Comments on threats to DirectionlessGov

William Heath recently talked about what would happen if Google killed Directionlessgov (here).
While I doubt they would – we don’t make money, just a mockery. That said, there are threats to Directionless; ones which are far more deadly than Google could be.

The part of Google we use is their search engine, and when I talk about Google here, that’s the only bit I mean – www.google.com/search . Search is big business – Google (GOOG) alone is worth about $112 billion. Even so, if Google suddenly didn’t exist for any reason, we would just use a different search engine. It may suck more, but we’d be OK. The reason for them existing no longer may be that they don’t want us to use their site, maybe they make changes which are incompatible with what we do, or
maybe the ISS crashed into them
. Search is a commodity. As with many of the sites on the internet, if you can’t use them for purpose, whether they exist is irrelevant.

Much of what Directionless, and other sites such as http://www.TheyWorkForYou.com, http://www.PublicWhip.org, and overseas equivalents such as http://www.GovTrack.US, do, is scraping. We read the html page that Parliament (or wherever) serves out for browsers, and interpret it and pull out the content using automated programs. Those programs are, and have to be, quite conservative in what they understand – as accurate representation is critical. When those page structures change, our sites break until we catch up. While change is annoying, progress is a good thing (“progress” backwards, however, is just annoying).

There is no free, reliable, source of the postcode to Local Authority (or Constituency) lookup table – it costs a very large amount of money. With a budget of 3 tea bags, some milk and a chocolate biscuit, Local Directionless needs to know which Authority covers your postcode so that we can search the relevant website. We scrape this information from another site. However, when the site we scrape changes, stops working or moves block us, then we stop working until we fix it. While we can adapt, and are be on the defensive in an arms war and will only win because of more motivation and supplies of tea.

The biggest threat to directionless, and all similar civic sites, is not that one commodity part stops working until we switch (Google to Yahoo for example), it’s that one vital part prevents us from working. For some projects, it’s mapping data (unavailable, rather than restricted), for some, it’s the license of Hansard and potential problems caused by legal threats, in others, it’s what happens when a partially open site takes offline part of the site we use, and where we need up to date lookups.

What took most of the time of the local search was not the implementation, but creating the lookup table between Local Authorities/Councils (and how they were titled: “Cambridge” is not the same as “Cambridge City” or “Cambridge County”). That’s the real core of Directionless Local. And there’s no easy way for us to get that list from any other public site.

That lookup isn’t commoditised. Yet. We could protect it and commoditise it by creating a list of postcodes, one post code per authority and constituency (and possibly even ward?), and make it publicly available. There are many websites which allow you to do a lookup, and definitive lookup tables could be created, by anyone, from any of them. But only if there is a master list which we know is comprehensive. Then Directionless, and any other site, could swap in any site we wanted to get the information from. Some of these jobs aren’t sexy, but need doing.

Why does the local search currently not work in places? Because the site we scrape (which we were able to get a definitive list of authority names from) has taken down some authorities while they correct the data they show. Unfortunately, until they do, we stay broken in those areas. And there’s not a thing we can do about it, and we’re stuck until they fix it.

That’s the biggest threat to www.Directionlessgov.com – well meaning, justifiable, simple, and deadly.

posted: 22 May 2006