Regional TheGovernmentSays.com

The thing in TGS that I’ve been working on for the last week or so is http://travel.thegovernmentsays.com – travel advice.

The FCO does a wonderful job of updating its travel advice for every country in the world on a regularly (daily in some cases) basis. Unfortunately, they provide only very limited mechanisms for pushing updates. Checking the website is great when you want to know what the advice is. When it changes, sometimes you want to know right now without constantly hitting refresh (which would really interfere with your holiday).

As a result, if you request an email alert for travel advice, it will send you an alert whenever we see the advice has changed, which is possibly a number of times a day. The FCO website is checked for updates every 20 minutes.

I’ve also put live the regional versions of TGS. See the top of www.thegovernmentsays.com for details. The regional (and other nations) coverage is a mixture of localised versions of national press releases, and also releases which only affect a localised area.

Have fun.

posted: 20 Aug 2006

TGS and MPTables updates

After spending most of the morning fighting with the remaining new extensions to TGS, they’re finally done. Now, we just need our design guru to send me the changes in a way which don’t suck. You can always tell the bits of design that I do, and the bits that someone else does – my bits look ugly.

Once that’s done, I’ll probably throw open a competition to do the designs for the logo at the top of the regional sites. At the moment, it’s just the bog standard TGS logo, but there should be scope for doing fun stuff – adding a dragon to Wales; a bottle of whiskey to Scotland etc.

I also found out where all the TGS traffic of the last few days has come from. It was mentioned in a sidebar in last Sunday’s Observer newspapper, and a large range of organisations have picked up on it. I especially like their comment

Something you can play with now is an addition to mptables.com which I thought would take ages but ended up taking about 15 minutes. MPTables includes a (small) number of data sources, but it would be good for people to be able to include their own data on constituencies and match it against ours. Especially for testing purposes before offering it for inclusion, but more generally as well. Now you can.

posted: 16 Aug 2006

So I’ve taken a few days off my day job to spend some time cracking on with various projects I’ve been thinking or talking about for the last few months. The first thing to do was to update the config files which control all of the TGS feeds. Most of these are automated, but each config file requires the URL of the relevant department adding. Many of them are easy to guess, most are the top hit on google, but just enough aren’t that the failure rate of automation was too high. So it got stuck for a while until after the move to the mysociety servers.

One of the advantages of that move, besides the immediate results in terms of increased content, has been the increased volume of traffic that we’ve been getting.

Some of those are now available if you know where to look, some are being finalised and tested in that window just there —-> (assuming the FCO website stays up long enough for the script to ever finish).

Anyway, will hopefully be a busy few days.

Should be fun.

posted: 16 Aug 2006

News boxes.

Another project I spent some time on, about a year ago now, with significant amounts of help from Richard Pope, is some way of easily putting democracy news onto webpages. Think googleads, but for local democracy information.

Behind the scenes, it again uses the TheGovernmentSays.com model, of pulling in RSS feeds and then spitting them out of the database. Instead of aiming at big webpages and sites, it is targetted at small fragments of html. You visit the site, enter your postcode, pick which information you want included (based on your postcode, it tells you who your mp is and localises what you see), then you get given a fragment of html to paste into your webpages. That’s it.

Whenever someone looks at your website after inclusion of the fragment, they get the latest democracy news for your area, what your mp has been doing, saying, and what pledgebank pledges are happening in your area. We hope to add more sources of information over time, but this is what we’re starting with (as the information is easily accessible).

While this implementation is focussed on democracy, there is very little which is tied to democracy. It could be reused for any area of interest where there are multiple feeds of interest which need aggregation, in a simple fashion.

Over time, an enhancement could be to, where a browser has an override cookie, show information local to the browser itself, rather than to the website owner. But that’s going to be left as an exercise for the reader.

posted: 26 Jun 2006

mySociety panopticon / rss aggregator

Something that I’ve recently built, but isn’t quite live (at time of writing) is the mysociety panopticon – which is just a web based rss aggregator with a fancy name. See http://panopticon.mysociety.org/

mySociety and it’s sites get a large number of mentions in many places around the web. The panopticon uses google blogs and other search to pick up where they get mentioned and stores them all. The new panopticon (replacing one based on drupal by John Handelaar) is in the panopticon directory of mysociety cvs, and should be pretty easy to reuse if you wish to do so (see the README in the cvs directory for the simple instructions).

It’s heavily based on the code used for displaying thegovernmentsays.com, extended to cope with topics (e.g. pledgebank, theyworkforyou) which entries are tagged with based on the feed that they come in through. Multiple feeds can use the same tag to put entries into categories.

posted: 26 Jun 2006

TheGovernmentSays.com server updates

After being hosted on the end of my ADSL line for the last few months, TheGovernmentSays.com has moved to a large shiny new server as part of the “mysociety friends and family” projects.

Various other projects will land there shortly, but it does mean that TGS now has reliable hosting. Apologies to those who have been affected by the problems caused by me moving house and the fan falling off the cpu of the box that previously hosted it.

One of the benefits of this is that we can much more easily scale up the processing. A single fetch/process/update cycle for TGS used to be about 30 minutes. And since we rerun the process every hour early in the mornings, that was about the limit.

You may have noticed from the vast numbers of posts today that I’ve added in all of the missing major departments (about 30 new feeds were added today). I have another 40 or 50 to add when the next major feature arrives – which is versions of TGS for each of the regions (and Wales, Scotland and Northern Ireland). Various bits of stuff will be added over the next few weeks and months, but hopefully it’ll be all done by the end of the summer.

posted: 26 Jun 2006

Comments on threats to DirectionlessGov

William Heath recently talked about what would happen if Google killed Directionlessgov (here).
While I doubt they would – we don’t make money, just a mockery. That said, there are threats to Directionless; ones which are far more deadly than Google could be.

The part of Google we use is their search engine, and when I talk about Google here, that’s the only bit I mean – www.google.com/search . Search is big business – Google (GOOG) alone is worth about $112 billion. Even so, if Google suddenly didn’t exist for any reason, we would just use a different search engine. It may suck more, but we’d be OK. The reason for them existing no longer may be that they don’t want us to use their site, maybe they make changes which are incompatible with what we do, or
maybe the ISS crashed into them
. Search is a commodity. As with many of the sites on the internet, if you can’t use them for purpose, whether they exist is irrelevant.

Much of what Directionless, and other sites such as http://www.TheyWorkForYou.com, http://www.PublicWhip.org, and overseas equivalents such as http://www.GovTrack.US, do, is scraping. We read the html page that Parliament (or wherever) serves out for browsers, and interpret it and pull out the content using automated programs. Those programs are, and have to be, quite conservative in what they understand – as accurate representation is critical. When those page structures change, our sites break until we catch up. While change is annoying, progress is a good thing (“progress” backwards, however, is just annoying).

There is no free, reliable, source of the postcode to Local Authority (or Constituency) lookup table – it costs a very large amount of money. With a budget of 3 tea bags, some milk and a chocolate biscuit, Local Directionless needs to know which Authority covers your postcode so that we can search the relevant website. We scrape this information from another site. However, when the site we scrape changes, stops working or moves block us, then we stop working until we fix it. While we can adapt, and are be on the defensive in an arms war and will only win because of more motivation and supplies of tea.

The biggest threat to directionless, and all similar civic sites, is not that one commodity part stops working until we switch (Google to Yahoo for example), it’s that one vital part prevents us from working. For some projects, it’s mapping data (unavailable, rather than restricted), for some, it’s the license of Hansard and potential problems caused by legal threats, in others, it’s what happens when a partially open site takes offline part of the site we use, and where we need up to date lookups.

What took most of the time of the local search was not the implementation, but creating the lookup table between Local Authorities/Councils (and how they were titled: “Cambridge” is not the same as “Cambridge City” or “Cambridge County”). That’s the real core of Directionless Local. And there’s no easy way for us to get that list from any other public site.

That lookup isn’t commoditised. Yet. We could protect it and commoditise it by creating a list of postcodes, one post code per authority and constituency (and possibly even ward?), and make it publicly available. There are many websites which allow you to do a lookup, and definitive lookup tables could be created, by anyone, from any of them. But only if there is a master list which we know is comprehensive. Then Directionless, and any other site, could swap in any site we wanted to get the information from. Some of these jobs aren’t sexy, but need doing.

Why does the local search currently not work in places? Because the site we scrape (which we were able to get a definitive list of authority names from) has taken down some authorities while they correct the data they show. Unfortunately, until they do, we stay broken in those areas. And there’s not a thing we can do about it, and we’re stuck until they fix it.

That’s the biggest threat to www.Directionlessgov.com – well meaning, justifiable, simple, and deadly.

posted: 22 May 2006

OpenBSD RSS feeds

Not really anythign to do with democracy, but more to do with actually being useful for me, I’ve created some RSS feeds for commits to the openbsd stable branches. For details, see http://flirble.disruptiveproactivity.com/rss/

posted: 11 May 2006

Directionlessgov.com/local

Much has recently been made of
direct.gov.uk‘s local searching
capabilities. Putting you straight to the page you want (via a handful
of interstitial pages).

In the best directionlessgov.com traditions, here’s something better:
http://www.directionlessgov.com/local/.

Direct.gov.uk requires you to pick the category that your thing fits
into (most) from their list, and then follow through several poges
click through to what you’re actually looking for.

previously, we’ve done a direct comparison of google and
Direct.gov.uk, but, given the extremely strict categorisation of
direct.gov.uk (and the extreme lack of strictness or catgeories in
what people actually look for), it’s not worth the
many days it would take (against the day or so it’s taken so
for). So you get national direct.gov.uk results, and local
google results.

We map your postcode onto your local authority, and then use this list of councils and websites (which took a couple of days to
create forthis purpose) to search just your council(s).

So, there you have it Directionlessgov.com/local.

posted: 12 Apr 2006

Direct Gov – but directly to where?

[Note: most of this was originally written in summer 2005, with some minor tweaks and published in February 2006]

Work done by Government is generally one of 3 things; doing
work people don’t see, doing work people see, or moving
something from one category to the other. Search engines and
portal tools are often used to help citizens do what they
can’t find. Vast quantities of money, goodwill and time are
wasted because of poor re-implementations of a moderately
hard problem.

In a direct comparison early in 2005, 75% of users, when
shown results of search queries side by side, selected a
result from Google (limited to .gov.uk domains) over the
specialised £4.4m portal direct.gov.uk[2].

Users submitted a term to a webpage, and were shown, side by
side, the results of their search in the direct.gov.uk
engine and in google.

When the term was submitted, a request was sent to the
direct.gov.uk search engine, and the resulting HTML parsed
and the results extracted and displayed. Google’s knowledge
of .gov.uk sites was then searched[1] and similarly displayed.

The results were shown in the order that the search engine returned them,
and the default number of results for each engine shown side
by side (20 results for direct.gov.uk and 10 for Google). When a
user clicked on a page they wanted to look at, they
were first taken to a script which recorded their search term,
preferred url and which search engine returned that link, before bouncing
them to where they wanted to go. Users who opened multiple links for the
same search term (for whatever reason) generated multiple records.

Initially, we displayed google results in the left, and
Direct.gov.uk results on the right. After a month of usage,
these were switched to look into whether the earlier display
of google (on the left) was disadvantaging direct.gov.uk.
When the direct.gov.uk results were shown first, 28% of
people a small (3%) decrease in the percentage who clicked a
google result when it was shown second rather than first.

People generally searched using “keywords”, rather than a
sentance or phrase. Where longer search terms were submitted
(ie four words or more), these were generally for very
specific searches (e.g. “national spatial address
infrastructure”, or “freedom of information request
exemption”) , rather than english phrasing around what was
wanted (e.g. “how much council tax is band e in selsey”).

Direct.gov.uk “improved” searching

In the summer of 2005, partially as a result of criticism,
direct.gov.uk loudly announced a new, improved search engine
in August. Taking only results from September 2005 onwards,
repeating the above comparison, the newer engine made
no difference – the direct.gov.uk was still selected 28% of
the time (we still showed direct.gov.uk results first).

Footnote

1. used the Google API which returns identical results to www.google.com but in a more computer readable format

2. These stats were generated in a 3 month period in 2005.

posted: 12 Apr 2006