Thursday, May 31, 2007

Hot or Not? Heat theme maps

Around the web these days you see lots of "heat theme" maps, pulling out the "hot" keywords in blocks of text (mainly blogs) but it can be anything (example). They take a block of text, and find all the words used most in it. Most of the time they use words from a specific database, but they don't have to: http://services.disruptiveproactivity.com/heattheme/

This sort of visualisation of words is also useful in others areas - analysing the output from questionnaires and other responses. Or large reports where you want to see what's talked about lots, and what isn't. So I connected the code from the mySociety panopticon (which tells us what people are saying about our sites) and attached the input to a HTML form, rather than the database. Which lets you put a load of text into a box and get a diagram of what's mentioned lots.

But we already have a service designed to make commenting on large reports easier, and which makes a text version available for doing things with. So we also feed the output of the text version into the input of the heat theme form, and find what the Transformational Government document spends most of the time talking about: like this.


Thanks to William for his help in testing it - and for giving me the idea in the first place.

Have fun.

Tuesday, May 29, 2007

Drinks and democracy in Manchester

Saturday June 23rd is the day before the Labour Leadership Conference starts in Manchester.
Come and join mysociety and friends for beer and conversation.

It'll be at the Briton's Protection from 7:30 till around 10ish. I'll be the guy in the mySociety hoodie, probably in one of the two back rooms, or maybe hte beer garden if it's nice.

Sunday, May 13, 2007

Petition the PM and TheGovernmentSays.com

One of TheGovernmentSays.com's features it allows people to comment on any story. Generally, there aren't that many comments, but the vast majority of those that we've seen in the last month or two have been on petitions (mainly found by people googling the petition) and commenting on it. I've had the idea for a while of attaching the RSS feed for new petitions to a bulletin board system, creating a new thread for each new petition and seeing what happens.

There is clearly demand for people to comment on petitions. Some are clearly more inciting of comments than others. Government involvement in everything has impacts, and the issues are never clear cut, and some issues are highly emotional.

There is a large question over whether the Government itself could host such comments, both legally, politically and practically; although the debate around issues should be hosted and engaged. Independence of operations may be useful.

Debate is essential, and a large amount of that may get quite heated and passionate. Some of that will be critcism, some of it will be right, and some of it will be trivial, and some of it will be substantive. All of which must be allowed.

Sunday, May 06, 2007

Making Government statistics more accessible

There are often conversations about what Government organisations could do to make their data more available. This data is already made available in multiple forms; most websites have a version covering the latest fad when it was being designed (the varying "mobile" access has been a constant for the last few years, although the actual methods have varied - think WAP).

What would happen if a site was designed to be reused, as well as used?

It would probably involve outputting XML or CSV or similar, rather than HTML, but the content may be substantially similar. Other projects go from HTML to XML...

One project I've recently been working on involves reusing data from Neighbourhood Statistics.

Neighbourhood statistics has many constituent parts, but it's main aim is for you to give it an area, and it tell you statistics about that area. It can do that in multiple different ways, as data, as descriptions and with pretty maps, but the core is the data. All easily accessible in nice friendly ways. In fact, NeSS is a good example of what most sites should do about making data accessible to people.

Although for my purposes, I didn't want to type 140,000 postcodes into NeSS. I'm lazy and that type of work is what we should get the computer to do. The data is well formed - we can just scrape it.

While on a singular basis, that worked (results will be available soon), what about something for next time?

NeSSY is designed to make it easier to make repeated requests for specific information (or find out what is available for an area) from Neighbourhood Statistics. It operates the same way as the NeSS web interface does, but you can tell it which links you'd like it to follow to get the data you want.

Retrofitting this on has significant downsides, as significant amounts of context are lost in the conversion from the database into HTML, and putting it back requires both vast amounts of care (getting it wrong is bad), and more knowledge/time, I've given it so far (about 8 hours of work). There may be areas that it doesn't cope with at all.

What it does show is that there are relatively simple and fast ways of redisplaying information that Government could add to their systems which make reuse far easier.

Multiple choice games could be "fun" to see how much people know about their local areas... The possibilities of data reuse are endless. If anyone is interested in working on further development, the code is here, and please drop me an email to let me know what you're working on