Saturday, April 11, 2009

CrowdSourcing and human rights

There's been an ongoing discussion this week over at Paul Currion's blog about the benefit of croudsourcing of information in the human rights sphere, most of it criticising probably crazy ideas.

Disaster first responders will need customised tools, and if they find something that will save them time, then it will get adopted. But that's always going to be harder than giving them a wiki and they solve everything, or possibly, anything. But for Paul's work, the solutions for first responders wont be the same as for everyone else - the environment is likely too different, as there needs to be some co-ordination, communication and planning. But I wonder, what is there that first responders do that they don't have time to look at. I suspect there is a huge difference between natural disasters and civil unrest (of which the G20 demostrations could be considered a very tame example). Either way, Paul's interest is not about crowdsourcing what human rights groups can already do, it's about helping them do what they can't.

Paul's argument is that crowdsourcing wont help that much in the areas in which he deals, as first responders in an incident are overwhelmed, busy and don't have time to edit wikis or do much that doesn't give direct benefit. That's completely true. By their very nature, first responders are there when few others are, when the infrastructure is questionable (in that you may have no idea yet what's left) and generally the focus is on saving lives; they're there when there is no crowd. At that point, the only thing that makes those people's jobs better is better access to more useful information. And that's not going to be a normal wiki. What it might be (and this is also unworkable), is say, an iphone app that lets them report things "3 bodies by side of road" click, "house burnt out" click, which gets GPS co-ordinates and timestamped, and uploaded over a text message or later when there's a connection, depending on what they know. This has the downside of the battery runs out somewhere they probably can't recharge it easily - these problems are very real, nuanced and hard, but the potential is for some novel solutions that help responders help more people faster. That's probably not crowdsourced, which is a solution to different problems, but if we step away from the real time, and move to an issue based process, there are wider uses which aren't limited to first responders, but anyone with a cell phone, such as the G20 police ID matching. That's where crowdsourcing comes into its own.


I spent a few days last week in London, and was around some of the G20 protests with some friends. Those protests were in central London - one of the biggest cities in one of the richest countries in the world, with the most advanced infrastructure and resources, and everything was heavily covered by cameras from the press, the police and individuals. Mostly individuals, who then uploaded their photos to flickr or elsewhere and tagged them appropriately.

I doubt anyone in the UK hasn't seen the pictures of Ian Tomlinson being pushed to the ground shortly before he died, with some police in the background, none of whom have id numbers visible in the footage (possibly due to active removal, but probably due to camera angles), or today's new footage. All we have is their faces, in a variety of shots, in a variety of angles, from different places, at different times.

Someone had the great idea of crowdsourcing the deduction of the identification numbers of those PCs. You know what those present look like from the video; you can then get better photos from before/after from flickr, and therefore have high quality photos. Even if you can't get the ID numbers from that, you then have a decent photo, and can start to look through the huge number of other photos of police on the day to find them (the dog handlers being particularly easy). A massively manual task, that will take huge amounts of time and people, but time is something that the project has at this point. And the task is massively parallelisable with no communication overhead until you find a match - or as someone said on twitter Tomlinson's Law: Given enough eyeballs, all thugs are shallow.

Labels: , , ,

Saturday, February 28, 2009

Irrepressible.info for video

Irrespressible.info (take a look if you've not seen it, otherwise this wont make sense) is a site which republishing information that someone wants to try and hide - items on democracy which are banned inside china for example.

We talked about this very briefly at the convention.

Doing this for video (and pdf etc - it's all regular expressions) is insanely easy. If there are people willing to help mirror stuff (and there will be).

Firstly, write a piece of javascript which blog authors can embed in their blogs, which will push all the youtube, vimeo, mp3 etc URLs up to a server which publishes the full list. Then you have many clients all hitting the list at random and mirroring all/some of the material to their own private, local, copies.

Because someone has gone to the trouble to issue a takedown request, it must matter to someone. More importantly, as the mirroring is automatic, the blog owner can take the material down, without having to worry that it's not there. More complex javascript could notice that the link is no longer in that page and flag that as something potentially interesting - that way, blog owners wouldn't even need to proactively publish that stuff had been taken down, the entire system would just notice.

This is something I've mentioned before. Anyone want to built it at Rewired State next Saturday?
drop me an email if so.

Labels: ,

Notes on the Convention on Modern Liberty

It was a fantastic day. Congratulations to all those who worked hard to make it so. My slides (which I read the notes from, but didn't show on screen due to cabling issues, are available here (and I'm somewhat suprised by just how many people have looked at them). There'll be video online at some point.

There were various questions asked in the session that could have been answered in more detail. So here are some thoughts.

how many people do you think it takes?



This was the title of my talk, even though it never actually appeared. And an underlying theme of all of the people talking in my session was that it takes one person, but lots of one people. Heather Brooke is fighting the FoI fight, generally on her own, but she's not alone. There are huge numbers of people doing little bits. And small pieces loosely joined is something that can generate a huge system.

Ben Goldacre and Phil Booth also talked a bit about this.I'll probably return to it at some point in future


What about politicians who are getting buried under the online engagement tools?



Jason Kitcat (a Green Councillor in Brighton) asked about what's happening on the other side of the engagement fence. mySociety and others are busy building better tools to engage and contact representatives, and they're cowering in their bunkers saying "no more" because they don't have anyone building htem better tools.

There are solutions to that, but many of them come down to "pay mysociety money" and they'll build such tools for you. With good tools on both sides, this can be a good thing. With inequality, comes chaos on one side. But representatives have some motivation to do it and at the moment, aren't. it's likely, after the next general election, that some of the new cohort who get twitter/blogging etc (or RSS sourced data more generally as an additional filter to email) will look at getting something done. It's not hard, it's just no one's doing it. It only takes one person to lead, but they'll have to develop it.


Consistency of Data over time? It isn't



One of the questions in the Q&A was, having seen Hans Rosling's about why the government doesn't make available data over time, and want to be able to produce such graphs via cut and paste. Unfortunately, live's not that simple as the real world has a habit of changing and screwing stuff up. If you want consistent data for a long number of years, you're probably going to have to understand it and do some work - definitions change over time, especially when you're looking at statistics, and so numbers may not be directly comparable from one year to the next. Heather mentioned how knife crime stats are fiddled to get headlines. But there are legitimate changes - think how the place in which you live has changed over the last 25 years. Brian Eno in his panel talked about long term thinking, and that's vital, but equally, it's hard to do evidence based policy over long time periods. And most importantly, over those time periods, what you choose to care about depends not on the data you have, but as much on the question you're asking. Some questions you might care about an ethnic group breakdown, some people might just want "us" and "them". There is no right answer.


APIs are great



For people who wandering in from elsewhere, I should probably state that APIs are generally good things.

But, that said, there are things that they shouldn't be used for. And one of the prime examples is the definitive statistics for the country, which in the UK are called National Statistics (capitalised like so) and while they may be produced by the Office of National Statistics, are not necessarily produced by the ONS, and can be produced by any department.

The thing about NS is that they are used to create policy, information over time, the definitive stats, is vital. If you want the National Statistics for the country in 1909, you can get them (of course, they weren't called that back in 1909).

This stuff can't change without significant and careful consideration; and most importantly, it can never ever go away. There is an API for Statistics (data exchange) which does some of this, but it must not become the definitive source. Definitive copies must be replicated and kept, so that they can be referred to. Some National Statistics have a significant discontinuity in/around 1997, because the Tory goverment that had kept some statistics going since 1979 was replaced by a labour governemnt that wanted to measure different things (legitimately). There's no reason to assume that wont happen again, and, indeed, it's somewhat likely that a new government will care about different issues and want statistics on what is changing to inform policy.

Many things depend on those statistics, including budgets .If you consider a budget to be the number in the outcome cell of a spreadsheet, that's fine. But budgets determine how many people have jobs. And you don't want revisions to those without some care automatically as data changes - that needs somewhat careful planning and monitoring of change over time. Otherwise you may find that your budget suddenly drops as data changes. APIs aren't everything.

The real world is not only a messed up place (see here) - and stats should reflect that it's a sequence of messed up places, changing over time. While you can produce one statistic covering something for ever, the only thing you can be certain about is that it's wrong.



Summary



but thanks to Sunny for organising the Bloggers Summit and inviting me to speak. it was great.

Labels: ,