One of those irregular verbs? Open Research Data

I’m reminded of a survey JISC funded about attitudes of PhD students to lots of things to do with PhDs. It asked them about their opinions on a load of topics, including their views on open access to research. It was about 85% in favour (and there were some who didn’t know enough to answer as they’d only just started).

But, that’s not quite what they meant to ask their rolling longitudinal study. They meant to ask about open access to their research, when the percentage in favour almost inverted.

The #datadebate on open access to Science on Tuesday evening was focussing on open access to research data; but in the same way that one person’s output is another person’s input, it was only looking at one side of the coin.


Baroness O’Neill’s argument that there should be some consideration of the requestor goes against the fundamental notion of FoI that it is applicant blind. Monbiot (badly) made the good point that would just make the problems around FoI of climate Data worse.

But there was no discussion on the panel of how it could help them in their research, rather than simply be the burden that other people wanting your material is. I suspect it would have somewhat illuminated the discussion. For every data request, there is a requestor, and generally requests have reasons.

When I said that health data was likely to be next thing covered by whatever the PDC became, and it might go badly, I wasn’t expecting it to be quite so spectacular or quick (I was mostly inferring from direction, not knowledge). In winning one debate on the trading fund data, the community discussion didn’t know that it was conflated in various heads with the personal data of the NHS. Although, in hindsight, I can see how that accident was easy to make from outside, without seeing how the coalition looks at the NHS. Treating the NHS like a really-large-trading-fund is one explanation for some positions. Decision makers with this project historyalso doesn’t inspire confidence in process.Yesterday, the Oxford Internet Institute held a meeting on the anonymised NHS data. There is legitimate cause for concern, the details aren’t available yet (and probably don’t exist), but in a way that lets No2IDgo thermal.While Professor Ross Anderson is right on the specific case he designs and demolishes, there’s an argument that his concerns can be negated for a process of a different design. It is exceptionally difficult to anonymise a dataset in the abstract without knowing what is in the dataset, or what it can be used for. More flexibility requires more disclosure control which reduces the quality of the data. For a given set of uses, for a specific level of access, any dataset can be made safe. Whether it can also be made useful is the bit that requires knowledgable thought.

I doubt Ross or anyone would argue against research being good for society. I have also seen no arguments against this that say there is no potential benefit to society in this.

The significant and legitimate disagreement is how to do this safely.

There are ways. The various Government Data Labs, HMRC, ONS VML, etc, are one option. Give accredited people access to what they legitimately ask for, under tight conditions. You have to be very specific about what you’re asking for, and it has to be commensurate benefit, impossible from any other sources, and reviewed.

Fundamentally, not having copies of the data in other places is the best way to avoid losing them.

None of the NHS discussion is about open data. I would also argue that real #opendata has a significant role to play in the transparency of the process. Who is requesting access, to what, and the outcomes they find, are all places where the strengths of opendata come into their own.

Ben Goldacre writes extensively on problems in the Pharma ecosystem, and many of the issues that Ross and others are concerned about have very close equivalents. The nice thing about Ben, is that he also offers solutions to the problems he pushes. Many of them are the transparency aims shared by the opendata movement.

Requiring publication of all requests for access, as well as open access to publications resulting (and hence, clear and obvious gaps).

Within the research world, for sensitive data, this is normal.

This data, is also, effectively impossible to release as open data freely.

On both sides of the debate, both on the NHS and the access to scientific data from Tuesday, as groups talk past each other, an old Yes Minister quote springs to mind:

I use narrow specific exemptions; He uses invalid exemptions; They are Vexatious.

Accepting that (at least in others) is a prerequisite for progress.

Disclosure: I used to be deal a (very) little with some of the organisations linked to above as part of my old day job. I don’t now.

Dec 2011
POSTED IN Uncategorized

Leave a Reply

Your email address will not be published. Required fields are marked *