Thursday, April 5, 2012

Open Data: The Science of Relationships

Typically, I find the majority of Wiley's resources useful and relevant, but at first pass, the first video in the Open Data section of the course was neither, for me. As a learner, I'm trying to understand conceptual relationships between all things "open," as described in each section. The TEDx video of Tim Berners-Lee fast-talking about "linked data" and leading the audience in a chant of "Raw Data Now!" really had me feeling less informed about open data, after watching it (The Next Web of Open, Linked Data, TEDx March 13, 2009).

Part of the problem was that I couldn't keep up with him - he's clearly a technical wizard - and his wildly excitable gestures and his intermittent use of tech-savvy terms were distracting at best. After watching this video I wasn't convinced a) that I had a clear understanding of  "open data" or 2) that it would matter to me at all even if I did understand it.

So I move ahead, reading through the other resources. And lo and behold, what do I find? I find this, from the Wikipedia entry Wiley links to: :"The concept of open data is not new; but although the term is currently in frequent use, there are no commonly agreed definitions..." (http://en.wikipedia.org/wiki/Open_data). Oh, goodie! Another completely unformed concept to learn about. I can't say that hasn't been a trend in this particular course of study, so I'm not surprised.

A little puzzled, I went back to the video with Berner-Lee. He said something early in the talk about the differences between data and documents,  and that's been helpful - data is something you typically can't use by itself, in isolation. It has relationships to other things, other data. I wanted to revisit that, to see if I could get something more from that. I did.

According to Berner-Lee, documents on the web are usually stand-alone items that you can (if they are open) use, repurpose, share, all of which can inform you and others. Data has to be related to something else  - or put in context, if you will - to be useable, relevant. So I start to think about data the way Berners-Lee recommends: as relationships. He mentions social networking. So I friend someone on Facebook, for example. Okay,  well, that's data, he says. That's data about me, about my friend, about the networks I'm in, about the things I'm interested in, and so on. So the connection itself is data, but it's not meaningful unless taken all together, as in, looking at what/who I'm connecting to and why that gives the world of Facebook some information about me. And that information, when 'linked' with other users' information, is powerful and can provide a comprehensive portrait of who I am online. Interesting! That is an explanation I can relate to.

So the sum is greater than the value of each part. Got it. But with that metaphor we're wandering dangerously close to another subject I find uninteresting, at best, which is math.

So back to relationships. I like data better when I think about it as relationships. So the "linking data" Berner-Lee is talking about is basically like taking little bits of information produced by and about people, and then connecting it to another person's little bit about them, and so on, and so on, until there is a huge network of relationships/data available to, by and for everyone. Neat. Don't get too overwhelmed, now, but that 'network' of relationships, ie data, looks something like this:
Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net
Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net
CC-BY-SA license


There are other resources from Wiley that paint a descriptive picture of what open data might be, and how it might be relevant or useful. The United States government is working with open data and providing it to anyone in the world, via their website, data.gov. This is a cool project, I think, because aside from data that could compromise or influence issues of national security, the government puts information about projects, initiatives, and all sorts of  other things being paid for by tax dollars, out on the web for anyone to see and use and reuse. That kind of transparency is arguably one ingredient of a successful democratic process. That is a pretty big implication about open data and the value it can provide. And from what I can tell, it's one of many more yet to be discovered.