Showing posts with label open science. Show all posts
Showing posts with label open science. Show all posts

Thursday, April 5, 2012

Open Data: The Science of Relationships

Typically, I find the majority of Wiley's resources useful and relevant, but at first pass, the first video in the Open Data section of the course was neither, for me. As a learner, I'm trying to understand conceptual relationships between all things "open," as described in each section. The TEDx video of Tim Berners-Lee fast-talking about "linked data" and leading the audience in a chant of "Raw Data Now!" really had me feeling less informed about open data, after watching it (The Next Web of Open, Linked Data, TEDx March 13, 2009).

Part of the problem was that I couldn't keep up with him - he's clearly a technical wizard - and his wildly excitable gestures and his intermittent use of tech-savvy terms were distracting at best. After watching this video I wasn't convinced a) that I had a clear understanding of  "open data" or 2) that it would matter to me at all even if I did understand it.

So I move ahead, reading through the other resources. And lo and behold, what do I find? I find this, from the Wikipedia entry Wiley links to: :"The concept of open data is not new; but although the term is currently in frequent use, there are no commonly agreed definitions..." (http://en.wikipedia.org/wiki/Open_data). Oh, goodie! Another completely unformed concept to learn about. I can't say that hasn't been a trend in this particular course of study, so I'm not surprised.

A little puzzled, I went back to the video with Berner-Lee. He said something early in the talk about the differences between data and documents,  and that's been helpful - data is something you typically can't use by itself, in isolation. It has relationships to other things, other data. I wanted to revisit that, to see if I could get something more from that. I did.

According to Berner-Lee, documents on the web are usually stand-alone items that you can (if they are open) use, repurpose, share, all of which can inform you and others. Data has to be related to something else  - or put in context, if you will - to be useable, relevant. So I start to think about data the way Berners-Lee recommends: as relationships. He mentions social networking. So I friend someone on Facebook, for example. Okay,  well, that's data, he says. That's data about me, about my friend, about the networks I'm in, about the things I'm interested in, and so on. So the connection itself is data, but it's not meaningful unless taken all together, as in, looking at what/who I'm connecting to and why that gives the world of Facebook some information about me. And that information, when 'linked' with other users' information, is powerful and can provide a comprehensive portrait of who I am online. Interesting! That is an explanation I can relate to.

So the sum is greater than the value of each part. Got it. But with that metaphor we're wandering dangerously close to another subject I find uninteresting, at best, which is math.

So back to relationships. I like data better when I think about it as relationships. So the "linking data" Berner-Lee is talking about is basically like taking little bits of information produced by and about people, and then connecting it to another person's little bit about them, and so on, and so on, until there is a huge network of relationships/data available to, by and for everyone. Neat. Don't get too overwhelmed, now, but that 'network' of relationships, ie data, looks something like this:
Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net
Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net
CC-BY-SA license


There are other resources from Wiley that paint a descriptive picture of what open data might be, and how it might be relevant or useful. The United States government is working with open data and providing it to anyone in the world, via their website, data.gov. This is a cool project, I think, because aside from data that could compromise or influence issues of national security, the government puts information about projects, initiatives, and all sorts of  other things being paid for by tax dollars, out on the web for anyone to see and use and reuse. That kind of transparency is arguably one ingredient of a successful democratic process. That is a pretty big implication about open data and the value it can provide. And from what I can tell, it's one of many more yet to be discovered.




Thursday, March 22, 2012

Open Science: A Fundamental Tension

The module on Open Science in Wiley's course seemed to me an amalgamation of all the other "open" topics and concepts covered in the course thus far.
There are also some very familiar and consistent themes in the conversation about Open Science: there are many and varied views about what, exactly, is meant by "open science;" there are many different applications of "open science," and there are quite a few barriers to the implementation of "open science."
Open Science was the first module in the course during which I started to feel really unsure about this whole "open" thing. Up to now, everything sounded just great. Yes, let's share resources and educational content! Yes, let's allow each other to build on and improve our work! Yes, let's make resources cheaper and more accessible to anyone interested!
And then I read about Open Science. To be clear, I still very much appreciate the idea of openness as it applies to science, but this topic seems more complicated than a few of the others Wiley approaches. For the first time in the course (for me, anyway), the concept of  "openness" is transposed over a really specific topic, and that's when things really start to fall apart. There seem to be more questions than answers when it comes to Open Science.
As with any other community of experts, scientists belong to a culture all their own. The culture of science promotes some fairly conservative values about sharing information. In fact, according to Michael Nielsen in his speech at TEDx Waterloo in April, 2011, traditional scientists actually look down on sharing. There is no prestige to be had from sharing your data with someone else. Data is considered a highly personal, sometimes even secretive aspect of scientific exploration. As a result, science tends to occur in isolation, with scientists hording sets of data for their own use, and hiding their work from each other.
Certainly, there have been some changes and shifts in the culture of science over the centuries. Again, back in the 15th Century, the printing press comes on to the scene and changes everything, even science. For the first time ever, scientific endeavors can be widely communicated and acknowledged.  But even with that information revolution, the mores and values of the community of science didn't change much. Typically, scientific results are shared by way of professional publications and academic journals. Results are achieved when specific data or sets of data are manipulated in specific ways (can you tell I'm not a scientist?).  So, while there is value to the results of scientific exploration, what's not being shared is data. Scientists do not like to share their data, and there's is no incentive for them to do so. In fact, there are drawbacks, which must be obvious at this point - they must not like to share for a good reason, right? Right. If they share their data, they give someone else - a peer, a competitor, a colleague - the opportunity to change, mix, or even damage their work (Bissell and Kirn, Open Science and OER: Where Do They Intersect? 2011).
In order for science to begin to entertain the idea of sharing, Nielsen says we need to make scientists see sharing as part of their job, and we need to reward them for it. Providing incentives to share, promoting conversations about the values of sharing - these all need to happen in order for science to look upon "openness" as a truly worthwhile endeavor.
According to the Science Commons, there are several dominating principles that must be followed by institutions or individuals, in order to participate in Open Science. Institutions, organizations and people must:
  1. Provide Open Access to Literature from Funded Research,
  2. Provide Access to Research Tools from Funded Research,
  3. Put Data from Funded Research in the Public Domain, and
  4. Make Investments in Open Cyberinfrastructures.
What's most confusing and thought-provoking to me about Open Science is the global relevance of science in general. Science affects every aspect of our lives as human beings. It shapes our world in very specific and influential ways. So, in that sense, the culture of science as it is now, is a global problem. Scientists all over the world are potentially duplicating each others efforts and results, redoing each others' work and research.  Wouldn't it make more sense - to everyone, not just scientists - to share? Share the work, the data, the results; share the prestige, the fame, the failure? Without sharing the work, how much time are we losing, in the battle against things like cancer, diabetes, HIV? These are diseases that affect everyone, everywhere. Why wouldn't the scientific community want to contribute to each others' work and knowledge? Wouldn't societies as a whole stand to gain more through that type of sharing than any one scientist could lose?
Well, it might make sense, and we might gain more as a society if science were more open, but it's not happening - not now, and not for a while - if ever. The other troubling part of implementing Open Science is the sub-concept of "Open data."  Data, in and of itself, drives much of what science is and means and does. And it's the data - what it can do, what it can mean, and what it can change - that's so controversial in this context.  Let's say, for example, an experiment leads to the discovery of a biological warfare method. If we are applying the principles listed at the Science Commons,  and this data is in the public domain, accessible by way of technology, what harm can then be done if the data were to be found and used by someone who, let's say, isn't a scientist, or someone who doesn't have an altruistic purpose in mind? If this data were to fall into the hands of someone who wants to cause other people harm, using this data to recreate the biological warfare method is a possible outcome. Application of the result may be detrimental to society - how much of society, I can't say, but one or more people/places could be negatively affected. So clearly there is an ethical concern to "open data," as it's understood in the context of open science.
However negative I may sound towards Open Science, in some ways I do support it, and I do understand its value. And I think behind any and all of these movements towards being open, we do recognize that technology affords us connections to a global community that we can all participate in if we choose to, and that's something that was never possible before. Technology can remove all sorts of barriers, but it can also raise a lot of questions.
While technology advances at a rapid pace, obviously the evolution of humanity moves at a  slower clip. All these ideas and concepts are great, but when taken in context, are pretty radical. Sure, we have the technology and the desire to see what's possible to achieve with it, but really, we are trying to change longstanding cultural norms. That will take time, and it will require fundamental changes to the culture and professional field of science.