While Big Data is a buzzword thrown about as a promise to change health and healthcare, the reality is we still have many barriers in realizing what our health data can really do.  Big Data is the concept of really large data sets.  As part of Obama’s Open Government Initiative, the government has posted an enormous amount of data sets generated by the federal government at  And they’ve released data specific to health at clinical care provider quality information, nationwide health service provider directories, databases of the latest medical and scientific knowledge, consumer product data, community health performance information, government spending data.

But this data does not include data held by private companies or research institutions who conduct clinical trials.  This leaves data sitting in repositories that could improve public health, enhance patient safety, and spur drug development if shared more widely both within and across sectors.  But the tides are changing.

This week, the Institute of Medicine of the National Academies (IOM) released their summary of a workshop held to discuss sharing of clinical research data.  The focus of this workshop was to bring together industry stakeholders including Pharmaceutical companies, academic researchers, and government agencies to discuss compiling the large quantities of clinical research data they collect.  Not only would sharing this data improve health by data sharing but it can also increase public trust in clinical trials and the conclusions derived from them by lending transparency to the clinical research process according to the IOM.

As the IOM notes, despite several barriers to data sharing – such as concerns about data mining, erroneous secondary analyses of data, unwarranted litigation, and a desire to protect commercial information – there is increasing acknowledgement among researchers of the importance and potential benefits to sharing clinical research data at various stages of the research, discovery, and development pipeline.

As William Potter, Co-Chair emeritus of the Neuroscience Steering Committee for the Biomarkers Consortium of the Foundation for the National Institutes of Health, observes “advances in treatment for the complex illnesses faced by society today are not likely to result from the data from a single study.” By releasing data from clinical trials, the data can be reviewed for errors or add to a collection of data to form new studies.  Data sharing can perhaps reduce the duplication of efforts as separate studies collect some of the same information.  Each trial whether by the same researcher or another can be built upon rather than reinventing the wheel and going down a path that was not successful or led to erroneous results.

Last September, I had the privilege to be part of Partnership WITH Patients and the associated Health in Kansas City.  One of our breakout sessions in the was a discussion with Eli Lilly’s Clinical Open Innovations Group.  The attendees included me and 4 other ePatients (@Strangely_T1, @ReginaHolliday, @afternoonnapper, @TiffanyAndLupus).  All of us have participated in clinical trials (most notably for myself was the clinical trials for Dexcom – a continuous glucose monitor) and we were fascinated by the idea that a pharmaceutical company wanted to listen to US for how to improve their development processes.  We talked about the potential impact that opening up clinical data could have on research and treatment.  I think @Strangely_T1 sums it up best on his blog post:

Personally, I think this is brilliant and it is outstanding to see a major pharma company starting an initiative in this area.  Cooperative development.  I liken the current pharma industry environment to pre-Renaissance Europe, where everyone was so isolated that collaboration was simply impossible.  The idea of development collaboration is like a bunch of highly skilled artists, scholars, and philosophers deciding to move to Florence.  All at once.  I mean, seriously what could happen?  Well a little thing called The Renaissance for one thing.

Soon after, in October, I had the pleasure of being invited to the Rev Forum sponsored by the Livestrong Foundation and Genentech where much of the discussion was about freeing data and revolutionizing research through collaboration.  The same discussion arose among the entire forum as speakers discussed the same benefits.

But we all recognize the barriers preventing us from realizing the promises of Even Bigger Data.  Logistically, de-identifying and standardizing data to be shared can take a lot of time.  But I think as we standardize data collection and require interoperability through other means like EHRs, this can be overcome by agreement to an industry standard.  The bigger barriers come in convincing researchers and companies to challenge the status quo.  Too worried about their own proprietary interests, this information is held back.  If we can convince them that in releasing the data they hold so tightly, they will perhaps realize greater recognition for their work, and the discovery of new, very profitable outcomes.

Patient privacy is often cited as a barrier.  Interestingly though, the patient has no access to this data at all. Patients sign consents when they enter a trial and though these consent forms would have to be redeveloped (they need to be amended anyway as there are serious concerns as to whether a patient is informed at all by signing a consent) and could be give the option to share their data with industry, their treating physicians, and themselves.  As an attorney who spends a lot of time focusing on HIPAA and HITECH laws, I don’t think that the patient privacy issue is as large a barrier as some might claim.  But I do think that patients NOT having access to their  own data collected in a trial is a much bigger issue that needs to be addressed.  In fact, the IOM states in the report “clinical trials participants deserve to receive information from trials that will help them make health decisions.”

Patient access to data is a post for another day, but opening up data at any level is the first step toward making true progress in personal and population health.  The government recognizes it and publishes its data sets.

Additionally, in February, I received an email from the Obama Administration in response to a “We the People” petition asking for increased public access to the results of scientific research stating a new policy – directing those with more than $100 million in research and development expenditures to develop plans to make the results of federally-funded research publically available free of charge within 12 months after original publication (memorandum here). The National Institutes of Health (NIH) already has a public access policy.  And though this doesn’t open up the data itself, it does take another small step in that direction.

Now if we can persuade other private industry partners to follow Lilly COI and step forward to collaborate with each other by sharing clinical data, we can take a big leap forward. It may seem a huge chasm to cross – the space between competition and collaboration, between data sets held separately and those combined for innovation.  But as the word artist and poet Sekou Andrews (@sekouworld) said in closing the Rev Forum –

There is “no distance our human capacity cannot close” our footprints are on the moon,

we can fix healthcare.

One way we can fix it…sharing data.

