Making Data Sharing Happen

Download Making Data Sharing Happen

Post on 10-May-2015

887 views

Category:

Documents

1 download

Embed Size (px)

DESCRIPTION

Flash talk for Beyond the PDF 2, Amsterdam, 2013

TRANSCRIPT

<ul><li>1.Making It Happen: Making It Happen Sustainable Data Preservation and UseMarch 19, 2013Anita de WaardVP Research Data Collaborations, Elsevier RDS a.dewaard@elsevier.com</li></ul> <p>2. Whataspects/tools/capabilities/frameworksare related to this idea? There are many different research databases both generic(Dryad, Dataverse, ) and specific (NIF, IEDA, PDB, ) There are many systems for creating/sharing workflows(Taverna, MyExperiment, Vistrails, Workflow4Ever etc) There are many e-lab notebooks(LabGuru, LabArchives, LaBlog, etc) There are scores ofprojects, committees, standards, bodies, grants, initiatives, conferences for discussing and connecting all of this(KEfED, Pegasus, PROV, RDA, ScienceGateways, Codata, BRDI, Earthcube, etc. etc) You can make a living out of this ;-)! (and many of us do) 3. but this is what scientists do:Using antibodiesand squishy bitsGrad Students experimentand enter details into theirlab notebook.The PI then tries tomake sense of this,and writes a paper.End of story. 4. Why save research data?A. Data Preservation: Preserve record of scientific process, provenance Enable reproducible researchB. Data Use: Use results obtained by others Do better science! Improve interdisciplinary workC. Sustainable Models: Technology transfer; societal/industrial development Reward scientists for data creation (credit/attribution) Long-term archiving 5. Where The Data Goes Now: PDB: A small portion of data88,3 k (1-2%?) stored in small,PetDB: &gt; 50 My Papers 1,5 kSedDB:topic-focused2 M scientistsdata repositories 0.6 kMiRB:2 M papers/year25k TAIR: 72,1 k Some data(8%?) stored in large,generic dataMajority of datarepositories(90%?) is stored on local hard drivesDryad:Dataverse:7,631 files 0.6 M Datacite:1.5 M 6. Key Needs:DEVELOP SUSTAINABLE MODELS PDB: A small portion of data88,3 k (1-2%?) stored in small,PetDB: &gt; 50 My Papers 1,5 kSedDB:topic-focused2 M scientistsdata repositories 0.6 kMiRB:2 M papers/year25k TAIR: 72,1 k Some data(8%?) stored in large,generic dataMajority of datarepositories(90%?) is stored on local hard drivesDryad:Dataverse:7,631 files 0.6 MINCREASE DATAPRESERVATION Datacite:1.5 M 7. Objections (and rebuttals) to data sharing: Objection:Rebuttal: Our lab notebooks are all on Graft tools closely on scientists paper its how we do thingsdaily practice I need to see a direct benefit Create tools to allow better of any effort I put in.insight in own and others results. I dont really trust anyoneCreate social networking context elses data and dont think and allow data owner to provide theyll trust mine granular access control. I am afraid other people =&gt; Reward system moves might scoop myfrom a competition to a discoveriesshared mission 8. From insular CoSI-FactoriesPreparePrepareObserve PonderPonder Observe Communicate CommunicateAnalyzeAnalyze 9. to shared experimental repositories:Across labs, experiments:track reagents and howthey are usedObservations Observations ObservationsPrepare Prepare Analyze CommunicateAnalyze Communicate 10. to shared experimental repositories:Compare outcome ofinteractions with theseentitiesObservations Observations ObservationsPrepare Prepare Analyze CommunicateAnalyze Communicate 11. to shared experimental repositories:Build a virtual reagentspectrogram by comparinghow different entities Observationsinteracted in differentexperiments ThinkObservationsObservations Prepare PrepareAnalyze CommunicateCommunicate Analyze 12. Some examples: Grafting tools on workflow: create tailoredmetadata collection tools on mini-tabletsin labs to replace paper notebook Direct rewards: through PI-Dashboard:allow immediate access/analysis of shareddata: new science! Data sharing rewards: Data Rescue Challenge::collect and reward stories/practices of datapreservation/use in Earth/Lunar Science Improve data use: With NIF/Eagle-I: addantibodies as key entities to paper, link to AB repository consortium 13. How do we make data use happen: We are creating repositories of shared experiments:you are part of a greater whole! Collect and share stories and practices re. data useand sustainable systems: What gets to them? Develop system of rewards for data sharing: enabledemonstrably better science! Work with grant agencies, repositories(generic/specific, institutional, cross-national) tointegrate and annotate existing datasets and enablecross-use Collectively pioneer long-term funding options;support/develop shared mission funding challenges </p>