the future of open science

35
The Future of Open Science Philip E. Bourne http://www.slideshare.net/pebourne/ 4/08/14 NIAID Workshop on Open Science 1

Upload: philip-bourne

Post on 06-May-2015

944 views

Category:

Education


3 download

DESCRIPTION

Presented at the NIAID Festival on Open Science April 8-9, 2014 NIH Campus, MD, USA

TRANSCRIPT

Page 1: The Future of Open Science

NIAID Workshop on Open Science 1

The Future of Open Science

Philip E. Bourne

http://www.slideshare.net/pebourne/

4/08/14

Page 2: The Future of Open Science

NIAID Workshop on Open Science 2

The future depends on who you ask

Here is my biased viewpoint

4/08/14

Page 3: The Future of Open Science

NIAID Workshop on Open Science 3

My Background/Bias

• RCSB PDB/IEDB Database Developer – Views on community, quality, sustainability …

• PLOS Journal Co-founder – Open science advocate• Associate Vice Chancellor for Innovation – Business

models, interaction with the private sector, sustainability

• Professor – Mentoring, reward system, value (or not) of research

• NIH Strategist/Transformer - ??4/08/14

Page 4: The Future of Open Science

NIAID Workshop on Open Science 4

Perhaps the first question to ask is:

What is an endpoint?

4/08/14

Page 5: The Future of Open Science

NIAID Workshop on Open Science 5

What is an Endpoint?

4/08/14

Page 6: The Future of Open Science

NIAID Workshop on Open Science 6

What Does The Democratization of Science Imply?

• The obvious – participation by all• Not so obvious

– More scrutiny – New types of rewards– More equal value placed on all participants– The removal of artificial boundaries that corral

knowledge (through power and resources) within silos that do not make sense as complexity increases

4/08/14

Page 7: The Future of Open Science

NIAID Workshop on Open Science 7

Consider some personal examples that illustrate these implications

4/08/14

Page 8: The Future of Open Science

More Scrutiny – Highlights Lack of Reproducibility

• I can’t immediately reproduce the research in my own laboratory:

• It took an estimated 280 hours for an average user to approximately reproduce the paper

• Workflows are maturing and becoming helpful• Data and software versions and accessibility

prevent exact reproducibility

Daniel Garijo et al. 2013 Quantifying Reproducibility in Computational Biology: The Case of the Tuberculosis Drugome PLOS ONE 8(11) e80278 .

NIAID Workshop on Open Science

84/08/14

Page 9: The Future of Open Science

NIAID Workshop on Open Science 9

Why New Types of Rewards?

• I have a paper with 16,000 citations that no one has ever read

• I have papers in PLOS ONE that have more citations than ones in PNAS

• I have data sets I am proud of few places to put them

• I edited a journal but it did not count for much

4/08/14

Page 10: The Future of Open Science

NIAID Workshop on Open Science 10

Equal Value Placed on Participants

• The UC System has Research Scientists (RS) & Project Scientists (PS) as well as tenured faculty -– RS/PS have no senate rights yet:– RS/PS frequently teach– RS/PS frequently have more grant money– RS/PS typically perform more service– RS/PS are most of the data scientists you know

4/08/14

Page 11: The Future of Open Science

NIAID Workshop on Open Science 11

Are Increasingly Found on the Google Bus

4/08/14

Page 12: The Future of Open Science

NIAID Workshop on Open Science 12

Institutional Boundaries

• Academia – Departments of physics, math, biology, chemistry etc. persist but scholars rarely confine themselves to these disciplines

• NIH – 27 institutes and centers, many dedicated to specific diseases & conditions – yet a specific gene undoubtedly transcends ICs

4/08/14

Page 13: The Future of Open Science

The Era of Open Has The Potential to Deinstitutionalize

NIAID Workshop on Open Science

13

Daniel Hulshizer/Associated Press

4/08/14

Page 14: The Future of Open Science

An Example of That Potential:The Story of Meredith

NIAID Workshop on Open Science

14

http://fora.tv/2012/04/20/Congress_Unplugged_Phil_Bourne

4/08/14

Page 15: The Future of Open Science

The Era of Open Has The Potential to Deinstitutionalize

NIAID Workshop on Open Science

15

Daniel Hulshizer/Associated Press

4/08/14

Page 16: The Future of Open Science

NIAID Workshop on Open Science 16

I have argued that the democratization of science is compelling

and that much has happened around open literature, open software and now open data

4/08/14

Page 17: The Future of Open Science

NIAID Workshop on Open Science 17

I Would Also Argue That This Process is About to Accelerate

• Others provide a more compelling argument:– Google car– 3D printers– Waze– Robotics

4/08/14

Page 18: The Future of Open Science

NIAID Workshop on Open Science 18

From the Second Machine Age

4/08/14

From: The Second Machine Age: Work, Progress, and Prosperity in a Time of Brilliant Technologies by Erik Brynjolfsson & Andrew McAfee

Page 19: The Future of Open Science

NIAID Workshop on Open Science 19

So what will this look like for an institution?

4/08/14

Institutions will become digital enterprises

Page 20: The Future of Open Science

NIAID Workshop on Open Science 20

Components of The Academic Digital Enterprise

• Consists of digital assets– E.g. datasets, papers, software, lab notes

• Each asset is uniquely identified and has provenance, including access control– E.g. publishing simply involves changing the access

control• Digital assets are interoperable across the

enterprise

4/08/14

Page 21: The Future of Open Science

NIAID Workshop on Open Science 21

Life in the Academic Digital Enterprise

• Jane scores extremely well in parts of her graduate on-line neurology class. Neurology professors, whose research profiles are on-line and well described, are automatically notified of Jane’s potential based on a computer analysis of her scores against the background interests of the neuroscience professors. Consequently, professor Smith interviews Jane and offers her a research rotation. During the rotation she enters details of her experiments related to understanding a widespread neurodegenerative disease in an on-line laboratory notebook kept in a shared on-line research space – an institutional resource where stakeholders provide metadata, including access rights and provenance beyond that available in a commercial offering. According to Jane’s preferences, the underlying computer system may automatically bring to Jane’s attention Jack, a graduate student in the chemistry department whose notebook reveals he is working on using bacteria for purposes of toxic waste cleanup. Why the connection? They reference the same gene a number of times in their notes, which is of interest to two very different disciplines – neurology and environmental sciences. In the analog academic health center they would never have discovered each other, but thanks to the Digital Enterprise, pooled knowledge can lead to a distinct advantage. The collaboration results in the discovery of a homologous human gene product as a putative target in treating the neurodegenerative disorder. A new chemical entity is developed and patented. Accordingly, by automatically matching details of the innovation with biotech companies worldwide that might have potential interest, a licensee is found. The licensee hires Jack to continue working on the project. Jane joins Joe’s laboratory, and he hires another student using the revenue from the license. The research continues and leads to a federal grant award. The students are employed, further research is supported and in time societal benefit arises from the technology.

From What Big Data Means to Me JAMIA 2014 21:194

4/08/14

Page 22: The Future of Open Science

NIAID Workshop on Open Science 22

Life in the NIH Digital Enterprise

• Researcher x is made aware of researcher y through commonalities in their data located in the data commons. Researcher x reviews the grants profile of researcher y and publication history and impact from those grants in the past 5 years and decides to contact her. A fruitful collaboration ensues and they generate papers, data sets and software. Metrics automatically pushed to company z for all relevant NIH data and software in a specific domain with utilization above a threshold indicate that their data and software are heavily utilized and respected by the community. An open source version remains, but the company adds services on top of the software for the novice user and revenue flows back to the labs of researchers x and y which is used to develop new innovative software for open distribution. Researchers x and y come to the NIH training center periodically to provide hands-on advice in the use of their new version and their course is offered as a MOOC.

4/08/14

Page 23: The Future of Open Science

NIAID Workshop on Open Science 23

To get to that end point we have to consider the complete digital research lifecycle

4/08/14

Page 24: The Future of Open Science

24

The Digital Research Life Cycle

IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION

4/08/14 NIAID Workshop on Open Science

Page 25: The Future of Open Science

NIAID Workshop on Open Science 25

Tools and Resources Will Be Better Coordinated

IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION

AuthoringTools

Lab Notebooks

DataCapture

Software

Analysis Tools

Visualization

ScholarlyCommunication

4/08/14

Page 26: The Future of Open Science

NIAID Workshop on Open Science 26

Through Interconnection Around a Common Framework

IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION

AuthoringTools

Lab Notebooks

DataCapture

Software

Analysis Tools

Visualization

ScholarlyCommunication

4/08/14

Page 27: The Future of Open Science

New/Extended Support Structures Will Emerge

IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION

AuthoringTools

Lab Notebooks

DataCapture

Software

Analysis Tools

Visualization

ScholarlyCommunication

Commercial &Public Tools

Git-likeResources

By Discipline

Data JournalsDiscipline-

Based MetadataStandards

Community Portals

Institutional Repositories

New Reward Systems

Commercial Repositories

Training

4/08/14 NIAID Workshop on Open Science 27

Page 28: The Future of Open Science

NIAID Workshop on Open Science 28

We Have a Ways to Go

IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION

AuthoringTools

Lab Notebooks

DataCapture

Software

Analysis Tools

Visualization

ScholarlyCommunication

Commercial &Public Tools

Git-likeResources

By Discipline

Data JournalsDiscipline-

Based MetadataStandards

Community Portals

Institutional Repositories

New Reward Systems

Commercial Repositories

Training

4/08/14

Page 29: The Future of Open Science

NIAID Workshop on Open Science 29

But Lets Not Forget NIH has Contributed a Lot

• NLM/NCBI• Individual IC support• Open access policies – PubMed Central• Emergent data sharing plans• Big Data to Knowledge (BD2K)• Office of the Associate Director for Data

Science• .. And more to come…

4/08/14

Page 30: The Future of Open Science

NIAID Workshop on Open Science 30

Call Out to Eric Green, and the Team…

4/08/14

bd2k.nih.gov

Page 31: The Future of Open Science

NIAID Workshop on Open Science 31

Interesting Observations So Far

• We need to start by asking, how are we using the data now?

• We have the why for data sharing, but not the how

• Training is spotty• Existing data resources

need attention• Sometimes it is enough

for me to sit down

4/08/14

Page 32: The Future of Open Science

Office of Data Science

Data Commons

TrainingCenter BD2K Review

Sustainability Education Innovation Process

• Cloud – Data & Compute

• Search• Security • Reproducibility

Standards• App Store

• Hands-on• MOOCs

• Community Engagement

• Data Science Centers

• Training Grants• DDI• Analysis• Domain Support

• Data Resource Support

• Metrics• Best Practices• Evaluation• Portfolio

Analysis

The Biomedical Research Digital Enterprise

Communication

Collaboration

Programmatic Theme

Deliverable

Example Features• To IC’s• To Researchers• To Federal

Agencies• To International

Partners• To Computer

Scientists

Scientific Data Council External Advisory Board

04/03/14

Page 33: The Future of Open Science

33

1. A link brings up figures from the paper

0. Full text of PLoS papers stored in a database

2. Clicking the paper figure retrievesdata from the PDB which is

analyzed

3. A composite view ofjournal and database

content results

One Possible End Point

1. User clicks on thumbnail2. Metadata and a

webservices call provide a renderable image that can be annotated

3. Selecting a features provides a database/literature mashup

4. That leads to new papers

4. The composite view haslinks to pertinent blocks

of literature text and back to the PDB

1.

2.

3.

4.

PLoS Comp. Biol. 2005 1(3) e344/08/14

Page 34: The Future of Open Science

NIAID Workshop on Open Science 34

Open Science Will:

• Lead to the democratization of science• Change how institutions think and operate – they

will become digital enterprises• Impact all aspects of the scholarly research lifecycle

• Accelerate seek{ing} fundamental knowledge about the nature and behavior of living systems and the application of that knowledge to enhance health, lengthen life, and reduce illness and disability

4/08/14

Page 35: The Future of Open Science

NIAID Workshop on Open Science 35

Thank You!Questions?

[email protected]

Acknowledgements• Vivien Bonazzi• Eric Green• Mark Guyer• Jennie Larkin• David Lipman• Peter Lyster• Many more….

4/08/14