genomics:gtl information and data sharing policy

19
Genomics:GTL Information and Data Sharing Policy Susan K. Gregurick U.S. Department of Energy Office of Science Office of Biological and Environmental Research BERAC Briefing May 19 th , 2008 Susan.Gregurick@ science.doe.gov Genomicsgtl.energy.gov

Upload: zocha

Post on 19-Mar-2016

45 views

Category:

Documents


1 download

DESCRIPTION

Genomics:GTL Information and Data Sharing Policy. Susan K. Gregurick U.S. Department of Energy Office of Science Office of Biological and Environmental Research BERAC Briefing May 19 th , 2008 Susan.Gregurick @ science.doe.gov Genomicsgtl.energy.gov. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Genomics:GTL  Information and Data Sharing Policy

Genomics:GTL Information and Data Sharing Policy

Susan K. GregurickU.S. Department of Energy

Office of ScienceOffice of Biological and Environmental Research

BERAC BriefingMay 19th, 2008

[email protected]

Genomicsgtl.energy.gov

Page 2: Genomics:GTL  Information and Data Sharing Policy

“As long as the attention to data policies and data management by funding agencies does not catch up with the rapidly changing research environment, we will continue the systemic loss and underutilization of

valuable data derived from public investments.”

(excerpted from Uhlir and Schröder, Data Science Journal Vol. 6, 2007).

Page 3: Genomics:GTL  Information and Data Sharing Policy

UPSIDE: Uniform Principle for Sharing Integral Data and Materials

Expeditiously

– Community standards for sharing publication related data and materials state that it is an author’s obligation to not only release the data and materials but to provide these in a format that allows other scientists to build on these for future research.

Sharing publication related data and materials: Responsibilities of authorship in the life sciences. National Research Council, 2003.

Page 4: Genomics:GTL  Information and Data Sharing Policy

Principles of data sharing: A Community led approach

• General:– Information and data arising from public research investment should be

publicly available• Scientific:

– Data and Information sharing is essential for the highly focussed Genomics: GTL research program

– New technologies mean increasingly larger amounts of research data

• Ownership of data generated through GTL-sponsored research lies with researchers and institutions, but needs to be shared across the program

• Our role is to provide guidance and mechanisms to facilitate and support data and information sharing within the GTL program

Page 5: Genomics:GTL  Information and Data Sharing Policy

A checklist for developing the GTL data policy.

• Identify science driver(s) necessitating a formal policy • Create a working group to bring the policy to fruition• Poll GTL researcher with respect to data policy needs and developments

• Research current policies and data sharing opinions/practices from literature• Draft a strawman document and define key aspects of the policy:

– scope of policy (data types covered), applicability (which data falls under the policy), rules of data sharing, compliance to standards, submission to appropriate repositories

– Expectation for compliance– consequences of non-compliance

• Subject strawman draft policy to internal and then external round(s) of consultation followed by iterative improvements

• Post final draft onto public website and publicized at GTL Awardees Meeting• Set into motion support for policy

– Could include: creation/extension of data centers, physical archives, facilities, institutions, ring-fenced funds for competitive award programs, education, outreach

• Monitor compliance and enforce policy • Extend policy to cover sub-areas of science/data as required• Revise policy and any implementation as required

Page 6: Genomics:GTL  Information and Data Sharing Policy

GTL Data and Information Sharing Policy

• The Office of Biological and Environmental Research (OBER) will require that all publishable information resulting from GTL funded research must conform to community recognized standard formats when they exist, be clearly attributable, and be deposited within a community recognized public database(s) appropriate for the research conducted. Furthermore, all experimental data obtained as a result of GTL funded research must be kept in an archive maintained by the Principal Investigator (PI) for the duration of the funded project. Any publications resulting from the use of shared experimental data must accurately acknowledge the original source or provider of the attributable data. The publication of information resulting from GTL funded research must be consistent with the Intellectual Property provisions of the contract under which the publishable information was produced.

Page 7: Genomics:GTL  Information and Data Sharing Policy

Details of Data and Information Sharing Policy

Effective October 1, 2008

All investigators are expected to submit their publication relatedinformation to a national or international public repository, when oneexists, according to the repository’s established standards forcontent and timeliness but no later than 3 months after publication.

This includes:

• Experimental protocols,• Raw and/or processed data, as required by the repository,• Other relevant supporting materials.

Page 8: Genomics:GTL  Information and Data Sharing Policy

Protection of Intellectual Property

• For cases where information sharing standards or databases do not yet exist, the information sharing and data archiving plan provided by a project’s PI must state these limitations.

• Data and information that are necessary elements of protected intellectual property and related to a pending or future patent application are explicitly exempt from public access until completion of the patenting process.

Page 9: Genomics:GTL  Information and Data Sharing Policy

Nationally and Internationally-Accepted Databases and Ontologies

• Sequence Data and Information:– Deposit and report accession number

• Genbank/EMBL, UniProtkb/Swiss-Prot Protein Knowledge database

• Three Dimensional Structures:– Deposit and record accession code

• PDB, NAD

Page 10: Genomics:GTL  Information and Data Sharing Policy

Microarray and Gene Expression Data

• (MGED) Society: focuses on establishing standards for microarray and

other functional genomics data, including data quality, management, annotation and exchange.

– MIAME describes the Minimum Information About a Microarray Experiment that is needed to enable the interpretation of the results of the experiment unambiguously.

– A number of high impact journals requiring MIAME compliant data as a condition for publishing microarray based papers (Nature, Science, PLoS,…)

• GTL Microarray and Gene Expression Data (recommended):– Deposit in MIAME-compliant format

• Gene Expression Omnibus, ArrayExpress, Stanford Microarray Database

Page 11: Genomics:GTL  Information and Data Sharing Policy

proteomics and molecular interaction experiments

• Proteomics Standards Initiative (PSI), a working group of the Human Proteome Organization (HUPO): defines community standards for proteomics data to facilitate data comparison, exchange and verification.

– minimum information about a proteomics experiment (MIAPE) and minimum information required for reporting a molecular interaction experiment (MIMIx )

– A number of databases now accept PSI Molecular Interaction standards (BIND, DIP, HPRD, Hybrigenics, IntAct, MINT, and MIPS)

• GTL Proteomics Data (recommended):– Deposit in MIAPE and MIMIx compliant format

• Open Proteomic Database (OPD, PRIDE) and PEDRo (Proteome Experimental Data Repository )

Page 12: Genomics:GTL  Information and Data Sharing Policy

Information Sharing Systems and Databases Under Development

• Other Technologies (recommended):– In cases where there are no public repositories or community driven

standard ontologies, data and information should be made publicly available by the PI

Page 13: Genomics:GTL  Information and Data Sharing Policy

Protection of Human Subjects

• Research using human subjects provides important scientific benefits but these benefits never outweigh the need to protect individual rights and interests. OBER will require that grantees and contractors follow the DOE principles and regulations for the protection of human subjects involved in DOE research. Minimally this will require an IRB review. These principles are stated clearly in the Policy and Order documents: DOE P 443.1A and DOE O 443.1A, which are available online at www.directives.doe.gov.

Page 14: Genomics:GTL  Information and Data Sharing Policy

Computational Software

• The International Society for Computational Biology (ISCB) recommends that funding agencies follow ISCB guidelines for open-source software at a “Level 0” availability. – ISCB states that research software will be made available free of

charge, in binary form, on an “as is” basis for non-commercial use and without providing software users the right to redistribute.

• OBER will follow ISCB recommendations at a Level 0 availability. – Research software (binary) is to be made accessible through either an

open source license (www.opensource.org) or deposited to an open

source software community such as SourceForge.

Page 15: Genomics:GTL  Information and Data Sharing Policy

Laboratory Information Management Systems

(LIMS) for Data Management and Archiving

• Research projects that involve more than one senior investigator will be required to implement a LIMS or a similar type of system for data and information archiving and retrieval across the entire project.

• The LIMS plan should balance the clear value of data availability and sharing within the project against the cost and effort of archive construction and maintenance.

Page 16: Genomics:GTL  Information and Data Sharing Policy

Summary

• Data and information should conform to existing community recognized standard formats wherever possible, to be clearly attributable, and to be deposited, in a timely manner, within a community recognized public database(s) appropriate for the research conducted.

• OBER is committed to encouraging development of public repositories and standard ontologies for the GTL research community.

• OBER recognizes that this policy necessarily will be updated to incorporate new standards, data types, and other advances.

• This information and data-sharing policy and related materials can be found at genomicsgtl.energy.gov/datasharing.

Page 17: Genomics:GTL  Information and Data Sharing Policy

Funding Body County Parent Policy Title Year Location on Web (URL)

Economic and Social Research Council (ESRA)

UK ESRA Data Policy 1994 http://www.esrcsocietytoday.ac.uk/ESRCInfoCentre/Images/DataPolicy2000_tcm6-12051.pdf

Natural Environmental Research Council (NERC)

UK NERC Data Policy 1996 http://www.nerc.ac.uk/research/sites/data/policy.asp

National Science Foundation (NSF)

US NSF Data Sharing Policy 2001 http://www.nsf.gov/pubs/2001/gc101/gc101rev1.pdf

National Institute of Health (NIH)

US NIH Data Sharing Policy 2003 http://grants.nih.gov/grants/policy/data_sharing/[D1]

Gorden and Betty Moore Foundation (GBMF)*

US GBMF Data Sharing Policy and Implementation Guidance

2005 http://www.moore.org/docs/GBMF_Data_Sharing_Policy_Impl_Guide_v4.pdf

Genome Canada Canada Genome Canada Data Release and Resource Sharing Policy

2005 http://www.genomecanada.ca/xcorporate/policies/DataReleasePolicy.pdf

Biotechnology and Biological Sciences Research Council (BBSRC)*

UK BBSRC Data Sharing Policy

2006 http://www.bbsrc.ac.uk/support/guidelines/datasharing/context.html

Medical Research Council (MRC)

UK MRC Data Sharing and Preservation Policy

2006 http://www.mrc.ac.uk/PolicyGuidance/EthicsAndGovernance/DataSharing/PolicyonDataSharingandPreservation/MRC002551

Wellcome Trust UK Wellcome Trust Policy on Data Management and Sharing

  http://www.wellcome.ac.uk/doc_wtx035043.html

Department of Energy (DOE)

US Genomics: GTL Program Information and Data Sharing Policy (Office of Biological and Environmental Research)

2008 http://genomicsgtl.energy.gov/datasharing

Page 18: Genomics:GTL  Information and Data Sharing Policy

GTL Knowledgebase WorkshopDOE-OBER workshop

GTL Knowledgebase for Systems BiologyWashington DC, May 28-30, 2008

Workshop Purpose

Identify research needs and opportunities for a Systems Biology Knowledgebase to capitalize on GTL research investments.Provide an assessment of where the science and technology now stands and where barriers to progress might exist.

Describe the directions for fundamental research that can be pursued to meet these goals:

Data and information acquisition and curationOrganization of information driven by scientific inquiryInfrastructure and Technology

Page 19: Genomics:GTL  Information and Data Sharing Policy

Data and Information Sharing Policy Working Group

• Chair: J. Fredrickson• Members:

– A. P. Arkin K. Andrews-Cramer– E. Uberbacher H. Berman– D. Platt N. Baliga– S. Kravitz B. Davison – S. Salzberg G. Anderson– J. Stanford T. Critchlow– P. Karp JBEI, GLBRC & BESC– D. Schmoyer