the diversity of biomedical data, databases and standards (research data alliance (rda) 8th plenary)

17
The Diversity of Biomedical Data, Databases and Standards Peter McQuilton BioSharing Content Lead https://www.biosharing.org @biosharing r Bridging Force, WG Biosharing Registry,WG Data Type Registries,WG Metadata Standards International Data Week, RDA, Denver, 15 th September, 2016

Upload: peter-mcquilton

Post on 13-Apr-2017

35 views

Category:

Science


1 download

TRANSCRIPT

Page 1: The Diversity of Biomedical Data, Databases and Standards (Research Data Alliance (RDA) 8th plenary)

The Diversity of Biomedical Data,

Databases and StandardsPeter McQuilton

BioSharing Content Leadhttps://www.biosharing.org

@biosharing

IG Elixir Bridging Force, WG Biosharing Registry,WG Data Type Registries,WG Metadata Standards CatalogInternational Data Week, RDA, Denver, 15th September, 2016

Page 2: The Diversity of Biomedical Data, Databases and Standards (Research Data Alliance (RDA) 8th plenary)

A growth in data, a growth in databases, a growth in standards

Number of databases in the NAR database issue, up to 2015 (from @AlexBateman1)

Page 3: The Diversity of Biomedical Data, Databases and Standards (Research Data Alliance (RDA) 8th plenary)

• Data/content standards:

• Structure, enrich and report the description of the datasets

and the experimental context under which they were produced

• Facilitate the discovery, sharing, understanding and reuse of

datasets

• ensure all digital research outputs are Findable, Accessible,

Interoperable and Reusable (FAIR)

Data has to be structured for sharing – we need standards

Page 4: The Diversity of Biomedical Data, Databases and Standards (Research Data Alliance (RDA) 8th plenary)

Content standards – enablers

Formats Terminologies Guidelines

Minimum information reporting

requirements, checklists o Report the same core,

essential information o e.g. MIAME guidelines

Controlled vocabularies, taxonomies,

thesauri, ontologies etc.o Use the same word and refer to

the same ‘thing’o e.g. Gene Ontology

Conceptual model, conceptual

schema, exchange formats etco Allow data to flow from one

system to anothero e.g. FASTA

Page 5: The Diversity of Biomedical Data, Databases and Standards (Research Data Alliance (RDA) 8th plenary)

de jure de factograss-roots

groupsstandard

organizations Nanotechnology Working Group

Over 700 content standards in biomedical sciences

miameMIAPA

MIRIAMMIQASMIX

MIGEN

ARRIVEMIAPE

MIASE

MIQE

MISFISHIE….

REMARK

CONSORT

MAGE-TabGCDML

SRAxmlSOFT FASTA

DICOM

MzMLSBRML

SEDML…

GELML

ISA-Tab

CML

MITAB

AAOCHEBI

OBIPATO ENVO

MOD

BTOIDO…

TEDDY

PROXAO

DO

VO

Formats Terminologies Guidelines

…….... …….... ……....

Page 6: The Diversity of Biomedical Data, Databases and Standards (Research Data Alliance (RDA) 8th plenary)

Technologically-focused content standards

Biologically-focused content standards

Even if common features exists, e.g.:- description of source biomaterial- experimental design componentsthese are inconsistently duplicated

Arrays

ScanningArrays &Scanning

ColumnsGels

MS MS

FTIRNMR

transcriptomics proteomics metabolomics

plant biologyepidemiology microbiology

Diversity in Standards

Page 7: The Diversity of Biomedical Data, Databases and Standards (Research Data Alliance (RDA) 8th plenary)

What is BioSharing?

A web-based, curated and searchable portal that monitors the development and evolution of standards, their use in databases and the adoption of both in data

policies, to inform and educate the user community.

Page 8: The Diversity of Biomedical Data, Databases and Standards (Research Data Alliance (RDA) 8th plenary)

What is BioSharing?

Standards are digital objects too and we make them FAIR

Page 9: The Diversity of Biomedical Data, Databases and Standards (Research Data Alliance (RDA) 8th plenary)

Data policies by funders, journals and other organizations

(>100)

Database, tools and services

(>1000)

Content standards(>700)

Complex and evolving landscape

Formats Terminologies Guidelines

Page 10: The Diversity of Biomedical Data, Databases and Standards (Research Data Alliance (RDA) 8th plenary)

Working with and for the community

Page 11: The Diversity of Biomedical Data, Databases and Standards (Research Data Alliance (RDA) 8th plenary)

NCBI Taxon

~1400 tagsSome hierarchySynonyms4 axes – - Process - Material - Datatype - Property

What data do we capture?

Page 12: The Diversity of Biomedical Data, Databases and Standards (Research Data Alliance (RDA) 8th plenary)

Collections group together

one or more types of

resource by domain,

project or organization.

Recommendations are a

core-set of resources that

are selected and

recommended by a funder

or journal data policy.

Grouping records for different use cases

Page 13: The Diversity of Biomedical Data, Databases and Standards (Research Data Alliance (RDA) 8th plenary)
Page 14: The Diversity of Biomedical Data, Databases and Standards (Research Data Alliance (RDA) 8th plenary)
Page 15: The Diversity of Biomedical Data, Databases and Standards (Research Data Alliance (RDA) 8th plenary)
Page 16: The Diversity of Biomedical Data, Databases and Standards (Research Data Alliance (RDA) 8th plenary)

“BioSharing and its interactive browser will allow us to discover which databases and standards are not currently included in our author guidelines, enabling us to regularly monitor and refine our policies as appropriate, in support of our mission to help our authors enhance the reproducibility of their work.” – Holly Murray, F1000Research

Page 17: The Diversity of Biomedical Data, Databases and Standards (Research Data Alliance (RDA) 8th plenary)

Advisory Board Operational Team