data publication: discover, explore, visualise

36
Data Publication: Discover, Explore, Visualise Alejandra Gonzalez-Beltran, PhD Research Lecturer Oxford e-Research Centre University of Oxford Data Visualisation and the Future of Academic Publishing University of Oxford and Oxford University Press June 10 th 2016 @alegonbel

Upload: alejandra-gonzalez-beltran

Post on 14-Apr-2017

269 views

Category:

Science


5 download

TRANSCRIPT

Data Publication:Discover, Explore, Visualise

Alejandra Gonzalez-Beltran, PhDResearch Lecturer

Oxford e-Research CentreUniversity of Oxford

Data Visualisation and the Future of Academic PublishingUniversity of Oxford and Oxford University Press

June 10th 2016@alegonbel

Philippe  Rocca-­Serra,  PhDSenior  Research  Lecturer

AlejandraGonzalez-­Beltran,  PhDResearch  Lecturer

Milo  Thurston,  DPhDResearch  Software  Engineer

MassimilianoIzzo,  PhDResearch  Software  Engineer

Peter  McQuilton,  PhDKnowledge  Engineer

Our  main  areas  of  research  and  activity:

• Enabling  reproducible  research  through…

• Data  collection,  curation,  representation  etc.• Data  publication• Data  provenance  • Development  of  software,  infrastructure• Open,  community  ontologies  and  standards• Semantic  web  /  linked  data• Training

Communities we work with/for:Allyson  Lister,  PhDKnowledge  Engineer

EamonnMaguire,  DPhilSoftware  Engineer  contractor

David  Johnson,  PhDResearch  Software  Engineer

Susanna-­Assunta  Sansone,  PhDPrincipal  Investigator,  Associate  Director  

OutlineOutline

• Challenges  associated  to  scholarly  data

• Importance  of  all  research  outputs  /  metadata

• Reproducibility  crisis• Experiments  description• Data  availability

• Data  publication• Springer  Nature  Scientific  Data

• Discover,  Explore,  Visualise  Scholarly  Data• Scientific  Data  ISA-­explorer

• Challenges  associated  to  scholarly  data

• Importance  of  all  research  outputs  /  metadata

• Reproducibility  crisis• Experiments  description• Data  availability

• Data  publication• Springer  Nature  Scientific  Data

• Discover,  Explore,  Visualise  Scholarly  Data• Scientific  Data  ISA-­explorer

Credit  to:  https://www.digital-­science.com/blog/news/five-­top-­reasons-­to-­protect-­your-­data-­and-­practise-­safe-­science/

Challenges  related  to  scholarly  dataChallenges  related  to  scholarly  data

• Outputs are multi-dimensional, diverse, not always well cited / storedo Software, codes, workflows etc.; hard(er) to get hold of

• Data often distributed and fragmented to fit (siloed) databaseso Without enough information for others to understand it

• Uneven level of details and annotation across different databaseso Specialized, generalist, public and institutional

• Data curation activities are perceived as time consumingo Collection and harmonization of detailed methods and experimental

steps is done/rushed at publication stage

But…  shared  data  is  not  always  understandable,  reusable

But…  shared  data  is  not  always  understandable,  reusable

Importance  of-­ avoid  selective  reporting-­ experimental  design-­ statistical  power-­ statistical  analysis-­ code/methods  availability-­ data  availability

Importance  of-­ avoid  selective  reporting-­ experimental  design-­ statistical  power-­ statistical  analysis-­ code/methods  availability-­ data  availability

• Incentive, credit for sharingo Big and small datao Unpublished datao Long tail of datao Curated aggregation

• Peer review of data• Value of data vs. analysis• Discoverability and reusability

o Complementing community databases

Growing  number  of  data  papers  and  data  journalsGrowing  number  of  data  papers  and  data  journals

nature.com/scientificdataHonorary Academic Editor Susanna-Assunta Sansone, PhD

Managing EditorAndrew L Hufton, PhD

Editorial CuratorVarsha Khodiyar

PublisherIain Hrynaszkiewicz

A new open-access, online-only publication for descriptions of scientifically valuable datasets

Supported by

nature.com/scientificdataHonorary Academic Editor Susanna-Assunta Sansone, PhD

Managing EditorAndrew L Hufton, PhD

Editorial CuratorVarsha Khodiyar

PublisherIain Hrynaszkiewicz

A new open-access, online-only publication for descriptions of scientifically valuable datasets

Supported by

Research

papers

Data  

records

Data  

Descriptors

Value  added:  complement  between  traditional  articles  &  repositories

Value  added:  complement  between  traditional  articles  &  repositories

Scientific hypotheses:SynthesisAnalysisConclusions

Methods and technical analyses supporting the quality of the measurements:What did I do to generate the data?How was the data processed?Where is the data?Who did what when

Relation  with  traditional  articles  – contentRelation  with  traditional  articles  – content

Citation  of  and  links  to  data  files  and  databasesCitation  of  and  links  to  data  files  and  databases

Citation  of  and  links  to  data  files  and  databasesCitation  of  and  links  to  data  files  and  databases

Credit  for  data  producersCredit  for  data  producers

A  new  article  typeA  new  article  type

A new category of publication that provides detailed descriptors of scientifically valuable datasets

Mandates open data, without unnecessary restrictions, as a condition of submission

Summary  Table

Web  app  to  discover,  explore,  visualise data  descriptors

http://scientificdata.isa-­explorer.org/

Browse

Keyword  search

Filter

Filtering  options

Summary  Table

Filtering  options

See  annotations  and  number  ofassociated  data  descriptors

Filtering  options

Combination  of  filters

Visualise the  samples’  characteristics

Publication  date

Open  associated  data  descriptor

Download  Metadata

Licensing

Links  to  data  repositories  to  access  the  data

Assays  details

SummarySummary

• Challenges  associated  to  scholarly  data

• Importance  of  all  research  outputs  /  metadata

• Reproducibility  crisis• Experiments  description• Data  availability

• Data  publication• Springer  Nature  Scientific  Data

• Discover,  Explore,  Visualise  Scholarly  Data• Scientific  Data  ISA-­explorer

• Challenges  associated  to  scholarly  data

• Importance  of  all  research  outputs  /  metadata

• Reproducibility  crisis• Experiments  description• Data  availability

• Data  publication• Springer  Nature  Scientific  Data

• Discover,  Explore,  Visualise  Scholarly  Data• Scientific  Data  ISA-­explorer

Philippe  Rocca-­Serra,  PhDSenior  Research  Lecturer

AlejandraGonzalez-­Beltran,  PhDResearch  Lecturer

Milo  Thurston,  DPhDResearch  Software  Engineer

MassimilianoIzzo,  PhDResearch  Software  Engineer

Peter  McQuilton,  PhDKnowledge  Engineer

Communities we work with/for:Allyson  Lister,  PhDKnowledge  Engineer

EamonnMaguire,  DPhilSoftware  Engineer  contractor

David  Johnson,  PhDResearch  Software  Engineer

Susanna-­Assunta  Sansone,  PhDPrincipal  Investigator,  Associate  Director  

Our  main  areas  of  research  and  activity:

• Enabling  reproducible  research  through…

• Data  collection,  curation,  representation  etc.• Data  publication• Data  provenance  • Development  of  software,  infrastructure• Open,  community  ontologies  and  standards• Semantic  web  /  linked  data• Training