from peer-reviewed to peer-reproduced: a role for research objects in scholarly publishing in the...

Post on 06-Aug-2015

369 Views

Category:

Science

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

From peer-reviewed to peer-reproduced: a role for research objects in scholarly

publishing in the life sciences

Alejandra González-BeltránOxford e-Research Centre, University of Oxford

-ontology.org

Bioinformatics Open Source Conference (BOSC), Dublin, Ireland

July 10-11 2015

"AGBell Notebook" by Alexander Graham Bell. (d. 1922) - page 40-41 of Alexander Graham Bell Family Papers in the Library of Congress' Manuscript Division.

Licensed under Public Domain via Wikimedia Commons - http://commons.wikimedia.org/wiki/File:AGBell_Notebook.jpg#/media/File:AGBell_Notebook.jpg

http://petcaretips.net/bonding-rabbit-to-pets.html

Many things have been said about the challenges of

science reproducibilityand how it can go wrong…

Difficulties when the description of the experimental steps

is only available in lab notebooks and scientific articles;

lack of data, lack of software tools required for analysis

Can data models and computational workflows help in capturing the experimental processes and reproduce findings?

How?

experimentaldescription

(design & steps)

conclusions

computational workflows

aggregation & workflow preservation

Can data models and computational workflows help in capturing the experimental processes and reproduce findings?

How?

Can data models and computational workflows help in capturing the experimental processes and reproduce findings?

How?

Can data models and computational workflows help in capturing the experimental processes and reproduce findings?

How?

• open peer-review• availability of

• data • analysis scripts• documentation

Evaluation of SOAPdenovo2 tool for the de novo assembly of genomes from small DNA segments reads by next generation sequencing, implementing improvements over SOAPdenovo1 assembler.

pre-publication history

https://github.com/aquaskyline/SOAPdenovo2

http://sourceforge.net/projects/soapdenovo2/

Experimental Description

Experimental DescriptionEXCELERATE interoperability component

http://www.ncbi.nlm.nih.gov/books/NBK279831/

http://elixir-uk.org/interoperability-infrastructure

genomeassemblyalgorithm

genomesize

Predictor Variables (Factor Name, Factor Type)

The experimental plan - computational case

genomeassemblyalgorithm

genomesize

SOAPdenovo2

SOAPdenovo1

ALL-PATHS-LG

bacterial genome

insect genome

human genome

Predictor Variables (Factor Name, Factor Type)

The experimental plan - computational case

genomeassemblyalgorithm

genomesize

SOAPdenovo2

SOAPdenovo1

ALL-PATHS-LG

bacterial genome

insect genome

human genome

bacterial genome

insect genome

human genome

bacterial genome

insect genome

human genome

Predictor Variables (Factor Name, Factor Type)

3x3 factorial design9 study groups

The experimental plan - computational case

genomeassemblyalgorithm

genomesize

SOAPdenovo2

SOAPdenovo1

ALL-PATHS-LG

bacterial genome

insect genome

human genome

bacterial genome

insect genome

human genome

bacterial genome

insect genome

human genome

Predictor Variables (Factor Name, Factor Type)

The experimental plan - computational case

S. aureus

R. sphaeroides

B. impatiens

Chinese Han genome (or YH genome)

genomeassemblyalgorithm

genomesize

SOAPdenovo2

SOAPdenovo1

ALL-PATHS-LG

bacterial genome

insect genome

human genome

bacterial genome

insect genome

human genome

bacterial genome

insect genome

human genome

Predictor Variables (Factor Name, Factor Type)

The experimental plan - computational case

Response Variables (with units)

genome coverage (%)

computation run time (h)

peak memory consumption (Gb)

contig N50 (kb or bp)

scaffold N50 (kb or bp)

number of errors

The experimental steps

Unambiguous identification of resources (e.g. record from public repositories); persistent identifiers if available (ORCIDs, DOIs); we suggest a dedicated article section

Experimental workflows - identification of processes, their inputs and outputsExperimental design: identify experimental goal, independent and response variables

The experimental steps

Unambiguous identification of resources (e.g. record from public repositories); persistent identifiers if available (ORCIDs, DOIs); dedicated article section

Experimental workflows - identification of processes, their inputs and outputsExperimental design: identify experimental goal, independent and response variables

Reproducing SOAPdenovo2 results with Galaxy workflows

S. aureus pipeline

Reproducing SOAPdenovo2 results with Galaxy workflows

S. aureus pipeline

2241 400

30

119.0 11 106 24 68

0

Reproducing SOAPdenovo2 results with Galaxy workflows

Publishing findings as nanopublications

assertion

provenance

publication info

nanopublication A NP represents structured data along with its provenance in a single publishable and citable entity

Publishing findings as nanopublications

assertion

provenance

publication info

nanopublication A NP represents structured data along with its provenance in a single publishable and citable entity

Abstract & Conclusions

assertion provenance

Generation of nanopublications for all the results of the response variablesNanoMaton

templates for nanopublications

Prevent priming; report all findings corresponding to the identified response variables

Remain neutral and report all findings of similar importance with the same weight

“genome coverage increased over the human data when comparing SOAPdenovo2 against SOAPdenovo1”

Link conclusions to experimentaldescription

http://www.researchobject.org/

Aggregation and workflow preservation as

ResearchObject: enables the aggregation of the digital

resources contributing to findings of computational

research, including results, data and software, as citable

compound digital objects

http://isa-tools.github.io/soapdenovo2

Aggregation and workflow preservation as

http://www.researchobject.org/

From narrative to self-described structured data

Model & workflow assisted experimental description and review processDepth and breadth of semantic resources, clear meaning of experimental elements

Ruibang Luo, University of Hong Kong

Tin-Lap Lee, Chinese University of Hong Kong

Tak-wah Lam, University of Hong Kong

SOAPdenovo2

Scott Edmunds, GigaSciencePeter Li, GigaScience

Marco Roos, Leiden University

Mark Thompson, Leiden University

Rajaram Kaliyaperumal, Leiden University

Eelke van der Horst, Leiden University

Jun Zhao, Lancaster University

María Susana Avila García, Oxford University

Philippe Rocca-Serra, Oxford UniversitySusanna-Assunta Sansone, Oxford University

Alejandra Gonzalez-Beltran, Oxford University

Team

Questions?You can email us...

isatools@googlegroups.com

View our bloghttp://isatools.wordpress.com

Follow us on Twitter@isatools

View our websites

View our Git repo & contributehttp://github.com/ISA-tools

Thanks for your attention!

top related