practical interoperability across semantic stores of data for ecological, taxonomic, phylogenetic,...

8
Practical interoperability across semantic stores of data for blah blah blah eol.org @eol @cydpar r

Upload: cyndy-parr

Post on 01-Sep-2014

383 views

Category:

Technology


2 download

DESCRIPTION

Presented at the Biodiversity Information Standards (Taxonomic Databases Working Group) 2013 meeting in Florence, Italy on 31 October 2013. Essentially, an introduction to aspects of the back end of the new trait repository of Encyclopedia of Life.

TRANSCRIPT

Page 1: Practical interoperability across semantic stores of data for ecological, taxonomic, phylogenetic, and metagenomics research

Practical interoperabilityacross semantic stores of data for blah blah blah

eol.org@eol@cydparr

Page 2: Practical interoperability across semantic stores of data for ecological, taxonomic, phylogenetic, and metagenomics research

The road to TraitBank In second year of 2 year project: Marine

Expert AudienceConservation science

Virtuoso triple store<EOL taxon id> <hasAvgBodyMass in g> <value><EOL taxon id> <preysOn> <scientific name>

Beta testing NOW for public launch early 201421 datasets with 2.8 million data records for 520,000 taxa

Harvest, display, curate, search, download

MOST DATA NOT BORN SEMANTIC

From text miningFrom literature tablesFrom data papersFrom databases

Page 3: Practical interoperability across semantic stores of data for ecological, taxonomic, phylogenetic, and metagenomics research

Term URIs from existing ontologies

• Statistics from Semantic Science Integrated Ontology• Units Ontology• Environments Ontology EnvO• Gene Ontology• ETHAN (Natural history, with Joel Sachs)• Vertebrate Trait Ontology• Plant Trait Ontology

• Where necessary: request terms• Last resort: create provisional terms with

http://eol.org/schema/terms/xxxx• Of course, also using unique EOL taxon identifiers, which we’ve

mapped to identifiers of other projects

e.g. those registered in bioportal.bioontologies.org

Page 4: Practical interoperability across semantic stores of data for ecological, taxonomic, phylogenetic, and metagenomics research

Known URIs tool

Only light reasoning so far– just to infer inverse relationships like “eats” and “is eaten by”

Page 5: Practical interoperability across semantic stores of data for ecological, taxonomic, phylogenetic, and metagenomics research

14 datasets with 25k taxa, 422k interactions, for 3k locationsalpha version of ingestion, normalization, aggregationalpha version of web APIalpha version of data exports

GLoBI http://globalbioticinteractions.wordpress.com/Jorrit Poelen, Chris Mungall, James Simon GoMexSi

Page 6: Practical interoperability across semantic stores of data for ecological, taxonomic, phylogenetic, and metagenomics research

GLoBI ontology workhttps://github.com/jhpoelen/eol-globi-data/tree/master/eol

-globi-ontology

Interaction processes from Gene OntologyRelations from OBO Relations OntologyLife cycle stages and body parts from UBERONObservation and specimen terms from variousBehaviors from NeuroBehaviorOntology and Habitat keywords from Environment Ontology

New terms:/eats, /interactsWith, /preysUpon, /hasHost, /hosts, /parasitizes

Page 7: Practical interoperability across semantic stores of data for ecological, taxonomic, phylogenetic, and metagenomics research

Adding data

Page 8: Practical interoperability across semantic stores of data for ecological, taxonomic, phylogenetic, and metagenomics research

To do

• Term evaluation and recommendations• Map similar terms• Map terms to upper ontology like Species

Profile Model• Leverage reasoning for data validation

To access to the Beta test, happening NOWSend your EOL login to:

@cydparr [email protected]