synthesising disparate data resources to obtain composite estimates of geophylogeny

Post on 18-Dec-2014

490 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

Invited talk to the 2nd BioVeL workshop, Gothenburg, Sweden, 10 May 2012

TRANSCRIPT

SYNTHESISING DISPARATE DATA RESOURCES TO OBTAIN COMPOSITE ESTIMATES OF GEOPHYLOGENY

Rutger Vos

A simple assignment?

Refine a tree for the Primates with taxonomic and systematic data

Add divergence dates

Add occurrence data Visualize the result Use public web

services

Actually not so easy…

The Tree of Life Web Service

Using PhyloWS we traversed the Tree of Life and built a local, semantically annotated copy of the Primate clade

Adding taxonomic metadata

Using the uBio PhyloWS service we enhanced our tree with further taxonomic annotations and links, and expanded some genera

Fetching additional tree data

Using the TreeBASE PhyloWS service we fetched additional data to resolve the tree further using a “supertree” approach

Computing node ages

The TimeTree PhyloWS service allowed us to anchor molecular (i.e. relative) node ages on absolute dates

Adding occurrence data

Using the GBIF XML API, we then fetched occurrence records for the species in our tree

Visualizing the result

Implementation

Except for GBIF, all services: return NeXML implement PhyloWS

Semantic annotations using RDFa

Glued together with Perl

Challenges

Although some services have the same API, no GUI exists to chain them together

No web services for computationally intensive steps

Data and metadata are messy and sparse

Conclusions

The tree of life can be covered with all sorts of metadata (taxonomic, molecular, biogeographic, paleontological), viewable in different ways

Standards still incompletely defined and adhered to, though

Shameless plug: PhyloTastic

A web service to extract subsets of taxa from megatrees and annotate them

Deliverable of the first HIP hackathon, at NESCent, in June 2012

Acknowledgements

top related