hardisty roberts tdwg_301013_min

24
V irtual Biodiversity V iBRANT -infrastructure SEVENTH FRAMEWORK PROGRAMME A decadal view of biodiversity informatics: challenges and priorities Alex Hardisty, Dave Roberts, and the biodiversity informatics community* * 80 people took part in the open debate that led to this paper

Upload: vibrantmanager

Post on 13-Jan-2015

168 views

Category:

Technology


0 download

DESCRIPTION

TDWG (Firenze, 30 Oct 2013). Description of community view of priorities for future work in biodiversity informatics.

TRANSCRIPT

Page 1: Hardisty roberts tdwg_301013_min

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

A decadal view of biodiversity informatics: challenges and priorities Alex Hardisty, Dave Roberts, and the

biodiversity informatics community*

* 80 people took part in the open debate that led to this paper

Page 2: Hardisty roberts tdwg_301013_min

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

A decadal view of biodiversity informatics: challenges and priorities Alex Hardisty, Dave Roberts, and the

biodiversity informatics community*

* 80 people took part in the open debate that led to this paper

“We are drowning in information, while starving for wisdom. The world henceforth will be run by synthesizers, people able to put together the right information at the right time, think critically about it, and make important choices”

E. O. Wilson, "Consilience: The Unity of Knowledge" (1998)

Page 3: Hardisty roberts tdwg_301013_min

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

A decadal view of biodiversity informatics: challenges and priorities Alex Hardisty, Dave Roberts, and the

biodiversity informatics community*

* 80 people took part in the open debate that led to this paper

“We are drowning in information, while starving for wisdom. The world henceforth will be run by synthesizers, people able to put together the right information at the right time, think critically about it, and make important choices”

Time to model all

life on Earth.

Purves et. al. (2013) Nature, 493: 295-297

E. O. Wilson, "Consilience: The Unity of Knowledge" (1998)

Page 4: Hardisty roberts tdwg_301013_min

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

An infrastructure to allow the available data to be brought into a coordinated coupled modelling environment, capable of addressing questions relating to our use of the natural environment, that captures the variety, distinctiveness and complexity of all life on Earth

A decadal view of biodiversity informatics: challenges and priorities Alex Hardisty, Dave Roberts, and the

biodiversity informatics community*

* 80 people took part in the open debate that led to this paper

The Grand Challenge for Biodiversity Informatics

To achieve it we need:To build user confidenceIntegrative flexible e-Science environmentsPredictive models across multiple scales, coupled

Page 5: Hardisty roberts tdwg_301013_min

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

1. Open Data should be normal practice;

Page 6: Hardisty roberts tdwg_301013_min

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

1. Open Data should be normal practice;2. Data encoding should

allow analysis across multiple scales;

Page 7: Hardisty roberts tdwg_301013_min

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

1. Open Data should be normal practice;2. Data encoding should

allow analysis across multiple scales;

3. Infrastructure projects should devote significant resources to market the service they develop;

Page 8: Hardisty roberts tdwg_301013_min

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

Actinobacillus actimomycetemcomitansActinobacillus actimycetemcomitansActinobacillus actinmycetemcomitansActinobacillus actinomicetemcomitansActinobacillus actinomyActinobacillus actinomyceActinobacillus actinomycemcomitansActinobacillus actinomyceremcomitansActinobacillus actinomycetamActinobacillus actinomycetamcomitansActinobacillus actinomycetecomitansActinobacillus actinomycetemcmitansActinobacillus actinomycetemcomintansActinobacillus actinomycetemcomitanceActinobacillus actinomycetemcomitansActinobacillus actinomycetemcomitants

Actinobacillus actinomycetemcommitansActinobacillus actinomycetemocimitansActinobacillus actinomycetencomitansActinobacillus actinomycetumActinobacillus actinomyctemcomitansActinobacillus actinomyectomcomitansActinobacillus actinomyetemcomitansActinobacillus actinonmycetemcomitansActinobacillus actionomycetemcomitansActinobacillus actynomicetemcomitansActinobacillus antinomycetemcomitans

Difficulties with Latinized NamesTranscription errors

Nomenclator provides correct spelling. Indexing infrastructure resolves to it.

Names as strings of characters… 4. A list of taxon names

DOI: 10.4289/0013-8797.115.1.75

5. Persistent Identifiers

Page 9: Hardisty roberts tdwg_301013_min

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

Actinobacillus actimomycetemcomitansActinobacillus actimycetemcomitansActinobacillus actinmycetemcomitansActinobacillus actinomicetemcomitansActinobacillus actinomyActinobacillus actinomyceActinobacillus actinomycemcomitansActinobacillus actinomyceremcomitansActinobacillus actinomycetamActinobacillus actinomycetamcomitansActinobacillus actinomycetecomitansActinobacillus actinomycetemcmitansActinobacillus actinomycetemcomintansActinobacillus actinomycetemcomitanceActinobacillus actinomycetemcomitansActinobacillus actinomycetemcomitants

Actinobacillus actinomycetemcommitansActinobacillus actinomycetemocimitansActinobacillus actinomycetencomitansActinobacillus actinomycetumActinobacillus actinomyctemcomitansActinobacillus actinomyectomcomitansActinobacillus actinomyetemcomitansActinobacillus actinonmycetemcomitansActinobacillus actionomycetemcomitansActinobacillus actynomicetemcomitansActinobacillus antinomycetemcomitans

Difficulties with Latinized NamesTranscription errors

Nomenclator provides correct spelling. Indexing infrastructure resolves to it.

Names as strings of characters… 4. A list of taxon names

DOI: 10.4289/0013-8797.115.1.75

5. Persistent Identifiers

6. Author identifiers

Page 10: Hardisty roberts tdwg_301013_min

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

7. 3rd party authentication

Page 11: Hardisty roberts tdwg_301013_min

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

8. Classification Bank

Embley & Stackebrandt (1994) Bergey’s Manual, 2nd Edition (2012)

Actinomycetes: the antibiotic factories

Atopobium minutumSphaerobacter thermophilus

strain TH3

Bifidobacteriaceae

Actinomycetaceae

Arthrobacteriaceae, Cellomonadaceae, Microbacteriaceae, Dermatophilaceae and realtives

Propionibacteriaceae

Nocardioidaceae

Frankiaceae

Corynebacteriaceae, Mycobacteriaceae, Nocardiaceae and realtives

Actinoplanaceae

Pseudonocardiaceae

Streptomycetaceae, Streptosporangiaceae and relatives

Insertion element in 23S rRNA

Page 12: Hardisty roberts tdwg_301013_min

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

9. Accepted names

AnnualChecklist

DynamicEditionHome

© 2013, Species 2000 at University of Reading | Disclaimer

Latest on Twitter

Overview

About the Catalogue of LifeContributors & partnersContact us

User Guide

Getting startedVersions of the CatalogueContributing your dataGlossary

Additional Services

DownloadsAdvanced services

'The most comprehensive andauthoritative global index ofspecies currently available, theCatalogue of Life consists of asingle integrated checklist andtaxonomic hierarchy for all theworld's species.'

Welcome to the Catalogue of Life website: gateway to ourdatabase of the world's known species of animals, plants, fungiand micro-organisms

Explore »

This Dynamic Edition is a constantly evolving version of the Catalogue ofLife.

Now tracking 70% of species known to science

1,315,754 species

Annual Checklist »The Annual Checklist is a snapshot of the entireCatalogue of Life: a fixed imprint.

Why two versions?

Design: Chris Turnbull | Content: Simon Thornton-Wood

Catalogue of Lifecatalogueoflife

Join the conversation

catalogueoflife Catalogue of Life,11th March 2013 is now online atcatalogueoflife.org/col6 days ago · reply · retweet · favorite

catalogueoflife Catalogue of Life,08th February 2013 is now online atcatalogueoflife.org/col37 days ago · reply · retweet · favorite

I . P . N . I

Page 13: Hardisty roberts tdwg_301013_min

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

10. Tools to make LOD

Implicit semantics“Compound 2a melted at 119oC”

Humans are good at interpreting this:

Explicit semantics CML Schema<cml:molecule ref=“2a”> <cml:property> <cml:scalar dictRef=“prop:mpt” units=“units:celsius” dataType=“xds:float” >119</cml:scalar> </cml:property></cml:molecule>

Molecules in CML/InChl

propertyDictionaryunitsDictionary

W3CSchema

4 namespaces, 3 dictionaries

Machines need this:

Page 14: Hardisty roberts tdwg_301013_min

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

GBIF/GBIC – 2-4 Jul 2012 – Copenhagen, © 2012, R. J. Robbins

The generation of important new insights while handicapped with limited technology, indirect measurement, and fuzzy data is the mark of scientific greatness.

Page 15: Hardisty roberts tdwg_301013_min

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

11. Data fit for purposeData are received at face-value, examined and tested. If the user is satisfied, then the data will be applied.

Page 16: Hardisty roberts tdwg_301013_min

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

12. Observational data infrastructure

Agriculture Systems

Climate Change

Forest Management

Invasion Biology

Urban Ecosystems

http://www.teamnetwork.org

http://www.earthobservations.org/geobon.shtml

http://www.eubon.eu

http://www.neoninc.org

http://mooreabiocode.org/

Moorea Biocode Project

http://www.ilternet.edu

Page 17: Hardisty roberts tdwg_301013_min

GBIO Document

Courtesy of Donald Hobern: http://tinyurl.com/BIH13-hobern

http://www.biodiversityinformatics.org/

Page 18: Hardisty roberts tdwg_301013_min

RESEARCH INFRASTRUCTURE INVESTMENTS

OTHER INFORMATION

DOMAINS

ASSESSMENTS AND INDICATORS

GBIO Framework

Courtesy of Donald Hobern: http://tinyurl.com/BIH13-hobern

Page 19: Hardisty roberts tdwg_301013_min

Focus Area: Evidence

• Organised views of biodiversity data– Consistent assessment of quality and fitness-for-use– Comprehensive digital nomenclature and taxonomy– Access to all evidence for recorded species occurrence– Access to species traits, measurements and interactions– Services and interfaces to access data as needed

• Provide comprehensive organised views of all relevant data

• Act as a “lens” into primary data

Courtesy of Donald Hobern: http://tinyurl.com/BIH13-hobern

Page 20: Hardisty roberts tdwg_301013_min

http://tinyurl.com/oalvv8rStructuring the biodiversity informatics community at the European level and beyond

Clarity of vision, greater focus on end-goals;Good, simple tools with syntactic operability;Community identity;Better links within our community and with other disciplines - ecology, agriculture, socioeconomics, remote sensing, etc..

The biodiversity informatics community needs :

We have a lot of data. Now we need to show that those data are actually useful.

What questions can these data address?Stop mobilising just any data. Invert the system and direct what

data are to be recovered by the question that is being addressed. This will also dictate the quality level.

Page 21: Hardisty roberts tdwg_301013_min

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

To build user confidence

Thus far, all projects share a common problem of keeping services running after project funding ended

New models are needed

To create translational pipelines to industry adoption

To encourage institutional adoption for care and maintenance

For recognition of contribution other than through publication of academic papers

Stronger marketing and outreach

Invest more in up-skilling and hand-holding

Page 22: Hardisty roberts tdwg_301013_min

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

Integrative flexible e-Science environmentsUsing standardised building blocks and workflows

Interoperable components

With access to data from multiple sources

Recognise different kinds of VRE

General-purpose / specialised / single scientific objective

- cf. chemistry laboratory vs forensics lab vs HIV vaccine lab

- Scratchpads & BioVeL / AquaMaps and iMarine / CarbonWaterCloud

Must generate immediate benefit for users

Science driven, with scientists as active participants in creation of infrastructure

Functions people find useful: simple and intuitive Technology invisible (disappears into background)

Page 23: Hardisty roberts tdwg_301013_min

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

Predictive models across multiple scales

A new framework of methods, techniques, standards to bring about interoperability of data and models across different biological scales

From Genetic through species and ecosystem to landscape

Learn from Virtual Physiological Human and from Numerical weather prediction and climatology Edwards (2010). A Vast Machine

“General Ecological Models” Purves et al. (2013). doi:10.1038/493295a

Evolvable to incorporate new scientific insights

Re-analysis models

Making data we have global

Implies ‘inversion’ of existing infrastructure

‘inversion’ of existing infrastructure is about re-examining every element of data we have to re-construct the past biodiversity, as a guide and calibrator of models that can predict the future

Page 24: Hardisty roberts tdwg_301013_min

http://h2020.myspecies.info

ViBRANTVirtual Biodiversity

Structuring the biodiversity informatics community at the European level and beyond

Our goal, sine qua non, is to deliver predictive

modelling of the biosphere.