introduction to eol.org for scientists

36
@cydpa rr @eol ynthia Parr emantic reasoning workshop ashington, DC 6-7 September 2012 Introduction to eol.org

Upload: cyndy-parr

Post on 01-Sep-2014

1.865 views

Category:

Technology


0 download

DESCRIPTION

A talk given at the Semantic Reasoning workshop held at the National Museum of Natural History September 6, 2012. The audience included computer scientists and biological scientists interested in using EOL for their research.

TRANSCRIPT

Page 1: Introduction to EOL.org for scientists

@cydparr @eol

Cynthia ParrSemantic reasoning workshopWashington, DC 6-7 September 2012

Introduction to eol.org

Page 2: Introduction to EOL.org for scientists

Whirlwind tour

• What kind of information we have• How we assemble that information• How machines and people interact with EOL• Next steps

Page 3: Introduction to EOL.org for scientists

>1.1 million taxon pages with content from more than 200 providers, 1000s individuals

5 million content objects

Page 4: Introduction to EOL.org for scientists

Details tab

Leafy Seadragon example

Page 5: Introduction to EOL.org for scientists

Total of 1,344,711 images 9,586 videos 28,569 sounds

Page 6: Introduction to EOL.org for scientists

Maps

Page 7: Introduction to EOL.org for scientists

Literature

Page 8: Introduction to EOL.org for scientists

EOL has Global Partners and is internationalized

China

Australia

Dutch

South Africa

Costa Rica

Mexico EgyptIndia

Colombia

Peru

Taiwan

Norway

USA

Page 9: Introduction to EOL.org for scientists

EOL summarizes knowledge

Erosaria caputserpentisSerpent's Head Cowrie

Depth range based on 51 specimens in 2 taxa.Water temperature and chemistry ranges based on 40 samples.

Environmental ranges Depth range (m): -5 - 67 Temperature range (°C): 23.011 - 28.496 Nitrate (umol/L): 0.048 - 0.923 Salinity (PPS): 33.821 - 35.837 Oxygen (ml/l): 4.349 - 4.825 Phosphate (umol/l): 0.088 - 0.228 Silicate (umol/l): 0.983 - 4.026

From Moorea Biocode

From GBIFFrom OBIS

Page 10: Introduction to EOL.org for scientists

Erosaria caputserpentisSerpent's Head Cowrie

Salinity envelope (n=40)

From OBIS

Page 11: Introduction to EOL.org for scientists

Cynthia ParrSpecies Pages Group

Global Content Summit17-19 Jan 2011

Richness scores

http://eol.org/pages/704102

Page 12: Introduction to EOL.org for scientists

Whirlwind tour

• What kind of information we have• How we assemble that information

– Big picture– Subject semantics– Names infrastructure– Curation– Richness score

• How machines and people interact with EOL• Next steps

Page 13: Introduction to EOL.org for scientists

EOL aggregates and curates

Scientific Databases, includingBHL, GBIF, ALA, INBio, COL, Scratchpads, LifeDesks Scientific Journals Curate

CommentRate, Collect

eol.orgAggregate

Quality control

Page 14: Introduction to EOL.org for scientists

EOL v2

Plinian Core

DwCdescription

SPMinfoitem

usingDarwin Core Archive flat files as transport mechanism

Sharing process adds semantics to content objects

Page 15: Introduction to EOL.org for scientists

DistributionMolecularBiology

Multiple topicsTypeInformation

HabitatConservationStatus

ThreatsMorphology

ConservationManagement

TrendsSize

AssociationsUses

TrophicStrategyCyclicity & Life Cycle

PopulationBiologyReproduction

MigrationTaxonomy

LifeExpectancyIdentification

BehaviourEcology

Diseases

0 100000 200000 300000 400000 500000 600000 700000 800000

Number of text objectsSu

bjec

t of t

ext o

bjec

t

Page 16: Introduction to EOL.org for scientists

Content objects are associated with taxon names

Wikimedia Commons: Physeter macrocephalus

(note we actually have over 3.3 million named pages)

Page 17: Introduction to EOL.org for scientists

Names from different providers are matched

Animal Diversity Web .... Physeter catodon Linnaeus, 1758 ARKive .................. Physeter macrocephalus Linné BioPix .................. Physeter macrocephalus L. INBio ................... Physeter catodon IUCN .................... Physeter Macrocephalus ITIS .................... Physeter macrocephalus Linnaeus, 1758 MarLIN .................. Physeter macrocephalus Linné NCBI .................... Physeter Catodon Species 2000 ............ Physeter macrocephalus Linnaeus, 1758 Taxon Concept ........... Physeter australasianus Desmoulins, 1822 Wikimedia Commons ....... Physeter macrocephalus WORMS ................... Physeter macrocephalus Linnaeus 1758

Physeter macrocephalus

Page 18: Introduction to EOL.org for scientists

Taxon concept pages: multiple hierarchies on Names tab

Page 19: Introduction to EOL.org for scientists

Problem: one taxon may have several names

Animal Diversity Web .... Physeter catodon Linnaeus, 1758 ARKive .................. Physeter macrocephalus Linné BioPix .................. Physeter macrocephalus L. INBio ................... Physeter catodon IUCN .................... Physeter Macrocephalus ITIS .................... Physeter macrocephalus Linnaeus, 1758 MarLIN .................. Physeter macrocephalus Linné NCBI .................... Physeter Catodon Species 2000 ............ Physeter macrocephalus Linnaeus, 1758 Taxon Concept ........... Physeter australasianus Desmoulins, 1822 Wikimedia Commons ....... Physeter macrocephalus WORMS ................... Physeter macrocephalus Linnaeus 1758

Page 20: Introduction to EOL.org for scientists

Problem: the same name may apply to more than one taxon

Page 21: Introduction to EOL.org for scientists

EOL curation

• Trust or untrust taxon associations• Add new taxon association• Set preferred hierarchies• Set preferred common names• Leave comments

Coming: Taxonomic concept curation

Page 22: Introduction to EOL.org for scientists

EOL is not Wikipedia

…though we have more than 212,000 Wikipedia articles and 115,000 Wikimedia images Can’t currently edit within text objects

Page 23: Introduction to EOL.org for scientists

Whirlwind tour

• What kind of information we have• How we assemble that information• How machines and people interact with EOL

– API– Third party apps– Collections and communities

• Next steps

Page 24: Introduction to EOL.org for scientists

EOL enables machine interaction

Curate

CommentRate, Collect

eol.orgAggregate

API

Third party apps

Page 25: Introduction to EOL.org for scientists

Third party applications eol.org/api

Page 26: Introduction to EOL.org for scientists

People interact with EOL content & each other

Collections

Communities

Page 27: Introduction to EOL.org for scientists

Studies currently underwaywith University of Maryland

• Cross-cultural study on motivation to engage in citizen science – Dana Rotman

• Interaction among scientists and non-scientists on EOL’s social network – Jae-wook Ahn

• Website traffic analysis to aid conservation communication – Yurong He and Bill Fagan

Page 28: Introduction to EOL.org for scientists

Whirlwind tour

• What kind of information we have• How we assemble that information• How machines and people interact with EOL• Next steps

Page 29: Introduction to EOL.org for scientists

Using EOL collections to get computable data

Step 1: Search on EOL for organisms with characteristics of interest. Add each one to an EOL collection. Step 2: Write a program using EOL API methods to retrieve the external database identifiers for the species in that collection.Step 3: Add to your program code to retrieve data using external database APIs.Step 4: Analyze, rinse, repeat.

From Arthur Chapman

Page 30: Introduction to EOL.org for scientists

Crowd-sourcing for computable data

Lovell and Libby Langstroth, Calphotos Foodwebs.org

Page 31: Introduction to EOL.org for scientists

Efforts underwayPhylogenetic trees: Collaboration with Open Tree of Life project for draft tree

Computable data challengehttp://eol.org/info/data_challengeRod Page’s Bionames projectAlexandria Archive Institute

Devries and Thessen using DBPedia Spotlight to extract associations among taxa and add to Linked Open Data cloud

Sloan 2 project: Marine computable data

TraitBank ABI proposal

Page 32: Introduction to EOL.org for scientists

Research wishes

• Collecting nominations for research idea where EOL can help:

http://eol.org/info/wishes_for_researchDUE 15 SEPTEMBER

• Will follow with Rubenstein Fellows call for proposals

Page 33: Introduction to EOL.org for scientists

Our fundersJohn D. and Catherine T. MacArthur FoundationAlfred P. Sloane FoundationSmithsonian InstitutionMarine Biological LaboratoryHarvard UniversityDavid Rubenstein and other funders and donors

All our content providers and global partners

Volunteer curators and individual contributors via Flickr, Wikimedia, and members of EOL

Thanks to

Page 34: Introduction to EOL.org for scientists

Summary of EOL page richnessOverall• 950,000 have content• 2 % are rich• ~22 % have only links• to literature

Hot List• 30 % of 75K are rich• Average richness = ~30

• Red Hot List• 56 % of 3K are rich• Average richness = 43

Page 35: Introduction to EOL.org for scientists

1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 101 106 111 116 121 126 131 1360

100000

200000

300000

400000

500000

600000

1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 101 106 111 116 121 126 1311

10

100

1000

10000

100000

1000000

Partners in order of # taxa contributed to EOL

Num

ber o

f tax

a fo

r whi

ch c

onte

nt is

con

trib

uted

to E

OL

Long Tail in databases contributing to EOL

… viewed on log scale

Page 36: Introduction to EOL.org for scientists

Taxon page richness algorithm

a (Breadth) b (Depth) c (Diversity)+ +

Breadth: Images, topics of text objects, references, maps, videos, sounds, conservation status

Depth: # words per text object, # words total

Diversity: Sources (partners)

60% 30% 10%

0 – 100, Threshold 40