Transcript
Page 1: IRMNG – the Interim Register of Marine and Nonmarine Genera: rationale and current status Tony Rees – CSIRO Marine and Atmospheric Research, Australia

IRMNG – the Interim Register ofMarine and Nonmarine Genera:

rationale and current status

Tony Rees – CSIRO Marine and Atmospheric Research, Australiafor: GN-CoL names and taxonomy sharing workshop, Hawaii, March 2012

www.obis.org.au/irmng

Page 2: IRMNG – the Interim Register of Marine and Nonmarine Genera: rationale and current status Tony Rees – CSIRO Marine and Atmospheric Research, Australia

Tony Rees: IRMNG March 2012

The Dream…

Imagine a system that would…

• Automatically classify “any” genus & species name to kingdom / phylum / class / order / family (as far down as possible) – “what is this critter” – plus hierarchical relations e.g. parents / children / siblings

• Return whether a current (valid) or non-current name e.g. synonym

• Check spelling for correctness, also authority details, plus supply original publication ref. as available

• Return associated attributes such as extant / fossil status, habitat information, geographic / geologic range, more…

• Work seamlessly, with a single point of entry, across all groups and geologic epochs including present day

• Be as up-to-date as possible (latest content), and authoritative (maintained by relevant experts)

Page 3: IRMNG – the Interim Register of Marine and Nonmarine Genera: rationale and current status Tony Rees – CSIRO Marine and Atmospheric Research, Australia

Tony Rees: IRMNG March 2012

Realising the Dream…

• For extant taxa: role of Cat. of Life, however ~30% of species still to go; for fossil taxa: PaleoDB (unknown proportion missing, maybe 50%?)

• In mean time, could make progress by assembling global genera list, and infilling with species names as available

• IRMNG is an attempt along these lines… a work in progress, with modest resourcing, but available for use now.

genera

species

Page 4: IRMNG – the Interim Register of Marine and Nonmarine Genera: rationale and current status Tony Rees – CSIRO Marine and Atmospheric Research, Australia

Tony Rees: IRMNG March 2012

IRMNG data sources

• Animal genera + auth’s from Nomenclator Zoologicus and elsewhere, tax. placements and synonymies from multiple sources including CoL, individual taxon treatments and printed works

• Botanical genera and auth’s from Index Nominum Genericorum (ING) supplemented with other sources, tax. placements and synonymies from multiple sources including GRIN (APGIII in the main), Index Fungorum, AlgaeBase, CyanoDB, more

• Prokaryote genera, auth’s and tax. placements from LSPN (Euzéby list), previous/non-valid names from multiple sources

• Virus genera and tax. placements from ICTV db (multiple versions – very different through time)

• Species lists (all groups) from CoL 2006, Aphia/WoRMS 2006, AFD, NZ Organisms Register + more.

Page 5: IRMNG – the Interim Register of Marine and Nonmarine Genera: rationale and current status Tony Rees – CSIRO Marine and Atmospheric Research, Australia

Tony Rees: IRMNG March 2012

IRMNG content as at March 2012 (cf. e.g. Cat. of Life):

• Not all IRMNG genera yet linked to relevant families, but ~370k are (remainder linked to higher taxon i.e. phylum, class or order)

• Extant/fossil, marine/nonmarine flags held for majority of names

• Nomenclatural status known for most names, tax. status i.e. valid name/synonym for only a subset at this time (varies by group)

• Authority known for >97% of genera, publication details for “animal” subset (from Nomenclator Zoologicus in the main)

• Fuzzy matching (TAXAMATCH) deployed over all web-based queries for correction of potential errors in input names to be matched.

IRMNG:• 19k families• 454k genera• 1.46m species names

(including synonyms)

Cat. of Life (2011 version):• 8k families• 178k genera• 2.25m species names

(including synonyms)

Page 6: IRMNG – the Interim Register of Marine and Nonmarine Genera: rationale and current status Tony Rees – CSIRO Marine and Atmospheric Research, Australia

Tony Rees: IRMNG March 2012

IRMNG in practice – example genus = “Lawsonia”

• Same name is currently a valid genus in 3 Codes i.e. plants, animals and bacteria (no barriers to this)

Page 7: IRMNG – the Interim Register of Marine and Nonmarine Genera: rationale and current status Tony Rees – CSIRO Marine and Atmospheric Research, Australia

Tony Rees: IRMNG March 2012

Required base information is scattered in multiple systems / printed works at this time

(etc.)

plant animal

bacterium

Page 8: IRMNG – the Interim Register of Marine and Nonmarine Genera: rationale and current status Tony Rees – CSIRO Marine and Atmospheric Research, Australia

Tony Rees: IRMNG March 2012

Required base information is scattered in multiple systems / printed works at this time

(etc.)

plant animal

bacterium

Page 9: IRMNG – the Interim Register of Marine and Nonmarine Genera: rationale and current status Tony Rees – CSIRO Marine and Atmospheric Research, Australia

Tony Rees: IRMNG March 2012

IRMNG query as at March 2012

Page 10: IRMNG – the Interim Register of Marine and Nonmarine Genera: rationale and current status Tony Rees – CSIRO Marine and Atmospheric Research, Australia

Tony Rees: IRMNG March 2012

IRMNG query as at March 2012

parentschildren

synonym of (as

known)

extant, habitat flags

Page 11: IRMNG – the Interim Register of Marine and Nonmarine Genera: rationale and current status Tony Rees – CSIRO Marine and Atmospheric Research, Australia

Tony Rees: IRMNG March 2012

Note: IRMNG fields displayed on the web are only a subset of full information held for any name, e.g.:

Page 12: IRMNG – the Interim Register of Marine and Nonmarine Genera: rationale and current status Tony Rees – CSIRO Marine and Atmospheric Research, Australia

Tony Rees: IRMNG March 2012

IRMNG core fields

• IRMNG ID, Rank

• Scientific name (for species: epithet + parent ID)

• Authority

• Publication (as “microcitation” – subset with link to refs. module)

• Source(s) for above

• Orthography verified against (authoritative source)

• Parent ID (+ “according to…”) – Linnaean ranks only at this time

• Nomenclatural status (+ relation with other names as needed) + “according to…”

• Taxonomic status (same)

• Nomenclatural Code

• Taxonomic or nomenclatural remarks

• Extant/fossil, marine/nonmarine flags + “according to” (could be “as per parent”)

• Date entered, last modified, deprecated (where required)

(under consideration…)

• Intermediate ranks e.g. subfamily, subgenus, also infraspecies (not currently held)

• Type genus / species indicator

• Freshwater / terrestrial flags vs. present “nonmarine”

• Geo flags (country codes etc.)

• Palaeo range (periods/epochs)

• Vernacular names as available

Page 13: IRMNG – the Interim Register of Marine and Nonmarine Genera: rationale and current status Tony Rees – CSIRO Marine and Atmospheric Research, Australia

Tony Rees: IRMNG March 2012

IRMNG is not just a “passive” aggregator…

Editorial / curatorial decisions / actions required to:

• Correct obvious data errors

• Assemble “complete” records from multiple sources (where one source data deficient)

• Normalise authority data (in particular) to a “house style”

• Digitise or transcribe print material into electronic form where not otherwise available

• Decide between conflicting content in data sources e.g. for authority orthography/year, taxonomic placement, valid/synonym status and more

• Cross-link names e.g. synonyms -> current names, basionyms -> replacement names, misspelled names to their correctly spelled counterparts, etc. etc.

• Reconcile variant higher taxonomies as supplied to a single hierarchy

• Add nomenclatural or taxonomic remarks as required.

Page 14: IRMNG – the Interim Register of Marine and Nonmarine Genera: rationale and current status Tony Rees – CSIRO Marine and Atmospheric Research, Australia

Tony Rees: IRMNG March 2012

Relevance to present meeting?

• Demonstrates utility of a single entry point to a system permitting query on “any name” – i.e., a [comprehensive] Taxonomic Name Resolution Service (TNRS) covering all life

• Envisage something like OBIS or GBIF, but for taxonomy – the aggregator / central query point is not a content author, but provides integration and value-added services

• IRMNG – based on static snapshot/s of multiple data sources; cf. a “super catalogue” should be based on live feeds from relevant authoritative sources, continuously updated as available (?+ some static data not available as feeds)

• Maybe the static data lives outside the “data aggregation/query” point, becomes a separately managed source

• How does / should GNA facilitate this?

• Will the need for an IRMNG (or IRMNG equivalent) disappear or grow in the above scenario? (for example could this role be taken by another player or group of players…)

Page 15: IRMNG – the Interim Register of Marine and Nonmarine Genera: rationale and current status Tony Rees – CSIRO Marine and Atmospheric Research, Australia

Tony Rees: IRMNG March 2012

Thank you!

Page 16: IRMNG – the Interim Register of Marine and Nonmarine Genera: rationale and current status Tony Rees – CSIRO Marine and Atmospheric Research, Australia

Tony Rees: IRMNG March 2012

(supplementary slides)

Page 17: IRMNG – the Interim Register of Marine and Nonmarine Genera: rationale and current status Tony Rees – CSIRO Marine and Atmospheric Research, Australia

Tony Rees: IRMNG March 2012

Size of the task: IRMNG 2011 content cf. Cat. of Life 2011

Cat. of Life - 2011 edition

% with auth's

IRMNG –Oct 2011 -

extant + fossil

% with auth's

IRMNG –Oct 2011 - fossil only

          Kingdoms 8   7   (0)

Phyla 111   153   (12) Classes 288   509   (64) Orders 1,233   2,645   (715)

Families 8,071 0% 19,639 22.1% (6,542) Subfamilies -  -  -  -  - 

Genera 178,515 0% 452,848 97.1% (90,278)

Subgenera -  -  -   - -  Species (valid) 1,347,224 ~100% 1,020,519 ~100% (16,792)

Species (synonyms) 895,441 ~100% 440,738 ~100% (100)

• CoL has 70% of valid extant species names (of est. 1.9m total), thus maybe also 70% of valid extant genera (with subset of genus-level synonyms)

• IRMNG has further ~180k extant genus names and ~90k fossil names at this time (including syns) – est. ~25k still missing

Page 18: IRMNG – the Interim Register of Marine and Nonmarine Genera: rationale and current status Tony Rees – CSIRO Marine and Atmospheric Research, Australia

Taxonomic names:what the customer is currently offered (+ more…)

Tony Rees: IRMNG March 2012

publication discovery official registers

taxon-specific DB’s

integrated DB’s “all names”

Botany

Zoology

New names

published (in

primary literature)

New names

published (in

primary literature)

ICTV Viruses DB

ICTV Viruses DB

LPSN(Prokaryote

names)

LPSN(Prokaryote

names)

ICBN DecisionsICBN Decisions

ICZN DecisionsICZN Decisions

Journal TOC’s, RSS

feeds,text mining

Journal TOC’s, RSS

feeds,text mining

Abstracting services

Abstracting services

Subject bibliographies

Subject bibliographies

Reviews, secondary literature

Reviews, secondary literature

Zoological Record

Zoological Record

ION (Index of Organism Names)

ION (Index of Organism Names)

ChecklistBank

GNI

GNUB

ChecklistBank

GNI

GNUB

Catalogue of Life

Catalogue of Life

ITISNCBI

TaxonomyWoRMS

etc.

ITISNCBI

TaxonomyWoRMS

etc.

CyanoDBCyanoDB

Plant GSD’sPlant GSD’s

PaleoDBPaleoDB

Animal GSD’sAnimal GSD’s

other compilations e.g. regional lists, Wikispecies, Wikipedia, more…

other compilations e.g. regional lists, Wikispecies, Wikipedia, more…

The Plant List, IPNI,

TROPICOS, ING

The Plant List, IPNI,

TROPICOS, ING

Index FungorumMycoBank

Index FungorumMycoBank

AlgaeBaseAlgaeBase

Nomenclator Zoologicus

Nomenclator Zoologicus

Page 19: IRMNG – the Interim Register of Marine and Nonmarine Genera: rationale and current status Tony Rees – CSIRO Marine and Atmospheric Research, Australia

Tony Rees: IRMNG March 2012

Two approaches - GNI and Cat. of Life

NameBank / GNI• 20m+ names – all ranks, no hierarchy• mix of “clean” and “dirty” names• many duplicates• extant + fossil, most sectors with at least some names

Page 20: IRMNG – the Interim Register of Marine and Nonmarine Genera: rationale and current status Tony Rees – CSIRO Marine and Atmospheric Research, Australia

Tony Rees: IRMNG March 2012

GNI search result – “Lawsonia” (all ranks returned)(Mar 2012)

…candidate genus names highlighted in red (although could be other ranks too)

… need access to original taxonomic / nomenclatural resources to sort out / see if anything missed

Page 21: IRMNG – the Interim Register of Marine and Nonmarine Genera: rationale and current status Tony Rees – CSIRO Marine and Atmospheric Research, Australia

Tony Rees: IRMNG March 2012

Two approaches - GNI and Cat. of Life

NameBank / GNI Cat. of Life

• 20m+ names – all ranks, no hierarchy• mix of “clean” and “dirty” names• many duplicates• extant + fossil, most sectors with at least some names

• <2m names – Linnaean ranks, in hierarchy• all “clean”/ vetted names / relationships• extant only, sectors either complete or absent

Page 22: IRMNG – the Interim Register of Marine and Nonmarine Genera: rationale and current status Tony Rees – CSIRO Marine and Atmospheric Research, Australia

Tony Rees: IRMNG March 2012

Cat. of Life search result – “Lawsonia” (Mar 2012)


Top Related