biodip - a proposed infrastructure to link the taxonomic to the genomic and other domains
TRANSCRIPT
![Page 1: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/1.jpg)
Donat Agosti
Plazi
[email protected] SSS Day 2015, November 6, NHMB, Bern
BioDiP - a proposed infrastructure to link the taxonomic to the genomic and other domains
![Page 2: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/2.jpg)
Content:What is the issue?Where do we stand?What is planned?What can you do?
![Page 3: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/3.jpg)
Improve the Role of Published Biodiversity Literature in Policy decisions
Published observation records (Scientific literature)
EU BON Policy Brief
![Page 4: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/4.jpg)
The Scientific Challenge: Annotate genes
1 tnntttccca cgaataaata atataagatt ttgattatta cctccttctt taattttatt61 attatcaaga agattagttt ataaaggagt aggaacagga tgaactgttt atcctccttt121 atctaataat ttatatcata atggattttc aactgattta gcaatttttt ctttacatat181 tgcaggaata tcatcaatta taggagcaat taattttatt tcaacaattt taaatataca241 tcataaaaat ttatcattag ataaaattcc attgttagtt tgatcaattt taattacagc301 tattttatta ttattatctt tacctgtatt agcaggtgca attactatat tattaactga361 tcgaaatcta aatacaactt tttttgatcc ttcgggtgga ggagatccaa ttttatatca421 acatttattt
![Page 5: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/5.jpg)
Text
<tax:treatment>
<tax:nomenclature>
<tax:name>
<tax:xid source="HNS" identifier="193329"/>
<tax:xmldata>
<dc:Genus>Mystrium</dc:Genus>
<dc:Species>leonie</dc:Species>
</tax:xmldata>
Mystrium leonie
</tax:name>
<tax:status>n. sp.</tax:status>
Fig 1 D - F
</tax:nomenclature>
<tax:div type="description">
<tax:p>HOLOTYPE WORKER: TL 3.95, HL 1.02, HW 0.95, CI 93, SL
1.30, SI 137, PW 0.73, ML 0.38. Mandible outer margin
to a sharp apical tooth, the apex parallel to the anterior
(Holotype with material in mandibles, so mandibles and
$ described below from paratypes.) Median clypeus
....
</treatment>
XML: Semantic enhanced text (e.g. TaxonX)
From human to machine readable text
RDF
![Page 6: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/6.jpg)
Countries (Region)Australia (Queensland)
Export species materials citations (DwC)
Visualization of taxonomic literature
![Page 7: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/7.jpg)
Text mining tools: Visualization of treatment content
Summary of content of 37 Zootaxa spider publications and 8 Biodiversity Data Journal. (Miller et al., 2015)
![Page 8: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/8.jpg)
Pseudomyrmex ants and Vachellia ant-acaciasare a classic example of mutualism in biology.
allenii
melanoceras
ruddiae
chiapensis
collinsii
cookii
cornigera
globulifera
hindsii
janzenii
mayana
sphaerocephala
boopis
flavicornis
hesperius
ita
janzenikuenckeli
mixtecus
nigrocinctus
nigropilosus
opaciceps
particeps
peperi
reconditus
satanicus
simulansspinicola
subtilissimus
veneficus
ferrugineus
gentlei
gracilis
Transbiotic link networkAssociated species linked throughreferences in taxonomic treatments
Acacia-ant species: Pseudomyrmex gracili
Treatment: redescription
Associated ant-acacia: Acacia gentlei
Ants Plants
Photocredits: Alex Wild
Treatment
Treatments linked through citations
Text mining tools: Visualization of treatment content
![Page 9: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/9.jpg)
The Scientific Challenge: link a name to its treatment
![Page 10: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/10.jpg)
What is the issue? Build the necessary system
A system that allows to text and data mine the corpus of taxonomic literature.
A system that links taxonomic names to its treatments.
![Page 11: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/11.jpg)
Trait information machine ready
ResolutionReconciliation
TreatmentBank
NAMES
MANAGEMENT
CITATION
MANAGEMENT
REFBANK
TREATMENT
MANAGEMENT
ATOMIZATION &
SEMANTICIZATION
OF CONTENT MARKUP / initial trait extraction
Specialist taxonomic
databases
Build the necessary system
![Page 12: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/12.jpg)
Where do we stand?
![Page 13: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/13.jpg)
Number of citations:Better than most scientific papers
The origin
Where do we stand?
![Page 14: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/14.jpg)
Citations of the ALL-Book (Google Scholar)
Book 458
Chapter 1: Alonso, L.E., and D. Agosti. Biodiversity Studies, Monitoring, and Ants: An Overview 162Chapter 2: Kaspari, M. A Primer in Ant Ecology (pdf) 85
Chapter 3: Andersen, A.N. A Global Ecology of Rainforest Ants: Functional Groups in Relation to Environmental Stress and Disturbance (pdf) 172
Chapter 4: Schultz, T.R., and T.P. McGlynn. The Interaction of Ants with Other Organisms (pdf) 94Chapter 5: Brown, W.L.Jr. Diversity of Ants (pdf) 220Chapter 6: Alonso, L.E. Ants as Indicators of Diversity (pdf) 164
Chapter 7: Kaspari, M., J.D. Majer. Using Ants to Monitor Environmental Change (pdf) 116
Chapter 8: Ward, P.S. Broad-scale Patterns of Diversity in Leaf litter Ant Communities (pdf) 112
Chapter 9: Bestelmeyer, B.T., D. Agosti, L.E. Alonso, C.R.F. Brandão, W.L. Brown Jr., J.H.C. Delabie, and R. Silvestre. Field Techniques for the Study of Ground-Dwelling Ants: An Overview, Description and Evaluation (pdf) 388Chapter 10: Delabie, J.H.C., B.L. Fisher, J.D. Majer, and I.W. Wright. Sampling Effort and Choise of Methods (pdf) 117
Chapter 11: Lattke, J.E. Specimen Processing: Building and Curating an Ant Collection (pdf) 36
Chapter 12: Brandão, C.R.F. Major Regional and Type Collections of Ants (Formicidae) of the World and Sources for the Identification of Ant Species (pdf) 41Chapter 13: Longino, J.R. What to Do with the Data (pdf) 89
Chapter 14: Agosti, D., and L.E. Alonso. The ALL Protocol: A Standard Protocol for the Collection of Ground-Dwelling Ants (pdf); versión en español (pdf) 179
Chapter 15: Fisher, B.L., A.K.F. Malsch, R. Gadagkar, J.H.C. Delabie, H.L. Vasconcelos, and J.D. Majer. Applying the ALL Protocol: Selected Case Studies (pdf); extended version (pdf) 21Total 2454
![Page 15: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/15.jpg)
Thanks to
![Page 16: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/16.jpg)
Open Access
Before antbase.org, Harvard‘s Museum of Comparative Zoology could claim to be the
only location with a complete set of ant systematics publications from 1758 - present.
Through antbase.org‘s
digital library, access
to this body of
literature is worldwide,
and it is actively used
(>10,000 visits in one
month only).2004
![Page 17: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/17.jpg)
.. Open Access to ca 80% of the papers citing the Antbook
(most of if not through the publishers’ Websites.
.. But it needs time to find the articles
Thanks to
Manual work
![Page 18: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/18.jpg)
Access through Citation
2015
Zookeys doi
DOI: Digital Object Identifier
![Page 19: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/19.jpg)
Create a citable open corpus of taxonomic publications
![Page 20: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/20.jpg)
Create a citable open corpus of taxonomic publications
Content (Nov 2015)
Active:4,500 Poctorupoidea and ant articlesDOI provider for Revue Suisse de Zoologie, Israel Journal of Entomology, Polish Forestry Institute, Revue de Paléobiologie
Pipeline:5,000 ant articles16,000 drosophilid articlesAll Pensoft journal articles, images, tables
Planned:Back issues of RSZBackbone for Plazi and Pensoft cited articles
Participation:Please join! Make all Swiss based treatments accessible?
![Page 21: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/21.jpg)
![Page 22: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/22.jpg)
Biodiversity Literature Repository: RecordTreatment
Illustration
![Page 23: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/23.jpg)
Text
<tax:treatment>
<tax:nomenclature>
<tax:name>
<tax:xid source="HNS" identifier="193329"/>
<tax:xmldata>
<dc:Genus>Mystrium</dc:Genus>
<dc:Species>leonie</dc:Species>
</tax:xmldata>
Mystrium leonie
</tax:name>
<tax:status>n. sp.</tax:status>
Fig 1 D - F
</tax:nomenclature>
<tax:div type="description">
<tax:p>HOLOTYPE WORKER: TL 3.95, HL 1.02, HW 0.95, CI 93, SL
1.30, SI 137, PW 0.73, ML 0.38. Mandible outer margin
to a sharp apical tooth, the apex parallel to the anterior
(Holotype with material in mandibles, so mandibles and
$ described below from paratypes.) Median clypeus
....
</treatment>
XML: Semantic enhanced text (e.g. TaxonX)
From human to machine readable text
RDF
![Page 24: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/24.jpg)
Taxonomic publication: deconstruct but keep parts linked
Material citation
TreatmentArticleJournalhas part has part has part
![Page 25: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/25.jpg)
Treatment: a well defined part of an article that defines the particular usage of a scientific name by an authority at a given time (a page(s) in a publication).
Treatment
The special case taxonomic literature: The citated elements aretreatments, not article
Formica obsoleta Linnaeus, 1758: 580
![Page 26: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/26.jpg)
Treatment
![Page 27: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/27.jpg)
Original combinations
Reference to an orginal combination
Subsequent useages of names cite the referenced treatment
What is a treatment?
![Page 28: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/28.jpg)
Treatment and treatment reference and citation
Trea
tmen
t ci
tati
on
Treatment references
![Page 29: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/29.jpg)
Treatment Graph for the Malagasy Ants Aphaenogaster
Original description
Re-descriptioncites
cite
s /
syn
ony
miz
es
Re-description
Re-de.Re-description
cites
![Page 30: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/30.jpg)
Material citation
TreatmentArticleJournalis part of is part of is part of
Taxonomic publication: deconstruct but keep parts linked
![Page 31: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/31.jpg)
Material citationTreatmentArticleJournal
is part of is part of is part of
citescites cites
Taxonomic publication: deconstruct but keep parts linked
ISSN DOI httpURI httpURI
![Page 32: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/32.jpg)
Material citationTreatmentArticleJournal
is part of is part of is part of
citescites cites
Taxonomic publication: deconstruct but keep parts linked
ISSN DOI httpURI httpURI
CIEPS / ISSN CrossRef / DataCite Client Client
Plazi PlaziBiodiversity Literature Repository / Zenodo
CERN
![Page 33: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/33.jpg)
Treatment Citation Life
article
treatment
Dikow & Agosti, 2015.
![Page 34: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/34.jpg)
Taxonomic publication
Treatment Verbatim
Material citations
Specimen ID
Treatment citation
Bibliogr. citation
Taxonomic Name
Usages
is part of
cites
Illustration
![Page 35: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/35.jpg)
Taxonomic publication
Material citation
GenbankID
Collection Accession
#
SpecimenID
Digital Object ID
Collecting Event ID
Host ID
Verbatim
is part of
cites
![Page 36: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/36.jpg)
Treatment: implicit and explicit links
![Page 37: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/37.jpg)
Plazi Tools: Data extraction: tables
«Treatment»Wissenschaftliche ArtnameVerbreitungsnachweisBibliographische Records
Cataglyphis tartessica workersVariable mean ± SDHead length 11.23 ± 0.12Head width 11.15 ± 0.12Scape length 11.47 ± 0.12Mesosoma length 11.94 ± 0.16Femur length 12.03 ± 0.14Cephalic index 0 93.60 ± 3.940Scape index 128.10 ± 7.660
![Page 38: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/38.jpg)
Plazi tools: discovering of scientific names
![Page 39: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/39.jpg)
Plazi tools: discovering and parsing of bibliographic references
![Page 40: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/40.jpg)
Plazi tools: discovering and parsing of observation data
![Page 41: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/41.jpg)
Plazi tools: discovering of treatments
![Page 42: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/42.jpg)
Status quo
• 50,000+ treatments life, daily growth
• RDF in Betaversion
• GoldenGate Imagine (PDF and text mining tool) in betaversion
• Provider for data for NCBI, Wikidata, GBIF, EOL, antweb
• Biodiversity Literature Repository functional
![Page 43: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/43.jpg)
LODPDF
HNS
HNS
The Scientific Challenge
![Page 44: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/44.jpg)
The Scientific Challenge
![Page 45: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/45.jpg)
article
treatment
CiteshttpURI
cites (DOI)
Scientific name
https://www.wikidata.org/wiki/Property:P1992
Feed Wikipedia with taxonomic data
![Page 46: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/46.jpg)
Status quo
• 50,000+ treatments life, daily growth
• RDF in Betaversion
• GoldenGate Imagine (PDF and text mining tool) in betaversion
• Provider for data for NCBI, Wikidata, GBIF, EOL, antweb
• Biodiversity Literature Repository functional
![Page 47: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/47.jpg)
What is planned?What can you do?
![Page 48: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/48.jpg)
What is planned? What can you do?
A system that allows to text and data mine the corpus of taxonomic literature.
A system that links taxonomic names to its treatments.
![Page 49: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/49.jpg)
Trait information machine ready
ResolutionReconciliation
TreatmentBank
NAMES
MANAGEMENT
CITATION
MANAGEMENT
REFBANK
TREATMENT
MANAGEMENT
ATOMIZATION &
SEMANTICIZATION
OF CONTENT MARKUP / initial trait extraction
Specialist taxonomic
databases
What is planned? What can you do?
existing prototype planned
prototype
![Page 50: BioDIP - a proposed infrastructure to link the taxonomic to the genomic and other domains](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587c3dbb1a28ab5a1d8b591d/html5/thumbnails/50.jpg)
BioDiP
Program: SUK-Programm 2013-2016 P-2 «Wissenschaftliche Information: Zugang, Verarbeitung und Speicherung»
Partners: HES-SO, HEG Geneva (Swiss Institute of Bioinformatics), Plazi, open at various levels – from adding content to BLR to data mining and building applications
Submission: February 2016
Duration: 2-3 years
What is planned? What can you do?