the gbif information architecture technological integration at the global level donald hobern gbif...

29
The GBIF Information Architecture Technological integration at the global level Donald Hobern GBIF Program Officer for Data Access and Database Interoperability March 2005

Upload: isaac-banks

Post on 18-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The GBIF Information Architecture Technological integration at the global level Donald Hobern GBIF Program Officer for Data Access and Database Interoperability

The GBIF Information ArchitectureTechnological integration at the global level

Donald Hobern GBIF Program Officer for Data Access and Database Interoperability

March 2005

Page 2: The GBIF Information Architecture Technological integration at the global level Donald Hobern GBIF Program Officer for Data Access and Database Interoperability

GBIF and TDWG

GBIF – Global Biodiversity Information Facility• Megascience activity involving 43 countries/economies and 28 international organisations• Secretariat based in Copenhagen, Denmark• Mission

• Free and universal access to world’s biodiversity data via Internet • Sharing primary biodiversity data for society, science and a sustainable future

• Products• Registry of biodiversity data resources• Index of biodiversity data• Software tools• Web portals (http://www.gbif.net) and data services

TDWG – Taxonomic Databases Working Group• Not-for-profit scientific and educational association• Affiliated to the International Union of Biological Sciences• Mission

• To provide an international forum for biological data projects • To develop and promote the use of standards• To facilitate data exchange

• Products• Standards/guidelines for recording/exchanging data about organisms• Promotion of use of these standards• Forum for discussion (especially annual meeting)

Page 3: The GBIF Information Architecture Technological integration at the global level Donald Hobern GBIF Program Officer for Data Access and Database Interoperability

Vernacular (FR): Pyrale du maïs

Vernacular (ES): Piral del maíz

Vernacular (DE): Maiszünsler

Diagnosis: Wingspan 26-30mm; sexually dimorphic;male: forewings ochreous to dark brown; female: forewings pale yellow; …

Foodplant: Zea mais L. 1753

Biodiversity data

Species: Ostrinia nubilalis (Hübner, 1796)

Family: Pyralidae

Order: Lepidoptera

Class: Insecta

Genus: Ostrinia Hübner, 1825

Vernacular (EN): European Corn-borer

Family: Gramineae

Taxonomic Names

Collection: DGH LepidopteraRecord id: DGHEUR_003217Country: FranceCoordinates: 03.047˚E 48.730˚NDate: 28 June 2003Collector: Donald Hobern

Specimens and Observations

Ecological Interactions

Locus: AAL35331Definition: acyl-CoA Z/E11 desaturase

1 mvpyattadg hpekdecfed...

Sequence Data

Average RainfallLocation: 48.82°N 2.29°E Jan Feb Mar Apr ...182.3 120.6 158.1 204.9 ...

Abiotic Data

Taxonomic Descriptions

Pheromones of Ostriniahttp://www.nysaes.cornell.edu/fst/faculty/acree/pheronet/phlist/ostrinia.html

Digital Literature and Web Resources

Synonym: Pyralis nubilalis Hübner, 1796

Page 4: The GBIF Information Architecture Technological integration at the global level Donald Hobern GBIF Program Officer for Data Access and Database Interoperability

Resources vary in interoperability and scope for dynamic integration

• Images, audio, etc.• Require text (metadata) for interpretation• Very hard for software to interpret automatically• (Normally) link to resource as a unit (e.g. via URL in web page)

• Unstructured text documents (including e.g. PDF, Word, HTML)• Can be indexed for full-text search (Google)• Hard for software to interpret automatically• Normally link to document as a unit (e.g. URL in web page)

• Structured XML documents• Can be indexed for full-text search• Can be parsed and interpreted by software tools (for known schemas)• Can link to document as a unit or to fragments of a document• Can use structure to develop highly-specific searches• Normally process whole documents (not random-access)

• Databases• Can be exposed as structured XML documents• Can process any subset of the data (random-access selection of specific records)

Data resources

Page 5: The GBIF Information Architecture Technological integration at the global level Donald Hobern GBIF Program Officer for Data Access and Database Interoperability

<?xml version="1.0" encoding="UTF-8"?><response>

<record><darwin:DateLastModified>2003-06-08</darwin:DateLastModified><darwin:InstitutionCode>DGH</darwin:InstitutionCode><darwin:CollectionCode>DGH Lepidoptera</darwin:CollectionCode><darwin:CatalogNumber>DGHEUR_0002976</darwin:CatalogNumber><darwin:ScientificName>Dichomeris marginella (Fabricius, 1781)</darwin:ScientificName><darwin:BasisOfRecord>O</darwin:BasisOfRecord><darwin:Kingdom>Animalia</darwin:Kingdom><darwin:Order>Lepidoptera</darwin:Order><darwin:Family>Gelechiidae</darwin:Family><darwin:Genus>Dichomeris</darwin:Genus><darwin:Species>marginella</darwin:Species><darwin:ScientificNameAuthor>(Fabricius, 1781)</darwin:ScientificNameAuthor><darwin:IdentifiedBy>Donald Hobern</darwin:IdentifiedBy><darwin:Collector>Donald Hobern</darwin:Collector><darwin:YearCollected>2003</darwin:YearCollected><darwin:MonthCollected>06</darwin:MonthCollected><darwin:DayCollected>08</darwin:DayCollected><darwin:ContinentOcean>Europe</darwin:ContinentOcean><darwin:Country>Denmark</darwin:Country><darwin:County>Københavns Amt</darwin:County><darwin:Locality>Merianvej, Hellerup</darwin:Locality><darwin:Longitude>12.538</darwin:Longitude><darwin:Latitude>55.737</darwin:Latitude><darwin:CoordinatePrecision>100</darwin:CoordinatePrecision><darwin:IndividualCount>1</darwin:IndividualCount><darwin:Notes>1 in Skinner trap</darwin:Notes>

</record></response>

June 2003 S M T W T F S 1 2 3 4 5 6 7 8 9 10 11 12 13 1415 16 17 18 19 20 2122 23 24 25 26 27 2829 30

Observation record formatted using the Darwin Core

Structured XML data

Page 6: The GBIF Information Architecture Technological integration at the global level Donald Hobern GBIF Program Officer for Data Access and Database Interoperability

Some TDWG Data Standards

Darwin Core• Simple XML data model to represent taxon occurrence records (only core attributes)• Extensions to handle e.g. curation details, microbial data, image data

ABCD Schema – Access to Biological Collection Data• More complex XML data model to represent collection or observation data• Detailed document structure including features for different communities

DiGIR – Distributed Generic Information Retrieval• XML protocol for searching remote data resources • Suitable for use with a wide range of different data models

BioCASe Protocol• XML protocol for searching remote data resources with more complex schema (e.g. ABCD)• Derived from DiGIR – new unified DiGIR/BioCASe protocol being developed (”TAPIR”)

Taxon Concept Schema• XML data model currently under development for exchange of nomenclatural/taxonomic data• First version to be used for implementation in 2005

SDD Schema – Structured Descriptive Data• XML data model for descriptive data relating to taxa or specimens (highly generalised)• Suitable for representation of character tables, diagnostic keys, etc.

Page 7: The GBIF Information Architecture Technological integration at the global level Donald Hobern GBIF Program Officer for Data Access and Database Interoperability

DiGIR Protocol <?xml version="1.0" encoding="UTF-8"?><request xmlns="http://digir.net/schema/protocol/2003/1.0" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:darwin="http://digir.net/schema/conceptual/darwin/2003/1.0"> <header> <version>1.0.0</version> <sendTime>2004-07-20T09:56:00-0500</sendTime> <source>127.0.0.1</source> <destination resource="HDS">http://any.net/digir/DiGIR.php</destination> <type>search</type> </header> <search> <filter> <and> <equals> <darwin:Collector>Thomas, J.</darwin:Collector> </equals> <equals> <darwin:Genus>Rubus</darwin:Genus> </equals> </and> </filter> <records limit="3" start="0"> <structure> <xsd:element name="record“ minOccurs="0“ maxOccurs="unbounded"> <xsd:complexType> <xsd:sequence> <xsd:element ref="darwin:InstitutionCode"/> <xsd:element ref="darwin:CollectionCode"/> <xsd:element ref="darwin:CatalogNumber"/> <xsd:element ref="darwin:ScientificName"/> <xsd:element ref="darwin:Latitude"/> <xsd:element ref="darwin:Longitude"/> </xsd:sequence> </xsd:complexType> </xsd:element> </structure> </records> <count>true</count> </search></request>

<?xml version="1.0" encoding="utf-8" ?><response xmlns="http://digir.net/schema/protocol/2003/1.0"> <header> <version>$Revision: 1.96 $</version> <sendTime>2004-07-20T09:56:03-0500</sendTime> <source resource="HDS">http://any.net/digir/DiGIR.php</source> <destination>127.0.0.1</destination> <type>search</type> </header> <content xmlns:darwin="http://digir.net/schema/conceptual/darwin/2003/1.0"> <record> <darwin:InstitutionCode>INS</darwin:InstitutionCode> <darwin:CollectionCode>COL</darwin:CollectionCode> <darwin:CatalogNumber>27694</darwin:CatalogNumber> <darwin:ScientificName>Rubus brasiliensis</darwin:ScientificName> <darwin:Latitude>-23.6625</darwin:Latitude> <darwin:Longitude>-46.7738</darwin:Longitude> </record> <record> <darwin:InstitutionCode>INS</darwin:InstitutionCode> <darwin:CollectionCode>COL</darwin:CollectionCode> <darwin:CatalogNumber>27907</darwin:CatalogNumber> <darwin:ScientificName>Rubus brasiliensis</darwin:ScientificName> <darwin:Latitude>-23.412</darwin:Latitude> <darwin:Longitude>-46.7899</darwin:Longitude> </record> <record> <darwin:InstitutionCode>INS</darwin:InstitutionCode> <darwin:CollectionCode>COL</darwin:CollectionCode> <darwin:CatalogNumber>27908</darwin:CatalogNumber> <darwin:ScientificName>Rubus urticaefolius</darwin:ScientificName> <darwin:Latitude>-23.412</darwin:Latitude> <darwin:Longitude>-46.7899</darwin:Longitude> </record> </content> <diagnostics> <diagnostic code="STATUS_INTERVAL" severity="info">3600</diagnostic> <diagnostic code="STATUS_DATA" severity="info">191,1,1</diagnostic> <diagnostic code="MATCH_COUNT" severity="info">26</diagnostic> <diagnostic code="RECORD_COUNT" severity="info">3</diagnostic> <diagnostic code="END_OF_RECORDS" severity="info">false</diagnostic> </diagnostics></response>

Request Response

Page 8: The GBIF Information Architecture Technological integration at the global level Donald Hobern GBIF Program Officer for Data Access and Database Interoperability

DiGIR-BioCASe Protocol and Nested Networks

User

Get Darwin Core records where darwin:ScientificName equals Puma concolor from any provider.

MaNIS Provider

Darwin Core

Curatorial

Taxon Occurrence

OBIS Provider

Darwin Core

Marine

Taxon Occurrence

IPGRI Banana Provider

Darwin Core

IPGRI Passport

Banana Descriptor

Taxon Occurrence

IPGRI Soy Bean Provider

Darwin Core

IPGRI Passport

Soy Bean Descriptor

Taxon Occurrence

BioCASe Provider

Darwin Core

ABCD

Taxon Occurrence

BioCASe Provider

Darwin Core

ABCD

Taxon Occurrence

Get standard plant genetic resource Passport data for all crop types.

Get full set of Soy Bean crop descriptors.

Get complete ABCD documents from each BioCASe provider

Get DiGIR-style records each with a set of Darwin Core descriptors and a complete ABCD Unit

Page 9: The GBIF Information Architecture Technological integration at the global level Donald Hobern GBIF Program Officer for Data Access and Database Interoperability

Data navigation

SpecimenCatalogue Number

SpecimenCatalogue Number

SpecimenCatalogue Number

Taxon ConceptName

Taxon ConceptName

Identified as

Identified as

Taxon ConceptName

PublicationTitle, Year

PublicationTitle, Year

PublicationTitle, Year

SynonymyRelationship

AuthorName

AuthorName

CollectorName

CollectorName

SequenceData

BarcodeData

BarcodeData

BarcodeData

Type material

Type material

Identified as

PDF

XML

XML

Referenced material

CharacterData

CharacterData

CharacterData

Type material

Page 10: The GBIF Information Architecture Technological integration at the global level Donald Hobern GBIF Program Officer for Data Access and Database Interoperability

Integration strategiesDifferent approaches to integrating distributed information:

• Data Warehouseo Bring all data togethero Data owners can still maintain control over their datao Single agreed central data modelo All data in one place any complex query can be constructed

• Federated Networko Data remain with original ownerso Data owners can manage data as best suits themo Agreed data models for exchangeo Queries can become expensive as number of resources increases

• Indexed Network o Data remain with original ownerso Data owners can manage data as best suits themo Agreed data models for exchangeo Most frequently-issued queries can be handled in index

• Selection may be influenced by community culture

• Network may include warehouses/indices for specific areas

• Key driver should be user requirements (use cases)

Page 11: The GBIF Information Architecture Technological integration at the global level Donald Hobern GBIF Program Officer for Data Access and Database Interoperability

Central services

GBIF plans to offer the following central services:

• Service Registry – discover data resources

• Data Index and Aggregated Data Services – find distributed data

• Schema Repository – interpret structured data

• Feedback Mechanisms – add value by connecting users to providers

• Data Validation Services – report potential inconsistencies within data

• Usage Reporting Mechanisms – recognise providers for sharing data

• Globally Unique Identifier Services – provide persistent references to data records

Page 12: The GBIF Information Architecture Technological integration at the global level Donald Hobern GBIF Program Officer for Data Access and Database Interoperability

Networked data resources

Specimens:Flowering Plants of

Africa

Specimens:Proteaceae of

the World

Taxon Names:Proteaceae of

the World

Observations:Birds of Central

America

Observations:Butterflies of

Belize

Checklist:Birds of Belize

Specimens:Mammals of North Europe

Taxon Names:Mammals of

the World

Specimens:Bacteria Cultures

Taxon Names:Bacteria

Further Links:Bacteria

Further Links:Mammals

Museum A

Museum C University D

Observer Network B

GBIFNetwork

Page 13: The GBIF Information Architecture Technological integration at the global level Donald Hobern GBIF Program Officer for Data Access and Database Interoperability

Service registry

Data Node Type of data Taxon Region Records

Museum A Specimen/Observation

Flowering Plants Africa 327000

Specimen/Observation

Proteaceae World 23000

Taxonomic Names Proteaceae World 1500

Observer Network B

Specimen/Observation

Birds Central America 68500

Specimen/Observation

Butterflies Belize 4200

Name List Birds Belize 587

Museum C Specimen/Observation

Mammals North Europe 1800

Taxonomic Names Mammals World 8000

General Resources Mammals World 600

University D Specimen/Observation

Bacteria World 1200

Taxonomic Names Bacteria World 5000

General Resources Bacteria World 400

Page 14: The GBIF Information Architecture Technological integration at the global level Donald Hobern GBIF Program Officer for Data Access and Database Interoperability

Data index

Catalogue of Life

Biodiversity Data Access

Biodiversity DataIndex

Taxonomic Name

Service (ECAT)

Userrequests

GBIF Data Nodes

Specimen DataSpecimen DataLinks to other data

Specimen DataSpecimen DataName Lists

Specimen DataSpecimen DataObservation Data

Specimen DataSpecimen DataSpecimen Data

Page 15: The GBIF Information Architecture Technological integration at the global level Donald Hobern GBIF Program Officer for Data Access and Database Interoperability

Data index

Page 16: The GBIF Information Architecture Technological integration at the global level Donald Hobern GBIF Program Officer for Data Access and Database Interoperability

Distributed data access

GBIF Service Registry

GBIF Global Portal Regional PortalThematic Portal

Taxonomy / Nomenclature

PDFHTML web page

Struc-tured XML

Image

GBIF Data Index Entrez

Software Applications

Discovery

Indexing

Presentation &Analysis

ResourcesSpecimens

Descriptive Data

Sequences / Barcodes

OBIS

Network-specific

Registries

Structured Unstructured

Databases Documents

Access XML XMLTCS

XMLABCDDwC

XMLSDD

XMLTaXMLit

URL URL URL

CGIAR

Page 17: The GBIF Information Architecture Technological integration at the global level Donald Hobern GBIF Program Officer for Data Access and Database Interoperability

Handling of unstructured data – a possible approach

• Develop set of keywords to classify web pages– Need standard classification of biodiversity information (probably with varying levels

of granularity)– Register pages under { taxonName, url, keywords, audience, language, ... }– Sufficient immediately to construct tables of links as part of taxon-specific web

pages

• Develop corresponding XML tags to mark up sections of web text (<description>, <habitat>, etc.)

– See TaXMLit from Biologia Centrali-Americana project for related approach– Possible to extract sections for re-use elsewhere– Example: a web site requires a description of common garden birds

• Find suitable text by audience and language for each taxon?

• Maximise benefits from effort expended in developing content– All information developed in any format can be indexed and discovered– All resources can contribute to the global data pool– Can develop plug-and-play taxonomic/regional/thematic portals based on the entire

network of registered data elements, structured and unstructured

Page 18: The GBIF Information Architecture Technological integration at the global level Donald Hobern GBIF Program Officer for Data Access and Database Interoperability

Data validation

Administrator

Specimen

Researcher

Data Input

Database

Web Server

Internet

IndexingService

ValidationService

Validation service:

• Extensible framework for checking XML data

• Basic checks for XML validity (e.g. UTF-8)

• Validation against schema

• Checks that element content has expected structure or uses expected values

• Geospatial checks – coordinates in expected countries or on land / in sea

• Taxonomic checks – is name known? does it appear in referenced authority list? is higher taxonomy consistent?

Page 19: The GBIF Information Architecture Technological integration at the global level Donald Hobern GBIF Program Officer for Data Access and Database Interoperability

Schema repository – version history

Schema A

1.0 1.1

Schema B

1.0 1.1 1.2

Structured store of data models for each schema

Modelling of version history for each schema

Single location from which to generate new machine-readable formats

Should such a tool also provide an environment in which to develop new data models?

.xsd

.dtd

.xsd

.???

.dtd

• Elements• Datatypes• Cardinality• Enumerated

values• Annotations

Page 20: The GBIF Information Architecture Technological integration at the global level Donald Hobern GBIF Program Officer for Data Access and Database Interoperability

Schema repository – documentation

Automated generation of standardised HTML, PDF and other human readable documentation for each standard

Can easily generate documentation e.g. of inter-version differences

Schema A

1.0 1.1

Schema B

1.0 1.1 1.2

Page 21: The GBIF Information Architecture Technological integration at the global level Donald Hobern GBIF Program Officer for Data Access and Database Interoperability

Schema repository – conceptual mappings

Schema A

1.0 1.1

Schema B

1.0 1.1 1.2

Identify relationships between elements in different versions of the same schema or different schemas (need interfaces to allow these linkages to be defined)

Generate software products to automate transformation between different versions and schemas

How should we handle different content models for related concepts?

Should mappings simply be between different imported schemas, or should they all be made against some central set of concepts?

.xslt

.pl?

.java?

Page 22: The GBIF Information Architecture Technological integration at the global level Donald Hobern GBIF Program Officer for Data Access and Database Interoperability

Aggregator DiGIR provider

Museum B DiGIR provider

Museum A DiGIR provider

Problems to solve – data served from multiple locations

Different networks of providers will serve up the same records through different routes, and software tools will not be able to detect all duplicate records

Aggregator nodes may in many cases themselves be the sole provider for other important records

The same situation will also occur e.g. if a taxonomist serves data for all specimens from many institutions for a single genus, including some that may also have been digitised by the institutions themselves

No reliable mechanism to relate Agg:MusA/CollX to MusA:CollX or Agg:MusA/CollX:1001 to MusA:CollX:1001

Inst Coll CatNo

MusA CollX 1001

MusA CollX 1002

Inst Coll CatNo

MusB CollY 202-1

MusB CollY 465-2

Inst Coll CatNo

Agg MusA/CollX 1001

Agg MusA/CollX 1002

Agg MiscOther 1000

Agg MiscOther 1001

Global Provider Registry

Key Service Name

1 Museum A DiGIR

2 Museum B DiGIR

3 Aggregator DiGIR

Global Data Index

Service Inst Coll CatNo

1 MusA CollX 1001

1 MusA CollX 1002

2 MusB CollY 202-1

2 MusB CollY 465-2

3 Agg MusA/CollX 1001

3 Agg MusA/CollX 1002

3 Agg MiscOther 1000

3 Agg MiscOther 1001

Import by aggregator

Page 23: The GBIF Information Architecture Technological integration at the global level Donald Hobern GBIF Program Officer for Data Access and Database Interoperability

Museum A DiGIR providerhttp://www.museumA.org/digir.php

Problems to solve – referring to data from outside network

Data records have no stable URLs on the network (even using constructed DiGIR queries)

Provider endpoints may change, and specimens may move to different institutions

There needs to be a reliable way to reference each data record from other web sites and in external media

The solution needs to handle situations in which data providers move to new locations

It should ideally also be possible to refer to a particular version of each record in case of modifications

Inst Coll CatNo

MusA CollX 1001

MusA CollX 1002

Global Data Indexhttp://www.gbif.net/data/digir.jsp

Service Inst Coll CatNo

1 MusA CollX 1001

1 MusA CollX 1002

Revision of the genus Aenetus

Material examined

Museum A, Collection X

1001, 1002

Revision of the genus Aenetus

Material examined

Museum A, Collection X

1001, 1002

Aenetus virescens

Distribution

Auckland (see 1001)

Wellington (see 1002)

?

?

Page 24: The GBIF Information Architecture Technological integration at the global level Donald Hobern GBIF Program Officer for Data Access and Database Interoperability

Taxonomy Server B

Concept: Xus bus sensu J. Owen, 2002== Aus bus sensu P. Brown, 1949

Concept: Yus dus sensu J. Owen,2002== Aus dus sensu P. Brown, 1949

Taxonomy Server A

Concept: Aus bus sensu Smith, 2003== Aus bus sensu Jones, 1965== Aus cus bus sensu Black, 1926> Aus bus sensu Brown, 1949> Aus dus sensu Brown, 1949

Problems to solve – referring to taxon concepts

No standardised way to reference taxonomic publications or taxon concepts to allow reliable machine processing of the relationships between them (is ”P. Brown, 1949” the same as ”Brown, 1949”?)

Need mechanisms to allow data providers to refer to taxon concepts in a reliable and consistent way

User

What is the relationship between the concepts from Taxonomy Server A and Taxonomy Server B?

Page 25: The GBIF Information Architecture Technological integration at the global level Donald Hobern GBIF Program Officer for Data Access and Database Interoperability

• Universal reusable mechanism which can be applied to any type of biodiversity data record (specimen, collection, institution, taxon concept, taxonomic publication, image, etc.)

• Identifiers must refer uniquely to a single data element

• Users must be able to resolve each identifier to locate the data to which it relates

• Identifiers must be resolvable even if data changes ownership or the server is moved to a new endpoint

• Identifiers must be suitable for use by the wider life science community and others

• It should be as simple as possible for researchers– to create new identifiers– to discover existing identifiers associated with data items– to resolve identifiers to retrieve the associated data

Globally unique identifiers – requirements

Page 26: The GBIF Information Architecture Technological integration at the global level Donald Hobern GBIF Program Officer for Data Access and Database Interoperability

Potential model – Life Science Identifiers (LSIDs)

Specification for Uniform Resource Name model from EMBL, IBM, I3C and OMG

”A straightforward approach to naming and identifying data resources stored in multiple, distributed data stores in a manner that overcomes the limitations of naming schemes in use today”

All LSIDs are formed from 5 or 6 colon-delimited components: urn, lsid, the domain name of the authority for the identifier, a namespace for the identifier, the identifier for the object within the namespace, and an optional revision number

Format

urn:lsid:<domainName>:<namespace>:<objectId>[:<revisionId>]

Example: referencing a PubMed article

urn:lsid:ncbi.nlm.nih.gov:pubmed:12571434

Example: referencing first version of the 1AFT protein in the Protein Data Bank

urn:lsid:pdb.org:pdb:1AFT:1

Potential example: referencing specimen record in GBIF Network (identifiers assigned centrally)

urn:lsid:gbif.net:Specimen:2706712

Potential example: name record from IPNI

urn:lsid:ipni.org:TaxonName:82090-3:1.1

Page 27: The GBIF Information Architecture Technological integration at the global level Donald Hobern GBIF Program Officer for Data Access and Database Interoperability

Web-based taxonomy?

GBIF Registry

GBIF Global Portal Thematic Portal

Taxonomy / Nomenclature

PDFHTML web page

Struc-tured XML

Image

GBIF Data Index

Discovery

Indexing

Presentation &Analysis

ResourcesSpecimens

Descriptive Data

Sequences / Barcodes

Taxonomic Collaboration Environment

Structured Unstructured

Databases Documents

Sequences / Barcodes

Taxonomy / Nomenclature

Specimens

Characters

Literature

Network resources

Group resources

Network data

Group data

Taxonomist input

Revision

Page 28: The GBIF Information Architecture Technological integration at the global level Donald Hobern GBIF Program Officer for Data Access and Database Interoperability

Taxonomy / Nomenclature

PDFHTML web page

Struc-tured XML

Image

SpecimensDescriptive

DataSequences /

Barcodes

What is Species Bank?

GBIF Data Portal

GBIF Data Index

User Browser

/ Client Tool

Map Service

Taxon Portal

HTML web pages

Regional Portal

Species Bank ?

Species Bank ?

Page 29: The GBIF Information Architecture Technological integration at the global level Donald Hobern GBIF Program Officer for Data Access and Database Interoperability

Links

Global Biodiversity Information Facility

http://www.gbif.org/Communications Portal

http://www.gbif.net/Data Portal

http://circa.gbif.net/Public/irc/gbif/dadi/library?l=/architectureArchitecture documents

Taxonomic Databases Working Group

http://www.tdwg.org/Including access to working groups