global biodiversity information facility greg riccardi co-chair 9 november 2009 outcomes of the...

21
GLOBAL GLOBAL BIODIVERSITY BIODIVERSITY INFORMATION INFORMATION FACILITY FACILITY Greg Riccardi Co-chair 9 November 2009 WWW.GBIF.ORG WWW.GBIF.ORG Outcomes of the GBIF LSID-GUID Task Group

Upload: regina-robbins

Post on 23-Dec-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: GLOBAL BIODIVERSITY INFORMATION FACILITY Greg Riccardi Co-chair 9 November 2009  Outcomes of the GBIF LSID-GUID Task Group

GLOBALGLOBALBIODIVERSITYBIODIVERSITYGLOBALGLOBALBIODIVERSITYBIODIVERSITY

INFORMATIONINFORMATIONFACILITYFACILITY

Greg Riccardi

Co-chair

9 November 2009

WWW.GBIF.OWWW.GBIF.ORGRG

Outcomes of the GBIF LSID-GUID Task Group

Page 2: GLOBAL BIODIVERSITY INFORMATION FACILITY Greg Riccardi Co-chair 9 November 2009  Outcomes of the GBIF LSID-GUID Task Group

OverviewOverview

Task Group Overview The Characteristics of Effective

Identifiers Benefits and Opportunities Recommendations Discussion Session

Thursday, 12 Nov, 1400-1530

Page 3: GLOBAL BIODIVERSITY INFORMATION FACILITY Greg Riccardi Co-chair 9 November 2009  Outcomes of the GBIF LSID-GUID Task Group

GUID Goals from GBIF Strategic PlansGUID Goals from GBIF Strategic Plans The GBIF strategic plans document includes goals

To consolidate the underlying enabling infrastructure and standardisation for global connectivity of biodiversity data and information

To develop a system of globally unique identifiers and encourage their use throughout biodiversity informatics

To use TDWG standards to allow all data objects to be identified using standard actionable globally unique identifiers

To provision GBIF web services and user interfaces to allow users to locate and view any data object with a standard globally unique identifier.

Page 4: GLOBAL BIODIVERSITY INFORMATION FACILITY Greg Riccardi Co-chair 9 November 2009  Outcomes of the GBIF LSID-GUID Task Group

Call to the Task GroupCall to the Task Group

GBIF convened a task group, the “LSID GUID Task Group” (LGTG) to explore the issues and offer recommendations

on the way forward, with particular reference to the GBIF network,

that will enable GBIF to provide architecture leadership and best practices for implementation.

The principal objective of the group is to provide recommendations and guidelines on

deployment of identifiers on the GBIF network with particular reference to the potential role of GBIF as a stable, long term provider of identifier resolution services.

Page 5: GLOBAL BIODIVERSITY INFORMATION FACILITY Greg Riccardi Co-chair 9 November 2009  Outcomes of the GBIF LSID-GUID Task Group

MembersMembers

Phil Cryer (Missouri Botanical Garden) Roger Hyam (Natural History Museum and PESI) Chuck Miller (Missouri Botanical Garden) Nicola Nicolson (Royal Botanic Gardens, Kew) Éamonn Ó Tuama (GBIF) Rod Page (University of Glasgow) Jonathan Rees (Science Commons) Greg Riccardi (co-chair, Florida State University) Kevin Richards (Landcare Research, New

Zealand) Richard White (co-chair, Cardiff University)

Page 6: GLOBAL BIODIVERSITY INFORMATION FACILITY Greg Riccardi Co-chair 9 November 2009  Outcomes of the GBIF LSID-GUID Task Group

ResultsResults

Report document Draft written at the August 2009 workshop at

GBIF Revised for distribution in October 2009

Contents of report Overview of definitions and technology Recommendations for the GBIF secretariat and

for the biodiversity community

Report delivered to GBIF Science Committee Response of committee (at end of talk)

Page 7: GLOBAL BIODIVERSITY INFORMATION FACILITY Greg Riccardi Co-chair 9 November 2009  Outcomes of the GBIF LSID-GUID Task Group

OverviewOverview

Task Group Overview The Characteristics of Effective

Identifiers Benefits and Opportunities Recommendations Discussion Session

Thursday, 12 Nov, 1400-1530

Page 8: GLOBAL BIODIVERSITY INFORMATION FACILITY Greg Riccardi Co-chair 9 November 2009  Outcomes of the GBIF LSID-GUID Task Group

Preliminary DefinitionPreliminary Definition

An identifier is a character string associated with an object. Identifiers are used in informatics to refer

to objects in data sets, documents and repositories.

Some identifiers are useful Some are more useful

Page 9: GLOBAL BIODIVERSITY INFORMATION FACILITY Greg Riccardi Co-chair 9 November 2009  Outcomes of the GBIF LSID-GUID Task Group

Characteristics of Effective Identifiers

Characteristics of Effective Identifiers

Two use cases that make identifiers effective for users

Uniqueness of reference to a single object An identifier can be used to aggregate information

about the identified object For example, information received from multiple

sources associated with a single identifier is information about a single object.

Actions may be carried out using the identifier An identifier can be used to find further information

about the object, concept or data to which it refers. This information might be interpreted directly or

used to support services.

Page 10: GLOBAL BIODIVERSITY INFORMATION FACILITY Greg Riccardi Co-chair 9 November 2009  Outcomes of the GBIF LSID-GUID Task Group

Problems with terminologyProblems with terminology The task group struggled with terms

GUID is problematic Used in IT to refer to the way that Microsoft

uses 128 bit UUIDs Used in biodiversity to refer to …

Persistent, actionable identifier The Task Group recommendation for

terminology Two required characteristics: persistent

and actionable

Page 11: GLOBAL BIODIVERSITY INFORMATION FACILITY Greg Riccardi Co-chair 9 November 2009  Outcomes of the GBIF LSID-GUID Task Group

Persistent IdentifierPersistent Identifier

Persistence: The property that an identifier always refers to a specific object. All information associated with a persistent

identifier is about the same object. The properties of the object are subject to

change, but once a persistent identifier is assigned to one object, it cannot be reused to refer to a different object.

Example ITIS TSNs are integers that are persistent

identifiers for taxa

Page 12: GLOBAL BIODIVERSITY INFORMATION FACILITY Greg Riccardi Co-chair 9 November 2009  Outcomes of the GBIF LSID-GUID Task Group

Actionable IdentifiersActionable Identifiers

An identifier is actionable if there is a service that, given the identifier, provides information about the object identified E.g., a resolution service maps an identifier into a

Web service that provides information about the identifier and its associated object

Example An HTTP URI is actionable.

The HTTP system provides mechanisms for clients to access informationabout a data object from its associated identifier.

ITIS TSNs are actionable because ITIS supports services that provide information for TSNs.

Page 13: GLOBAL BIODIVERSITY INFORMATION FACILITY Greg Riccardi Co-chair 9 November 2009  Outcomes of the GBIF LSID-GUID Task Group

Good Identifier TechnologiesGood Identifier Technologies HTTP URI: A fundamental technology of WWW

Persistence assured using DNS Actionable through HTTP protocol

LSID: Life Science Identifiers Persistence assured by convention Actionable according to the LSID services model May be mapped into HTTP URI by resolution services

Recommendation: Both are important to biodiversity and should be supported by GBIF

UUID Persistence assured by random assignment Not independently actionable Can be an effective part of HTTP URI and LSID technologies

Page 14: GLOBAL BIODIVERSITY INFORMATION FACILITY Greg Riccardi Co-chair 9 November 2009  Outcomes of the GBIF LSID-GUID Task Group

OverviewOverview

Task Group Overview The Characteristics of Effective

Identifiers Benefits and Opportunities Recommendations Discussion Session

Thursday, 12 Nov, 1400-1530

Page 15: GLOBAL BIODIVERSITY INFORMATION FACILITY Greg Riccardi Co-chair 9 November 2009  Outcomes of the GBIF LSID-GUID Task Group

Example Benefits of IDsExample Benefits of IDs

Tracking citation and impact The association among objects might be contained in a

blog post: Joe writes “I searched the GBIF repository for all frogs from

Cuba. The collection of objects that I found useful are in the collection [ID1]. I plotted the locations of the records [ID2] and reported the results in my paper [ID3].

Such an association provides feedback and is used by search engines in rankings and ratings

Management and disambiguation of taxon names Disambiguation of taxon names requires services that

support tests of difference as well as of equality. Different identifiers do not necessarily refer to different

objects. Tests of inequality for objects must rely on evaluation of

metadata or of the objects themselves.

Page 16: GLOBAL BIODIVERSITY INFORMATION FACILITY Greg Riccardi Co-chair 9 November 2009  Outcomes of the GBIF LSID-GUID Task Group

OpportunityOpportunity

Integrating identifiers with the Semantic Web and the Linked Data model Linked Data (http://linkeddata.org) is a

vision of a web of interconnected data, to be consumed by machines

HTTP URIs are used as identifiers, and the data is described using RDF

If we use HTTP URIs for identifiers, we will be part of Linked Data

Page 17: GLOBAL BIODIVERSITY INFORMATION FACILITY Greg Riccardi Co-chair 9 November 2009  Outcomes of the GBIF LSID-GUID Task Group

Potential Linked Data ModelPotential Linked Data Model

Page 18: GLOBAL BIODIVERSITY INFORMATION FACILITY Greg Riccardi Co-chair 9 November 2009  Outcomes of the GBIF LSID-GUID Task Group

OverviewOverview

Task Group Overview The Characteristics of Effective

Identifiers Benefits and Opportunities Recommendations Discussion Session

Thursday, 12 Nov, 1400-1530

Page 19: GLOBAL BIODIVERSITY INFORMATION FACILITY Greg Riccardi Co-chair 9 November 2009  Outcomes of the GBIF LSID-GUID Task Group

Recommendations: GBIF ShouldRecommendations: GBIF Should Take the leadership role in driving the application and use of identifiers in biodiversity

informatics, Provide materials such as an executive summary targeted to administrative leadership

explaining the costs and benefits of implementing persistent identifiers, Educate the community in general persistent identifier principles and practices, Encourage, support and advise on the use of appropriate identifier technologies, in

particular lsids and HTTP uris, but not impose a requirement for one at the expense of the other, and provide specific advice for the issuing and use of lsids and for HTTP uris,

Support a promotional programme, Demonstrate good practice in its data portal, Assist providers that are not currently maintaining their own persistent identifiers to do

so: this includes both education and technology, Make data more inter-connected, Start a programme to become an RDF consumer and encourage data providers to

deploy RDF services, Provide services to support identifier resolution, redirection, metadata hosting, and

caching, Provide additional services, including persistent identifier monitoring services, Extend the role of its data portal by hosting resources related to the use of identifiers,

such as the TDWG vocabularies, Assist with the availability of software for data and service providers, and Continue to be funded to provide support to data providers for the foreseeable future.

Page 20: GLOBAL BIODIVERSITY INFORMATION FACILITY Greg Riccardi Co-chair 9 November 2009  Outcomes of the GBIF LSID-GUID Task Group

Response of the GBIF Science Committee

Response of the GBIF Science Committee

The SC reviewed and endorsed the report of the LSID GUID TG (LGTG).

The SC recommends that An additional full case study is developed in the document to

highlight the new quality control mechanisms that can be established to have users report and receive feedback on the quality of data being served.

Additionally, the LGTG makes an excellent “obligatory reading material”

for the Biodiversity Informatics community in general and for GBIF Participants, in particular.

The SC strongly recommends all participants to read it and be aware of the impact that the implementation of tools such as IPT and GBRDS will have in their local contexts as well as globally

Page 21: GLOBAL BIODIVERSITY INFORMATION FACILITY Greg Riccardi Co-chair 9 November 2009  Outcomes of the GBIF LSID-GUID Task Group

How to contact GBIF:How to contact GBIF:

Web site: www.gbif.orgData portal: data.gbif.org

GBIF SecretariatUniversitetsparken 152100 CopenhagenDenmark

E-mail: [email protected]: +45 3532 1470Fax: +45 3532 1480