global working checklist of compositae a tica project seed funded by gbif ecat

39
Global Working Checklist of Compositae A TICA Project Seed Funded by GBIF ECAT

Upload: abigail-harrison

Post on 13-Jan-2016

237 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Global Working Checklist of Compositae A TICA Project Seed Funded by GBIF ECAT

Global Working Checklist of Compositae

A TICA Project

Seed Funded by GBIF ECAT

Page 2: Global Working Checklist of Compositae A TICA Project Seed Funded by GBIF ECAT

Long Term Vision

Peter Raven (email to Vicki Funk 9.9.2005):

“…Whatever happens, we want and need one consolidated, agreed list [of Compositae species], and not a series of choices from various lists.”

Page 3: Global Working Checklist of Compositae A TICA Project Seed Funded by GBIF ECAT

How to get there?

• Phase 1: – Creation, consolidation and initial editing of a

list of names of taxa integrated from existing electronic checklists and floras that are (nearly) complete, and which are available in structured databases or digital form.

– Followed by processing hard copy publications.

Page 4: Global Working Checklist of Compositae A TICA Project Seed Funded by GBIF ECAT

How to get there?

• Phase 2: – Full or partial checklist reports for taxa

available for downloading from the TICA website.

– Taxonomists to examine taxonomy and nomenclature.

– Recoding of comments and corrections.

Page 5: Global Working Checklist of Compositae A TICA Project Seed Funded by GBIF ECAT

How to get there?

• Phase 3: – Dealing with taxonomic differences.

Page 6: Global Working Checklist of Compositae A TICA Project Seed Funded by GBIF ECAT

GBIF ECAT Seed Fund• Duration: 1 March 2006 – 31 August 2007.• Partners:

– Landcare Research, New Zealand (Lead Partner)– Missouri Botanical Garden; – Royal Botanic Gardens, Kew; – Botanic Garden and Botanical Museum Berlin-Dahlem; – Australian National Herbarium, Centre for Plant

Biodiversity Research, CSIRO; – University of Tokyo; – Smithsonian Institution; – South African National Biodiversity Institute, Pretoria

(SANBI);– Instituto de Botánica Darwinion, Buenos Aires– The International Compositae Alliance (TICA).

Page 7: Global Working Checklist of Compositae A TICA Project Seed Funded by GBIF ECAT

GBIF ECAT Seed Fund Project Team

• Jerry Cooper & Ilse Breitwieser– Aaron Wilton– Kevin Richards– Christina Flann

Page 8: Global Working Checklist of Compositae A TICA Project Seed Funded by GBIF ECAT

Scope of the ECAT GBIF project

• Creation, consolidation and initial editing of names of Compositae taxa integrated from existing electronic checklists and Floras that are complete, or nearly complete, and which are available in structured databases or digital form.

• This will be followed by processing additional digital and hard copy publications (as many as possible within timeframe).

Phase 1 of Compositae checklist

Page 9: Global Working Checklist of Compositae A TICA Project Seed Funded by GBIF ECAT

Contracted objectives of the project• The collation and integration of prioritised existing

checklists into a Global working Checklist of the Compositae;

• Where possible resolve and complete nomenclatural content (including homotypic synonyms);

• Capture, examine, report and resolve (as much as possible) differences in taxon concepts;

• Provide data contributors with regular reports of editorial changes;

• Make the developing checklist accessible via the Internet, hosted by TICA, and eventually linked to GBIF ECAT;

• Provide a framework for facilitating information flow and content revision among data contributors and the broader TICA community;

• Provide a substantial information basis (including a gap analysis) and operating framework for the completion and long-term maintenance of the global checklist.

Page 10: Global Working Checklist of Compositae A TICA Project Seed Funded by GBIF ECAT

Aims of the workshop

• Awareness of the project• Feedback • Phase 2/3 discussion, agreement, and planning

– How to continue once GBIF contract is finished (Aug. 2007)?

– Future funding?– Decision on mechanisms for dealing with taxonomic

differences need to be made at this workshop.• Possible models: creation of an editorial board supported by

specialist subgroups who will determine authoritative taxonomic views. What lessons learnt from similar projects, e.g. Euro+Med?

Page 11: Global Working Checklist of Compositae A TICA Project Seed Funded by GBIF ECAT

Global Working Checklist of Compositae

Background to the project

Jerry Cooper

Page 12: Global Working Checklist of Compositae A TICA Project Seed Funded by GBIF ECAT

What is GBIF?

• The Global Biodiversity Information Facility• Formed in 2000. Secretariat in Copenhagen• Intergovernmental. 47 country signatories, and

based on a ‘Memorandum of Understanding’• In support of the Convention on Biological

Diversity (CBD)• An Internet based data sharing network for

collection/observation/taxonomic data• Currently serves 96 million records from 707

sources, and growth is remains exponential

Page 13: Global Working Checklist of Compositae A TICA Project Seed Funded by GBIF ECAT

What is ECAT?

• Electronic Catalogue of Names of Known Organisms

• A principle GBIF work programme

• Names of Taxa are the key to unlocking biodiversity data

• GBIF Seed funding awarded annually to start key databasing projects to deliver the ECAT

Page 14: Global Working Checklist of Compositae A TICA Project Seed Funded by GBIF ECAT

Why is ECAT a database mediated programme?

• Why a database?– It makes explicit (‘unlocks’) the implicit

information content of a checklist– Ease of maintenance and transparency of

derivation of content– Application of Unique Identifiers facilitates

digital connectivity of information across linked resources

– Efficient & flexible (re)use of information in many forms

Page 15: Global Working Checklist of Compositae A TICA Project Seed Funded by GBIF ECAT

Why is ECAT a database mediated programme?

• Why necessary to collate existing digital data as a first step?

• One centralized database, or multiple, distributed, connected databases?

Page 16: Global Working Checklist of Compositae A TICA Project Seed Funded by GBIF ECAT

Why is ECAT a database mediated programme?

A global database of names of taxa, and taxon concepts, will provide an essential digital backbone for unlocking and linking existing digital data, and for facilitating future taxonomic [database] checklist work.

Page 17: Global Working Checklist of Compositae A TICA Project Seed Funded by GBIF ECAT

Related global ‘names’ initiatives

• Catalogue of Life Consortium (Species 2000/ITIS), uBIO, GenBank Taxonomic Framework, CBOL …

• Taxonomic Databases Working Group

Page 18: Global Working Checklist of Compositae A TICA Project Seed Funded by GBIF ECAT

What are we trying to achieve in this GBIF seed project?

Phase 1 IS NOT a taxonomic project

• The emphasis is on:– Collating and integrating existing digital data– Applying data standards– Providing the resulting digital backbone as a service

• The value of the resulting consolidated database is considerable:– Consistent nomenclature– Gap analysis– Identifying taxonomic opinion– Significant contribution to the global, digitally accessible

catalogue of life– ‘Digital backbone’ of Compositae information

Page 19: Global Working Checklist of Compositae A TICA Project Seed Funded by GBIF ECAT

Scope & priorities for collation

1. Nomenclature

• Genus/species/infraspecific epithets (+orthographic variants)

• Linkage to basionym/replacement names (providing homotypic synonymy)

• Standardized Authors

• Linkage to place of publication

Page 20: Global Working Checklist of Compositae A TICA Project Seed Funded by GBIF ECAT

Scope & priorities for collation

2. Taxonomic Opinion

• Heterotypic synonyms

• Preferred name for synonyms according to X in publication Y (basic taxon concepts)

• Position in a taxonomic hierarchy (genus-tribe-family – FGVP)

Page 21: Global Working Checklist of Compositae A TICA Project Seed Funded by GBIF ECAT

Scope & priorities for collation

3. Metadata

• Who provided which data

• How the provided data was consolidated, edited, and any consensus derived

• Unique identifiers for tracking both names & taxon concepts

Page 22: Global Working Checklist of Compositae A TICA Project Seed Funded by GBIF ECAT

Limitations to what we can achieve in phase 1

• Consensus taxonomic opinion?

• Infraspecific names?

• Common names?

• Distribution information?

• Consolidated bibliography?

• Published revisions?

Page 23: Global Working Checklist of Compositae A TICA Project Seed Funded by GBIF ECAT

Key technical outputs from phase 1

• Feedback to providers on overlap/mismatch• Provision of URIs• Web site providing easy access to information• Web services

– providing end users with ability to incorporate/link catalogue data into other, new/existing work and maintain currency of these data

– providing GBIF ECAT with current information

Page 24: Global Working Checklist of Compositae A TICA Project Seed Funded by GBIF ECAT

Global Working Checklist of Compositae:

Project Methodology

Aaron Wilton

Page 25: Global Working Checklist of Compositae A TICA Project Seed Funded by GBIF ECAT

From Agenda

Project Details– Information ownership and acknowledgement– The proposed methodology– Nomenclature and Taxonomy– Data integration methodology and the priority

databases– Database contributors– Information services

Page 26: Global Working Checklist of Compositae A TICA Project Seed Funded by GBIF ECAT

Process Overview

Data set

ChecklistDatabase

2. Transform 3. Import

4. Integrate

5. Edit

7. Checklist Website

6. Report

1. Export

Data set

Data set

Database

Database

Database

Page 27: Global Working Checklist of Compositae A TICA Project Seed Funded by GBIF ECAT

1. Data sets from Providers

• Format flexible

• Content– Nomenclature– Taxonomy– References/Literature– Important: Unique ID’s and Modified

Dates

• Metadata for website

Page 28: Global Working Checklist of Compositae A TICA Project Seed Funded by GBIF ECAT

2. Transformation and Importation

• Transformation– Convert to standard format– Largely manual

• Importation– Data sets added as prepared– Maintain distinct records– Linked to provider metadata

Page 29: Global Working Checklist of Compositae A TICA Project Seed Funded by GBIF ECAT

4. Integration

• Build list of “consensus records”• Two steps

– Matching records– Calculating consensus record

• Matching– Use nomenclatural data– Exact and fuzzy matches– Matched records linked to consensus record– New records assigned unique id

Page 30: Global Working Checklist of Compositae A TICA Project Seed Funded by GBIF ECAT

4. Example of matching

1 Antennaria Link ex Fr.

1 Antennaria Gaertn.

Antennaria Link ex Fr.

Antennaria Gaertn.2 Antennaria Fr.

2 Antennaria Gaertn.

3 Antennaria

3 Anaphalis DC. Anaphalis DC.

?

Provider records Consensus records

Page 31: Global Working Checklist of Compositae A TICA Project Seed Funded by GBIF ECAT

4. Calculating Consensus

• Calculate from all linked records

• Each field based on majority except– Ties– Editors record

Page 32: Global Working Checklist of Compositae A TICA Project Seed Funded by GBIF ECAT

4. Example

2 Antennaria Gaertn. 1821 Fruct. Sem. Pl. 2 419

3 Antennaria Gaertn. 1791

1 Antennaria Gaertn. 1821 Fruct. Sem. Pl. 2 410

Name Author Year Citation Page

Antennaria Gaertn. 1821 Fruct. Sem. Pl. 2 <null> Warning Warning

Consensus

Editor 1791 410

Antennaria Gaertn. 1791 Fruct. Sem. Pl. 2 410 Consensus

Page 33: Global Working Checklist of Compositae A TICA Project Seed Funded by GBIF ECAT

5. Editing• Data priorities

– Nomenclatural– References– Taxonomy– Other data

• Process– Resolve data conflicts– Verify links (provider to consensus records)– Verify difference between near matches– Fill gaps

• Editorial work recorded– Editors record created to record changes and inserts– Verification flags

Page 34: Global Working Checklist of Compositae A TICA Project Seed Funded by GBIF ECAT

6. Reporting

• Webservices– Available to Data providers– Html or xml– Functions will provide means to get

• Full consensus data for a name• Comparisons matrix showing

– TICA ID and other provider IDs– Full data by data provider

• Resolution of deprecated TICA ids• Get all TICA ids

• Manual– As required– Gap analysis

Page 35: Global Working Checklist of Compositae A TICA Project Seed Funded by GBIF ECAT

7. Website

• Website present data for– Consensus record– Taxonomic concepts

• Hybrid, preferred name

• Acknowledge contributions

• Automatically updated

Page 36: Global Working Checklist of Compositae A TICA Project Seed Funded by GBIF ECAT

Summary of Scope

Capture Integrate Edit Display

Nomenclature Taxonomy () ()

Literature () () ()

Other ()

Page 37: Global Working Checklist of Compositae A TICA Project Seed Funded by GBIF ECAT

Work Plan• Integrator Development

– Nomenclature & Taxonomy (May – Sept)– Literature (Sept – Dec)

• Web site– Initial conversion (Complete) – New reports and enhancements (Nov - Dec)– Web services (Nov/Dec 2006)

• Data Editing (1 Sept 2006 to 30 August 2007)• Data sets from Providers (now – end August?)

Page 38: Global Working Checklist of Compositae A TICA Project Seed Funded by GBIF ECAT

Data Received to Date

• IPNI (Compositae)

• Kadereit et al. Compositae from Families and Genera of Vascular Plants

• World Checklist of Seed Plants (A-I), Rafaël Govaerts

• Flora of Japan

• New Zealand Plant Names

Page 39: Global Working Checklist of Compositae A TICA Project Seed Funded by GBIF ECAT