eukref. a community effort towards phylogenetic-based curation of ribosomal databases for...

15
A community effort towards phylogenetic-based curation of ribosomal databases for environmental sequencing Javier del Campo Laura Parfrey

Upload: eukref

Post on 14-Aug-2015

224 views

Category:

Science


0 download

TRANSCRIPT

Page 1: EuKRef. A community effort towards phylogenetic-based curation of ribosomal databases for environmental sequencing

A community effort towards phylogenetic-based curation

of ribosomal databases for environmental sequencing

Javierdel Campo

LauraParfrey

Page 2: EuKRef. A community effort towards phylogenetic-based curation of ribosomal databases for environmental sequencing

Motivation

Integrate expert views on taxonomy into public database resources

Improve resources for high throughput sequence annotation

Page 3: EuKRef. A community effort towards phylogenetic-based curation of ribosomal databases for environmental sequencing

• Catalyze experts in protist taxonomy to engage in curation and validation of a ribosomal DNA marker gene database for eukaryotic lineages across the tree of life.

• Synthesize the efforts of individual curators to produce a phylogenetically curated ribosomal DNA marker gene database for eukaryotes.

• Use the improved reference database to characterize the environmental distribution of eukaryotic microbes from large-scale HTES datasets.

Aims

Page 4: EuKRef. A community effort towards phylogenetic-based curation of ribosomal databases for environmental sequencing

1) HTES 18S rDNA sequence retrieval

2) Reference database annotation

3) Community analysis using classification

High-throughput environmental sequence (HTES) analysis of eukaryotes

Page 5: EuKRef. A community effort towards phylogenetic-based curation of ribosomal databases for environmental sequencing

Starting reference database 18S phylogeny

Page 6: EuKRef. A community effort towards phylogenetic-based curation of ribosomal databases for environmental sequencing

Use the phylogeny to improve classification

Page 7: EuKRef. A community effort towards phylogenetic-based curation of ribosomal databases for environmental sequencing

Reference database 18S phylogenyAfter curation

Page 8: EuKRef. A community effort towards phylogenetic-based curation of ribosomal databases for environmental sequencing

Integrate environmental metadata

Where was this sequence isolated?• Fresh water or marine?• Aerobic or anoxic?• Host information? (symbiotic clades)

Page 9: EuKRef. A community effort towards phylogenetic-based curation of ribosomal databases for environmental sequencing

0 500 1000 1500 2000 2500 3000 3500 HTS readsA manually curated reference DBThe opisthos example

Page 10: EuKRef. A community effort towards phylogenetic-based curation of ribosomal databases for environmental sequencing

Outputs for each group

• Set of curated sequences– Files with chimeric sequences and short sequences

• Alignment of these sequences• Phylogenetic tree• Database

– Full classification (unlimited ranks)– Environmental metadata

• Open access after 1 year embargo (if desired)

Page 11: EuKRef. A community effort towards phylogenetic-based curation of ribosomal databases for environmental sequencing

18S reference DB curation pipeline (simplified)

Page 12: EuKRef. A community effort towards phylogenetic-based curation of ribosomal databases for environmental sequencing

18S reference DB curation pipeline

Page 13: EuKRef. A community effort towards phylogenetic-based curation of ribosomal databases for environmental sequencing

• A refined curation pipeline, associated computational tools, and curation instructions

• Reference databases for individual lineages

• Synthesis of classification for each group

Workshop Outputs

Page 14: EuKRef. A community effort towards phylogenetic-based curation of ribosomal databases for environmental sequencing

After the workshop…

• Continued curation.

• Recruit new curators using refined tools.

• Coordinate with other groups.

• Integrate data from different curation efforts (into a cohesive database).

• Data sharing and distribution.

Page 15: EuKRef. A community effort towards phylogenetic-based curation of ribosomal databases for environmental sequencing

Acknowledgments

Thank you!Advisers and participants