jonathan eisen talk on "phylogenomics of microbes" at lake arrowhead small genomes meeting...

42
TIGR TIGR

Upload: jonathan-eisen

Post on 01-Jun-2015

3.014 views

Category:

Technology


1 download

DESCRIPTION

Talk by Jonathan Eisen on Phylogenomics of microbes at Lake Arrowhead Small Genomes meeting in 2002.

TRANSCRIPT

Page 1: Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Genomes Meeting 2002

TIGRTIGRTIGRTIGR

Page 2: Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Genomes Meeting 2002

TIGRTIGRTIGRTIGRTIGRTIGRTIGRTIGR

“Nothing in biology makes senseexcept in the light of evolution.”

T. H. Dobzhansky (1973)

Page 3: Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Genomes Meeting 2002

TIGRTIGRTIGRTIGR

Topics of Discussion• Introduction to phylogenomics• Uses of evolutionary analysis in genomics

– Selection of species– Functional prediction– Gene duplication– Gene loss– Genome rearrangements– Lateral transfer– Uncultured species– Specialization

Page 4: Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Genomes Meeting 2002

TIGRTIGRTIGRTIGR

Phylogenomic Analysis

Phylogenomics involves combining evolutionary reconstructions of genes, proteins, pathways, and species with analysis of complete genome sequences.

Page 5: Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Genomes Meeting 2002

TIGRTIGRTIGRTIGR

Uses of Phylogenomics• Selection of species• Functional prediction• Gene duplication• Intragenomic movement• Gene loss• Lateral transfer• Genome rearrangements• Uncultured species

Page 6: Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Genomes Meeting 2002

TIGRTIGRTIGRTIGR

Strain Selection and Evolution

• Increasing phylogenetic representation• Determining relatedness to model organism• Understanding major evolutionary transitions• Identifying taxa with unusual (high or low) rates

of evolution• Identifying source of DNA from uncultured

species• Species naming and type strains (e.g., see Ward et.

al. 2001)

Page 7: Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Genomes Meeting 2002

TIGRTIGRTIGRTIGRBacteria Archaea

Evolutionary Diversity Still Poorly Represented in Complete Genomes

Page 8: Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Genomes Meeting 2002

TIGRTIGRTIGRTIGR

BacteriaArchaea

Eukaryotes

Giardia

Trichomonas

Naegleria

Trypanosoma

Euglena

Plasmodium

Tetrahymena

Phytophthora

Arabidopsis

Chlamydomonas

Dictyostelium

Humans

Fly

Worm

Encephalatozoon

S. cerevisiae

S. pombe

S. pombe Genome AnalysisEukaryotes vs. Prokaryotes

Page 9: Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Genomes Meeting 2002

TIGRTIGRTIGRTIGR

Plants

Giardia

Trichomonas Parabisalia

Diplomonads

Naegleria

Trypanosoma

Euglena

Plasmodium

Tetrahymena

Phytophthora

Arabidopsis

Chlamydomonas

Fungi

Animals

Dictyostelium

HumansFly

Worm

Encephalatozoon

S. cerevisiaeS. pombe

Microsporidia

Dictyostelia

HeterokontsCiliates

ApicomplexaKinetoplastids

EuglenasAcrasidae

Single vs. Multi-celled

Page 10: Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Genomes Meeting 2002

TIGRTIGRTIGRTIGR

Uses of Phylogenomics• Selection of species• Functional prediction• Gene duplication• Intragenomic movement• Gene loss• Lateral transfer • Genome rearrangements• Uncultured species

Page 11: Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Genomes Meeting 2002

TIGRTIGRTIGRTIGR

Predicting Function

• Identification of motifs• Homology/similarity based methods

– Highest hit, top hit, HMMs, threading

• Evolutionary methods– Phylogenetic trees– Ds/Dn– Phylogenetic profiles

TIGRTIGRTIGRTIGR

Page 12: Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Genomes Meeting 2002

TIGRTIGRTIGRTIGR

MutS.Aquaeorf.TrepaSPE1.DromeMSH2.XenlaMSH2.RatMSH2.MouseMSH2.HumanMSH2.YeastMSH2.NeucratMSH2.ArathMutS.Borbuorf.StrpyMutS.BacsuMutSSynspMutSEcoliorfNeigoMutSThemaMutSTheaq

orf.Deiraorf.ChltrMSH1.SpombeMSH1.YeastMSH3.YeastSwi4.SpombeRep3.MousehMSH3.Humanorf.ArathMSH6.YeastGTBP.HumanGTBP.MouseMSH6.ArathorfStrpyyshDBacsuMSH5CaeelhMHS5humanMSH5YeastMutS.MetthorfBorbuMutS2AquaeMutSSynsporfDeiraMutS.HelpysgMutS.SauglMSH4.YeastMSH4.CaeelhMSH4.HumanA.AquaeTrepaFlyXenlaRatMouseHumanYeastNeucrArathBorbuStrpyBacsuSynspEcoliNeigoThemaTheaqDeiraChltrSpombeYeastYeastSpombeMouseHumanArathYeastHumanMouseArathMutS2.MetthMutS2.SauglStrpyBacsuCaeelHumanYeastBorbuAquaeSynspDeiraHelpyYeastCaeelHumanMSH4MSH5MutS2MutS1MSH1MSH3MSH6MSH2B.AquaeTrepaXenlaNeucrArathBorbuSynspNeigoThemaDeiraChltrSpombeSpombeArathMouseMouseFlyRatMouseHumanYeastStrpyBacsuEcoliTheaqYeastYeastHumanYeastHumanArathStrpyBacsuHumanMutS2-MetthBorbuAquaeSynspDeiraHelpyMutS2-SauglCaeelYeastYeastCaeelHumanMSH4MSH5MutS2MutS1MSH1MSH3MSH6MSH2C.MutS2StrpyBacsuMutS2.MetthBorbuAquaeSynspDeiraHelpyMutS2.SauglCaeelYeastYeastCaeelHumanHumanMSH4Segregation &

Crossover

MSH5Segregation &

Crossover

FlyMouseHumanYeastAquaeTrepaXenlaNeucrArathBorbuSynspNeigoThemaDeiraChltrSpombeSpombeArathArathMutS1All MMR

(Bacteria)

RatStrpyBacsuEcoliTheaqYeastYeastMouseHumanYeastHumanMouseMSH1MMR in

Mitochondria

MSH3MMR of

Large Loops in Nucleus

MSH6MMR of

Mismatches and Small Loopsin Nucleus

MSH2All MMR

in Nucleus

D.

Page 13: Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Genomes Meeting 2002

TIGRTIGRTIGRTIGR

rRNA and Uncultured Microbes

Page 14: Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Genomes Meeting 2002

TIGRTIGRTIGRTIGR

Evolutionary Rate Variation

231456

Page 15: Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Genomes Meeting 2002

TIGRTIGRTIGRTIGR

Uses of Phylogenomics• Selection of species• Functional prediction• Gene duplication• Gene loss• Lateral transfer• Genome rearrangements• Uncultured species

Page 16: Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Genomes Meeting 2002

TIGRTIGRTIGRTIGR

Why Duplications Are Useful to Identify

• Allows division into orthologs and paralogs

• Improves functional predictions

• Helps identify mechanisms of duplication

• Can be used to study mutation processes in different parts of a genome

• Lineage specific duplications may be indicative of species’ specific adaptations

Page 17: Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Genomes Meeting 2002

TIGRTIGRTIGRTIGR

Lineage Specific Duplications in Wolbachia wMelAnnotationankyrin repeat domain proteinankyrin repeat domain proteinankyrin repeat domain proteinankyrin repeat domain proteinankyrin repeat domain proteinankyrin repeat domain proteinankyrin repeat domain proteinconserved domain proteinconserved domain proteinconserved domain proteinconserved domain proteinconserved hypothetical proteinconserved hypothetical proteinconserved hypothetical proteinconserved hypothetical proteinconserved hypothetical proteinconserved hypothetical proteinconserved hypothetical proteinconserved hypothetical proteinconserved hypothetical proteinconserved hypothetical proteinconserved hypothetical proteinconserved hypothetical proteinconserved hypothetical proteinconserved hypothetical proteinconserved hypothetical proteinconserved hypothetical proteinconserved hypothetical proteinconserved hypothetical proteinconserved hypothetical proteinconserved hypothetical proteinconserved hypothetical proteinconserved hypothetical proteinconserved hypothetical proteinconserved hypothetical proteinFRAMESHIFTconserved hypothetical proteinPOINT MUTATIONconserved hypothetical protein,degenerateconserved hypothetical protein,FRAMESHIFTconserved hypothetical protein,FRAMESHIFTconserved hypothetical protein,FRAMESHIFTconserved hypothetical protein,FRAMESHIFTconserved hypothetical protein,interruption-Cconserved hypothetical protein,POINT MUTATIONconserved hypothetical protein,POINT MUTATIONconserved hypothetical protein,truncatedconserved hypothetical protein,truncationDNA mismatch repair proteinMutL (mutL)DNA repair protein RadC,putativeDNA repair protein RadC,putative, truncationDNA repair protein RadC,truncationDnaJ domain proteinDnaJ domain proteinexopolysaccharide synthesisprotein ExoD-related proteinexopolysaccharide synthesisprotein ExoD-related proteinHNH endonuclease familyproteinHNH endonuclease familyproteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical protein

hypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinhypothetical proteinmajor facilitator familytransportermajor facilitator familytransportermajor facilitator familytransportermembrane protein, putativemembrane protein, putativemembrane protein, putativeMutL family proteinNa+/H+ antiporter family proteinNa+/H+ antiporter, putativepermease, putativeportal protein, FRAMESHIFTportal protein, FRAMESHIFTprophage LambdaW1, DNAmethylaseprophage LambdaW1, terminaselarge subunit, putativeprophage LambdaW2, ankyrinrepeat domain proteinprophage LambdaW2, ankyrinrepeat domain protein

prophage LambdaW2, baseplateassembly protein J, putativeprophage LambdaW2, baseplateassembly protein V, putativeFRAMESHIFTprophage LambdaW2, baseplateassembly protein V, putativeFRAMESHIFTprophage LambdaW2, baseplateassembly protein W, putativeprophage LambdaW2, minor tailprotein Z, putative,FRAMESHIFTprophage LambdaW2, site-specific recombinase, resolvasefamilyprophage LambdaW4, ankyrinrepeat domain proteinprophage LambdaW4, DNAmethylaseprophage LambdaW4, portalprotein, FRAMESHIFTprophage LambdaW4, portalprotein, FRAMESHIFTprophage LambdaW4, terminaselarge subunit, putativeprophage LambdaW5, ankyrinrepeat domain proteinprophage LambdaW5, ankyrinrepeat domain proteinprophage LambdaW5, ankyrinrepeat domain proteinprophage LambdaW5, baseplateassembly protein J, putative,FRAMESHIFTprophage LambdaW5, baseplateassembly protein V, putativeprophage LambdaW5, baseplateassembly protein W, putativeprophage LambdaW5, minor tailprotein Z, putative, degenerate,FRAMESHIFTprophage LambdaW5, site-specific recombinase, resolvasefamilyregulatory protein RepA, putativeregulatory protein RepA, putativereverse transcriptase, putativereverse transcriptase, putativereverse transcriptase, putativesodium/alanine symporter familyproteinsodium/alanine symporter familyproteinTenA/THI-4 family proteintranscriptional regulatortranscriptional regulatortranscriptional regulatortranscriptional regulatortranscriptional regulatortranscriptional regulatortranscriptional regulator, putativetranslation elongation factor Tu(tuf)translation elongation factor Tu(tuf)transposase, degeneratetransposase, IS4 familytransposase, IS4 familytransposase, IS4 familytransposase, IS5 family,interruption-Ntransposase, IS5 family,truncationtransposase, putative, degeneratetransposase, putative, degeneratetransposase, putative, degeneratetype IV secretion system proteinVirB4, putativeUDP-N-acetylglucosaminepyrophosphorylase-relatedprotein

Page 18: Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Genomes Meeting 2002

TIGRTIGRTIGRTIGR

MutL Duplication in Wolbachia wMel

ORF01096 DNA mismatch repair protein MutL (mutL)ORF00446 MutL family protein

Page 19: Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Genomes Meeting 2002

TIGRTIGRTIGRTIGR

MutL Duplication in Wolbachia wMel

Page 20: Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Genomes Meeting 2002

TIGRTIGRTIGRTIGR0.1

Schizosaccharomyces pombeGP139

Neurospora crassaPIRS55262S552

Clostridium perfringensGP18145

Bacillus subtilisSPP45864YWJD

Bacillus cereusGP6759487embCAB

B BACAN 01914 UV endonuclease

Bacillus haloduransOMNINTL01BH

B BACAN 01459 UV endonuclease

Deinococcus radioduransGP61167

Nostoc sp. PCC 7120GP17130610d

Older Duplication of UVDE

Page 21: Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Genomes Meeting 2002

TIGRTIGRTIGRTIGR

Uses of Phylogenomics• Selection of species• Functional prediction• Gene duplication• Intragenomic movement• Gene loss• Lateral transfer• Genome rearrangements• Uncultured species

Page 22: Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Genomes Meeting 2002

TIGRTIGRTIGRTIGR

X-files

Eisen et al. 2000. Genome Biology 1(6): 11.1-11.9

Also see Tillier and Collins. 2000. Nature Genetics 26(2):195-7 and Suyama and Bork. 2001. Trends Genetics 17: 10-13.

Page 23: Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Genomes Meeting 2002

TIGRTIGRTIGRTIGR C. trachomatis MoPn

C. p

neu

mon

iae

AR

39Origin

Terminus

C. trachomatis vs C. pneumoniae Dot Plot

Read et al. 2000

Page 24: Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Genomes Meeting 2002

TIGRTIGRTIGRTIGR

StrpB vs. StrpA All

13621300

13621500

13621700

13621900

13622100

13622300

13622500

13622700

13622900

13623100

0 500 1000 1500 2000 2500

Series1

Page 25: Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Genomes Meeting 2002

TIGRTIGRTIGRTIGR

StrpB vs. StrpA: Orthologs

13621300

13621500

13621700

13621900

13622100

13622300

13622500

13622700

13622900

13623100

0 500 1000 1500 2000 2500

Series1

Page 26: Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Genomes Meeting 2002

TIGRTIGRTIGRTIGR

Uses of Phylogenomics• Selection of species• Functional prediction• Gene duplication• Intragenomic movement• Gene loss• Lateral transfer• Genome rearrangements• Uncultured species

Page 27: Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Genomes Meeting 2002

TIGRTIGRTIGRTIGR

Most ‘Evidence’ for Gene Transfer has Alternative Explanations

Observation Other Causes Always Occurs

Unusual Distribution Sampling bias Not if recipient already has gene.

Unusual GC/Codons Selection Not if donor/recipient similar.Not if it occurred long ago.

High hit to "distant" species SelectionRate variationGene loss

Usually.

Incongruent trees Bad treesMissed paralogs

Usually.

Correlation of above withneighbors

Selection Only if genes keep order aftertransfer.

Page 28: Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Genomes Meeting 2002

TIGRTIGRTIGRTIGR

Steps in Lateral Gene Transfer

1

2

3-5

6

A B C D

Page 29: Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Genomes Meeting 2002

TIGRTIGRTIGRTIGR

Mitochondrial Genome Integration into A. thaliana chrII

Lin et al., 1999

Page 30: Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Genomes Meeting 2002

TIGRTIGRTIGRTIGR

Number of pBVTs Dependson # of Genomes Analyzed

1 2 3 4 5 Other

0

200

400

600

800

1000

1200

1400

1600

1800

Number of protein sets

Fruit flyC. elegansArabidopsisYeastParasites

Salzberg et al. 2001

Page 31: Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Genomes Meeting 2002

TIGRTIGRTIGRTIGR

Trees Don’t Support Transfer II

Page 32: Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Genomes Meeting 2002

TIGRTIGRTIGRTIGR

Uses of Phylogenomics• Selection of species• Functional prediction• Gene duplication• Intragenomic movement• Gene loss• Lateral transfer• Genome rearrangements• Uncultured species

Page 33: Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Genomes Meeting 2002

TIGRTIGRTIGRTIGR Beja O, et.al., Science 2000 289:1902-6, Nature (2001) 411: 786-789

Page 34: Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Genomes Meeting 2002

TIGRTIGRTIGRTIGR

Puf Operons from Uncultured Bacteria

Page 35: Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Genomes Meeting 2002

TIGRTIGRTIGRTIGR

Puf Operons vs. Cultured Species

Page 36: Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Genomes Meeting 2002

TIGRTIGRTIGRTIGR

Alternative Phylogenetic AnchorsChlorobium tepidum

Cytophaga hutchinsonii

Prevotella ruminocola

Bacteroides fragilis

Porphyromonas gingivalis

MBBAD68TR

MBBAD65TR

Page 37: Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Genomes Meeting 2002

TIGRTIGRTIGRTIGR

Acknowledgements• Outside TIGR

–A. Stoltzfus

–H. Ochman

–D. Bryant

–W. F. Doolittle

–M. Eisen

–M-I Benito

• $$$:

–NSF

–NIH

–ONR

–DOE

–NEB

Page 38: Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Genomes Meeting 2002

TIGRTIGRTIGRTIGR

B. anthracis lineage specific duplications

ORF04205 molybdopterin biosynthesis protein MoeA (moeA)ORF05907 molybdopterin biosynthesis protein MoeA (moeA)ORF02636 molybdopterin biosynthesis protein MoeA (moeA)ORF04204 molybdopterin biosynthesis protein MoeB, putativeORF05908 molybdopterin biosynthesis protein MoeB, putativeORF02634 molybdopterin biosynthesis protein MoeB, putativeORF05904 molybdopterin converting factor, subunit 1 (moaD)ORF02639 molybdopterin converting factor, subunit 1 (moaD)ORF04206 molybdopterin converting factor, subunit 2 (moaE)ORF05905 molybdopterin converting factor, subunit 2 (moaE)ORF02638 molybdopterin converting factor, subunit 2 (moaE)

Based on Read et al. submitted

Page 39: Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Genomes Meeting 2002

TIGRTIGRTIGRTIGR0.1

Schizosaccharomyces pombeGP139

Neurospora crassaPIRS55262S552

Clostridium perfringensGP18145

Bacillus subtilisSPP45864YWJD

Bacillus cereusGP6759487embCAB

B BACAN 01914 UV endonuclease

Bacillus haloduransOMNINTL01BH

B BACAN 01459 UV endonuclease

Deinococcus radioduransGP61167

Nostoc sp. PCC 7120GP17130610d

Page 40: Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Genomes Meeting 2002

TIGRTIGRTIGRTIGR

Page 41: Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Genomes Meeting 2002

TIGRTIGRTIGRTIGR

C. pneumoniae Paralogs by Position

Page 42: Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Genomes Meeting 2002

TIGRTIGRTIGRTIGR

C. pneumoniae Paralogs - Lineage Specific