using genomic-based information for the modelling of bacterial environments and lifestyle

41
USING GENOMIC-BASED INFORMATION FOR THE MODELLING OF BACTERIAL ENVIRONMENTS AND LIFESTYLE Shiri Freilich Eytan Ruppin, Roded Sharan School of Computer Sciences Tel Aviv University May 2009

Upload: pilar

Post on 24-Feb-2016

31 views

Category:

Documents


0 download

DESCRIPTION

Using genomic-based information for the modelling of bacterial environments and lifestyle . Shiri Freilich Eytan Ruppin , Roded Sharan School of Computer Sciences Tel Aviv University May 2009. Species evolve to adapt to their environment . . Phenotype. Genotype. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Using genomic-based information for the  modelling  of bacterial environments and lifestyle

USING GENOMIC-BASED INFORMATION FOR THE MODELLING OF BACTERIAL ENVIRONMENTS AND LIFESTYLE

Shiri Freilich

Eytan Ruppin, Roded SharanSchool of Computer Sciences Tel Aviv UniversityMay 2009

Page 2: Using genomic-based information for the  modelling  of bacterial environments and lifestyle

Species evolve to adapt to their environment.

Page 3: Using genomic-based information for the  modelling  of bacterial environments and lifestyle

Environment/lifestyle Phenotype Genotype

Can we use the genotype to predict the lifestyle of a species?

Page 4: Using genomic-based information for the  modelling  of bacterial environments and lifestyle

FROM GENOMIC INFORMATION TO PHENOTYPIC (METABOLIC) INFORMATION

Genomicinformation

Metabolicinformation

GeneEnzyme

Page 5: Using genomic-based information for the  modelling  of bacterial environments and lifestyle

GENOMIC ERA: SYSTEMATIC CONSTRUCTION OF HUNDREDS OF METABOLIC NETWORKS

Genomicinformation

Metabolicinformation

Hundreds offully sequencedbacterial species

Page 6: Using genomic-based information for the  modelling  of bacterial environments and lifestyle

FROM METABOLIC INFORMATION TO ENVIRONMENTAL INFORMATION

Metabolicinformation

Environmentalinformation

Internal metaboliteExternal metabolite

Predicted natural metabolic environments

Page 7: Using genomic-based information for the  modelling  of bacterial environments and lifestyle

D-GLUCOSE IS AN EXAMPLE OF AN EXTERNALMETABOLITE

Page 8: Using genomic-based information for the  modelling  of bacterial environments and lifestyle

NEW APPROACHES ALLOW RECONSTRUCTION OF SPECIES’ METABOLIC-ENVIRONMENTS

From Borenstein et al, PNAS 2008

Based on the network topology, identifying the set of compounds that are exogenously acquired

Internal metabolite

External metabolite

Page 9: Using genomic-based information for the  modelling  of bacterial environments and lifestyle

CONSTRUCTING PREDICTED ENVIRONMENTS ACROSS HUNDREDS OF SPECIES

Metabolicinformation

Environmentalinformation

Predicted metabolic environments

Page 10: Using genomic-based information for the  modelling  of bacterial environments and lifestyle

SO WHAT DO WE HAVE AND WHAT IS IT GOOD FOR?

Metabolic networks

Environments

Species

Genomes

?

• Can we characterize the lifestyle of a species based onGenomic attributes?

• How does the structure of the metabolic network reflectadaptation to species’ lifestyle?

• Can we characterize ecological strategies based on genomic attributes?

• Can we characterize ecological communities based on genomic attributes?

• Why should we do it?

Page 11: Using genomic-based information for the  modelling  of bacterial environments and lifestyle

FIRST QUESTION

Metabolic networks

Environments

Species

Genomes

?

• Can we characterize the lifestyle of the species based onGenomic attributes?

Can we predict, based on genomic knowledge, whether a speciesis a specialist or generalist?Can we estimate the range of environments it can inhabit?

Page 12: Using genomic-based information for the  modelling  of bacterial environments and lifestyle

AND TO BE MORE SPECIFIC – WE SHOULD COUNT IN HOW MANY ENVIRONMENTS A SPECIES LIVES

Predicted metabolic environments

External/input metabolites

Internal/essential metabolites

Page 13: Using genomic-based information for the  modelling  of bacterial environments and lifestyle

GENOMIC-BASE PREDICTED DIVERSITY CORRESPONDS WITH ECOLOGICAL KNOWLEDGE

Specific examples :

Pseudomonas aeruginosa

Desulfotalea psychrophila

Genomic- based predicted environments

√ √ √√xx

NCBI annotations

Available systematic estimates/information for environmental variability

Fraction of reg. genes

Multiple

Specialized

High

{Low}

Beyond specific examples:strong correlation between metabolic-environment variability and established measures of environment variability

Page 14: Using genomic-based information for the  modelling  of bacterial environments and lifestyle

WE NOW HAVE AN ENVIRONMENTAL MODEL

ViableNot viable

Environmental viability matrix

Env 1 Env 2Env N

Spc N

Spc 1Spc 2 Information on species

Information on environment

Page 15: Using genomic-based information for the  modelling  of bacterial environments and lifestyle

SECOND QUESTION

Metabolic networks

Environments

Species

Genomes

?

• How does the structure of the metabolic network reflects adaptation to species’ lifestyle?

Studying essentialityof reactions acrosshundreds of bacterial-species across many simulated growth-environments.

Page 16: Using genomic-based information for the  modelling  of bacterial environments and lifestyle

ESSENTIALITY OF ENZYMES ACROSS SPECIES AND ENVIRONMENTS

Predicted environments

√ √ √√xx

x √ √Enzy

mes

Predicted environments

√ √ √√xx

x √ √

Environment I:

α β

γ δ

ε ζ

η

α

γ δ

ε ζ

η

Environment II: Environment III:

β

γ δ

ε ζ

η

External metabolite

Intermediate product

Essential biomass product

Backed-up reaction

Essential reaction

Conditional-dependent reaction

Accuracy: 0.86 (E. coli ) and 0.85 (B. subtilis)

Page 17: Using genomic-based information for the  modelling  of bacterial environments and lifestyle

Taking an enzymes’ point of view

High-throughput identification of essential reactions across species species-specific (or group-specific) essentiality looking for drug targets against reaction with a wide phylogenetic coverage.

This approach can be applied for highlighting essentiality in

groups of medical, ecological or agricultural interest, e.g., human pathogens versus human commensals.

pathogens

Enzy

mes

Backed-up reaction

Essential reaction

Conditional-dependent reaction

Non pathogenic-bacteria

Page 18: Using genomic-based information for the  modelling  of bacterial environments and lifestyle

One example:

10-Formyl-THF

fMet-tRNA

Initiation of protein synthesis

Purine synthesis

THF 5,10-MethenylTHFFTL MCH

Most commensals Few pathogens

Most commensals Most pathogens

Potential drug targetMCH: Methenyltetrahydrofolate cyclohydrolase FTL: Formyltetrahydrofolate synthetase

Page 19: Using genomic-based information for the  modelling  of bacterial environments and lifestyle

TAKING A SPECIES POINT OF VIEW: ESTIMATING ROBUSTNESS OF METABOLIC NETWORK

4/7 Backed-up reaction

1/7 Essential reaction

2/7 Conditional-dependent reaction

α β

γ δ

ε ζ

η

The fraction of reaction across species

Environmental diversity

Frac

tion

Mean ~0.75

Descriptionofnetwork-robustness

E.coli: 0.78 (0.83 backed-up genes)

M. genitalium: 0.35 (0.2-0.45 backed-up genes)

Page 20: Using genomic-based information for the  modelling  of bacterial environments and lifestyle

GENETIC ROBUSTNESS What is it? The ability of a biological system

to continue functioning following mutations How robust are biological systems? Under

laboratory conditions most genes are dispensable; dispensability depends on the experimental setting.

How can we explain robustness in evolutionary terms? The origin of robustness is under debate: Direct selection in favor of resistance to

mutations By product of the selection for other traits (e.g.,

increasing steady-state metabolic fluxes) Genetic robustness reflects environmental

robustness

Page 21: Using genomic-based information for the  modelling  of bacterial environments and lifestyle

NETWORK ROBUSTNESS: ENVIRONMENTAL-DEPENDENT AND INDEPENDENT COMPONENTS

The fraction of reaction across species

Environmental diversity

Frac

tion

Correlation: 0.8

Correlation: 0.1

environmentally-dependent component component is strongly associated with environmental diversity (rho=0.8) and responsible for the robustness of no more than 20% of metabolic reactions over all species and environments modeled.

How can we explain the environmentally-independent component?

Page 22: Using genomic-based information for the  modelling  of bacterial environments and lifestyle

ENVIRONMENTALLY-INDEPENDENT ROBUSTNESS IS ASSOCIATED WITH THE METABOLIC CAPACITIES

Obse

rved

gro

wth

rate

(log

)

Prediction for growth rate (log), based on network robustness

How can we explain the environmentally-independent component?The environmentally-independent component is associated (correlation=~0.6) with the metabolic capacities of a species -- higher robustness is observed in fast-growers or in organisms with an extensive production of secondary-metabolites.

Page 23: Using genomic-based information for the  modelling  of bacterial environments and lifestyle

SECOND QUESTION

Metabolic networks

Environments

Species

Genomes

?

• How does the structure of the metabolic network reflectadaptation to species’ lifestyle?

The design of metabolic networks represents a species-specific adaptation to both its needs and its environment.

Page 24: Using genomic-based information for the  modelling  of bacterial environments and lifestyle

THIRD QUESTION

Metabolic networks

Environments

Species

Genomes

?

• The structure of the metabolic network reflectsadaptation to species’ lifestyle. Can we characterize complexecological attributes based on genomic attributes?

Can we predict the level ofcompetition a species encounters in its natural environments andits rate of growth?

Page 25: Using genomic-based information for the  modelling  of bacterial environments and lifestyle

ONCE AGAIN – OUR ENVIRONMENTAL MODEL

Viable

Not viable

Environmental viability matrix

Env 1 Env 2 Env N

Spc N

Spc 1

Spc 2 Information on species

Information on environment

Page 26: Using genomic-based information for the  modelling  of bacterial environments and lifestyle

APPLYING THE ENVIRONMENTAL-MODEL FOR THE CHARACTERIZATION OF ECOLOGICAL ATTRIBUTES – COMPETITION

Viable

Not viable

Environmental Viability MatrixEnv 1 Env 2 Env 3

Spc 3

Spc 1

Spc 2

Spc 4

Env 4

Co-Habitation vector

Spc 3

Spc 1

Spc 2

{1,3,2}

{3}

{3,2}

Max-CHS

Spc 4 {1}

3

3

31

Environments populated by bacteria of an annotated lifestyleM

ean

leve

l of p

opul

atio

n

Population of environments are in agreement withecological knowledge

Page 27: Using genomic-based information for the  modelling  of bacterial environments and lifestyle

DELINEATING ECOLOGICAL STRATEGIES FOR RATE OF GROWTH:

Environments diversity

Max

imal

co-

habi

tatio

n

Ecological diversity with intense co-inhabitation, associated with a typical fast rate of growth.

A specialized niche with little co-inhabitation, associated with a typical slow rate of growth

Page 28: Using genomic-based information for the  modelling  of bacterial environments and lifestyle

THIRD QUESTION

Metabolic networks

Environments

Species

Genomes

?

• Can we characterize ecological attributes based on genomic attributes?

The patterns observed suggests a universal principle where metabolic flexibility is associated with a need to grow fast, possibly in the face of competition.Beyond specific examples, the interplay between the environmental diversity – and maximal co habitationallows training a predictor for growth rate (ROC score of 0.75 )

Page 29: Using genomic-based information for the  modelling  of bacterial environments and lifestyle

FOURTH QUESTION

Metabolic networks

Environments

Species

Genomes

?

• Can we characterize ecological Communities based on genomic attributes?

Characterization of pair-wise relationship between bacterial species to identify competitionand cooperation

Page 30: Using genomic-based information for the  modelling  of bacterial environments and lifestyle

SPECIES DO NOT LIVE IN A VACUUM

Page 31: Using genomic-based information for the  modelling  of bacterial environments and lifestyle

WHY SHOULD WE MODEL COMMUNITIES? The composition of bacterial communities is

a major factor in human health. Variations in the identity and abundance of

species affect its metabolic potential and hence have important medical implications.

Computational approaches can now be applied for the modeling of bacterial interactions.

The ultimate goal is to be able to manipulate bacterial communities to our advantage.

Page 32: Using genomic-based information for the  modelling  of bacterial environments and lifestyle

FIRST STEP: MODELING PAIR-WISE INTERACTIONS

Environmental viability matrix

Env 1 Env 2 Env N

Spc N

Spc 1Spc 2

Pairwise interactions data

Spc1 Spc2 Spc N

Spc N

Spc 1Spc 2

New type of data:

Interaction (competition/cooperation)

No interaction

Viable

Not viable

Page 33: Using genomic-based information for the  modelling  of bacterial environments and lifestyle

Environments populated by bacteria of an annotated lifestyle

Mea

n le

vel o

f pop

ulat

ion

LACK OF SYSTEMATIC KNOWLEDGE OF PAIR-WISE INTERACTIONS

Modelling lifestyle

Systematic knowledge for species-specific lifestyle

Modeling pairwise interactions

Env 1 Env 2 Env N

Spc N

Spc 1Spc 2

3

1

2

Spc1 Spc2 Spc N

Spc N

Spc 1Spc 2

Systematic knowledge for pairwise interactions

Metagenomic data?

Page 34: Using genomic-based information for the  modelling  of bacterial environments and lifestyle

WHAT ARE METAGENOMIC DATA?

Page 35: Using genomic-based information for the  modelling  of bacterial environments and lifestyle

IDENTIFYING “OUR” SPECIES IN ENVIRONMENTAL SAMPLES

Species represented by 16s rRNA

Spc 1

Spc 2

Gut

Spc 3

Environmental samples

Marine

Soil

BLAST

Page 36: Using genomic-based information for the  modelling  of bacterial environments and lifestyle

CONSTRUCTING EXPERIMENTALLY-DRIVEN PAIRWISE INTERACTIONS DATA

Environmental samples

Gut Marine PM3

Spc 3

Spc 1Spc 2

Spc 4

Spc 5

Spc 6

Environmentally-drivenDatabase of interactions

Spc 3

Spc 1Spc 2

Spc 4

Spc 5

Spc 6

Spc 1 Spc 2 Spc 3 Spc 4 Spc 5 Spc 6

-134 species (including 47 species in the gut and 81 marine species)-~1200 interactions (limited to interactions within the gut and between marine species)

Page 37: Using genomic-based information for the  modelling  of bacterial environments and lifestyle

PUBMED IS A LARGE AND COMPREHENSIVE DATA SOURCE

Müller & Mancuso, Plos ONE, 2008

Co-occurrence analysis is a technique often applied in text mining, comparative genomics, and promoter analysis. Co-occurrence between genes and proteins was shown to reveal functional interactions.

Page 38: Using genomic-based information for the  modelling  of bacterial environments and lifestyle

CONSTRUCTING LITERATURE-DRIVEN PAIRWISE INTERACTIONS DATA

Papers in Pubmed

PM1 PM2 PM3

Spc 3

Spc 1Spc 2

Spc 4

Spc 5

Spc 6

Co-occurrence basedDatabase of interactions

Spc 3

Spc 1Spc 2

Spc 4

Spc 5

Spc 6

Spc 1 Spc 2 Spc 3 Spc 4 Spc 5 Spc 6

-All species are covered by the database (~400 species)-~6000 interactions (cut-offs vary in dependence with the statistical approach taken)

Page 39: Using genomic-based information for the  modelling  of bacterial environments and lifestyle

CO-OCCURRENCE BASED INTERACTIONS DATA ARE IN AGREEMENT WITH ECOLOGICAL PROPERTIES

Num

ber of co-associated partners

Spc1 Spc2 Spc 3

Spc 3

Spc 1Spc 2

2

1

0

Pairwise interaction data

Significant enrichment in experimentally-based

interactions

Spec. Obli. Aqua. Hoas. Mult. Soil

Strong correlationwith systematic annotations

of ecological diversity

Page 40: Using genomic-based information for the  modelling  of bacterial environments and lifestyle

PAIRWISE INTERACTIONS OF GUT BACTERIA

Typical gut bacteria

Potential gut bacteria

Potential human-associatedbacteria

Other

Page 41: Using genomic-based information for the  modelling  of bacterial environments and lifestyle

THANKS

Anat KreimerUri GophnaRoded SharanEytan Ruppin

Nir YosefIsaac MeilijsonElhanan BorensteinMoshe Mevarech