using genomic-based information for the modelling of bacterial environments and lifestyle

Post on 24-Feb-2016

31 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Using genomic-based information for the modelling of bacterial environments and lifestyle . Shiri Freilich Eytan Ruppin , Roded Sharan School of Computer Sciences Tel Aviv University May 2009. Species evolve to adapt to their environment . . Phenotype. Genotype. - PowerPoint PPT Presentation

TRANSCRIPT

USING GENOMIC-BASED INFORMATION FOR THE MODELLING OF BACTERIAL ENVIRONMENTS AND LIFESTYLE

Shiri Freilich

Eytan Ruppin, Roded SharanSchool of Computer Sciences Tel Aviv UniversityMay 2009

Species evolve to adapt to their environment.

Environment/lifestyle Phenotype Genotype

Can we use the genotype to predict the lifestyle of a species?

FROM GENOMIC INFORMATION TO PHENOTYPIC (METABOLIC) INFORMATION

Genomicinformation

Metabolicinformation

GeneEnzyme

GENOMIC ERA: SYSTEMATIC CONSTRUCTION OF HUNDREDS OF METABOLIC NETWORKS

Genomicinformation

Metabolicinformation

Hundreds offully sequencedbacterial species

FROM METABOLIC INFORMATION TO ENVIRONMENTAL INFORMATION

Metabolicinformation

Environmentalinformation

Internal metaboliteExternal metabolite

Predicted natural metabolic environments

D-GLUCOSE IS AN EXAMPLE OF AN EXTERNALMETABOLITE

NEW APPROACHES ALLOW RECONSTRUCTION OF SPECIES’ METABOLIC-ENVIRONMENTS

From Borenstein et al, PNAS 2008

Based on the network topology, identifying the set of compounds that are exogenously acquired

Internal metabolite

External metabolite

CONSTRUCTING PREDICTED ENVIRONMENTS ACROSS HUNDREDS OF SPECIES

Metabolicinformation

Environmentalinformation

Predicted metabolic environments

SO WHAT DO WE HAVE AND WHAT IS IT GOOD FOR?

Metabolic networks

Environments

Species

Genomes

?

• Can we characterize the lifestyle of a species based onGenomic attributes?

• How does the structure of the metabolic network reflectadaptation to species’ lifestyle?

• Can we characterize ecological strategies based on genomic attributes?

• Can we characterize ecological communities based on genomic attributes?

• Why should we do it?

FIRST QUESTION

Metabolic networks

Environments

Species

Genomes

?

• Can we characterize the lifestyle of the species based onGenomic attributes?

Can we predict, based on genomic knowledge, whether a speciesis a specialist or generalist?Can we estimate the range of environments it can inhabit?

AND TO BE MORE SPECIFIC – WE SHOULD COUNT IN HOW MANY ENVIRONMENTS A SPECIES LIVES

Predicted metabolic environments

External/input metabolites

Internal/essential metabolites

GENOMIC-BASE PREDICTED DIVERSITY CORRESPONDS WITH ECOLOGICAL KNOWLEDGE

Specific examples :

Pseudomonas aeruginosa

Desulfotalea psychrophila

Genomic- based predicted environments

√ √ √√xx

NCBI annotations

Available systematic estimates/information for environmental variability

Fraction of reg. genes

Multiple

Specialized

High

{Low}

Beyond specific examples:strong correlation between metabolic-environment variability and established measures of environment variability

WE NOW HAVE AN ENVIRONMENTAL MODEL

ViableNot viable

Environmental viability matrix

Env 1 Env 2Env N

Spc N

Spc 1Spc 2 Information on species

Information on environment

SECOND QUESTION

Metabolic networks

Environments

Species

Genomes

?

• How does the structure of the metabolic network reflects adaptation to species’ lifestyle?

Studying essentialityof reactions acrosshundreds of bacterial-species across many simulated growth-environments.

ESSENTIALITY OF ENZYMES ACROSS SPECIES AND ENVIRONMENTS

Predicted environments

√ √ √√xx

x √ √Enzy

mes

Predicted environments

√ √ √√xx

x √ √

Environment I:

α β

γ δ

ε ζ

η

α

γ δ

ε ζ

η

Environment II: Environment III:

β

γ δ

ε ζ

η

External metabolite

Intermediate product

Essential biomass product

Backed-up reaction

Essential reaction

Conditional-dependent reaction

Accuracy: 0.86 (E. coli ) and 0.85 (B. subtilis)

Taking an enzymes’ point of view

High-throughput identification of essential reactions across species species-specific (or group-specific) essentiality looking for drug targets against reaction with a wide phylogenetic coverage.

This approach can be applied for highlighting essentiality in

groups of medical, ecological or agricultural interest, e.g., human pathogens versus human commensals.

pathogens

Enzy

mes

Backed-up reaction

Essential reaction

Conditional-dependent reaction

Non pathogenic-bacteria

One example:

10-Formyl-THF

fMet-tRNA

Initiation of protein synthesis

Purine synthesis

THF 5,10-MethenylTHFFTL MCH

Most commensals Few pathogens

Most commensals Most pathogens

Potential drug targetMCH: Methenyltetrahydrofolate cyclohydrolase FTL: Formyltetrahydrofolate synthetase

TAKING A SPECIES POINT OF VIEW: ESTIMATING ROBUSTNESS OF METABOLIC NETWORK

4/7 Backed-up reaction

1/7 Essential reaction

2/7 Conditional-dependent reaction

α β

γ δ

ε ζ

η

The fraction of reaction across species

Environmental diversity

Frac

tion

Mean ~0.75

Descriptionofnetwork-robustness

E.coli: 0.78 (0.83 backed-up genes)

M. genitalium: 0.35 (0.2-0.45 backed-up genes)

GENETIC ROBUSTNESS What is it? The ability of a biological system

to continue functioning following mutations How robust are biological systems? Under

laboratory conditions most genes are dispensable; dispensability depends on the experimental setting.

How can we explain robustness in evolutionary terms? The origin of robustness is under debate: Direct selection in favor of resistance to

mutations By product of the selection for other traits (e.g.,

increasing steady-state metabolic fluxes) Genetic robustness reflects environmental

robustness

NETWORK ROBUSTNESS: ENVIRONMENTAL-DEPENDENT AND INDEPENDENT COMPONENTS

The fraction of reaction across species

Environmental diversity

Frac

tion

Correlation: 0.8

Correlation: 0.1

environmentally-dependent component component is strongly associated with environmental diversity (rho=0.8) and responsible for the robustness of no more than 20% of metabolic reactions over all species and environments modeled.

How can we explain the environmentally-independent component?

ENVIRONMENTALLY-INDEPENDENT ROBUSTNESS IS ASSOCIATED WITH THE METABOLIC CAPACITIES

Obse

rved

gro

wth

rate

(log

)

Prediction for growth rate (log), based on network robustness

How can we explain the environmentally-independent component?The environmentally-independent component is associated (correlation=~0.6) with the metabolic capacities of a species -- higher robustness is observed in fast-growers or in organisms with an extensive production of secondary-metabolites.

SECOND QUESTION

Metabolic networks

Environments

Species

Genomes

?

• How does the structure of the metabolic network reflectadaptation to species’ lifestyle?

The design of metabolic networks represents a species-specific adaptation to both its needs and its environment.

THIRD QUESTION

Metabolic networks

Environments

Species

Genomes

?

• The structure of the metabolic network reflectsadaptation to species’ lifestyle. Can we characterize complexecological attributes based on genomic attributes?

Can we predict the level ofcompetition a species encounters in its natural environments andits rate of growth?

ONCE AGAIN – OUR ENVIRONMENTAL MODEL

Viable

Not viable

Environmental viability matrix

Env 1 Env 2 Env N

Spc N

Spc 1

Spc 2 Information on species

Information on environment

APPLYING THE ENVIRONMENTAL-MODEL FOR THE CHARACTERIZATION OF ECOLOGICAL ATTRIBUTES – COMPETITION

Viable

Not viable

Environmental Viability MatrixEnv 1 Env 2 Env 3

Spc 3

Spc 1

Spc 2

Spc 4

Env 4

Co-Habitation vector

Spc 3

Spc 1

Spc 2

{1,3,2}

{3}

{3,2}

Max-CHS

Spc 4 {1}

3

3

31

Environments populated by bacteria of an annotated lifestyleM

ean

leve

l of p

opul

atio

n

Population of environments are in agreement withecological knowledge

DELINEATING ECOLOGICAL STRATEGIES FOR RATE OF GROWTH:

Environments diversity

Max

imal

co-

habi

tatio

n

Ecological diversity with intense co-inhabitation, associated with a typical fast rate of growth.

A specialized niche with little co-inhabitation, associated with a typical slow rate of growth

THIRD QUESTION

Metabolic networks

Environments

Species

Genomes

?

• Can we characterize ecological attributes based on genomic attributes?

The patterns observed suggests a universal principle where metabolic flexibility is associated with a need to grow fast, possibly in the face of competition.Beyond specific examples, the interplay between the environmental diversity – and maximal co habitationallows training a predictor for growth rate (ROC score of 0.75 )

FOURTH QUESTION

Metabolic networks

Environments

Species

Genomes

?

• Can we characterize ecological Communities based on genomic attributes?

Characterization of pair-wise relationship between bacterial species to identify competitionand cooperation

SPECIES DO NOT LIVE IN A VACUUM

WHY SHOULD WE MODEL COMMUNITIES? The composition of bacterial communities is

a major factor in human health. Variations in the identity and abundance of

species affect its metabolic potential and hence have important medical implications.

Computational approaches can now be applied for the modeling of bacterial interactions.

The ultimate goal is to be able to manipulate bacterial communities to our advantage.

FIRST STEP: MODELING PAIR-WISE INTERACTIONS

Environmental viability matrix

Env 1 Env 2 Env N

Spc N

Spc 1Spc 2

Pairwise interactions data

Spc1 Spc2 Spc N

Spc N

Spc 1Spc 2

New type of data:

Interaction (competition/cooperation)

No interaction

Viable

Not viable

Environments populated by bacteria of an annotated lifestyle

Mea

n le

vel o

f pop

ulat

ion

LACK OF SYSTEMATIC KNOWLEDGE OF PAIR-WISE INTERACTIONS

Modelling lifestyle

Systematic knowledge for species-specific lifestyle

Modeling pairwise interactions

Env 1 Env 2 Env N

Spc N

Spc 1Spc 2

3

1

2

Spc1 Spc2 Spc N

Spc N

Spc 1Spc 2

Systematic knowledge for pairwise interactions

Metagenomic data?

WHAT ARE METAGENOMIC DATA?

IDENTIFYING “OUR” SPECIES IN ENVIRONMENTAL SAMPLES

Species represented by 16s rRNA

Spc 1

Spc 2

Gut

Spc 3

Environmental samples

Marine

Soil

BLAST

CONSTRUCTING EXPERIMENTALLY-DRIVEN PAIRWISE INTERACTIONS DATA

Environmental samples

Gut Marine PM3

Spc 3

Spc 1Spc 2

Spc 4

Spc 5

Spc 6

Environmentally-drivenDatabase of interactions

Spc 3

Spc 1Spc 2

Spc 4

Spc 5

Spc 6

Spc 1 Spc 2 Spc 3 Spc 4 Spc 5 Spc 6

-134 species (including 47 species in the gut and 81 marine species)-~1200 interactions (limited to interactions within the gut and between marine species)

PUBMED IS A LARGE AND COMPREHENSIVE DATA SOURCE

Müller & Mancuso, Plos ONE, 2008

Co-occurrence analysis is a technique often applied in text mining, comparative genomics, and promoter analysis. Co-occurrence between genes and proteins was shown to reveal functional interactions.

CONSTRUCTING LITERATURE-DRIVEN PAIRWISE INTERACTIONS DATA

Papers in Pubmed

PM1 PM2 PM3

Spc 3

Spc 1Spc 2

Spc 4

Spc 5

Spc 6

Co-occurrence basedDatabase of interactions

Spc 3

Spc 1Spc 2

Spc 4

Spc 5

Spc 6

Spc 1 Spc 2 Spc 3 Spc 4 Spc 5 Spc 6

-All species are covered by the database (~400 species)-~6000 interactions (cut-offs vary in dependence with the statistical approach taken)

CO-OCCURRENCE BASED INTERACTIONS DATA ARE IN AGREEMENT WITH ECOLOGICAL PROPERTIES

Num

ber of co-associated partners

Spc1 Spc2 Spc 3

Spc 3

Spc 1Spc 2

2

1

0

Pairwise interaction data

Significant enrichment in experimentally-based

interactions

Spec. Obli. Aqua. Hoas. Mult. Soil

Strong correlationwith systematic annotations

of ecological diversity

PAIRWISE INTERACTIONS OF GUT BACTERIA

Typical gut bacteria

Potential gut bacteria

Potential human-associatedbacteria

Other

THANKS

Anat KreimerUri GophnaRoded SharanEytan Ruppin

Nir YosefIsaac MeilijsonElhanan BorensteinMoshe Mevarech

top related