probabilistic models that uncover the hidden information flow in signalling networks
DESCRIPTION
Achim Tresch. Probabilistic Models that uncover the hidden Information Flow in Signalling Networks. Which model?. A model that explains the data merely finds associations E.g.: Epidemiology (predict colon cancer risk from SNPs). A model that explains the mechanism. - PowerPoint PPT PresentationTRANSCRIPT
Probabilistic Models that uncover the hidden Information Flow in Signalling Networks
Achim Tresch
-2-
A model that explains the data
merely finds associations E.g.: Epidemiology (predict colon cancer risk from SNPs)
Which model?
A model that explains the mechanism
finds explanations E.g.: Physics, Systems Biology (predict the signal flow through a cascade of transcription factors)
-3-
Which model?
?
Our choice: Graphical Modelsnodes correspond to physical entities, arrows correspond to interactions
Need for inter-ventional data
Two different types of nodes:Observable componentsPerturbed components (signals)
1st Idea
-4-
How do marionettes walk?
-5-
How do marionettes walk?This is what we observe This is the true model
?
Both models explain the observations perfectly.
What makes the right model (biologically) more plausible?
-6-
How do marionettes walk?This is what we observe This is the true model
?
Both models explain the observations perfectly.
What makes a model (biologically) more plausible?
Signal transmission is expensive!
Find a consistent model with a most
parsimonious effects graph
Signals,Signal graph Γ
Observables,Effects graph Θ
2nd Idea
-7-
Signal graph, Adjacency matrix Γ= (with 1´s in the diagonal)
Effects graph,Adjacency matrix Θ =
Signals
Predicted effects Ft
Obs
erva
bles
1 1 0
0 1 1
Parsimony Assumption: Each observable is linked to exactly one action
Definition [Markowetz, Bioinformatics 2005]: A Nested Effects Model (NEM) is a model F for which F = Γ Θ
Nested Effects Models
1 0 0
1 1 1
0 0 1
1 0
0 0
0 1
-8-
Signals
Predicted effects Ft
Obs
erva
bles
1 1 0
0 1 1
Nested Effects Models
Why „nested“ ?
If the signal graph is transitively closed, then the observed effects are nested in the sense that
a → b
implies
effects(a) effects(b)
The present formulation of a NEM drops the transitivity requirement.
█ █ █
Predicted effects
-9-
s
a
Signals
Obs
erva
bles
Effect of signal s on observable a
Ra,s1 1 0
0 1 1Predicted effects = Ft Measured effects = Rt
sa, perturbed is if
respond NOT does |
perturbed is if
responds | log R
s
aDataP
s
aDataP
Nested Effects Models
The final ingredient: A quantification of the measured effect strength
Ra,s > 0 if the data favours an effect of s on a
-10-
)0|(log)|(log FDPFDP
0 if 0
1 if
)0|(
)|(log
),( ,
,,
),( ,
,,
sa as
assa
sa sa
sasa
F
FR
sDP
FsDP
Assuming independent data,
it follows that
Note: Missing data is handeled easily: set Rs,a= 0
a s
assa FsDPFDP sObservable Signals
,, )|( )|(
)(tr )(tr )( ,s)(a,
,, RFRFRFRa
aaassa
Nested Effects Models
-11-
NEM Estimation
There are two ways of finding a high scoring NEM:
)(tr maxarg)ˆ,ˆ(,
R
dR )P( ))(exp(tr maxargˆ
Maximum Likelihood:
Bayesian, posterior mode:
For n≤5 signals, an exhaustive parameter space search is possible.
For larger n, apply standard optimization strategies:Gradient ascent, Simulated annealingor heuristics tailored to NEMs:Module networks [Fröhlich et al., BMC Bioinformatics 2007], Triplet search [Markowetz at al., Bioinformatics 2007]
Theorem (Tresch, SAGeMB 2008): For ideal data, is unique up to reversals (Corollary: if Γ is a DAG).
)ˆ,ˆ( ̂
-12-
1 2 3 4 5 6 7 8 9 10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
4
2
3
1
True Graph
1 23 4 56 78 9101112131415161718192021222324252627282930
1
2
3
4
True graphs Γ,Θ
simulatedmeasure-ments (R)
idealmeasure-ments (ΓΘ)
R/Bioconductor package: Nessy
Simulation
-13-
Tru
e G
raph
12
34
56
78
9101
1121
3141
5161
7181
9202
1222
3242
5262
7282
9301
2
3
4
True graph Estimated graph
Histogram of res$posteriors
res$posteriorsFr
eque
ncy
6 8 10 12 14
020
040
060
0
Distribution of the likelihoods
12 edges, 212=4096 signal graphs, ~ 4seconds
Simulation
-14-
a
b
1
2
3
Hypotheses:
• SL between two genes occurs if the genes are located in different pathways
• Genes sharing the same synthetic lethality partners have an increased chance of being located in the same pathway [Ye, Bader et al., Mol.Systems Biology 2005]
Pathway I Pathway II
Pathway IPathway IIsynthetic lethality
Consequence:
• A gene b whose SL partners are nested into the SL partners of another gene a is likely to be located beneath a in the same pathway.
Application: Synthetic Lethality
-15-
YAL021C
YBL008W
YBR136W YBR195CYBR274W
YBR278W
YCL016C YCL061C
YCR066W
YCR086W
YDL013W
YDL040C
YDL074C
YDL101C
YDR004WYDR076W
YDR121W
YDR191W
YDR386W
YDR439W
YEL061C
YER016W
YER095W
YER173W
YGL020C
YGL058W
YGL163C
YHR013C
YHR191C
YIL132C YJL047C
YJL092W
YJL115W
YJR043C
YKL113C
YLL002W
YLR032W
YLR288C
YLR320WYML032C
YML085C
YML102W
YMR038C
YMR048W
YMR078C
YMR190C
YMR224C
YNL250W
YNL273W
YNR052C
YOL068C
YOR025W
YOR038C
YOR080W
YOR144C
YOR209C
YOR368W
YPL055C
YPL153C
YPL194W
YPR023C
YPR135W
YPR164W
YAL021CYBL008WYBR136WYBR195C
YBR274WYBR278W
YCL016CYCL061CYCR066WYCR086WYDL013WYDL040CYDL074CYDL101CYDR004WYDR076WYDR121WYDR191WYDR386WYDR439WYEL061CYER016WYER095WYER173WYGL020CYGL058W
YGL163CYHR013C
YHR191CYIL132C
YJL047CYJL092WYJL115WYJR043CYKL113CYLL002WYLR032W
YLR288CYLR320WYML032CYML085CYML102WYMR038CYMR048WYMR078CYMR190CYMR224CYNL250WYNL273WYNR052CYOL068CYOR025WYOR038CYOR080WYOR144CYOR209CYOR368W
YPL055CYPL153CYPL194WYPR023CYPR135WYPR164W
YAL021CYBL008WYBR136WYBR195CYBR274WYBR278WYCL016CYCL061CYCR066WYCR086WYDL013WYDL040CYDL074CYDL101CYDR004WYDR076WYDR121WYDR191WYDR386WYDR439WYEL061CYER016WYER095WYER173WYGL020CYGL058WYGL163CYHR013CYHR191CYIL132CYJL047CYJL092WYJL115WYJR043CYKL113CYLL002WYLR032WYLR288CYLR320WYML032CYML085CYML102WYMR038CYMR048WYMR078CYMR190CYMR224CYNL250WYNL273WYNR052CYOL068CYOR025WYOR038CYOR080WYOR144CYOR209CYOR368WYPL055CYPL153CYPL194WYPR023CYPR135WYPR164W
c(-1
, 1)
-1-1
-1-1
-11
11
1
Application: Synthetic Lethality
Pan et al., Cell 2006
-16-
Application: Synthetic Lethality
7 of 10 Genes directly linked to DNA repair
Tresch, unpublished
-17-
References: • Structure Learning in Nested Effects Models. A. Tresch, F. Markowetz,
to appear in SAGeMB 2008, avaliable on the ArXive
• Nested Effects Models as a Means to learn Signaling Networks from Intervention Effects. H. Fröhlich, A. Tresch, F. Markowetz, M. Fellmann, R. Spang, T. Beissbarth, in preparation
• Computational identification of cellular networks and pathways F. Markowetz, Olga G. Troyanskaya, Dennis Kostka, Rainer Spang. Molecular BioSystems, Bioinformatics 2007
• Non-transcriptional Pathway Features Reconstructed from Secondary Effects of RNA Interference. F. Markowetz, J. Bloch, R. Spang, Bioinformatics 2005
R/Bioconductor packages:
• NEM (Markowetz, Fröhlich, Beissbarth)
• Nessy (Tresch)
Software, References
-18-
Research & Teaching Activities
Research related to the theory of NEMsIntegration of multiple data sourcesTime-dependent NEMsAllow for arbitrary signalling model
TeachingLectures & Exercises in Bioinformatics, Machine Learning, Statistics for Physicians, Group Theory, Microarray AnalysisE-learning Core Group of the FacultyBachelor-/Master- and PhD theses
Other Research TopicsSoftware for data acquisition, -processing & -visualization for high-density technologiesDesign and analysis of biological/clinical experiments, consulting
-19-
Florian Markowetz Lewis-Sigler Institute, Princeton
Tim Beissbarth, Holger FröhlichGerman Cancer Research Center, Heidelberg
Rainer SpangComputational Diagnostics Group, Regensburg
Acknowledgements
-20-
Thank You!
Conclusion
Exercise:
Why is this administration model inefficient?
Construct a model that scores better!
-21-
-22-
What I did not show …
Estimated Graph (68 'known' Observations)
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768
rel-
key-
tak-
mkk4hep-
Automatic Feature Selection, without Control experiment:Estimated graph (120 genes selected)
-23-
The „observed“ graph of the Fellmann estrogen receptor dataset
What I did not show …
-24-
15 Genes
17 Knockdown Experiments
6 of them double Knockdowns
What I did not show …
-25-
AKT1
AKT2
BCL2
CCNG2ERK1ERK2
ESR1
FOXA1
GPR30
HSPB8
LOC120224STAT3 STAT5B
STC2
XBP1
AKT1AKT2BCL2
CCNG2ESR1
FOXA1HSPB8
LOC120224STAT5B
STC2XBP1
ERK1+STAT5BERK2+STAT5B
ESR1+ERK1ESR1+ERK2
ESR1+GPR30STAT3+STAT5B
c(-
1, 1)
-6.7
-5.8
-5.1
-3.6
-0.3
30.4
31.1
2.4
12
Same Data,
With prior knowledge.
What I did not show …