inference of predictive gene interaction...
TRANSCRIPT
Inference of predictive gene interaction networks
Benjamin Haibe-Kains
DFCI/HSPH
December 15, 2011
at the Dana-Farber Cancer Institute and Harvard School of Public Healthat the Dana-Farber Cancer Institute and Harvard School of Public Health
The Computational Biology and Functional Genomics Laboratory
Benjamin Haibe-Kains (DFCI/HSPH) R/Bioconductor Course December 15, 2011 1 / 12
Background
Phenotypes result from biological networks, not individual genes
New biotechnologies allow us to analyze multiple genes in parallel:I next generation sequencingI gene expression profilingI Chip-seqI . . .
Understand the complex interactions between genes and the behaviorof a network is fundamental
I to bring new biological insightsI to make ”useful” predictions
Benjamin Haibe-Kains (DFCI/HSPH) R/Bioconductor Course December 15, 2011 2 / 12
Relevance
Aim: Infer reliable predictive gene interaction networks from geneexpression data
Beyond biological understanding, such networks would be efficient toolsfor:
predicting the response of an organism (cancer patient) toperturbations (targeted therapies)
identifying the key genes to target for significantly decreasing apathway activity
optimizing combination of drugs and therapy regiments
Benjamin Haibe-Kains (DFCI/HSPH) R/Bioconductor Course December 15, 2011 3 / 12
Gene Interaction Network
Genes are represented as”nodes”
Interactions are representedby ”edges”
Edges can be directed toshow ”causal” interactions
Edges are not necessarilydirect interactions
gene a
gene b
gene c
gene d
gene e
gene f
gene g
gene h
Benjamin Haibe-Kains (DFCI/HSPH) R/Bioconductor Course December 15, 2011 4 / 12
Gene Interaction Network and Perturbation
gene a
gene b
gene c
gene d
gene e
gene f
gene g
gene h
Highexpression
Lowexpression
gene a
gene b
gene c
gene d
gene e
gene e
gene g
gene h
Perturbation
Benjamin Haibe-Kains (DFCI/HSPH) R/Bioconductor Course December 15, 2011 5 / 12
Challenges in network inference. . . and how we address them
1 Problem complexity : seed the search for the ”best” network by usingprior biological knowledge about gene interactions
ß Predictive Networks web application
2 Curse of dimensionality : development of a local regression-basednetwork inference to enable analysis of hundreds of genes in parallel
3 Lack of validation: development of performance criteria to assess thequality of network models
4 Lack of software: implementation of network inference methods andrelated tools in R
ß predictionet package
Benjamin Haibe-Kains (DFCI/HSPH) R/Bioconductor Course December 15, 2011 6 / 12
Challenges in network inference. . . and how we address them
1 Problem complexity : seed the search for the ”best” network by usingprior biological knowledge about gene interactions
ß Predictive Networks web application
2 Curse of dimensionality : development of a local regression-basednetwork inference to enable analysis of hundreds of genes in parallel
3 Lack of validation: development of performance criteria to assess thequality of network models
4 Lack of software: implementation of network inference methods andrelated tools in R
ß predictionet package
Benjamin Haibe-Kains (DFCI/HSPH) R/Bioconductor Course December 15, 2011 6 / 12
Challenges in network inference. . . and how we address them
1 Problem complexity : seed the search for the ”best” network by usingprior biological knowledge about gene interactions
ß Predictive Networks web application
2 Curse of dimensionality : development of a local regression-basednetwork inference to enable analysis of hundreds of genes in parallel
3 Lack of validation: development of performance criteria to assess thequality of network models
4 Lack of software: implementation of network inference methods andrelated tools in R
ß predictionet package
Benjamin Haibe-Kains (DFCI/HSPH) R/Bioconductor Course December 15, 2011 6 / 12
Challenges in network inference. . . and how we address them
1 Problem complexity : seed the search for the ”best” network by usingprior biological knowledge about gene interactions
ß Predictive Networks web application
2 Curse of dimensionality : development of a local regression-basednetwork inference to enable analysis of hundreds of genes in parallel
3 Lack of validation: development of performance criteria to assess thequality of network models
4 Lack of software: implementation of network inference methods andrelated tools in R
ß predictionet package
Benjamin Haibe-Kains (DFCI/HSPH) R/Bioconductor Course December 15, 2011 6 / 12
Predictive Networks web application
https://compbio.dfci.harvard.edu/predictivenetworks/
Benjamin Haibe-Kains (DFCI/HSPH) R/Bioconductor Course December 15, 2011 7 / 12
Regression-based network inference
Gene expression data
Undirected graph
Directedgraph
Directedgraph
Networktopology
Predictive models
Priors
MRMR
causalityInference
(linear)regression
Predictive Networksweb application
Benjamin Haibe-Kains (DFCI/HSPH) R/Bioconductor Course December 15, 2011 8 / 12
Performance criteria for network inference
We implement a cross-validation framework to assess:
edge-specific stability: what are the interactions inferred in most ofthe cross-validation folds?
gene-specific prediction score: what are the genes whose expressioncan be well predicted by their parent/source genes?
Benjamin Haibe-Kains (DFCI/HSPH) R/Bioconductor Course December 15, 2011 9 / 12
Predictionet R package
https://github.com/bhaibeka/predictionet
Benjamin Haibe-Kains (DFCI/HSPH) R/Bioconductor Course December 15, 2011 10 / 12
Acknowledgements
Computational Biology andFunctional Genomics Laboratory
Machine Learning Group Entagen
Dana-Farber Cancer Institute, USA Universite Libre deBruxelles, Belgium
Entagen, USA
Renee RubioKathleen FlemingNiall PrendergastRaktim SinhaDaniel SchlauchAlejandro QuirozMegha PadiStefan BentinkJohn Quackenbush
Catharina OlsenGianluca Bontempi
Christopher BoutonErik BakkeJames Hardwick
Ontario Cancer Institute
Universty of Toronto,Canada
Amira Djebbari
Benjamin Haibe-Kains (DFCI/HSPH) R/Bioconductor Course December 15, 2011 11 / 12