cis-regultory module 10/24/07. tfs often work synergistically (harbison 2004)
Post on 22-Dec-2015
215 views
TRANSCRIPT
![Page 1: Cis-regultory module 10/24/07. TFs often work synergistically (Harbison 2004)](https://reader030.vdocuments.mx/reader030/viewer/2022032523/56649d7e5503460f94a61398/html5/thumbnails/1.jpg)
Cis-regultory module
10/24/07
![Page 2: Cis-regultory module 10/24/07. TFs often work synergistically (Harbison 2004)](https://reader030.vdocuments.mx/reader030/viewer/2022032523/56649d7e5503460f94a61398/html5/thumbnails/2.jpg)
TFs often work synergistically
(Harbison 2004)
![Page 3: Cis-regultory module 10/24/07. TFs often work synergistically (Harbison 2004)](https://reader030.vdocuments.mx/reader030/viewer/2022032523/56649d7e5503460f94a61398/html5/thumbnails/3.jpg)
Combinatorial control
![Page 4: Cis-regultory module 10/24/07. TFs often work synergistically (Harbison 2004)](https://reader030.vdocuments.mx/reader030/viewer/2022032523/56649d7e5503460f94a61398/html5/thumbnails/4.jpg)
lysogenic growth
lytic growth
(source: Gary Kaiser)
-phase
E coli
![Page 5: Cis-regultory module 10/24/07. TFs often work synergistically (Harbison 2004)](https://reader030.vdocuments.mx/reader030/viewer/2022032523/56649d7e5503460f94a61398/html5/thumbnails/5.jpg)
ORcI cro
-operon
![Page 6: Cis-regultory module 10/24/07. TFs often work synergistically (Harbison 2004)](https://reader030.vdocuments.mx/reader030/viewer/2022032523/56649d7e5503460f94a61398/html5/thumbnails/6.jpg)
ORcI cro
-operon
on off
lysogenic growth
![Page 7: Cis-regultory module 10/24/07. TFs often work synergistically (Harbison 2004)](https://reader030.vdocuments.mx/reader030/viewer/2022032523/56649d7e5503460f94a61398/html5/thumbnails/7.jpg)
ORcI cro
-operon
off on
lytic growth
OR1OR2OR3
![Page 8: Cis-regultory module 10/24/07. TFs often work synergistically (Harbison 2004)](https://reader030.vdocuments.mx/reader030/viewer/2022032523/56649d7e5503460f94a61398/html5/thumbnails/8.jpg)
cro
-operon
cI
Pol II
lysogenic
crocI
Pol II
lytic
![Page 9: Cis-regultory module 10/24/07. TFs often work synergistically (Harbison 2004)](https://reader030.vdocuments.mx/reader030/viewer/2022032523/56649d7e5503460f94a61398/html5/thumbnails/9.jpg)
Cis-regulatory module (CRM)
• “A CRM is a DNA segment, typically a few hundred base pairs in length containing multiple binding sites, that recruits several cooperating factors to a particular genomic location” – Ji and Wong (2006)
![Page 10: Cis-regultory module 10/24/07. TFs often work synergistically (Harbison 2004)](https://reader030.vdocuments.mx/reader030/viewer/2022032523/56649d7e5503460f94a61398/html5/thumbnails/10.jpg)
Statistical Methods
• Predict modules when the motifs are known. (simpler)– LRA, by Wasserman and Fickett (1998)
• Predict modules when the motifs also need to be discovered. (more difficult)– CisModule, by Zhou and Wong (2004)– EMCModule, by Gupta and Liu (2005)
![Page 11: Cis-regultory module 10/24/07. TFs often work synergistically (Harbison 2004)](https://reader030.vdocuments.mx/reader030/viewer/2022032523/56649d7e5503460f94a61398/html5/thumbnails/11.jpg)
LRA
![Page 12: Cis-regultory module 10/24/07. TFs often work synergistically (Harbison 2004)](https://reader030.vdocuments.mx/reader030/viewer/2022032523/56649d7e5503460f94a61398/html5/thumbnails/12.jpg)
LRA
Cooperative motifs:
Basic idea: True regulatory regions are likely to have multiple motif sites. P
roba
bilit
y fo
r be
ing
regu
lato
ry
![Page 13: Cis-regultory module 10/24/07. TFs often work synergistically (Harbison 2004)](https://reader030.vdocuments.mx/reader030/viewer/2022032523/56649d7e5503460f94a61398/html5/thumbnails/13.jpg)
LRA
• Training data contain a subset of known regulatory and control regions.
p
pp
1log)(logit
nnxxp ...)(logit 110
highest motif matching score within a given sequence
regression coefficient
Probability for being a regulatory
region
![Page 14: Cis-regultory module 10/24/07. TFs often work synergistically (Harbison 2004)](https://reader030.vdocuments.mx/reader030/viewer/2022032523/56649d7e5503460f94a61398/html5/thumbnails/14.jpg)
Application: skeletal-muscle gene regulation
• 5 muscle-specific TFs are known:– Mef-2, Myf, SRF, Tef, Sp-1
• 29 regulatory regions are known.
• Can we predict the regulatory regions just from sequence motif information?
![Page 15: Cis-regultory module 10/24/07. TFs often work synergistically (Harbison 2004)](https://reader030.vdocuments.mx/reader030/viewer/2022032523/56649d7e5503460f94a61398/html5/thumbnails/15.jpg)
Computational Procedure
• Motif matrices are identified by Gibbs sampling using sequence information from the 29 regulatory regions.
• For some TF, motifs cannot be found by the de novo approach. Use literature motifs instead.
• Top two matching scores for each TF are included as covariates.
• Apply LRA model. Use leave-one-out cross-validation to evaluate model performance.
![Page 16: Cis-regultory module 10/24/07. TFs often work synergistically (Harbison 2004)](https://reader030.vdocuments.mx/reader030/viewer/2022032523/56649d7e5503460f94a61398/html5/thumbnails/16.jpg)
Results
•Single motifs are highly non-specific.
•Simple multi-sites analysis improves specificity at the cost of reducing sensitivity.
![Page 17: Cis-regultory module 10/24/07. TFs often work synergistically (Harbison 2004)](https://reader030.vdocuments.mx/reader030/viewer/2022032523/56649d7e5503460f94a61398/html5/thumbnails/17.jpg)
Results
•Single motifs are highly non-specific.
•Simple multi-sites analysis improves specificity at the cost of reducing sensitivity.
![Page 18: Cis-regultory module 10/24/07. TFs often work synergistically (Harbison 2004)](https://reader030.vdocuments.mx/reader030/viewer/2022032523/56649d7e5503460f94a61398/html5/thumbnails/18.jpg)
Results
•Single motifs are highly non-specific.
•Simple multi-sites analysis improves specificity at the cost of reducing sensitivity.
•Logistic regression further improves specificity at reduced cost for sensitivity.
![Page 19: Cis-regultory module 10/24/07. TFs often work synergistically (Harbison 2004)](https://reader030.vdocuments.mx/reader030/viewer/2022032523/56649d7e5503460f94a61398/html5/thumbnails/19.jpg)
• Motifs must be known in advance.
• When known regulatory sequences are few, it is difficult to identify motifs by using traditional methods.
Objective:
• Integrating motif discovery and module finding in a single statistical model.
Limitations of LRA
![Page 20: Cis-regultory module 10/24/07. TFs often work synergistically (Harbison 2004)](https://reader030.vdocuments.mx/reader030/viewer/2022032523/56649d7e5503460f94a61398/html5/thumbnails/20.jpg)
De novo module identification
Two tasks
• Identify TF motifs
• Identify CRMs.
![Page 21: Cis-regultory module 10/24/07. TFs often work synergistically (Harbison 2004)](https://reader030.vdocuments.mx/reader030/viewer/2022032523/56649d7e5503460f94a61398/html5/thumbnails/21.jpg)
Why module approach can help motif discovery
•Due to poor specificity, a short sequence can be enriched simply by chance.
•The probability for random matches is much smaller for motif co-occurrence.
![Page 22: Cis-regultory module 10/24/07. TFs often work synergistically (Harbison 2004)](https://reader030.vdocuments.mx/reader030/viewer/2022032523/56649d7e5503460f94a61398/html5/thumbnails/22.jpg)
cisModule
Basic idea:• A two-level
hierarchical mixture model (HMx).– Level 1: modules
sequences
(Zhou and Wong 2004)
![Page 23: Cis-regultory module 10/24/07. TFs often work synergistically (Harbison 2004)](https://reader030.vdocuments.mx/reader030/viewer/2022032523/56649d7e5503460f94a61398/html5/thumbnails/23.jpg)
cisModule
Basic idea:• A two-level
hierarchical mixture model (HMx).– Level 1: modules
sequences– Level 2: motifs
modules
(Zhou and Wong 2004)
![Page 24: Cis-regultory module 10/24/07. TFs often work synergistically (Harbison 2004)](https://reader030.vdocuments.mx/reader030/viewer/2022032523/56649d7e5503460f94a61398/html5/thumbnails/24.jpg)
• Treat HMx model as a stochastic machinery to generate sequences.– From the first sequence position, make a series of random
decisions of whether to initiate a module of length l or generate a letter from the background model.
– Inside a module, If a site for the kth motif was initiated at position n, then generate wk letters from its PWM and place them at [n, n+wk-1], otherwise generate a letter from the background.
– After reaching the end of the current module, decide whether sampling from the background or initiating a new module.
HMx Model as a Stochastic Process
(Zhou and Wong 2004)
![Page 25: Cis-regultory module 10/24/07. TFs often work synergistically (Harbison 2004)](https://reader030.vdocuments.mx/reader030/viewer/2022032523/56649d7e5503460f94a61398/html5/thumbnails/25.jpg)
given alignment, update model parameters
given model parameters, update module/motif locations
Model inference: Gibbs sampling
![Page 26: Cis-regultory module 10/24/07. TFs often work synergistically (Harbison 2004)](https://reader030.vdocuments.mx/reader030/viewer/2022032523/56649d7e5503460f94a61398/html5/thumbnails/26.jpg)
An numerical experiment
• Merge the 29 regulatory regions with a set of sequences randomly selected from ENSEMBL promoters.
• Test the ability of cisModule to identify motifs under “noisy” environment.
![Page 27: Cis-regultory module 10/24/07. TFs often work synergistically (Harbison 2004)](https://reader030.vdocuments.mx/reader030/viewer/2022032523/56649d7e5503460f94a61398/html5/thumbnails/27.jpg)
Results
![Page 28: Cis-regultory module 10/24/07. TFs often work synergistically (Harbison 2004)](https://reader030.vdocuments.mx/reader030/viewer/2022032523/56649d7e5503460f94a61398/html5/thumbnails/28.jpg)
Limitations of CisModule
• The length of module, and number of motifs are externally provided.
• Convergence time could be slow. Multiple cycles are needed each starting from a new seed.
• Assuming that combinations of different motifs are independent.
![Page 29: Cis-regultory module 10/24/07. TFs often work synergistically (Harbison 2004)](https://reader030.vdocuments.mx/reader030/viewer/2022032523/56649d7e5503460f94a61398/html5/thumbnails/29.jpg)
EMCModule
• Gupta and Liu (2005) developed a similar approach called EMCModule.
• Main difference:– They use the collection of literature motifs as initial
“seeds” for motif discovery. – Their method improves the convergence speed.– Their definition of CRMs are a little different: the
number of motifs are fixed within one module, but the order of and distance between different motifs can be varied.
![Page 30: Cis-regultory module 10/24/07. TFs often work synergistically (Harbison 2004)](https://reader030.vdocuments.mx/reader030/viewer/2022032523/56649d7e5503460f94a61398/html5/thumbnails/30.jpg)
Further issues
• Comparative genomic approach can also be incorporated into module discovery. (Zhou and Wong 2007).
• The modules identified by these methods can be viewed as belonging to one “type”. New methods need to developed to discover multiple module types.
• While module-based approach is helpful for finding cooperative motifs, it may hurt discovery of single motifs.
![Page 31: Cis-regultory module 10/24/07. TFs often work synergistically (Harbison 2004)](https://reader030.vdocuments.mx/reader030/viewer/2022032523/56649d7e5503460f94a61398/html5/thumbnails/31.jpg)
(Yuh et al. 1998)
![Page 32: Cis-regultory module 10/24/07. TFs often work synergistically (Harbison 2004)](https://reader030.vdocuments.mx/reader030/viewer/2022032523/56649d7e5503460f94a61398/html5/thumbnails/32.jpg)
(Yuh et al. 1998)
![Page 33: Cis-regultory module 10/24/07. TFs often work synergistically (Harbison 2004)](https://reader030.vdocuments.mx/reader030/viewer/2022032523/56649d7e5503460f94a61398/html5/thumbnails/33.jpg)
(Yuh et al. 1998)
![Page 34: Cis-regultory module 10/24/07. TFs often work synergistically (Harbison 2004)](https://reader030.vdocuments.mx/reader030/viewer/2022032523/56649d7e5503460f94a61398/html5/thumbnails/34.jpg)
(Yuh et al. 1998)
![Page 35: Cis-regultory module 10/24/07. TFs often work synergistically (Harbison 2004)](https://reader030.vdocuments.mx/reader030/viewer/2022032523/56649d7e5503460f94a61398/html5/thumbnails/35.jpg)
Reading List
• Wasserman and Fickett (1988)– LRA. One of the first work on cis-regulatory modules.
• Zhou and Wong (2004)– cisModule. A statistical method to identify cis-
regulatory modules without knowledge of motif information.
• Yuh et al. (1998)– An influential biological paper on how information can
be integrated from different modules to regulate gene expression.