an introduction to promoter prediction and analysis
TRANSCRIPT
An introduction to Promoter prediction and Analysis in
plants
Presented by:Sarbesh D. Dangol
PhD studentMarch 30, 2016.
Plant Genomics : :
Promoters
• Gene promoters are DNA sequences located upstream of gene coding regions.
• Contains multiple cis-acting elements, which are specific binding sites for TFs.
• Contains “core promoter” ( 40 bp upstream of the ∼transcriptional initiation site) and comprises the TATA box.
• Chromatins allow distant cis-acting elements to fold and spatially become proximal to the regulatory complex.
Cis and Trans regulation
• Cis-acting: DNA sequence that acts to change expression of gene adjacent to it.
• Trans-acting: Sequence controlling expression of Transcription factors (TFs).
Types of Promoters
a) Constitutive promoters• Drives somewhat constant levels of gene
expression in all tissues, at all times. • No promoters are truly constitutive. • Eg: CaMV35S promoter• High-expressing housekeeping genes are good
source (Ubiquitin, actin, Tubulin, EIF genes)
Types of Promotersb) Spatiotemporal promoters• More precise control of native genes and
transgenes. • Restricts gene expression to certain cells, tissues,
organs, or developmental stages.• Seed specific promoters in Hordein and Glutenin
genes.• Fruit specific promoters in Expansin genes
(during fruit ripening).• Anther-pollen-specific promoters.
Tuber/storage organ-specific promoters
• Tuber/storage organ-specific promoters in pDJ3S gene in potato, ß-amylase, sporamin gene in potato, cassava, carrot, sweet potato.
• Contain sugar-responsive elements like TGGACGG motif present in sporamin and ß-amylase genes.
• Sucrose responsive elements (SURE).
Types of Promotersc) Inducible promoters• Responsive to environmental stimuli (Biotic and
abiotic stresses) and external chemical stimuli. • Induction of alc system activated by ethanol or
acetaldehyde. • Drought-responsive element in DRE genes
(A/GCCGAC).• C-repeat binding factor in CBF genes • ABA responsive element in ABRE genes
(ACGTGG/T)
• Defense response promoterPathogen-inducible (defensin promoter of
OsPR10a gene) • Wound responsive promoterPR genes: Pathogenesis related genes
Models for finding Binding Sites
A) Exact String Model• Searches for exact sequence in the DNA
sequence.
Models for finding Binding Sites
B) String Mismatches Model• Tries to find almost exact sequence tolerating
a mistake in one of the positions.
Models for finding Binding Sites
C) Degenerate String Model (Consensus model)• Tries to find a sequence, and allows various
bases to be placed in specific position of the sequence.
Models for finding Binding Sites
D) Position Weight Matrix Model (Position Specific Scoring Matrix Model)
• HbMFT1::GUS activity in stamens and mature seeds. • Analyzed promoters with PLACE database. • Found elements in sequence: GTGANTH10 and POLLEN1LELAT52 for pollen expression; GATABOX for
seed/embryo expression, etc.• QRT-PCR analysis showed that HbMFT1 was induced mainly under
short-day conditions, weak expression in whole long-day conditions (different photoperiod and temperature).
• Characterization of HbMFT1::GUS promoter in Arabidopsis.
• GhMYB25-like genes have promoters with SURE elements.
• Four copies of the SP8b-like box (TACTtTT) were also found in the GhMYB25-like promoter.
• The expression of GhMYB25-like may be regulated by sugar signaling through SURE and SP8 motifs.
• Experimental approach required.
PRIMA (Promoter Integration in Microarray Analysis)
• To find the binding sites of TFs in the promoter region.
• Assumption: Co-expressed genes are regulated by common TFs and share common regulatory elements in their promoters.
Some Bioinformatics approaches in sequence analysis
• MEME algorithm: Identify likely motifs within the input set of sequences.
• CREME (Cis-regulatory Module Explorer): Identifies and visualizes spatially clustered Binding Sites of promoters in co-expressed genes.
• Markov models, hidden markov models, Hybrid models, etc. • PRIMA (Promoter Integration in Microarray Analysis): To find
the binding sites of TFs in the promoter region. Assumption: Co-expressed genes are regulated by common
TFs and share common regulatory elements in their promoters.
Promoter predictors CSHL: http://rulai.cshl.org/software/index1.htm BDGP: fruitfly.org/seq_tools/promoter.html ICG: TATA-Box predictor
• Contains 16,960 TFs and 1143 TF binding site matrices among 76 plant species.
• Used to detect transcription factor binding sites, TFs, CpG islands, tandem repeats.
• Identification of conserve regions between similar gene promoters.• TF information (response conditions, target genes, etc.)• Co-expression profile.• Protein-protein interaction/co-factor analysis• DPEs
References1. Chow C et al. (2015) PlantPAN 2.0: an update of plant promoter analysis
navigator for reconstructing transcriptional regulatory networks in plants. Nucleic Acids Research.
2. Bi Z. et al. (2016) Identification, functional study, and promoter analysis of HbMFT1, a homolog of MFT from rubber tree. Int J Mol Sci.
3. Carlos M (2014) Identification and validation of promoters and cis-acting regulatory elements. Plant science, pp 109-119 .
4. Boeva V. (2016) Analysis of genomic sequence motifs for deciphering transcription factor binding and transcriptional regulation in eukaryotic cells. Frontiers in Genetics.
5. Wang L. et al. (2014) Silencing the vacuolar invertase gene GhVIN1 blocks cotton fiber initiation from the ovule epidermis, probably by suppressing a cohort of regulatory genes via sugar signaling. The Plant Journal. 78 : 686–696
6. Malcolm Campbell. (2002) “Discovering Genomics, proteomics and bioinformatics.” Cold Sping Harbor Laboratory Press.
7. Lewins (2004) “Genes VIII.”8. Watson et al. (2004) “Molecular Biology of the Gene.” Cold Spring Harbor
Laboratory Press.. Fifth edition. 9. Cullis CA. (2004) “Plant Genomics and Proteomics.” Willey Liss.