promoter analysis tfbs detection daniel rico, phd. [email protected] daniel rico, phd. [email protected]
TRANSCRIPT
![Page 1: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/1.jpg)
Promoter AnalysisTFBS Detection
Daniel Rico, PhD.
Daniel Rico, PhD.
![Page 2: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/2.jpg)
1. Promoters and gene regulation in Eukaryotes
2. Position Weight Matrices (PWM)
3. PWM Databases
4. TFBS prediction using PWMs
5. Pattern Discovery: Finding unknown motifs
6. Exercise: Use the human NOS2 sequence
to predict TFBS with Match and JASPAR
2
![Page 3: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/3.jpg)
Transcription Factor Binding Sites
1. Promoters and gene regulation in Eukaryotes
2. Position Weight Matrices (PWM)
3. PWM Databases
4. TFBS prediction using PWMs
5. Exercise: Use the human NOS2 sequence
to predict TFBS with Match and JASPAR
3
![Page 4: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/4.jpg)
4
Gene
Enhancer
TSS: Transcription Start Site
“Proximal” promoter(100bp-2Kb 5’ Upstream)
![Page 5: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/5.jpg)
Promoters
Promoters are DNA segments upstream of transcripts that initiate transcription
Promoter attracts RNA Polymerase to the transcription start site
5’Promoter 3’
5
![Page 6: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/6.jpg)
GENES IN ENSEMBL
6
5’ Forward (+) strand 3’
Reverse (-) strand
![Page 7: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/7.jpg)
7
Transcription Termination Site
Transcription Start Site
![Page 8: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/8.jpg)
Promoter Structure in Prokaryotes (E.Coli)
Transcription starts at offset 0.
• Pribnow Box (-10)
• Gilbert Box (-30)
• Ribosomal Binding Site (+10)
8
![Page 9: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/9.jpg)
9
Promoter Structure in Eukaryotes
![Page 10: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/10.jpg)
10
CAGE (Cap Analysis of Gene Expression))detects the transcriptional activity of each promoter transcript.
Experimental Transcription Start Sites (TSS)by CAGE
![Page 11: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/11.jpg)
11
Representation of CAGE preparation protocol adapted to various platforms.
Now Solexa and Illumina are preferred. 454 Life Sciences (FLX system) is not used any longer because concatenation requires additional PCR cycles and complicated manipulation.
In the future, single-molecule sequencing technology will be preferred because PCR may not be required.
![Page 12: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/12.jpg)
12
http://www.osc.riken.jp/english/activity/cage/basic/
![Page 13: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/13.jpg)
13http://fantom.gsc.riken.jp/4/edgeexpress/view/
![Page 14: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/14.jpg)
http://www.epd.isb-sib.ch/ 14
![Page 15: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/15.jpg)
15
![Page 16: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/16.jpg)
Sequence Analysis: Searching Transcription Factor Binding Sites (TFBS)
16
![Page 17: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/17.jpg)
TFBS: Detection methods
in vivoFunctional analysisChIP
in vitro on cloned fragmentFootprinting reactionsExonuclease digestsGel retardation (EMSA)UV Crosslinking
in vitro on artificial DNA:SELEX: Systematic Evolution of Ligands by Exponential
enrichment
17
![Page 18: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/18.jpg)
18
Affinity
Specificity
Nat Rev Genet. 2010 Nov;11(11):751-60. Epub 2010 Sep 28.Determining the specificity of protein-DNA interactions.
Transcription Factors bind TO TFBS in DNA
![Page 19: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/19.jpg)
19
TF Binding Sites
Problems:often poorly defined consensusSequences not conserved within species, and
even worse between speciesExamples of enhancers functionally conserved
but not sequence-conservedMost of the TFBS sequence data comes from
just a few speciesVery often in vitro experiments2 completely different binding sites could be
merged in the same matrix/consensus
19
![Page 20: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/20.jpg)
Transcription Factor Binding Sites
1. Promoters and gene regulation in Eukaryotes
2. Position Weight Matrices (PWM)
3. PWM Databases
4. TFBS prediction using PWMs
5. Pattern Discovery: Finding unknown motifs
6. Exercise: Use the human NOS2 sequence
to predict TFBS with Match and JASPAR
20
![Page 21: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/21.jpg)
Data collection
Probabilities can be calculated and corrected for background
Also called position-specific scoring matrices (PSSMs). In log scale.21
![Page 22: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/22.jpg)
From PFM to PWM/PSSM
Transcription Factor Binding Sites 22
![Page 23: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/23.jpg)
SEQUENCE LOGOS: The information content of a matrix column ranges from 0 (no base preference) and 2 (only 1 base used).
http://weblogo.berkeley.edu/ http://www.lecb.ncifcrf.gov/~toms/sequencelogo.html23
![Page 24: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/24.jpg)
AAGTTCAAGCTCAGGCTCAAGGTC
A 430000 C 000204G 014100T 000140
Consensus: ARGBTC
Summary
24
![Page 25: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/25.jpg)
Transcription Factor Binding Sites
1. Promoters and gene regulation in Eukaryotes
2. Position Weight Matrices (PWM)
3. PWM Databases
4. TFBS prediction using PWMs
5. Pattern Discovery: Finding unknown motifs
6. Exercise: Obtain mouse and human fosB promoters
and predict TFBS with Match and JASPAR
25
![Page 26: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/26.jpg)
26
Transfac: not free, 848 matrices, loads of information and references, quality score based on methods used
Jaspar: open sources, 123 matrices, minimal information, majority based on SELEX method (80%)
26
![Page 27: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/27.jpg)
TRANSFAC®
27http://www.gene-regulation.com/pub/databases.html
![Page 28: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/28.jpg)
http://jaspar.cgb.ki.se/
http://jaspar.genereg.net/
28
![Page 29: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/29.jpg)
29
Jaspar example: Pax6
29
![Page 30: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/30.jpg)
Transcription Factor Binding Sites
Transcription Factor Binding Sites
1. Promoters and gene regulation in Eukaryotes
2. Position Weight Matrices (PWM)
3. PWM Databases
4. Pattern Matching: TFBS prediction using PWMs
5. Pattern Discovery: Finding unknown motifs
6. Exercise: Use the human NOS2 sequence
to predict TFBS with Match and JASPAR
30
![Page 31: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/31.jpg)
Click here to select all TFBSClick here to
select all TFBS
31
![Page 32: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/32.jpg)
Transcription Factor Binding Sites
1. Promoters and gene regulation in Eukaryotes
2. Position Weight Matrices (PWM)
3. PWM Databases
4. Pattern Matching: TFBS prediction using PWMs
5. Pattern Discovery: Finding unknown motifs
6. Exercise: Use the human NOS2 sequence
to predict TFBS with Match and JASPAR
32
![Page 33: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/33.jpg)
33
Pattern discovery
Reference Genome
Seq. oligo expectedfrequency
AAAAAA 0.00024AAAAAC 0.00030AAAAAG 0.00031AAAAAT0.00024AAAACC 0.00028…
Sequences of interest
Seq. oligo observedfrequency
AAAAAA 0.00023AAAAAC 0.00031AAAAAG 0.00125AAAAAT0.00018AAAACC 0.00026…
***
33
![Page 34: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/34.jpg)
http://meme.sdsc.edu/meme/ 34
![Page 35: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/35.jpg)
Transcription Factor Binding Sites
1. Promoters and gene regulation in Eukaryotes
2. Position Weight Matrices (PWM)
3. PWM Databases
4. Pattern Matching: TFBS prediction using PWMs
5. Pattern Discovery: Finding unknown motifs
6. Exercise: Use the human NOS2 sequence
to predict TFBS with Match and JASPAR
35
![Page 36: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/36.jpg)
EXERCISE Step by step
a. Download from UCSC or Ensembl the human NOS2 gene plus 5000 bases upstream. Select the “proximal promoter” first 1Kb: from -1000 to TSS (hint: there is no zero position!)
b. Go to JASPAR and search for TFBS in promoter with the defaults.
c. Do the same exercise with the mouse NOS2.
d. Compare the results.
36
![Page 37: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/37.jpg)
Chromatin AccessibilityAccess to experimental information37
![Page 38: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/38.jpg)
http://www.nature.com/scitable/
![Page 39: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/39.jpg)
![Page 40: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/40.jpg)
Eucromatina y Heterocromatina
Replicatión tardía (late)Replicatión temprana (early)
![Page 41: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/41.jpg)
![Page 42: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/42.jpg)
![Page 43: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/43.jpg)
![Page 44: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/44.jpg)
Nat Rev Genet. 2011 Jul 12;12(8):554-64. doi: 10.1038/nrg3017.Determinants and dynamics of genome accessibility.
![Page 45: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/45.jpg)
![Page 46: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/46.jpg)
![Page 47: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/47.jpg)
ENCODE: www.genome.gov/10005107
ENCyclopedia of DNA Elements, NHGRI Consortium of international researchers UCSC is the Data Coordination Center 47
Slides from http://www.openhelix.com/ENCODE
![Page 48: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/48.jpg)
ENCODE Background
Pilot phase, or phase I: www.genome.gov/26525202 Selected regions of the genome: 1%, 30 MB 48
![Page 49: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/49.jpg)
ENCODE Pilot Data and Beyond
ENCODE portal: http://genome.ucsc.edu/ENCODE/ Pilot ENCODE browser: genome.ucsc.edu/ENCODE/pilot.html49
![Page 50: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/50.jpg)
ENCODE Next Phase: Production Phase
UCSC is the DCC for human and mouse data The portal is available: genome.ucsc.edu/ENCODE/ New aspects of the Production Phase projects 50
![Page 51: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/51.jpg)
ENCODE Production Phase Focus
ENCODE is now genome-wide Specific cell types and new technologies being applied Project focus topics selected, then supplemented
Copyright O
penHelix. N
o use or reproduction w
ithout express written
consent
51
chromatin
transcriptome/genes
promoters/regulatory
sites
DNase sites
![Page 52: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/52.jpg)
ENCODE Data is Flowing!
Data being submitted to UCSC DCC by data providers “Wranglers” ensure meta data is present Quality checks occur, data is released for use
Copyright O
penHelix. N
o use or reproduction w
ithout express written
consent
52
![Page 53: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/53.jpg)
ENCODE Data Types Mapping data
Genes
Expression
Regulation
Variation
53
ENCODE Tracks
identified with icon
![Page 54: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/54.jpg)
Regulation Data
Regulation data Structure: modifications, open vs. closed chromatin 54
Image from NIH
![Page 55: Promoter Analysis TFBS Detection Daniel Rico, PhD. drico@cnio.es Daniel Rico, PhD. drico@cnio.es](https://reader031.vdocuments.mx/reader031/viewer/2022020417/56649d9d5503460f94a8667e/html5/thumbnails/55.jpg)
Regulation Data II
Transcription factor binding sites, TFBS RNA binding proteins 55
TATA bound to DNA