structure of proximal and distant regulatory elements in the human genome
DESCRIPTION
Structure of proximal and distant regulatory elements in the human genome. Ivan Ovcharenko Computational Biology Branch National Center for Biotechnology Information National Institutes of Health September 23, 2010. The Genome Sequence: The Ultimate Code of Life. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Structure of proximal and distant regulatory elements in the human genome](https://reader036.vdocuments.mx/reader036/viewer/2022062321/56812f8d550346895d9509a7/html5/thumbnails/1.jpg)
1
Ivan Ovcharenko
Computational Biology BranchNational Center for Biotechnology Information
National Institutes of Health
September 23, 2010
Structure of proximal and distant regulatory elements in the human genome
![Page 2: Structure of proximal and distant regulatory elements in the human genome](https://reader036.vdocuments.mx/reader036/viewer/2022062321/56812f8d550346895d9509a7/html5/thumbnails/2.jpg)
2
~ 3% is coding for proteins
3 billion letters
~ 45% is “junk” (repetitive elements)
gene regulatory elements (REs) reside SOMEWHERE in the rest ~50%
The Genome Sequence: The Ultimate Code of Life
![Page 3: Structure of proximal and distant regulatory elements in the human genome](https://reader036.vdocuments.mx/reader036/viewer/2022062321/56812f8d550346895d9509a7/html5/thumbnails/3.jpg)
04/19/2023 3
Distant Regulatory Elements
![Page 4: Structure of proximal and distant regulatory elements in the human genome](https://reader036.vdocuments.mx/reader036/viewer/2022062321/56812f8d550346895d9509a7/html5/thumbnails/4.jpg)
4
Hirschprung disease is associated with a noncoding SNP
RET
![Page 5: Structure of proximal and distant regulatory elements in the human genome](https://reader036.vdocuments.mx/reader036/viewer/2022062321/56812f8d550346895d9509a7/html5/thumbnails/5.jpg)
5
Hundreds of noncoding disease SNPs
![Page 6: Structure of proximal and distant regulatory elements in the human genome](https://reader036.vdocuments.mx/reader036/viewer/2022062321/56812f8d550346895d9509a7/html5/thumbnails/6.jpg)
6
• Transcription factors (TF) bind to very short binding sites (6-10 nucleotides) (TFBS)
• Combinatorial binding of multiple TFs to a RE defines a specific pattern of gene expression
• Correlating patterns of TFBS in REs with the biological function will “decode” the gene regulatory encryption
GENE
aCTGACTgaaaaCTGATATTGacagtTTGTTGTTGttaa
TFBS TFBS TFBS
REGULATORY ELEMENT (RE)
Protein A Protein BProtein C
DNA
Combinations of binding sites define the biological function of regulatory elements
![Page 7: Structure of proximal and distant regulatory elements in the human genome](https://reader036.vdocuments.mx/reader036/viewer/2022062321/56812f8d550346895d9509a7/html5/thumbnails/7.jpg)
7
![Page 8: Structure of proximal and distant regulatory elements in the human genome](https://reader036.vdocuments.mx/reader036/viewer/2022062321/56812f8d550346895d9509a7/html5/thumbnails/8.jpg)
8Berman et al. (2002) PNAS 99:757
a. Are known to occur widely in nature (Arnone and Davidson, 1997)
b. Provide redundancy for key regulatory events – cornerstone of developmental stability
c. Respond to various concentrations of TFs (e.g. allow lowly abundant TFs to bind)
Homotypic TFBS clusters
![Page 9: Structure of proximal and distant regulatory elements in the human genome](https://reader036.vdocuments.mx/reader036/viewer/2022062321/56812f8d550346895d9509a7/html5/thumbnails/9.jpg)
9
99530000 99532000 99534000 99536000 99538000 99540000 99542000 995440004.00E-05
5.00E-05
6.00E-05
7.00E-05
8.00E-05
9.00E-05
1.00E-04
E2F_Q6_01 Cluster
Searching the human genome for homotypic TFBS clusters
![Page 10: Structure of proximal and distant regulatory elements in the human genome](https://reader036.vdocuments.mx/reader036/viewer/2022062321/56812f8d550346895d9509a7/html5/thumbnails/10.jpg)
10
Homotypic TFBS clusters in the human genome
• ~700 TRANSFAC & Jaspar PWMs were used to annotate putative TFBS in
the non-repetitive, non-exonic part of the human genome
• A 2-state HMM model was trained to identify genomic regions with an
elevated density of TFBS events
TFBS “A”TFBS cluster
< 3kb
< 500 bps
![Page 11: Structure of proximal and distant regulatory elements in the human genome](https://reader036.vdocuments.mx/reader036/viewer/2022062321/56812f8d550346895d9509a7/html5/thumbnails/11.jpg)
11
Only 33 PWMs have more than 1000 clusters
Direct Indirect Human specific
0
1000
2000
3000
4000
5000
Number of clusters in the human genome
700+ Transcription Factors
• 126,000 homotypic TFBS clusters
• 272 (40%) of TFs have at least 5 clusters
• Median length – 597 bps
• Median number of TFBS per cluster – 5
• Total genome span – 50.4 Mb (1.6%)
![Page 12: Structure of proximal and distant regulatory elements in the human genome](https://reader036.vdocuments.mx/reader036/viewer/2022062321/56812f8d550346895d9509a7/html5/thumbnails/12.jpg)
12
Homotypic TFBS are strongly associated with promoters
2290 clusters (47% of 4894 total) are in promoters
51% of human promoters contain at least 1 cluster
![Page 13: Structure of proximal and distant regulatory elements in the human genome](https://reader036.vdocuments.mx/reader036/viewer/2022062321/56812f8d550346895d9509a7/html5/thumbnails/13.jpg)
V$J3_ETS1_HSAP V$AP2_Q6_01 V$AHRHIF_Q6 V$SP1_Q6 V$AREB6_03 V$HNF6_Q6 V$J3_HNF4A_ V$J3_FOXD3_RNOR0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
in promoters not in promoters
p-val < 0.005 for 78 TFs
Fraction of clusters in promoters
13
![Page 14: Structure of proximal and distant regulatory elements in the human genome](https://reader036.vdocuments.mx/reader036/viewer/2022062321/56812f8d550346895d9509a7/html5/thumbnails/14.jpg)
14
SNP density in clusters
![Page 15: Structure of proximal and distant regulatory elements in the human genome](https://reader036.vdocuments.mx/reader036/viewer/2022062321/56812f8d550346895d9509a7/html5/thumbnails/15.jpg)
Comparing TFBS to inter-site regions within clusters to avoid ascertainment bias
cluster
inter-site region
![Page 16: Structure of proximal and distant regulatory elements in the human genome](https://reader036.vdocuments.mx/reader036/viewer/2022062321/56812f8d550346895d9509a7/html5/thumbnails/16.jpg)
16
Two lines of evidence of negative selection acting on TFBS within TFBS clusters
![Page 17: Structure of proximal and distant regulatory elements in the human genome](https://reader036.vdocuments.mx/reader036/viewer/2022062321/56812f8d550346895d9509a7/html5/thumbnails/17.jpg)
17
Overlap with in vivo developmental enhancershttp://enhancer.lbl.gov
346 ENHANCERS 503 NEGATIVES
“deep” or “ultra” conservation
![Page 18: Structure of proximal and distant regulatory elements in the human genome](https://reader036.vdocuments.mx/reader036/viewer/2022062321/56812f8d550346895d9509a7/html5/thumbnails/18.jpg)
18
LBL enhancers overlapping conserved homotypic clusters
Expected :: 5 (1.5%) enhancers overlapping clusters
Observed :: 163 (47%) enhancers overlapping clusters
p-value < 10-100
![Page 19: Structure of proximal and distant regulatory elements in the human genome](https://reader036.vdocuments.mx/reader036/viewer/2022062321/56812f8d550346895d9509a7/html5/thumbnails/19.jpg)
19
Breaking the code. TF – tissue associations.
![Page 20: Structure of proximal and distant regulatory elements in the human genome](https://reader036.vdocuments.mx/reader036/viewer/2022062321/56812f8d550346895d9509a7/html5/thumbnails/20.jpg)
20
3-fold stronger association with p300 binding than expected
enhancer
![Page 21: Structure of proximal and distant regulatory elements in the human genome](https://reader036.vdocuments.mx/reader036/viewer/2022062321/56812f8d550346895d9509a7/html5/thumbnails/21.jpg)
21
25-fold difference, P=2.99·10-50
Tissue-specific association of NOBOX and E2F4
E2F4 HCT NOBOX HCT
![Page 22: Structure of proximal and distant regulatory elements in the human genome](https://reader036.vdocuments.mx/reader036/viewer/2022062321/56812f8d550346895d9509a7/html5/thumbnails/22.jpg)
A
B
C
diencephalon
pancreas
caudal
somites
subregions of
forebrain, midbrain,
hindbrain
neural tube
Experimental validation, E2F4 & NRF1 clusters
Lawrence Berkeley LabAxel ViselLen Pennacchio
![Page 23: Structure of proximal and distant regulatory elements in the human genome](https://reader036.vdocuments.mx/reader036/viewer/2022062321/56812f8d550346895d9509a7/html5/thumbnails/23.jpg)
23
Summary
Homotypic TFBS clusters are abundant in the human genome; they span 50.4 Mb (1.6% of the genome) – about as much as coding DNA
~50% of human promoters contain a homotypic cluster of binding sites
~50% of validated enhancers contain a homotypic cluster of binding sites
![Page 24: Structure of proximal and distant regulatory elements in the human genome](https://reader036.vdocuments.mx/reader036/viewer/2022062321/56812f8d550346895d9509a7/html5/thumbnails/24.jpg)
24
Acknowledgements
Valer Gotea
Lawrence Berkeley Lab
Axel Visel
Len Pennacchio
![Page 25: Structure of proximal and distant regulatory elements in the human genome](https://reader036.vdocuments.mx/reader036/viewer/2022062321/56812f8d550346895d9509a7/html5/thumbnails/25.jpg)
25
SNP ascertainment bias leads to low SNP density in clusters