telomeres are dna at the ends of chromosomes. across all organisms, telomeres have guanine-rich...

1
Telomeres are DNA at the ends of chromosomes. Across all organisms, telomeres have guanine-rich sequences. Many of these sequences have been shown to form G4 in vitro (some in vivo). As human cells age telomeres shorten. When telomeres shorten, proteins that are associated with the ends of telomeres are released. Rap1p is a transcription regulator that is known to bind to double-stranded DNA (including telomere DNA) and to G4s. Individuals with the premature aging disease Werner syndrome lack a DNA helicase, WRN, that is preferentially unwinds G4s and helps maintain telomeres. Yeast as a model organism to study aging Yeast naturally maintain telomere length using the enzyme telomerase. Yeast tlc1 mutants lack telomerase activity and shorten their telomeres with cell division. Eventually this telomere shortening causes permanent cell-cycle arrest (senescence). Yeast tcl1 mutants are model for studying human cell senescence. Evidence That G-Quadruplexes Regulate Transcription in S. Cerevisiae Steven Hershman, Qijun Chen, Julia Lee, Marina Kozak, Jasmine Smith, Alex Chavez, Li-San Wang and F. Brad Johnson Department of Pathology and Laboratory Medicine, University of Pennsylvania, Philadelphia, PA 19104 Results In silico prediction shows yeast genome has high G4 forming potential Evidence of non-telomeric G4s in the human genome In humans, sequences with G4 forming potential are overrepresented, particularly in promoters and nucleosome-free regions (Huppert, 2005). Additionally, G4 forming sequences are overrepresented in proto- oncogenes and underrepresented in tumor suppressor genes (Eddy, 2006). Role of G-quads in telomeres; its connection with aging Scientific problem Are there in vivo non-telomeric G4s in yeast? If so, what role do they play? Computational Prediction of G4 DNA The most conservative G4 prediction algorithms use a regular expression where repeats of at least 3 guanine residues are separated by loops of 1-7 other base pairs. Regular Expression (Huppert, 2005) G 3+ N 1-7 G 3+ N 1-7 G 3+ N 1-7 G 3+ Sliding Window (Eddy, 2006) An alternative method for finding G4s relies on looking at the distance between the first and fourth GGG repeat. This allows for more flexible lengths in the loop regions. Structural biology and biochemistry of G-quadruplexes (G4s) G-quadruplexes (G4s) are four stranded DNA structures that form when guanine residues participate in Hoogsteen hydrogen bonding (Left) forming boxes known as G- quartets. These G-quartets stack to from a quadruplex (Right). G4s are particularly stable under physiological pH and salt conditions. Top: The enrichment ratio of quadruplexes of different lengths in ORFs and promoters. Control sequences were simulated by conserving position- wise A/T/G/C frequencies. G4 sequences show a greater preference for promoters compared with open reading frames, especially at lower, more trustworthy, G4 lengths. Bottom: A sliding window was used to measure the position of G4 sequences of length 50 or less relative to the translation start site across all genes. The greatest peak was found to occur at about 425 base pairs upstream of the promoter. In yellow is an example control sequence, which has many fewer G4 forming sequences. Literature Cited: Eddy J and Maizels N (2006). Gene Function Correlates with Potential for G4 DNA Formation in the Human Genome. Nucleic Acids Research . 34(14):3887-96. Huppert JL and Balasubramanian S (2005). Prevalence of quadurpelxes in the human genome. Nucleic Acids Research . 33, 2908-2916. Huber MD, Lee DC, and Maizels N (2002). G4 DNA unwinding by BLM and Sgs1p: substrate specificity and substrate-specific inhibition. Nucleic Acids Research . 30, 3954-3961. Williamson JR (1994). G-Quartet Structures in Telomeric DNA. Annu. Rev Biophys. Biomol. Struct. 23, 703-730. OR Williamson, J.R., Raghuraman MK, and Cech TR (1989). Monovalent Cation- Induced Structure of Telomeric DNA: The G-Quartet Model. Cell . 59, 871-880. Rawal P, Kummarasetti VB, Ravindran J, Kumar N, Halder K, Sharma R, Mukerji M, Das SK, Chowdhury S (2006). Genome-wide prediction of G4 DNA as regulatory motifs: role in Escherichia coli global regulation. Genome Res ., 16, 644-655. Acknowledgements: Biostatistics Core, School of Medicine, U. of Pennsylvania for early advice as to which statistical analyses to use Cara Winter, Wager Lab, Department of Biology, U. of Pennsylvania for help with microarray analysis Supported by the Vagelos Program, Roy and Diana Vagelos Science Challenge Award and National Institute on Aging Grant 1R01AG021521 Future Projects GGG repeats in quadruplex forming sequences can be mutated so that they should no longer form quadruplexes. We would like to test how this affects regulation of yeast genes that are bind Rap1, but not in a double stranded DNA-dependent manner. If quadruplexes play a role in gene regulation, then by adding drugs that stabilize quadruplex formation, we should be able to find genes who expression levels change. We would like to explore this with various quadruplex binding drugs in microarray experiments. In addition to intrastrand quadruplexes, it is also possible that multiple strands can come together to form a quadruplex. We would like to explore this possibility computationally by allowing the four GGG repeats to occur on either the sense or anti-sense strand. Conclusions In the yeast genome, the increased frequency and location preference to promoters of G-quadruplex sequences suggests a role in gene regulation. G-quadruplexes may help coordinate transcriptional responses to telomere shortening. Rap1p might be involved in the mechanism. G4 potential is correlated with senescence Genes with G4 sequences of length 50 or less in their promoters are correlated with genes that are upregulated 2-fold in cells lacking tlc1 that have reached senescence. Fisher Exact P=0.00036 Tlc1 up G4-DNA 244 33 354 Background G4 potential is correlated with non-double strand Rap1p DNA targets Top: A list of genes that associate with Rap1 but not through double stranded DNA binding was developed by creating a list of genes that associate with Rap1p through chip-chip assays and removing any gene whose double stranded DNA associated to Rap1 in vitro. This list of genes was enriched with G4 potential. Genes that interact with Rap1 are correlated with genes that are downregulated at senescence (Fisher Exact P=0.00087). G4 cannot explain this connection because the tlc1-G4 connection was based on genes that were upregulated. Bottom: A GO analysis was completed on the genes in this double overlap. Classes that involved metabolism and transcription were overrepresented. This echo Fisher Exact P=1.4 x 10 -13 Non-dsDNA Rap1p targets G4-DNA 222 55 390 Class Go Term P Value BP telomerase-independent telomere maintenance 8.11E- 08 BP mitotic recombination 3.00E- 06 MF DNA helicase activity 9.87E- 06 MF helicase activity 0.00012 1 BP energy derivation by oxidation of organic compounds 0.00131 3 MF transcription factor activity 0.00148 5 BP DNA recombination 0.00196 7 BP generation of precursor metabolites and energy 0.00367 Model: Rap1p-G4 interactions To explain the Rap1p/tlc1/G4 overlap results, we propose the model to the right. Rap1p leaves its telomere binding sites as telomeres become short at senescence. Top: When Rap1p interacts with quadruplexes in the absence of a double stranded binding site, it activates transcription. Bottom: If there is a double stranded binding site present, then Rap1p inhibits transcription. OFF Rap1p ON Rap1p 0 20 40 60 80 100 120 140 160 180 200 -1500 -1250 -1000 -750 -500 -250 0 250 500 750 1000 1250 1500 N um berofG 4 w ithin 200b G enom e C ontrol Promoter ORF 0 2 4 6 8 10 12 14 16 18 20 25 35 50 75 100 125 150 200 250 500 1000 M axim um G 4 Length Fold Enrichm entG 4 O RF Prom oter

Upload: hubert-dennis

Post on 17-Dec-2015

221 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Telomeres are DNA at the ends of chromosomes.  Across all organisms, telomeres have guanine-rich sequences. Many of these sequences have been shown

Telomeres are DNA at the ends of chromosomes. Across all organisms, telomeres have guanine-rich sequences. Many of these

sequences have been shown to form G4 in vitro (some in vivo). As human cells age telomeres shorten. When telomeres shorten, proteins that are associated with the ends of

telomeres are released. Rap1p is a transcription regulator that is known to bind to double-stranded

DNA (including telomere DNA) and to G4s. Individuals with the premature aging disease Werner syndrome lack a DNA

helicase, WRN, that is preferentially unwinds G4s and helps maintain telomeres.

Yeast as a model organism to study aging Yeast naturally maintain telomere length using the enzyme telomerase. Yeast tlc1 mutants lack telomerase activity and shorten their telomeres

with cell division. Eventually this telomere shortening causes permanent cell-cycle arrest (senescence).

Yeast tcl1 mutants are model for studying human cell senescence.

Evidence That G-Quadruplexes Regulate Transcription in S. Cerevisiae

Steven Hershman, Qijun Chen, Julia Lee, Marina Kozak, Jasmine Smith, Alex Chavez, Li-San Wang and F. Brad Johnson

Department of Pathology and Laboratory Medicine, University of Pennsylvania, Philadelphia, PA 19104Results

In silico prediction shows yeast genome has high G4 forming potential

Evidence of non-telomeric G4s in the human genome In humans, sequences with G4 forming potential are overrepresented,

particularly in promoters and nucleosome-free regions (Huppert, 2005). Additionally, G4 forming sequences are overrepresented in proto-oncogenes

and underrepresented in tumor suppressor genes (Eddy, 2006).

Role of G-quads in telomeres; its connection with aging

Scientific problemAre there in vivo non-telomeric G4s in yeast? If so, what role do they play?

Computational Prediction of G4 DNA

The most conservative G4 prediction algorithms use a regular expression where repeats of at least 3 guanine residues are separated by loops of 1-7 other base pairs.

Regular Expression (Huppert, 2005)

G3+N1-7G3+N1-7G3+N1-7G3+

Sliding Window (Eddy, 2006)

An alternative method for finding G4s relies on looking at the distance between the first and fourth GGG repeat. This allows for more flexible lengths in the loop regions.

Structural biology and biochemistry of G-quadruplexes (G4s) G-quadruplexes (G4s) are four

stranded DNA structures that form when guanine residues participate in Hoogsteen hydrogen bonding (Left) forming boxes known as G-quartets. These G-quartets stack to from a quadruplex (Right).

G4s are particularly stable under physiological pH and salt conditions.

Top: The enrichment ratio of quadruplexes of different lengths in ORFs and promoters. Control sequences were simulated by conserving position-wise A/T/G/C frequencies. G4 sequences show a greater preference for promoters compared with open reading frames, especially at lower, more trustworthy, G4 lengths.

Bottom: A sliding window was used to measure the position of G4 sequences of length 50 or less relative to the translation start site across all genes. The greatest peak was found to occur at about 425 base pairs upstream of the promoter. In yellow is an example control sequence, which has many fewer G4 forming sequences.

Literature Cited: Eddy J and Maizels N (2006). Gene Function Correlates with Potential for G4

DNA Formation in the Human Genome. Nucleic Acids Research. 34(14):3887-96.

Huppert JL and Balasubramanian S (2005). Prevalence of quadurpelxes in the human genome. Nucleic Acids Research. 33, 2908-2916.

Huber MD, Lee DC, and Maizels N (2002). G4 DNA unwinding by BLM and Sgs1p: substrate specificity and substrate-specific inhibition. Nucleic Acids Research. 30, 3954-3961.

Williamson JR (1994). G-Quartet Structures in Telomeric DNA. Annu. Rev Biophys. Biomol. Struct. 23, 703-730. OR Williamson, J.R., Raghuraman MK, and Cech TR (1989). Monovalent Cation-Induced Structure of Telomeric DNA: The G-Quartet Model. Cell. 59, 871-880.

Rawal P, Kummarasetti VB, Ravindran J, Kumar N, Halder K, Sharma R, Mukerji M, Das SK, Chowdhury S (2006). Genome-wide prediction of G4 DNA as regulatory motifs: role in Escherichia coli global regulation. Genome Res., 16, 644-655.

Acknowledgements: Biostatistics Core, School of Medicine, U. of Pennsylvania for early advice as

to which statistical analyses to use Cara Winter, Wager Lab, Department of Biology, U. of Pennsylvania for help

with microarray analysis Supported by the Vagelos Program, Roy and Diana Vagelos Science

Challenge Award and National Institute on Aging Grant 1R01AG021521

Future Projects GGG repeats in quadruplex forming sequences can be mutated so that

they should no longer form quadruplexes. We would like to test how this affects regulation of yeast genes that are bind Rap1, but not in a double stranded DNA-dependent manner.

If quadruplexes play a role in gene regulation, then by adding drugs that stabilize quadruplex formation, we should be able to find genes who expression levels change. We would like to explore this with various quadruplex binding drugs in microarray experiments.

In addition to intrastrand quadruplexes, it is also possible that multiple strands can come together to form a quadruplex. We would like to explore this possibility computationally by allowing the four GGG repeats to occur on either the sense or anti-sense strand.

Conclusions In the yeast genome, the increased frequency and location preference to

promoters of G-quadruplex sequences suggests a role in gene regulation. G-quadruplexes may help coordinate transcriptional responses to telomere

shortening. Rap1p might be involved in the mechanism. 

G4 potential is correlated with senescence

Genes with G4 sequences of length 50 or less in their promoters are correlated with genes that are upregulated 2-fold in cells lacking tlc1 that have reached senescence.

Fisher Exact P=0.00036

Tlc1up

G4-DNA

24433354

Background

G4 potential is correlated with non-double strand Rap1p DNA targets Top: A list of genes that associate

with Rap1 but not through double stranded DNA binding was developed by creating a list of genes that associate with Rap1p through chip-chip assays and removing any gene whose double stranded DNA associated to Rap1 in vitro. This list of genes was enriched with G4 potential.

Genes that interact with Rap1 are correlated with genes that are downregulated at senescence (Fisher Exact P=0.00087). G4 cannot explain this connection because the tlc1-G4 connection was based on genes that were upregulated.

Bottom: A GO analysis was completed on the genes in this double overlap. Classes that involved metabolism and transcription were overrepresented. This echo similar results found in E. Coli by Rawal (2006).

Fisher Exact P=1.4 x 10-13

Non-dsDNA Rap1p targets

G4-DNA

22255390

Class Go Term P Value

BPtelomerase-independent telomere maintenance 8.11E-08

BP mitotic recombination 3.00E-06

MF DNA helicase activity 9.87E-06

MF helicase activity 0.000121

BP

energy derivation by oxidation of organic compounds 0.001313

MF transcription factor activity 0.001485

BP DNA recombination 0.001967

BPgeneration of precursor metabolites and energy 0.00367

Model: Rap1p-G4 interactions To explain the Rap1p/tlc1/G4 overlap

results, we propose the model to the right.

Rap1p leaves its telomere binding sites as telomeres become short at senescence.

Top: When Rap1p interacts with quadruplexes in the absence of a double stranded binding site, it activates transcription.

Bottom: If there is a double stranded binding site present, then Rap1p inhibits transcription.

OFF

Rap1p

ON

Rap1p

0

20

40

60

80

100

120

140

160

180

200

-1500 -1250 -1000 -750 -500 -250 0 250 500 750 1000 1250 1500Num

ber

of G

4 w

ithin

200

bp

Genome

Control

Promoter ORF

0

2

4

6

8

10

12

14

16

18

20

25 35 50 75 100 125 150 200 250 500 1000

Maximum G4 Length

Fol

d E

nri

chm

ent G

4

ORF

Promoter