dna methylation profiling reveals novel …zx378tc0675/kobayashi phd thesis for elec...dna...
TRANSCRIPT
DNA METHYLATION PROFILING REVEALS NOVEL BIOMARKERS AND
IMPORTANT ROLES FOR DNA METHYLTRANSFERASES IN PROSTATE
CANCER
A DISSERTATION
SUBMITTED TO THE DEPARTMENT OF GENETICS
AND THE COMMITTEE ON GRADUATE STUDIES
OF STANFORD UNIVERSITY
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS
FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY
Yuya Kobayashi
February 2011
http://creativecommons.org/licenses/by-nc/3.0/us/
This dissertation is online at: http://purl.stanford.edu/zx378tc0675
© 2011 by Yuya Kobayashi. All Rights Reserved.
Re-distributed by Stanford University under license with the author.
This work is licensed under a Creative Commons Attribution-Noncommercial 3.0 United States License.
ii
I certify that I have read this dissertation and that, in my opinion, it is fully adequatein scope and quality as a dissertation for the degree of Doctor of Philosophy.
Richard Myers, Primary Adviser
I certify that I have read this dissertation and that, in my opinion, it is fully adequatein scope and quality as a dissertation for the degree of Doctor of Philosophy.
Gavin Sherlock, Co-Adviser
I certify that I have read this dissertation and that, in my opinion, it is fully adequatein scope and quality as a dissertation for the degree of Doctor of Philosophy.
James Brooks
I certify that I have read this dissertation and that, in my opinion, it is fully adequatein scope and quality as a dissertation for the degree of Doctor of Philosophy.
Joseph Lipsick
I certify that I have read this dissertation and that, in my opinion, it is fully adequatein scope and quality as a dissertation for the degree of Doctor of Philosophy.
Hua Tang
Approved for the Stanford University Committee on Graduate Studies.
Patricia J. Gumport, Vice Provost Graduate Education
This signature page was generated electronically upon submission of this dissertation in electronic format. An original signed hard copy of the signature page is on file inUniversity Archives.
iii
iv
ABSTRACT
Candidate gene based studies have identified a handful of aberrant CpG DNA
methylation events in prostate cancer (Brooks et al. 1998; Yegnasubramanian et al.
2004). However, large scale DNA methylation profiles have not been examined for
normal prostates or prostate tumors. Additionally, the mechanisms behind these DNA
methylation alterations are unknown. In this thesis, I describe the results of my efforts to
better understand these previously unexplored areas of biology.
For the study presented in this thesis, I quantitatively profiled 95 primary prostate tumors
and 86 healthy prostate tissue samples for their DNA methylation levels at 26,333 CpGs
representing 14,104 gene promoters by using the Illumina HumanMethylation27
platform. When the profiles of the prostate tissue samples were compared, I observed a
substantial number of tumor-specific DNA methylation alterations. A 2-class
Significance Analysis of this dataset revealed 5,912 CpG sites with increased DNA
methylation and 2,151 CpG sites with decreased DNA methylation in tumors (FDR <
0.8%). Prediction Analysis of this dataset identified 87 CpGs that are the most predictive
diagnostic methylation biomarkers of prostate cancer. By integrating available clinical
follow-up data, I also identified 69 prognostic DNA methylation alterations that correlate
with biochemical recurrence of the tumor.
v
To identify the mechanisms responsible for these genome-wide DNA methylation
alterations, I measured the gene expression levels of several DNA methyltransferases
(DNMTs) and their interacting proteins by TaqMan qPCR and observed increased
expression of DNMT3A2, DNMT3B, and EZH2 in tumors. Subsequent transient
transfection assays in cultured primary prostate cells revealed that DNMT3B1 and
DNMT3B2 overexpression resulted in increased methylation of a substantial subset of
CpG sites that also showed tumor-specific increased methylation.
vi
ACKNOWLEDGEMENTS
I would like to thank the following people and organizations for material, intellectual and
moral support:
All the patients who donated their prostates for the advancement of science.
Devin Absher, Kenny Day, Krista Stanton and Kevin Roberts for assistance with
experiments and analysis of the HumanMethylation27 data.
James Brooks and Zulfiqar Gulzar for collecting the samples and their guidance and
support.
Donna Peehl and Sarah Young for assistance with the cultured primary prostate
cells.
All the faculty, staff and student in the Stanford Genetics Department for creating
an intellectually stimulating and fun environment.
My thesis committee members Hua Tang, Joseph Lipsick and James Brooks for
helpful comments and discussion.
My advisors Rick Myers and Gavin Sherlock for outstanding guidance and
mentorship throughout my graduate career.
All my current and past lab-mates in the Myers and Sherlock Labs for their
friendship, support and insight, day-in and day-out.
My fantastic Stanford friends for all the great memories.
My family for their continued support and doing everything to open every door of
opportunity for me.
My wife Katie for her patience and love.
vii
TABLE OF CONTENTS
Section Page
CHAPTER 1: Introduction ......................................................................................... 1
CHAPTER 2: Methods ................................................................................................ 6
CHAPTER 3: DNA methylation profiles of normal prostates and prostate tumors
....................................................................................................... 13
CHAPTER 4: DNA Methyltransferases in prostate ................................................. 35
CHAPTER 5: Discussion ............................................................................................. 44
APPENDIX ................................................................................................................... 52
REFERENCES ............................................................................................................. 60
viii
List of Tables
Table 2.1: Primer sequences used in PyroMark assays ................................................. 11
Supplementary Table S1: Clinical information associated with prostate samples ..... 53
Supplementary Table S2: Diagnostic methylation markers of prostate cancer identified
by PAM ......................................................................................................................... 56
Supplementary Table S3: Prognostic methylation markers of prostate cancer identified
by SAM survival ........................................................................................................... 58
ix
List of Illustrations
Figure 3.1: Hierarchical clustering of prostate tissues by DNA methylation ............. 15
Figure 3.2: Normal vs Tumor unpaired 2-class SAM analysis of the 181 prostate
samples .......................................................................................................................... 16
Figure 3.3: Differentially methylated CpGs of prostate tumors ................................. 17
Figure 3.4: Normal vs Tumor paired 2-class SAM analysis of the 181 prostate samples
...................................................................................................... 19
Figure 3.5: Quantitative SAM analysis of 86 prostate samples based on age of patient at
the time of surgery ........................................................................................................ 19
Figure 3.6: 749-Age dependent CpGs ....................................................................... 20
Figure 3.7: GSTP1 CpG island hypermethylation in prostate tumors ....................... 22
Figure 3.8: APC proximal promoter hypermethylation in prostate tumors ............... 23
Figure 3.9: RASSF1 proximal promoter hypermethylation in prostate tumors .......... 24
Figure 3.10: Diagnostic markers of prostate cancer identified by PAM .................... 26
Figure 3.11: PyroMark validates HumanMethylation27 results ................................. 27
Figure 3.12: Comparison of neighboring CpGs by PyroMark .................................... 30
Figure 4.1: Expression of DNMTs and EZH2 correlates with global hypermethylation in
prostate tumors .............................................................................................................. 37
Figure 4.2: Overexpression of DNMTs and EZH2 results in increased methylation at a
subset of prostate tumor hypermethylation sites ........................................................... 40
1
CHAPTER 1
INTRODUCTION
1
Prostate cancer is the most commonly diagnosed malignancy and second leading cause of
cancer mortality for men in the United States with an estimated 217,730 new cases and
32,050 prostate cancer deaths projected for 2010 (Jemal et al. 2010). With nearly one in
six men diagnosed with this disease in their lifetime, it represents a $9.9 billion burden to
the U.S. healthcare system in 2006 (National Cancer Institute 2010).
After more than two decades of widespread serum prostate specific antigen (PSA)
testing, clinical prostate cancer has shifted to a predominantly localized disease.
However there are two key challenges in our current diagnostic and prognostic strategies
with regards to this disease. First, PSA testing is not specific to cancer – more common
and less serious conditions such as benign prostate hyperplasia and prostate infections
also increase PSA levels. For this reason, PSA testing has a false positive rate of greater
than 70% (P Finne et al. 2000). This high rate of false positives has not only been shown
to have psychological harms to patients (K Lin et al. 2008), but also leads to many
unnecessary prostate biopsies.
Second, despite the shift towards localized disease, two recent large-scale, randomized
trials of PSA screening suggest that prostate cancer is over-diagnosed and over-treated
(Andriole et al. 2009; Schröder et al. 2009). In the Andriole study, no difference in
prostate cancer mortality was observed between two groups of patients: cases comprised
of over 38,000 patients who received annual screening and an equal size control group
which received „usual care‟ outlined by their personal healthcare providers. In the
2
Schröder study, 182,000 men were either offered PSA screening every 4 years or were
offered no screening. A slight decrease in mortality rate was observed, but it was noted
that in order to save one life, 1,410 men would need to be screened and 49 patients would
need to be treated. While both studies observed higher levels of detection in the screened
group, this did not strongly correlate with reduced mortality.
This over treatment is most likely because many cancers that are detected are never
destined to progress. While some patients die of metastatic disease within 2 to 3 years of
diagnosis, with a 10-year survival rate of 91% and average age of onset of 72 years, most
patients live with a relatively indolent form of the disease and ultimately die of unrelated
causes. The broad range of clinical behavior is likely a reflection of the underlying
genomic diversity of the tumors (Taylor et al. 2010).
Previous studies of prostate tumors reported significant heterogeneity in the gene
expression profiles and genomic structural alterations including DNA copy number
changes and gene fusions involving the ETS family of transcription factors detectable in
approximately half of prostate tumors (Lapointe et al. 2004; Singh et al. 2002; Sboner et
al. 2010; Tomlins et al. 2008, 2005; King et al. 2009; Taylor et al. 2010; Robbins et al.
2011; Pflueger et al. 2011). However, exon sequencing of known oncogenes and tumor
suppressors has found few somatic mutations and the calculated background mutation
rate appears to be relatively low (Taylor et al. 2010). This suggests the presence of other
3
forms of genomic aberrations that contribute to the observed gene expression variations,
and in turn, the diversity in tumor behavior.
DNA methylation has long been suspected to play a role in tumorigenesis and cancer
progression in various tissue types (Lapeyre et al. 1981; Jones 1986; P W Laird and R
Jaenisch 1994, 1996; Samir K. Patra et al. 2002; Das and Singal 2004; Melanie Ehrlich 2002;
Manel Esteller and Herman 2002). Early studies in cancer epigenetics revealed an overall
reduction of 5-methylcytosine in various tumor genomes (Gama-Sosa et al. 1983; A P
Feinberg and Vogelstein 1983). In contrast, more recent studies identified many
hypermethylation events in CpG islands near known tumor suppressor transcriptional
start sites, which correlated with reduction in transcript levels (Brooks et al. 1998). Many
of these candidate gene-based approaches have led to discovery of potentially prognostic
DNA methylation events (Hannes M. Müller et al. 2003; Eun-Jung Kim et al. 2008).
However, recent advances in microarray and high-throughput massively-parallel
sequencing technologies have enabled investigators to study site-specific DNA
methylation events on a much broader scale. Recent studies of the DNA methylome in
colorectal cancer and glioblastomas have revealed valuable new insights into those
diseases, including the discovery of hundreds of affected genes previously not identified
(Irizarry et al. 2009; Noushmehr et al. 2010; The Cancer Genome Atlas. 2008).
In prostate cancer, hypermethylation of several tumor suppressor promoters has been well
documented. Most notably, the hypermethylation of the CpG island overlapping the
4
transcriptional start site of the GSTP1 gene has been associated with transcriptional
silencing and is described as the most common molecular alteration in prostate cancer
identified to date (X Lin et al. 2001; Woodson et al. 2008). Since GSTP1 promoter
methylation is very common and specific for prostate cancer, many investigators have
proposed using this methylation event as a diagnostic biomarker (Nakayama et al. 2004;
Cairns et al. 2001). A 2004 study looking at eight additional candidate-sites known to be
differentially methylated in other tumor types identified three gene promoters that were
specifically hypermethylated in prostate cancer (Jerónimo et al. 2004), APC, RASSF1
and CRBP1. Additionally, studies looking at DNA methyltransferases (DNMTs) and
DNMT-interacting proteins have suggested that dysregulation of these genes in prostate
cancer are responsible for the improper DNA methylation events in primary tumors and
cell lines (Hoffmann et al. 2007; Yaqinuddin et al. 2008; Ley et al. 2010; Brooks et al.
1998). These findings suggest that DNA methylation alteration is a common event in
prostate cancer. However, discovery of specific sites of alterations have been limited by
practical limitations of the throughput of quantitative DNA methylation assays that were
available at the time.
More recently, Kron et al. reported the DNA methylation profiles of 20 prostate tumors at
CpG islands across the genome using a human CpG island microarray. However, this
study did not include the profiles of normal prostate tissues and was only able to make
comparisons between the prostate tumors and six cases of age-matched lymphocytes
(Kron et al. 2009). While they were able to identify sites that were methylated or
5
unmethylated in prostate cancer, they could not examine the change in methylation states
due to their study design.
To identify DNA methylation alterations that occur in prostate cancer, I quantitatively
profiled 95 primary prostate tumors and 86 healthy prostate tissues for their DNA
methylation levels at 26,333 CpG sites in 14,104 gene promoters using the Illumina
HumanMethylation27 microarray platform. By applying existing microarray analysis
tools, I identified the differentially methylated CpGs and explored subsets of them that
accurately distinguished tumor and normal prostate tissues. Furthermore, I then
integrated available clinical data to discover novel prognostic markers of aggressive
tumors. Finally, I investigated the DNMT protein family, as well as their interacting
partners, for their role in the alteration of DNA methylation in prostate cancer.
6
CHAPTER 2
METHODS
7
Sample collection and preparation
All prostate samples used for this study were collected at the Stanford University Medical
Center between 1999 and 2007 with patient‟s informed consent under an IRB-approved
protocol. Multiple tissue samples were harvested from each prostate, flash frozen and
stored at -80°C. Sections of each prostate tissue sample were evaluated by a
genitourinary pathologist. The tumor and non-tumor areas were marked and
contaminating tissues were trimmed away from the block as described previously
(Lapointe et al. 2004). Tumor samples in which at least 90% of the epithelial cells were
cancerous, and non-tumor samples having no observable tumor epithelium, were selected
for extraction of DNA and RNA. Clinical information associated with prostate samples
included in the analysis is summarized in Supplemental Table S1.
Primary prostate cell culture and transfection assays
A primary culture of human prostatic epithelial cells (E-PZ-231) was established from
benign tissue of the peripheral zone of the prostate of a 56 year-old man who underwent
radical prostatectomy to treat prostate cancer. Using previously described methods
(Peehl 2002), primary cultures were serially passaged. When tertiary passage cells were
about 50% confluent, they were fed Complete PFMR- 4A medium (Peehl 2002) without
gentamycin until they reached ~85% confluency. Cells in each 60-mm, collagen-coated
dish were then transfected with 10 µg of plasmid DNA using Lipofectamine 2000
(Invitrogen) according to the manufacturer‟s instructions. After 48 hours, cells from
three 60-mm dishes per condition were dissociated with TrypLE Express (Invitrogen),
8
centrifuged, and snap-frozen in liquid nitrogen. These cell pellets were then used for
DNA isolation.
Nucleic acid isolation
DNA and RNA were isolated from tissue samples or cell cultures using Qiagen AllPrep
DNA/RNA mini kit (Qiagen) following the manufacturer‟s protocol, with the exception
of the RNA from primary prostate cell cultures. This RNA was isolated with Trizol
Reagent (Invitrogen) according to the manufacturer‟s instructions.
Sodium bisulfite conversion
Sodium bisulfite conversion of genomic DNA was performed using the EZ-96 DNA
Methylation Kit (Deep-Well format) (ZymoResearch). The conversion was completed
using the alternative incubation protocol for Illumina Infinium Methylation Assay, as
described by the manufacturer.
Methylation analysis by Illumina Infinium HumanMethylation27
Five hundred ng of sodium bisulfite-converted genomic DNA from patient samples or
cultured cells were assayed by Infinium HumanMethylaton27, RevB Beadchip Kits
(Illumina). The assay was performed using the protocol as described by the
manufacturer.
9
Beta score calculations, quality filtering and batch normalization
HumanMethylation27 array results were initially extracted and analyzed using Illumina
BeadStudio software with the Methylation Module v3.2. Beta scores were calculated
manually using values exported from BeadStudio. For each probe intensity value, I
subtracted the median negative background control probe value based on the color
channel. The beta score was calculated using the background subtracted intensity values
as: β = IntensityMethylated / (IntensityMethylated + IntensityUnmethylated). Any negative beta
scores were converted to a zero. Any beta scores with an associated detection p-value of
greater than 0.01 were converted to "missing values". To correct for any array-by-array
variation, I imputed all missing values using KNN (Troyanskaya et al. 2001), then
performed normalization using the ComBat R-package (W. Evan Johnson et al. 2006).
All previously imputed values were converted back to "missing values" for subsequent
analyses.
To remove CpG probes with potentially problematic hybridization, I performed BLAT on
all 27,578 probe sequences against the GRCh27/hg19 build of the human genome. One
thousand and twenty eight probes showed questionable mapping and therefore were
removed from analysis. I also identified 217 probes that included a SNP of greater than
3% minor allele frequency within 15 bp of the assayed CpG. These probes were also
rejected with consideration to potential variation in probe hybridization due to the
common SNP.
10
Clustering
Prior to each hierarchical clustering, the beta scores were mean centered. Hierarchical
clustering of the arrays was done using the software Cluster 3.0 with Average Linkage.
Because these datasets were too large to cluster the genes by Cluster 3.0, gene clustering
was done using XCluster, available through the Stanford Microarray Database (Sherlock
et al. 2001), using non-centered Pearson Correlation to perform the hierarchical
clustering.
Significance Analysis of Microarray (SAM)
Each SAM was performed as described in the software manual. The data were analyzed
using the latest version of SAM available at the time of this manuscript preparation,
which was version 3.09c. SAM was implemented using R version 2.10.0.
Prediction Analysis of Microarray (PAM)
Prior to PAM, the CpGs were sorted by standard deviation across all tumors and normals.
To improve statistical power, only CpGs which had a standard deviation of 0.04 or
greater were analyzed. PAM was performed as described in the software manual. The
data were analyzed using the latest version of PAM available at the time of this
manuscript preparation, which was version 2.11. PAM was implemented using R version
2.10.0. Based on visual examination of the training errors and the cross-validation
results, I set the shrinkage threshold to 10.5.
11
PyroMark assays
PyroMark assays were performed at the Stanford Protein and Nucleic Acid Facility using
the manufacturer‟s recommended protocol (Qiagen). For each target region, 3 primers
were used: a forward and reverse PCR primer and a sequencing primer. Primer
sequences are listed in Table 2.1
Target CpG Promoter Primer Sequence
cg19790294 CYBA Forward 5'-GTTTTTGAGTTTTTTTAGGGTTTTTTAAATT-3'
cg19790294 CYBA Reverse 5'-CCTTCACACCTTATCCTACTATTA-3'
cg19790294 CYBA Sequencing 5'-GATAAGTGTTTTTGTTTAATGT-3'
cg04448487 GDAP1L1 Forward 5'-TTTTTATTTTTGTAGGGAGTTTGA-3'
cg04448487 GDAP1L1 Reverse 5'-CTCTCTCTCCCCCAACATCACATA-3'
cg04448487 GDAP1L1 Sequencing 5'-AGGGAGTTTGATATTGAG-3'
cg02879662 HIF3A Forward 5'-GGGGTTTTTTTTTTGGAGATTT-3'
cg02879662 HIF3A Reverse 5'-CACCCCTACAATCCCTAA-3'
cg02879662 HIF3A Sequencing 5'-TTGGATTGTTGGGGG-3'
cg19853760 LGALS1 Forward 5'-TGAGGGGGGGTAGTAGTT-3'
cg19853760 LGALS1 Reverse 5'-ATCCCCACACTCACACAAA-3'
cg19853760 LGALS1 Sequencing 5'-TGATTTGTAATTGGTTGAAT-3'
cg04622802 LOC387758 Forward 5'-GGGTAATAGAGTTAGTATTTTGTTAG-3'
cg04622802 LOC387758 Reverse 5'-CCCAACAAACTTCATATAACTCTACA-3'
cg04622802 LOC387758 Sequencing 5'-AGTTAGTATTTTGTTAGGGT-3'
cg21096399 MCAM Forward 5'-TAGGTTTTTGGTTTGGGAAG-3'
cg21096399 MCAM Reverse 5'-AATCCCCTAAAAACTACATTAACT-3'
cg21096399 MCAM Sequencing 5'-GGGTAGTGATAGGTGT-3'
cg24340926 RAB33A Forward 5'-GGGTTTTTTTTATTGGTTAGTTAAAT-3'
cg24340926 RAB33A Reverse 5'-AACCCCAACATCCCCTTATCACA-3'
cg24340926 RAB33A Sequencing 5'-TTTTTATTGGTTAGTTAAATATAAT-3'
cg13102585 RPIP8 Forward 5'-GGGGATGGTTATGGAAGG-3'
cg13102585 RPIP8 Reverse 5'-ACAACCCCAAAACCATAATAATCT-3'
cg13102585 RPIP8 Sequencing 5'-GGTTATGGAAGGGTTGA-3'
cg22862656 SCGB2A2 Forward 5'-GGAATAAATAGAGTAAGGTTGGGTGTT-3'
cg22862656 SCGB2A2 Reverse 5'-ACCCCCAACATAAAAACCATCAACAACTTC-3'
cg22862656 SCGB2A2 Sequencing 5'-AGGTTGGGTGTTTATTTTTATA-3'
Table 2.1 Primer sequences used in PyroMark assays.
12
TaqMan gene expression assay
Expression levels of genes encoding several DNMT and DNMT-interacting proteins, as
well as beta-2-microglobulin as an endogenous control, were measured in 10 normal and
36 tumor samples by TaqMan Gene Expression Assay. I used the following Applied
Biosystems inventoried assays with FAM/MGD labeled probes (Assay ID in
parentheses): DNMT1 (Hs00945900_g1), DNMT3A (Hs00173377_m1), DNMT3A2
(Hs00601097_m1), DNMT3B (Hs01003405_m1), DNMT3L (Hs01081364_m1), EZH2
(Hs01016789_m1) and the Human B2M (beta-2-microglobulin) Endogenous Control.
Twenty five ng of cDNA were assayed in triplicate for each target, using the protocol as
described by the manufacturer, on the ABI PRISM 7900HT instrument. The results were
analyzed using the ABI SDS 2.4 and ABI RQ Manager 1.2.1 software. Briefly, the
average CT and delta-CT were calculated for each DNMT and EZH2. By integrating the
average CT value from the B2M CT, I calculated the delta-delta-CT. All sample delta-
delta-CT values were normalized to that of a tumor sample PC625T to generate an RQ
value. To present the RQ value as a positive value, 5 was added to each RQ value.
Expression vectors
The pcDNA3/Myc- EZH2 construct was a generous gift from A. Chinnaiyan (Okano et
al. 1999). The pcDNA3/Myc-DNMT3A, pcDNA3/Myc-DNMT3A2, pcDNA3/Myc-
DNMT3B1, pcDNA3/Myc-DNMT3B2 and pcDNA3/Myc-DNMT3B3 constructs were a
generous gift from A. Riggs (Chen et al. 2005).
13
CHAPTER 3
DNA METHYLATION PROFILES OF NORMAL
PROSTATES AND PROSTATE TUMORS
14
Prostate DNA methylation profiles
To explore the prostate DNA methylome, I profiled 95 primary prostate tumors and 86
normal prostate tissues, including 70 matched pairs, using the Illumina
HumanMethylation27 microarrays. This platform assays 27,578 CpG sites, all but 600 of
which are in the proximal promoter regions of 14,475 transcription start sites. After
batch correcting and quality filtering the data, I was able to determine quantitative
methylation status (beta scores) for 26,333 CpG sites in 14,104 promoters. To investigate
the similarities and differences of the DNA methylation profiles of the normal samples
and tumor samples, as well as their heterogeneity, I performed unsupervised hierarchical
clustering on the entire dataset (Figure 3.1). When the data were clustered by sample, I
observed two main clusters – one comprised almost entirely of normal samples (77/88)
and the other comprised almost entirely of the tumor samples (67/71). The branch
lengths in the normal sample cluster were generally shorter than the branch lengths in the
tumor sample cluster, indicating more heterogeneity in methylation profiles among the
tumor samples. Twenty-two of the samples did not fall into either of the two main
clusters and formed long off-shooting branches or small clusters. Eighteen of these were
tumor samples, further indicative of the heterogeneous nature of the tumor DNA
methylome. By visual inspection, the majority of the samples showed relatively little
methylation change between the tumor and normal clusters (Figure 3.1), and most of
these invariable CpG sites showed low levels of methylation in both normal and tumor
samples. However, there were distinct CpG clusters with methylation patterns that
distinguished the normal or tumor sample clusters, and, strikingly, a large number of CpG
sites showed increased methylation in the tumor cluster compared to the normal cluster.
15
Figure 3.1: Hierarchical clustering of prostate tissues by DNA methylation.
Unsupervised hierarchical clustering of 181 prostate tissues and 26,333 CpGs, by sample
and by CpG. Red branches represent tumor samples and blue branches represent normal
samples. Red pixels represent high DNA methylation while green pixels represent low
DNA methylation. As indicated in the Methods section, the beta scores were mean
centered prior to clustering.
16
To identify the CpG sites with statistically different DNA methylation status between
normal prostate tissues and tumors, I performed a two-class Significance Analysis of
Microarrays (SAM) (Tusher et al. 2001). As I had matched normal tissues for only 70 of
the 95 tumors used in this study, I conducted the SAM analysis as unpaired. The analysis
identified 5,912 CpG sites hypermethylated in tumors compared to normal tissues, and
2,151 CpG sites hypomethylated at FDR < 0.8% (Figure 3.2). This corresponds to 4,224
and 1,792 promoters, respectively. I performed hierarchical clustering on all samples
based on these 8,063 differentially methylated CpG sites (Figure 3.3). Of the 11,116
gene promoters represented by two or more CpG sites, only 223 had opposite methylation
effects (i.e., at least one hypermethylated CpG and at least one hypomethylated CpG).
When the distances from transcriptional start sites were compared in these 224 promoters
with opposite methylation effects, I saw enrichment for hypermethylated CpGs in the -
100 bp to +800 bp range, whereas I saw enrichment for the hypomethylated CpGs in the -
700 bp to -200 bp range. Thus, overall, nearly one third (8,063/26,333) of assayed
promoter CpGs had a statistically significant change in DNA methylation, with most of
those showing an increase in methylation. Interestingly, 43% (6,015/14,104) of all gene
promoters assayed had at least one CpG with a tumor-specific methylation change.
17
Figure 3.2: Normal vs Tumor unpaired 2-class SAM analysis of the 181 prostate
samples. False discovery rate of 0.78% resulted in 8,063 differentially methylated CpGs
including 5,912 hypermethylated CpGs (red) and 2,151 hypomethylated CpGs (green).
Figure 3.3: Differentially methylated CpGs of prostate tumors. Unsupervised
hierarchical clustering of 181 prostate tissues based on the 5,912 and 2,151 CpG sites
hypermethylated and hypomethylated in prostate tumors, respectively. Red branches
represent tumor samples and blue branches represent normal samples. Red pixels
represent high DNA methylation while green pixels represent low DNA methylation.
18
As the SAM analysis was unpaired, there is the possibility that inter-individual DNA
methylation differences could be a possible confounding factor. The relative
homogeneity of the normal samples, indicated by the short branch lengths and tight
clustering compared to the tumor samples, suggested that this was likely not a problem,
but it was still a concern I chose to address. I thus also conducted a two-class paired
SAM analysis on only the 70 matched sample pairs, identifying 5,556 hypermethylated
and 2,185 hypomethylated CpGs at a similar significance cutoff (FDR > 0.84%) (Figure
3.4). This paired analysis identified only 306 novel, differentially methylated CpGs that
were not discovered in the unpaired analysis. Furthermore, when the list of differentially
methylated CpGs was ranked by significance, most of the CpGs uniquely identified in the
paired analysis (278/306) fell in the bottom 25%. These data indicate that the prostate
tumor methylation signal is strong enough to overcome any inter-individual methylation
differences for the vast majority of CpGs assayed on this platform. For all subsequent
analyses, I used the unpaired SAM results to include all samples.
While the impact of inter-individual methylation differences was minimal for the purpose
of distinguishing tumor and normal samples, it was clear that such differences were
present. I chose to explore this phenomenon in the normal prostate samples using
available patient data. I performed a Quantitative SAM analysis to identify CpGs that
showed differential methylation relative to the age of the patient at the time of surgery
(range: 44-74 yrs). This analysis revealed 749 CpG sites that showed increasing
methylation with age, while no CpG showed decreasing methylation with age (FDR <
4.82%) (Figure 3.5 and Figure 3.6). When I repeated the age-dependent Quantitative
19
SAM analysis with tumor samples, no CpG showed correlation with age, despite the fact
that the majority of these came from the same patients as the normal samples. This
strongly suggests that the age-dependent DNA methylation pattern that I observe is
overridden by the alterations that occur in cancer.
Figure 3.4: Normal vs Tumor paired 2-class SAM analysis of the 181 prostate
samples. False discovery rate of 0.84% resulted in 7,741 differentially methylated CpGs
including 5,556 hypermethylated CpGs (red) and 2,185 hypomethylated CpGs (green).
Figure 3.5: Quantitative SAM analysis of 86 prostate samples based on age of
patient at the time of surgery. False discovery rate of 4.82% resulted in 749
differentially methylated CpGs, all hypermethylated (red).
20
Figure 3.6: 749-Age dependent CpGs. 749 age-dependent CpGs, clustered by CpG,
patients ordered by age.
Patient Age
44 74
21
Diagnostic methylation markers
Among the CpG sites that I found to be differentially methylated in tumor versus normal
prostate tissues by SAM, and shown clustered in Figure 3.3, were several sites that had
been previously characterized in prostate tumors, most notably several CpG sites near or
within the GSTP1 gene. Hypermethylation of the CpG island overlapping the
transcriptional start site of the GSTP1 gene has been associated with transcriptional
silencing and is described as the most common molecular alteration in prostate cancer
identified to date (X Lin et al. 2001; Woodson et al. 2008). Since GSTP1 promoter
methylation is very common and specific for prostate cancer, many investigators have
proposed using this methylation event as a diagnostic biomarker for prostate cancer
(Nakayama et al. 2004; Cairns et al. 2001). The HumanMethylation27 arrays contain
seven CpG sites in the GSTP1 promoter. Five of these sites showed significantly
increased DNA methylation in tumors, four of which are located in the promoter CpG
island that had been previously characterized as a site of hypermethylation in prostate
cancer (Brooks et al. 1998), while the fifth lies 88 bp downstream of the annotated CpG
island boundary (red circles in Figure 3.7A). The two remaining CpGs showed either no
differential methylation (gray circle in Figure 3.7A) or slight but statistically significant
hypomethylation (green circle in Figure 3.7A); both lie further upstream of the
transcriptional start site, outside of the promoter CpG island. Our data not only confirm
the previously described hypermethylation of the GSTP1 promoter CpG island, but also
show that CpG DNA methylation alteration is highly context dependent even within a
single promoter.
22
Figure 3.7: GSTP1 CpG island hypermethylation in prostate tumors. (A) Diagram
of the GSTP1 gene. Blue boxes represent the RefSeq annotation of the GSTP1 gene. The
green box represents a CpG island calculated by UCSC Genome Browser. Circles are
CpG sites assayed by HumanMethylation27: red circles represent probes that were
identified to be hypermethylated in prostate tumors by 2-class SAM, the green circle
represents a probe that was hypomethylated, and the gray circle represents a probe that
showed no significant change. The numbers below the circles indicate the relative
distance in base pairs from the predicted TSS. (B) Heatmap depicts DNA methylation
pattern of the 7 probes near GSTP1. The dendrogram is based on the hierarchical
clustering from Figure 2. Red branches represent tumor samples and blue branches
represent normal samples. Coordinates are based on NCBI36/hg18 human genome
assembly.
A
-831 -566 -89 +206 +543 +721 +976
23
In addition to GSTP1, I also examined our data specifically for methylation changes in
the promoters of APC and RASSF1, which have also been previously shown to have
hypermethylation in prostate cancer (Jerónimo et al. 2004) and were represented by
multiple probes on the HumanMethylation27 array. With APC, all six CpG sites
represented on the array showed hypermethylation in tumors, located 122 bp upstream to
488 bp downstream of the TSS (Figure 3.8). With RASSF1, three CpGs sites were
probed, located 58 bp upstream to 176 bp downstream of the TSS and within a CpG
island boundary; all three were hypermethylated (Figure 3.9). However, five of the six
probes located more than 2 kb downstream of the TSS in a second CpG island did not
show differential methylation.
A
+15 +131
-53
-122 +214 +488
24
Figur 3.8: APC proximal promoter hypermethylation in prostate tumors. (A)
Diagram of the APC gene. Blue boxes represent the RefSeq annotation of the APC gene.
There are no CpG islands in this window, calculated by the UCSC Genome Browser.
Circles are CpG sites assayed by HumanMethylation27: red circles represent probes that
were identified to be hypermethylated in prostate tumors by 2-class SAM. The numbers
above and below the circles indicate the relative distance in base pairs from the predicted
TSS. (B) Heatmap depicts DNA methylation pattern of the 6 probes near APC. The
dendrogram is based on the hierarchical clustering from Figure 2. Red branches
represent tumor samples and blue branches represent normal samples. Coordinates are
based on NCBI36/hg18 human genome assembly.
A
+176
-46
-58 +2908
+2693 +3115 +3697
+3438 +4063
25
Figure 3.9: RASSF1 proximal promoter hypermethylation in prostate tumors. (A)
Diagram of the RASSF1 gene. Blue boxes represent the RefSeq annotation of the
RASSF1 gene. Green boxes represent the CpG islands calculated by UCSC Genome
Browser. Circles are CpG sites assayed by HumanMethylation27: red circles represent
probes that were identified to be hypermethylated in prostate tumors by 2-class SAM and
the gray circles represent probes that showed no significant change. The numbers above
and below the circles indicate the relative distance in basepairs from the predicted TSS.
(B) Heatmap depicts DNA methylation pattern of the 9 probes near RASSF1. The
dendrogram is based on the hierarchical clustering from Figure 2. Red branches
represent tumor samples and blue branches represent normal samples. Coordinates are
based on NCBI36/hg18 human genome assembly.
While hierarchical clustering of samples using the most differentially methylated CpG
sites (the set shown in Figure 3.3) was able to distinguish most tumors from normal
tissues, the classification was not perfect, as indicated by the inclusion of normal tissue
samples within the tumor cluster and vice versa. To identify CpG sites that could best
predict either the tumor state or the normal state, I performed a Prediction Analysis of
Microarrays (PAM), to perform sample classification (Robert Tibshirani et al. 2002).
This analysis generated a list of 87 predictive CpG sites, most of which had increased
methylation in the tumor samples (83/87), and represented 82 gene promoters total
(Figure 3.10). The CYBA, GSTP1, KLK10, PPT2 and CXCL1 promoters each had two
CpGs represented in this list. Notably, in this ranked list of 87 predictive methylation
alterations, the GSTP1 hypermethylation was ranked 57th (Supplementary Table S2).
Thus I have identified 56 molecular events, most of which had not been previously
characterized, that are better markers of prostate cancer than is GSTP1.
26
Figure 3.10: Diagnostic markers of prostate cancer identified by PAM.
Unsupervised hierarchical clustering of 181 prostate samples based on the 87 diagnostic
CpG sites identified by PAM. Red branches represent tumor samples and blue branches
represent normal samples. Red pixels represent high DNA methylation while green
pixels represent low DNA methylation.
Validation by PyroMark sequencing
To validate the Prediction Analysis results and to validate the Illumina
HumanMethylation27 platform, I designed PyroMark assays for a subset of our
predictive CpGs. This pyrosequencing method provides quantitative percent methylation
for regions up to about 120 bp in length. I designed primers that would enable us to
assay nine diagnostic CpGs from the promoters of the RAB33A, HIF3A, GDAP1L1,
27
MCAM, LGALS1, RPIP8, CYBA, SCGB2A2, and LOC387758 genes, which were
selected from the top 40 most diagnostic CpGs identified in the PAM analysis. I selected
DNA from 12 matched tumor-normal pairs, previously assayed by HumanMethylation27,
treated them with bisulfite and performed DNA sequencing. When the percent
methylation readout from PyroMark and HumanMethylation27 of the same loci were
compared, I observed a striking correlation between the two platforms at all nine CpGs
(r2 values from 0.89 to 0.98) (Figure 3.11). While both PyroMark and
HumanMethylation27 both rely on bisulfite conversion, the subsequent chemistry
involved in quantifying methylation levels is substantially different. The strong
correlation between the two platforms not only indicated a high level of accuracy in
methylation quantification by both platforms, but also validated differential methylation
at the nine predictive CpG sites.
C
28
29
Figure 3.11: PyroMark validates HumanMethylation27 results. PyroMark
sequencing results compared to HumanMethylation27 beta scores at 9 diagnostic CpGs
identified by PAM. Blue circles are normal samples and red circles are tumor samples.
Y-axis: fraction methylation calculated from PyroMark. X-axis: fraction methylation
calculated from HumanMethylation27 (beta scores). Black line: linear regression. (A)
CYBA (cg19790294). (B) GDAP1L1 (cg04448487). (C) HIF3A (cg02879662). (D)
LGLS1 (cg19853760). (E) LOC387758 (cg04622802). (F) MCAM (cg21096399). (G)
RPIP8 (cg13102585). (H) RAB33A (cg24340926). (I) SCGB2A2 (cg22862656).
Because the PyroMark results provided sequence lengths of up to 120 bp, I was able to
explore the methylation states of additional CpGs near each of the nine predictive CpGs.
Each region contained 4 to 13 additional CpGs that could be measured by the PyroMark
assay. In most cases, neighboring CpGs within a region had similar methylation levels
(Figure 3.12A – 3.12I). However, there were several notable exceptions where I
observed dramatically different levels of methylation within these short regions. In the
MCAM region (Figure 3.11F), one CpG had 20% methylation while another CpG only
85 bp away had 90% methylation. Similarly, the RPIP8 region (Figure 3.12G), had a
CpG with 10% methylation and another CpG 54 bp away had as high as 80%
methylation. In the most extreme case, two SCGB2A2 region CpGs, only 2 bp apart, had
30
a difference in methylation of 40% (Figure 3.12I). It is clear from these plots that the
differences in methylation between tumor samples and normal samples were variable
within a single region. Even though I selected these nine regions because a single CpG in
each was among the most differentially methylated in the HumanMethylation27, if
another neighboring CpG in the same region had been assayed on the array, it may have
not been called substantially different between tumor and normal tissues. The most
striking example of this is in the RPIP8 region (Figure 3.12G), where a CpG 72 bp away
from the CpG assayed on HumanMethylation27 showed no difference between tumor
samples and normal samples. All the assayed CpGs were screened against dbSNP (build
130) for possible influence on the PyroMark readout by sequence variation. No SNPs
were found in these regions that could confound the PyroMark methylation
quantification.
31
32
33
Figure 3.12: Comparison of neighboring CpGs by PyroMark. PyroMark sequencing
results comparing neighboring CpGs of the 9 diagnostic CpGs identified by PAM. Each
diamond represents a CpG methylation level for an individual sample. Lines connect
CpGs from each sample. Blue lines are normal samples, red lines are tumor samples. Y-
axis: fraction methylation calculated from PyroMark. X-axis: relative coordinates in
basepairs. Box indicates CpG assayed by HumanMethylation27. (A) CYBA
(cg19790294). (B) GDAP1L1 (cg04448487). (C) HIF3A (cg02879662). (D) LGLS1
(cg19853760). (E) LOC387758 (cg04622802). (F) MCAM (cg21096399). (G) RPIP8
(cg13102585). (H) RAB33A (cg24340926). (I) SCGB2A2 (cg22862656).
34
Prognostic methylation markers
To explore tumor heterogeneity, I compared the methylation profiles of the 86 tumors
with respect to Gleason grade and time to biochemical recurrence (defined as serum PSA
> 0.07 ng/mL after surgery) of the donors. Gleason grade is a powerful predictor of
treatment failure, tumor progression and death from prostate cancer, and biochemical
recurrence has also been correlated with prostate cancer-specific mortality (Freedland et
al. 2005). I conducted a multiclass SAM in an effort to identify methylation events that
distinguished tumors of different Gleason grades, but was unable to identify such events.
Next, I conducted a SAM survival analysis with the time to biochemical recurrence as the
survival variable. With a false discovery rate of 26.7%, I identified six CpGs that showed
greater methylation in tumors from men who had shorter time to recurrence and 63 CpGs
that showed lower methylation in patients with shorter time to recurrence (Supplementary
Table S3). This strong bias towards lower methylation in aggressive tumors was striking
as I observed a bias for CpG sites with increased methylation in the tumor/normal
comparison. While I was only able to identify a small number of CpGs whose
methylation state correlated with time to recurrence, I noted that several of these CpG
sites are in the proximal promoter genes of known cancer-related genes, including 3
CpGs near MAGE gene family members which encode for strictly tumor-specific
antigens (Chomez et al. 2001) and 4 CpGs near WT1, a transcription factor gene
associated with Wilm's tumor.
35
CHAPTER 4
DNA METHYLTRANSFERASES IN PROSTATE
36
Correlation of tumor hypermethylation with DNA methyltransferase expression
With nearly one third of assayed CpGs showing changes in DNA methylation between
tumor and normal samples, I hypothesized that one or more of the DNA
methyltransferases (DNMTs), or a protein that interacts with a DNMT, had altered
activity, possibly due to changes in transcript abundance, in prostate tumors. Such
alterations in activity could in turn lead to global DNA methylation changes. To test this
hypothesis, I selected RNA from 10 of the normal and 36 of the tumor samples, and
measured the transcript abundance of DNMT1, DNMT3A, DNMT3A2, DNMT3B,
DNMT3L and EZH2 using the TaqMan Gene Expression assay. These genes comprise
the known maintenance methyltransferase (DNMT1) (Chuang et al. 1997), all known
methyltransferases with de novo capability [DNMT1 (Estève et al. 2005), DNMT3A
(Okano et al. 1999), DNMT3B (Okano et al. 1999)], and two interacting proteins thought
to target methyltransferases to specific genomic regions [DNMT3L (El-Maarri et al.
2009) and EZH2 (Okano et al. 1999)]. In addition, I uniquely assayed DNMT3A and its
alternative promoter variant DNMT3A2 by using transcript-specific primers and probes.
While several splice variants of DNMT3B have been characterized, I was unable to
design variant-specific primers and probes for them, so I instead designed primers and
probes to the common region of all DNMT3B variants. I did not observe detectable levels
of DNMT3L transcript abundance from either tumor or normal samples (data not shown).
When the transcript levels of the remaining genes were compared between normal and
tumor samples with a two-tailed t-test, three showed significant changes: DNMT3A2 (P =
0.0013), DNMT3B (P = 0.024) and EZH2 (P = 0.026), while DNMT1 and DNMT3A did
not (Figure 4.1A).
37
38
Figure 4.1: Expression of DNMTs and EZH2 correlates with global
hypermethylation in prostate tumors. Comparison of transcript levels of DNMTs and
EZH2 measured by TaqMan qPCR with the average DNA methylation levels of CpG
sites that are hypermethylated in prostate tumors. Blue circles are normal samples and
red circles are tumor samples. P-value was calculated by linear regression analysis. Y-
axis: average DNA methylation levels (beta score). X-axis: relative gene expression
levels [log2(RQ)]. Black line: linear regression. (A) DNMT1 expression. (B) DNMT3A
expression. (C) DNMT3A2 expression. (D) DNMT3B expression. (E) EZH2 expression.
(F) Comparison of DNMT and EZH2 transcript levels between normal tissues (blue) and
tumors (red). Significant differences are indicated by asterisks; P values were calculated
by t-test. Standard errors are depicted by error bars. Y-axis: relative gene expression
levels [log2(RQ)].
I compared the expression values for these five genes to global DNA methylation levels.
Specifically, I plotted the mean percent methylation of all 5,912 hypermethylated CpG
sites against relative expression of each methyltransferase or interacting protein, and
calculated regression and the goodness-of-fit of the regression for each sample. Again,
DNMT3A2 (r2
= 0.272, P = 0.0031), DNMT3B (r2
= 0.197, P = 0.0056) and EZH2 (r2
=
0.211, P = 0.0037) all showed significant correlation between expression and global
hypermethylation, while DNMT1 and DNMT3A did not (Figure 4.1B – 4.1F). The
correlation between DNMT3A2, DNMT3B and EZH2 expression and global
hypermethylation, in conjunction with the observed over-expression of the same genes in
tumors, suggests a possible causal role in the global methylation changes seen in prostate
tumor.
39
DNMT overexpression recapitulates hypermethylation events seen in prostate
tumors
To determine whether the increased transcript abundance of DNMT3A2, DNMT3B and
EZH2 in tumor cells has a causal role in the hypermethylation of a large number of
promoter CpGs, I expressed these genes from the CMV promoter in transient transfection
assays in primary cultures of normal prostatic epithelial cells. I used plasmids expressing
DNMT3A, DNMT3A2, DNMT3B1, DNMT3B2, and DNMT3B3, an EZH2-cDNA plasmid,
and a no-insert plasmid. I co-transfected each cDNA plasmid with the no-insert plasmid,
and independently with the EZH2 plasmid, and also included a mock no-insert plasmid
only transfection. I calculated the change in DNA methylation for each CpG between
each cDNA transfection and the mock transfection after 48 hours. I then plotted the ideal
cumulative distribution function of the DNA methylation level change at all 26,333 CpG
sites along with the empirical cumulative distribution function of just the changes at the
5,912 CpG sites hypermethylated in tumors (Figure 4.2A – 4.2K), and tested the
difference in the two distribution functions using the Kolmogorov-Smirnov (K-S) test. In
all eleven experimental transfections, the distribution of the 5,912 CpG sites was
significantly enriched compared to the null: DNMT3A (P = 6.0E-45), DNMT3A2 (P =
3.5E-62), DNMT3B1 (P = 1.2E-31), DNMT3B2 (P = 5.2E-39), DNMT3B3 (P = 4.6E-44),
EZH2 (P = 1.1E-59), DNMT3A+EZH2 (P = 7.8E-64), DNMT3A2+EZH2 (P = 9.8E-65),
DNMT3B1+EZH2 (P = 2.1E-29), DNMT3B2+EZH2 (P = 6.7E-42), DNMT3B3+EZH2 (P
= 2.5E-67). Consistent with our hypothesis, when the plots of the empirical cumulative
distribution functions were visually inspected, I observed that the low P-value of the K-S
test appeared to be driven more by the CpGs of increased methylation rather than CpGs
of decreased methylation in all eleven conditions.
40
41
42
Figure 4.2: Overexpression of DNMTs and EZH2 results in increased methylation
at a subset of prostate tumor hypermethylation sites. Ideal (black) and empirical (red)
cumulative distribution functions of change in DNA methylation after DNMT or EZH2
transfection into cultured normal prostate cells. The empirical distribution functions are
based on the 5,912 CpGs that were hypermethylated in prostate tumors, while the ideal
distribution functions are based on all 26,333 CpGs assayed on the array.
Overexpression of (A) DNMT3A, (B) DNMT3A2, (C) DNMT3B1, (D) DNMT3B2, (E)
DNMT3B3, (F) EZH2, (G) DNMT3A and EZH2, (H) DNMT3A2 and EZH2, (I)
DNMT3B1 and EZH2, (J) DNMT3B2 and EZH2, and (K) DNMT3B3 and EZH2.
To test specifically whether the list of 5,912 CpG sites was statistically enriched for
CpGs with substantially increased DNA methylation, I performed a series of chi-square
tests. Based on the distribution of CpG methylation levels in tumor and normal tissues at
these CpG sites, I set a cutoff value of 0.05. In other words, CpG sites where the
methylation increased by 5 percent or greater in the experimental transfection compared
to the mock transfection were considered to have substantially increased DNA
methylation. I calculated expected values based on the distribution of these CpGs with
substantially increased DNA methylation in the entire set of 26,333 CpGs. When chi-
square tests were performed, all eleven experimental conditions had very low p-values:
DNMT3A (P = 1.1E-45), DNMT3A2 (P = 1.7E-66), DNMT3B1 (P = 8.9E-127),
DNMT3B2 (P = 1.8E-157), DNMT3B3 (P = 6.6E-10), EZH2 (P = 9.4E-31),
DNMT3A+EZH2 (P = 1.5E-13), DNMT3A2+EZH2 (P = 1.1E-11), DNMT3B1+EZH2 (P
= 1.9E-185), DNMT3B2+EZH2 (P = 9.4E-107), DNMT3B3+EZH2 (P = 2.3E-68).
Again, DNMT3B1 and DNMT3B2, which are alternative splicing isoforms differing by
the presence of one exon, both in the presence and absence of EZH2 co-transfection,
showed the lowest P-values, all less than 1E-100. From these data, I conclude that our
list of 5,912 CpGs is indeed enriched for CpGs with substantially increased methylation
43
when DNMTs or EZH2 were overexpressed, with DNMT3B1 and DNMT3B2 appearing
to have the strongest impact on the DNA methylation levels at these sites.
Based on these data, I further investigated the altered DNA methylation in the
DNMT3B1 and DNMT3B2 overexpression experiments. Because these splice isoforms
differ by only one exon coding for 21 amino acids in a linker region (Sakai et al. 2004), I
suspected that they would share many targets. To identify the CpGs targeted by
DNMT3B1 and DNMT3B2 in prostate tumors, I examined the list of CpGs that were
hypermethylated in prostate tumors and in the overexpression experiments. Specifically,
I looked for overlaps in the list of CpGs with 5% or greater increase in methylation
compared to the mock in the DNMT3B1 (1267 CpGs), DNMT3B1+EZH2 (1322 CpGs),
DNMT3B2 (1261 CpGs), and DNMT3B2+EZH2 (1235 CpGs) overexpression
experiments. Four hundred and thirty eight CpGs were represented in all 4 lists and an
additional 425 CpGs were represented in 3 of the 4 lists. I performed two permutation
tests to determine the likelihood of our results. In the first permutation test, I generated 4
lists of CpGs (1267, 1322, 1261 and 1235 CpGs, respectively) drawn randomly from the
whole list of 26,333 CpGs and counted the number of incidences where there was an
overlap of 438 CpGs in all 4 lists. It was never observed in the 10,000 iterations. In our
second permutation test, I repeated the first permutation test but changed the criteria to
observing at least 863 CpGs overlapping in 3 of the 4 lists. This too was never observed
in 10,000 iterations. This provided further evidence that the differentially methylated
CpGs in the DNMT3B1 and DNMT3B2 overexpression experiments indeed significantly
deviated from random sampling, and are likely to be those that are specifically, directly
or indirectly, targeted by these methyltransferases.
44
CHAPTER 5
DISCUSSION
45
DNA methylation changes in prostate cancer and its potential as diagnostic and
prognostic biomarkers
Alterations in DNA methylation have been shown to play a role in tumorigenesis and
cancer progression in many malignancies, including prostate cancer. Until recently,
technical limitations have restricted these findings to either characterization of a handful
of candidate loci or of overall abundance of 5-methylcytosine in the genome. Here, I
present quantitative DNA methylation levels at more than 26,000 loci across 14,000 gene
promoters. Because I assayed 95 cancers and 86 normal prostate tissues in parallel at
CpGs specifically enriched at gene promoters, I was able to show that 43% of gene
promoters represented in our assay had a tumor-specific methylation change. In addition
to confirming methylation changes seen in previously published candidate loci studies, I
also identified thousands of novel changes, including a set of hypermethylated loci more
strongly predictive of prostate cancer than GSTP1. Our data show that DNA methylation
changes in prostate cancer occur on a broad scale, at many loci throughout the genome.
DNA methylation alteration has been observed in early cancers and precursor lesions
suggesting that methylation changes drive malignant initiation rather than tumor
progression (Baylin et al. 2001; Belinsky et al. 1998; Guerrero-Preston et al. 2009;
Brooks et al. 1998). Our observations are largely consistent with this hypothesis. If the
acquisition of DNA methylation alterations continues throughout tumor progression,
variation in methylation profiles should be observed in tumors of different histological
grades and clinical outcomes. Although I detected more heterogeneity among tumors
than among normal tissues, the vast majority of tumors fell in a single cluster and I did
46
not observe obvious subclassifications, though some tumor samples did cluster with
normal samples. I compared clinical outcomes of the donors of the tumors that clustered
with normal tissues against the donors of the other tumors but did not observe any
differences in Gleason grades or time-to-recurrence (data not show). However, from the
little inter-tumor heterogeneity that did exist, I identified several dozen DNA methylation
changes that correlated with patients' time-to-recurrence.
Both the diagnostic and prognostic markers identified in this study have the potential to
be clinically useful. Following the identification of GSTP1 hypermethylation in prostate
cancer, it was suggested that DNA methylation alterations in prostate cells can be
detected from patient urine, semen or blood (Goessl et al. 2001). DNA methylation,
unlike RNA transcript level changes or protein abundance, is a stable marker making
them an ideal as targets of clinical testing. Furthermore because we saw tumor-specific
acquisition and loss of methylation marks, diagnosis may be made simply by the presence
or absence of certain methylation events rather than looking for fold change in
abundance, and therefore is not particularly dependent on patients‟ background levels.
By developing urine, semen or blood-based test utilizing these DNA methylation changes
could potentially supplant PSA testing and/or reduce the number of biopsies performed.
Potentially more importantly, if a DNA methylation test can accurately predict the course
of prostate tumor progression, the number of unnecessary prostatectomies can be
drastically reduced. However, before either of these tests can be developed, it would be
necessary to repeat our findings in an independent replication sample set.
47
Mechanisms of DNA methylation alterations
The fact that I observed changes at a very specific subset of CpG sites across most
tumors, rather than a global DNA methylation deregulation or instability, suggests a
common mechanism among prostate cancers. This specificity in target sites was
particularly apparent in gene promoters assayed by multiple probes and by the PyroMark
assay (Supplemental Text S2). The case of GSTP1 illustrates this point well, where the
methylation changes were highly context dependent: only the CpG island overlapping the
transcriptional start site was hypermethylated. Based on these findings, I suspect that
cellular processes involved with targeted CpG methylation regulation are themselves
misregulated or altered in early tumor initiation. The most likely candidates are DNMTs
and DNMT-interacting proteins. In support of this hypothesis, I observed significant
correlations between the gene expression levels and levels of global hypermethylation for
several of these candidates. In vitro experiments in normal prostatic epithelial cells
confirmed that overexpression of DNMT3B1 and DNMT3B2 leads to the
hypermethylation of a subset of the prostate tumor-specific changes. These data, together
with previous observations, strongly suggests that dysregulation of DNMTs and possibly
DNMT- interacting proteins are among the earliest events in tumorigenesis.
While I did not address the mechanism for the observed decreased methylation of some
CpGs in tumors, there are four likely possibilities. First, there may be aberrations in the
maintenance DNA methyltransferase gene, DNMT1. Although I did not observe a
decrease in the DNMT1 transcript level, there may be translational dysregulation of this
gene or mutations that leads to decreased activity. A decrease in DNMT1 activity may
48
lead to improper maintenance and gradual loss of methylation with every DNA
replication. However, this would likely lead to a global loss rather than targeted loss at
particular CpGs, and therefore, is the least likely scenario. A second possibility is the
dysregulation of a direct or indirect DNA demethylase. While there have been a few
reports of such enzymes in mammalian cells, none has been conclusive and their
existence is still speculative (Bhutani et al. 2010; Iyer et al. 2009; Okada et al. 2010). A
third possibility is that the targeted hypomethylation may be the result of dysregulation of
an interacting protein of DNMT1 or the hypothetical DNA demethylase. With more than
twenty DNMT1-interacting proteins already identified, it is conceivable that one or more
of them are involved in DNMT1 targeting. Finally, there could be a chromatin level
rearrangement that is influencing DNA accessibility. It could be that an active
demethylase gains access to previously inaccessibly regions after chromatin remodeling.
To further interogate these possibilities, a better understanding of the DNA demethylation
is needed.
Conclusions and future directions
By approaching DNA methylation in cancer from a genomic perspective, I was able to
gain new insights into the underlying biology of prostate cancer, as well as discover
novel markers for more accurate diagnosis of the disease. However, our study was
limited in scale by technology and practicality: with only 26,000 assayed CpGs, mostly
biased towards gene promoters, it is likely that these results are not representative of the
28 million CpGs found in the human genome. The vast majority of sites targeted by the
Illumina HumanMethylation27 microarrays were chosen because of their proximity to an
49
annotated transcription start site and CpG islands. Because ectopic DNA methylation at
CpG islands has been thought to silence nearby genes (Bird 2002; Jones and Baylin
2002), this was a reasonable approach to select CpGs to be assayed. However this
platform assays only about half of all CpG islands. Even among these assayed CpG
islands, the PyroMark assay revealed some degree of variability in methylation levels
(Figure 3.11). This suggests that because of the low coverage of about 2 CpGs per
island, the methylation levels I observed is not representative of the whole CpG island.
This is particularly significant as recent studies have suggested that the edges of and
regions adjacent to CpG islands, or CpG island shores, may be the most functional region
in terms of gene regulation by DNA methylation (Irizarry et al. 2009). These
observations warrant a more detailed look at global DNA methylation patterns.
Since the time the experiments detailed here was completed, Illumina released the 2nd
generation of these arrays which covers approximately 450,000 CpGs, including those in
CpG islands, CpG island shores, miRNA promoter regions, intergenic regions and
additional gene promoters not covered by the HumanMethylation27, while maintainig
90% of the sites assayed by the HumanMethylation27 platform. This method would be
ideal for future studies intended to study CpG sites that are likely to have a functional
impact based on our current assumptions of DNA methylation. In addition to the
advantages of the targeted approach of the HumanMethylation27, this
HumanMethylation450 platform would assay additional regions and provide greater
depth of coverage at each target region.
50
Alternative to HumenMethylation450, researchers have developed several high-
throughput sequencing based methods for assaying genome wide DNA methylation
(Brunner et al. 2009; Meissner et al. 2005). In particular, the Reduce Representation
Bisulfite Sequencing (RRBS) method developed by Meissner et al. can provide
quantitative DNA methylation profile of several hundred thousand to a couple million
CpGs. While these sequencing-based approaches are more time consuming and
expensive, they provide DNA methylation information in a more agnostic way – that is,
these sites will not be limited to those hand-picked based on genome annotation.
However, that is not to say that these methods aren‟t without bias. By design, both
methods are biased towards CpG-rich regions of the genome and can only interrogate
regions bound by two MspI recognition sequence (CCGG).
While both HumanMethylation450 and RRBS would allow for a more thorough
exploration of the prostate genome, they have different advantages and disadvantages.
HumanMethylation450 is more likely to identify potentially functional methylation
changes but are limited to those sites selected by Illunina. RRBS allows for an agnostic
scan of the DNA methylation profiles, but are still restricted to specific regions of the
genome and is more expensive. For the immediate future, either of these approaches
could significantly improve the quality of the DNA methylation profiles I depicted in this
study and the appropriate method should be chosen based on these differences. However,
I speculate that as the cost of sequencing rapidly drops and third generation sequencing
technologies, including those that can directly measure DNA methylation, becomes more
widely available, researchers will soon be able to affordably perform whole genome
51
DNA methylation sequencing on large sample sets. This would dramatically improve our
understanding of DNA methylation alterations that occur in cancer.
Beyond DNA methylation, recent success in integrative analysis of copy-number
variation (CNV) and gene expression data highlights the great value in studying prostate
cancer from multiple perspectives (Taylor et al. 2010). Expanding such an integrative
analysis to include DNA methylation data along with gene expression and CNV data is
likely to lead to a better understanding of prostate cancer biology.
Finally, our quantitative methylation analyses revealed a wide spectrum of methylation
states (beta scores) rather than the expected binary states of “methylated” or
“unmethylated.” This was true in both tumor and normal tissues. This indicates cell
population heterogeneity, despite careful dissection of the sample by a pathologist based
on histological characteristics, at the molecular level. A careful investigation of this
population heterogeneity may lead to not only a more detailed picture of DNA
methylation alteration, but also a better understanding of tumor progression such as
mutually exclusive or obligate co-occurring events.
52
APENDIX
SUPPLEMENTARY TABLES
53
Supplementary Table S1: Clinical information associated with prostate samples.
PC# Age Pre-treatment PSA Path Gr
(Gleason) Months
followed-up Recurrence Days to
recurrence
15 43 5.12 3+3 90.2 None -
19 57 4.42 4+5 87.2 Biochemical 574
21 61 7.82 3+4 88.4 None -
22 53 4.64 3+3 85.3 None -
26 62 4.32 3+4 81.6 Biochemical 1809
37 50 3.2 4+3 85.3 Biochemical 1217
45 56 6.58 3+4 91.8 None -
47 51 9.92 3+4 64.7 Biochemical 1128
83 64 7.92 4+5 31.2 None -
84 71 8.48 3+4 80.9 None -
85 47 4.8 3+4 70 None -
86 65 2.1 3+4 69 None -
88 53 5.31 3+3 77.8 None -
92 69 4.51 3+4 74.5 None -
97 53 4.1 4+3 85.8 Biochemical 1834
99 73 6.26 3+4 83.6 None -
100 67 4.4 4+3 83.1 None -
103 61 5.08 4+4 87.5 Biochemical 177
111 57 44.46 3+4 61.4 None -
159 47 2.7 3+4 79.4 None -
163 65 4.2 4+5 62.7 None -
166 44 2.79 3+4 36.8 None -
167 58 6.68 3+4 54.9 None -
184 63 6.71 4+3 44.6 None -
185 55 6.03 3+4 45.2 None -
188 72 9.14 3+3 69.5 None -
190 72 6.03 3+4 58.1 None -
205 66 6.76 3+4 16.9 None -
223 62 5.1 3+4 74.9 Biochemical 1852
228 45 6.55 3+4 74.3 Biochemical 2013
229 48 6.04 3+4 54.7 None -
233 69 9.74 3+4 15.4 None -
237 66 4+5 26 Unknown -
242 56 10.92 3+3 79 None -
248 73 15.9 4+3 116 Biochemical 90
252 65 42 3+4 22.9 Biochemical 306
265 59 4.53 4+4 20.8 Biochemical 83
274 62 3.9 3+4 105.5 None -
283 70 10.8 3+4 104.1 None -
335 58 3.78 4+3 45.5 None -
336 65 5.96 3+4 22.2 None -
343 69 10.77 3+3 62.0 None -
54
348 63 2.16 3+4 73 None -
351 62 3.04 4+3 62.3 None -
352 51 3.62 4+3 66.1 None -
361 72 3.09 3+4 24.4 None -
362 67 2.78 3+3 44.8 None -
366 57 4.01 4+3 63.8 Biochemical 370
367 52 5.9 3+4 3 None -
370 56 10.76 4+4 62.7 Biochemical 475
375 61 4.29 3+4 16.5 None -
378 62 8.976 3+4 60.6 Biochemical 1455
389 64 7.45 4+5 40.8 None -
393 54 5.18 3+3 33.6 None -
398 72 12.86 3+4 53.7 Biochemical 92
405 64 15.44 3+4 40.1 Biochemical 75
423 68 8.13 3+3 49.0 None -
430 55 9.37 3+4 9.6 None -
448 63 7.1 3+4 35.6 None -
452 62 5.12 4+3 48.2 Biochemical 575
453 71 4.97 4+3 55.1 None -
455 56 4.5 3+4 49.1 None -
457 62 5.74 3+4 38.7 None -
463 70 4.11 4+3 52.1 None -
470 72 6.8 3+4 47.9 None -
473 66 4.64 3+4 52.8 None -
474 68 6.16 3+4 43.3 None -
477 58 4.62 3+4 11.9 None -
480 51 0.94 3+3 48.2 None -
482 64 2.84 3+3 45.9 None -
485 61 5.91 3+4 49.7 Biochemical 1491
488 66 9.61 4+3 28.9 None -
490 63 4.48 3+4 44 None -
491 64 6.21 3+3 9.3 None -
491 64 6.21 3+3 9.3 None -
494 65 2.38 3+4 41.4 None -
494 65 2.38 3+4 41.4 None -
498 50 4.68 3+3 50.7 None -
501 61 3.93 4+3 41.4 None -
527 61 7.66 3+3 40 None -
537 51 22.7 4+3 30.1 None -
538 69 10.7 3+4 33.3 Biochemical 746
540 46 6.2 3+4 18.3 None -
544 59 7 3+4 0 None -
547 65 10.1 4+5 30.7 Biochemical 96
551 48 4.8 3+3 22.8 None -
555 64 5.5 3+4 24.2 None -
563 63 9.8 3+4 5.6 None -
55
574 43 9.6 4+5 51 Biochemical 570
575 50 9.71 3+3 25.1 None -
579 48 4.7 3+3 22.6 None -
582 58 3.61 3+4 19.7 None -
593 50 7.56 3+3 8.8 None -
594 60 4.3 3+4 27.0 None -
599 49 8.9 3+4 24.5 None -
601 56 5.3 4+3 0 None -
604 54 3.8 3+4 0 None -
610 64 3.6 4+3 0 None -
616 59 4.45 4+3 1.4 None -
619 71 4.17 3+4 3.3 None -
621 60 3.13 3+4 16.8 None -
625 64 3 3+4 21.0 None -
626 54 3.28 3+4 1.4 None -
627 53 7.5 4+5 19.1 Biochemical 41
629 71 3.2 3+3 20.5 None -
634 70 14.91 4+5 11.7 None -
636 59 6.38 3+3 16.8 Biochemical 139
643 61 7.16 3+4 18.1 None -
645 53 3.3 3+3 15.4 None -
646 64 4.9 3+4 16.6 None -
648 64 8.4 4+4 0 None -
07 8392 66 21.2 3+5 12 None -
07 866 66 13.5 3+4 21 None -
07 8980 72 23.2 4+3 12 Biochemical 1
07 9957 53 4.4 3+4 12 None -
TB 1872 71 68 Biochemical 1
TB 2682 57 28.5 4+3 45 None -
TB1875 73 12 Biochemical
56
Supplementary Table S2: Diagnostic methylation markers of prostate cancer identified
by PAM.
Rank CpG ID Gene Symbol
1 cg00489401 FLT4
2 cg10541755 EIF5A2
3 cg05270634 RND2
4 cg02879662 HIF3A
5 cg17231524 MGC39606
6 cg26537639 CYBA
7 cg22262168 MOBKL2B
8 cg14563260 EDG2
9 cg19790294 CYBA
10 cg07186138 APOBEC3C
11 cg14672994 FLJ20920
12 cg21096399 MCAM
13 cg15146752 EPHA2
14 cg24340926 RAB33A
15 cg20557104 B3GALT7
16 cg04622802 LOC387758
17 cg17965019 HIST1H3J
18 cg09300114 SLC16A5
19 cg08359956 LR8
20 cg10453365 RHCG
21 cg08924430 FLJ20032
22 cg13102585 RPIP8
23 cg00848728 DAB1
24 cg03085312 RARA
25 cg06428055 ELF4
26 cg04448487 GDAP1L1
27 cg09851465 C1orf87
28 cg08348496 HAPLN3
29 cg22862656 SCGB2A2
30 cg22319147 CDH5
31 cg27223047 FBN2
32 cg08965235 LTBP3
33 cg24715245 UCHL1
34 cg02254461 AXUD1
35 cg26025891 PSTPIP1
36 cg01683883 CMTM2
37 cg17606785 EFS
38 cg21307628 URB
39 cg18328334 TNS1
40 cg19853760 LGALS1
41 cg16232979 TPM4
42 cg23502772 MGC42105
57
43 cg04034767 GRASP
44 cg20083676 EDG3
45 cg21623671 ANXA6
46 cg12627583 AOX1
47 cg19713460 SYNGR1
48 cg19423196 MAT1A
49 cg22892110 MAPK15
50 cg12727795 PDGFRB
51 cg15835232 HLF
52 cg12100791 PYCARD
53 cg09704415 SPATA6
54 cg04337944 FBLN1
55 cg14360917 SP2
56 cg26420196 GAS6
57 cg04920951 GSTP1
58 cg27554782 CHRNB4
59 cg00727590 PLA2G3
60 cg14188232 ITGA11
61 cg18145505 GREM1
62 cg18711066 NFATC3
63 cg26124016 RARB
64 cg24512400 KLK10
65 cg15528736 FCGRT
66 cg01777397 RSNL2
67 cg03513363 DUSP15
68 cg21790626 ZNF154
69 cg02659086 GSTP1
70 cg00862041 GPRASP2
71 cg18552413 DARC
72 cg23499956 S100A16
73 cg17329164 PPT2
74 cg18006568 FLJ12056
75 cg14539231 EPSTI1
76 cg04273431 PRR3
77 cg15910208 KLK10
78 cg12585943 PPT2
79 cg15309006 LOC63928
80 cg17568996 NFAM1
81 cg24467291 RSN
82 cg02029926 CXCL1
83 cg20786074 EFEMP1
84 cg25806808 CXCL1
85 cg23092823 PODN
86 cg09099744 CDKN2A
87 cg25259754 FCRL3
58
Supplementary Table S3: Prognostic methylation markers of prostate cancer identified
by SAM survival.
Rank CpG ID Gene Symbol q-value (%)
1 cg01352108 KCNK4 0
2 cg24068372 LOC349136 0
3 cg20870559 OAS2 0
4 cg03734874 FLJ42486 0
5 cg03640944 KIAA1754 18.05041
6 cg02320454 GPR150 25.7863
7 cg17173423 MS4A3 15.04201
8 cg05047411 MAGEA8 15.04201
9 cg26164184 FCN2 15.04201
10 cg04645174 OR7A17 26.81402
11 cg05828624 REG1A 26.81402
12 cg21325760 MAGEL2 26.81402
13 cg20804821 GPR62 26.81402
14 cg03600318 SFTPD 26.81402
15 cg11061975 SIRPB2 26.81402
16 cg14620221 OR8B8 26.81402
17 cg13311440 CD48 26.81402
18 cg27504299 TCL1B 26.81402
19 cg03109316 ZNF80 26.81402
20 cg00918005 REG3G 26.81402
21 cg17836145 VNN2 26.81402
22 cg15457079 CPN1 26.81402
23 cg07688234 PFC 26.81402
24 cg22511262 WT1 26.81402
25 cg24169915 FLJ25773 26.81402
26 cg03833774 ZCCHC5 26.81402
27 cg20832020 VSIG9 26.81402
28 cg17338403 SLCO3A1 26.81402
29 cg01564343 TREML1 26.81402
30 cg22228134 GZMH 26.81402
31 cg22442090 GIMAP5 26.81402
32 cg01731341 FGF6 26.81402
33 cg19000186 CNGA1 26.81402
34 cg15711744 ANP32D 26.81402
35 cg03544379 OR7C2 26.81402
36 cg07443748 CESK1 26.81402
37 cg04353769 MS4A6A 26.81402
38 cg04014889 MAGEL2 26.81402
39 cg07379574 C19orf4 26.81402
40 cg10994126 PAPPA2 26.81402
41 cg03014957 DEFB118 26.81402
42 cg24012708 HDHD3 26.81402
59
43 cg13447818 FLG 26.81402
44 cg05222924 WT1 26.81402
45 cg18368125 TMED6 26.81402
46 cg19718882 WIT-1 26.81402
47 cg13097816 GPR35 26.81402
48 cg12237269 SLN 26.81402
49 cg19241311 DEFB123 26.81402
50 cg16777782 CDH13 26.81402
51 cg05248470 LILRB2 26.81402
52 cg16158220 REGL 26.81402
53 cg21353232 SEZ6L 26.81402
54 cg13482233 HEPH 26.81402
55 cg12234947 GNAT2 26.81402
56 cg15075718 MFRP 26.81402
57 cg01351032 CIITA 26.81402
58 cg01693350 WT1 26.81402
59 cg12878228 PRSS1 26.81402
60 cg06550629 GPR133 26.81402
61 cg01757745 C10orf93 26.81402
62 cg02813121 S100A12 26.81402
63 cg06806711 MS4A1 26.81402
64 cg13297249 FLJ38379 26.81402
65 cg01369413 UBQLN3 26.81402
66 cg09217923 TAAR2 26.81402
67 cg00690280 WFDC10B 26.81402
68 cg21742836 PPP4C 26.81402
69 cg24122922 C20orf39 26.81402
60
REFERENCES
61
Andriole GL, Grubb RL, Buys SS, Chia D, Church TR, Fouad MN, Gelmann EP, Kvale
PA, Reding DJ, Weissfeld JL, et al. 2009. Mortality Results from a Randomized
Prostate-Cancer Screening Trial. N Engl J Med 360: 1310-1319.
Baylin SB, Esteller M, Rountree MR, Bachman KE, Schuebel K, and Herman JG. 2001.
Aberrant patterns of DNA methylation, chromatin formation and gene expression
in cancer. Hum. Mol. Genet. 10: 687-692.
Belinsky SA, Nikula KJ, Palmisano WA, Michels R, Saccomanno G, Gabrielson E,
Baylin SB, and Herman JG. 1998. Aberrant methylation of p16INK4a is an early
event in lung cancer and a potential biomarker for early diagnosis. Proceedings of
the National Academy of Sciences of the United States of America 95: 11891 -
11896.
Bhutani N, Brady JJ, Damian M, Sacco A, Corbel SY, and Blau HM. 2010.
Reprogramming towards pluripotency requires AID-dependent DNA
demethylation. Nature 463: 1042-1047.
Bird A. 2002. DNA methylation patterns and epigenetic memory. Genes & Development
16: 6 -21.
Brooks JD, Weinstein M, Lin X, Sun Y, Pin SS, Bova GS, Epstein JI, Isaacs WB, and
Nelson WG. 1998. CG island methylation changes near the GSTP1 gene in
prostatic intraepithelial neoplasia. Cancer Epidemiology Biomarkers &
Prevention 7: 531 -536.
Brunner AL, Johnson DS, Kim SW, Valouev A, Reddy TE, Neff NF, Anton E, Medina
C, Nguyen L, Chiao E, et al. 2009. Distinct DNA methylation patterns
characterize differentiated human embryonic stem cells and developing human
fetal liver. Genome Res 19: 1044-1056.
Cairns P, Esteller M, Herman JG, Schoenberg M, Jeronimo C, Sanchez-Cespedes M,
Chow N, Grasso M, Wu L, Westra WB, et al. 2001. Molecular Detection of
Prostate Cancer in Urine by GSTP1 Hypermethylation. Clinical Cancer Research
7: 2727 -2730.
Chen Z, Mann JR, Hsieh C, Riggs AD, and Chédin F. 2005. Physical and functional
interactions between the human DNMT3L protein and members of the de novo
methyltransferase family. J. Cell. Biochem 95: 902-917.
Chomez P, De Backer O, Bertrand M, De Plaen E, Boon T, and Lucas S. 2001. An
overview of the MAGE gene family with the identification of all human members
of the family. Cancer Res 61: 5544-5551.
Chuang LS, Ian H, Koh T, Ng H, Xu G, and Li BF. 1997. Human DNA-(Cytosine-5)
Methyltransferase-PCNA Complex as a Target for p21WAF1. Science 277: 1996-
62
2000.
Das PM, and Singal R. 2004. DNA Methylation and Cancer. Journal of Clinical
Oncology 22: 4632 -4642.
Ehrlich M. 2002. DNA methylation in cancer: too much, but also too little. Oncogene 21:
5400-5413.
El-Maarri O, Kareta MS, Mikeska T, Becker T, Diaz-Lacava A, Junen J, Nusgen N,
Behne F, Wienker T, Waha A, et al. 2009. A systematic search for DNA
methyltransferase polymorphisms reveals a rare DNMT3L variant associated with
subtelomeric hypomethylation. Hum. Mol. Genet. 18: 1755-1768.
Esteller M, and Herman JG. 2002. Cancer as an epigenetic disease: DNA methylation and
chromatin alterations in human tumours. J. Pathol. 196: 1-7.
Estève P, Chin HG, and Pradhan S. 2005. Human maintenance DNA (cytosine-5)-
methyltransferase and p53 modulate expression of p53-repressed promoters.
Proceedings of the National Academy of Sciences of the United States of America
102: 1000-1005.
Feinberg AP, and Vogelstein B. 1983. Hypomethylation distinguishes genes of some
human cancers from their normal counterparts. Nature 301: 89-92.
Finne P, Finne R, Auvinen A, Juusela H, Aro J, Määttänen L, Hakama M, Rannikko S,
Tammela TL, and Stenman U. 2000. Predicting the outcome of prostate biopsy in
screen-positive men by a multilayer perceptron network. Urology 56: 418-422.
Freedland SJ, Humphreys EB, Mangold LA, Eisenberger M, Dorey FJ, Walsh PC, and
Partin AW. 2005. Risk of Prostate Cancer–Specific Mortality Following
Biochemical Recurrence After Radical Prostatectomy. JAMA: The Journal of the
American Medical Association 294: 433 -439.
Gama-Sosa MA, Slagel VA, Trewyn RW, Oxenhandler R, Kuo KC, Gehrke CW, and
Ehrlich M. 1983. The 5-methylcytosine content of DNA from human tumors.
Nucleic Acids Res 11: 6883-6894.
Goessl C, Müller M, Heicappell R, Krause H, Straub B, Schrader M, and Miller K. 2001.
DNA-based detection of prostate cancer in urine after prostatic massage. Urology
58: 335-338.
Guerrero-Preston R, Báez A, Blanco A, Berdasco M, Fraga M, and Esteller M. 2009.
Global DNA methylation: a common early event in oral cancer cases with
exposure to environmental carcinogens or viral agents. P R Health Sci J 28: 24-
29.
63
Hoffmann MJ, Engers R, Florl AR, Otte AP, Muller M, and Schulz WA. 2007.
Expression changes in EZH2, but not in BMI-1, SIRT1, DNMT1 or DNMT3B are
associated with DNA methylation changes in prostate cancer. Cancer Biol. Ther
6: 1403-1412.
Irizarry RA, Ladd-Acosta C, Wen B, Wu Z, Montano C, Onyango P, Cui H, Gabo K,
Rongione M, Webster M, et al. 2009. The human colon cancer methylome shows
similar hypo- and hypermethylation at conserved tissue-specific CpG island
shores. Nat Genet 41: 178-186.
Iyer LM, Tahiliani M, Rao A, and Aravind L. 2009. Prediction of novel families of
enzymes involved in oxidative and other complex modifications of bases in
nucleic acids. Cell Cycle 8: 1698-1710.
Jemal A, Siegel R, Xu J, and Ward E. 2010. Cancer Statistics, 2010. CA Cancer J Clin
60: 277-300.
Jerónimo C, Henrique R, Hoque MO, Mambo E, Ribeiro FR, Varzim G, Oliveira J,
Teixeira MR, Lopes C, and Sidransky D. 2004. A Quantitative Promoter
Methylation Profile of Prostate Cancer. Clinical Cancer Research 10: 8472-8478.
Johnson WE, Rabinovic A, and Li C. 2006. Adjusting batch effects in microarray
expression data using Empirical Bayes methods. Biostat kxj037.
Jones PA, and Baylin SB. 2002. The fundamental role of epigenetic events in cancer.
Nat. Rev. Genet 3: 415-428.
Jones PA. 1986. DNA Methylation and Cancer. Cancer Research 46: 461-466.
Kim E, Kim Y, Jeong P, Ha Y, Bae S, and Kim W. 2008. Methylation of the RUNX3
Promoter as a Potential Prognostic Marker for Bladder Tumor. The Journal of
Urology 180: 1141-1145.
King JC, Xu J, Wongvipat J, Hieronymus H, Carver BS, Leung DH, Taylor BS, Sander
C, Cardiff RD, Couto SS, et al. 2009. Cooperativity of TMPRSS2-ERG with PI3-
kinase pathway activation in prostate oncogenesis. Nat Genet 41: 524-526.
Kron K, Pethe V, Briollais L, Sadikovic B, Ozcelik H, Sunderji A, Venkateswaran V,
Pinthus J, Fleshner N, van der Kwast T, et al. 2009. Discovery of Novel
Hypermethylated Genes in Prostate Cancer Using Genomic CpG Island
Microarrays. PLoS ONE 4: e4830.
Laird PW, and Jaenisch R. 1994. DNA methylation and cancer. Hum. Mol. Genet 3 Spec
No: 1487-1495.
Laird PW, and Jaenisch R. 1996. The role of DNA methylation in cancer genetic and
64
epigenetics. Annu. Rev. Genet 30: 441-464.
Lapeyre JN, Walker MS, and Becker FF. 1981. DNA methylation and methylase levels in
normal and malignant mouse hepatic tissues. Carcinogenesis 2: 873-878.
Lapointe J, Li C, Higgins JP, van de Rijn M, Bair E, Montgomery K, Ferrari M, Egevad
L, Rayford W, Bergerheim U, et al. 2004. Gene expression profiling identifies
clinically relevant subtypes of prostate cancer. Proceedings of the National
Academy of Sciences of the United States of America 101: 811-816.
Ley TJ, Ding L, Walter MJ, McLellan MD, Lamprecht T, Larson DE, Kandoth C, Payton
JE, Baty J, Welch J, et al. 2010. DNMT3A Mutations in Acute Myeloid
Leukemia. N Engl J Med. http://www.ncbi.nlm.nih.gov/pubmed/21067377
(Accessed November 29, 2010).
Lin K, Lipsitz R, Miller T, and Janakiraman S. 2008. Benefits and Harms of Prostate-
Specific Antigen Screening for Prostate Cancer: An Evidence Update for the U.S.
Preventive Services Task Force. Annals of Internal Medicine 149: 192 -199.
Lin X, Tascilar M, Lee W, Vles W, Lee B, Veeraswamy R, Asgari K, Freije D, Van Rees
B, Gage W, et al. 2001. GSTP1 CpG island hypermethylation is responsible for
the absence of GSTP1 expression in human prostate cancer cells. American
Journal of Pathology 159: 1815-1826.
Meissner A, Gnirke A, Bell GW, Ramsahoye B, Lander ES, and Jaenisch R. 2005.
Reduced representation bisulfite sequencing for comparative high-resolution
DNA methylation analysis. Nucleic Acids Res 33: 5868-5877.
Müller HM, Widschwendter A, Fiegl H, Ivarsson L, Goebel G, Perkmann E, Marth C,
and Widschwendter M. 2003. DNA Methylation in Serum of Breast Cancer
Patients. Cancer Research 63: 7641-7645.
Nakayama M, Gonzalgo ML, Yegnasubramanian S, Lin X, De Marzo AM, and Nelson
WG. 2004. GSTP1 CpG island hypermethylation as a molecular biomarker for
prostate cancer. J. Cell. Biochem 91: 540-552.
National Cancer Institute. 2010. Cancer Trends Progress Report - 2009/2010 Update.
http://progressreport.cancer.gov/ (Accessed January 8, 2011).
Noushmehr H, Weisenberger DJ, Diefes K, Phillips HS, Pujara K, Berman BP, Pan F,
Pelloski CE, Sulman EP, Bhat KP, et al. 2010. Identification of a CpG Island
Methylator Phenotype that Defines a Distinct Subgroup of Glioma. Cancer Cell
17: 510-522.
Okada Y, Yamagata K, Hong K, Wakayama T, and Zhang Y. 2010. A role for the
elongator complex in zygotic paternal genome demethylation. Nature 463: 554-
65
558.
Okano M, Bell DW, Haber DA, and Li E. 1999. DNA methyltransferases Dnmt3a and
Dnmt3b are essential for de novo methylation and mammalian development. Cell
99: 247-257.
Patra SK, Patra A, Zhao H, and Dahiya R. 2002. DNA methyltransferase and
demethylase in human prostate cancer. Mol. Carcinog. 33: 163-171.
Peehl DM. 2002. Human Prostatic Epithelial Cells - Culture of Epithelial Cells, Second
Edition (eds R. I. Freshney and M. G. Freshney), John Wiley & Sons, Inc., New
York, USA.
http://onlinelibrary.wiley.com/doi/10.1002/0471221201.ch6/summary (Accessed
September 27, 2010).
Pflueger D, Terry S, Sboner A, Habegger L, Esgueva R, Lin P, Svensson MA,
Kitabayashi N, Moss BJ, MacDonald TY, et al. 2011. Discovery of non-ETS gene
fusions in human prostate cancer using next-generation RNA sequencing.
Genome Research 21: 56 -67.
Robbins CM, Tembe WA, Baker A, Sinari S, Moses TY, Beckstrom-Sternberg S,
Beckstrom-Sternberg J, Barrett M, Long J, Chinnaiyan A, et al. 2011. Copy
number and targeted mutational analysis reveals novel somatic events in
metastatic prostate tumors. Genome Research 21: 47 -55.
Sakai Y, Suetake I, Shinozaki F, Yamashina S, and Tajima S. 2004. Co-expression of de
novo DNA methyltransferases Dnmt3a2 and Dnmt3L in gonocytes of mouse
embryos. Gene Expression Patterns 5: 231-237.
Sboner A, Demichelis F, Calza S, Pawitan Y, Setlur S, Hoshida Y, Perner S, Adami H,
Fall K, Mucci L, et al. 2010. Molecular sampling of prostate cancer: a dilemma
for predicting disease progression. BMC Medical Genomics 3: 8.
Schröder FH, Hugosson J, Roobol MJ, Tammela TL, Ciatto S, Nelen V, Kwiatkowski M,
Lujan M, Lilja H, Zappa M, et al. 2009. Screening and Prostate-Cancer Mortality
in a Randomized European Study. N Engl J Med 360: 1320-1328.
Sherlock G, Hernandez-Boussard T, Kasarskis A, Binkley G, Matese JC, Dwight SS,
Kaloper M, Weng S, Jin H, Ball CA, et al. 2001. The Stanford Microarray
Database. Nucleic Acids Res 29: 152-155.
Singh D, Febbo P, Ross K, Jackson D, Manola J, Ladd C, Tamayo P, Renshaw A,
D'Amico A, Richie J, et al. 2002. Gene expression correlates of clinical prostate
cancer behavior. Cancer Cell 1: 203-209.
Taylor BS, Schultz N, Hieronymus H, Gopalan A, Xiao Y, Carver BS, Arora VK,
66
Kaushik P, Cerami E, Reva B, et al. 2010. Integrative Genomic Profiling of
Human Prostate Cancer. Cancer Cell 18: 11-22.
The Cancer Genome Atlas. 2008. Comprehensive genomic characterization defines
human glioblastoma genes and core pathways. Nature 455: 1061-1068.
Tibshirani R, Hastie T, Narasimhan B, and Chu G. 2002. Diagnosis of multiple cancer
types by shrunken centroids of gene expression. Proceedings of the National
Academy of Sciences of the United States of America 99: 6567-6572.
Tomlins SA, Rhodes DR, Perner S, Dhanasekaran SM, Mehra R, Sun X, Varambally S,
Cao X, Tchinda J, Kuefer R, et al. 2005. Recurrent fusion of TMPRSS2 and ETS
transcription factor genes in prostate cancer. Science 310: 644-648.
Tomlins SA, Rhodes DR, Yu J, Varambally S, Mehra R, Perner S, Demichelis F,
Helgeson BE, Laxman B, Morris DS, et al. 2008. The role of SPINK1 in ETS
rearrangement-negative prostate cancers. Cancer Cell 13: 519-528.
Troyanskaya O, Cantor M, Sherlock G, Brown P, Hastie T, Tibshirani R, Botstein D, and
Altman RB. 2001. Missing value estimation methods for DNA microarrays.
Bioinformatics 17: 520-525.
Tusher VG, Tibshirani R, and Chu G. 2001. Significance analysis of microarrays applied
to the ionizing radiation response. Proceedings of the National Academy of
Sciences of the United States of America 98: 5116-5121.
Woodson K, O'Reilly KJ, Hanson JC, Nelson D, Walk EL, and Tangrea JA. 2008. The
usefulness of the detection of GSTP1 methylation in urine as a biomarker in the
diagnosis of prostate cancer. J. Urol 179: 508-511; discussion 511-512.
Yaqinuddin A, Qureshi S, Qazi R, and Abbas F. 2008. Down-regulation of DNMT3b in
PC3 cells effects locus-specific DNA methylation, and represses cellular growth
and migration. Cancer Cell International 8: 13.
Yegnasubramanian S, Kowalski J, Gonzalgo ML, Zahurak M, Piantadosi S, Walsh PC,
Bova GS, De Marzo AM, Isaacs WB, and Nelson WG. 2004. Hypermethylation
of CpG Islands in Primary and Metastatic Human Prostate Cancer. Cancer
Research 64: 1975 -1986.