serum dna motifs predict disease and clinical status in multiple sclerosis

8
JMD CME Program Serum DNA Motifs Predict Disease and Clinical Status in Multiple Sclerosis Julia Beck,* Howard B. Urnovitz,* Marina Saresella, Domenico Caputo, Mario Clerici, †§ William M. Mitchell, and Ekkehard Schu ¨ tz* From Chronix Biomedical,* Goettingen, Germany; the Laboratory of Molecular Medicine and Biotechnology, and the Multiple Sclerosis Unit, Don C. Gnocchi ONLUS (Organizzazione Non Lucrativa di Utilitá Sociale) Foundation IRCCS (Istituto di Ricovero e Cura a Carattere Scientifico), Milan, Italy; the Department of Biomedical Sciences and Technologies, § University of Milano, Milan, Italy; and the Department of Pathology, Vanderbilt University, Nashville, Tennessee Using recently available mass sequencing and assem- bly technologies, we have been able to identify and quantify unique cell-free DNA motifs in the blood of patients with multiple sclerosis (MS). The most com- mon MS clinical syndrome , relapsing-remitting MS (RRMS) , is accompanied by a unique fingerprint of both inter- and intragenic cell-free circulating nucleic acids as specific DNA sequences that provide signifi- cant clinical sensitivity and specificity. Coding genes that are differentially represented in MS serum en- code cytoskeletal proteins , brain-expressed regulators of growth , and receptors involved in nervous system signal transduction. Although coding genes distinguish RRMS and its clinical activity , several repeat sequences , such as the L1M family of LINE elements , are consis- tently different in all MS patients and clinical status versus the normal database. These data demonstrate that DNA motifs observed in serum are characteris- tic of RRMS and disease activity and are promising as a clinical tool in monitoring patient responses to treatment modalities. (J Mol Diagn 2010, 12:312–319; DOI: 10.2353/jmoldx.2010.090170) Although multiple sclerosis (MS) remains a clinical diagno- sis, 1 the definitive standard for the confirmation of diagnosis and the clinical assessment of MS disease activity is T1- weighted gadolinium (Gd) enhanced magnetic resonance imaging (MRI). Gd-MRI supplies information about current disease activity by highlighting areas of breakdown in the blood-brain barrier that indicate inflammation. 2 Areas of inflammation appear as active lesions. T1-weighted images also show “black holes,” which are thought to indicate areas of permanent damage. T2-weighted MRI scans are used to provide information about disease burden or lesion load. The high costs of randomized clinical trials in MS are di- rectly associated with the requirement of frequent Gd-MRI scans to assess clinical activity as a function of pharmaceu- tical intervention and optimal dose assessment. Gadolinium carries significant risk for some patients. In 2007 the U.S. Food and Drug Administration issued a “black box” warning for the use of gadolinium (http://www.fda.gov/Drugs/DrugSafety/ PostmarketDrugSafetyInformationforPatientsandProviders/ ucm142884.htm, last accessed October 13, 2009). This warning was based on research that linked the use of gadolinium as an image enhancement aid for MRI to the development of nephrogenic systemic fibrosis, a debilitat- ing and potentially fatal disease. The development of new therapies for MS is hin- dered by the lack of a low-cost, minimally invasive diagnostic assay to monitor disease activity. The high per patient cost of Gd-MRI prohibits the number of patients studied in randomized controlled clinical trials as well as the rate at which important questions can be tested. High per patient costs make it prohibitively expensive to study the comparative effectiveness of a treatment, prevention, or diagnostic regimen as it tran- sitions from clinical trial to the larger venue of clinical practice. The cost for maximizing disease control in clinical practice has adverse economic consequences for the uninsured patient as well as societal effects on the health insurance industry and on local, state, and federal governments. The basic sequence data re- ported here from mass sequence and assembly (MSA) technology provides significant promise that the differ- ential frequencies of specific DNA motifs in patients with relapsing-remitting MS (RRMS) can be translated Supported by the EMPRO (European Microbicides Project) and AVIP (AIDS Vaccine Integrated Project) European Commission WP6 Projects, the nGIN (Next Generation HIV-1 Immunogens inducing broadly reactive Neutralising antibodies) and GISHEAL (Genetic and Immunological Stud- ies of European and African HIV-1 Long Term Non Progressors) WP7 Projects, the Japan Health Science Foundation, and 2007 Ricerca Final- izzata (Italian Ministry of Health). Accepted for publication November 6, 2009. None of the authors disclosed any relevant financial relationships. Supplemental material for this article can be found on http://jmd. amjpathol.org. H.B.U., J.B., and E.S. are employees of Chronix Biomedical. W.M.M. is an independent board of directors member. M.C. and W.M.M. serve on the Chronix Medical Advisory Committee. Address reprint requests to William M. Mitchell, M.D., Ph.D., Department of Pathology, Vanderbilt University, Nashville, TN 37232. E-mail: bill.mitchell@ vanderbilt.edu. Journal of Molecular Diagnostics, Vol. 12, No. 3, May 2010 Copyright © American Society for Investigative Pathology and the Association for Molecular Pathology DOI: 10.2353/jmoldx.2010.090170 312

Upload: ekkehard

Post on 27-Jan-2017

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Serum DNA Motifs Predict Disease and Clinical Status in Multiple Sclerosis

JMD

CME Pro

gram

Serum DNA Motifs Predict Disease and ClinicalStatus in Multiple Sclerosis

Julia Beck,* Howard B. Urnovitz,*Marina Saresella,† Domenico Caputo,‡

Mario Clerici,†§ William M. Mitchell,¶

and Ekkehard Schutz*From Chronix Biomedical,* Goettingen, Germany; the Laboratory

of Molecular Medicine and Biotechnology,† and the Multiple

Sclerosis Unit,‡ Don C. Gnocchi ONLUS (Organizzazione Non

Lucrativa di Utilitá Sociale) Foundation IRCCS (Istituto di

Ricovero e Cura a Carattere Scientifico), Milan, Italy; the

Department of Biomedical Sciences and Technologies,§ University

of Milano, Milan, Italy; and the Department of Pathology,¶

Vanderbilt University, Nashville, Tennessee

Using recently available mass sequencing and assem-bly technologies, we have been able to identify andquantify unique cell-free DNA motifs in the blood ofpatients with multiple sclerosis (MS). The most com-mon MS clinical syndrome, relapsing-remitting MS(RRMS), is accompanied by a unique fingerprint ofboth inter- and intragenic cell-free circulating nucleicacids as specific DNA sequences that provide signifi-cant clinical sensitivity and specificity. Coding genesthat are differentially represented in MS serum en-code cytoskeletal proteins, brain-expressed regulatorsof growth, and receptors involved in nervous systemsignal transduction. Although coding genes distinguishRRMS and its clinical activity, several repeat sequences,such as the L1M family of LINE elements, are consis-tently different in all MS patients and clinical statusversus the normal database. These data demonstratethat DNA motifs observed in serum are characteris-tic of RRMS and disease activity and are promisingas a clinical tool in monitoring patient responses totreatment modalities. (J Mol Diagn 2010, 12:312–319;DOI: 10.2353/jmoldx.2010.090170)

Although multiple sclerosis (MS) remains a clinical diagno-sis,1 the definitive standard for the confirmation of diagnosisand the clinical assessment of MS disease activity is T1-weighted gadolinium (Gd) enhanced magnetic resonanceimaging (MRI). Gd-MRI supplies information about currentdisease activity by highlighting areas of breakdown in theblood-brain barrier that indicate inflammation.2 Areas ofinflammation appear as active lesions. T1-weighted imagesalso show “black holes,” which are thought to indicate areasof permanent damage. T2-weighted MRI scans are used toprovide information about disease burden or lesion load.

The high costs of randomized clinical trials in MS are di-rectly associated with the requirement of frequent Gd-MRIscans to assess clinical activity as a function of pharmaceu-tical intervention and optimal dose assessment. Gadoliniumcarries significant risk for some patients. In 2007 theU.S. Foodand Drug Administration issued a “black box” warning for theuse of gadolinium (http://www.fda.gov/Drugs/DrugSafety/PostmarketDrugSafetyInformationforPatientsandProviders/ucm142884.htm, last accessed October 13, 2009). Thiswarning was based on research that linked the use ofgadolinium as an image enhancement aid for MRI to thedevelopment of nephrogenic systemic fibrosis, a debilitat-ing and potentially fatal disease.

The development of new therapies for MS is hin-dered by the lack of a low-cost, minimally invasivediagnostic assay to monitor disease activity. The highper patient cost of Gd-MRI prohibits the number ofpatients studied in randomized controlled clinical trialsas well as the rate at which important questions can betested. High per patient costs make it prohibitivelyexpensive to study the comparative effectiveness of atreatment, prevention, or diagnostic regimen as it tran-sitions from clinical trial to the larger venue of clinicalpractice. The cost for maximizing disease control inclinical practice has adverse economic consequencesfor the uninsured patient as well as societal effects onthe health insurance industry and on local, state, andfederal governments. The basic sequence data re-ported here from mass sequence and assembly (MSA)technology provides significant promise that the differ-ential frequencies of specific DNA motifs in patientswith relapsing-remitting MS (RRMS) can be translated

Supported by the EMPRO (European Microbicides Project) and AVIP(AIDS Vaccine Integrated Project) European Commission WP6 Projects,the nGIN (Next Generation HIV-1 Immunogens inducing broadly reactiveNeutralising antibodies) and GISHEAL (Genetic and Immunological Stud-ies of European and African HIV-1� Long Term Non Progressors) WP7Projects, the Japan Health Science Foundation, and 2007 Ricerca Final-izzata (Italian Ministry of Health).

Accepted for publication November 6, 2009.

None of the authors disclosed any relevant financial relationships.

Supplemental material for this article can be found on http://jmd.amjpathol.org.

H.B.U., J.B., and E.S. are employees of Chronix Biomedical. W.M.M. isan independent board of directors member. M.C. and W.M.M. serve onthe Chronix Medical Advisory Committee.

Address reprint requests to William M. Mitchell, M.D., Ph.D., Department ofPathology, Vanderbilt University, Nashville, TN 37232. E-mail: [email protected].

Journal of Molecular Diagnostics, Vol. 12, No. 3, May 2010

Copyright © American Society for Investigative Pathology

and the Association for Molecular Pathology

DOI: 10.2353/jmoldx.2010.090170

312

Page 2: Serum DNA Motifs Predict Disease and Clinical Status in Multiple Sclerosis

into a rapid serum-based diagnostic assay for MS andassessment of its clinical activity.

Materials and Methods

Patients

Patients (n � 28) with RRMS from the Centro SclerosiMultipla, Don Gnocchi Foundation (Milan, Italy) who sat-isfied the Poser criteria for the diagnosis of clinicallydefinite MS were included in this study. All patients gaveinformed consent according to a protocol approved bythe internal review board of the Don Gnocchi Foundation.Thirteen patients were in clinical relapse, and blood sam-ples were obtained within 7 days of clinical relapse andbefore the initiation of therapy. These patients are clas-sified as having relapsing MS and included 12 femalesand 1 male (median age 38 years, range 31�55 years)with median duration of MS of 12 years (range, 3 to 21years) and a median Kurtzke Expanded Disability StatusScale score of 4.5 (range, 0.0 to 10.0). The ExpandedDisability Status Scale ranks patients as a function of theirphysical disability. A median score of 4.5 signifies arelatively severe disability but requiring minimal assis-tance. These patients had not received immunomodula-tory drug treatment for 1 year or more at the time ofcirculating nucleic acid (CNA) analysis. Fifteen RRMSpatients with clinically stable disease (relapse-free for atleast 6 months before CNA analysis) are classified for thisstudy as stable MS. These included 12 females and 4males (median age 39 years, range, 25 to 53 years) witha median disease duration of 12.5 years (range, 2–23years) and a median Expanded Disability Status Scalescore of 4.0. The diagnoses of relapsing MS and stableMS were confirmed by brain and spinal cord Gd-MRI.Enhancing lesions (dye-enhanced MRI plaques of acuteinflammation)2 were present in all relapsing MS patients,but no areas of enhancement were seen at the time ofenrollment in patients with stable MS. Serum from appar-ently healthy individuals (n � 50) was used as the controlcohort.5

Sampling

Serum samples were collected, processed within 2hours, and stored at �80°C. Frozen serum was thawed at4°C, and cell debris was removed by brief centrifugationat 4000 � g for 20 minutes. Total nucleic acids wereextracted from the supernatant using the High Pure Nu-cleic Acids Extraction Kit (Roche Diagnostics, Indianap-olis, IN) according to the manufacturer’s instructions.

Generation of Random Circulating Nucleic AcidLibraries

One microliter of the total nucleic acid solution was sub-jected to a random primer DNA amplification protocolusing the GenomePlex WGA4 kit (Sigma-Aldrich, St.Louis, MO) according to the manufacturer’s instructions.

Resulting circulating DNA preparations were molecularbarcoded, pooled, and sequenced using a GS FLX high-throughput sequencer (Roche/454 Life Sciences, Bran-ford, CT) according to the manufacturer’s instructions.Raw sequences were trimmed for the adapters/primersused in library generation.

Sequence Analysis

Repetitive elements were detected using the RepeatMaskerprogram (Smit AFA, Hubley R, Green P. RepeatMaskerOpen 3.0, http://www.repeatmasker.org, last accessedApril 30, 2008), which makes use of Repbase (version12.09; Genetic Information Research Institute, MountainView, CA).3 The genomic origin of the nonrepetitive cir-culating DNA was investigated by local alignment analy-ses using the BLAST program with high stringent param-eters.4 Each sequence was subjected to sequentialBLAST analyses querying databases of known bacterialgenomes, viral genomes, and the human genome (refer-ence genome build 36.2). For each query fragment andeach database search, the highest scoring Blast hit witha length of �40 bp was recorded in a SQL database withthe start and stop positions for query and hit and thedescription of the respective matching subject. With useof the genome annotation file as obtained from NationalCenter for Biotechnology Information (NCBI) the positionof each hit was evaluated, leading to the hit counts andcorresponding hit lengths within annotated genes, pseu-dogenes, the transcribed parts of a gene (denoted asRNA), the coding sequences and untranslated regions(UTRs). The genome annotation file was used to calculatethe expected values for the respective features as per-centages of the total genome length.

Statistics

Nucleotides matching to the different chromosomes werenormalized by all nucleotides matching to the genome.Nucleotides matching to repetitive elements were nor-malized by all nucleotides matching to repetitive ele-ments. A database from 50 apparently healthy humanswas used as a normal control group.5 Comparisons be-tween the healthy and diseased groups were evaluatedusing the C value (index) of receiver operator character-istic (ROC) curves,6 which was calculated as area underthe curve by applying the linear trapezoidal rule. All ele-ments showing at least an area under the curve value of0.65 were considered for further evaluation. Fisher’s ex-act P values were calculated at the point of best separa-tion (as defined by the maximum of the product of sen-sitivity and specificity). Multivariate regression modelswere estimated using the stepwise-in approach. All sta-tistical evaluations were calculated using Microsoft Excel.

Results

We sequenced serum DNA from 28 patients with definiteRRMS using MSA. Thirteen patients were diagnosed as

Serum DNA Motifs in MS 313JMD May 2010, Vol. 12, No. 3

Page 3: Serum DNA Motifs Predict Disease and Clinical Status in Multiple Sclerosis

undergoing clinical relapse. Blood was drawn within 1week of the diagnosis and before any therapy was initi-ated. The remaining 15 patients were in a remission for atleast 6 months before sampling. A total of 2.1 � 105

high-quality sequences and 3.75 � 107 nucleotides weregenerated. The average number of sequences per sam-ple was 7.6 � 103, averaging 178 nucleotides in length.Of all nucleotides obtained from relapsing MS and stableMS patients, 82% produced significant hits on one of thepublic databases, of which �99% aligned to the humangenome. The relative amount of genes, pseudogenes,transcribed, and untranslated regions (annotated as RNAand UTR) and coding sequences was calculated andcompared with the relative amounts observed in the cir-culating DNA pool of control samples obtained from 50apparently healthy individuals.6 The representation ofcoding sequences, genes, RNAs, UTRs, and pseudo-genes in the circulating DNA of RRMS patients was com-parable with the representation for the circulating DNA(P � 0.1) from healthy control subjects.

The relative abundance of serum DNA fragmentsmatching to specific genes was compared between thedifferent patient groups. Comparison of the relapsing MSpatients with the healthy control group yielded fragmentmatches to 20 genes, which are differentially representedand yielded c values of �0.65 in the ROC statistics(Supplemental Table S1, see http://jmd.amjpathol.org).Interestingly, 12 of the 20 genes that yielded a c value�0.65 in the individual analysis are either exclusively orpredominantly expressed in the central nervous system(CNS). The combination of the differential representationof fragments matching to 7 of these genes in a multivar-iate regression (MVR) model is sufficient to separate therelapsing MS patients from the normal control group witha combined c value of 0.98 (P � 0.0001) (Table 1). Fourof these genes are expressed predominantly in the CNS(GRID2, CHL1, CNTNAP5, and NELL2). The remainingthree genes (PLCB1, DTNB, and FHOD3) of the MVR

model are ubiquitous in their tissue expression (Table 2).All of the gene fragments identified in the MVR analysisare underrepresented relative to the normal control sub-jects with the exception of CHL1 and CNTNAP5. Of thegenes that were differentially represented between re-lapsing MS patients and normal control subjects onlyGRID2 was also underrepresented in the stable MS groupcompared with the normal control group. Contrary to therelapsing MS group, comparison of the stable MS groupwith the normal control group yielded only a few geneswith predominantly CNS expression (Table 1, Supple-mental Table S2, see http://jmd.amjpathol.org). Similar tothe relapsing MS group, MVR yielded seven genes thatare sufficient to separate stable MS patients from the normalcontrol group (c value � 0.98; P � 0.0001). Differentiationbetween relapsing MS and stable MS patients (Supplemen-tal Table S3, see http://jmd.amjpathol.org) can be achievedby combining three genes (SATB2, ELAVL2, and CUGBP2)in a MVR model (c value � 1; P � 0.0001). Finally, 18 genesshowed significantly different representation in the serum ofall MSpatients comparedwith the normal control group (Table1, Supplemental Table S4, see http://jmd.amjpathol.org). Amultivariate regression model using the representation ofseven genes results in a C value of 0.98 (P � 0.00001).

The majority of differentially represented genes can begrouped into functional clusters as shown in Table 2. Fourmain functional groups were identified. The first groupconsists of proteins forming the cytoskeleton or cell mor-phology. The second group comprises regulators ofgrowth and differentiation, the majority of which are ex-plicitly expressed in the brain. The third major groupcomprises phosphatases, phospholipases, and kinases.The fourth group consists of receptors and carrier pro-teins involved in nervous system signal transduction. Dif-ferential serum representation of neuronal growth regu-lators is especially pronounced in the relapsing MSpatient group in whom significant differences from the

Table 1. Serum DNA Signatures Predictive of RRMS and Disease Activity Obtained by MVR Analysis

Normal* c statistic Stable c statistic

Genes†

All MS PARN(�), COMMD10(�), TMEM117(�),SPAG17(�), NEB(�), PTPRR(�),TRIO(�)

0.97 (0.85–0.99)

Relapsing MS GRID2(�), CHL1(�), PLCB1(�), DTNB(�),CNTNAP5(�), NELL2(�), FHOD3(�)

0.98 (0.81–0.98) SATB2(�), ELAVL2(�),CUGBP2(�)

1.0 (0.90–1.0)

Stable MS FRMD3(�), PLD5(�), GPR158(�),HPSE2(�), PPM1H(�), ABLIM1(�),RHBDD1(�)

0.98 (0.90–1.0)

Repeats‡

All MS (TG)n(�), AT_rich(�), L1MA4(�),L1M5(�), L1MD(�), (GGTTG)n(�),L1MC(�)

0.98 (0.93–1.0)

Relapsing MS AT_rich(�), L1M5(�), Hs17(�), L1MD(�),Satellite(�), HERVK-int(�), L1MA4(�)

1.0 (0.98–1.0) (CAA)n(�), LTR16B1(�),Satellite(�), MER21-int(�), (TCTCTG)n(�),L1ME4a(�)

1.0 (0.90–1.0)

Stable MS (TG)n(�), AT_rich(�), CER(�), L1MC(�),(TC)n(�), L1MD(�), LTR16A(�)

1.0 (0.98–1.0)

*(�) signifies an excess compared with the normal database; (�) signifies a reduction compared with the normal database.†Derived from Supplemental Tables S1 to S4 (see http://jmd.amjpathol.org).‡Derived from Supplemental Tables S5 to S8 (see http://jmd.amjpathol.org).

314 Beck et alJMD May 2010, Vol. 12, No. 3

Page 4: Serum DNA Motifs Predict Disease and Clinical Status in Multiple Sclerosis

normal group were detected for five genes (MDGA2,CNTNAP5, CHL1, CNTN5, and NAV3) that are involved inaxon guidance, nervous system cell development, andregeneration. Furthermore, we have found differential se-rum representations for the neural epidermal growth fac-tor gene (NELL2) and FAM19A2 related sequences.NELL2 is involved in cell growth regulation and differen-tiation, whereas the FAM19A2 protein possibly acts as aregulator of immune and nervous system cells. Further-more, four neuronal growth factor/regulator genes showedclear differences in representation in the serum of relapsingMS and stable MS patients (Table 2).

MS patient DNA motifs are further distinguished fromthose of normal subjects by alterations in the frequenciesof repetitive elements and fragments derived from spe-cific human chromosomes. An MVR model that uses therepetitive elements, (TG)n, AT_rich, L1MA4, L1M5, L1MD,(GGTTG)n, and L1MC, distinguishes MS patients fromnormal control subjects (Table 1, Supplemental Table S5,see http://jmd.amjpathol.org) with high statistical reliability(c value � 1, P � 0.0001). A second MVR model wascalculated using families of repetitive element DNA se-quences and sequences derived from different chromo-somes. The combination of sequences matching to chro-mosome 17 (Hs17), LINE1 elements belonging to the L1Mfamily, chromosome 4 (Hs4), and the mammalian apparentlong terminal repeat-retrotransposon family (LTR/MaLR)yields a similar highly significant relative risk P value for MSpatients compared with the normal control group. The re-petitive elements, repetitive element families, and chromo-

somes with individual c values of �0.65 are given in Sup-plemental Table S5 (see http://jmd.amjpathol.org). Eight ofthe 16 individual repetitive elements with high c values aremicrosatellites and 6 are different LINE1 elements. All LINEelements are significantly reduced compared with nor-mal control LINE element DNA motifs, whereas (TG)n,(CACAA)n, (CAACC)n, (CCCCAA)n, and (GGTTG)n re-peats are overrepresented.

Differentially represented repetitive elements in MSDNA motifs provided additional evidence that distin-guishes active from stable disease (Table 1). An MVRmodel using the repetitive elements (CAA)n, LTR16B1,satellite, MER21, (TCTCTG)n, and L1ME4a yielded a cvalue of 1 (P � 0.0001) between the two groups (Table 1,Supplemental Table S6, see http://jmd.amjpathol.org).Stable MS patients can be distinguished from healthycontrol subjects by combining the down-regulation ofL1MC, L1MD, and AT_rich small repeats, and up-regu-lation of CER, (TG)n, and LTR16A sequences (P �0.0001) (Table 1, Supplemental Table S7, see http://jmd.amjpathol.org. An MVR model using the single repet-itive elements AT_rich, L1M5, Hs17, L1MD, satellite,HervK-int (endogenous retrovirus), L1MA4, and chromo-some-specific nonrepetitive elements of chromosomeHs17 provided highly significant differences (c value � 1,P � 0.0001) that distinguish relapsing MS patients fromhealthy control subjects (Table 1, Supplemental TableS8, see http://jmd.amjpathol.org). An MVR model usingthe repetitive element families L1M, satellite, and DNA/hAT and nonrepetitive sequences derived from chromo-

Table 2. Gene Functions of Dysregulated CNAs from Patients with RRMS Versus Healthy Control Subjects

Relapsing MSversus normal

Stable MS versusnormal

Relapsing MS versusnormal

Relapsing MSversus stable MS Total

Cytoskeleton/cell morphology NEBL, DTNB,KLHL1, FHOD3,ARHGAP15

ABLIM1, FRMD3,FARP1, CLASP1

SNTG2, SPAG17, NEB,SGCZ

NEBL, FHOD3,UBR4, FARP1,SORBS1

18

Neuronal growthfactors/growth regulators/differentiation regulators

NELL2, MDGA2,CHL1, CNTN5,CNTNAP5,NAV3,FAM19A2

SDK1 TRIO, PDGFD, PTPRR CNTNAP5, CHL1,ELAVL2, NEGR1

15

Phosphatases/phospholipases/kinases

DGKI, MAST4,ITPK1, PLCB1

PPM1H, PLD5, KSR2 ITPK1, PRKCB1,KSR2, PLD5

11

Nervous signal transduction GRID2, GRIA1 GRID2, SLC44A5 GRID2, STXBP6 6Ion channels/ion transporters/

ion channel interactingNKAIN3 KCNMA1 SFXN5 PDZD2 4

RNA processing SRPK2, PARN CUGBP2 3Polysaccharide metabolism HPSE CHSY2 2G protein-coupled receptors GPR158 GPR158 2Apoptosis WWOX, ELMO1 2Cell cycle regulation MCC 1Proteases RHBDD1 1Inflammation COMMD10 1Lipid rafts RFTN1 1Unclassified MDS1 PLEKHA5, SPATA5,

TMEM117CUBN, SATB2,

CCDC60,SERGEF, GSG1L,MDS1, MUC16,EBF1,TBC1D22A,SSBP2

14

Total 20 15 18 28 81

Genes with exclusive or predominant expression in nervous tissues are in bold; genes included in MVR(C) are underlined.

Serum DNA Motifs in MS 315JMD May 2010, Vol. 12, No. 3

Page 5: Serum DNA Motifs Predict Disease and Clinical Status in Multiple Sclerosis

somes Hs17 and Hs4 yielded a similar highly significantdifferentiation (Supplemental Table S8, see http://jmd.amjpathol.org). The differentially represented Hs17-spe-cific DNA motifs are derived from three hot spots of

overrepresentation (Figure 1) that occur at approximatenucleotide positions 2.5 � 107, 5.3 � 107, and 7.0 � 107.

Our results demonstrate that serum DNA can beshown to have profiles that are distinct from those ofgenomic DNA when MSA procedures are used. Figure 2,A–D, illustrates ROC curves for acute and stable RRMSpatients as a function of differential frequencies of codinggenes and repeat elements versus those in normal pa-tients. These profiles are highly statistically different in theserum DNA from MS patients compared with serum DNAprofiles of healthy normal subjects. Relative frequenciesof genes and repetitive elements can be used to distin-guish MS patients from normal control subjects by serum-based analysis as well as distinguish disease activity. Aknown MS locus is present on Hs17.7 In our analysisHs17 is significantly overrepresented as a source of DNAmotifs in relapsing MS patients versus control subjectswith three distinct chromosomal hot spots. These resultsconfirm the value of MSA technologies to detect informa-tive serum DNA biomarkers that we have reported re-cently in experimental neurodegenerative diseases ofcattle8,9 and elk.9

Discussion

Progressive MS is a physically devastating human demy-elinating disease of the CNS expressed in a complexgenetic and immunological milieu and is widely consid-ered to be a disease of autoimmunity against the com-ponents of myelin.10–12 Autoimmunity to myelin compo-

Figure 1. Relative representation of DNA sequences assignable to homosapiens chromosome 17 as a function of distribution over the chromosome,based on a 0.5-kb window. The green line shows distribution in 50 normalsubjects, and the red line represents patients with relapsing MS. The bluehorizontal line depicts the border of significance adopted for patients versusnormal control subjects, which is equivalent to the mean � 4.6 SD (equalsP � 0.01 after Bonferroni correction). The blue box indicates the genomicregion linked to MS.4

Figure 2. MVR index distribution (inset) andresulting ROC curves of chromosome (Hs) andrepetitive element (RE) representation as well asintragenic representation in serum DNA fromRRMS patients versus normal healthy control se-rum DNA. A: Representation of Hs and RE inrelapsing MS patients versus normal control sub-jects. B: Representation of Hs and RE in stableMS patients versus normal control subjects. C:Representation of genes in relapsing MS patientsversus normal control subjects. D: Representa-tion of genes in stable MS patients versus normalcontrol subjects. Main figures (ROC curves): theblack line is the actual ROC curve, the solid redlines indicate the 95% confidence interval (2.5 to97.5% limits) of the ROC curves, and the dashedline represents a line of no separation (c value �0.5). Data insets: Red dots depict either relaps-ing MS or stable MS values, green dots are nor-mal control values; the solid line represents thegroup median.

316 Beck et alJMD May 2010, Vol. 12, No. 3

Page 6: Serum DNA Motifs Predict Disease and Clinical Status in Multiple Sclerosis

nents can be demonstrated in patients with MS andexperimentally induced murine models (experimentalautoimmune encephalomyelitis). Immune responsesagainst these components provide supportive experi-mental evidence for this hypothesis. Alternatively, a vari-ety of intracellular pathogens have been proposed asetiological agents and include human herpesvirus-6,13

endogenous retroviruses,14 and Chlamydophila pneu-moniae 15,16. Intrathecal immunoglobulins observed asoligoclonal bands in the cerebrospinal fluid are mostcommonly observed in chronic infections of the CNS.17

Antibodies to C. pneumoniae as components of the oli-goclonal bands in MS18 provide further supportive evi-dence for the role of this pathogen in MS. Molecularmimicry to antigens of C. pneumoniae19,20 has been dem-onstrated in murine models of MS. The severity of exper-imental autoimmune encephalomyelitis is increasedwith systemic infection with C. pneumoniae that dissemi-nates to the CNS.19 In the rat experimental autoimmuneencephalomyelitis model immunization with a 20-merprotein homologous with a C. pneumoniae protein thatshares a critical seven-amino acid motif with myelin basicprotein results in a severe experimental autoimmune en-cephalomyelitis response.20 Our mass sequence analy-sis of serum DNA provides additional evidence of aninfectious process. The ecotropic viral integration site,EVI1, is predictive of the progressive spongiform enceph-alopathy observed in chronic wasting disease of elk.9

Other endogenous retroviral elements have been simi-larly observed in bovine spongiform encephalopathy.8,9

EVI1 is recognized also as the CCR7 chemokine receptorgene. The EVI1/CCR7 gene is important in that humanendogenous retrovirus (HERV) peptides in the expres-sion of type 1 cytokine profiles in vitro of peripheral bloodmononuclear cells from patients in relapsing phases ofMS but not from stable MS patients have been report-ed.21 Moreover, LTR16B1 of ERV3 was significantly de-creased in serum of patients with relapsing MS. Down-

regulation of the MS-associated retrovirus has similarlybeen positively associated in 11 patients with response totreatment with recombinant interferon-�.22 Our MSA andcomparative analysis of serum DNA has expanded theobservation of HERV sequences in MS to include theunderexpression of AT_rich repeats and LINE elementsL1M5, L1MA4, L1MD, and L1MC compared with the nor-mal database. An overexpression of (TG)n and (GGTTG)nwas found in the all-inclusive MS group compared withthe normal database.

An MS susceptibility locus occurs on human chromo-some 17q24 flanked by long segmental, highly homolo-gous, intrachromosomal duplications.7 We have ob-served DNA fragments traceable to three hot spots on Hs17 in MS patients. Cell-free DNA in the blood is associ-ated with histones (nucleosomes) protected from nucle-ase digestion.23 Epigenetic modifications such as DNAmethylation as well as acetylation and methylation ofhistones result in nucleosome positioning alterations andchromatin order.24 Extensive hypomethylation of LINEelements and endogenous and endogenous retroviruseshave been reported in several cancers.25 The duplica-tions flanking chromosome 17q24 may affect the biolog-ical activity of the regional genes.7 The observed differ-ential frequencies of serum repetitive element DNA in theblood of MS patients versus healthy controls may simi-larly be a reflection of DNA plasticity. Alternatively, differ-ential repetitive element representation in the CNA poolmay originate from different processes by which DNA isreleased from cells. Several forms of cell death other thanapoptosis (eg, necrosis, autophagy, or mitotic catastro-phe) as well as an active release of newly synthesizedDNA have been hypothesized as possible sources ofblood CNA.

L1 elements (LINES) and HERV encode reverse tran-scriptase that provides an RNA intermediate for newchromosomal integration sites. L1s are the most activeautonomous elements of the human genome26 and are

Table 3. Prevalence of Repetitive Circulating DNA in Multiple Sclerosis Patients Versus Healthy Control Subjects

Repetitive element

Statistically significant MS dysregulation versus healthy CNA

MS P AMS P SMS P Comments

LINE All LINE elements decreased in MS regardless of diseaseactivityL1MA4 2 0.0003 2 0.0021

L1M5 2 �0.0001 2 �0.0001L1MC 2 0.0004 2 0.0002L1MD 2 �0.0001 2 0.0008 2 0.0008

Small repeats Mixed levels of the small repeat repetitive elementscharacterize disease activity; decrease of AT richversus healthy controls equivalent in both relapsing andremitting MS

(TG)n 1 �0.0001 1 �0.0001(TC)n 1 0.0025AT_rich 2 �0.0001 2 �0.0001 2 �0.0001(GGTTG)n 1 �0.0001

LTRs LTR elements clearly distinguish between relapsing andremitting MSCER 1 0.0007

LTR16A 1 0.0084HervK-int 2 0.0015

Satellite 2 0.0009 Decreased levels during acute disease activityMVR analysis 0.0001 0.0001 0.0001 All MVR analyses using the indicated repetitive elements

are highly significant

Repetitive elements were used in the predictive algorithms for each category (all MS, relapsing MS, and stable MS). Complete quantitative data onall repetitive data and MVR formulas are available in Supplemental Tables S5 to S8 (see http://jmd.amjpathol.org).

AMS, relapsing (acute) MS; SMS, remitting (stable) MS.

Serum DNA Motifs in MS 317JMD May 2010, Vol. 12, No. 3

Page 7: Serum DNA Motifs Predict Disease and Clinical Status in Multiple Sclerosis

estimated to be present in �500,000 copies.27 Becauseof 5� truncation-associated RT dissociation from its RNAtemplate and various disruptive mutations, only 30 to 60LINES are active transposing elements. Based on thesimilarity of LINE-generated pseudogenes and theHERV-W family of retroviruses, it has been postulated thatthey are derived from LINE-mediated retrotranspositionof retroviral mRNA.28 Nevertheless, the number of com-plete pseudogenes is greater than expected.29 A signif-icant question arises from the results of our total se-quencing approach of serum DNAs: What role do L1retroelements and microsatellite repeats play in MS? Theimportance of this question is related to whether or notthese repeat elements should be considered for pharma-ceutical targeting because L1 retrotransposons havebeen demonstrated to be regulatory-sensitive to smallinterfering RNAs.30 Further, these newly identified bio-markers might have some role in the dynamic equilibriumof autoreactive T lymphocytes that play a pivotal role inthe prevention of autoimmune diseases such as MS.31

Gene expression profiles derived from peripheralblood mononuclear cells have provided evidence that asmall number of genes are down-regulated in MS,32 sim-ilar to the reduction in both genes and repetitive elementcirculating DNA that is observed in the present study,especially in the relapsing disease stage. Translation ofthe observed microchip data to a complex ratio of geneexpression profiles in peripheral blood mononuclear cellsusing quantitative RT-PCR has demonstrated the poten-tial utility in clinical diagnosis.33 On the basis of the datapresented here, it should be possible to develop aninexpensive serum-based quantitative multiplex PCR as-say to diagnose MS without the significant expense ofmRNA preservation in peripheral blood mononuclearcells and a reverse transcription step. In addition, ourmass sequencing data allow a simple laboratory analysisof disease activity. Our ROC MVR analysis suggests thatthe repetitive elements will yield the best separation oftrue positive results from the normal population with pre-dicted extraordinary sensitivity and specificity (Figure 2).Table 3 summarizes the various repetitive elements thatdistinguish RRMS and its clinical activity from normalindividuals. Current list prices for hospital-based Gd-MRIaverage approximately $3500 (https://www.mymedicalcosts.com/Pricing.aspx, last accessed October 18, 2009). Withthe ever-increasing efforts to reduce the costs of medicalcare, serum-based assays (estimated to be $100 to $250)with appropriate validation could replace Gd-MRI as theprimary standard for diagnosis as well as disease monitor-ing as a function of drug therapy. The data further suggestthat circulating nucleic acids in MS may provide additionaltools to address pathogenesis.

Acknowledgments

We thank Sara Henneke, Stefan Balzer, and CarstenMuller for their skillful assistance, and Sascha Glinka andBirgit Ottenwalder at Eurofins Medigenomix GmbH for per-forming the GS FLX/454 sequencing. Access to the MS

serum bank maintained by the Don C. Gnocchi ONLUSFoundation IRCCS is gratefully acknowledged.

References

1. Birnbaum G: Making the diagnosis of multiple sclerosis. Adv Neurol2006, 98:111–124

2. Traboulsee A, Li DK: The role of MRI in the diagnosis of multiplesclerosis. Adv Neurol 2006, 98:125–146

3. Jurka J, Kapitonov VV, Pavlicek A, Klonowski P, Kohany O, WalichiewiczJ: Repbase ;Update, a database of eukaryotic repetitive elements.Cytogenet Genome Res 2005, 110:462–467

4. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W,Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation ofprotein database search programs. Nucleic Acids Res 1997,25:3389–3402

5. Beck J, Urnovitz HB, Riggert J, Clerici M, Schutz E: Profile of thecirculating DNA in healthy individuals. Clin Chem 2009, 55:730–738

6. Kestler HA: ROC with confidence—a Perl program for receiver oper-ator characteristic curves. Comput Methods Programs Biomed 2001,64:133–136

7. Chen DC, Saarela J, Clark RA, Miettinen T, Chi A, Eichler EE, PeltonenL, Palotie A: Segmental duplications flank the multiple sclerosis locuson chromosome 17q. Genome Res 2004, 14:1483–1492

8. Beck J, Urnovitz H, Grochup M, Ziegler U, Brenig B, Schutz E. Serumnucleic acids in an experimental bovine transmissible spongiformencephalopathy model. Zoonoses Public Health, 2009, 56:384–390

9. Gordon PM, Schutz E, Beck J, Urnovitz HB, Graham C, Clark R,Dudas S, Czub S, Sensen M, Brenig B, Groschup MH, Church RB,Sensen CW: Disease-specific motifs can be identified in circulatingnucleic acids from live elk and cattle infected with transmissiblespongiform encephalopathies. Nucleic Acids Res 2009, 37:550–556

10. Noseworthy JH, Lucchinetti C, Rodriguez M, Weinshenker BG: Mul-tiple sclerosis. N Engl J Med 2000, 343:938–952

11. Hartung HP, Kieseier BC, Hemmer B: Purely systemically activeanti-inflammatory treatments are adequate to control multiple sclero-sis. J Neurol 2005, 252(Suppl 5):v30–v37

12. Trapp BD, Nave KA: Multiple sclerosis: an immune or neurodegen-erative disorder? Annu Rev Neurosci 2008, 31:247–269

13. Fotheringham J, Jacobson S: Human herpesvirus 6 and multiplesclerosis: potential mechanisms for virus-induced disease. Herpes2005, 12:4–9

14. Rolland A, Jouvin-Marche E, Viret C, Faure M, Perron H, Marche PN:The envelope protein of a human endogenous retrovirus-W familyactivates innate immunity through CD14/TLR4 and promotes Th1-likeresponses. J Immunol 2006, 176:7636–7644

15. Sriram S, Stratton CW, Yao S, Tharp A, Ding L, Bannan JD, MitchellWM: Chlamydia pneumoniae infection of the central nervous systemin multiple sclerosis. Ann Neurol 1999, 46:6–14

16. Tang YW, Sriram S, Li H, Yao SY, Meng S, Mitchell WM, Stratton CW.Qualitative and quantitative detection of Chlamydophila pneumoniaeDNA in cerebrospinal fluid from multiple sclerosis patients and con-trols. PLoS ONE 2009, 4:e5200

17. Stratton CW, Wheldon DB: Multiple sclerosis: an infectious syn-drome involving Chlamydophila pneumoniae. Trends Microbiol2006, 14:474–479

18. Yao SY, Stratton CW, Mitchell WM, Sriram S: CSF oligoclonal bands inMS include antibodies against Chlamydophila antigens. Neurology2001, 56:1168–1176

19. Du C, Yao SY, Ljunggren-Rose A, Sriram S: Chlamydia pneumoniaeinfection of the central nervous system worsens experimental allergicencephalitis. J Exp Med 2002, 196:1639–1644

20. Lenz DC, Lu L, Conant SB, Wolf NA, Gerard HC, Whittum-Hudson JA,Hudson AP, Swanborg RH: A Chlamydia pneumoniae-specific pep-tide induces experimental autoimmune encephalomyelitis in rats.J Immunol 2001, 167:1803–1808

21. Clerici M, Fusi ML, Caputo D, Guerini FR, Trabattoni D, Salvaggio A,Cazzullo CL, Arienti D, Villa ML, Urnovitz HB, Ferrante P: Immuneresponses to antigens of human endogenous retroviruses in pa-tients with acute or stable multiple sclerosis. J Neuroimmunol1999, 99:173–182

318 Beck et alJMD May 2010, Vol. 12, No. 3

Page 8: Serum DNA Motifs Predict Disease and Clinical Status in Multiple Sclerosis

22. Mameli G, Serra C, Astone V, Castellazzi M, Poddighe L, Fainardi E,Neri W, Granieri E, Dolei A: Inhibition of multiple-sclerosis-associatedretrovirus as biomarker of interferon therapy. J Neurovirol 2008,14:73–77

23. Holdenrieder S, Nagel D, Schalhorn A, Heinemann V, Wilkowski R,von Pawel J, Raith H, Feldmann K, Kremer AE, Muller S, Geiger S,Hamann GF, Seidel D, Stieber P: Clinical relevance of circulatingnucleosomes in cancer. Ann NY Acad Sci 2008, 1137:180–189

24. Ballestar E, Esteller M: The impact of chromatin in human cancer:linking DNA methylation to gene silencing. Carcinogenesis 2002,23:1103–1109

25. Ehrlich M: DNA methylation in cancer: too much, but also too little.Oncogene 2002, 21:5400–5413

26. Kazazian HH Jr, Moran JV: The impact of L1 retrotransposons on thehuman genome. Nat Genet 1998, 19:19–24

27. International Human Genome Sequencing Consortium: Initial sequenc-ing and analysis of the human genome. Nature 2001, 409:860–921

28. Pavlícek A, Paces J, Elleder D, Hejnar J: Processed pseudogenesof human endogenous retroviruses generated by LINEs: their

integration, stability, and distribution. Genome Res 2002, 12:391–399

29. Pavlícek A, Paces J, Zíka R, Hejnar J: Length distribution of longinterspersed nucleotide elements (LINEs) and processed pseudo-genes of human endogenous retroviruses: implications for retrotrans-position and pseudogene detection. Gene 2002, 300:189–194

30. Yang N, Kazazian HH Jr. L1 retrotransposition is suppressed byendogenously encoded small interfering RNAs in human culturedcells. Nat Struct Mol Biol. 2006, 13:763–771

31. Saresella M, Marventano I, Guerini FR, Zanzottera M, Delbue S,Marchioni E, Maserati R, Longhi R, Ferrante P, Clerici M: Myelin basicprotein-specific T lymphocytes proliferation and programmed celldeath in demyelinating diseases. Clin Immunol 2008, 129:509–517

32. Maas K, Chan S, Parker J, Slater A, Moore J, Olsen N, Aune TM:Cutting edge: molecular portrait of human autoimmune disease. J Im-munol 2002, 169:5–9

33. Fossey SC, Vnencak-Jones CL, Olsen NJ, Sriram S, Garrison G, DengX, Crooke PS 3rd, Aune TM: Identification of molecular biomarkers formultiple sclerosis. J Mol Diagn 2007, 9:197–204

Serum DNA Motifs in MS 319JMD May 2010, Vol. 12, No. 3