targeted analysis of whole genome sequence data to diagnose genetic cardiomyopathy

36
751 I nherited cardiomyopathy is genetically diverse and has been linked to mutations that are rare in the general popula- tion. Genetic testing for cardiomyopathy is a useful adjunct for diagnosis as it is positioned to provide prognostic infor- mation for individuals and families. Specific gene mutations may suggest a greater risk of arrhythmias, rapid course, and importantly gene positive individuals with early signs of car- diomyopathy may benefit from early treatment. 1–4 Currently, clinical genetic testing for cardiomyopathy relies on screening the coding region of multiple genes simultaneously as a gene panel. The first gene panel for cardiomyopathy, introduced in 2007, sampled only 5 genes, whereas current panels assess >50 different genes. 5 Concomitant with panel expansion, sen- sitivity for mutation identification has increased. Clinical Perspective on p 759 Massively parallel, next generation sequencing is now being transitioned into the clinical arena. 6 Options include whole exome sequencing (WES) and whole genome sequencing (WGS). WES interrogates only the coding sequence and relies on an exon capture step. This capture step is limited by oligo- nucleotide design and may be incomplete because of uneven exon capture caused by GC bias, off-target sequencing, and omission of noncanonical transcripts, all especially impor- tant in the heart. 7,8 WES arrays are regularly updated to reflect changes in the annotation of the coding region of the genome. 9 Like panel-based testing, WES may necessitate recapture and resequencing as genome annotation continues. Currently, WES is less expensive than WGS, but that is rapidly chang- ing. High-throughput sequencing technology is now available that can produce high coverage (30×) genomes for $1000.00 (http://www.illumina.com/systems/hiseq-x-sequencing-sys- tem.ilmn). This progression in sequencing technology narrows the costs between targeted sequencing and whole genome approaches, making whole genome sequencing a viable alter- native to panel sequencing. The declining cost of WGS and the 100-fold increase in genome coverage makes it a viable alternative to genetic pan- els and WES for genetic testing of cardiomyopathy. To transi- tion WGS into a useful tool for diagnosing cardiomyopathy Background—Cardiomyopathy is highly heritable but genetically diverse. At present, genetic testing for cardiomyopathy uses targeted sequencing to simultaneously assess the coding regions of >50 genes. New genes are routinely added to panels to improve the diagnostic yield. With the anticipated $1000 genome, it is expected that genetic testing will shift toward comprehensive genome sequencing accompanied by targeted gene analysis. Therefore, we assessed the reliability of whole genome sequencing and targeted analysis to identify cardiomyopathy variants in 11 subjects with cardiomyopathy. Methods and Results—Whole genome sequencing with an average of 37× coverage was combined with targeted analysis focused on 204 genes linked to cardiomyopathy. Genetic variants were scored using multiple prediction algorithms combined with frequency data from public databases. This pipeline yielded 1 to 14 potentially pathogenic variants per individual. Variants were further analyzed using clinical criteria and segregation analysis, where available. Three of 3 previously identified primary mutations were detected by this analysis. In 6 subjects for whom the primary mutation was previously unknown, we identified mutations that segregated with disease, had clinical correlates, and had additional pathological correlation to provide evidence for causality. For 2 subjects with previously known primary mutations, we identified additional variants that may act as modifiers of disease severity. In total, we identified the likely pathological mutation in 9 of 11 (82%) subjects. Conclusions—These pilot data demonstrate that 30 to 40× coverage whole genome sequencing combined with targeted analysis is feasible and sensitive to identify rare variants in cardiomyopathy-associated genes. (Circ Cardiovasc Genet. 2014;7:751-759.) Key Words: cardiomyopathies genetics genomics humans © 2014 American Heart Association, Inc. Circ Cardiovasc Genet is available at http://circgenetics.ahajournals.org DOI: 10.1161/CIRCGENETICS.113.000578 Received May 17, 2013; accepted July 9, 2014. From the Department of Medicine (J.R.G., M.J.P., L.D.-C., J.P.F., P.P., E.M.M.), Department of Human Genetics (V.N., E.M.M.), and Department of Pathology (P.P.), The Computation Institute (L.P.P.), The University of Chicago and Argonne National Laboratories, Chicago, IL. *Drs Golbus and Puckelwartz contributed equally to this work. The Data Supplement is available at http://circgenetics.ahajournals.org/lookup/suppl/doi:10.1161/CIRCGENETICS.113.000578/-/DC1. Correspondence to Elizabeth McNally, Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, 303 E. Superior St. 7-123, Chicago, IL 60611. E-mail [email protected] Targeted Analysis of Whole Genome Sequence Data to Diagnose Genetic Cardiomyopathy Jessica R. Golbus, MD*; Megan J. Puckelwartz, PhD*; Lisa Dellefave-Castillo, MS; John P. Fahrenbach, PhD; Viswateja Nelakuditi, MS; Lorenzo L. Pesce, PhD; Peter Pytel, MD; Elizabeth M. McNally, MD, PhD Original Article by guest on April 29, 2016 http://circgenetics.ahajournals.org/ Downloaded from by guest on April 29, 2016 http://circgenetics.ahajournals.org/ Downloaded from by guest on April 29, 2016 http://circgenetics.ahajournals.org/ Downloaded from by guest on April 29, 2016 http://circgenetics.ahajournals.org/ Downloaded from by guest on April 29, 2016 http://circgenetics.ahajournals.org/ Downloaded from by guest on April 29, 2016 http://circgenetics.ahajournals.org/ Downloaded from by guest on April 29, 2016 http://circgenetics.ahajournals.org/ Downloaded from by guest on April 29, 2016 http://circgenetics.ahajournals.org/ Downloaded from by guest on April 29, 2016 http://circgenetics.ahajournals.org/ Downloaded from by guest on April 29, 2016 http://circgenetics.ahajournals.org/ Downloaded from by guest on April 29, 2016 http://circgenetics.ahajournals.org/ Downloaded from by guest on April 29, 2016 http://circgenetics.ahajournals.org/ Downloaded from by guest on April 29, 2016 http://circgenetics.ahajournals.org/ Downloaded from by guest on April 29, 2016 http://circgenetics.ahajournals.org/ Downloaded from by guest on April 29, 2016 http://circgenetics.ahajournals.org/ Downloaded from by guest on April 29, 2016 http://circgenetics.ahajournals.org/ Downloaded from by guest on April 29, 2016 http://circgenetics.ahajournals.org/ Downloaded from by guest on April 29, 2016 http://circgenetics.ahajournals.org/ Downloaded from by guest on April 29, 2016 http://circgenetics.ahajournals.org/ Downloaded from by guest on April 29, 2016 http://circgenetics.ahajournals.org/ Downloaded from by guest on April 29, 2016 http://circgenetics.ahajournals.org/ Downloaded from by guest on April 29, 2016 http://circgenetics.ahajournals.org/ Downloaded from by guest on April 29, 2016 http://circgenetics.ahajournals.org/ Downloaded from by guest on April 29, 2016 http://circgenetics.ahajournals.org/ Downloaded from by guest on April 29, 2016 http://circgenetics.ahajournals.org/ Downloaded from by guest on April 29, 2016 http://circgenetics.ahajournals.org/ Downloaded from by guest on April 29, 2016 http://circgenetics.ahajournals.org/ Downloaded from by guest on April 29, 2016 http://circgenetics.ahajournals.org/ Downloaded from

Upload: chicago

Post on 27-Apr-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

751

Inherited cardiomyopathy is genetically diverse and has been linked to mutations that are rare in the general popula-

tion. Genetic testing for cardiomyopathy is a useful adjunct for diagnosis as it is positioned to provide prognostic infor-mation for individuals and families. Specific gene mutations may suggest a greater risk of arrhythmias, rapid course, and importantly gene positive individuals with early signs of car-diomyopathy may benefit from early treatment.1–4 Currently, clinical genetic testing for cardiomyopathy relies on screening the coding region of multiple genes simultaneously as a gene panel. The first gene panel for cardiomyopathy, introduced in 2007, sampled only 5 genes, whereas current panels assess >50 different genes.5 Concomitant with panel expansion, sen-sitivity for mutation identification has increased.

Clinical Perspective on p 759Massively parallel, next generation sequencing is now being

transitioned into the clinical arena.6 Options include whole exome sequencing (WES) and whole genome sequencing (WGS). WES interrogates only the coding sequence and relies

on an exon capture step. This capture step is limited by oligo-nucleotide design and may be incomplete because of uneven exon capture caused by GC bias, off-target sequencing, and omission of noncanonical transcripts, all especially impor-tant in the heart.7,8 WES arrays are regularly updated to reflect changes in the annotation of the coding region of the genome.9 Like panel-based testing, WES may necessitate recapture and resequencing as genome annotation continues. Currently, WES is less expensive than WGS, but that is rapidly chang-ing. High-throughput sequencing technology is now available that can produce high coverage (30×) genomes for $1000.00 (http://www.illumina.com/systems/hiseq-x-sequencing-sys-tem.ilmn). This progression in sequencing technology narrows the costs between targeted sequencing and whole genome approaches, making whole genome sequencing a viable alter-native to panel sequencing.

The declining cost of WGS and the 100-fold increase in genome coverage makes it a viable alternative to genetic pan-els and WES for genetic testing of cardiomyopathy. To transi-tion WGS into a useful tool for diagnosing cardiomyopathy

Background—Cardiomyopathy is highly heritable but genetically diverse. At present, genetic testing for cardiomyopathy uses targeted sequencing to simultaneously assess the coding regions of >50 genes. New genes are routinely added to panels to improve the diagnostic yield. With the anticipated $1000 genome, it is expected that genetic testing will shift toward comprehensive genome sequencing accompanied by targeted gene analysis. Therefore, we assessed the reliability of whole genome sequencing and targeted analysis to identify cardiomyopathy variants in 11 subjects with cardiomyopathy.

Methods and Results—Whole genome sequencing with an average of 37× coverage was combined with targeted analysis focused on 204 genes linked to cardiomyopathy. Genetic variants were scored using multiple prediction algorithms combined with frequency data from public databases. This pipeline yielded 1 to 14 potentially pathogenic variants per individual. Variants were further analyzed using clinical criteria and segregation analysis, where available. Three of 3 previously identified primary mutations were detected by this analysis. In 6 subjects for whom the primary mutation was previously unknown, we identified mutations that segregated with disease, had clinical correlates, and had additional pathological correlation to provide evidence for causality. For 2 subjects with previously known primary mutations, we identified additional variants that may act as modifiers of disease severity. In total, we identified the likely pathological mutation in 9 of 11 (82%) subjects.

Conclusions—These pilot data demonstrate that ≈30 to 40× coverage whole genome sequencing combined with targeted analysis is feasible and sensitive to identify rare variants in cardiomyopathy-associated genes. (Circ Cardiovasc Genet. 2014;7:751-759.)

Key Words: cardiomyopathies ◼ genetics ◼ genomics ◼ humans

© 2014 American Heart Association, Inc.

Circ Cardiovasc Genet is available at http://circgenetics.ahajournals.org DOI: 10.1161/CIRCGENETICS.113.000578

Received May 17, 2013; accepted July 9, 2014.From the Department of Medicine (J.R.G., M.J.P., L.D.-C., J.P.F., P.P., E.M.M.), Department of Human Genetics (V.N., E.M.M.), and Department of

Pathology (P.P.), The Computation Institute (L.P.P.), The University of Chicago and Argonne National Laboratories, Chicago, IL.*Drs Golbus and Puckelwartz contributed equally to this work.The Data Supplement is available at http://circgenetics.ahajournals.org/lookup/suppl/doi:10.1161/CIRCGENETICS.113.000578/-/DC1.Correspondence to Elizabeth McNally, Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, 303 E. Superior St. 7-123,

Chicago, IL 60611. E-mail [email protected]

Targeted Analysis of Whole Genome Sequence Data to Diagnose Genetic Cardiomyopathy

Jessica R. Golbus, MD*; Megan J. Puckelwartz, PhD*; Lisa Dellefave-Castillo, MS; John P. Fahrenbach, PhD; Viswateja Nelakuditi, MS; Lorenzo L. Pesce, PhD;

Peter Pytel, MD; Elizabeth M. McNally, MD, PhD

Original Article

by guest on April 29, 2016http://circgenetics.ahajournals.org/Downloaded from by guest on April 29, 2016http://circgenetics.ahajournals.org/Downloaded from by guest on April 29, 2016http://circgenetics.ahajournals.org/Downloaded from by guest on April 29, 2016http://circgenetics.ahajournals.org/Downloaded from by guest on April 29, 2016http://circgenetics.ahajournals.org/Downloaded from by guest on April 29, 2016http://circgenetics.ahajournals.org/Downloaded from by guest on April 29, 2016http://circgenetics.ahajournals.org/Downloaded from by guest on April 29, 2016http://circgenetics.ahajournals.org/Downloaded from by guest on April 29, 2016http://circgenetics.ahajournals.org/Downloaded from by guest on April 29, 2016http://circgenetics.ahajournals.org/Downloaded from by guest on April 29, 2016http://circgenetics.ahajournals.org/Downloaded from by guest on April 29, 2016http://circgenetics.ahajournals.org/Downloaded from by guest on April 29, 2016http://circgenetics.ahajournals.org/Downloaded from by guest on April 29, 2016http://circgenetics.ahajournals.org/Downloaded from by guest on April 29, 2016http://circgenetics.ahajournals.org/Downloaded from by guest on April 29, 2016http://circgenetics.ahajournals.org/Downloaded from by guest on April 29, 2016http://circgenetics.ahajournals.org/Downloaded from by guest on April 29, 2016http://circgenetics.ahajournals.org/Downloaded from by guest on April 29, 2016http://circgenetics.ahajournals.org/Downloaded from by guest on April 29, 2016http://circgenetics.ahajournals.org/Downloaded from by guest on April 29, 2016http://circgenetics.ahajournals.org/Downloaded from by guest on April 29, 2016http://circgenetics.ahajournals.org/Downloaded from by guest on April 29, 2016http://circgenetics.ahajournals.org/Downloaded from by guest on April 29, 2016http://circgenetics.ahajournals.org/Downloaded from by guest on April 29, 2016http://circgenetics.ahajournals.org/Downloaded from by guest on April 29, 2016http://circgenetics.ahajournals.org/Downloaded from by guest on April 29, 2016http://circgenetics.ahajournals.org/Downloaded from by guest on April 29, 2016http://circgenetics.ahajournals.org/Downloaded from

752 Circ Cardiovasc Genet December 2014

requires development of an analytic approach that permits detection of rare mutations. As a first step for cardiomy-opathy WGS analysis, we created a super gene set with 204 known and putative cardiomyopathy genes. This super gene set exceeds the number of genes on most clinically available gene panels for cardiomyopathy because the super gene set was extended based on association with cardiomyopathy in humans and animal-based modeling. WGS data were filtered through this super gene set to determine the feasibility of using WGS as a reliable screening method for cardiomyopa-thy variants. To test the sensitivity of WGS combined with this analytic approach, it was tested in 11 unrelated subjects with dilated cardiomyopathy (DCM). The pathogenic or likely pathogenic variants were identified in 9 of 11 subjects (82% detection rate). These data demonstrate that ≈30 to 40× cover-age WGS is a reliable alternative to panel-based testing for cardiomyopathy. Furthermore, for those individuals in which the super gene set was unrevealing, the remaining genome sequence is available for immediate further interrogation, instead of waiting for additional panel-based testing justifying the extracost and time associated with WGS.

MethodsStudy SubjectsEleven unrelated subjects with nonischemic DCM were selected for WGS. Personal and family history of cardiomyopathy were available for all subjects. The study was approved by the University of Chicago Institutional Review Board. All study subjects provided written in-formed consent. Genetic counseling for WGS was provided.

Generation of Whole Genome Sequence DataGenomic DNA was extracted from the peripheral blood of 10 study subjects and the explanted heart tissue of 1 subject. Reversible termi-nator massively parallel sequencing was performed by Illumina (San Diego, CA) on the HiSeq2000. Paired end reads were mapped to the National Center for Biotechnology Information reference genome 37.1 (hg19) and variants were called using Illumina’s proprietary software (ELAND/CASAVA).

Myopathy Super Gene SetThe myopathy gene set comprises 204 genes (245 transcripts) iden-tified by published association with cardiac or skeletal myopathies or cardiac arrhythmias as a single gene disorder, association-based study, or in animal models. In genes with multiple transcripts, those transcripts with the highest expression in cardiac and skeletal mus-cle were included in the gene set. For all transcripts in the gene set, exonic boundaries ±10 base pairs (bps) were downloaded from the Ensemble Genome Browser (http://useast.ensembl.org/index.html).

Analysis of Protein Coding Single Nucleotide VariantsThe effect of single nucleotide variants (SNVs) was determined using SeattleSeq Annotation 134 (http://snp.gs.washington.edu/SeattleSeqAnnotation134/). Variants were analyzed using PolyPhen-2 (PP2), Sorting Tolerant From Intolerant (SIFT), PhastCons, Genomic Evolutionary Rate Profiling (GERP), Protein analysis through evolutionary relationships (Panther), and ConSeq.10–15 Frequency was assessed using 3 publically available databases: The March, 2012 Integrated Phase 1 release of the 1000 Genomes Project,16 the National Heart, Lung, and Blood Institute Exome Sequencing Project (ESP 5400) (http://evs.gs.washington.edu/EVS/), and dbSNP 135/136 (http://www.ncbi.nlm.nih.gov/projects/SNP/). A minor al-lele frequency of ≤0.01 was used to restrict variants.

Prioritization of Intronic SNVsVariants were analyzed by MaxEntScan using publically available perl scripts.17 Retained variants were then analyzed for frequen-cy using the March, 2012 Integrated Phase 1 release of the 1000 Genomes Project16 and dbSNP 135/136 (http://www.ncbi.nlm.nih.gov/projects/SNP/).

Identification and Analysis of Insertion/Deletion PolymorphismsInsertions/deletions (Indels) were annotated using the Variant Effect Predictor (http://useast.ensembl.org/index.html). Indels in the coding sequence or at a splice junction were scored for frequency using the 1000 Genomes Project.16

Desmin ExpressionFull-length human desmin cDNA (Origene) was mutated. Wild-type and mutant plasmid was placed into pcDNA3.1/His C (Invitrogen) and transfected into C2C12 cells (ATCC) grown in DMEM supple-mented with 10% fetal bovine serum and 1% penicillin/streptomycin in a 10% CO

2 incubator at 37°C. After 48 hours, cells were fixed

in 100% methanol (−20°C) and stained as described using an anti-Xpress Antibody (Invitrogen). Results were imaged as described.18

ResultsEleven unrelated subjects with nonischemic cardiomyopathy (Table 1) were selected for WGS and targeted analysis. Nine of the 11 subjects or an affected family member had undergone previous panel-based clinical genetic testing (Table 1). The number of genes assessed by this previous testing varied with the year in which the subject had undergone testing. In 3 of the 9 subjects who had undergone clinical testing, a pathogenic mutation was found via clinical testing (Table 1). WGS was performed in these families to provide validation to confirm that WGS has sensitivity to detect known mutations. Also, 2 of the families (DCM-AAB and DCM-Bl) exhibited marked phenotypic variability between generations. In these families the probands required heart transplant in the second decade of life, whereas other family members, carrying the clinically identified mutation, were only mildly affected into the fourth decade of life. These data suggested additional genetic modi-fiers. For 6 of 9 subjects previous clinical genetic testing was unrevealing. WGS was completed with an average coverage of 37.1-fold. On average, 113.4 GB of data per individual passed filter (Q≥20) and aligned to the reference genome cov-ering 97.9% of the non-N reference genome (Table I in the online-only Data Supplement). We restricted our analysis to a super gene set that included 204 genes (245 transcripts) previ-ously associated with Mendelian or non-Mendelian forms of cardiomyopathy, skeletal myopathies, or cardiac arrhythmias as demonstrated in either humans or animal models (Table II in the online-only Data Supplement).

Analysis of SNVsWGS identified an average of 3.7 million SNVs per individual with a greater number of SNVs identified in the 2 non-white individuals (4.0 million). Each genome had ≈11 586 nonsyn-onymous SNVs. Restricting analysis to the super gene set reduced the average number of missense SNVs to 167 per individual (Table III in the online-only Data Supplement). Missense SNVs were filtered using a combination of algo-rithms that predict effect based on conservation and structure.

by guest on April 29, 2016http://circgenetics.ahajournals.org/Downloaded from

Golbus et al Whole Genome Sequencing in Cardiomyopathy 753

SNVs were identified as rare based on their frequency in the population at large (Figure 1). For TTN, only truncating vari-ants were considered as described by Herman et al,19 as is the standard practice for clinical genetic testing. Missense variants in TTN were not considered because these variants are highly prevalent in the general population and vastly exceed the fre-quency of cardiomyopathy, making them difficult to interpret at this time.20 Variants common to multiple individuals in the sequencing cohort and absent from frequency databases were discarded as they represent sequencing or aligning artifacts. This analysis pipeline reduced the number of potentially dam-aging missense variants to 0 to 11 per individual (Table IV in the online-only Data Supplement). Variants were confirmed using Sanger Sequencing. We tested the sensitivity of this pipeline using 100 independent mutations reported in inher-ited cardiomyopathy and found it to be 91% sensitive (Table V in the online-only Data Supplement). We also detected the one known pathogenic missense mutation previously identified in the cohort in subject DCM-AAB03 (TPM1 D230N).

To assess variants positioned to alter splicing, SNVs within 10 bps of exon boundaries were filtered through maximum

entropy and evaluated for frequency in the population at large (Figure 1). MaxENT estimates the strength of the 5′ and 3′ splice junctions using a maximum entropy model.17 An average of 50 intronic SNVs (range, 44–63) was identified in each indi-vidual using the super gene set. Filtering with MaxENT reduced this list to ≤4 per individual. By including only rare variants, the number of splice site altering SNVs was reduced to 0 to 2 per individual (Table IV in the online-only Data Supplement). Variants were confirmed using Sanger Sequencing. We tested the sensitivity of this approach using a control data set of 25 known splice site altering mutations and found it to be 88% sensitive (Table V in the online-only Data Supplement). This approach also detected the single known splice variant within the cohort in MDC-01 (DES c.735+3 A>G); this variant was previously shown to disrupt splicing.21,22 Combining the analy-ses for missense, nonsense, and splicing variants produced 1 to 13 potentially pathogenic SNVs per individual (Table IV in the online-only Data Supplement). All variants that passed pipe-line criteria were then manually curated based on the specific phenotype of the proband. For example, variants in genes that are usually associated with muscle involvement were ranked

Table 1. Clinical Features of Subjects

Age at DxClinical

FeaturesAge at Death (D) or

Transplant (T)No. of Affected

Family MembersPrevious

Genes Tested

DCM-O1 12 Heart block; pacemaker age 12

32 (D) 7 (5 DCM, 1 SCD, and 1 arrhythmia)

ACTC, LDB3/ZASP, LMNA, MYBPC3, MYH7, PLN, TAZ, TNNI3, TNNT2, and TPM1 (affected

relative tested)

DCM-AAB03 20 VT; BiV ICD 20 (T) 7 (DCM) LMNA, MYBPC3, MYH7, TNNI3, TNNT2, and aTPM1 *TPM1 D230N

DCM-AAW02 33 Postpartum CMP with EF of 20%; VT; ICD

... 4 (DCM) ACTC, GLA, LAMP2, MYBPC3, MYH7, MYL3, MYL2, PRKAG2, TNNT2, TNNI3, TNNC1,

TPM1, and LMNA

DCM-AAY02 57 LAF and LIV conduction delay

... 2 (DCM) ABCC9, ACTN2, CSRP3, CTF1, DES, EMD, LDB3, LMNA, MYBPC3, MYH7, PLN, SGCD,

TAZ, TCAP, TNNI3, TPM1, and VCL

LGMD-AH01 30s Dual chamber ICD+LGMD

... 3 (LGMD) LMNA, LDB3, TNNT2, DES, SGCD, PLN, ACTC1, MYH7, TPM1, TNNI3, TAZ, TTR, MYBPC3, LAMP2, and cardiomyopathy

mitochondrial genes

DCM-Q14 21 ... Third decade (D) 12 (10 DCM, 2 SCD) ...

SD-303 52 VT 61 (T) 13 ...

DCM-AAL01 32 VT and AF; BiV ICD

43 (T) 13 (7 DCM, 6 SCD) LMNA, MYH7, MYBPC3, TNNT2, TNNI3, TPM1, ACTC, MYL2, MYL3, LAMP2, and PRKAG2

MDC-01 ... AF+LGMD 37 (D) 17 (13 DCM, 4 LGMD) ABCC9, ACTC, ACTN2, CSRP3, CTF1, DES, EMD, MDB3, LMNA, MYBPC3, MYH7, PLN,

SGCD, TAZ, TCAP, TNNI3, TNNT2, TPM, and VCL *DES c.735+3A>G22

DCM-BI01 16 ... 16 (T) 5 (DCM) ANKRD1, ACTC, LDB3, LMNA, MYBPC3, MYH7, PLN, SCN5A, TNNCI, TNNI3, TNNT2, and TPM1 (affected relative tested) *TNNT2

K210del

DCM-BH01 62 LBBB; ICD ... 2 (DCM) LMNA, MYH7, TNNT2, ACTC1, DES, MYBPC3, TPM1, TNNI3, ZASP, TAZ, PLN, TTR,

LAMP2, SGCD, MYL2, MYL3, PRKAG2, and cardiomyopathy mitochondrial genes

AF indicates atrial fibrillation; BiV, biventricular; CMP, cardiomyopathy; DCM, dilated cardiomyopathy; Dx, diagnosis; EF, ejection fraction; ICD, implantable cardioverter defibrillator; LAF, left atriofascicular; LBBB, left bundle branch block; LIV, left intraventricular; LGMD, limb girdle muscular dystrophy; SCD, sudden cardiac death; and VT, ventricular tachycardia.

*Primary mutation identified by panel sequencing.

by guest on April 29, 2016http://circgenetics.ahajournals.org/Downloaded from

754 Circ Cardiovasc Genet December 2014

lower in probands without muscle disease. Variants were then tested for segregation, where possible (Table 2).

Analysis of Indel PolymorphismsEach genome had on average 293 729 insertions and 312 947 deletions. Filtering these variants using the super gene set reduced the number to ≈88 indels per individual (Table III in the online-only Data Supplement). Indels in the cod-ing sequence or at a splice junction and present in the 1000

Genomes database at frequency ≤0.01 were retained. Indels common in the sequenced cohort were omitted as these are likely sequencing/aligning artifact. This analysis reduced the number of potentially pathogenic indels to 0 to 1 per indi-vidual, which were confirmed using Sanger Sequencing. This analysis also detected the single known pathogenic indel within the cohort in DCM-BI01 (TNNT2 K210del). Combin-ing the analysis for ndels with that for SNVs produced 1 to 14 potentially pathogenic variants per individual.

Figure 1. Variant analysis pipeline. A, Missense variant analysis. The super gene set includes genes linked to cardiomyopathy. Approxi-mately 3.7 million variants were identified, restricting to the super gene set reduces the number of variants to ≈11.5K. Missense single nucleotide variants (SNVs) from these genes were analyzed using PolyPhen-2, SIFT, PhastCons, and GERP.10–13 Variants were retained if predicted to be probably or possibly damaging by PolyPhen-2 or if damaging by 2 of the 3 remaining programs. Cutoff scores of 0.95 and 3 were considered damaging by PhastCons and GERP, respectively. If no prediction was made by at least 3 of 4 programs, the variant was retained for further analysis. Variants were then analyzed by Panther and ConSeq.14,15 Variants with a Panther substitution position—specific evolutionary conservation score of <−2 or a ConSeq score <0 were retained. If neither program was able to make a prediction, the variant was retained. This filtering reduces the number of variants to ≈167 per genome. Variants were analyzed for frequency using 3 databases: the 1000 Genomes Project, National Heart, Lung, and Blood Institute Exome Sequencing Project (ESP 5400), and dbSNP 135/136.16 Variants present at a frequency of ≤0.01 were retained, resulting in 0 to 11 nonsynonymous single nucleotide variant per genome. Nonsense SNVs within the myopathy super gene set were filtered using frequency. B, Splice site variant analysis. Analysis was restricted to intronic SNVs within 10 bp of an exon, reducing variant lists from ≈3.7 mol/L to 50 per genome. Variants were analyzed using maximum entropy.17 The 1000Genomes Project and dbSNP 135/136 were used to determine frequency. Variants present at a frequency of ≤0.01 were deemed potentially deleterious, resulting in 0 to 2 variants per genome.

Table 2. Pathogenic and Likely Pathogenic Variants Identified by Whole Genome Sequencing

Gene Function PositionNHLBI ESP Frequency

1000 Genomes Frequency

Additional Evidence for Causality

DCM-AAB03 GLATPM1*

MissenseMissense

R118CD230N

0.0004Absent

AbsentAbsent

Segregation n=7

DCM-AAW02 TTN Nonsense E3707X Absent Absent Segregation n=2

DCM-AAY02 FLNC Intronic, SS c.3791-1 G>C ... Absent Absent in unaffected sibling

DCM-Q14 TTN 1-bp insertion L20605PfsX2 ... Absent Segregation n=38

DCM-AAL01 DES Missense R127P Absent Absent Histopathology functional studies

(Figure 3)

MDC-01* DES Intronic, SS c.735+3 A>G ... Absent 22

DCM-BI01 TTNTNNT2*

Intronic, SS3bp deletion

c.42521–5 C>GK210del

... AbsentAbsent

Segregation n=3

DCM-BH01 SCN5A Missense G1318V 0.00008 Absent Segregation n=2

SD-303 SCN5A Missense R814W Absent Absent Segregation n=16

Frequency refers to the overall variant frequency. bp indicates basepair; DCM, dilated cardiomyopathy; NHLBI ESP, National Heart, Lung, and Blood Institute Exome Sequencing Project; and SS, splice site.

*Identified by panel-based genetic testing.

by guest on April 29, 2016http://circgenetics.ahajournals.org/Downloaded from

Golbus et al Whole Genome Sequencing in Cardiomyopathy 755

Likely Pathogenic Cardiomyopathy Variants Detected by WGSPotentially pathogenic variants were filtered based on the genes’ tissue expression pattern, association with syndromic features, and pattern of inheritance. Candidate variants were confirmed by Sanger sequencing and sequenced in family members for segregation when available. We identified 11 likely pathogenic mutations in 9 individuals, including 9 new mutations not identified by traditional panel-based screening (Table 2; Tables VII–XIII in the online-only Data Supplement). Of these mutations, 2 are second mutations identified within an individual that may function as disease modifiers. For 7 subjects, segregation analysis was performed in 2 to 38 family members, including both affected and unaffected individuals (Table 2; Figures I–IV in the online-only Data Supplement).

Subject DCM-BH01 presented at the age of 62 years with nonischemic DCM, nonsustained ventricular tachycardia, and a wide QRS complex consistent with cardiac conduction sys-tem disease (Figure 2A). Clinical genetic testing for 20 DCM genes was negative (Table 1). WGS identified 3 potentially pathogenic variants, including missense variants in SCN5A, FKTN, and HPS6 (Figure 2B). Both HPS6 and FKTN were excluded because of association with syndromic, autosomal recessive disorders.23,24 The SCN5A G1318V variant was absent from public databases. SCN5A mutations near this region have been linked to both DCM and inherited arrhyth-mic disorders,25 consistent with the striking conduction sys-tem disease observed in this individual.26 The SCN5A G1318V variant was also found in an adult offspring whose left ven-tricular (LV) ejection fraction was 47.8% (Figure 2C).

An SCN5A variant was also identified in subject SD-303 (Figure I in the online-only Data Supplement). SD-303 has a history of arrhythmias beginning in the third decade of life. Congestive heart failure was diagnosed in the sixth decade of life and a heart transplant was performed at the age of 61 years. Family history is remarkable for arrhythmia, sudden death, and congestive heart failure with some family mem-bers requiring transplant before the age of 40 years. WGS identified 7 potentially pathogenic variants, all missense (Table VII in the online-only Data Supplement). Segregation analysis of SCN5A R814W in 16 family members confirmed its association with disease (Figure I in the online-only Data Supplement). SCN5A R814W was both rare and considered highly deleterious by all algorithms. This particular variant has also been described as a de novo mutation in a 23-year-old woman with sporadic DCM, atrial flutter, and short runs of nonsustained ventricular tachycardia.27 Nguyen et al28 found that the SCN5A R814W mutation disrupted both activation and deactivation of Na

v1.5.

Subject DCM-AAL01 was diagnosed with DCM at the age of 32 years and underwent heart transplantation at the age of 43 years (Figure 3A). Clinical genetic testing was nega-tive for 11 genes (Table 1). Seven potentially pathogenic mis-sense variants were identified in the myopathy super gene set (Table VIII in the online-only Data Supplement). Variants in LDB3, MYH11, and DES were identified as potentially rel-evant based on the proband’s phenotype and were confirmed by Sanger sequencing (Figure 3B). The variant in MYH11 is unlikely to be the primary driver because MYH11 variants

are often associated with aortic aneurysm, which was not a factor in this subject’s disease.29,30 We performed immunohis-tochemical staining for desmin and electron microscopy on frozen sections from the proband’s explanted heart. Desmin aggregates were readily evident, diagnostic of a desmin-related myopathy (Figure 3C).31,32 Segregation analysis could not be performed as all affected family members had died or were unavailable for testing. Epitope-tagged DES R127P or wild-type desmin was introduced into myogenic C2C12 cells to assess pathogenicity. Intracellular desmin aggregates were readily detected with DES R127P but not with wild-type desmin, confirming the pathogenicity of this variant (Figure 3D). LDB3, also known as ZASP or CYPHER, has been linked to cardiomyopathy and, given the variant’s rarity, may contribute to the phenotype.33 As desmin protein aggre-gates were observed in the subject’s explanted heart, and as these aggregates are not known to be a feature of LDB3 muta-tions, these data lead us to conclude that DES is the primary genetic mutation in this individual. Genetic counseling was provided to the subject.

Figure 2. SCN5A G1318V detected by whole genome sequenc-ing (WGS). WGS was used to assess the genome of DCM-BH01 with DCM, left bundle-branch block and nonsustained ventricular tachycardia. A, Twelve-lead EKG from proband. B, Variants were filtered through the analysis pipeline identifying 3 potentially pathogenic variants. FKTN and HPS6 were excluded as these genes are linked to recessive, syndromic disease.23,24 The SCN5A G1318A variant was considered pathogenic because SCN5A gene mutations are known to affect the cardiac conduction system in addition to causing DCM.26 C, SCN5A G1318A vari-ant was identified in the proband’s offspring with DCM. NHLBI ESP indicates National Heart, Lung, and Blood Institute Exome Sequencing Project.

by guest on April 29, 2016http://circgenetics.ahajournals.org/Downloaded from

756 Circ Cardiovasc Genet December 2014

Whole Genome Sequencing Identifies Second Hits as Disease ModifiersWGS and the super gene set were applied to 2 families that displayed a range of phenotype, consistent with disease modi-fiers. The first was an individual with a known TPM1 muta-tion with a more severe clinical course compared with other affected family members (Figure 4A). Subject DCM-AAB03 underwent heart transplantation at the age of 20 years after presenting with refractory heart failure symptoms and severe LV systolic dysfunction (LV ejection fraction <20%). With the exception of 1 brother who died from cardiomyopathy and muscle weakness at the age of 12 years, other family members remain only mildly affected into the fourth decade and beyond. WGS identified 6 potentially pathogenic missense mutations, including a mutation in the X-encoded GLA gene (Table IX in the online-only Data Supplement). Mutations in GLA, which codes for the protein β-galactosidase-α, cause the X-linked dis-order Fabry disease.34,35 DCM-AAB03 was hemizygous for the GLA R118C variant while his less severely affected sisters and

mother were heterozygous for the variant and his more mildly affected brother carried only the primary mutation in TPM1.

The second family was known to have a primary TNNT2 mutation but evidence for variability. Specifically, subject DCM-BI01 presented at the age of 16 years with a LV ejec-tion fraction of 9.8% and underwent cardiac transplanta-tion (Figure 4B). Clinical genetic testing identified TNNT2 K210del as a pathogenic mutation. The TNNT2 K210del muta-tion was also detected using the filtering pipeline for WGS. Both the patient and his brother, who also required heart trans-plantation in adolescence, carry the TNNT2 K210 deletion. This mutation was also found in the subject’s affected mother and in his mildly affected grandmother, who at the age of 65 years had asymptomatic LV dysfunction (LV ejection fraction 45%). This phenotypic variability has been noted previously with the TNNT2 K210del variant.36–38 WGS also identified a TTN splice variant c.42521–5 G>C (Table X in the online-only Data Supplement). This variant was found in the 2 boys who required heart transplants in their second decade but not in their grandmother who was only mildly affected at the age of 65 years (Figure 4B). This TTN splice variant has a MaxEnt score of 5.97, which is deleterious by the criteria set forth by Herman et al19 for TTN splice sites.

DiscussionWGS Detects Rare Variants for CardiomyopathyWGS offers a comprehensive approach to identifying genetic variation across the genome. We now show that rare variants can be detected in cardiomyopathy genes using WGS and tar-geted analysis. Although the comprehensive nature of WGS or even WES is attractive, analytic tools must be refined to iden-tify pathogenic variants. The pipeline applied herein relied on (1) restricting the number of genes for analysis to a super gene set, (2) filtering based on frequency with the assumption that a rare disease is caused by rare genetic variation, and (3) protein prediction algorithms that largely rely on disrupting conserved regions. The method successfully identified 3 known mutations (DES c.735+3 A>G in patient MDC-01, TNNT2 K210del in patient DCM-Bl01, TPM1 D230N in patient DCM-AAB03) providing proof of principle that WGS at 30 to 40× coverage is sensitive to detect these rare variants. Likely pathogenic vari-ants were identified in 6 of the remaining patients. Each pro-band had 1 to 14 variants that passed filtering criteria of our pipeline. We relied on segregation analysis, functional data, and phenotypic data from each subject to further refine the vari-ant list. It is important to note that phenotypic information is invaluable when identifying likely pathogenic variation and that manual curation of each high probability variant is required.

In subject DCM-AAY02, a putative mutation in the gene encoding γ-filamin, also known as filamin C (FLNC), was found (Table 2; Table XI in the online-only Data Supplement). DCM-AAY02 presented at the age of 57 years with left anterior hemiblock and intraventricular conduction delay, but without skeletal muscle disease. FLNC encodes an actin-crosslinking protein that interacts with the dystrophin-associated protein complex, and mutations in FLNC lead to skeletal myopathy in humans and mice.39,40 Moreover, a fish model, medaka, with reduced FLNC expression, resulted in an enlarged and

Figure 3. Whole genome sequencing identified a desmin (DES) gene mutation, R127P. A, Dilated cardiomyopathy (DCM)-AAL01 pedigree reveals DCM and sudden cardiac death (SCD). The proband (*) underwent cardiac transplantation at the age of 42 years. B, Seven variants were identified as potentially deleterious, 3 were relevant to this individual’s phenotype. MYH11 is linked to muscle phenotypes and therefore was excluded.29,30 LDB3 I558V may be a contributing variant. The DES R127P variant was further tested. C, The explanted heart from the proband dem-onstrated desmin aggregates on immunohistochemistry (brown staining, left), feature pathognomonic for desmin-related myopa-thies.32 Electron microscopy revealed granulofilamentous material consistent with desmin aggregates (right). D, When expressed in myogenic C2C12 cells, the DES R127P variant formed aggre-gates (right, arrowhead), whereas wild-type desmin (left) did not. Expressed desmin was tagged with the Xpress epitope tag (green). Nuclei are stained with 4',6-diamidino-2-phenylindole. ALT indicates alternate; REF, referent.

by guest on April 29, 2016http://circgenetics.ahajournals.org/Downloaded from

Golbus et al Whole Genome Sequencing in Cardiomyopathy 757

mechanically weakened heart.41 Furthermore, a quantitative proteomic assessment of aggregates in desminopathy identi-fied filamin C as the second most abundant protein in these aggregates, providing additional support for the role of fila-min C as a mediator of cardiomyopathy.42 The FLNC mutation identified in DCM-AAY02 is located 1 bp from a splice accep-tor, is absent from the 1000 Genomes database, is predicted to be strongly deleterious by MaxENT, and is not present in an unaffected sibling. Although further functional studies are required to confirm the pathogenicity of this variant, these data underscore the use of having a mutable super gene set that allows for the interrogation of putative cardiomyopathy genes.

Perhaps the greatest value of broad-based sequencing is the capacity to further interrogate the data if a mutation is not iden-tified on first pass analysis. Here, the super gene set is defined by genes previously associated with cardiomyopathy, restrict-ing our ability to identify new cardiomyopathy-associated genes. The super gene set is mutable and genes can be added and removed at any time, allowing for the identification of variation in genes that may be suspected in the pathogenesis of cardiomyopathy. Analysis can be rerun in minutes to hours depending on the size of the data set. However, this approach is candidate driven and will not identify variation in genes not on the super gene set. Additional panel-based testing can take weeks to months to complete and may still not supply variants

of interest. Furthermore, as WGS becomes more commonplace, it is possible that patients will already have sequencing data. Although panel-based testing is generally higher coverage, this pilot study confirms that 30 to 40× coverage sequencing data are appropriate for identifying cardiomyopathy mutations.

For 2 subjects in this pilot study, WGS did not identify a clear mutation. We expect that these 2 individuals have mutations in genes outside the super gene set. Importantly, the available WGS sequence from these genomes will permit ongoing analysis and will not require reinvestment in additional sequencing. Larger gene panels have increased sensitivity for mutation detection, especially for DCM. Therefore, it can be expected that having comprehensive sequencing should have even greater power to detect primary mutations, and importantly secondary modifier mutations, which may account for disease severity or point to potentially treatable disorders such as cardiomyopathy amena-ble to enzyme replacement therapy like Fabry.43

Cost ConsiderationsThe cost of targeted panel sequencing, whole exome, and whole genome sequencing is a major consideration, and costs are shifting rapidly with new technology. Clinical testing for cardiomyopathy using targeted panel sequencing is around $4000 at this time. Current pricing for clinical exome sequenc-ing is ≈$7000 but varies widely depending on the extent of

Figure 4. Whole genome sequencing (WGS) identified primary and secondary pathogenic sequence variation. A, Dilated cardiomyopathy DCM-AAB03 Pedigree. The TPM1 D230N variant segregates with cardiomyopathy in all family members. Several members of this family had earlier onset disease. Individual II-3 presented at the age of 20 years with heart failure and was found to be hemizygous for the X-linked GLA R118C variant. B, DCM-BI01 pedigree. The TNNT2 K210del variant has been described previously in cardiomyopathy.33,38 The 2 younger members who required cardiac transplantation had an additional TTN mutation predicted to disrupt the splice site and truncate TTN.

by guest on April 29, 2016http://circgenetics.ahajournals.org/Downloaded from

758 Circ Cardiovasc Genet December 2014

variant analysis. Clinical whole genome sequencing is ≈$9000 to 9500, and like other clinical genetic testing costs includes the cost of analysis. The anticipated $1000 genome is for research-based studies and does not include the cost of analysis.

Sequencing depth is also a major consideration for sequenc-ing studies. Next generation sequencing technologies have a higher per base sequencing error than traditional Sanger sequencing reducing the probability of true-positive variants.44 The higher error rate requires higher depth whereby each base is sequenced X number of times by high-quality reads. However, as depth increases so does cost. In this study, WGS was performed at 30 to 40× because previous work has shown that readable bases and number of SNVs identified increases exponentially with increasing depth, with a plateau at 30×, indicating diminishing returns with additional sequencing.45 It is also important to note that sequencing reads should undergo extensive filtering to identify and discard low-quality reads. It is imperative that analysis pipelines be applied to sequencing data to control for faulty reads, misalignment to the referent genome, and low-quality variant calls. These steps contribute to analysis costs. Given the range of approaches used for bio-informatics, costs for analysis will vary widely. Reducing ana-lytic costs will derive from improving alignment and variant calling pipelines as well as refining which genes to analyze and automating pipelines to reduce human analytic time. The informatically designed super gene set, which was applied here to WGS, is expected to be refined to provide more spe-cific results. The use of this approach is that the super gene set is virtual, and as such the analysis can readily be updated and optimized without the need for recapture and resequencing. Thus, for those individuals for which a primary mutation is not identified, the remaining genome information is available for analysis. These data provide a wealth of information that may inform not only about new genes that cause cardiomy-opathy but also about combinations of genes that may provide important prognostic information.

AcknowledgmentsWe thank the families for their participation.

Sources of FundingThis work was supported by Doris Duke Charitable Foundation, New York, NY; Sarnoff Foundation, Great Falls, VA; and National Institutes of Health, Bethesda, MD, NIH T32 HL007237, NIH F32 HL097587.

DisclosuresNone.

References 1. Cirino AL, Ho CY. Genetic testing for inherited heart disease. Circulation.

2013;128:e4–e8. 2. Lakdawala NK, Thune JJ, Colan SD, Cirino AL, Farrohi F, Rivero J, et al.

Subtle abnormalities in contractile function are an early manifestation of sarcomere mutations in dilated cardiomyopathy. Circ Cardiovasc Genet. 2012;5:503–510.

3. Maron BJ, Roberts WC, Arad M, Haas TS, Spirito P, Wright GB, et al. Clinical outcome and phenotypic expression in LAMP2 cardiomyopathy. JAMA. 2009;301:1253–1259.

4. Mestroni L, Taylor MR. Lamin A/C gene and the heart: how genetics may impact clinical care. J Am Coll Cardiol. 2008;52:1261–1262.

5. Zimmerman RS, Cox S, Lakdawala NK, Cirino A, Mancini-DiNardo D, Clark E, Leon A, et al. A novel custom resequencing array for dilated cardiomyopathy. Genet Med. 2010;12:268–278.

6. Meder B, Haas J, Keller A, Heid C, Just S, Borries A, et al. Targeted next-generation sequencing for the molecular genetic diagnostics of cardiomy-opathies. Circ Cardiovasc Genet. 2011;4:110–122.

7. Clark MJ, Chen R, Lam HY, Karczewski KJ, Chen R, Euskirchen G, Butte AJ, et al. Performance comparison of exome DNA sequencing technolo-gies. Nat Biotechnol. 2011;29:908–914.

8. Asan, Xu Y, Jiang H, Tyler-Smith C, Xue Y, Jiang T, Wang J, et al. Com-prehensive comparison of three commercial human whole-exome capture platforms. Genome Biol. 2011;12:R95.

9. Bamshad MJ, Ng SB, Bigham AW, Tabor HK, Emond MJ, Nickerson DA, et al. Exome sequencing as a tool for Mendelian disease gene discovery. Nat Rev Genet. 2011;12:745–755.

10. Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P, et al. A method and server for predicting damaging missense mutations. Nat Methods. 2010;7:248–249.

11. Siepel A, Bejerano G, Pedersen JS, Hinrichs AS, Hou M, Rosenbloom K, et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 2005;15:1034–1050.

12. Ng PC, Henikoff S. Predicting deleterious amino acid substitutions. Ge-nome Res. 2001;11:863–874.

13. Cooper GM, Stone EA, Asimenos G, Green ED, Batzoglou S, Sidow A; NISC Comparative Sequencing Program. Distribution and intensity of con-straint in mammalian genomic sequence. Genome Res. 2005;15:901–913.

14. Thomas PD, Kejariwal A, Campbell MJ, Mi H, Diemer K, Guo N, La-dunga I, et al. PANTHER: a browsable database of gene products orga-nized by biological function, using curated protein family and subfamily classification. Nucleic Acids Res. 2003;31:334–341.

15. Ashkenazy H, Erez E, Martz E, Pupko T, Ben-Tal N. ConSurf 2010: calcu-lating evolutionary conservation in sequence and structure of proteins and nucleic acids. Nucleic Acids Res. 2010;38(Web Server issue):W529–W533.

16. A map of human genome variation from population-scale sequencing. Na-ture. 2010;467:1061–1073.

17. Yeo G, Burge CB. Maximum entropy modeling of short sequence motifs with applications to RNA splicing signals. J Comput Biol. 2004;11:377–394.

18. Demonbreun AR, Fahrenbach JP, Deveaux K, Earley JU, Pytel P, McNally EM. Impaired muscle growth and response to insulin-like growth factor 1 in dysferlin-mediated muscular dystrophy. Hum Mol Genet. 2011;20:779–789.

19. Herman DS, Lam L, Taylor MR, Wang L, Teekakirikul P, Christodoulou D, et al. Truncations of titin causing dilated cardiomyopathy. N Engl J Med. 2012;366:619–628.

20. Golbus JR, Puckelwartz MJ, Fahrenbach JP, Dellefave-Castillo LM, Wolf-geher D, McNally EM. Population-based variation in cardiomyopathy genes. Circ Cardiovasc Genet. 2012;5:391–399.

21. Dalakas MC, Park KY, Semino-Mora C, Lee HS, Sivakumar K, Goldfarb LG. Desmin myopathy, a skeletal myopathy with cardiomyopathy caused by mutations in the desmin gene. N Engl J Med. 2000;342:770–780.

22. Park KY, Dalakas MC, Goebel HH, Ferrans VJ, Semino-Mora C, Litvak S, et al. Desmin splice variants causing cardiac and skeletal myopathy. J Med Genet. 2000;37:851–857.

23. Gahl WA, Huizing M. Hermansky-Pudlak syndrome. 1993. 24. Arimura T, Hayashi YK, Murakami T, Oya Y, Funabe S, Arikawa-Hirasa-

wa E, Hattori N, et al. Mutational analysis of fukutin gene in dilated car-diomyopathy and hypertrophic cardiomyopathy. Circ J. 2009;73:158–161.

25. Ruan Y, Liu N, Priori SG. Sodium channel mutations and arrhythmias. Nat Rev Cardiol. 2009;6:337–348.

26. Schott JJ, Alshinawi C, Kyndt F, Probst V, Hoorntje TM, Hulsbeek M, et al. Cardiac conduction defects associate with mutations in SCN5A. Nat Genet. 1999;23:20–21.

27. Olson TM, Michels VV, Ballew JD, Reyna SP, Karst ML, Herron KJ, et al. Sodium channel mutations and susceptibility to heart failure and atrial fibrillation. JAMA. 2005;293:447–454.

28. Nguyen TP, Wang DW, Rhodes TH, George AL Jr. Divergent biophysical defects caused by mutant sodium channels in dilated cardiomyopathy with arrhythmia. Circ Res. 2008;102:364–371.

29. Babu GJ, Warshaw DM, Periasamy M. Smooth muscle myosin heavy chain isoforms and their role in muscle physiology. Microsc Res Tech. 2000;50:532–540.

30. Tajsharghi H, Ohlsson M, Palm L, Oldfors A. Myopathies associated with β-tropomyosin mutations. Neuromuscul Disord. 2012;22:923–933.

31. Chourbagi O, Bruston F, Carinci M, Xue Z, Vicart P, Paulin D, et al. Des-min mutations in the terminal consensus motif prevent synemin-desmin heteropolymer filament assembly. Exp Cell Res. 2011;317:886–897.

by guest on April 29, 2016http://circgenetics.ahajournals.org/Downloaded from

Golbus et al Whole Genome Sequencing in Cardiomyopathy 759

32. Clemen CS, Herrmann H, Strelkov SV, Schröder R. Desminopathies: pa-thology and mechanisms. Acta Neuropathol. 2013;125:47–75.

33. Hershberger RE, Parks SB, Kushner JD, Li D, Ludwigsen S, Jakobs P, Nauman D, et al. Coding sequence mutations identified in MYH7, TNNT2, SCN5A, CSRP3, LBD3, and TCAP from 313 patients with familial or id-iopathic dilated cardiomyopathy. Clin Transl Sci. 2008;1:21–26.

34. Schafer E, Baron K, Widmer U, Deegan P, Neumann HP, Sunder-Plass-mann G, et al. Thirty-four novel mutations of the GLA gene in 121 pa-tients with Fabry disease. Hum Mutat. 2005;25:412.

35. Sheppard MN. The heart in Fabry’s disease. Cardiovasc Pathol. 2011;20:8–14.

36. Hershberger RE, Pinto JR, Parks SB, Kushner JD, Li D, Ludwigsen S, Cowan J, et al. Clinical and functional characterization of TNNT2 muta-tions identified in patients with dilated cardiomyopathy. Circ Cardiovasc Genet. 2009;2:306–313.

37. Richard P, Charron P, Carrier L, Ledeuil C, Cheav T, Pichereau C, Benaiche A, et al; EUROGENE Heart Failure Project. Hypertrophic cardiomyopa-thy: distribution of disease genes, spectrum of mutations, and implications for a molecular diagnosis strategy. Circulation. 2003;107:2227–2232.

38. Hanson EL, Jakobs PM, Keegan H, Coates K, Bousman S, Dienel NH, Litt M, et al. Cardiac troponin T lysine 210 deletion in a family with dilated cardiomyopathy. J Card Fail. 2002;8:28–32.

39. Fürst DO, Goldfarb LG, Kley RA, Vorgerd M, Olivé M, van der Ven PF. Filamin C-related myopathies: pathology and mechanisms. Acta Neuro-pathol. 2013;125:33–46.

40. Thompson TG, Chan YM, Hack AA, Brosius M, Rajala M, Lidov HG, et al. Filamin 2 (FLN2): A muscle-specific sarcoglycan interacting protein. J Cell Biol. 2000;148:115–126.

41. Fujita M, Mitsuhashi H, Isogai S, Nakata T, Kawakami A, Nonaka I, No-guchi S, et al. Filamin C plays an essential role in the maintenance of the structural integrity of cardiac and skeletal muscles, revealed by the medaka mutant zacro. Dev Biol. 2012;361:79–89.

42. Maerkens A, Kley RA, Olive M, Theis V, van der Ven PF, Reimann J, et al. Differential proteomic analysis of abnormal intramyoplasmic aggregates in desminopathy. J Proteomics. 2013;90:14–27.

43. Lidove O, West ML, Pintos-Morell G, Reisin R, Nicholls K, Figuera LE, Pa-rini R, et al. Effects of enzyme replacement therapy in Fabry disease–a com-prehensive review of the medical literature. Genet Med. 2010;12:668–679.

44. Bentley DR, Balasubramanian S, Swerdlow HP, Smith GP, Milton J, Brown CG, Hall KP, et al. Accurate whole human genome sequencing using reversible terminator chemistry. Nature. 2008;456:53–59.

45. Ajay SS, Parker SC, Abaan HO, Fajardo KV, Margulies EH. Accu-rate and comprehensive sequencing of personal genomes. Genome Res. 2011;21:1498–1505.

CLINICAL PERSPECTIVECardiomyopathy is highly heritable component and knowledge of the underlying genetic defects provides prognostic infor-mation to guide clinical decision making and to use preventive strategies. Genetic diagnosis currently relies on sequencing 1 to 80 genes simultaneously with <50% sensitivity for dilated cardiomyopathy. With the declining cost of gene sequencing, genetic testing is likely to rely on whole genome sequencing coupled with targeted gene analysis. To gauge the effectiveness of whole genome sequencing to diagnose cardiomyopathy, we piloted whole genome sequencing for 11 subjects with car-diomyopathy. Initial analysis focused on 204 known and putative cardiomyopathy genes. Variants within these genes were analyzed using a pipeline that combined multiple prediction algorithms along with frequency data from the 1000 Genomes project and the National Heart, Lung, and Blood Institute Exome Sequencing Project. This pipeline yielded 1 to 14 poten-tially pathogenic variants per individual. Variants were further analyzed using phenotypic data and segregation analysis. Three of the 11 subjects had a previously identified primary mutation, and all 3 mutations were detected by the analysis pipeline. In 2 of these subjects, additional potential modifier loci were seen that may contribute to the clinical heterogeneity seen in these families. In 6 subjects for whom the primary mutation was unknown, we identified mutations that segregated with disease, had clinical correlates, and additional pathological correlation to deem these variants pathogenic or likely pathogenic. In total, we identified the likely pathogenic mutation in 9 of 11 (82%) subjects indicating that whole genome sequencing data can be used to identify clinically relevant variants.

by guest on April 29, 2016http://circgenetics.ahajournals.org/Downloaded from

Viswateja Nelakuditi, Lorenzo L. Pesce, Peter Pytel and Elizabeth M. McNallyJessica R. Golbus, Megan J. Puckelwartz, Lisa Dellefave-Castillo, John P. Fahrenbach,

Targeted Analysis of Whole Genome Sequence Data to Diagnose Genetic Cardiomyopathy

Print ISSN: 1942-325X. Online ISSN: 1942-3268 Copyright © 2014 American Heart Association, Inc. All rights reserved.

Dallas, TX 75231is published by the American Heart Association, 7272 Greenville Avenue,Circulation: Cardiovascular Genetics

doi: 10.1161/CIRCGENETICS.113.0005782014;7:751-759; originally published online September 1, 2014;Circ Cardiovasc Genet. 

http://circgenetics.ahajournals.org/content/7/6/751World Wide Web at:

The online version of this article, along with updated information and services, is located on the

http://circgenetics.ahajournals.org/content/suppl/2014/09/01/CIRCGENETICS.113.000578.DC1.htmlData Supplement (unedited) at:

  http://circgenetics.ahajournals.org//subscriptions/

is online at: Circulation: Cardiovascular Genetics Information about subscribing to Subscriptions: 

http://www.lww.com/reprints Information about reprints can be found online at: Reprints:

  document. Permissions and Rights Question and Answer information about this process is available in the

requested is located, click Request Permissions in the middle column of the Web page under Services. FurtherCenter, not the Editorial Office. Once the online version of the published article for which permission is being

can be obtained via RightsLink, a service of the Copyright ClearanceCirculation: Cardiovascular Geneticsin Requests for permissions to reproduce figures, tables, or portions of articles originally publishedPermissions:

by guest on April 29, 2016http://circgenetics.ahajournals.org/Downloaded from

  1  

SUPPLEMENTAL MATERIAL

Supplemental Table 1: Whole Genome Sequencing Metrics. Coverage (X) Data passing

filter (>Q20) & aligned

(Gb)

% of Non-N Reference Covered

DCM-O1 43.1 131.8 97.9 DCM-AAB03 34.5 104.7 97.8

DCM-AAW02

35.8 107.4 98.1

DCM-AAY02 35.1 108.7 98.1 LGMD-AH01 35.8 109.3 97.8

DCM-Q14 36.7 112.5 97.9 SD-303 31.9 98.3 97.9

DCM-AAL01 35.0 108.6 97.8 MDC-01 37.5 114.7 97.9

DCM-BI01 42.6 130.9 98.0 DCM-BH01 39.6 120.3 97.9  

  2  

Supplemental Table 2. Myopathy Super Panel Genes

Gene Gene Name Transcript #exons

ABCC9 ATP-binding cassette, subfamily C NM_005691.2 NM_020297.2

38 38

ABCG5 ATP-binding cassette, subfamily G, member 5 NM_022436.2 13

ACADVL

Acyl-CoA dehydrogenase, very long chain NM_000018.2 20

ACTA1 Actin, alpha 1, skeletal muscle NM_001100.3 7

ACTA2 Actin, alpha 2, smooth muscle, aorta NM_001613.2 NM_001141945.1

9 9

ACTC1 Actin, alpha NM_005159.4 7 ACTN2 Alpha actinin NM_001103.2 21

ADRA1A Alpha-1A-adrenergic receptor

NM_000680.2 NM_033304.2 NM_033303.3

2 3 3

ADRA2A Alpha-2A-adrenergic receptor NM_000681.3 1

ADRA2B Alpha-2B-adrenergic receptor NM_000682.5 1

ADRA2C Alpha-2C-adrenergic receptor NM_000683.3 1

ADRB1 Beta-1-adrenergic recepter NM_000684.2 1

ADRB2 Beta-2-adrenergic recepter NM_000024.5 1

AGT Angiotensinogen NM_000029.3 5 AKAP6 A-Kinase Anchor protein NM_004274.4 14

AKAP9 A kinase (PRKA) anchor protein 9 NM_005751.4 50

ALG6

Asparagine-linked glycosylation 6, alpha-1,3-glucosyltransferase homolog NM_013339.3 15

ANK2 Ankyrin 2 NM_001148.4 46

ANKRD1 Ankyrin repeat domain-containing protein 1 NM_014391.2 9

AP3B1

Adaptor-related protein complex 3, beta 1 subunit NM_003664.3 27

APLNR Apelin recepter NM_005161.4 2 AQP4 Aquaporin 4 NM_001650.4 5

ARL6 ADP-ribosylation factor-like 6 NM_032146.3 NM_177976.1

9 9

  3  

ATP1B1

ATPase, Na+/K+, transporting beta-1 polypeptide NM_001677.3 6

ATP2A2

ATPase, Ca++ transporting, cardiac muscle, slow twitch 2 NM_001681.3 NM_170665.3

21 20

BAD BCL2-associated agonist of cell death NM_032989.2 NM_004322.3

4 3

BAG3 BCL2-Associated Athanogene 3 NM_004281.3 4

BAK1 BCL2-antagonist/killer 1 NM_001188.3 6

BAX BCL2-associated X protein NM_004324.3 5

BBS1 Bardet-Biedl syndrome 1 NM_024649.4 17

BBS12 Bardet-Biedl syndrome 12 NM_001178007.1 3

BBS2 Bardet-Biedl syndrome 2 NM_031885.3 17 BBS4 Bardet-Biedl syndrome 4 NM_033028.3 16 BBS5 Bardet-Biedl syndrome 5 NM_152384.2 12 BBS7 Bardet-Biedl syndrome 7 NM_176824.2 19 BBS9 Bardet-Biedl syndrome 9 NM_198428.2 23 BCL2 B-cell CLL/lymphoma 2 NM_000633.2 3

BCL2L1 BCL2-like 1 NM_138578.1 NM_001191.2 3 3

BDKRB1 Bradykinin receptor B1 NM_000710.3 3 BDKRB2 Bradykinin receptor B2 NM_000623.3 3

BLOC1S3

Biogenesis of lysosomal organelles complex-1, subunit 3 NM_212550.3 2

BMP1 Bone morphogenetic protein 1 NM_001199.3 NM_006129.4

16 20

BNIP3

BNIP3 BCL2/adenovirus E1B 19kDa interacting protein 3 NM_004052.2 6

BNIP3L

BCL2/adenovirus E1B 19kDa interacting protein 3-like NM_004331.2 6

BRAF

V-Raf Murine Sarcoma Viral Oncogene Homolog B1 NM_004333.4 18

BSCL2

Berardinelli-Seip congenital lipodystrophy 2 (seipin) NM_032667.6 NM_001122955.3

11 11

CACNA1C

Calcium channel, voltage-dependent, L type, alpha 1C subunit NM_000719.6 47

CACNB2

Calcium channel, voltage-dependent, beta-2 subunit NM_000724.3 13

  4  

CALM3 Calmodulin 3 NM_005184.2 6

CAMK2D

Calcium/calmodulin-dependent protein kinase II, delta NM_001221.3 19

CAPN3 Calpain-3 NM_000070.2 24 CASQ2 Calsequestrin 2 NM_001232.3 11 CAV3 Caveolin 3 NM_033337.2 2

CEP290 Centrosomal protein 290kDa NM_025114.3 54

CHKB Choline kinase beta NM_005198.4 15

CLCN1 Chloride channel 1, skeletal muscle NM_000083.2 23

CLCNKA Chloride channel K1 NM_004070.3 20 CLCNKB Chloride channel Kb NM_000085.3 20

COL6A1 Collagen, type VI, alpha 1 NM_001848.2 36

COL6A2 Collagen, type VI, alpha 2 NM_001849.3 28

COL6A3 Collagen, type VI, alpha 3 NM_004369.3 44

COQ9 Coenzyme Q9 homolog NM_020312.3 9

COX15

COX15 homolog, cytochrome c oxidase assembly protein NM_078470.4 NM_004376.5

9 9

CREB1 cAMP response element binding protein 1 NM_134442.3 9

CRYAB Crystallin, alpha-B NM_001885.1 3

CSRP3 Cysteine and glycine-rich protein 3 NM_003476.3 6

CTF1 Cardiotrophin 1 NM_001330.3 3

CTNNB1

Catenin (cadherin-associated protein), beta 1, 88kDa NM_001904.3 NM_001098209.1

15 16

CTSL Cathepsin L NM_001912.4 8

DAG1 Dystroglycan 1 NM_001177634.1 6 DES Desmin NM_001927.3 9

DICER1 Dicer 1, ribonuclease type III. NM_177438.2 27

DLL4 Delta-like 4 NM_019074.3 11 DMD Dystrophin NM_004006.2 79

DMPK Dystrophia myotonica-protein kinase NM_001081563.1 14

DNAJA3 DNAJ (Hsp40) homolog, subfamily A, member 3 NM_005147.5 12

DNAJC19 DNAJ (Hsp40) homolog, subfamily C, member 19 NM_145261.3 6

DNM1L Dynemin 1-like NM_012062.3 20

  5  

DPM3

Dolichyl-phosphate mannosyltransferase polypeptide 3 NM_018973.3 NM_153741.1

1 2

DSC2 Desmocollin 2 NM_004949.3 NM_024422.3 17 16

DSG2 Desmoglein 2 NM_001943.3 15 DSP Desmoplakin NM_004415.2 24

DTNA Dystrobrevin, alpha NM_001391.5 NM_032975.3 NM_001390.4

17 22 22

DYSF Dysferlin NM_003494.3 NM_001130987.1 55 56

ELN Elastin NM_000501.2 33 EMD Emerin NM_000117.2 6

EYA4 Eyes absent homolog 4 NM_172105.3 NM_172103.3 NM_004100.4

20 19 20

FBN1 Fibrillin 1 NM_000138.4 66 FBXO32 F-box only protein 32 NM_058229.3 9

FHL1 Four and a half LIM domains 1 NM_001449.4 7

FHL2 Four and a half LIM domains 2 NM_201557.3 6

FIS1 Fission 1 NM_016068.2 5

FKBP1A FK506-Binding protein 1A NM_000801.4 5

FKRP Fukutin-related protein NM_024301.4 NM_001039885.2 4 4

FKTN Fukutin NM_006731.2 10 FLNC Filamin C, gamma NM_001458.4 48 FOXD4 Forkhead box D4 NM_207305.3 1 FOXO1 Forkhead box O1 NM_002015.3 3

FOXO3 Forkhead box O3 NM_201559.2 NM_001455.3 4 3

FXN Frataxin NM_000144.4 5

GABARAP GABA(A) receptor-associated protein NM_007278.1 4

GLA Alpha-glactosidase NM_000169.2 7 GLB1 Galactosidase, beta 1 NM_000404.2 16

GPD1L Glycerol-3-phosphate dehydrogenase 1-Like NM_015141.3 8

GSN Gelsolin NM_000177.4 17

HADHA

Hydroxyacyl-CoA dehydrogenase/3-ketoacyl-CoA thiolase/enoyl-CoA hydratase (trifunctional protein), alpha subunit NM_000182.4 20

  6  

HADHB

Hydroxyacyl-CoA dehydrogenase/3-ketoacyl-CoA thiolase/enoyl-CoA hydratase (trifunctional protein), beta subunit NM_000183.2 16

HDAC1 Histone deacetylase 1 NM_004964.2 14 HDAC2 Histone deacetylase 2 NM_001527.3 14

HDAC5 Histone deacetylase 5 NM_001015053.1 27 HDAC6 Histone deacetylase 6 NM_006044.2 29

HPS1 Hermansky-Pudlak syndrome 1 NM_000195.3 20

HPS3 Hermansky-Pudlak syndrome 3 NM_032383.3 17

HPS4 Hermansky-Pudlak syndrome 4 NM_022081.4 14

HPS5 Hermansky-Pudlak syndrome 5 NM_181507.1 23

HPS6 Hermansky-Pudlak syndrome 6 NM_024747.5 1

ITPR1

Inositol 1,4,5-trisphosphate receptor, type 1 NM_001168272.1NM_001099952.2

61 59

ITPR2

Inositol 1,4,5-trisphosphate receptor, type 2 NM_002223.2 57

ITPR3

ITPR3 inositol 1,4,5-trisphosphate receptor, type 3 NM_002224.3 58

JAG1 Jagged 1 NM_000214.2 26 JUP Plakoglobin NM_002230.2 14

KCNE1

Potassium voltage-gated channel, Isk-related family, member 1 NM_000219.3 4

KCNE2

Potassium voltage-gated channel, Isk-related family, member 2 NM_172201.1 2

KCNE3

Potassium voltage-gated channel subfamily E member 3 NM_005472.4 3

KCNH2

Potassium voltage-gated channel, subfamily H (eag-related), member 2 NM_000238.3 15

KCNJ2

Potassium inwardly-rectifying channel, subfamily J, member 2 NM_000891.2 2

KCNJ5

Potassium inwardly-rectifying channel, subfamily J, member 5 NM_000890.3 3

  7  

KCNQ1

Potassium voltage-gated channel, KQT-like subfamily, member 1 NM_000218.2 16

LAMA4 Laminin, alpha 4 NM_001105206.1 39

LAMP2 Lysosome-associated membrane protein-2 NM_013995.2 9

LDB3 LIM domain binding 3 NM_007078.2 NM_001080116.1 13 8

LMNA Lamin A/C NM_005572.3 NM_170707.2 10 12

LTBP4

Latent transforming growth factor-beta-binding protein 4

NM_003573.2 NM_001042544.1 NM_001042545.1

34 34 31

MATR3 Matrin 3 NM_018834.5 15

MCU Mitochondrial calcium uniporter NM_138357.1 8

MFF Mitochondrial fusion factor NM_020194.4 11

MFN1 Mitofusin-1 NM_033540.2 19

MFN2 Mitofusin-2 NM_014874.3 NM_001127660.1 19 18

MYBPC3 Myosin binding protein C NM_000256.3 35

MYH11 Myosin, heavy chain 11, smooth muscle NM_022844.2 41

MYH6 Myosin heavy chain 6 NM_002471.3 39 MYH7 Myosin heavy chain 7 NM_000257.2 40

MYL2 Myosin, light chain 2, regulatory, cardiac, slow NM_000432.3 7

MYL3

Myosin, light chain 3, alkali; ventricular, skeletal, slow NM_000258.2 7

MYLK2 Myosin light chain kinase 2 NM_033118.3 13

MYOT Myotilin NM_006790.2 10 NEXN Nexilin NM_144573.3 13

NFE2L2 (NRF-2)

Nuclear factor (erythroid-derived 2)-like 2provided by HGNC NM_006164.4 5

NOS1AP

Nitric oxide synthase 1 (neuronal) adaptor protein NM_014697.2 10

NOTCH2 Notch homolog 2 NM_024408.3 34

OPA3 Optic atrophy 3 NM_025136.3 NM_001017989.2 2 2

OPA1 Optic atrophy 1 NM_130833.2 NM_130834.2 NM_015560.2

29 30 29

  8  

PACS2 Phosphofurin acidic cluster sorting protein 2 NM_001100913.2 24

PARK2 (PARKIN)

Parkinson protein 2, E3 ubiquitin protein ligase NM_004562.2 12

PINK1 PTEN induced putative kinase 1 NM_032409.2 8

PKP2 Plakophilin 2 NM_004572.3 14 PLN Phospholamban NM_002667.3 2 PMM2 Phosphomannomutase 2 NM_000303.2 8

PPARA Peroxisome proliferator-activated receptor alpha NM_001001928.2 NM_005036.4

8 8

PPARD (PPARβ)

Peroxisome proliferator-activated receptor delta NM_006238.4 NM_177435.2

8 7

PPARG

Peroxisome proliferator-activated receptor gamma NM_015869.4 NM_005037.5

7 7

PPARGC1A (PGC-1α)

Peroxisome proliferator-activated receptor gamma, coactivator 1 alpha NM_013261.3 13

PPID Peptidylprolyl isomerase D NM_005038.2

PSEN1 Presenilin 1 NM_000021.3 12 PSEN2 Presenilin 2 NM_000447.2 13

RBM20 RNA-Binding Motif Protein NM_001134363.1 14

RXRA Retinoid X receptor, alpha NM_002957.4 10

RXRB Retinoid X receptor, beta NM_021976.3 10

RXRG Retinoid X receptor, gamma NM_006917.4 10

RYR1 Ryanodine receptor 1 NM_000540.2 106 RYR2 Ryanodine receptor 2 NM_001035.2 105

SCGE Sarcoglycan epsilon NM_003919.2 NM_001099400.1 NM_001099401.1

11 11 12

SCN1B

Sodium channel, voltage-gated, type I, beta NM_001037.4 NM_199037.3

6 3

SCN3A

Sodium channel, voltage-gated, type III, alpha subunit NM_006922.3 6

SCN3B

Sodium channel, voltage-gated, type III, beta subunit NM_018400 28

SCN4A

Sodium channel, voltage-gated, type IV, alpha subunit NM_000334.4 24

SCN4B Sodium channel, NM_174934.3 5

  9  

voltage-gated, type IV, beta

SCN5A

Sodium channel, voltage-gated, type V, alpha subunit NM_000335.4 28

SDHA

Succinate dehydrogenase complex, subunit A, flavoprotein NM_004168.2 13

SGCA Alpha-sarcoglycan NM_000023.2 10 SGCB Beta-sarcoglycan NM_000232.4 6 SGCD Sarcoglycan, delta NM_000337.5 9 SGCG Sarcoglycan, gamma NM_000231.2 8

SIRT1 Sirtuin 1 NM_012238.4 NM_001142498.1 9 8

SNTA1 Syntrophin, alpha 1 NM_003098.2 8

SOD2 Superoxide dismutase 2 NM_001024465.1 NM_000636.2 6 5

SQSTM1 Sequestosome 1 NM_003900.4 8

SYNE1 Synaptic nuclear envelope protein 1 NM_182961.3 146

SYNE2 Synaptic nuclear envelope protein 2 NM_182914.2 116

TAZ Tafazzin NM_000116.3 11 TCAP Titin-cap (telethonin) NM_003673.3 2

TFAM Transcription factor A, mitochondrial NM_003201.1 7

TMEM15 Dolichol kinase NM_014908.3 1

TMEM43 Transmembrane protein 43 NM_024334.2 12

TMPO Thymopoietin NM_003276.2 4 TNNC1 Troponin C type 1 NM_003280.2 6 TNNI3 Troponin I type 3 NM_000363.4 8

TNNT2 Troponin T type 2 NM_001001430.1 NM_000364.2 16 16

TPM1 Tropomyosin 1 NM_001018005.1 NM_000366.5 10 10

TPM2 Tropomyosin 2 (beta) NM_003289.3 NM_213674.1 9 9

TTN Titin NM_133378.4 NM_003319.4 312 191

TTR Transthyretin NM_000371.3 4 VCL Vinculin NM_014000.2 22

ZFPM2 Zinc finger protein, multitype 2 NM_012082.3 8

 

  10  

Supplemental Table 3. WGS variants in the Myopathy Super-Panel of 204 genes.

#SNVs/ genome

#nsSNVs/ exome

#SNVs in myopathy transcripts

#nsSNVs in myopathy transcripts

#Indels/ genome

#Indels in myopathy transcripts

DCM-O1 3677167 11590 786 170 628107 86 DCM-AAB03 3648094 11333 858 164 601993 85 DCM-AAW02 3675915 11423 869 177 602159 82 DCM-AAY02 3741778 11620 922 189 604913 83 LGMD-AH01 3672600 11276 750 141 604991 83 DCM-Q14 3703781 11774 809 167 613501 94 SD303 3625506 11232 840 182 582597 82 DCM-AAL01 3886088 12114 970 198 638326 110 MDC-01 3677031 11461 769 135 617143 88 DCM-BI01 4020554 12450 907 179 677526 92 DCM-BH01 3646720 11175 740 143 602727 84 SNVs=Single nucleotide variants; nsSNPs= Non-synonymous single nucleotide variants; indel=insertion/deletion.  

  11  

Supplemental Table 4. Applying protein prediction algorithms and frequency data restricts the number of variants.

Subject Missense Nonsense Intronic, SS

Total #SNVs

Indels Total #Variants

DCM-01 4 0 2 6 0 6 DCM-AAB03 6 0 0 6 0 6 DCM-AAW02 0 1 0 1 0 1 DCM-AAY02 4 0 2 6 0 6 LGMD-AH01 7 0 0 7 0 7

DCM-Q14 4 0 0 4 1 5 SD-303 7 0 0 7 0 7

DCM-AAL01 7 0 0 7 0 7 MDC-01 2 0 1 3 0 3

DCM-BI01 11 1 1 13 1 14 DCM-BH01 3 0 0 3 0 3

SS=Splice site; SNP= Single nucleotide variant.  

  12  

   

Supplemental Table 5: Variants used to test SNV Pipeline

Chr Location REF Allele

ALT Allele Gene

Amino Acid

Change 12 21958221 C T ABCC9 A1140T 12 21958118 G A ABCC9 T1174I 15 35085599 C T ACTC1 E101K 15 35085632 G A ACTC1 H90Y 1 236849999 A G ACTN2 Q9R

10 121431885 C T BAG3 P151L 12 2614110 G A CACNA1C G406R 12 2224456 C T CACNA1C A39V 10 18828274 C T CACNB2b S442L 3 8775642 G A CAV3 R27Q

11 19209758 T C CSRP3 K69R 11 19209833 A G CSRP3 L44P 11 19209792 A C CSRP3 C58G| 2 220286087 G C DES R350P 2 220290421 C T DES T442I 2 220290678 G T DES S460I 2 220290456 C T DES R454W

20 10633181 C T JAG1 G274D 20 10637100 C T JAG1 C234Y 21 35742856 C T KCNE2 R27C 17 68171392 A T KCNJ2 D71V 17 68171832 C T KCNJ2 R218W 11 2594214 G C KCNQ1 V180L 11 2604687 A T KCNQ1 Y188F 11 2604765 C T KCNQ1 A214V 11 2593319 G A KCNQ1 V127M 10 88446920 G A LDB3 A147T 10 88446975 C T LDB3 A165V 10 88459081 C T LDB3 R268C 1 156084983 C T LMNA L92F 1 156106204 C T LMNA R341W 1 156084975 G T LMNA R89L 1 156105885 G A LMNA R265H 5 138643358 C G MATR3 S85C

11 47369209 G A MYBPC3 R282W 11 47371592 G A MYBPC3 R160W 11 47369220 C T MYBPC3 I820N 14 23863503 A T MYH6 P830L 14 23863473 G A MYH6 A1004S 14 23862646 C A MYH6 E1457K 14 23857123 C T MYH6 R1832C 14 23862177 C G MYH6 Q1065H 14 23894048 C T MYH7 R870H

  13  

14 23897840 C T MYH7 E483K 12 111356964 C T MYL2 A13T 12 111351120 G C MYL2 P76A 3 46901001 T C MYL3 M149V 1 78408441 A G NEXN Y588C 1 78408434 G T NEXN G586* 1 78408317 C A NEXN P547T 1 78383902 C G NEXN Q131E 1 78392548 C T NEXN R215C 1 120510178 C T NOTCH2 C405Y 6 118880109 C T PLN R9C

14 73678519 A G PSEN1 M364V 10 112572068 C T RBM20 P638L 10 112572061 C A RBM20 R636S 10 112572055 G A RBM20 R634 19 35524558 C G SCN1B C121W 11 123524481 A G SCN3B L10P 17 62034787 G A SCN4A T704M 17 62018868 T C SCN4A M1692V 17 62036629 C T SCN4A R672H 17 62049557 T C SCN4A I141V 17 62019214 C T SCN4A M1476I 11 118011980 G A SCN4B L179F 3 38622661 C A SCN5A A997S 3 38592386 C T SCN5A R1772H 3 38640461 C T SCN5A G657 5 156186312 G A SGCD E261K 5 156022010 T G SGCD S150A

17 37822118 G A TCAP R87Q 3 52488019 A G TNNC1 Y5H 3 52485301 C T TNNC1 G159d

19 55665514 G C TNNI3 R145G 19 55663219 T G TNNI3 K206Q 19 55663266 T C TNNI3 D190G 19 55668428 T G TNNI3 P33 19 55663276 G C TNNI3 E187D 1 201331116 C A TNNT2 R143L 1 201333494 G A TNNT2 R101W 1 201328764 C T TNNT2 D208N 1 201334766 A T TNNT2 I20N 1 201334425 C T TNNT2 R102Q 1 201334372 A T TNNT2 F105I 1 201332507 C T TNNT2 E104K

15 63349227 T C TPM1 V117A 15 63349227 T C TPM1 V95A 15 63336271 G A TPM1 E54K 15 63336229 G A TPM1 E40K 9 35685747 G C TPM2 R91G 9 35685483 T G TPM2 Q147P

  14  

2 179634621 G A TTN T2850I 2 179392277 A G TTN M26986T 2 179667000 C T TTN V54M

10 75871844 C T VCL R975W 10 75842257 C A VCL L277M X 100653021 G A GLA R356W X 100653894 C T GLA R227Q X 100653930 T C GLA N215S

  15  

Supplemental Table 6: Variants used to test the Splice Site Pipeline

Chr Position Location REF allele

ALT allele Gene

bp from exon

11 62462039 c. 639+1 G >A G A BSCL2 1 14 24728909 c.984+1G>A G A TGM1 1 1 21889602 c.298-1G>A G A ALPL 1

14 23900793 c.818+1G>A G A MYH7 1 14 23900791 c.818+3G>C G A MYH7 1 1 120460386 c.5930-1 G-->A G A NOTCH2 1

21 39763580 c. 2592+1G>A G A ERG 1 15 42694334 c.1536+1G>T G T CAPN3 1 4 74282071 c.1289+1 G>A G A ALB 1 X 32591964 c.1603-1G>T G A DMD 1 9 137622081 c.925-2A>G A G COL5A1 2 7 117227888 c.1811+1G>C G C CFTR 1 7 117230496 c.1898+3A>G A G CFTR 3

11 134014890 c.747+1G>T G T JAM3 1 3 37056039 c.790 +4A>T A T MLH1 4 3 37083827 c.1731+5G>A G A MLH1 4

13 113769973 c.572-1G>A G A F7 1 X 46736931 c.1073-9T>A T A RP2 9 19 11213463 c.313+1 G>A G A LDLR 1 X 32472776 c.3603+3A>T A T DMD 3 3 193360552 c.985 -2A>G A G OPA1 2

11 2549153 c.387 -5T>A T A KCNQ1 5 16 57017325 c.1407+2T>C T C CETP 2 13 32950933 c.8754+5G>A G A BRCA2 5 X 110385428 c.276+4A>G A G PAK3 4 11 47354745 c.3330+2 T>G T G MYBPC3 2 17 41276028 c.81-6T>A T A BRCA1 6 13 32930748 c.7617+2T>G T G BRCA2 2

 

  16  

 Supplemental Table 7. Potentially pathogenic variants identified in SD-303

Gene Function Position Transcript Phast Cons GERP PP2 SIFT

NHLBI ESP

frequency

1000 Genomes frequency

SCN5A Missense R814W NM_000335 1 3.92 Probably Damaging Damaging Absent Absent

BBS5* Missense R11Q NM_152384 1 5.66 Probably Damaging Damaging 0.0005 Absent

HSP4 Missense T251S NM_022081 0.453 4.25 Possibly Damaging Tolerated 0.007 0.0041

SYNE2 Missense Q2615E NM_182914 0 2.02 Possibly Damaging Absent Absent

DSP Missense R1458G NM_004415 0.976 3.82 Possibly Damaging Tolerated 0.001 0.001

ADRB2 Missense T164I NM_000024 1 5.21 Tolerated 0.011 0.01 AKAP9 Missense S2186P NM_005751 0.996 3.11 Benign Tolerated 0.002 Absent

Frequency refers to the overall variant frequency  *Variant does not segregate with disease    

  17  

Supplemental Table 8. Potentially pathogenic variants identified in DCM-AAL01  Gene Function Position Transcript PhastCons GERP PP2 SIFT NHLBI ESP

frequency 1000 Genomes EUR frequency

DES Missense R127P NM_001927 1 4.64 Probably damaging Damaging Absent Absent

MYH11 Missense R798C NM_022844 0.955 5.11 Probably damaging Absent Absent

AKAP6 Missense E935K NM_004274 0.996 4.95 Probably damaging Damaging Absent Absent

AKAP9 Missense Q3444R NM_005751 1 4.11 Possibly damaging Damaging 0.014 Absent

LDB3 Missense I558V NM_007078 0.997 4.8 Possibly damaging Tolerated Absent Absent

BRAF Missense E26D NM_004333 1 1.93 Possibly damaging Tolerated 0.0052 Absent

FOXO3 Missense P456Q NM_201559 NM_001455 1 5.41 Benign Tolerated Absent Absent

Frequency refers to the overall variant frequency in the NHLBI ESP. EUR=European.

  18  

Supplemental Table 9. Potentially pathogenic variants identified in DCM-AAB03

Gene Function Position Transcript PhastCons GERP PP2 SIFT

NHLBI ESP

frequency

1000 Genomes frequency

TPM1 Missense   D230N NM_000366 NM_001018005 1 5.49 Probably

damaging Tolerated Absent Absent

SCN5A* Missense   S216L NM_000335 1 3.95 Probably damaging Damaging 0.0011 0.0009

GLA Missense   R118C NM_000169 0.093 3.27 Possibly damaging Damaging 0.00034 Absent

RXRG Missense   S366L NM_006917 0.988 3.91 Possibly damaging Damaging Absent Absent

AKAP6 Missense Q639K NM_004274 0.998 5.76 Possibly damaging Tolerated 0.0021 Absent

CLCN1 Missense   K614N NM_000083 1 4.49 Possibly damaging Tolerated 0.0024 Absent

Frequency refers to the overall variant frequency PP2=PolyPhen-2; *Variant does not segregate with disease

  19  

Supplemental Table 10. Potentially pathogenic variants identified in DCM-BI01

Gene Function Position Transcript Phast Cons GERP PP2 MaxENT NHLBI ESP

frequency

1000 Genomes frequency

TNNT2 Indel K210del K219del

NM_001001430 NM_000364 Absent

TTN Intronic, SS

c.42521-5C>G

c.62012-5C>G

NM_003319 NM_133378 0.767 4.34 5.97 Absent

CSRP3* Nonsense Q189X NM_003476 0.964 5.71 Absent Absent FLNC* Missense R2176H NM_001458 0.993 5.29 Absent Absent

SYNE1 Missense S3255L NM_182961 1 5.58 Probably damaging 0.0075 0.0041

AP3B1 Missense E681G NM_003664 1 5.32 Probably damaging 0.00093 0.0046

KCNE2 Missense T8A NM_172201 0.922 5.06 Probably damaging 0.0045 0.0018

SYNE2 Missense E6735K NM_182914 1 5.18 Probably damaging 0.000093 0.0009

COL6A2 Missense A995T NM_001849 0.999 2.49 Probably damaging Absent 0.01

HPS1 Missense E9D NM_000195 0.858 -1.07 Probably damaging 0.0043 0.0041

HDAC6 Missense N1200D NM_006044 0.969 5.47 Possibly damaging 0.0022 0.03

COL6A2 Missense E278K NM_001849 1 3.63 Possibly damaging 0.0075 0.01

MYH6 Missense R860H NM_002471 0.998 4.33 Benign 0.00057 0.0037 FBN1* Missense S1481G NM_000138 1 5.71 Benign 0.0013 0.0005

Frequency refers to the overall variant frequency PP2=PolyPhen-2, *Variant does not segregate with disease in III-2, III-3 (Figure 4B).  

  20  

Supplemental Table 11: Potentially pathogenic variants identified in DCM-AAY02

Gene Function Position Transcript PhastCons GERP PP2 SIFT MaxEnt

NHLBI ESP

frequency

1000 Genomes frequency

FLNC Intronic, SS

c.3791 -1G>C NM_001458 1 4.61 NA NA -8.07 0 0

TTN* Intronic, SS

c.26492 -8T>G NM_133378 0.48 4.35 NA NA -6.23 0.0008 0.0009

DYSF* Missense R285Q NM_001130987 0.986 4.65 Probably Damaging NA 0.0004 0

SYNE2 Missense R968H NM_182914 0.998 4.7 Probably Damaging NA 0.0001 0

ABCG5 Missense A98G NM_022436 0995 5.13 Possibly Damaging Damaging 0.001 0.001

KCNH2 Missense I711V NM_000238 1 4.19 Possibly Damaging Damaging 0.00007 0

Frequency refers to the overall variant frequency. *Identified in unaffected sibling

  21  

Supplemental Table 12: Potentially pathogenic variants identified in DCM-AAW02

Gene Function Position Transcript PhastCons GERP PP2 SIFT NHLBI ESP frequency

1000 Genomes frequency

TTN Nonsense E3707X NM_003319 0.035 5.41 NA NA Absent Absent Frequency refers to the overall variant frequency in the NHLBI ESP.

  22  

Supplemental Table 13: Potentially pathogenic variants identified in DCM-Q14

Gene Function Position Transcript PhastCons GERP PP2 SIFT NHLBI ESP frequency

1000 Genomes frequency

TTN 1bp insertion L20605PfsX2 ENST00000342992 NA NA NA NA 0 0

ACTN2* Missense G111R NM_001103 0.999 5.51 Probably Damaging Damaging 0.0001 0

FKRP* Missense E343Q NM_024301 1 3.73 Probably Damaging Tolerated 0 0

COL6A2* Missense E106K NM_001849 1 3.88 Possibly Damaging Tolerated 0.0076 0.0018

SDHA* Missense M519T NM_004168 0.914 3.26 Possibly Damaging Damaging 0 0

Frequency refers to the overall variant frequency. *Variant did not segregate with disease

  23  

Supplemental Figure 1. Pedigree for subject SD-303. Segregation analysis was performed for the SCN5A R814W variant. The variant segregated with disease in all family members tested (n=16) (carrier= +, noncarrier= -).

  24  

  Supplemental Figure 2. Pedigree for subject DCM-AAW02. Both the proband (*) and her father are heterozygous for a nonsense variant in the giant protein titin.

*

I

II

III

1 2

1 2 3 4 5

1 2 3 4 65 7

TTN E3707X +/-

TTN E3707X +/-

= DCM

=Suspected

  25  

Supplemental Figure 3. Pedigree for subject DCM-AAY02. The proband (*) carries a predicted splice-site altering variant in the gene FLNC. This variant is absent in her unaffected sister.

FLNC c.3791-1 G>C -/-

I

II

III

1 2

1 2

2 3 4

4

FLNC c.3791-1 G>C +/-

1

2 31

IVP

=DCM

*

  26  

Supplemental Figure 4. Pedigree for DCM-Q14. The proband is indicated with an asterisk (*). The frameshifting TTN variant was genotyped in 38 family members and found in 6 affected individuals in the pedigree (indicated with a +).