non optical semi-conductor next generation sequencing of the main cardiac qt-interval duration genes...

5
Non Optical Semi-Conductor Next Generation Sequencing of the Main Cardiac QT-Interval Duration Genes in Pooled DNA Samples Juan Gómez & Julian R. Reguero & César Morís & Victoria Alvarez & Eliecer Coto Received: 1 September 2013 /Accepted: 21 October 2013 /Published online: 5 November 2013 # Springer Science+Business Media New York 2013 Abstract DNA variants at the genes encoding cardiac channels have been associated with inherited arrhythmias and the QT interval in the general population. Next gen- eration sequencing technologies would be of special inter- est to uncover the genetic variation at these genes. The amplification and sequencing of DNA pools (instead of single individuals) would facilitate the rapid and cost- effective screening of large amounts of individuals. How- ever, this pooling strategy could result in a signal of the rare variants below the detection capacity. To validate this approach, a pool of 20 individuals with known rare unique variants in five genes was amplified in only two tubes and sequenced using the non optical semi-conductor (Ion Torrent PGM , Life Technologies) technology. We show that this could be an effective strategy for the screening of large cohorts. Among others, this would facilitate the discovery of new sequence variants linked to cardiac arrhythmia in the general population. Keywords Next generation sequencing . Ion torrent sequencing . Cardiac arrhythmia genes . DNA pools Introduction Brugada's and long-QT syndromes (BS/LQTS) are arrythmogenic diseases linked to mutations in key cardiac potassium and sodium channels [1, 2]. To date, at least 10 genes have been implicated in BS/LQTS, with KCNQ1 , KCNH2 , and SCN5A accounting for most of the mutations [3, 4]. In addition, several nucleotide variants at these and other cardiac channel genes (such as KCNE1 and KCNE2 ) have been linked to differences in the QT interval duration in the general population [5, 6]. Thus, the sequencing of these genes should be relevant to search for mutations in patients/ families with BS/LQTS, and also to characterize new variants linked to differences in the QT interval in the general popula- tion or to the risk of developing cardiac arrhythmia in response to pharmacological treatment [7, 8]. Due to the large size of the SCN5A , KCNQ1 , and KCNH2 genes (>15 coding exons) the Sanger-based sequencing of single amplicons is labor-intensive and expensive. Thus, mas- sive next generation sequencing (NGS) technologies would be relevant for the screening of these genes at a population scale. The Ion Torrent Personal Genome Machine (PGM) is a semi- conductor (instead of optical) NGS technology [911]. The reported PGM procedures are based on the amplification of DNA from single patients, followed by labeling of the PCR products with specific bar -code primers and array sequencing [12, 13]. Because each fragment can be recognized by the corresponding bar -code , many patients can be sequenced in a single sequencing array. The amplification of DNA pools from several patients could reduce the costs and efforts of sequencing large sample sets. The rare nucleotide variants identified in a pool of DNAs could be further assigned to a specific individual through Sanger sequencing (a workflow Associate Editor Enrique Lara-Pezzi oversaw the review of this article Electronic supplementary material The online version of this article (doi:10.1007/s12265-013-9516-6) contains supplementary material, which is available to authorized users. J. Gómez : V. Alvarez : E. Coto (*) Genética Molecular-Laboratorio de Medicina-Fundación Renal (IRSIN-FRIAT), Hospital Universitario Central Asturias, 33006 Oviedo, Spain e-mail: [email protected] J. R. Reguero : C. Morís Cardiología-Fundación Asturcor, Hospital Universitario Central Asturias, Oviedo, Spain C. Morís : E. Coto Departamento de Medicina, Universidad de Oviedo, Oviedo, Spain E. Coto Red de Investigación Renal (REDINREN), Madrid, Spain J. of Cardiovasc. Trans. Res. (2014) 7:133137 DOI 10.1007/s12265-013-9516-6

Upload: eliecer

Post on 23-Dec-2016

218 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Non Optical Semi-Conductor Next Generation Sequencing of the Main Cardiac QT-Interval Duration Genes in Pooled DNA Samples

Non Optical Semi-Conductor Next Generation Sequencingof the Main Cardiac QT-Interval Duration Genes in PooledDNA Samples

Juan Gómez & Julian R. Reguero & César Morís & Victoria Alvarez & Eliecer Coto

Received: 1 September 2013 /Accepted: 21 October 2013 /Published online: 5 November 2013# Springer Science+Business Media New York 2013

Abstract DNA variants at the genes encoding cardiacchannels have been associated with inherited arrhythmiasand the QT interval in the general population. Next gen-eration sequencing technologies would be of special inter-est to uncover the genetic variation at these genes. Theamplification and sequencing of DNA pools (instead ofsingle individuals) would facilitate the rapid and cost-effective screening of large amounts of individuals. How-ever, this pooling strategy could result in a signal of therare variants below the detection capacity. To validate thisapproach, a pool of 20 individuals with known rareunique variants in five genes was amplified in only twotubes and sequenced using the non optical semi-conductor(Ion Torrent PGM , Life Technologies) technology. Weshow that this could be an effective strategy for thescreening of large cohorts. Among others, this wouldfacilitate the discovery of new sequence variants linkedto cardiac arrhythmia in the general population.

Keywords Next generation sequencing . Ion torrentsequencing . Cardiac arrhythmia genes . DNA pools

Introduction

Brugada's and long-QT syndromes (BS/LQTS) arearrythmogenic diseases linked to mutations in key cardiacpotassium and sodium channels [1, 2]. To date, at least 10genes have been implicated in BS/LQTS, with KCNQ1 ,KCNH2 , and SCN5A accounting for most of the mutations[3, 4]. In addition, several nucleotide variants at these andother cardiac channel genes (such as KCNE1 and KCNE2)have been linked to differences in the QT interval duration inthe general population [5, 6]. Thus, the sequencing of thesegenes should be relevant to search for mutations in patients/families with BS/LQTS, and also to characterize new variantslinked to differences in the QT interval in the general popula-tion or to the risk of developing cardiac arrhythmia in responseto pharmacological treatment [7, 8].

Due to the large size of the SCN5A , KCNQ1 , and KCNH2genes (>15 coding exons) the Sanger-based sequencing ofsingle amplicons is labor-intensive and expensive. Thus, mas-sive next generation sequencing (NGS) technologies would berelevant for the screening of these genes at a population scale.The Ion Torrent Personal Genome Machine (PGM) is a semi-conductor (instead of optical) NGS technology [9–11]. Thereported PGM procedures are based on the amplification ofDNA from single patients, followed by labeling of the PCRproducts with specific bar-code primers and array sequencing[12, 13]. Because each fragment can be recognized by thecorresponding bar-code , many patients can be sequenced in asingle sequencing array. The amplification of DNA poolsfrom several patients could reduce the costs and efforts ofsequencing large sample sets. The rare nucleotide variantsidentified in a pool of DNAs could be further assigned to aspecific individual through Sanger sequencing (a workflow

Associate Editor Enrique Lara-Pezzi oversaw the review of this article

Electronic supplementary material The online version of this article(doi:10.1007/s12265-013-9516-6) contains supplementary material,which is available to authorized users.

J. Gómez :V. Alvarez : E. Coto (*)Genética Molecular-Laboratorio de Medicina-Fundación Renal(IRSIN-FRIAT), Hospital Universitario Central Asturias,33006 Oviedo, Spaine-mail: [email protected]

J. R. Reguero :C. MorísCardiología-Fundación Asturcor, Hospital Universitario CentralAsturias, Oviedo, Spain

C. Morís : E. CotoDepartamento de Medicina, Universidad de Oviedo, Oviedo, Spain

E. CotoRed de Investigación Renal (REDINREN), Madrid, Spain

J. of Cardiovasc. Trans. Res. (2014) 7:133–137DOI 10.1007/s12265-013-9516-6

Page 2: Non Optical Semi-Conductor Next Generation Sequencing of the Main Cardiac QT-Interval Duration Genes in Pooled DNA Samples

comparing single and pooled samples are depicted in theSupplementary Figure 1). The NGS sequencing of DNA poolsrather than individual samples has the limitation that eachnucleotide variant present in only one patient should be dilut-ed by the wild type allele and this could result in a signal toolow to be detected (false negatives). This limitation discour-ages the use of DNA pools of patients for clinical use.

The purpose of this study was to develop and validate aprocedure for sequencing the main arrhythmogenic genes inpooled DNA samples, using a two-tube multiplex amplifica-tion and non optical semi-conductor NGS (Ion Torrent Per-sonal Genome Machine , Life Technologies). This would fa-cilitate the rapid and cost-saving search for DNA variantslinked to the duration of QT interval at a population scale.

Methods

Sample DNAs

This study was approved by the Ethical Committee of Hospi-tal Universitario Central de Asturias (HUCA) and all theparticipants signed an informed consent. A total of 20 patientsdiagnosedwith LQTS or BS and previously Sanger sequencedfor the SCN5A , KCNH2 , KCNQ1 , KCNE1 , and KCNE2genes were recruited through the Cardiomyopathy Unit ofthe HUCA. Briefly, the exon coding nucleotides (plus at leastsix immediately flanking intronic nucleotides) of the fivegenes were polymerase chain reaction (PCR) amplified withprimers that matched the intron sequences. PCR fragmentswere purified and sequenced with BigDye chemistry in aABI3130xl equipment (Applied Biosystems). Fragment se-quences from each patient were compared with the referencesequence in the Ensembl database (www.ensembl.org) toidentify the nucleotide variants. Each of the 20 patients washeterozygous for a different variant, either a mutation (likelypathogenic and absent among controls; n =8), rare singlenucleotide polymorphisms (SNPs) previously linked to therisk of cardiac arrhythmia (n =6), or SNPs non pathogenicor with uncertain effect (n =6) (Table 1).

We adjusted each individual DNA to a concentration of10 ng/μl using Real Time TaqMan quantification with RNaseP Detection Reagents (FAM™) (Life Technologies) in a 7500Real Time PCR-System (Applied Biosystems). Through thisprocedure, we also confirmed that all the DNAs were suitablefor PCR amplification.

NGS with the Ion Torrent PGM and Data Analysis

A multiplex amplification was designated online (IonAmpliSeq™ designer) to amplify the whole coding sequenceplus at least five intronic flanking nucleotides of SCN5A ,KCNH2 , KCNQ1 , KCNE1 , and KCNE2 . A total of 116

primer pairs were provided by the manufacturer in two tubes(Supplementary Table 1). The amplicons covered >99 % ofthe target sequence (Supplementary Tables 2 and 3). A poolcontaining 10 μl of each DNA was created. In this way,assuming equimolecular amounts of the 20 DNAs, the fre-quency of reads for each of the 20 control variants should be2.5 % (1/40 alleles inside the pool).

We performed a multiplex amplification of the pool follow-ed by Ion Torrent NGS (see Supplementary Methods). Brief-ly, a total of 10 ng of the DNA pool were amplified in twoAmpliseq tubes using the Ion AmpliSeq™ Library Kit (LifeTechnologies). The reactions were quantified (AgilentBioanalyzer) and amplified using the Ion PGM templateOT2 200 Kit and the Ion One-Touch instrument (Life Tech-nologies). Template-positive spheres were recovered usingDynabeads MyOne Streptavidin C1 beads and qualified usingthe Ion Sphere quality control assay and the Qubit 2.0 fluo-rometer (Life Technologies). Sphere particles were loaded in a316 (100 Mb) semi-conductor chip, and sequenced using thePGM 200 sequencing kit protocol in the Ion Torrent PGM.The 316 chip was chosen (instead of the low-314 and high-318 capacity arrays) to obtain approximately 50 reads peramplicon per patient.

We used a 260-flow runs, which support a template readlength of approximately 200 bp. Data from the PGM runswere processed using the Torrent Suite v3.4.2 software (LifeTechnologies) to generate sequence reads and filter and re-move poor signal reads.

We considered three types of reads: amplicons with <25×coverage (1,000 reads) were discarded; in the range 25–50×(1,000–2,000 reads) were considered admissible; >50× cov-erage (>2,000 reads) were considered optimal.

Results

The reads-quality for the SCN5A , KCNQ1 , and KCNH2amplicons is shown in Fig. 1. A total of 102 (88 %) gaveoptimal reads, 4 (3 %) were admissible, and 10 (9 %) werenull or poor reads. Among the non-readable amplicons, sevenwere mapped inKCNH2 , two inKCNQ1 , and one in SCN5A .To determine whether these read-failures were intrinsic to theAmpliseq , we amplified and sequenced a pool of eight DNAswith the 314 (10 Mb) chip. We replicated the amplicon cov-erage failures, and thus concluded that the lack of nucleotidesignal for these PCR fragments was likely due to amplificationfailure. The corresponding exons should thus be Sanger se-quenced as part of the mutation screening. Because ampliconswith high GC content are reluctant to amplify, we determinedthe GC content in the 10 non-readable amplicons: 9 of the 10had a GC content of >65 % (Supplementary Table 4).

We identified a total of 37 variants (34 single nucleotidechanges and 3 indels) in the five genes: 14 in SCN5A , 11 in

134 J. of Cardiovasc. Trans. Res. (2014) 7:133–137

Page 3: Non Optical Semi-Conductor Next Generation Sequencing of the Main Cardiac QT-Interval Duration Genes in Pooled DNA Samples

KCNH2 , 9 in KCNQ1 , 2 in KCNE1 , and 1 in KCNE2(Supplementary Table 5; Supplementary Figure 2). Sixteenof the 20 control variants were in readable amplicons, and allof them were successfully identified in the DNA pool. More-over, the number of reads was in the range 2–4 %, close to theexpected 1:40 ratio (Table 1). In addition, we identified all theknown nucleotide polymorphisms in the readable sequences.Only 2 of the 37 variants were novel (non-reported), bothindels that mapped in at least four nucleotides long homopol-ymer sequences. Sanger sequencing of the 20 patients showedthat the two were false positives.

Discussion

The Ion Torrent PGM is a semi-conductor (instead of optical)based sequencing technology [9–11]. The reported PGM pro-cedures are based on the amplification of DNA from singlepatients. PCR-fragments from each individual are specificallybar-coded and sequenced. Li et al. sequenced a total of 15patients for the KCNQ1 , KCNH2 , SCN5A , KCNE1 ,KCNE2 ,and RYR2 using two NGS platforms (MiSeq-Illumina and IonTorrent-Life technologies) [14]. A total of 386 primer-pairs

were designated to amplify the whole coding sequence of thesix genes, and each sample DNAwas combined with primerpairs in a microfluidic chip and amplified. Pooled ampliconsfrom each patient were harvested, bar -code labeled, andsequenced. Because each patient amplicons are labeled withspecific bar-code the libraries from the 15 patients could bepooled and sequenced in a single NGS-run.

The amplification of pooled samples instead of singleDNAs in only two tubes containing dozens of primer-pairsreduces the necessity of multiple amplifications per sample(Supplementary Figure 1). In addition, the pooling wouldavoid the necessity for bar-coding and library amplificationof each patient, thus reducing the cost and labor effort ofhandling large sample sets. In spite of some limitations, suchas the poor or no amplification of some fragments, this ap-proach would facilitate the screening of large amount ofpatients to virtually a population scale. The amplification ofsingle fragments from DNA pools has been used to re-sequence and discover rare variants linked to common dis-eases [15, 16]. However, to our knowledge, a multiplex am-plification and PGM-NGS of pooled DNAs has not beenreported. This would reduce the time and cost of screeninglarge amounts of individuals. DNA pools have been

Table 1 Rare variants in the KCNH2 , KCNQ1, SCN5A , KCNE1 , and KCNE2 present in only one of the 20 patients used to create the DNA pool

Gene nt position nt change Exon/intron Effect Disease effect* Total reads Reads rare var (%)

KCNH2 150644428 C/A Ex13 p.R1047L 1 1,329 3.66

KCNH2 150644805 G/A Ex12 p.P952S 2 2,350 3.17

KCNH2 150654622 A/AG Int 4 − 31 Intron SNP 3 5,277 3.00

KCNH2 150655280 Del G Ex 4 p.S261 fs 2 Failed# –

KCNH2 150655339 C/A Ex4 p.R242S 2 Failed –

KCNH2 150644926 Ins GG Ex12 p.G911fs 2 Failed –

KCNH2 150645534 A/C Ex11 p.K897T 1 Failed –

KCNQ1 2549061 G/C Int 1 − 97 Intron SNP 3 5,999 4.08

KCNQ1 2608850 G/T Exon 9 p.K393N 1 8,089 3.74

KCNQ1 2790163 T/C Int 12+14 Intron SNP 3 9,616 3.28

SCN5A 38592503 C/T Ex 28 p.S1787N 1 13,362 3.43

SCN5A 38592734 G/A Ex 28 p.S1710L 2 7,620 3.73

SCN5A 38597145 A/G Int 26+2 Splicing 2 2,245 3.06

SCN5A 38597867 C/T Int 25+65 Intron SNP 3 4,546 3.40

SCN5A 38598694 A/G Int 24+28 Intron SNP 3 10,503 3.45

SCN5A 38622687 C/T Ex 17 p.R988Q 2 2,005 2.50

SCN5A 38622727 G/A Ex 17 p.R975W 1 2,444 3.39

SCN5A 38674475 C/T Int 2+51 Intron SNP 3 7,096 3.54

KCNE1 35821680 C/T Ex 4 p.D85N 2 7,902 2.67

KCNE2 35742799 A/G Ex 2 p.T8A 1 5,883 2.91

The number of reads per amplicon (containing these variants) and the percentage of reads of the rare allele are also indicated

*Effect: 1=rare missense polymorphism, increased risk for arrhythmia; 2=mutation (absent in controls, likely damaging); 3=rare intronic polymor-phism, likely non pathogenic# Failed=<25× coverage per amplicon

J. of Cardiovasc. Trans. Res. (2014) 7:133–137 135

Page 4: Non Optical Semi-Conductor Next Generation Sequencing of the Main Cardiac QT-Interval Duration Genes in Pooled DNA Samples

successfully sequenced with other NGS platforms, althoughsome authors concluded that bar-code indexing of individualsamples before pooling (rather than a genomic DNA poolingstrategy) should be preferred to avoid false-positives/nega-tives in larger sample sets [17]. Thus, the ability of thisapproach to detect rare variants in a pool of DNAs is a criticalissue that needs to be addressed.

We designated and validated a protocol based on themultiplex amplification of DNA pools with a customAmpliseq for five cardiac arrhythmia genes. For this pur-pose, we created a pool of 20 individuals, each one hetero-zygous carrier for a single nucleotide variant in one of thefive genes. These variants were successfully identified inall the readable amplicons. The four non-detected variantswere in amplicons with no nucleotide reads, and mapped toGC-rich regions that are difficult to amplify. All the 16detected mutations gave reads in the range 2–4 %, closeto the expected 2.5 % (1/40 alleles). One of the mainlimitations of the PGM (and other NGS platforms) is thepresence of false-positives, mainly indels in homopolymerregions >4 nucleotides [6]. This was in agreement with ourresults, with the only two false positive calls in two homo-polymer sequences.

Finally, our experimental approach should be relevant touncover rare nucleotide variants linked to differences in theQT interval duration in the general population [14, 15]. How-ever, the use of DNA pools of patients has some limitationsthat make it not recommendable for clinical use. Mainly,although the control variants in all the readable ampliconswere detected we cannot exclude that some samples in anyother pool did not amplify. In this case, the correspondingpatient should be wrongly classified as a non-mutation carrier.

In conclusion, our study suggests that the Ampliseq +PGM-sequencing of DNA pools could be useful for a rapidand cost-effective screening of large cohorts. This procedurewould facilitate the identification of the genetic bases ofcardiac arrhythmia in the general population. Although a 1/40 dilution was enough to detect the unique control variants,the lower level of detection was not characterized. Thus, wecannot exclude that rare unique variants could be read over thesequencing noise level in even larger pools.

Acknowledgments We thank José L. Martínez, Marcos García, BelénAlonso, and Sara Iglesias for technical assistance. This work was sup-ported by a grant from Instituto de Salud Carlos III-Fondo Europeo deDesarrollo Regional (FIS-12/00287).

Fig. 1 Number of reads per amplicon of the SCN5A, KCNH2 , and KCNQ1 genes. Arrows indicate amplicons with <25× coverage

136 J. of Cardiovasc. Trans. Res. (2014) 7:133–137

Page 5: Non Optical Semi-Conductor Next Generation Sequencing of the Main Cardiac QT-Interval Duration Genes in Pooled DNA Samples

Conflict of Interest The authors declare no conflict of interest.

References

1. Cerrone, M., Napolitano, C., & Priori, S. G. (2012). Genetics of ion-channel disorders. Current Opinion in Cardiology, 27, 242–52.

2. Napolitano, C., Bloise, R., Monteforte, N., et al. (2012). Suddencardiac death and genetic ion channelopathies: long QT, Brugada,short QT, catecholaminergic polymorphic ventricular tachycardia,and idiopathic ventricular fibrillation. Circulation, 125 , 2027–34.

3. Kapa, S., Tester, D. J., Salisbury, B. A., et al. (2009). Genetic testingfor long-QT syndrome: distinguishing pathogenic mutations frombenign variants. Circulation, 120 , 1752–60.

4. Giudicessi, J. R., & Ackerman, M. J. (2013). Genetic testing inheritable cardiac arrhythmia syndromes: differentiating pathogenicmutations from background genetic noise. Current Opinion inCardiology, 28, 63–71.

5. Gouas, L., Nicaud, V., Chaouch, S., et al. (2007). Confirmation ofassociations between ion channel gene SNPs and QTc interval dura-tion in healthy subjects. European Journal of Human Genetics, 15 ,974–9.

6. Marjamaa, A., Newton-Cheh, C., Porthan, K., et al. (2009). Commoncandidate gene variants are associated with QT interval duration inthe general population. Journal of Internal Medicine, 265 , 448–58.

7. Yang, P., Kanki, H., Drolet, B., et al. (2002). Allelic variants in long-QT disease genes in patients with drug-associated torsades depointes. Circulation, 105 , 1943–8.

8. Paulussen, A. D., Gilissen, R. A., Armstrong, M., et al. (2004).Genetic variations of KCNQ1, KCNH2, SCN5A, KCNE1, andKCNE2 in drug-induced long QT syndrome patients. Journal ofMolecular Medicine, 82, 182–8.

9. Rothberg, J. M., Hinz, W., Rearick, T. M., et al. (2011). An integratedsemiconductor device enabling non-optical genome sequencing.Nature, 475 , 348–52.

10. Loman, N. J., Misra, R. V., Dallman, T. J., et al. (2012). Performancecomparison of benchtop high-throughput sequencing platforms.Nature Biotechnology, 30, 434–9.

11. Quail, M. A., Smith, M., Coupland, P., et al. (2012). A tale of threenext generation sequencing platforms: comparison of Ion Torrent,Pacific Biosciences and IlluminaMiSeq sequencers.BMCGenomics,13, 341.

12. Elliott, A. M., Radecki, J., Moghis, B., et al. (2012). Rapid detectionof the ACMG/ACOG-recommended 23 CFTR disease-causing mu-tations using ion torrent semiconductor sequencing. Journal ofBiomolecular Techniques, 23, 24–30.

13. Costa, J. L., Sousa, S., Justino, A., et al. (2013). Nonoptical massiveparallel DNA sequencing of BRCA1 and BRCA2 genes in a diag-nostic setting. Human Mutation, 34, 629–35.

14. Li, X., Buckton, A. J., Wilkinson, S. L., et al. (2013). Towardsclinical molecular diagnosis of inherited cardiac conditions: a com-parison of bench-top genome DNA sequencers. PLoS One, 8 ,e67744.

15. Nejentsev, S., Walker, N., Riches, D., et al. (2009). Rare variants ofIFIH1, a gene implicated in antiviral responses, protect against type 1diabetes. Science, 324 , 387–9.

16. Jin, S. C., Pastor, P., Cooper, B., et al. (2012). Pooled-DNAsequencing identifies novel causative variants in PSEN1, GRNand MAPT in a clinical early onset and familial Alzheimer'sdisease Ibero-American cohort. Alzheimer's Research &Therapy, 4 , 34.

17. Harakalova, M., Nijman, I. J., Medic, J., et al. (2011). Genomic DNApooling strategy for next-generation sequencing-based rare variantdiscovery in abdominal aortic aneurysm regions of interest-challenges and limitations. Journal of Cardiovascular TranslationalResearch, 4 , 271–80.

J. of Cardiovasc. Trans. Res. (2014) 7:133–137 137