ge thenome institute - genome.gov | national human genome ... · nhgri current topics in genome...
TRANSCRIPT
NHGRI Current Topics in Genome Analysis 2012 Week 6: Next-Generation Sequencing Technologies
February 22, 2012 Elaine Mardis, Ph.D.
1
the
Genomeat Washington University
Institute
© Elaine R. Mardis
Elaine R. Mardis Professor in Genetics Co-director, The Genome Institute
Next-Generation Sequencing Technologies
NHGRI Current Topics in Genome Analysis
© Elaine R. Mardis
Current Topics in Genome Analysis 2012
Elaine Mardis, Ph.D.
No Relevant Financial Relationships with ���Commercial Interests
NHGRI Current Topics in Genome Analysis 2012 Week 6: Next-Generation Sequencing Technologies
February 22, 2012 Elaine Mardis, Ph.D.
2
© Elaine R. Mardis
Overview
• Next-Generation Sequencing (NGS) Instruments • Roche/454 • Illumina • Life Technologies • Pacific Biosciences • Ion Torrent • Oxford Nanopore
• NGS Applications across the spectrum of genomics • Examples from our work • Future Directions
© Elaine R. Mardis
© Elaine R. Mardis
The Trajectory of Throughput: 10 years
E.R. Mardis, Nature (2011) 470: 198-203
NHGRI Current Topics in Genome Analysis 2012 Week 6: Next-Generation Sequencing Technologies
February 22, 2012 Elaine Mardis, Ph.D.
3
© Elaine R. Mardis
Capillary technology
Applied Biosystems 3730xl (2004)
$15,000,000
Next-gen technology
Illumina HiSeq (2011)
$10,000
Comparative costs: sequencing a human genome
© Elaine R. Mardis
Next-generation Sequencer basics Platforms and their attributes
NHGRI Current Topics in Genome Analysis 2012 Week 6: Next-Generation Sequencing Technologies
February 22, 2012 Elaine Mardis, Ph.D.
4
© Elaine R. Mardis
Next-generation DNA sequencing instruments
• All commercially-available sequencers have the following shared attributes: • Random fragmentation of starting DNA, ligation with custom linkers
= “a library” • Library amplification on a solid surface (either bead or glass) • Direct step-by-step detection of each nucleotide base incorporated
during the sequencing reaction • Hundreds of thousands to hundreds of millions of reactions imaged
per instrument run = “massively parallel sequencing” • Shorter read lengths than capillary sequencers • A “digital” read type that enables direct quantitative comparisons • A sequencing mechanism that samples both ends of every fragment
sequenced (“paired end” reads)
© Elaine R. Mardis
Paired-end reads
• All next-gen platforms now offer paired end read capability, e.g. sequences can be derived from both ends of the library fragments.
• Differences exist in the _distance_ between read pairs, based on the approach/platform. • “paired ends” : linear fragment sequenced at both ends in
two separate reactions • “mate pairs” : circularized fragment of >1kb, sequenced by
a single reaction read or by two separate end reads (platform dependent)
• In general, paired end reads offer advantages for sequencing large and complex genomes because they can be more accurately placed (“mapped”) than can single ended short reads.
NHGRI Current Topics in Genome Analysis 2012 Week 6: Next-Generation Sequencing Technologies
February 22, 2012 Elaine Mardis, Ph.D.
5
© Elaine R. Mardis
Roche/454 Library Construction and emPCR
Single Stranded Adapter Ligated Library • Random Fragmentation • Adapter Ligation
• emPCR for clonal amplification on bead
© Elaine R. Mardis
Roche/454 Pyrosequencing
DNA Capture Bead Containing Millions of
Copies of a Single Clonal Fragment
A A T C G G C A T G C T A A A A G T C A
Anneal Primer T
PP i
ATP
Light + oxy luciferin
Sulfurylase
Luciferase
APS
luciferin
Centrifugation
Load Enzyme Beads Load beads into
PicoTiter™Plate
Sequencing by Synthesis
NHGRI Current Topics in Genome Analysis 2012 Week 6: Next-Generation Sequencing Technologies
February 22, 2012 Elaine Mardis, Ph.D.
6
© Elaine R. Mardis
454 Instrumentation
Instrument Run Time (hr)
Read Length
(bp)
Yield (Mb/run)
Error Type
Error Rate (%)
Purchase Cost
(x1000)
454 FLX+ 18-20 700 900 Indel 1 $30 A
454 FLX Titanium 10 400 500 Indel 1 $500
454 GS Jr. Titanium 10 400 50 Indel 1 $108
A – Requires the 454 FLX Titanium. This is the upgrade cost.
Notable:
• Mate pair paired end reads of 3kb, 8kb and 20 kb separation without an increase in run time.
• Cost per run makes sequencing an entire human genome cost-prohibitive relative to other technologies (~ $20/Mbp)
• Great platform for targeted validation
© Elaine R. Mardis
Illumina Sequencing: Library Preparation
NHGRI Current Topics in Genome Analysis 2012 Week 6: Next-Generation Sequencing Technologies
February 22, 2012 Elaine Mardis, Ph.D.
7
© Elaine R. Mardis
Illumina Sequencing by Synthesis
Excitation
Emission Incorporate
Detect De-block
Cleave fluor
© Elaine R. Mardis
Illumina Instrumentation
Instrument Run Time (days)
Read Length
(bp)
Yield (Gb/run)
Error Type
Error Rate (%)
Purchase Cost
(x1000)
GAIIx 14 150 x 150 96 Sub >0.1 $525
HiSeq 2000 8 100 x 100 200 x 2 Sub >0.1 $700
HiSeq 2000 v3 10 100 x 100 <600 Sub >0.1 $700
MiSeq 1 150 x 150 2 Sub >0.1 $125
• 2010: HiSeq 2000 • Two flow cells per run
• 100 Gbp/FC or two genome equivalents per run • New scanning mechanics - scans both surfaces of FC lanes
• 2011: HiSeq 2000
• Improved chemistry (v. 3): increased yield and accuracy
• 2011: MiSeq
NHGRI Current Topics in Genome Analysis 2012 Week 6: Next-Generation Sequencing Technologies
February 22, 2012 Elaine Mardis, Ph.D.
8
© Elaine R. Mardis
Life Technologies: sequencing by ligation
• custom adapter library • emPCR on magnetic beads • sequencing by ligation using fluorescent probes from a common primer • sequential rounds of ligation from a series of primers • fixed/known nucleotides for each probeset identify two bases each cycle, or “two base encoding”
© Elaine R. Mardis
SOLiD Instrumentation
Instrument Run Time (days)
Read Length
(bp)
Yield (Gb/run)
Error Type
Error Rate (%)
Purchase Cost
(x1000)
SOLiD 4 12 50 x 35 PE 71 A-T Bias >0.06 $475
SOLiD 5500 xl 8 75 x 35 PE 60 x 60 MP
155 A-T Bias >0.01 $595
5500 xl • Front-end automation addresses bottlenecks at emPCR, breaking, and enrichment
of beads • 6-lane Flow Chip with independent lanes/2 per run • Cost per whole genome data set is predicted to be $6K by 2011 • Very high accuracy data due to two-base encoding • ECC Module – An optional 6th primer that increases accuracy to 99.999% • Direct conversion of color space to base space • True paired-end chemistry enabled – Ligation reaction can be used in either
direction
NHGRI Current Topics in Genome Analysis 2012 Week 6: Next-Generation Sequencing Technologies
February 22, 2012 Elaine Mardis, Ph.D.
9
© Elaine R. Mardis
Third generation sequencers??
• Recently, new sequencing platforms were introduced. • The Pacific Biosciences sequencer is a single molecule
detection system that marries nanotechnology with molecular biology.
• The Ion Torrent uses pH rather than light to detect nucleotide incorporations.
• The MiSeq is a scaled down version of the HiSeq, with faster chemistry and scanning.
• All offer a faster run time, lower cost per run, reduced amount of data generated relative to 2nd Gen platforms, and the potential to address genetic questions in the clinical setting.
© Elaine R. Mardis
Comparisons to Third-Generation Sequencers
Company Platform Name Sequencing Amplification Run Time
Roche 454 Ti DNA Polymerase “Pyrosequencing”
emPCR 10 hours
Illumina Hi-Seq/MiSeq
DNA Polymerase Bridge amplification
10 days/24 hours
Life SOLiD/5500
DNA Ligase emPCR 12 days
Ion Torrent PGM Synthesis H+ detection
emPCR 2 hours
Pacific Biosciences
RS Synthesis NONE 45 min
NHGRI Current Topics in Genome Analysis 2012 Week 6: Next-Generation Sequencing Technologies
February 22, 2012 Elaine Mardis, Ph.D.
10
© Elaine R. Mardis
Pacific Biosciences RS
Shearing (Covaris/Hydroshear)
Polish ends
SMRTbell™ ligation
Sequencing primer annealing
DNA polymerase binding
Load library/polymerase complex onto SMRT cell
Zero Mode Waveguides (ZMWs) Ver. 1.1 = 1 x 45,000 per SMRT cell Ver. 1.2 = 2 x 75,000 per SMRT cell
Sample Prep Library/Polymerase Complex
Movie 1 (v1.1 & v1.2)
Raw reads
Post-filter reads
Mapped reads
Movie 2 (only v1.2)
Raw reads
Post-filter reads
Mapped reads
Sequencing
© Elaine R. Mardis
The SMRTbell™ Library Types
Circular Consensus Small ~250bp Single Movie Multiple passes Many sub-reads
Sub Reads
VLR (“very long read”) Larger ~6kbp Single 45 minute Movie Single long read provides linking information
Standard Large ~2kbp Single Movie Few sub-reads
Sub Reads
Reads
Consensus Read
Read
NHGRI Current Topics in Genome Analysis 2012 Week 6: Next-Generation Sequencing Technologies
February 22, 2012 Elaine Mardis, Ph.D.
11
© Elaine R. Mardis
PacBio RS Instrumentation
Instrument Run Time (Hours)
Read Length
(bp)
Yield (Mb)
Error Type
Error Rate (%)
Purchase Cost
(x1000)
RS 14 (~8
SMRTCells) 2500 45 per
SMRTCell Insertions 15 $695
mean mapped sub-read accuracy: 86.2% mean mapped sub-read length: 3,416 bp maximum mapped read length: 8,580 bp maximum 95th percentile mapped read length: 5,807 bp
1x45 min movie 8 SMRT cells
Strobe polymerase/strobe reagent/strobe protocol (45 min movie)
2x45 min movies 8 SMRT cells
Strobe polymerase/standard reagent/standard protocol
© Elaine R. Mardis
ION Torrent Personal Genome Machine (PGM)
NHGRI Current Topics in Genome Analysis 2012 Week 6: Next-Generation Sequencing Technologies
February 22, 2012 Elaine Mardis, Ph.D.
12
© Elaine R. Mardis
Ion Torrent Yield Trajectory
© Elaine R. Mardis
Ion Torrent Yield with Automation Modules
NHGRI Current Topics in Genome Analysis 2012 Week 6: Next-Generation Sequencing Technologies
February 22, 2012 Elaine Mardis, Ph.D.
13
© Elaine R. Mardis
Oxford Nanopore Sequencing
Exonuclease-aided sequencing
Pore translocation sequencing
© Elaine R. Mardis
Applying Next Generation Sequencing
• Genomes: re-sequencing or de novo • point mutation/indel/structural variation
discovery • Protein:DNA binding
- Chromatin IP/histone binding - Nucleosome/transcription factor binding, etc.
• ncRNA discovery/sequencing/variants • Transcriptome sequencing (RNA-seq) • Genome-wide methylation of DNA (Methyl-seq) • Clinical sequencing for therapeutic decisions
E.R. Mardis, Annual Reviews in Genetics & Genomics (2008) E.R. Mardis, Nature (2011) 470: 198-203
NHGRI Current Topics in Genome Analysis 2012 Week 6: Next-Generation Sequencing Technologies
February 22, 2012 Elaine Mardis, Ph.D.
14
© Elaine R. Mardis
• Prepare paired end libraries as whole genome fragment/shotgun by random shearing of genomic DNA, adapter ligation, size selection.
• Produce paired end data from each end of billions of library fragments, over-sampling about 30-fold to cover at a depth sufficient to find all types of genome alterations.
• Computer programs align the read pair sequences onto the reference genome and several algorithms are used to discover variants genome-wide.
Whole Genome Sequencing: Data Production and Alignment
© Elaine R. Mardis
Cancer Genomics
© R.K.Wilson 2011
Whole Genome Tumor:Normal Comparison
Ley et al., Nature 2008
• Caucasian female, mid-50s at diagnosis
• De novo M1 AML • Family history of
AML and lymphoma • 100% blasts in
initial BM sample • Relapsed and died
at 23 months • Normal
cytogenetics • Informed consent
for whole genome sequencing
• Solexa sequencer, 32 bp unpaired reads
• 10 somatic mutations detected
NHGRI Current Topics in Genome Analysis 2012 Week 6: Next-Generation Sequencing Technologies
February 22, 2012 Elaine Mardis, Ph.D.
15
© Elaine R. Mardis
Disease progression model for Patient AML1
Diagnosis: Multiple leukemic clones
present
Clinical remission: loss of most
leukemic clones
Relapse: Acquisition of new mutations in a pre-existing clone
© Elaine R. Mardis
BreakDancer: detecting somatic structural variation
• Read pair analysis with BreakDancer identifies putative SVs based on sets of read pairs that don’t map as expected, with respect to distance apart and/or orientation.
• The identified reads are used to produce an assembly, then validated.
K. Chen et al., Nature Methods 6: 677-81 (2009)
NHGRI Current Topics in Genome Analysis 2012 Week 6: Next-Generation Sequencing Technologies
February 22, 2012 Elaine Mardis, Ph.D.
16
© Elaine R. Mardis
SV Ref:
Var:
TIGRA_SV: assembling SVs to nucleotide resolution
TIGRA_SV targeted local Assembly
Integration
Features: • de Bruijin graphic approach • Multiple Kmer, read threading • Decode all alleles
A tandem duplication
An indel
AGCTGT---CA!AGCTGTTGTCA!
AGCTGTCA!AGC---CA!
Contig2.1.-5.1.3
• Split reads • Read pairs • Read depth
© Elaine R. Mardis
Clinical case: “AML52”
37 y.o. female with de novo AML; M3 morphology
Complex cytogenetics, persistent leukemia
First remission, referred to WU for SCT. rBM: normal morphology, cytogenetics; negative
for PML/RARA.
Allogeneic SCT
Consolidation + ATRA
Chemo + ATRA
Chemo only
???
NHGRI Current Topics in Genome Analysis 2012 Week 6: Next-Generation Sequencing Technologies
February 22, 2012 Elaine Mardis, Ph.D.
17
© Elaine R. Mardis
An insertion-derived fusion of PML and RARA
Tumor
Chromosome 17:
Chromosome 15:
PML-RARA: PML exons 1-3 fused to RARA exons 3-9 (bcr3 variant in frame)
RARA-LoxL1: fusion out of frame
LoxL1-PML: truncated protein - premature stop in novel LoxL1 exon 5a
© Elaine R. Mardis Welch et al., JAMA April 20, 2011
NHGRI Current Topics in Genome Analysis 2012 Week 6: Next-Generation Sequencing Technologies
February 22, 2012 Elaine Mardis, Ph.D.
18
© Elaine R. Mardis
Combining Platforms: de novo Assembly
Two VLR PacBio reads contiguate an Illumina assembly gap
© Elaine R. Mardis
Rapid Genotyping by Ion Torrent
Total Number of Bases [Mbp] 20.35 ‣ Number of Q17 Bases [Mbp] 12.49 ‣ Number of Q20 Bases [Mbp] 9.66 Total Number of Reads 218,782 Mean Length [bp] 93 Longest Read [bp] 202
Amplicon Target Sizes
NHGRI Current Topics in Genome Analysis 2012 Week 6: Next-Generation Sequencing Technologies
February 22, 2012 Elaine Mardis, Ph.D.
19
© Elaine R. Mardis
Hybrid Capture
• Hybrid capture - fragments from a whole genome library are selected by combining with probes that correspond to most (not all) human exons or gene targets.
• The probe DNAs are biotinylated, making selection from solution with streptavidin magnetic beads an effective means of purification.
• An “exome” by definition, is the exons of all genes annotated in the species’ reference genome.
• Custom capture reagents can be synthesized to target specific loci that may be of interest in a clinical context.
© Elaine R. Mardis
Merkel Cell Polyoma Virus Capture
• Merkel Cell Polyoma virus • MCPyV shows frequent genomic deletions and sequence
mutations that make it difficult to amplify the virus from cases of MCC by PCR
• The circular genome does not contain a defined linearization sequence
• Only FFPE material available for majority of cases
• For proof-of-principle experiments: • Biotinylated PCR amplicons designed to target entire 5Kb viral
genome • Hybrid capture and sequencing • Analysis to identify insertion points in human genomes
NHGRI Current Topics in Genome Analysis 2012 Week 6: Next-Generation Sequencing Technologies
February 22, 2012 Elaine Mardis, Ph.D.
20
© Elaine R. Mardis
Viral Coverage Plots (4 FFPE Samples)
Duncavage et al., JMD 2011
© Elaine R. Mardis
Finding the Junction (Integration Site) using SLOPE
NHGRI Current Topics in Genome Analysis 2012 Week 6: Next-Generation Sequencing Technologies
February 22, 2012 Elaine Mardis, Ph.D.
21
© Elaine R. Mardis
RNA Sequencing
RNA isolate Size selection for nc RNA classes
Fragment, RT w/randoms
polyA priming, RT, ds DNA SAGE tags
Adapter-ligated fragments for next-gen sequencing
Alignment to reference database & discovery - Expression levels
- Novel splice isoforms - Allelic bias in transcription
- Fusion transcripts
© Elaine R. Mardis
TNRC6B splice site mutation -> alt. splicing
Tumor WGS
Normal WGS
Tumor RNA-seq
Metastatic breast cancer (to brain)
Acceptor site mutation
NHGRI Current Topics in Genome Analysis 2012 Week 6: Next-Generation Sequencing Technologies
February 22, 2012 Elaine Mardis, Ph.D.
22
© Elaine R. Mardis
Samples
Microbial Communities
Reference Sequences Metagenomics
16S rRNA WGS
Data Coordination, Analysis and
Concordance to Clinical Data
Subjects
Sources of strains
Goals: - Defining Core Microbiomes in “healthy” subjects - Catalog changes in Microbiomes with disease
Human Microbiome Project
Community Phenotypes
© Elaine R. Mardis
Stability of virome over time
Alp
hapa
pillo
mav
irus
Pol
yom
aviru
s
Ros
eolo
viru
s
Lym
phoc
rypt
oviru
s
Mas
tade
novi
rus
Alp
hapa
pillo
mav
irus
Gam
map
apill
omav
irus
Hum
an
papi
llom
aviru
s
Pol
yom
avir
us
Ros
eolo
viru
s
Lym
phoc
rypt
ovir
us
Mas
tade
novi
rus
Gam
map
apill
omav
irus
Hum
an p
apill
omav
irus
Kristine Wylie, WUGI
GI
Oral
Vaginal
Nasal
NHGRI Current Topics in Genome Analysis 2012 Week 6: Next-Generation Sequencing Technologies
February 22, 2012 Elaine Mardis, Ph.D.
23
© Elaine R. Mardis
400 MRSA Genomes Legend:
CD
MRSA.6326.326W87
MRSA.6330.330N40
MRSA.6347.347W106
MRSA.6365.G96
MRSA.6433.W211
MRSA.6433.N82
MRSA.6433.G99
MRSA.R103G
MRSA.6421.421N69
MRSA.6254.254W53
MRSA.6382.382W140
MRSA.6415.2790W210
MRSA.6415.4563W208
MRSA.6222.222N8
MRSA.6266.266.1G39
MRSA.6432.A46
MRSA.6398.N80
MRSA.6436.W212
MRSA.R103 714N
MRSA.R402 6488W
MRSA.R402N
MRSA.6370.370W132
MRSA.6342.342W101
MRSA.6341.341A26
MRSA.6331.331W92
MRSA.6412.4124774W170
MRSA.6224.224W20
MRSA.6329.329N39
MRSA.6329.329W90
MRSA.6245.245W45
MRSA.6245.245.1N21
MRSA.6245.245A12
MRSA.6245.245.1A13
MRSA.6245.245.1G25
MRSA.6216.216G6
MRSA.6216.216W14
MRSA.6246.246G26
MRSA.6246.246W46
MRSA.6326.326.1N37
MRSA.6231.231W27
MRSA.6422.4227588W179MRSA.6422.422N70
MRSA.6331.331.1G55MRSA.6332.332W95MRSA.6209.209W5MRSA.6386.N78
MRSA.6386.386A39MRSA.6331.331.1G54
MRSA.6331.3314371W93
MRSA.6500.3300G82MRSA.6501.3301A40
MRSA.6500.3300W155
MRSA.6348.348W107MRSA.6374.374W134
MRSA.6335.335G56
MRSA.6344.344W102MRSA.6345.345N44
MRSA.6345.345W103 MRSA
.630
7.30
7A16
MRSA.6307.307N28
MRSA.6307.307G42
MRSA.6307.307W72MRSA.6261.261G35
MRSA.6341.341.1N43MRSA.6215.2157462W12
MRSA.6407.4076690W166MRSA.6407.4074440W167
MRSA.6407.407N65
MRSA.6255.255G31MRSA.6255.255W54
MRSA.6415.7010W209MRSA.6415.N81
MRSA.6430.430G89
MRSA.6430.430W186
MRSA.6430.430N72
MRSA.6361.361W121
MRSA.6395.395W150
MRSA.6346.3463077W105
MRSA.6437.W213
MRSA.6437.N83
MRSA.6359.359W119
MRSA.6366.3667367W128
MRSA.6366.366W127
MRSA.6366.366G72
MRSA.6366.366N51
MRSA.6384.384A37
MRSA.6387.387G78
MRSA.6387.387N59
MRSA.6386.386W144
MRSA.6402.402N63
MRSA.6386.386G77
MRSA.6389.W206
MRSA.6389.A45
MRSA.6389.N79
MRSA.6401.4016575W160
MRSA.6349.349G62
MRSA.6258.258N25
MRSA.6258.258G33
MRSA.6258.258W56
MRSA.6266.2661353W66
MRSA.6266.2667011W67
MRSA.6266.266W65
MRSA.6266.2665921W68
MRSA.6266.266A15
MRSA.6266.266G38
MRSA.6427.427W184
MRSA.6427.427N71
MRSA.6243.W196
MRSA.6242.242W42
MRSA.6252.2527094W52
MRSA.6252.252G29
MRSA.6252.252W51
MRSA.6502.3302W157
MRSA.6501.3301W156
MRSA.6501.3301G83
MRSA.6501.3301N61
MRSA.6260.260G34
MRSA.6260.260W58
MRSA.6340.340A25
MRSA.6340.340W100
MRSA.6340.340G57
MRSA.6341.341N42
MRSA.6394.394G79
MRSA.6391.391W148
MRSA.6421.4217599W177
MRSA.6303.G93
MRSA.6303.W199
MRSA.6267.267.1N27
MRSA.6267.267W69
MRSA.6403.4036488W163
MRSA.6399.399W154
MRSA.6204.204W4
MRSA.6305.305W71
MRSA.6335.3352985W190
MRSA.6335.335W96
MRSA.6250.250W49
MRSA.6339.339W99
MRSA.6
328.32
8W89
MRSA.6
384.38
4W142
MRSA.6
406.40
6A41
MRSA.6
351.35
1W113
MRSA.6
319.31
9W81
MRSA.6
377.37
7W136
MRSA.6
376.37
6G74
MRSA.6
398.39
8W153
MRSA.631
1.311.1A
20
MRSA.631
1.311.1N
32
MRSA.631
1.311N31
MRSA.6311.
311A19
MRSA.6311.
311W75
MRSA.643
8.W214
MRSA.624
1.241N18
MRSA.624
1.241W41
MRSA.6
416.41
63680W
172
MRSA.6
416.41
6W171
MRSA.6
416.41
6.1G87
MRSA.6
346.34
6W104
MRSA.6
336.33
6W97
MRSA.6
337.33
7W98
MRSA.6265.265G37
MRSA.6265.265W64
MRSA.6350.350W109
MRSA.6349.349.1G63
MRSA.6
349.34
9W108
MRSA.6321.W200
MRSA.6214.214N4
MRSA.6214.214W10
MRSA.6380.380W138
MRSA.6268.268.1G41
MRSA.6268.268G40
MRSA.6268.268W70
MRSA.6308.308W73
MRSA.6385.385N58
MRSA.6385.385G76
MRSA.6385.385W143
MRSA.6385.385A38
MRSA.6385.3855370W192
MRSA.6365.365N50
MRSA.6365.365W126
MRSA.6327.327G50
MRSA.6327.327W88
MRSA.6369.369W131
MRSA.6320.320A21
MRSA.6320.320W82
MRSA.6320.320N33
MRSA.6320.320.1N34
MRSA.6320.320G46
MRSA.6396.3961643W152
MRSA.6396.396G80
MRSA.6396.396W151
MRSA.6355.355A32
MRSA.6355.355W117
MRSA.6355.355G67
MRSA.6355.355N47
MRSA.6309.309W74
MRSA.6309.309.1A18
MRSA.6309.309.1N30
MRSA.6263.2636238W63
MRSA.6263.263W62
MRSA.6328.328A23
MRSA.6327.327.1G51
MRSA.6328.328G52
MRSA.6324.324G48
MRSA.6323.323W83
MRSA.6219.W195
MRSA.6219.G90
MRSA.6219.N74
MRSA.6219.G91
MRSA.6226.226W21
MRSA.6226.226A6
MRSA.6226.226N9
MRSA.6261.261A14
MRSA.6261.261W59
MRSA.6377.377A36MRSA.6377.377N56
MRSA.6229.229W24
MRSA.6223.223G12
MRSA.6223.223W19
MRSA.6363.W203
MRSA.6232.2326105W29
MRSA.6232.232G16
MRSA.6232.232W28
MRSA.6388.388.1N60
MRSA.6390.3905178W147
MRSA.6390.390W146
MRSA.6236.2366121W33
MRSA.6236.236.1G20
MRSA.6236.2366527W35
MRSA.6236.2366195W34
MRSA.6236.236G19
MRSA.6236.236W32
MRSA.6215.2151332W13
MRSA.6215.215A2
MRSA.6215.215G5
MRSA.6215.215W11
MRSA.6221.221G11
MRSA.6221.221W18
MRSA.6388.W205
MRSA.6218.218G9
MRSA.6218.218.1G10
MRSA.6218.218W16
MRSA.6419.419W175
MRSA.6218.218.1A4
MRSA.6218.218N6
MRSA.6360.360N49
MRSA.6360.360W120
MRSA.6210.210G3
MRSA.6210.210W6
MRSA.6227.227N10
MRSA.6227.227.1N11
MRSA.6227.227G13
MRSA.6227.227W22
MRSA.6401.401N62
MRSA.6356.356N48
MRSA.6356.356G68
MRSA.6356.356A33
MRSA.6356.356W118
MRSA.6397.W207
MRSA.6233.233A7
MRSA.6233.233N14
MRSA.6234.234W31
MRSA.6411.4113370W168
MRSA.6412.412W169
MRSA.6412.412N68
MRSA.6431.G98
MRSA.6334.N76
MRSA.6334.W201
MRSA.6203.203G2
MRSA.6203.203W3
MRSA.6203.203A1
MRSA.6308.308A17
MRSA.6362.362G69
MRSA.6362.362W122
MRSA.6431.431A42
MRSA.6431.4313236W187
MRSA.6431.431N73
MRSA.6352.352G64
MRSA.6352.352W114
MRSA.6401.4016336W161
MRSA.6353.353G65MR
SA.6247.G92
MRSA.6354.G94
MRSA.6354.W202
MRSA.6247.N75
MRSA.6247.A43
MRSA.6247.W197
MRSA.6
247.29
47W198
MRSA.6331.3314694W94
MRSA.6368.368W130
MRSA.6217.217.1A3
MRSA.6217.2173205W18
9MRSA.6344.
344A27MRSA.6344.3
44G58MRSA.6396.396
.1G81MRSA.6217.217.1N5MRSA.6217.217.1G8MRSA.6217.217G7
MRSA.6217.217W15MRSA.6383.383W141
MRSA.6342.342342807W191
MRSA.6230.2302975W26
MRSA.6230.230W25
MRSA.6230.230N12
MRSA.6230.230.1G15
MRSA.6230.230.1N13
MRSA.6376.376.1A35
MRSA.6345.345G59
MRSA.6376.376.1N55
MRSA.6376.376A34
MRSA.6376.376W135
MRSA.6376.376.1G75
MRSA.6376.376N54
MRSA.6418.418W174
MRSA.6254.254G30
MRSA.6248.248W47
MRSA.6387.387W145
MRSA.6211.211W7
MRSA.6504.3304W158
MRSA.6506.3306W159
MRSA.6251.251N22
MRSA.6251.251W50
MRSA.6370.370N52
MRSA.6324.324.1N36
MRSA.6325.325W86
MRSA.6
257.25
7W55
MRSA.6
257.25
7G32
MRSA.6
257.25
7N24MRS
A.6262
.26211
99W61
MRSA.6
262.26
2G36
MRSA.6
262.26
2N26
MRSA.6
262.26
2W60
MRSA.6
379.37
9N57
MRSA.6
379.37
9W137
MRSA.6
421.42
17968W
178
MRSA.6
429.42
9W185
MRSA.6
249.24
9.1G28
MRSA.6
249.24
9G27
MRSA.6
249.24
9W48
MRSA.631
4.314G44
MRSA.631
4.314354
1W77MRSA
.6314.31
4.1G45
MRSA.631
4.314W76
MRSA.6375.
N77MRSA.6375.
G97MRSA.6375.W2
04MRSA.6238.23
8A8MRSA.6238.23
8W37MRSA.6238.238N
16MRSA.6238.2938W3
8
MRSA.6350.
3500593W11
2MRSA.
6350.350.1
A28MRSA.
6350.35013
96W110
MRSA.642
6.426W18
3MRSA
.6345.34
5.1G60
MRSA.6402.4026402W162
MRSA.6424.4247207W182
MRSA.6266.A44
MRSA.6381.381W139
MRSA.6420.420W176MRSA.6244.244.1A11MRSA.6244.2444815W44
MRSA.6244.244W43MRSA.6241.241G22MRSA.6244.244G24
MRSA.6244.244A10MRSA.6244.244N20
MRSA.6222.222A5MRSA.6353.353.1N46MRSA.6353.353.1G66MRSA.6353.353N45MRSA.6353.3535901W116MRSA.6353.353.1A31MRSA.6330.330.1N41MRSA.6353.353A30MRSA.6353.353W115MRSA.6417.417G88MRSA.6213.213N3
MRSA.6213.213G4MRSA.6213.213W9
MRSA.6242.242N19MRSA.6239.239G21
MRSA.6239.239W39
MRSA.6239.239A9MRSA.6239.239N17
MRSA.6328.328N38
MRSA.6234.234G18
MRSA.6233.233W30
MRSA.6240.240W40
MRSA.6364.364G70
MRSA.6364.364083W124
MRSA.6364.3647947W125
MRSA.6364.364.1G71
MRSA.6364.364W123
MRSA.6201.2011355W2
MRSA.R201G
MRSA.R201W
MRSA.6201.201N1
MRSA.6201.201.1N2
MRSA.8869.201G1
MRSA.6351.351A29
MRSA.6330.330.1G53
MRSA.6330.330W91
MRSA.6410.410N66
MRSA.6410.4102889W194
MRSA.6410.4106835W193
MRSA.6410.410G85
MRSA.6410.410.1G86
MRSA.6410.410.1N67
MRSA.6212.212W8
MRSA.6220.220W17
MRSA.6394.394W149
MRSA.6348.348G61
MRSA.6323.323G47
MRSA.6324.324W84
MRSA.6324.324.1G49
MRSA.6324.324.1A22
MRSA.6324.3247192W85
MRSA.6228.228.1G14
MRSA.6228.228W23
MRSA.6259.259W57
MRSA.R406 2601W
MRSA.R406N
MRSA.6406.406G84MRSA.6406.406N64MRSA.6406.4062601W165MRSA.6406.4063760W164
MRSA.6367.367W129
MRSA.6237.237.1N15
MRSA.6237.237W36
MRSA.6218.218.1N7
MRSA.6423.4232663W181
MRSA.6423.4237501W180
MRSA.6439.
W215
MRSA.6363.
G95
MRSA.6317.
3176744W79
MRSA.6317.31
7W78
MRSA.6435.43
5W188
MRSA.6336.33
6A24
MRSA.6371.371N
53
MRSA.6371.371G73
MRSA.6371.371W133
© Elaine R. Mardis
Conclusions
• 2nd and 3rd generation sequencing instruments are revolutionizing biological research.
• Earliest impacts have been on cancer genomics and metagenomics.
• The extreme need for bioinformatics-based analytical approaches to interpret these large data sets has revitalized the field and introduced statistical and mathematical rigor.
• Integration across data sets from DNA, RNA, methylation, proteomics, etc. presents the next challenge but provides comprehensive analytical power to inform biology.
• With newer instruments, clinical applications have potential for implementation, with appropriate interpretive algorithms.
NHGRI Current Topics in Genome Analysis 2012 Week 6: Next-Generation Sequencing Technologies
February 22, 2012 Elaine Mardis, Ph.D.
24
© Elaine R. Mardis
Acknowledgements
• The Genome Institute Vince Magrini Todd Wylie Sean McGrath Amy Ly Jason Walker Jasreet Hundal Lisa Cook Li Ding George Weinstock Richard K. Wilson
• Clinical Collaborators (WUSM) Tim Ley Phil Tarr John Pfeifer