overview of next generation sequencing platform … 2012/0305 - timmermann.pdf · next generation...
TRANSCRIPT
Next Generation Sequencing Core FacilityNext Generation Sequencing Core FacilityMax Planck Institute for Molecular GeneticsMax Planck Institute for Molecular GeneticsBerlin, GermanyBerlin, Germany
Dr. Bernd Timmermann
Overview of Next Generation Sequencing platform technologies
May 23rd 2012, Budapest
1. Technologies
Illumina
Roche / 454
2. Projects and Applications
whole Genome Re-sequencing
Sequence Capture
Amplicon Sequencing
3. Outlook
Outline
May 23rd 2012, Budapest
Max Planck Society
Max Planck Institutefor molecular Genetics
80 institutes and research facilities
20,435 people
Budget 1,400 million euro in 2010
May 23rd 2012, Budapest
1980 1985 1990 1995 2000 2005 2010
1000,000,000
100,000,000
10,000,000
1,000,000
100,0000
10,000
1000
100
10
Gel-based Systems
Capillary Sequencing
Next GenerationSequencing
First GenerationCapillary Sequencer
Second GenerationCapillary Sequencer
MicrowellPyrosequencing
Short-ReadSequencer
Thro
ughp
ut p
er s
yste
m [k
iloba
ses/
day]
Year
Modified after MR Stratton et al. Nature
458, 719‐724 (2009)
Development of Sequencing Throughput
May 23rd 2012, Budapest
Development of Sequencing Technologies
• 96 sequences in parallel• 3.2 billions of sequences
per run
Human Genome Project1000 Genomes Project
May 23rd 2012, Budapest
7 x Illumina
3 x SOLiD
5 x Roche GS
Sequencing Capacities at the MPI-MG
3 x Capillary Systems
May 23rd 2012, Budapest
IT Infrastructure
Long ReadTechnologies
Short ReadTechnologies
TB
GB
25 x 32 (64) Compute Server with 128 (512 GB) RAM4 peta byte Storage Capacity
May 23rd 2012, Budapest
1. Technologies
Illumina
Roche / 454
2. Projects and Applications
whole Genome Re-sequencing
Sequence Capture
Amplicon Sequencing
3. Outlook
Technologies
May 23rd 2012, Budapest
Genome Sequencer FLX HiSeq 2000/SOLiD
• ChipSeq• MeDipSeq• miRNA• RNAseq• Sequencing of target regions• Whole genome resequencing
• de novo Sequencing • Metagenome Analyses• Amplicon Sequencing• Full length Transcriptome Analyses• Sequencing of target regions
May 23rd 2012, Budapest
Principle Illumina Sequencing
Library Preparation
Attachement of single molecules to surface
Amplification to form clusters
Cluster Generation
May 23rd 2012, Budapest
5’
G
T
C
A
G
T
C
A
G
T
C
A
GT
3’
5’
C
A
G
TC
A
T
C
A
C
C
TAG
CG
TA
First base incorporated
Cycle 1: Add sequencing reagents
Remove unincorporated bases
Detect signal
Cycle 2-n: Add sequencing reagents and repeat
Sequencing by Synthesis (SBS)
May 23rd 2012, Budapest
Referenzsequenz ....CGAGCGAATGAAGTCGGGAGTCGTAATGAGCCCGTAATCCCGTTAGTA....
CGAGCGAATGAAGTCGGGAGTCCTAATGAGCCCGTAGAGCGAATGAAGTCGGGAGTCCTAATGAGCCCGTAA
CGAATGAAGTCGGGAGTCCTAATGAGCCCGTAATCCTGAAGTCGGGAGTCCTAATGAGCCCGTAATCCCGTT
TCGGGAGTCCTAATGAGCCCGTAATCCCGTTAGTA
Sequence Reads
Conversion of image data to DNA sequences
May 23rd 2012, Budapest
Input Material:
~ 1-3 µg DNA shotgun Sequencing~ 10 ng ChipSeq Sequencing
Library Preparation:
~ 1.5 days
Cluster Generation:
~ 1 day
Run Time/
Single read ~ 2 days (36 b)Read Length:
Paired End ~ 10 days (2 x 100 b)
Data Processing:
~ 1 day
Output:
Paired End ~ 500 Gb
Reads:
up to 4800 Mio
Facts Illumina Sequencing (HiSeq 2000)
May 23rd 2012, Budapest
1. Genome is loaded into a PicoTiter™
plate
3. Load Reagentsin a single rack
4. Sequencing
2. Load PicoTiter plate into instrument
454 Sequencing Instrument
May 23rd 2012, Budapest
Principle 454 Sequencing
Library Preparation
Emulsion PCR
Depositing DNA Beads into the PicoTiter™Plate
Pyrosequencing
Emulsion Breaking
May 23rd 2012, Budapest
Input Material:
~ 0.5 µg DNA
Library Preparation :
~ 4 hours
Emulsion PCR:
~ 1 day
Run Time:
20 hours
Data Processing:
~ 10 hours
Output:
Titanium+ 700 -
1000 MB
Reads: Titanium+ 1.000.000 -
1.600.000
Read length:
700 -
800 bases
Facts 454 Sequencing
May 23rd 2012, Budapest
Sequencing Pipeline
Library Preparation
Library Quantification
Bead EnrichmentSequencing
May 23rd 2012, Budapest
1. Technologies
Illumina
Roche / 454
2. Projects and Applications
whole Genome Re-sequencing
Sequence Capture
Amplicon Sequencing
3. Outlook
Projects and Applications
May 23rd 2012, Budapest
Goals
•
A public database of essentially all SNPs and detectable CNVs with allele frequency >1% in each of multiple human population samples
•
Pioneer and evaluate methods for:• Generating data from next-generation sequencing platforms• Exchanging and combining data and analytical methods• Discovering and genotyping SNPs and CNVs from nextgen data• Imputation with and from next generation sequencing data
•
454, Illumina and AB SOLiD platforms
•
Academic genome centers in US, UK, Germany, China and platform companies
(Nature 2010, Science 2010 and Nature 2011)
May 23rd 2012, Budapest
OncoTrack, “Methods for systematic
next generation oncology biomarker
development”, is an international consortium of over
60 scientists, that has launched one of
Europe’s largest collaborative academic‐
industry research projects to develop
and assess novel approaches for
identification of new markers for colon
cancer.
May 23rd 2012, Budapest
DNA RNA
Protein
MethylationMutations
mRNAmiRNACell lines
Tissues
Sequencing
Bioinformatics
May 23rd 2012, Budapest
RNAseq
expression profiling
total RNA Isolation
quality control
small RNA Depletion
dsDNA generationusing random hexamers
Illumina library preparation
massive parallel sequencing
Mapping
May 23rd 2012, Budapest
GWAS Candidate Genes
Whole Exome
0.5 –
5 MB 35 MB385 k Array, NimblegenIn-solution Enrichment
2.1 Mio Array, NimblegenIn-solution Enrichment
Sequence Capture
May 23rd 2012, Budapest
Targeted Resequencing: Project outline
Identification patients Sequence capture
Next-Gen sequencing
“Bioinformatics”Follow-up sequencingFunctional characterization
Work-flow
Sample preparation
May 23rd 2012, Budapest
Principle of sequence capture
DNA Preparation Enrichment of target regions Sequencing
A1 SP1
A2
genomicDNA
Fragments (200‐500bp)
Ligation of adapters
Hybridization
Selection with
streptavidin beads
Amplification and
Quantification
May 23rd 2012, Budapest
Cleft lip with or without cleft palate (CL/P)Cooperation with M. Nöthen and E. Mangold
• Prevalence among live births ~ 1 : 1.000
• Risk for siblings 1 : 20 –
1 : 25
• λs
40 -
50
Epidemiology of Epidemiology of nonsyndromic CL/Pnonsyndromic CL/P
Mangold E. et al. (2010), Nature Genetics
May 23rd 2012, Budapest
•
3 Loci on chr 8 (640Kb), 10 (161Kb) and 17 (340Kb) in 20 affected individuals
•
MID tagging and pooling of 10 samples
•
Enrichment using the 2.1M NimbleGen array
•
Sequencing on a Roche GS FLX system
Cleft lip with or without cleft palate (CL/P)Resequencing as follow up of GWAS
May 23rd 2012, Budapest
Mapping
May 23rd 2012, Budapest
•• 6.726 unique variants (>10 x Coverage)
•• 3.783 variants not listed in dbSNP (hg19)
•• 4 coding Variants
•• Detection of structural Variations not yet finished
Cleft lip with or without cleft palate (CL/P)Preliminary Results
May 23rd 2012, Budapest
Mutation detection pipeline quality
Concordance with Affymetrix Array "genome-wide human SNP array 6.0"
May 23rd 2012, Budapest
AimDetection and quantification of new and known variants
METHODAmplification and sequencing of target regionsMultiple alignments of sequences against a reference
reference
patient sequences
Amplicon Sequencing
May 23rd 2012, Budapest
Amplicon Sequencing
B-primer (21 bp)
MID
MID
key
key
A
B
A-primer (21 bp)
Sequence of interest
Locus‐specific PCR
amplification
emPCR Amplification
and sequencing
•
Long reads required to sequence through the locus specific
primer, enable haplotyping over longer distances•
100s to 1000s of amplicon clones sequenced simultaneously
May 23rd 2012, Budapest
Amplicon Sequencing
IRON StudyInterlaboratory Robustness of NGS
May 23rd 2012, Budapest
Amplicon Sequencing
IRON StudyHematology Focus Group
May 23rd 2012, Budapest
Amplicon Sequencing
IRON StudyResults
• per each amplicon, the median coverage eached was 713-fold, ranging from 553-fold to 878-fold
• a total of 92 variants (44 distinct mutations and 10 SNPs) were observed
• in comparison to data available from Sanger sequencing, 454 amplicon deep-sequencing detected all mutations and SNPs that were previously known
• we here confirm in a multicenter analysis that amplicon-
based deep-sequencing is technically feasible, achieves a high concordance across multiple laboratories, and therefore allows a broad and in-depth molecular characterization of hematological malignancies.
Kohlmann et al. (2011), Leukemia
May 23rd 2012, Budapest
Sensitivity of mutation detection as a function of tumor cell content
Querings et al. (2011), PlosOne
May 23rd 2012, Budapest
Establishment of small scale NGS systems
Analysis of complete genomes
Personalized medicine
Outlook
May 23rd 2012, Budapest
Acknowledgments
Hans Lehrach
Bernhard Herrmann
Hilger Ropers
Martin Vingron
Michal SchweigerMartin Kerick
Markus Ralser
Sequencing Facility:Ilona HauenschildSonia PaturejTina MoserIna LehmannNorbert MergesDaniela RothSabrina Rau
Heiner KuhlSven KlagesMartin Werber
May 23rd 2012, Budapest
Thanksfor your attention!