future applications of full length virus genome...

33
Presented at the 8 th European HIV Drug Resistance Workshop, March 17-19 2010, Sorrento, Italy Future applications of full length virus genome sequencing Paul Kellam Virus Genomics

Upload: dangquynh

Post on 18-Jun-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

Presented at the 8th European HIV Drug Resistance Workshop, March 17-19 2010, Sorrento, Italy

Future applications of full length virus genome sequencing

Paul Kellam Virus Genomics

Presented at the 8th European HIV Drug Resistance Workshop, March 17-19 2010, Sorrento, Italy

Revisiting early HIV resistance ideasRevisiting early HIV resistance ideas

Nature 1993

AIDS 1991

Presented at the 8th European HIV Drug Resistance Workshop, March 17-19 2010, Sorrento, Italy

Virus genome sequencingVirus genome sequencing

Population or

single genome

Presented at the 8th European HIV Drug Resistance Workshop, March 17-19 2010, Sorrento, Italy

Whole genome sequencingWhole genome sequencingPrimerPrimer--walking with M13 adaptors, capillary sequencingwalking with M13 adaptors, capillary sequencing

Presented at the 8th European HIV Drug Resistance Workshop, March 17-19 2010, Sorrento, Italy

Presented at the 8th European HIV Drug Resistance Workshop, March 17-19 2010, Sorrento, Italy

Population biologyPopulation biologyThe consensus sequence

The treatment

The minority species

Presented at the 8th European HIV Drug Resistance Workshop, March 17-19 2010, Sorrento, Italy

2nd generation: 454 sequencing2nd generation: 454 sequencing

Throughput: 500 Mb/run(GSFLX)

1 M reads/runRead length: ~500 bp

Presented at the 8th European HIV Drug Resistance Workshop, March 17-19 2010, Sorrento, Italy

HIV 2HIV 2ndnd generation sequencinggeneration sequencing

Drug resistance mutations Drug Resistance – population structure

~400b.p (inc V3)

Fragmented (nebulised 454 library)

Presented at the 8th European HIV Drug Resistance Workshop, March 17-19 2010, Sorrento, Italy

Direct indexing for 454Direct indexing for 454

b)

Presented at the 8th European HIV Drug Resistance Workshop, March 17-19 2010, Sorrento, Italy

Presented at the 8th European HIV Drug Resistance Workshop, March 17-19 2010, Sorrento, Italy

Compound errors in 2Compound errors in 2ndnd generation sequencinggeneration sequencing

ProcesscDNA synthesis error rate

Library representation

PCR error rate for clusters

Sequencing error rates

TechnicalSampling efficiency

Robustness of process

Cost effectiveness

Mutliplexing/logistics

Presented at the 8th European HIV Drug Resistance Workshop, March 17-19 2010, Sorrento, Italy

Coverage and errors in 454 sequencesCoverage and errors in 454 sequences

Wang et al, Genome Res. 2007 17: 1195-1201

Presented at the 8th European HIV Drug Resistance Workshop, March 17-19 2010, Sorrento, Italy

22ndnd generation datageneration data@IL29_4275:7:1:1031:2292#7/1ATCATCTTCCTCACGACGTTTGCCAATTTAGCCTTCTTCTCNTCCCCTCCGACT +BCCC@B?BCCBCCC@CCC:CCACCC;;>>>?C4CC>>@BB@&7=A>?<A2=BCC

@IL29_4275:7:1:1054:12506#7/1ATAATGGATAAAACCATCATATTGAAAGCAAACTTCAGTGTGATTTTTGACCGG +CCCBC?BCCCCACCCCCCCCCCCCCCC ACCC?BCCCCCC?CCCCCCCCC>CCCC

@IL29_4275:7:1:1060:16244#7/1ATATTCTGGAGCAATGAAATTTCCATTACTCTCGAAGTTGATTGCATCATTCGG +BCBCCBCBCBCCCCCACCCCCCCBCC@ CC=CBABCC;CCBBC@=;C?CBCCCC#

@IL29_4275:7:1:1061:2394#7/1ATTTGGCGTCAAGCGAACAATGGAGAGGACGCAACTGCTGGTCTTACCCACCTG +=BBBBBBB@B>?B6BB=B>BBBBB?>A <A65:???8-4;;.:>BBABB/AABB5

@IL29_4275:7:1:1077:4877#7/1CCTGATGTGTATTTCTTGGTTATGGCCATCTGGTCCACAGTGGTTTTTGTTAGT +ADA>2????:>BBAB>BBBA?.?<?BB BB1?????BAB9A89;0:??>B;;8B6

Etc to ~ 1-3 million? means 99.9% accurate

Presented at the 8th European HIV Drug Resistance Workshop, March 17-19 2010, Sorrento, Italy

PhredPhred

QPHRED = -10 x Log10 (Pe)

Presented at the 8th European HIV Drug Resistance Workshop, March 17-19 2010, Sorrento, Italy

Phred scoresPhred scores

Phred Qual score

Prob that base is called wrongly

Accuracy of base call

ASCI code

10 1 in 10 90% +

20 1 in 100 99% 5

30 1 in 1000 99.9% ?

40 1 in 10000 99.99% I

50 1 in 100000 99.999% S

Presented at the 8th European HIV Drug Resistance Workshop, March 17-19 2010, Sorrento, Italy

splitFastq_by_MIDsplitFastq_by_MID 454FastaFromSFF454FastaFromSFF

assemblyPipeline_splitQA.sh

assemblyPipeline_QCQA.sh

Fastq_QCFastq_QC

Fastq_QAFastq_QA

Fastq_QAFastq_QA

SSAHA2SSAHA2assemblyPipeline_SSAHAmap.sh

SAMToolsSAMTools

pileupConsensu s

pileupConsensu s

ScriptScript

FASTQFASTQ

SFFSFF

JPEGJPEG

SAM/BAMSAM/BAM

Presented at the 8th European HIV Drug Resistance Workshop, March 17-19 2010, Sorrento, Italy

1/481/48thth of a 454 plate (1/12of a 454 plate (1/12thth of a of a ¼¼ plate)plate)

Presented at the 8th European HIV Drug Resistance Workshop, March 17-19 2010, Sorrento, Italy

Presented at the 8th European HIV Drug Resistance Workshop, March 17-19 2010, Sorrento, Italy

0 560

Presented at the 8th European HIV Drug Resistance Workshop, March 17-19 2010, Sorrento, Italy

Phred 25

Presented at the 8th European HIV Drug Resistance Workshop, March 17-19 2010, Sorrento, Italy

Sequencing whole virus genomesSequencing whole virus genomes

Bluetongue Virus

Varicella Zoster Virus

Influenza Virus

Presented at the 8th European HIV Drug Resistance Workshop, March 17-19 2010, Sorrento, Italy

Images from

Fragment DNA and add adapters

Bind fragments to flow cell

Bridge amplification

Denature to return to single stranded DNA

2nd generation: 2nd generation: IlluminaIllumina ((SolexaSolexa) sequencing) sequencing

Presented at the 8th European HIV Drug Resistance Workshop, March 17-19 2010, Sorrento, Italy

Images from

Produces millions of DNA

clusters

Add all 4 labelled

terminators

Lazer excitation causes

flourescence which is

photographed

Repeat these sequencing

cycles

2nd generation: 2nd generation: IlluminaIllumina ((SolexaSolexa) sequencing) sequencing

Presented at the 8th European HIV Drug Resistance Workshop, March 17-19 2010, Sorrento, Italy

Illumina Influenza H5N1PreIllumina Influenza H5N1Pre-- & Post Quality Filtering& Post Quality Filtering

Presented at the 8th European HIV Drug Resistance Workshop, March 17-19 2010, Sorrento, Italy

454454--Illumina Coverage ComparisonIllumina Coverage Comparison

PB2 PB1 PA HA NP NA M1/M2 NS1/NS2

Presented at the 8th European HIV Drug Resistance Workshop, March 17-19 2010, Sorrento, Italy

Dataset Platform Total Reads

Mean Read Length

Ref Coverage %

Min Coverage

Max Coverage

557H5N1

454 15,214 425.81 100 259 1,928

Illumina 1,669,501 53.96 100 7,013 61,435

Presented at the 8th European HIV Drug Resistance Workshop, March 17-19 2010, Sorrento, Italy

Comparison of platformsComparison of platforms’’ consensus sequencesconsensus sequences

Reference Position

Called Base

454 Illumina5615 A R (A or G)5621 C Y (C or T)5624 T W (A or T)6900 G S (C or G)8472 A W (A or T)8477 G R (A or G)8715 A G (difference)8937 T K (G or T)9623 G K (G or T)12575 T K (G or T)13111 T K (G or T)13280 A W (A or T)

12 differences / 13,500 bp genome = 0.089%

Presented at the 8th European HIV Drug Resistance Workshop, March 17-19 2010, Sorrento, Italy

Consensus and population structure Consensus and population structure

Patient 1Patient 2

Patient 3Patient 4

Patient 5Patient 6

Patient 7Patient 8

Frequency

Presented at the 8th European HIV Drug Resistance Workshop, March 17-19 2010, Sorrento, Italy

The problem of linkageThe problem of linkage

Presented at the 8th European HIV Drug Resistance Workshop, March 17-19 2010, Sorrento, Italy

Diversity and phenotypic potentialDiversity and phenotypic potential

Kellam & Larder, J.Virol 1995, 69(2); 669

Presented at the 8th European HIV Drug Resistance Workshop, March 17-19 2010, Sorrento, Italy

QuasispeciesQuasispecies –– haplotypehaplotype reconstructionreconstruction

Eriksson et al, Plos Comp Biol May 2008, 4(5); e1000074

Presented at the 8th European HIV Drug Resistance Workshop, March 17-19 2010, Sorrento, Italy

ConclusionsConclusions• Move towards many more consensus whole genomes and

abstraction of population structure

• Drive down/control/filter for sequencing errors.

• 3rd generation (end 2010) will produce longer reads

• Learn more from ecologists

• Considerations of genome to infectivity ratio’s

Presented at the 8th European HIV Drug Resistance Workshop, March 17-19 2010, Sorrento, Italy

Virus Genomics Team Virus Genomics Team http://www.sanger.ac.uk/Teams/Team146www.sanger.ac.uk/Teams/Team146/

Rachael Chiam Simon

Watson

Greg Baillie

Anne PalserAstrid

Gall

HIV; Myra McClure, Deenan PillayInfluenza; James Wood, Maria Zambon

BTV; Massimo Palmarini & Peter MertensVZV; Judy Breuer