bioinformatics tools for viral quasispecies reconstruction from next-generation sequencing data and...
TRANSCRIPT
![Page 1: Bioinformatics tools for viral quasispecies reconstruction from next-generation sequencing data and vaccine optimization PD: Ion M ă ndoiu, UConn Co-PDs:Mazhar](https://reader034.vdocuments.mx/reader034/viewer/2022051621/5697bfc01a28abf838ca3999/html5/thumbnails/1.jpg)
Bioinformatics tools for viral quasispecies reconstruction from next-generation sequencing data and
vaccine optimization
PD: Ion Măndoiu, UConnCo-PDs: Mazhar Khan, UConn
Rachel O’Neill, UConnAlex Zelikovsky, GSU
![Page 2: Bioinformatics tools for viral quasispecies reconstruction from next-generation sequencing data and vaccine optimization PD: Ion M ă ndoiu, UConn Co-PDs:Mazhar](https://reader034.vdocuments.mx/reader034/viewer/2022051621/5697bfc01a28abf838ca3999/html5/thumbnails/2.jpg)
Outline• Background & aims of the project• Bioinformatics tools for quasispecies spectrum
reconstruction from NGS reads• Experimental validation on IBV data• Summary and ongoing work
![Page 3: Bioinformatics tools for viral quasispecies reconstruction from next-generation sequencing data and vaccine optimization PD: Ion M ă ndoiu, UConn Co-PDs:Mazhar](https://reader034.vdocuments.mx/reader034/viewer/2022051621/5697bfc01a28abf838ca3999/html5/thumbnails/3.jpg)
Infectious Bronchitis Virus (IBV)• Group 3 coronavirus• Biggest single cause of
economic loss in US poultry farms−Young chickens: coughing, tracheal
rales, dyspnea−Broiler chickens: reduced growth rate−Layers: egg production drops 5-50%,
thin-shelled, watery albumin
• Worldwide distribution, with dozens of serotypes in circulation‒ Co-infection with multiple serotypes
is not uncommon, creating conditions for recombination IBV-infected
embryonormalembryo
IBV-infected egg defects
![Page 4: Bioinformatics tools for viral quasispecies reconstruction from next-generation sequencing data and vaccine optimization PD: Ion M ă ndoiu, UConn Co-PDs:Mazhar](https://reader034.vdocuments.mx/reader034/viewer/2022051621/5697bfc01a28abf838ca3999/html5/thumbnails/4.jpg)
IBV Vaccination Broadly used, most commonly with attenuated live vaccine• Short lived protection• Layers need to be re-vaccinated multiple times
during their lifespan• Vaccines might undergo selection in vivo and
regain virulence [Hilt, Jackwood, and McKinley 2008]
![Page 5: Bioinformatics tools for viral quasispecies reconstruction from next-generation sequencing data and vaccine optimization PD: Ion M ă ndoiu, UConn Co-PDs:Mazhar](https://reader034.vdocuments.mx/reader034/viewer/2022051621/5697bfc01a28abf838ca3999/html5/thumbnails/5.jpg)
RNA Virus Replication
High mutation rate (~10-4)
Lauring & Andino, PLoS Pathogens 2011
![Page 6: Bioinformatics tools for viral quasispecies reconstruction from next-generation sequencing data and vaccine optimization PD: Ion M ă ndoiu, UConn Co-PDs:Mazhar](https://reader034.vdocuments.mx/reader034/viewer/2022051621/5697bfc01a28abf838ca3999/html5/thumbnails/6.jpg)
Quasispecies identified by cloning and Sanger sequencing in both IBV infected poultry and commercial vaccines [Jackwood, Hilt, and Callison 2003; Hilt, Jackwood, and McKinley 2008]
Evolution of IBV
![Page 7: Bioinformatics tools for viral quasispecies reconstruction from next-generation sequencing data and vaccine optimization PD: Ion M ă ndoiu, UConn Co-PDs:Mazhar](https://reader034.vdocuments.mx/reader034/viewer/2022051621/5697bfc01a28abf838ca3999/html5/thumbnails/7.jpg)
How Are Quasispecies Contributing to Virus Persistence and Evolution?
• Variants differ in– Virulence– Ability to escape immune response– Resistance to antiviral therapies– Tissue tropism
Lauring & Andino, PLoS Pathogens 2011
![Page 8: Bioinformatics tools for viral quasispecies reconstruction from next-generation sequencing data and vaccine optimization PD: Ion M ă ndoiu, UConn Co-PDs:Mazhar](https://reader034.vdocuments.mx/reader034/viewer/2022051621/5697bfc01a28abf838ca3999/html5/thumbnails/8.jpg)
Project Aims
• Develop bioinformatics tools for accurate reconstruction of quasispecies sequences and their frequencies from next-generation reads
• Study quasispecies persistence and evolution of IBV in commercial layer flocks following vaccination
• Use results of this study to optimize vaccine development and vaccination protocols
![Page 9: Bioinformatics tools for viral quasispecies reconstruction from next-generation sequencing data and vaccine optimization PD: Ion M ă ndoiu, UConn Co-PDs:Mazhar](https://reader034.vdocuments.mx/reader034/viewer/2022051621/5697bfc01a28abf838ca3999/html5/thumbnails/9.jpg)
Outline• Background & aims of the project• Bioinformatics tools for quasispecies spectrum
reconstruction from NGS reads• Experimental validation on IBV data• Summary and ongoing work
![Page 10: Bioinformatics tools for viral quasispecies reconstruction from next-generation sequencing data and vaccine optimization PD: Ion M ă ndoiu, UConn Co-PDs:Mazhar](https://reader034.vdocuments.mx/reader034/viewer/2022051621/5697bfc01a28abf838ca3999/html5/thumbnails/10.jpg)
Next Generation Sequencing
10
http://www.economist.com/node/16349358
Roche/454 FLX Titanium400-600 million reads/run
Length up to 1,000 bp
Illumina HiSeq 2000up to 6 billion PE reads/run
35-100bp read length
SOLiD 4/55001.4-2.4 billion PE reads/run
35-50bp read length
Ion Torrent PGM1-10M reads/run
length up to 400bp
![Page 11: Bioinformatics tools for viral quasispecies reconstruction from next-generation sequencing data and vaccine optimization PD: Ion M ă ndoiu, UConn Co-PDs:Mazhar](https://reader034.vdocuments.mx/reader034/viewer/2022051621/5697bfc01a28abf838ca3999/html5/thumbnails/11.jpg)
• Shotgun reads—starting positions
distributed ~uniformly
• Amplicon reads— reads have
predefined start/end positionscovering fixed overlappingwindows
Shotgun vs. Amplicon Reads
![Page 12: Bioinformatics tools for viral quasispecies reconstruction from next-generation sequencing data and vaccine optimization PD: Ion M ă ndoiu, UConn Co-PDs:Mazhar](https://reader034.vdocuments.mx/reader034/viewer/2022051621/5697bfc01a28abf838ca3999/html5/thumbnails/12.jpg)
Reconstruction from Shotgun Reads: ViSpA
Read Error Correction
Read Alignment
Preprocessing of Aligned
Reads
Read Graph ConstructionContig Assembly
Frequency Estimation
Shotgun reads
Quasispecies sequences w/ frequencies
User Specified Parameters: (A) Number of mismatches (B) Mutation rate
![Page 13: Bioinformatics tools for viral quasispecies reconstruction from next-generation sequencing data and vaccine optimization PD: Ion M ă ndoiu, UConn Co-PDs:Mazhar](https://reader034.vdocuments.mx/reader034/viewer/2022051621/5697bfc01a28abf838ca3999/html5/thumbnails/13.jpg)
Reconstruction from Amplicon Reads: VirA
Reference in FASTAformat
Error-correctedSAM/BAMRead data
Estimate Amplicons
Max-Bandwidth Paths
Viral population variants with frequencies
Amplicon Read Graph
Frequency Estimation
![Page 14: Bioinformatics tools for viral quasispecies reconstruction from next-generation sequencing data and vaccine optimization PD: Ion M ă ndoiu, UConn Co-PDs:Mazhar](https://reader034.vdocuments.mx/reader034/viewer/2022051621/5697bfc01a28abf838ca3999/html5/thumbnails/14.jpg)
Amplicon Sequencing Challenges
• Multiple reads from consecutive amplicons may match over their overlap
• Distinct quasispecies may be indistinguishable in an amplicon interval
![Page 15: Bioinformatics tools for viral quasispecies reconstruction from next-generation sequencing data and vaccine optimization PD: Ion M ă ndoiu, UConn Co-PDs:Mazhar](https://reader034.vdocuments.mx/reader034/viewer/2022051621/5697bfc01a28abf838ca3999/html5/thumbnails/15.jpg)
Outline• Background & aims of the project• Bioinformatics tools for quasispecies spectrum
reconstruction from NGS reads• Experimental validation on IBV data• Summary and ongoing work
![Page 16: Bioinformatics tools for viral quasispecies reconstruction from next-generation sequencing data and vaccine optimization PD: Ion M ă ndoiu, UConn Co-PDs:Mazhar](https://reader034.vdocuments.mx/reader034/viewer/2022051621/5697bfc01a28abf838ca3999/html5/thumbnails/16.jpg)
IBV Genome
Rev. Bras. Cienc. Avic. vol.12 no.2 Campinas Apr./June 2010
RT-PCR of S1 using redesigned primers
![Page 17: Bioinformatics tools for viral quasispecies reconstruction from next-generation sequencing data and vaccine optimization PD: Ion M ă ndoiu, UConn Co-PDs:Mazhar](https://reader034.vdocuments.mx/reader034/viewer/2022051621/5697bfc01a28abf838ca3999/html5/thumbnails/17.jpg)
Experiment 110 clone pool
C1 20%C2 20%C3 15%C4 15%C5 10%C6 10%C7 4%C8 4%C9 1%C10 1%
Assembled quasispecies
PV1 PV2PV3
…PVk
454 reads
…
M42 Sample
454 reads
…
53 plasmid clones
…
V1 V2V3
…Vn
Assembled quasispecies
![Page 18: Bioinformatics tools for viral quasispecies reconstruction from next-generation sequencing data and vaccine optimization PD: Ion M ă ndoiu, UConn Co-PDs:Mazhar](https://reader034.vdocuments.mx/reader034/viewer/2022051621/5697bfc01a28abf838ca3999/html5/thumbnails/18.jpg)
Evaluated Reconstruction Flows
![Page 19: Bioinformatics tools for viral quasispecies reconstruction from next-generation sequencing data and vaccine optimization PD: Ion M ă ndoiu, UConn Co-PDs:Mazhar](https://reader034.vdocuments.mx/reader034/viewer/2022051621/5697bfc01a28abf838ca3999/html5/thumbnails/19.jpg)
Reads Statistics & Coverage
Sample
Number of Reads
Uncorrected SAET Corrected Shorah Corrected KEC Corrected
M42 isolate 53062 53062 50858 48945
M42 clone pool 21040 21040 19439 17122
![Page 20: Bioinformatics tools for viral quasispecies reconstruction from next-generation sequencing data and vaccine optimization PD: Ion M ă ndoiu, UConn Co-PDs:Mazhar](https://reader034.vdocuments.mx/reader034/viewer/2022051621/5697bfc01a28abf838ca3999/html5/thumbnails/20.jpg)
Reads Validation
![Page 21: Bioinformatics tools for viral quasispecies reconstruction from next-generation sequencing data and vaccine optimization PD: Ion M ă ndoiu, UConn Co-PDs:Mazhar](https://reader034.vdocuments.mx/reader034/viewer/2022051621/5697bfc01a28abf838ca3999/html5/thumbnails/21.jpg)
How well we predicted sanger
clones
How well our prediction is
![Page 22: Bioinformatics tools for viral quasispecies reconstruction from next-generation sequencing data and vaccine optimization PD: Ion M ă ndoiu, UConn Co-PDs:Mazhar](https://reader034.vdocuments.mx/reader034/viewer/2022051621/5697bfc01a28abf838ca3999/html5/thumbnails/22.jpg)
Average Prediction Error
![Page 23: Bioinformatics tools for viral quasispecies reconstruction from next-generation sequencing data and vaccine optimization PD: Ion M ă ndoiu, UConn Co-PDs:Mazhar](https://reader034.vdocuments.mx/reader034/viewer/2022051621/5697bfc01a28abf838ca3999/html5/thumbnails/23.jpg)
Neighbor-Joining Tree for M42 Sanger Clones & Vispa Qsps
![Page 24: Bioinformatics tools for viral quasispecies reconstruction from next-generation sequencing data and vaccine optimization PD: Ion M ă ndoiu, UConn Co-PDs:Mazhar](https://reader034.vdocuments.mx/reader034/viewer/2022051621/5697bfc01a28abf838ca3999/html5/thumbnails/24.jpg)
Experiment 2
![Page 25: Bioinformatics tools for viral quasispecies reconstruction from next-generation sequencing data and vaccine optimization PD: Ion M ă ndoiu, UConn Co-PDs:Mazhar](https://reader034.vdocuments.mx/reader034/viewer/2022051621/5697bfc01a28abf838ca3999/html5/thumbnails/25.jpg)
Reads Statistics & CoverageSample
Number of Reads
Uncorrected SAET corrected Shorah corrected KEC corrected
M41 Vaccine 92113 92113 87883 85311
Field #1 38502 38502 33685 32521
Field #2 132513 132513 123370 111686
Field #3 76906 76906 71408 64507
Field #4 44467 44467 41653 37295
![Page 26: Bioinformatics tools for viral quasispecies reconstruction from next-generation sequencing data and vaccine optimization PD: Ion M ă ndoiu, UConn Co-PDs:Mazhar](https://reader034.vdocuments.mx/reader034/viewer/2022051621/5697bfc01a28abf838ca3999/html5/thumbnails/26.jpg)
Neighbor-Joining Tree for Sanger clones and ViSpA Reconstructed Sequences
![Page 27: Bioinformatics tools for viral quasispecies reconstruction from next-generation sequencing data and vaccine optimization PD: Ion M ă ndoiu, UConn Co-PDs:Mazhar](https://reader034.vdocuments.mx/reader034/viewer/2022051621/5697bfc01a28abf838ca3999/html5/thumbnails/27.jpg)
Outline• Background & aims of the project• Bioinformatics tools for quasispecies spectrum
reconstruction from NGS reads• Experimental validation on IBV data• Summary and ongoing work
![Page 28: Bioinformatics tools for viral quasispecies reconstruction from next-generation sequencing data and vaccine optimization PD: Ion M ă ndoiu, UConn Co-PDs:Mazhar](https://reader034.vdocuments.mx/reader034/viewer/2022051621/5697bfc01a28abf838ca3999/html5/thumbnails/28.jpg)
Summary
• Developed software tools for quasispecies reconstruction from both shotgun and amplicon next-generation reads‒ Code and executables freely available at
http://alla.cs.gsu.edu/~software/VISPA/vispa.html http://alan.cs.gsu.edu/vira/
– ViSpA plugin developed for users of ION Torrent, available on ION community
• Experimental results on both simulated and real data show improved accuracy tradeoffs compared to previous methods
• Tools are applicable to quasispecies studies of other viruses
![Page 29: Bioinformatics tools for viral quasispecies reconstruction from next-generation sequencing data and vaccine optimization PD: Ion M ă ndoiu, UConn Co-PDs:Mazhar](https://reader034.vdocuments.mx/reader034/viewer/2022051621/5697bfc01a28abf838ca3999/html5/thumbnails/29.jpg)
Ongoing Work
• Deployment of ViSpA and VirA on Galaxy servers maintained at UConn and GSU
• Tool validation on ION Torrent reads
• Comparison of shotgun and amplicon based reconstruction methods
• Combining long and short read technologies
• Quasispecies persistence studies using longitudinal sampling
![Page 30: Bioinformatics tools for viral quasispecies reconstruction from next-generation sequencing data and vaccine optimization PD: Ion M ă ndoiu, UConn Co-PDs:Mazhar](https://reader034.vdocuments.mx/reader034/viewer/2022051621/5697bfc01a28abf838ca3999/html5/thumbnails/30.jpg)
Tool Validation for ION Torrent reads
• Shotgun IBV reads generated using 316 ION chip
– 2,384,007 reads (1,177,740 after SAET correction)
– mean length 203.58 bp• ViSpA results
– 23 quasispecies with estimated frequency > .5%, 2,200 total
![Page 31: Bioinformatics tools for viral quasispecies reconstruction from next-generation sequencing data and vaccine optimization PD: Ion M ă ndoiu, UConn Co-PDs:Mazhar](https://reader034.vdocuments.mx/reader034/viewer/2022051621/5697bfc01a28abf838ca3999/html5/thumbnails/31.jpg)
Longitudinal Sampling
Amplicon / shotgun
sequencing
![Page 32: Bioinformatics tools for viral quasispecies reconstruction from next-generation sequencing data and vaccine optimization PD: Ion M ă ndoiu, UConn Co-PDs:Mazhar](https://reader034.vdocuments.mx/reader034/viewer/2022051621/5697bfc01a28abf838ca3999/html5/thumbnails/32.jpg)
Contributors
University of Connecticut:Rachel O’Neal, PhD. Mazhar Kahn, Ph.D.
Hongjun Wang, Ph.D. Craig ObergfellAndrew Bligh
Bassam TorkEkaterina Nenastyeva
Alex ArtyomenkoSerghei Mangul
Nicholas MancusoAlexander Zelikovsky
University of MarylandIrina Astrovskaya, Ph.D.