clc bio presentation at 5th sfaf 6/3/2010
DESCRIPTION
My presentation at the 5th Sequencing FInishing and Analysis in the Future (SFAF -- http://www.lanl.gov/conferences/finishfuture/2010SFAF_Meeting_Guide.pdf) June 3, 2010TRANSCRIPT
![Page 1: CLC bio presentation at 5th SFAF 6/3/2010](https://reader036.vdocuments.mx/reader036/viewer/2022062513/555012a7b4c905af648b49d1/html5/thumbnails/1.jpg)
CLC bioA Comprehensive Platform
for NGS Data Analysis
Saul A. Kravitz, PhDDirector of Consulting Services
![Page 2: CLC bio presentation at 5th SFAF 6/3/2010](https://reader036.vdocuments.mx/reader036/viewer/2022062513/555012a7b4c905af648b49d1/html5/thumbnails/2.jpg)
Before the Flood
2005: $5M Human genome – 19 sequencer years
Sample Prep AnalysisSequencing Science
![Page 3: CLC bio presentation at 5th SFAF 6/3/2010](https://reader036.vdocuments.mx/reader036/viewer/2022062513/555012a7b4c905af648b49d1/html5/thumbnails/3.jpg)
Nextgen Sequencing Revolution
Sample Prep AnalysisSequencing Science
2010: $6k Human genome ~1 sequencer day
Help!!
![Page 4: CLC bio presentation at 5th SFAF 6/3/2010](https://reader036.vdocuments.mx/reader036/viewer/2022062513/555012a7b4c905af648b49d1/html5/thumbnails/4.jpg)
Bioinformatics Challenges
•Data Analysis Tools for Biomedical Researchers•GUI-driven•HPC integration
•Unprecedented data volumes•Rapid technology change, applications growth
•Multi-platform data integration•No one-size-fits-all solutions
•Rapid customization and adaptation
![Page 5: CLC bio presentation at 5th SFAF 6/3/2010](https://reader036.vdocuments.mx/reader036/viewer/2022062513/555012a7b4c905af648b49d1/html5/thumbnails/5.jpg)
CLC bio NGS Analysis Platform
CLC Genomics WorkbenchCLC Genomics Server
CLC Assembly CellDeveloper SDK
Easy to use, Wizard-driven Desktop SoftwareEnterprise solution
High performance NGS algorithms
Workbench and Server Customization
![Page 6: CLC bio presentation at 5th SFAF 6/3/2010](https://reader036.vdocuments.mx/reader036/viewer/2022062513/555012a7b4c905af648b49d1/html5/thumbnails/6.jpg)
Swiss Army Knife of NGS Analysis
Genomics Transcriptomics EpigenomicsRNA-SeqmiRNA
CHIP-SeqRead MappingDe Novo AssemblySNP/DIP Detection
Visualization
File Format Conversion
Desktop SolutionsEnterpriseSolutions
Traditional Bioinformatics
Intuitive GUISDK
Tools Integration
High Performance
![Page 7: CLC bio presentation at 5th SFAF 6/3/2010](https://reader036.vdocuments.mx/reader036/viewer/2022062513/555012a7b4c905af648b49d1/html5/thumbnails/7.jpg)
Why not use free tools?
•Are tools free or “free”?
•Tools vs solutions
•True cost of ownership
•Ease of Use
•Tools integration
•Support
![Page 8: CLC bio presentation at 5th SFAF 6/3/2010](https://reader036.vdocuments.mx/reader036/viewer/2022062513/555012a7b4c905af648b49d1/html5/thumbnails/8.jpg)
Small RNA Analysis(in Beta soon)
•Identify and filter/trim adapters
•annotate using mirBASE and other resources
- target species of interest
•Merge/group by mature, precursor/reference
•Fully integrated with expression analysis
![Page 9: CLC bio presentation at 5th SFAF 6/3/2010](https://reader036.vdocuments.mx/reader036/viewer/2022062513/555012a7b4c905af648b49d1/html5/thumbnails/9.jpg)
De Novo Assembler
• Human assembly of 38x Illumina paired-end
• CLC Quality equivalent to Abyss
• CLC: 7 hrs, 1 node, 42 Gb of RAM
• Abyss: 80 hrs, 21 nodes, 336 Gb of RAM
• Metagenomics Assembly
• METAHIT Dataset MH0041 40M 75bp paired end
• 3 hrs on desktop, 6 Gb RAM
• Higher N50 and Total Contig Size than Reported
![Page 10: CLC bio presentation at 5th SFAF 6/3/2010](https://reader036.vdocuments.mx/reader036/viewer/2022062513/555012a7b4c905af648b49d1/html5/thumbnails/10.jpg)
Viral Sequencing at JCVI(See Nadia Fedorova’s Poster!)
• Amplify and Barcode using SISPA, 454 + Illumina Sequencing
• Depth of coverage sometimes >1000x
• De novo Assembly of Consensus for all Segments
• For each segment:
• Map reads from each technology independently using best full length reference from NCBI, call variations
• Update reference with variations confirmed by multiple technologies
• Map reads using updated reference and all reads
• Convert to consed, analyze, order Sanger closure reactions
Source: Jessica Hostetler, Nadia Federova, Tim Stockwell, Danny Katzel
![Page 11: CLC bio presentation at 5th SFAF 6/3/2010](https://reader036.vdocuments.mx/reader036/viewer/2022062513/555012a7b4c905af648b49d1/html5/thumbnails/11.jpg)
Why CLC bioTools?
• CLC handled hybrid sequencing technologies directly
• Very biased coverage confounded other assemblers that expect random arrival stats. CLC didn’t seem to suffer from biased coverage.
• Very accurate SNP calls in areas of deep coverage.
Tim StockwellDirector of Viral InformaticsJ. Craig Venter Institute
![Page 12: CLC bio presentation at 5th SFAF 6/3/2010](https://reader036.vdocuments.mx/reader036/viewer/2022062513/555012a7b4c905af648b49d1/html5/thumbnails/12.jpg)
Targeted Resequencing QC
•Assessment of targeted sequencing technology
•Coverage Statistics for Targeted Regions
•Very short schedule, limited bioinformatics staff
•Plug-in development leveraging CLC tools to automate the process and meet short deadline
•QC Report now available as plug-in
![Page 13: CLC bio presentation at 5th SFAF 6/3/2010](https://reader036.vdocuments.mx/reader036/viewer/2022062513/555012a7b4c905af648b49d1/html5/thumbnails/13.jpg)
Professional Services
•Developing customized solutions
•Integration with LIMS, workflows, DB
•Bioinformatics Algorithm Development
•Cloud and Grid Integration
•Data Analysis
![Page 15: CLC bio presentation at 5th SFAF 6/3/2010](https://reader036.vdocuments.mx/reader036/viewer/2022062513/555012a7b4c905af648b49d1/html5/thumbnails/15.jpg)
Thank you for listening
Saul A. Kravitz, PhDskravitz @ clcbio.com 301)355-0813
Questions