next-generation sequencing and pbrc. next generation sequencer applications denovo sequencing...

8
Next-generation sequencing and PBRC

Post on 19-Dec-2015

259 views

Category:

Documents


3 download

TRANSCRIPT

Next-generation sequencing and PBRC

Next Generation Sequencer Applications

• DeNovo Sequencing• Resequencing, Comparative Genomics• Global SNP Analysis• Gene Expression Analysis• Methylation Studies• ChIP Sequencing-transcription factors, histones, polymerases• Transcriptome Analysis-splicing, UTRs, cSNPs, nested transcripts• MicroRNA Discovery and quantitation• Metagenomics, Microbial diversity• Copy number variation• Chromosomal aberrations• Gene regulation studies

AB SOLiD Ligation sequencing

How many sequence tags* do I need for my gene expression application?

• SAGE/CAGE – 2-5 million mappable• miRNA – 10 million mappable• ChIP Seq—10-20 million mappable• Whole Transcriptome from polyA RNA – 40-50 million mappable• Whole Transcriptome from rRNA depleted - >50 million mappable• Whole Transcriptome for Allele Specific Expression - >>50 million mappable

SOLiD™ 4 generates >1.4 billion mappable sequences/run (2 slides)

Libraries can be multiplexed to decrease the cost/sample according to the application and number of sequences needed.

*For human/mouse sized genomes; smaller organisms require fewer sequence tags.

SAGE Sequencing vs. Microarray

SOLiD v4 Microarray-Illumina Ref 8

Microarray-Illumina Ref 6

Data Points 3.6 million 25,600 45,200

Known and novel transcripts

Known transcripts Known transcripts

Sensitivity 6 logs 3 logs 3 logs

Technical Reproducibility >.99-.999 0.9 0.9

Correlation to Taqman 0.9 0.7-0.8 0.7-0.8

Multiplexing/Barcoding Yes –up to 48 RNA or 96 DNA samples

No No

No background –better for low abundance transcript

detection

Hybridization process creates background signal Hybridization process

creates background signal

RNA quantity 5-10 ug 750 ng 750 ng

16 Sample Experiment Cost

$7200-full service$6100-PI creates library

$3600 $5200

Primary Data Analysis - Images to bases

Tertiary Data Analysis – Experiment Specific

Instrument-specificSequences +Quality values

• Differential expression• Methylation sites• Binding sites• Gene association• Genomic structure

Ref Seq + AlignmentAssembly, De Novo

Secondary Data Analysis – Bases to alignments/contigs

Applications• Tag Profiling• Small RNA

Analysis• Transcriptome

seq.• ChIP-Seq• Methylation

Analysis• Resequencing• De novo

assembly

Algorithms• Eland• Maq• SOAP• Velvet• Newbler• Mapreads• Others …

Run Q

ualit

y

Sam

ple

/Lib

rary

Qualit

y

Dis

covery

Bioinformatics: Geospiza

One or moreData sets

Next-gen sequencing: applications

– Genome analysis: basic and translational research• Genetics of disease – new frontiers• Exome resequencing: confirmation of GWAS• Genome sequence as diagnostic tool• Genetic counseling

– Epigenome analysis: basic research; biomarkers• Analyses of DNA methylation, transcription factors, histone

modifications, non-coding RNA• Epigenomic biomarkers of disease

– Gene expression analysis: basic research; diagnostics & biomarkers• Whole transcriptome: all transcribed sequences in a cell• SAGE analysis: expression of known genes• Small RNA: microRNA as regulators of biology

– Genotype to phenotype: a new frontier• Pathology: systems biology• Diagnosis: data filtering• Personalized Genomic Medicine: Treatment recommendations

Next-gen sequencing: challenges

– Rapid growth in methodology• Technology and equipment changes & upgrades

– High demands on informatics:• Staff• Software• Computational resources

– New ways of handling data needed:• Interpretation• Publication• Storage