comparison of rna sequencing with 19,319 lab validated rt ... · comparison of rna sequencing with...

38
Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays Jan Hellemans, PhD London, UK October 20-21, 2014

Upload: others

Post on 12-Oct-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Comparison of RNA sequencing with 19,319 lab validated RT ... · Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays Jan Hellemans, PhD London, UK October 20-21,

Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays

Jan Hellemans, PhD

London, UK

October 20-21, 2014

Page 2: Comparison of RNA sequencing with 19,319 lab validated RT ... · Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays Jan Hellemans, PhD London, UK October 20-21,
Page 3: Comparison of RNA sequencing with 19,319 lab validated RT ... · Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays Jan Hellemans, PhD London, UK October 20-21,

Acknowledgements

•  Biogazelle

Biogazelle team & collaborators

•  Ghent University

•  Steve Lefever

•  SEQC consortium

•  Christopher Mason

•  David Kreil

•  Leming Shi

•  Bio-Rad

Page 4: Comparison of RNA sequencing with 19,319 lab validated RT ... · Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays Jan Hellemans, PhD London, UK October 20-21,

•  qPCR: reference technology for nucleic acid quantification

•  sensitivity and specificity

•  wide dynamic range

•  speed

•  relatively low cost

•  conceptual and practical simplicity

•  easy to perform ≠ easy to do it right

•  many steps involved

•  all need to be right

Introduction

Page 5: Comparison of RNA sequencing with 19,319 lab validated RT ... · Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays Jan Hellemans, PhD London, UK October 20-21,

Assays & MIQE

•  design

•  amplicon length

•  primer positions (exonic or intron-spanning)

•  transcript coverage

•  in silico verification

•  specificity prediction (retropseudogenes and other homologues)

•  secondary structure analysis

•  empirical (wet lab) validation

•  specificity assessment (gel, melt, amplicon sequencing)

•  Cq of NTC (for SYBR assays)

•  amplification efficiency determination (slope, E, SE(E), r²)

Page 6: Comparison of RNA sequencing with 19,319 lab validated RT ... · Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays Jan Hellemans, PhD London, UK October 20-21,

Assays & MIQE

•  design

•  amplicon length

•  primer positions (exonic or intron-spanning)

•  transcript coverage

•  in silico verification

•  specificity prediction (retropseudogenes and other homologues)

•  secondary structure analysis

•  empirical (wet lab) validation

•  specificity assessment (gel, melt, amplicon sequencing)

•  Cq of NTC (for SYBR assays)

•  amplification efficiency determination (slope, E, SE(E), r²)

Page 7: Comparison of RNA sequencing with 19,319 lab validated RT ... · Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays Jan Hellemans, PhD London, UK October 20-21,

The perfect assay

•  specific for the gene of interest (no off-target amplification)

•  detection of all transcript variants

•  detection not affected by polymorphisms (no allelic bias or drop out)

•  amplification efficiency ~100%

•  no gDNA co-amplification

•  no primer dimer formation

properties

Page 8: Comparison of RNA sequencing with 19,319 lab validated RT ... · Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays Jan Hellemans, PhD London, UK October 20-21,

The perfect assay

Page 9: Comparison of RNA sequencing with 19,319 lab validated RT ... · Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays Jan Hellemans, PhD London, UK October 20-21,

The perfect assay

•  For some genes, there is no perfect assay

•  no unique sequence (homology with other genes – pseudogenes)

•  no common sequence among all transcripts

•  regions are excluded because of repeats, secondary structures, SNPs, homology, ...

•  Make the best possible compromise and report potential issues

•  Design à in silico quality control à lab validation

... or the best possible

Page 10: Comparison of RNA sequencing with 19,319 lab validated RT ... · Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays Jan Hellemans, PhD London, UK October 20-21,

Assay design using primerXL

•  database of genomic information (transcripts, SNPs, ...)

•  tools for target region selection (maximize transcript coverage)

•  primer3 design engine

•  analysis of secondary structures and SNPs in primer annealing regions

•  specificity prediction (BiSearch)

•  relaxation cascade (from perfect to best possible)

Page 11: Comparison of RNA sequencing with 19,319 lab validated RT ... · Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays Jan Hellemans, PhD London, UK October 20-21,

BiSearch specificity prediction

•  BiSearch loose

•  1222222222222222

•  BiSearch strict

•  1233333333333

Page 12: Comparison of RNA sequencing with 19,319 lab validated RT ... · Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays Jan Hellemans, PhD London, UK October 20-21,

BiSearch specificity prediction

•  BiSearch loose

•  1222222222222222

•  only the gene of interest (FFAR2)

•  BiSearch strict

•  1233333333333

reads   seq   gene_list   official_symbol   location  

2843   CATGGCAGTCACCATCTTCTGCTACTGGCGTTTTGTGTGGATCATGCTCTCCCAGCCCCTTGTGGGGGCCCAGAGGCGGCGCCGAGCCGTGGGGCTGGCTGTGGTGACGCTGCTCAATTTCCTGGTGTGCTTCGGACCTTACAGATCGGAA

ENSG00000126262   FFAR2   19:35940617-35942667  

1897   GTAAGGTCCGAAGCACACCAGGAAATTGAGCAGCGTCACCACAGCCAGCCCCACGGCTCGGCGCCGCCTCTGGGCCCCCACAAGGGGCTGGGAGAGCATGATCCACACAAAACGCCAGTAGCAGAAGATGGTGACTGCCATGAGATCGGAA

ENSG00000126262   FFAR2   19:35940617-35942667  

1535   GTAAGGTCCGAAGCACACCGAGAGCTGGGAGCAGGAGCTACACAGTCTGCTGGCCTCACTGCACACCCTGCTGGGGGCCCTGTACGAGGGAGCAGAGACTGCTCCTGTGCAGAATGAAGGCCCTGGGGTGGAGATGCTGCTGTCCTCAGAA

ENSG00000141456   AC091153.1   17:4574680-4607632  

1097   CATGGCAGTCACCATCTTCTGAGGACAGCAGCATCTCCACCCCAGGGCCTTCATTCTGCACAGGAGCAGTCTCTGCTCCCTCGTACAGGGCCCCCAGCAGGGTGTGCA

GTGAGGCCAGCAGACTGTGTAGCTCCTGCTCCCAGCTCTCGG

ENSG00000141456   AC091153.1   17:4574680-4607632  

1091   CATGGCAGTCACCATCTTCTGAGGACAGCAGCATCTCCACCCCAGGGCCTTCATTCTGCACAGGAGCAGTCTCTGCTCCCTCGTACAGGGCCCCCAGCAGGGTGTGCAGTGAGGCCAGCAGACTGTGTAGCTCCTGCTCCCAGCTCTCGGT

ENSG00000141456   AC091153.1   17:4574680-4607632  

Page 13: Comparison of RNA sequencing with 19,319 lab validated RT ... · Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays Jan Hellemans, PhD London, UK October 20-21,

Wet lab validation

•  PCR composition

•  total volume: 5 µl

•  instrument: CFX384 (with automation)

•  mastermix: SsoAdvanced SYBR

•  primer conc: 250 nM each

•  PCR program

•  default cycling protocol for SsoAdvanced SYBR (Ta=60°C)

•  Samples

•  cDNA: 25 ng (total RNA equivalents – Agilent Universal human reference RNA = MAQC A)

•  gDNA: 2.5 ng (Roche)

•  NTC: water + carrier (5 ng/μl yeast transfer RNA)

•  synthetic template (pooled 60-mers in concentration range: 20 M – 20 copies)

setup

Page 14: Comparison of RNA sequencing with 19,319 lab validated RT ... · Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays Jan Hellemans, PhD London, UK October 20-21,

Wet lab validation

•  lab validation of 103 053 assays (human, mouse and rat coding genes)

•  1 456 142 reactions

•  3 822 PCR plates (384-well)

•  equivalent to 15 288 PCR plates (96-well)

some numbers

305 m

Page 15: Comparison of RNA sequencing with 19,319 lab validated RT ... · Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays Jan Hellemans, PhD London, UK October 20-21,

Amplification efficiency

•  initial publication: Vermeulen et al., Nucleic Acids Research, 2009

•  Biogazelle approach (easy & cost effective)

•  60-mer

•  no modifications, standard desalted

•  7 points dilution series: 20 000 000 > 20 molecules

•  equivalent to full length double stranded template

•  limitation: behavior of first cycles amplifying from cDNA are not evaluated

synthetic templates

30 nt 3’ 30 nt 5’

ds template ss oligo r²<0.99 1 1 median E 2.00 2.01 average E 2.00 2.01 count E <> [1.90-2.10] 1 3 paired t-test p-value 0.14

Page 16: Comparison of RNA sequencing with 19,319 lab validated RT ... · Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays Jan Hellemans, PhD London, UK October 20-21,

Amplification efficiency distribution (n = 50 133)

89%

Page 17: Comparison of RNA sequencing with 19,319 lab validated RT ... · Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays Jan Hellemans, PhD London, UK October 20-21,

Amplification efficiency distribution (n = 50 133)

89%

redesign

redesign

Page 18: Comparison of RNA sequencing with 19,319 lab validated RT ... · Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays Jan Hellemans, PhD London, UK October 20-21,

Specificity

•  amplicon sizing ( + melt analysis for SYBR assays)

•  limited sensitivity for detecting low level non-specific coamplification

•  failure to observe non-specific amplification of sequences with similar size and/or Tm e.g. expressed pseudogenes or homologous genes

•  Next level of specificity assessment

•  in silico specificity predictions by BiSearch

•  massively parallel sequencing of pooled PCR products

•  average coverage > 1000-fold à lab specificity > 99.9%

•  50 – 200 times more sensitive than size analysis and Sanger sequencing

NGS for increased sensitivity

Page 19: Comparison of RNA sequencing with 19,319 lab validated RT ... · Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays Jan Hellemans, PhD London, UK October 20-21,

Specificity most assays are 100% on-target

Page 20: Comparison of RNA sequencing with 19,319 lab validated RT ... · Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays Jan Hellemans, PhD London, UK October 20-21,

Specificity

0%

25%

50%

75%

100%

% o

n-t

arg

et

2/3 of non-specific assays may go unnoticed without NGS

0% 20% 40% 60%

0 < x < 0.1 0.1 < x < 0.2 0.2 < x < 0.3 0.3 < x < 0.4 0.4 < x < 0.5 0.5 < x < 0.6 0.6 < x < 0.7 0.7 < x < 0.8 0.8 < x < 0.9

0.9 < x < 1

Page 21: Comparison of RNA sequencing with 19,319 lab validated RT ... · Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays Jan Hellemans, PhD London, UK October 20-21,

Specificity

perfect 60 293 86%

acceptable (<10% non-specific)

5 866 8%

predicted non-specificity (no specific design found)

1 204 2%

failing specificity QC criteria 2 467 4%

the power of in silico verification

Page 22: Comparison of RNA sequencing with 19,319 lab validated RT ... · Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays Jan Hellemans, PhD London, UK October 20-21,

MIQE compliant PrimePCR assay validation data sheet

Page 23: Comparison of RNA sequencing with 19,319 lab validated RT ... · Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays Jan Hellemans, PhD London, UK October 20-21,

Dynamic range

> 10 000 000 fold

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

16 7

77.2

16

8 3

88

.608

4 1

94

.304

2 0

97

.152

1 0

48

.576

52

4.2

88

26

2.1

44

13

1.0

72

65

.536

32

.768

16.3

84

8.1

92

4.0

96

2.0

48

1.0

24

0.5

12

0.2

56

0.1

28

0.0

64

0.0

32

0.0

16

0.0

08

0.0

04

0.0

02

0.0

01

ge

ne

co

un

t

copies per cell

human mouse rat

Page 24: Comparison of RNA sequencing with 19,319 lab validated RT ... · Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays Jan Hellemans, PhD London, UK October 20-21,

SEQC

•  multisite, cross-platform analysis of RNAseq

•  FDA sponsored and guided MAQC-III

•  Nature Biotechnology, Sept 2014 Focus on RNA sequencing quality control (SEQC) 2 Biogazelle co-authors

•  MAQC samples reference RNA with built in controls – known truths

•  > 100 billion reads

•  compared against qPCR (PrimePCR)

Page 25: Comparison of RNA sequencing with 19,319 lab validated RT ... · Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays Jan Hellemans, PhD London, UK October 20-21,

RNAseq vs PrimePCR Differential expression

454 ILMN PGM PRO

0.83 0.89 0.86 0.89

13,190 genes 16,264 genes 14,981 genes 16,242 genes

Page 26: Comparison of RNA sequencing with 19,319 lab validated RT ... · Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays Jan Hellemans, PhD London, UK October 20-21,

qPCR (PrimePCR) vs RNAseq (Illumina) r² = 75% for genes detected by both platforms

Page 27: Comparison of RNA sequencing with 19,319 lab validated RT ... · Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays Jan Hellemans, PhD London, UK October 20-21,

qPCR (PrimePCR) vs RNAseq (Illumina)

Page 28: Comparison of RNA sequencing with 19,319 lab validated RT ... · Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays Jan Hellemans, PhD London, UK October 20-21,

Saturation analysis

preparation   sample   libraries   reads   GENCODE12 mapping  

PrimePCR mapping  

ribo-depleted  

MAQC A   22   5 304 M   1 955 M (37%)   1 692 M (32%)  

MAQC B   17   3 370 M   1 447 M (43%)   1 193 M (35%)  

poly-A–enriched  

MAQC A   4   427 M   291 M (68%)   278 M (65%)  

MAQC B   4   446 M   323 M (72%)   297 M (67%)  

ABRF-NGS dataset

Page 29: Comparison of RNA sequencing with 19,319 lab validated RT ... · Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays Jan Hellemans, PhD London, UK October 20-21,

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

4 096 000 000

2 048 000 000

1 024 000 000

512 000 000

256 000 000

128 000 000

64 000 000

32 000 000

16 000 000

8 000 000

4 000 000

2 000 000

1 000 000

500 000

250 000

125 000

MAQC A - detection

MAQC B - detection

Saturation analysis ribo-depletion RNAseq - % of GENCODE12

Page 30: Comparison of RNA sequencing with 19,319 lab validated RT ... · Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays Jan Hellemans, PhD London, UK October 20-21,

Saturation analysis ribo-depletion RNAseq - % of GENCODE12

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

4 096 000 000

2 048 000 000

1 024 000 000

512 000 000

256 000 000

128 000 000

64 000 000

32 000 000

16 000 000

8 000 000

4 000 000

2 000 000

1 000 000

500 000

250 000

125 000

MAQC A - detection

MAQC B - detection

Page 31: Comparison of RNA sequencing with 19,319 lab validated RT ... · Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays Jan Hellemans, PhD London, UK October 20-21,

Saturation analysis ribo-depletion RNAseq - % of GENCODE12

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

4 096 000 000

2 048 000 000

1 024 000 000

512 000 000

256 000 000

128 000 000

64 000 000

32 000 000

16 000 000

8 000 000

4 000 000

2 000 000

1 000 000

500 000

250 000

125 000

MAQC A - detection

MAQC B - detection

MAQC A - quantification

MAQC B - quantification

Page 32: Comparison of RNA sequencing with 19,319 lab validated RT ... · Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays Jan Hellemans, PhD London, UK October 20-21,

Saturation analysis poly-A RNAseq - % of GENCODE12

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

4 096 000 000

2 048 000 000

1 024 000 000

512 000 000

256 000 000

128 000 000

64 000 000

32 000 000

16 000 000

8 000 000

4 000 000

2 000 000

1 000 000

500 000

250 000

125 000

MAQC A - detection

MAQC B - detection

MAQC A - quantification

MAQC B - quantification

Page 33: Comparison of RNA sequencing with 19,319 lab validated RT ... · Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays Jan Hellemans, PhD London, UK October 20-21,

Saturation analysis ribo-depletion RNAseq for MAQC A - GENCODE12 vs PrimePCR

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

4 096 000 000

2 048 000 000

1 024 000 000

512 000 000

256 000 000

128 000 000

64 000 000

32 000 000

16 000 000

8 000 000

4 000 000

2 000 000

1 000 000

500 000

250 000

125 000

GENCODE12 detection

primePCR detection

GENCODE12 quantification

primePCR quantification

Page 34: Comparison of RNA sequencing with 19,319 lab validated RT ... · Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays Jan Hellemans, PhD London, UK October 20-21,

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

4 096 000 000

2 048 000 000

1 024 000 000

512 000 000

256 000 000

128 000 000

64 000 000

32 000 000

16 000 000

8 000 000

4 000 000

2 000 000

1 000 000

500 000

250 000

125 000

ribo-depletion RNAseq - detection

poly-A RNAseq - detection

qPCR - detection

ribo-depletion RNAseq - quantification

poly-A RNAseq - quantification

qPCR - quantification

Saturation analysis MAQC A - % of PrimePCR

Page 35: Comparison of RNA sequencing with 19,319 lab validated RT ... · Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays Jan Hellemans, PhD London, UK October 20-21,

Confirmation rate of novel junctions

Junction prediction junctions   confirmed   confirmation rate  

multiple algorithms (Cstar + Magic + Subread)  

136   136   100%  

single algorithm   24   20   83%  

•  novel exon

•  one of the primers in the novel exon

•  novel junction

•  one of the primers overlapping the novel junction ≥ 5 bases at either side of junction

•  size analysis to confirm expected size for novel transcripts

Page 36: Comparison of RNA sequencing with 19,319 lab validated RT ... · Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays Jan Hellemans, PhD London, UK October 20-21,

Conclusions - I

•  Assay design and in silico verification

•  Transcript coverage

•  SNPs and secondary structures

•  Specificity prediction

•  Empirical assay validation

•  Efficiency in 90-110% range

•  Stringent specificity analysis by massively parallel amplicon sequencing

•  validated assays for human, mouse & rat coding genes PrimePCR

Page 37: Comparison of RNA sequencing with 19,319 lab validated RT ... · Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays Jan Hellemans, PhD London, UK October 20-21,

Conclusions - II

•  qPCR based transcriptome profiling

•  Samples from MAQC/SEQC study

•  PCR data as benchmark for evaluation of RNAseq

•  qPCR benefits: high sensitivity and large dynamic range

•  good correlation with RNAseq results

•  for individual genes, RNAseq ≤ 100 M reads gives lower sensitivity than qPCR

•  the majority of novel junctions identified by RNAseq can be confirmed by qPCR

Page 38: Comparison of RNA sequencing with 19,319 lab validated RT ... · Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays Jan Hellemans, PhD London, UK October 20-21,