1 30 sept. 2010 genome sciences centre bc cancer agency, vancouver, bc, canada malachi griffith...

Post on 11-Jan-2016

215 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

130 Sept. 2010

Genome Sciences Centre

BC Cancer Agency, Vancouver, BC, Canada

Malachi Griffith

ALEXA-Seq analysis reveals breast cell type specific mRNA isoforms

www.AlexaPlatform.org

2

In most genes, transcript diversity is generated by alternative expression

Types of alternative expressionGene expression

3

Transcript variation is important to the study of human disease

• Alternative expression generates multiple distinct transcript variants from most human loci

• Specific transcript variants may represent useful therapeutic targets or diagnostic markers

(Venables, 2006)

4

Massively parallel RNA sequencing

Isolate RNAs

Sequence ends

263 million paired reads21 billion bases of sequence

Generate cDNA, fragment, size select, add linkersLuminal

Map to genome, transcriptome, and predicted exon junctions

Discover isoforms and measure abundance

Myoepithelial

hESCs

vHMECs

Tissues/Cell Lines

5

Pipeline overview

6

What is an ALEXA-Seq sequence ‘feature’

Summary of features for human:~4 million total (14% ‘known’)

37k Genes62k Transcripts

278k exons2,210k exon junctions407k alternative exon boundaries560k intron regions227k intergenic regions

7

Data analyzed to date

• ALEXA-Seq processing: 19 projects – REMC + 18 others

• 105 libraries (200+ lanes)

• 3.9 billion paired-end reads

• 36-mers to 75-mers

8

Output

• Expression, differential expression and alternative expression values for 3.8 million features for each library processed

• Library quality analysis• Number of features expressed (above background)

– Genes, transcripts, exon regions, junctions, etc.

• Differential gene expression– Ranked lists

• Alternative expression– Ranked lists– Alternative isoforms involving exon skipping, alternative transcript

initiation sites, etc.– Known or predicted novel isoforms

• Candidate peptides– Ranked lists

9

ALEXA-Seq data browser(using REMC analysis as an example)

• Goals– Visualization, interpretation, design of validation

experiments, distribute results to internal/external collaborators

• What kinds of questions does ALEXA-Seq allow us to ask/answer?

• http://www.alexaplatform.org/alexa_seq/Breast/Summary.htm

10

Is the RNA-Seq library suitable for alternative expression analysis?

• Library summary• Read quality• Tag redundancy• End bias• Mapping rates• Signal-to-noise• hnRNA & gDNA

contamination• Features detected

11

Is my favorite gene expressed? alternatively expressed?

12

What are the most highly expressed genes, exons, etc. in each library?

• Expression• Differential

expression • Alternative

expression• Provided for each

feature type (gene, exon, junction, etc.)

• Ranked lists of events

13

e.g. most highly expressed genes

14

What are the top DE and AE genes for each tissue comparison?

• Candidate genes

• Each comparison

• DE or AE events

• Gains or Losses

15

Summary page for vHMECs vs. Luminal

16

Candidate features gained in vHMECs

CD10

vHMECs vs. Luminal

17

Which exons/junctions and corresponding peptides might be suitable for antibody design?

18

Candidate peptides gained in vHMECs

vHMECs vs. Luminal

19

Example housekeeping gene(Actin; no change)

20

CD10 (used to sort myoepithelial cells)

Myoepithelial & vHMECs

Luminal

422-fold higher in Myoepithelial than Luminal

21

CD227 (used to sort luminal epithelial cells)

Myoepithelial

Luminal CD227

CD227

22

Differential gene expression of CASP14 (Caspase 14 gained in vHMECs)

23

Novel skipping of PTEN exon 6

24

Exon 12 skipping of DDX5 (p68)

25

Tissue specific isoforms of CA12

Luminal

Myoepithelial vHMECs

26

Alternative first exons of INPP4B

27

Alternative first exons of SERPINB7

28

FERM domain containing proteins are alternatively expressed *

* (FRM6, FRM4A, FRMD4B are AE) (FRMD3, FRMD8 are DE)

29

Novel isoforms observed only in vHMECs

E6-E10 E7-E10

30

How reliable are predictions from ALEXA-Seq?

• Are novel junctions real?– What proportion validate by RT-PCR and Sanger

sequencing?

• Are differential/alternative expression changes observed between tissues accurate?– How well do DE values correlate with qPCR?

• To answer these questions we performed ~400 validations of ALEXA-Seq predictions from a comparison of two cell lines…

31

Validation (qualitative)

33 of 189 assays shown. Overall validation rate = 85%

32

Validation (quantitative)

qPCR of 192 exons identified as alternatively expressed by ALEXA-Seq

Validation rate = 88%

33

Conclusions

• ALEXA-Seq approach provides comprehensive global transcriptome profile– Input: paired-end RNA sequence data

– Output: expression, differential expression, alternative expression, candidate peptides, etc.

• Detection of both known and novel isoforms– Subset that differ between conditions

• Predictions are highly accurate– 86% validation rate by RT-PCR, qPCR and Sanger

sequencing

• www.AlexaPlatform.org

34

Acknowledgements

SupervisorMarco Marra

Committee Joseph ConnorsStephane FlibotteSteve JonesGregg Morin

BioinformaticsObi GriffithRyan MorinRodrigo GoyaAllen DelaneyGordon RobertsonRichard Corbett

Sequencing

Martin Hirst

Thomas Zeng

Yongjun Zhao

Helen McDonald

Laboratory

Trevor Pugh

Tesa Severson

5-FU resistance

Michelle Tang

Isabella Tai

Marco Marra

Multiple Myeloma

Rodrigo Goya

Marco Marra

Neuroblastoma

Olena Morozova

Marco Marra

Morgen

Pamela Hoodless

Jacquie Schein

Inanc Birol

Gordon Robertson

Shaun Jackman

Iressa and Sutent

Obi Griffith

Steven Jones

Lymphoma

Ryan Morin

Marco Marra

Griffith M, Griffith OL, Morin RD, Tang MJ, Pugh TJ, Ally A, Asano JK, Chan SY, Li I, McDonald H, Teague K, Zhao Y, Zeng T, Delaney AD, Hirst M, Morin GB, Jones SJM, Tai IT, Marra MA. Alternative expression analysis by RNA sequencing. In review (Nature Methods).

35

top related