annotation-agnostic differential expression analysis

Post on 15-Apr-2017

276 Views

Category:

Science

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Annotation-agnostic differential expression analysis

Leonardo Collado-Torres @fellgernon #ENAR2016

www.slideshare.net/lcolladotor

Mo#va#ngproblem:iden#fyandvalidateregionsofthegenomethatchangeexpressionwhenanalyzing#ssueswithpoten#allyincompletetranscriptomeannota#on

RNA-seq reads

Genome (DNA)

RNA transcripts (many possible variants)

Measuring gene expression: RNA-seq

Adapted from @jtleek

Genome (DNA)

Mapped reads

Adapted from @jtleek

Commonanalysispipelines:

• Featurecoun#ng(geneorexonlevel)• Transcriptassembly

Challenges in counting

hBp://www-huber.embl.de/users/anders/HTSeq/doc/count.html

Annotation variation

Frazee et al, Biostatistics, 2014

DER finder approach

•  Findcon#guousbasepairswithDifferen#alExpressionsignalàDERegionsorDERs

•  Findnearestannotatedfeature

Frazee et al, Biostatistics, 2014

coverage vector 2 6 0 11 6

Genome (DNA)

Read coverage

Adapted from @jtleek

Jaffe et al, Nat. Neuroscience, 2015

Finding DERs by expressed-regions

CBC:28

MD:24STR:28AMY:31HIP:32

DFC:34

Total N samples: 487

BrainSpan data

CoverageDatafromBrainSpan:hBp://download.allenins#tute.org/brainspan/MRF_BigWig_Gencode_v10/

VFC:30 MFC:32 OFC:30 M1C:25

S1C:26 IPC:33 A1C:30 STC:35 ITC:33

V1C:33

• Data:3#ssues(liver,tes#s,heart),8sampleseach• Alignwith

• Iden#fyexpressedregionswithderfinder– Adjustcoverage(40million)– Findexpressedregions(cutoff5)– DiscardERs<9bp

GTEx: DERs via expressed-regions

Presence of intronic ERs

CanstrictlyintronicERsdifferen#ate#ssues?

PCs differentiate tissues

Differential intronic ERs | exonic ERs

Differential intronic ERs | exonic ERs

Differential intronic ERs | exonic ERs

Simulation setup 3replicates:

2groups,eachwith5samples~2millionpaired-endreadsforchr171/6high,1/6lowingroup2vsgroup1

Annota#on:

completemissing20%oftranscripts(8.28%exons)

Referenceset:

3868exonsthatoverlaponly1transcript

Simulation results

•  Similarpowertomethodsthathavecompleteannota#on

•  Methodswithincorrectannota#onlosealotofpower•  HigherempiricalFDR/FPR

Collado-Torres et al, F1000Research, 2015

regionReport

Mo#va#ngproblem:iden#fyandvalidateregionsofthegenomethatchangeexpressionwhenanalyzing#ssueswithpoten#allyincompletetranscriptomeannota#onderfinderpermitsdiscoveryofnovelexpressedregions1. weiden#fiedexpressedintronicregionsthat

differen#ate#ssuesindependentlyofthenearestexonicregion

2. wehavedevelopedtoolsforreproducible/shareablerepor#ng

Acknowledgements

Hopkins Jeffrey Leek Alyssa Frazee Abhinav Nellore Chris Wilks Ben Langmead

LIBD Andrew Jaffe Jooheon Shin Nikolay Ivanov Amy Deep Ran Tao Yankai Jia Thomas Hyde Joel Kleinman Daniel Weinberger

Harvard Rafael Irizarry Michael Love Funding NIH LIBD CONACyT México

References + software + code •  Collado-Torres, et al. bioRxiv (2015) doi:10.1101/015370

–  http://bioconductor.org/packages/derfinder –  http://lcolladotor.github.io/derSupplement/

•  Collado-Torres, et al. F1000Research (2015) doi:10.12688/f1000research.6379.1

-  http://www.bioconductor.org/packages/regionReport -  http://lcolladotor.github.io/regionReportSupp/

•  Nellore, Collado-Torres, et al. bioRxiv (2015) doi:10.1101/019067

- rail.bio • Nellore, …, Collado-Torres, et al. bioRxiv (2016) doi:10.1101/038224

- intropolis.rail.bio

•  Jaffe, Shin, Collado-Torres, et al. Nat. Neurosci. (2015) doi:10.1038/nn.3898 –  https://github.com/lcolladotor/libd_n36 –  https://github.com/lcolladotor/enrichedRanges

•  Frazee, et al. Biostatistics. (2014) doi:10.1093/biostatistics/kxt053 –  https://github.com/leekgroup/derfinder

top related