a primer on single-cell rna-seq analysis -...

37
A primer on single-cell RNA-seq analysis Dr Ma6hew Ritchie @mritchieau UQ Winter School in ComputaAonal Biology 4 th July 2017

Upload: others

Post on 10-May-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A primer on single-cell RNA-seq analysis - …bioinformatics.org.au/ws17/wp-content/uploads/sites/13/...A primer on single-cell RNA-seq analysis Dr Ma6hew Ritchie @mritchieau UQ Winter

Aprimeronsingle-cellRNA-seqanalysis

DrMa6hewRitchie@mritchieau

UQWinterSchoolinComputaAonalBiology4thJuly2017

Page 2: A primer on single-cell RNA-seq analysis - …bioinformatics.org.au/ws17/wp-content/uploads/sites/13/...A primer on single-cell RNA-seq analysis Dr Ma6hew Ritchie @mritchieau UQ Winter

Abriefhistoryofgeneexpressiontechnology

Tilingarrayprobes

1.TheMicroarrayera.LongliveMicroarrays…Timeline of technologyLockhart et al., 1996

1995 2000 2005 2010 2015

De Risi et al., 1996

. . . . . .

Gunderson et al., 2004

..

5’ 3’Gene

ProbeTiling

Page 3: A primer on single-cell RNA-seq analysis - …bioinformatics.org.au/ws17/wp-content/uploads/sites/13/...A primer on single-cell RNA-seq analysis Dr Ma6hew Ritchie @mritchieau UQ Winter

Abriefhistoryofgeneexpressiontechnology

1.TheMicroarrayera.LongliveMicroarrays…2.DawnoftheRNA-sequencingera.Longlivesequencing…

3.Single-cellRNA-sequencingprotocolsemergeandproliferate…

Timeline of technologyLockhart et al., 1996

1995 2000 2005 2010 2015

Mortazavi et al., 2008Marioni et al. , 2008De Risi et al., 1996

. . . . . .

Gunderson et al., 2004

...

..Tang et al., 2009Islam et al., 2011 (STR-seq)

.

.Hashimshony al., 2012 (CEL-seq)

.Macosko et al,. 2015

(Drop-seq)

.

Cloonan et al., 2008

Page 4: A primer on single-cell RNA-seq analysis - …bioinformatics.org.au/ws17/wp-content/uploads/sites/13/...A primer on single-cell RNA-seq analysis Dr Ma6hew Ritchie @mritchieau UQ Winter

Outline

1.  Pre-processingscRNA-seqdataa)  Aligningreads,dealingwith

barcodesb)  QualityControl(samples&genes)c)  NormalizaUond)  DimensionreducUon

2.  Downstreamanalysis

a)  DifferenUalexpression(DE)analysis

Page 5: A primer on single-cell RNA-seq analysis - …bioinformatics.org.au/ws17/wp-content/uploads/sites/13/...A primer on single-cell RNA-seq analysis Dr Ma6hew Ritchie @mritchieau UQ Winter

scRNA-seqworkflow

e.g.FASTQfiles

"   #readsmappedtoeachgene

"   UniquemolecularidenUfiers(UMIs)

"   Filtercells"   Filtergenes

Removecell-specificbiases

"   Clustering"   Hypervariability"   DEanalysis"   GOandPathway

analysis"   Trajectoryanalysis

Page 6: A primer on single-cell RNA-seq analysis - …bioinformatics.org.au/ws17/wp-content/uploads/sites/13/...A primer on single-cell RNA-seq analysis Dr Ma6hew Ritchie @mritchieau UQ Winter

RawdataistypicallyavailableasaFASTQfile…

…100smillionsmorerowsofdata•  FastQCisolenusefulforassessingsequencequalityhYps://www.bioinformaUcs.babraham.ac.uk/projects/fastqc/

Page 7: A primer on single-cell RNA-seq analysis - …bioinformatics.org.au/ws17/wp-content/uploads/sites/13/...A primer on single-cell RNA-seq analysis Dr Ma6hew Ritchie @mritchieau UQ Winter

LotsofbarcodestodealwithinscRNA-seqdata…

-Cell(sample)specificbarcodes(olenknownsequences)-Moleculebarcodes(UMIs,randomsequences)

Page 8: A primer on single-cell RNA-seq analysis - …bioinformatics.org.au/ws17/wp-content/uploads/sites/13/...A primer on single-cell RNA-seq analysis Dr Ma6hew Ritchie @mritchieau UQ Winter

UniqueMolecularIdenAfiers(UMIs)

Islam et. al. Nat Methods. 2014

HPRT

HPRT

GAPDH

mRNA

cDNA

UMI2

UMI3

cell 1

cell 1

UMI1 cell 1

Index UMI

Molecule counting

HPRT

HPRT

GAPDH

PCR bias

Page 9: A primer on single-cell RNA-seq analysis - …bioinformatics.org.au/ws17/wp-content/uploads/sites/13/...A primer on single-cell RNA-seq analysis Dr Ma6hew Ritchie @mritchieau UQ Winter

UMIhandlingso`ware

•  umitools(hYp://brwnj.github.io/umitools/)•  Fastqreformamngandbamde-duping

•  UMI-tools(Smithetal.,GenomeResearch2017,hYps://github.com/CGATOxford/UMI-tools)•  ModelsequencingerrorsinUMIsusinganetwork-basedmethod

•  umis(Svenssonetal.,NatureMethods2017,hYps://github.com/vals/umis/)•  HandlesbothcellularandmolecularUMIs

•  scPipe(hYps://github.com/LuyiTian/scPipe)•  Simplegene-centricapproach,witherrorcorrecUon(R-based)

Page 10: A primer on single-cell RNA-seq analysis - …bioinformatics.org.au/ws17/wp-content/uploads/sites/13/...A primer on single-cell RNA-seq analysis Dr Ma6hew Ritchie @mritchieau UQ Winter

Modelling errors in UMIs

Smith et al. Genome Res. 2017;27:491-499 (Figure 1E)

Page 11: A primer on single-cell RNA-seq analysis - …bioinformatics.org.au/ws17/wp-content/uploads/sites/13/...A primer on single-cell RNA-seq analysis Dr Ma6hew Ritchie @mritchieau UQ Winter

ReadalignmentandgenecounAng

Manywellestablishedmethodsforaligningshortreadstoareferencegenome/transcriptome

•  BWA(Li&Durbin.BioinformaUcs2009;25:1754–60.)•  STAR(Dobinetal.Bioinforma;cs.2013;29:15–21)•  Rsubread(Liaoetal.NucleicAcidsRes.2013;41:e108)Oncealigned,obtaingene/transcriptcountsinaUMI-awarefashion

PseudoaligneropUons•  Sailfish(Patroetal.NatBiotechnol.2014;32:462–4)•  Salmon(Patroetal.bioRxiv.2015.doi:10.1101/021592.)•  kallisto(Brayetal.NatBiotechnol.2016.doi:10.1038/nbt.3519.)

TableofCountsFASTQ BAM

Page 12: A primer on single-cell RNA-seq analysis - …bioinformatics.org.au/ws17/wp-content/uploads/sites/13/...A primer on single-cell RNA-seq analysis Dr Ma6hew Ritchie @mritchieau UQ Winter

CharacterisAcsofscRNA-seqdata

"   High-resoluUon,high-dimensionalandhighlevelsofnoise"   60~70%ofcountsarezero"   Eachcellexpresses1,000~8,000genes"   Upto100-folddifferenceintotalcoverage"   QualitycontrolisessenUal!

Gene S1 S2 S3 S4 S5 S6 S7 … …

Rp1 0 0 0 0 0 0 0 … …

Sox17 0 1 6 11 2 0 0 … …

Mrpl15 0 0 0 0 0 0 0 … …

Lypla1 12 0 0 0 0 0 18 … …

Tcea1 7 0 0 21 0 2 0 … …

Rgs2 0 0 0 0 0 0 0 … …

Cldn4 0 0 0 0 0 0 0 … …

… … … … … … … … … …

~100–1,000cells

~10,000- 40,000genes

Page 13: A primer on single-cell RNA-seq analysis - …bioinformatics.org.au/ws17/wp-content/uploads/sites/13/...A primer on single-cell RNA-seq analysis Dr Ma6hew Ritchie @mritchieau UQ Winter

Qualitycontrol(QC)

"   Removelow-qualitycells(subsetbycolumn)"   Filterbylibrarysize"   Filterbynumberofexpressedgenesineachcell"   ExamineMitochondrial,RibosomalorSpike-in

proporUons

"   Removelow-abundancegenes(subsetbyrow)"   Filterbyaverageexpressionlevel"   Filterbyexpressedinatleastncells"   Usealessaggressiveapproachforstudiesinvolvingrarecells

Page 14: A primer on single-cell RNA-seq analysis - …bioinformatics.org.au/ws17/wp-content/uploads/sites/13/...A primer on single-cell RNA-seq analysis Dr Ma6hew Ritchie @mritchieau UQ Winter

Fastq reformat(bc_trim_barcode)

Readsalignment(Rsubread::align)

Exonmapping(bc_exon_mapping)

Barcodedemultiplex(sc_demultiplex)

Genecount(sc_gene_counting)

NumberofRemovedreads

Alignmentrate

Numberofreadsmappedtointron/exon

Readspercell;unmatchedbarcodes

NumberofcorrectedUMI&filteredgenes

Genecountingmatrix

Qualitycontrolinformationmatrix

AnSCData objectforqualitycontrolandfurtherdownstreamanalysis

QualitycontrolmetricscollectedateachstepscPipe workflowA B C

D

scPipeQualityControlMetrics

Page 15: A primer on single-cell RNA-seq analysis - …bioinformatics.org.au/ws17/wp-content/uploads/sites/13/...A primer on single-cell RNA-seq analysis Dr Ma6hew Ritchie @mritchieau UQ Winter

PairsplotsofQCMetricscanbeuseful

Typically 5-15% of samples are discarded in this process (50% in extreme cases).

Page 16: A primer on single-cell RNA-seq analysis - …bioinformatics.org.au/ws17/wp-content/uploads/sites/13/...A primer on single-cell RNA-seq analysis Dr Ma6hew Ritchie @mritchieau UQ Winter

Filtersampleswithcare!

DatafromDrJamesRyall(UniMelb)

Page 17: A primer on single-cell RNA-seq analysis - …bioinformatics.org.au/ws17/wp-content/uploads/sites/13/...A primer on single-cell RNA-seq analysis Dr Ma6hew Ritchie @mritchieau UQ Winter

Fastq reformat(bc_trim_barcode)

Readsalignment(Rsubread::align)

Exonmapping(bc_exon_mapping)

Barcodedemultiplex(sc_demultiplex)

Genecount(sc_gene_counting)

NumberofRemovedreads

Alignmentrate

Numberofreadsmappedtointron/exon

Readspercell;unmatchedbarcodes

NumberofcorrectedUMI&filteredgenes

Genecountingmatrix

Qualitycontrolinformationmatrix

AnSCData objectforqualitycontrolandfurtherdownstreamanalysis

QualitycontrolmetricscollectedateachstepscPipe workflowA B C

D

Page 18: A primer on single-cell RNA-seq analysis - …bioinformatics.org.au/ws17/wp-content/uploads/sites/13/...A primer on single-cell RNA-seq analysis Dr Ma6hew Ritchie @mritchieau UQ Winter

OtherApproachestoQC

•  Ilicicetal.GenomeBiology2016;17:29.•  TrainaSVMtodeterminelowqualitysamples(FluidigmC1)

•  scater(McCarthyetal.Bioinforma;cs2017;33(8):1179–86 hYp://bioconductor.org/packages/scater)

•  Exploratorydataanalysisapproach–plotQCmetricsanddetermineoutliersusingselectedmetrics

Page 19: A primer on single-cell RNA-seq analysis - …bioinformatics.org.au/ws17/wp-content/uploads/sites/13/...A primer on single-cell RNA-seq analysis Dr Ma6hew Ritchie @mritchieau UQ Winter

NormalizaAon

"   ScalingnormalizaUonforbulkRNA-seq"   Computeascaling(size)factorpersample;"   Popularmethods:TMM,DESeq;"   AssumesmostgenesarenotdifferenUallyexpressedbetween

samples.

"   MethodsforscRNA-seq"   Bylibrarysize" BaSiCs(Vallejosetal.2015)"   scran(Lunetal.2016)"   ComBatinsva(Leeketal.2012)"   Usespike-inRNAs

S"llanopen

ques"on

Page 20: A primer on single-cell RNA-seq analysis - …bioinformatics.org.au/ws17/wp-content/uploads/sites/13/...A primer on single-cell RNA-seq analysis Dr Ma6hew Ritchie @mritchieau UQ Winter

DimensionReducAon

"   HelpsvisualiserelaUonshipsbetweensamples"   Popularmethods:MDS,PCA,t-SNE(t-DistributedStochasUcNeighborEmbedding),etc.

MDSplot:Distancematrix

PCAplot:Covariancematrix

Page 21: A primer on single-cell RNA-seq analysis - …bioinformatics.org.au/ws17/wp-content/uploads/sites/13/...A primer on single-cell RNA-seq analysis Dr Ma6hew Ritchie @mritchieau UQ Winter

DimensionReducAonwitht-SNE

Page 22: A primer on single-cell RNA-seq analysis - …bioinformatics.org.au/ws17/wp-content/uploads/sites/13/...A primer on single-cell RNA-seq analysis Dr Ma6hew Ritchie @mritchieau UQ Winter

t-SNEappliedtoMouseBloodCellscRNA-seqdata

DatafromDrChrisUneBiben(WEHI)

Page 23: A primer on single-cell RNA-seq analysis - …bioinformatics.org.au/ws17/wp-content/uploads/sites/13/...A primer on single-cell RNA-seq analysis Dr Ma6hew Ritchie @mritchieau UQ Winter

MDSappliedtoMouseBloodCellscRNA-seqdata

Page 24: A primer on single-cell RNA-seq analysis - …bioinformatics.org.au/ws17/wp-content/uploads/sites/13/...A primer on single-cell RNA-seq analysis Dr Ma6hew Ritchie @mritchieau UQ Winter

MDSappliedtoMouseBloodCellscRNA-seqdata

Page 25: A primer on single-cell RNA-seq analysis - …bioinformatics.org.au/ws17/wp-content/uploads/sites/13/...A primer on single-cell RNA-seq analysis Dr Ma6hew Ritchie @mritchieau UQ Winter

DimensionReducAonwitht-SNE

Freepublic10XGenomics:-2,700PeripheralBloodMononuclearCells(PBMC)

"   Usefulforhigh-dimensionaldata:"   alargenumberofcells"   morediversepopulaUons

"   Mayover-interpretresultsforlessheterogenousdata

Page 26: A primer on single-cell RNA-seq analysis - …bioinformatics.org.au/ws17/wp-content/uploads/sites/13/...A primer on single-cell RNA-seq analysis Dr Ma6hew Ritchie @mritchieau UQ Winter

Notesonusingt-SNEWaYenberg,etal.,"HowtoUset-SNEEffecUvely",DisUll,2016.hYp://disUll.pub/2016/misread-tsne/

‘Althoughimpressive,theseimagescanbetempUngtomisread.’

1.ThosehyperparametersreallymaYer.2.Clustersizesinat-SNEplotmeannothing.

Page 27: A primer on single-cell RNA-seq analysis - …bioinformatics.org.au/ws17/wp-content/uploads/sites/13/...A primer on single-cell RNA-seq analysis Dr Ma6hew Ritchie @mritchieau UQ Winter

Notesonusingt-SNEWaYenberg,etal.,"HowtoUset-SNEEffecUvely",DisUll,2016.hYp://disUll.pub/2016/misread-tsne/

‘Althoughimpressive,theseimagescanbetempUngtomisread.’

1.ThosehyperparametersreallymaYer.2.Clustersizesinat-SNEplotmeannothing.3.Distancesbetweenclustersmightnotmeananything.

Page 28: A primer on single-cell RNA-seq analysis - …bioinformatics.org.au/ws17/wp-content/uploads/sites/13/...A primer on single-cell RNA-seq analysis Dr Ma6hew Ritchie @mritchieau UQ Winter

Notesonusingt-SNEWaYenberg,etal.,"HowtoUset-SNEEffecUvely",DisUll,2016.hYp://disUll.pub/2016/misread-tsne/

‘Althoughimpressive,theseimagescanbetempUngtomisread.’

1.ThosehyperparametersreallymaYer.2.Clustersizesinat-SNEplotmeannothing.3.Distancesbetweenclustersmightnotmeananything.4.Randomnoisedoesn’talwayslookrandom.5.Youcanseesomeshapes,someUmes.6.Fortopology,youmayneedmorethanoneplot.

Page 29: A primer on single-cell RNA-seq analysis - …bioinformatics.org.au/ws17/wp-content/uploads/sites/13/...A primer on single-cell RNA-seq analysis Dr Ma6hew Ritchie @mritchieau UQ Winter

DifferenAalexpression

"   DetectDEgenesormarkers."   Methodsdesignedforbulkdata–edgeR,voom,DESeq2,etc."   MethodsdevelopedforscRNA-seq–monocle,MAST,SCDE,etc.

Page 30: A primer on single-cell RNA-seq analysis - …bioinformatics.org.au/ws17/wp-content/uploads/sites/13/...A primer on single-cell RNA-seq analysis Dr Ma6hew Ritchie @mritchieau UQ Winter

DifferenAalexpression

"   DifferenUalexpressionanalysisbyedgeR"   Quasi-likelihood(QL)pipelineisnotappropriate;"   LikelihoodraUotest(LRT)isrecommended.

Page 31: A primer on single-cell RNA-seq analysis - …bioinformatics.org.au/ws17/wp-content/uploads/sites/13/...A primer on single-cell RNA-seq analysis Dr Ma6hew Ritchie @mritchieau UQ Winter

‘Generally,however,methodsdevelopedforbulkRNA-seqanalysisdonotperformnotablyworsethanthosedevelopedspecificallyfor

scRNA-seq.’

SonnesonandRobinson,bioRxiv2017hYp://www.biorxiv.org/content/early/2017/05/28/143289

Page 32: A primer on single-cell RNA-seq analysis - …bioinformatics.org.au/ws17/wp-content/uploads/sites/13/...A primer on single-cell RNA-seq analysis Dr Ma6hew Ritchie @mritchieau UQ Winter

Thevalueofcontroldatasets

•  BenchmarkingeffortsforscRNA-seqareintheirinfancy•  Lackofgoodcontroldatasetsforcomparinganalysismethods•  ThewiderangeofscRNA-seqprotocolsmakesthischallenging!

Page 33: A primer on single-cell RNA-seq analysis - …bioinformatics.org.au/ws17/wp-content/uploads/sites/13/...A primer on single-cell RNA-seq analysis Dr Ma6hew Ritchie @mritchieau UQ Winter

Summary

" scRNA-seqisapowerfultechniquetostudygeneregulaUoncellbycell;

"   Thecountscontainhighlevelsofnoisewithmanydropouts;

"   QualitycontrolisessenUaltoremoveproblemaUccellsaswellaslow-abundancegenes;

"   GoldstandardsfordataanalysisofscRNA-seqdatahaveyettoemerge;

"   MethodsdevelopedforbulkdatacanbeappliedtoscRNA-seqdata(withduecare).

Page 34: A primer on single-cell RNA-seq analysis - …bioinformatics.org.au/ws17/wp-content/uploads/sites/13/...A primer on single-cell RNA-seq analysis Dr Ma6hew Ritchie @mritchieau UQ Winter

FurtherReading

•  Svenssonetal.Moore’sLawinSingleCellTranscriptomicshYps://arxiv.org/abs/1704.01379

•  Lunetal.Astep-by-stepworkflowforlow-levelanalysisofsingle-cellRNA-

seqdatawithBioconductor,F1000Research,2016hYps://f1000research.com/arUcles/5-2122/v2

•  WaYenberg,etal.,"HowtoUset-SNEEffecUvely",DisUll,2016.

hYp://disUll.pub/2016/misread-tsne/

Page 35: A primer on single-cell RNA-seq analysis - …bioinformatics.org.au/ws17/wp-content/uploads/sites/13/...A primer on single-cell RNA-seq analysis Dr Ma6hew Ritchie @mritchieau UQ Winter

Rsubread

www.b

iocondu

ctor.org

R-basedanalysispipelineforscRNA-seqdata

scran

www.bioconductor.org

scPipe

www.b

iocondu

ctor.org

www.b

iocondu

ctor.org

Page 36: A primer on single-cell RNA-seq analysis - …bioinformatics.org.au/ws17/wp-content/uploads/sites/13/...A primer on single-cell RNA-seq analysis Dr Ma6hew Ritchie @mritchieau UQ Winter

Acknowledgements

Yunshun(Andy)ChenLuyiTianShianSuStuartLeeShalinNaikDanielaZalcensteinChris"neBibenRobertoBonelli

DavisMcCarthyAaronLunJohnMarioniJamesRyallErnstWolvertang

Page 37: A primer on single-cell RNA-seq analysis - …bioinformatics.org.au/ws17/wp-content/uploads/sites/13/...A primer on single-cell RNA-seq analysis Dr Ma6hew Ritchie @mritchieau UQ Winter

AMSIBioInfoSummer2017

http://bis.amsi.org.au

4-8 December 2017 Monash University