rna-seq experiences and plans lumc
DESCRIPTION
RNA-seq experiences and plans LUMC. Peter A.C. ’ t Hoen Human Genetics, LUMC. Pipelines. PASSion: a pattern growth algorithm-based pipeline for splice junction detection in paired-end RNA-Seq data Bioinformatics 28:479-86 (2012) - PowerPoint PPT PresentationTRANSCRIPT
RNA-seq experiences and plansLUMC
Peter A.C. ’t HoenHuman Genetics, LUMC
Pipelines
• PASSion: a pattern growth algorithm-based pipeline for splice junction detection in paired-end RNA-Seq data
Bioinformatics 28:479-86 (2012)
• eMiR: pipeline for mapping, 5p-3p resolution and annotation of miRNAs
BMC Genomics 11:716 (2010)
PASSion
PASSion: performance simulated data
eMiR
eMiR
eMiR results LUMC samples
0.0E+00
5.0E+06
1.0E+07
1.5E+07
2.0E+07
2.5E+07
3.0E+07
3.5E+07
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
Not truncatedTruncated not alignedTruncated and aligned sequences
eMiR results LUMC samples (2)
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Mt_tRNA_p
seudogen
e
miRNA
miRNA_p
seudogen
e
misc_R
NA
misc_R
NA_pse
udogene
rRNA
rRNA_p
seudogen
e
scRNA_p
seudogen
e
snRNA
snRNA_p
seudogen
e
snoRNA
snoRNA_p
seudogen
e
tRNA_p
seudogen
eother
Other studies: Methods
• Tag-based: one read per transcript• DeepSAGE most 3’ CATG• DeepCAGE 5’-end
• RNA-Seq: multiple reads per transcript• Whole mRNA sequencing after fragmentation
DeepSAGE – sample preparation
PCR enrichment and gel purification (~85bp)
Example gene: Gapd
14542
12555
Example gene: alternative polyadenylation
97
99
Expression profiling in a human cohort
• 105 subjects with GWAS and phenotype data• RNA isolated from total blood• Expression profiling by deep-SAGE• 95 passed all QC
Analysis pipeline
• Trimming / addition of nucleotides• Genome alignment (Bowtie)• UCSC genome browser .wiggle files for visualization
• Annotation (ENSEMBL/Biomart)• Reads summed per gene• OR tagwise analysis
SNPs: sample swaps detected
Gender-specific gene expression
male
female
Normalizedexpressionof Y-chrgenes
NormalizedXISTexpression
Contaminated samples detected
Genes associated with BMI
• Differential expression analysis
1. In edgeR (designed for count data)
2. In limma (designed for microarray data; voom: mean-variance model)
• Gender as confounder
Limma and edgeR reasonably consistent
-log10 P-value
In red: high expressed genes
Genes associated with BMI (N=9, FDR 0.05)
Allele specific expression detected for some genes
Helicos single molecule sequencing
Example polyA profiling on Helicos
Eleonora de Klerk
Oculopharyngeal muscular dystrophy: General switch to shorter 3’-UTRs
Eleonora de Klerk
Example RNA-Seq (Helicos)ADAMTS8
ADAMTS15
NOV
Peter Henneman
Analysis of pre-mRNA processing
Irina Pulyakhina
splicing
pre-mRNA
mature mRNA
intermediate
mRNA
Pre-mRNA analysis tools
• map to both exon-exon junctions and introns;• prioritize intronic alignments;• report multiple alignments;• deal with both low and high coverage;• deal with indels and mismatches;• find novel exons and splice sites;• look for both canonical and non-canonical splice sites
GSNAP
T T T T T T T G T
T T T T T T T G T . . . G T A T C G A GSNAP
TopHat
Difference between TopHat and GSNAP results:
T T T T T T T G TT T T T T T T G T . . . T T T T T T T G C . . .G T A T C G A
T T T T T T T G T
GSNAP alignment: TopHat alignment:
Normal (standard) insert size
intint
intex
inte-i
inte-e
exex
exe-i
exe-e
e-ee-e
e-ie-e
e-ie-i
pre-splicing
Intermediate (pre+post)
post-splicing
Extremely large insert size
intint
intex
inte-i
inte-e
exex
exe-i
exe-e
e-ee-e
e-ie-e
e-ie-i
?
Insert size cut-off
Plans for GEUVADIS
• Transcription of repeat sequences such as Macro Satellite Repeats
• Study effect on local and global gene expression• Study heterogeneity of transcripts expressed from repeats
FSHD: disease mechanism
Lemmers et al. Science 329:1650-3 (2010)
No FSHD
11-100 units
4q35
D4Z4 (3.3 kb units)
4qA
D4Z4 Contraction
FSHD4-type D4Z4
A
AAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAA
4qB No FSHDB
4qAPAS
PAS
DUX4
AcknowledgementsShoaib AminiIrina PulyakhinaEleonora de KlerkHenk BuermansYanju ZhangKai YeJeroen LarosJohan den DunnenGertjan van Ommen
Rick JansenJeroen van ZantenGerard van GrootheestBrenda PenninxJan Smit
Joukejan Hottenga Gonneke Willemsen Dorret Boomsma Eco de Geus
NTR