transcriptome-wide association study of schizophrenia and ... · metabolic syndrome in men...
TRANSCRIPT
Transcriptome-wide association study of schizophrenia and
chromatin phenotypes
Alexander (Sasha) Gusev
Harvard TH Chan School of Public Health
Challenges of studying complex disease
• Goal: specific mechanistic hypotheses.
• Schizophrenia is highly polygenic, with up to 70% of 1MB regions harboring an association
• For an associated SNP, causal mechanism often not through nearest gene [… or mediated by CNV / TFBS / etc]
• Relevant molecular features (RNA-seq/ChIP-seq in brain) are difficult to collect
Loh et al. 2015b Nat. Genet.; Sekar et al. 2016 Nature
Methods for molecular + GWAS data:
Gusev et al. 2016 Nat. Genet.
Outline
1. TWAS: [cis] molecular features can be predicted into un-assayed samples
2. TWAS to find expression associated with schizophrenia
3. TWAS to find expression associated with ChIP-Seq chromatin activity
Outline
1. TWAS: [cis] molecular features can be predicted into un-assayed samples
2. TWAS to find expression associated with schizophrenia
3. TWAS to find expression associated with ChIP-Seq chromatin activity
TWAS using individual-level data
C T G T A
G T C A A
C A G T C
~
C A G T C
C A G T C
C A C A A
G T G A C
~
individual GWAS data
β1 β2 β3 β4 β5
C A C T C
G T C T A
Linear genetic
predictor
60-80% prediction
accuracy
expression reference
Gamazon et al. 2015 Nat Genet; Gusev et al. 2016 Nat Genet
TWAS using summary-level data
2 1 2 5 1
7
R2 > 0.9 w/
individual-level
prediction
summary GWAS data
C T G T A
G T C A A
C A G T C
~
β1 β2 β3 β4 β5
Linear genetic
predictor
60-80% prediction
accuracy
expression reference
Gusev et al. 2016 Nat Genet; Barbeira et al. biorxiv
Outline
1. TWAS: [cis] molecular features can be predicted into un-assayed samples
2. TWAS to find expression associated with schizophrenia
3. TWAS to find expression associated with ChIP-Seq chromatin activity
TWAS for schizophrenia
CommonMind Consortium (CMC) 1: 500 brain RNA-seq alternative splice variants 2 Netherlands Twin Registry (NTR)3: 1,200 blood array METabolic Syndrome In Men (METSIM): 600 adipose RNA-seq Young Finns Study (YFS): 1,300 blood array
4 Expression Reference Panels
GWAS: Psychiatric Genetics Consortium ~80,000 SCZ samples [PGC 2014 Nature]
[1] Fromer et al. 2016 Nat Neuro; [2] Li et al 2016 Science; [3] Wright et al. 2014 Nat Genet
C T G T A
G T C A A
C A G T C
~
2 1 2 5 1
7 TWAS
157 gene associations (45 in novel loci) 80 splice-variant associations
conditioning on expression explains all genome-wide significant effect at known loci
TWAS for schizophrenia Z-
sco
re
chromosomes
polygenic effects across thousands of genes
5e−0
8
1e−0
6
1e−0
4
0.00
1
0.01
0.05 0.
1
0.2
0.5 1
CMC/brain genes
P−value threshold
liabi
lity−
scal
e R
2
0.00
0.01
0.02
0.03
0.04
0.05
5e−0
8
1e−0
6
1e−0
4
0.00
1
0.01
0.05 0.
1
0.2
0.5 1
CMC/brain splicing
P−value threshold
liabi
lity−
scal
e R
2
0.00
0.01
0.02
0.03
0.04
0.05
5e−0
8
1e−0
6
1e−0
4
0.00
1
0.01
0.05 0.
1
0.2
0.5 1
METSIM/adipose
P−value threshold
liabi
lity−
scal
e R
2
0.00
0.01
0.02
0.03
0.04
0.05
5e−0
8
1e−0
6
1e−0
4
0.00
1
0.01
0.05 0.
1
0.2
0.5 1
YFS/blood
P−value threshold
liabi
lity−
scal
e R
2
0.00
0.01
0.02
0.03
0.04
0.05
5e−0
8
1e−0
6
1e−0
4
0.00
1
0.01
0.05 0.
1
0.2
0.5 1
NTR/blood
P−value threshold
liabi
lity−
scal
e R
2
0.00
0.01
0.02
0.03
0.04
0.05
5e−0
8
1e−0
6
1e−0
4
0.00
1
0.01
0.05 0.
1
0.2
0.5 1
ALL jointly
P−value threshold
liabi
lity−
scal
e R
2
0.00
0.01
0.02
0.03
0.04
0.05brain 0.05
0.04
0.03
0.02
0.01
polygenic score using predicted expression, evaluated in independent SCZ cohort
P-value threshold
R2
w/ SCZ status
more associations with brain and splicing
5e−0
8
1e−0
6
1e−0
4
0.00
1
0.01
0.05 0.
1
0.2
0.5 1
CMC/brain genes
P−value threshold
liabi
lity−
scal
e R
2
0.00
0.01
0.02
0.03
0.04
0.05
5e−0
8
1e−0
6
1e−0
4
0.00
1
0.01
0.05 0.
1
0.2
0.5 1
CMC/brain splicing
P−value threshold
liabi
lity−
scal
e R
2
0.00
0.01
0.02
0.03
0.04
0.05
5e−0
8
1e−0
6
1e−0
4
0.00
1
0.01
0.05 0.
1
0.2
0.5 1
METSIM/adipose
P−value threshold
liabi
lity−
scal
e R
2
0.00
0.01
0.02
0.03
0.04
0.05
5e−0
8
1e−0
6
1e−0
4
0.00
1
0.01
0.05 0.
1
0.2
0.5 1
YFS/blood
P−value threshold
liabi
lity−
scal
e R
2
0.00
0.01
0.02
0.03
0.04
0.05
5e−0
8
1e−0
6
1e−0
4
0.00
1
0.01
0.05 0.
1
0.2
0.5 1
NTR/blood
P−value threshold
liabi
lity−
scal
e R
2
0.00
0.01
0.02
0.03
0.04
0.05
5e−0
8
1e−0
6
1e−0
4
0.00
1
0.01
0.05 0.
1
0.2
0.5 1
ALL jointly
P−value threshold
liabi
lity−
scal
e R
2
0.00
0.01
0.02
0.03
0.04
0.05brain brain splice-var
adipose blood blood 0.05
0.04
0.03
0.02
0.01
P-value threshold
all predicted cis-expression explain 26% of total SCZ SNP-heritability (upper bound on mediated effect)
R2
w/ SCZ status
polygenic score using predicted expression, evaluated in independent SCZ cohort
Outline
1. TWAS: [cis] molecular features can be predicted into un-assayed samples
2. TWAS to find expression associated with schizophrenia
3. TWAS to find expression associated with ChIP-Seq chromatin activity
Chromatin variation is under genetic control
[CEU] Waszak et al. 2015 Cell:
Local chromatin-expression interactions form modules.
[YRI] Grubert et al. 2015 Cell:
Chromatin variation and expression under shared genetic control,
genome
connecting genes to chromatin peaks with top hits
for every gene:
expression - QTLs
for every peak
chromatin - QTLs
TWAS for expression-chromatin associations
C T G T A
G T C A A
C A G T C
~
C A G T C
C A G T C
C A C A A
G T G A C
~
individual epigenomic data
β1 β2 β3 β4 β5
C A C T C
G T C T A
linear genetic
predictor
expression reference
TWAS identifies 10x more gene/chromatin associations than eQTL overlap approach
0 100 200 300 400 500
0.0
0.2
0.4
0.6
0.8
1.0
Power to detect gene−mark association
# chromatin samples
Pow
er
A
● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ● ● ●
●
●
TWAS
eQTL/cQTL overlap
● ● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
TWAS
eQTL/cQTL overlap
●
● ●
●
● ●
●
●
0
20
40
60
80
European TWAS
Max distance to gene boundar y# s
ign
ific
an
t a
sso
cia
tio
ns
● ● ● ● ●
●
●
●
●
●
●
●
●
●
●
●
● ● ●
●
●●
●
●
●
●●
●
●
●
●
●
100bp 1kb 2kb 10kb 50kb 100kb 250kb 500kb
●
●
●
●
●
H3K4ME1
H3K4ME3
H3K27AC
PU1
RPB2
B
● ● ●
●● ● ● ●
0
20
40
60
80
European QTL
Max distance to gene boundar y
# s
ign
ific
an
t a
sso
cia
tio
ns
●
● ●
●●
● ● ●
●
●
●
● ● ● ● ●
● ● ● ●● ●
● ●●
● ●
●● ● ● ●
100bp 1kb 2kb 10kb 50kb 100kb 250kb 500kb
●
●
●
●
●
H3K4ME1
H3K4ME3
H3K27AC
PU1
RPB2
●●
●
●
●
●
●
●
0
50
100
150
200
250
300
Yoruban TWAS
Max distance to gene boundar y
# s
ign
ific
an
t a
sso
cia
tio
ns
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●●
●● ● ●
●●
● ●
100bp 1kb 2kb 10kb 50kb 100kb 250kb 500kb
●
●
●
●
H3K4ME1
H3K4ME3
H3K27AC
DHS
● ●●
●● ● ● ●
0
50
100
150
200
250
300
Yoruban QTL
Max distance to gene boundar y#
sig
nific
an
t a
sso
cia
tio
ns
●
●● ● ● ● ● ●
●
● ●● ● ● ● ●
● ● ● ● ● ● ● ●
100bp 1kb 2kb 10kb 50kb 100kb 250kb 500kb
●
●
●
●
H3K4ME1
H3K4ME3
H3K27AC
DHS
chromatin-TWAS overlapping top hits
distance to the gene TSS
Validating TWAS gene-chromatin associations
brain/blood reference
expression
Discovery: TWAS gene–chromatin
associations
Validating TWAS gene-chromatin associations
brain/blood reference
expression
blood target expression
Discovery: TWAS gene–chromatin
associations Replication: measured LCL RNA-seq
expression in chromatin samples
Chromatin variation is highly associated with expression at predicted gene-peak associations
replication R2 ≈ total expression cis-hg2
ChIP-seq and expression from Waszak et al. 2015 Cell
NTR CMC
PU1
R2(m
ark
,GE
)
0.0
0.1
0.2
0.3
0.4
0.5
TWAS−linked peaks
all nearby peaks
random
* *
NTR CMC
RPB2
R2(m
ark
,GE
)
0.0
0.1
0.2
0.3
0.4
0.5
TWAS−linked peaks
all nearby peaks
random
****
**** ****
NTR CMC
H3K4ME1
R2(m
ark
,GE
)
0.0
0.1
0.2
0.3
0.4
0.5
TWAS−linked peaks
all nearby peaks
random
** *** ** ***
NTR CMC
H3K4ME3
R2(m
ark
,GE
)
0.0
0.1
0.2
0.3
0.4
0.5
TWAS−linked peaks
all nearby peaks
random
**
*
***
**
NTR CMC
H3K27AC
R2
(ma
rk,G
E)
0.0
0.1
0.2
0.3
0.4
0.5
TWAS−linked peaks
all nearby peaks
random
** **** ** ****
R2 chromatin-expression
(in LCLs)
blood brain
0.5
0.3
0.0
predicted gene-mark pairs background genes
Enrichment for chromatin associations with SCZ TWAS genes
157 SCZ TWAS genes 42 with chromatin TWAS associations
4x background (P=1x10-11)
significant enrichment for SCZ splice variants with chromatin associations
2 1 2 5 1
7
TWAS provides mechanistic hypotheses
108 SCZ loci 26/108 have TWAS splicing association
48/108 have TWAS splicing or gene association (76% not nearest gene)
34/48 also have chromatin association
(85% not in the promoter)
N=80,000
N=3,500
N=150
TWAS provides mechanistic hypotheses
108 SCZ loci 26/108 have TWAS splicing association
48/108 have TWAS splicing or gene association (76% not nearest gene)
34/48 also have chromatin association
(85% not in the promoter)
N=80,000
N=3,500
N=150
TWAS provides mechanistic hypotheses
108 SCZ loci 26/108 have TWAS splicing association
48/108 have TWAS splicing or gene association (76% not nearest gene)
34/48 also have chromatin association
(85% not in the promoter)
N=80,000
N=3,500
N=150
Conclusion / Future Work
• TWAS implicates specific genes and their regulators.
• Epigenetic mediators can be a common disease mechanism
• Yields zozens of mechanistic hypotheses for schizophrenia loci (following up with experimental validation)
Nick Mancuso Hilary Finucane Yakir Reshef Lingyun Song Alexias Safi Edwin Oh Steve McCarroll Ben Neale Mick O’Donovan Roel Ophoff
recruiting in new lab at Dana Farber Cancer Institute &
Harvard Medical School
gusevlab.org
Gregory Crawford Nicholas Katsanis Patrick F Sullivan Bogdan Pasaniuc Alkes L Price
Psychiatric Genomics Consortium CommonMind Consortium PsychENCODE 1000 Genomes
pre-print on biorxiv