supplemental data. cubillos et al. (2014). plant cell 10 ... · supplemental data. cubillos et al....

9
Supplemental Data. Cubillos et al. (2014). Plant Cell 10.1105/tpc.114.130310 Supplemental Figure 1. RPKM scatter plot for reads spanning SNPs. Scatter plot of the Log10 RPKM (Reads Per Kilobase per Million mapped reads) values for reads mapping to each gene w as plotted against the Log10 RPKM values of r eads containing a pol ymorphic region used to quantify allele-specific expression ratios.

Upload: others

Post on 19-Aug-2021

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Supplemental Data. Cubillos et al. (2014). Plant Cell 10 ... · Supplemental Data. Cubillos et al. (2014). Plant Cell 10.1105/tpc.114.130310 Supplemental Figure 4. Overlap comparison

Supplemental Data. Cubillos et al. (2014). Plant Cell 10.1105/tpc.114.130310

Supplemental Figure 1. RPKM scatter plot for reads spanning SNPs. Scatter plot of the

Log10 RPKM (Reads Per Kilobase per Million mapped reads) values for reads mapping to

each gene was plotted against the Log10 RPKM values of reads containing a polymorphic

region used to quantify allele-specific expression ratios.

Page 2: Supplemental Data. Cubillos et al. (2014). Plant Cell 10 ... · Supplemental Data. Cubillos et al. (2014). Plant Cell 10.1105/tpc.114.130310 Supplemental Figure 4. Overlap comparison

Supplemental Data. Cubillos et al. (2014). Plant Cell 10.1105/tpc.114.130310

Supplemental Figure 2. Percentage of reads specifically aligning to Col (red) and Cvi

(blue) genomes before and after the normalisation procedure. Each pair of piechart

represents a setup: parental pools or hybrids in stress ('S') or control ('NS') conditions.

Page 3: Supplemental Data. Cubillos et al. (2014). Plant Cell 10 ... · Supplemental Data. Cubillos et al. (2014). Plant Cell 10.1105/tpc.114.130310 Supplemental Figure 4. Overlap comparison

Supplemental Data. Cubillos et al. (2014). Plant Cell 10.1105/tpc.114.130310

Supplemental Figure 3. Comparison of allele-specific expression ratios obtained from

RNA-seq or pyrosequencing in different samples and conditions. Each dot corresponds

to a gene.

Page 4: Supplemental Data. Cubillos et al. (2014). Plant Cell 10 ... · Supplemental Data. Cubillos et al. (2014). Plant Cell 10.1105/tpc.114.130310 Supplemental Figure 4. Overlap comparison

Supplemental Data. Cubillos et al. (2014). Plant Cell 10.1105/tpc.114.130310

Supplemental Figure 4. Overlap comparison for the additive effect estimated from

local-eQTL and genes under significant allele-specific expression. Scatter plot for genes

sharing directionality between the eQTL study (Cubillos et al., 2012b) and the present RNA-

seq study. The additive effect in local-eQTLs previously detected in Col x Cvi F6 RILs versus

the allele-specific expression ratio in F1 individuals grown in control conditions is shown.

Page 5: Supplemental Data. Cubillos et al. (2014). Plant Cell 10 ... · Supplemental Data. Cubillos et al. (2014). Plant Cell 10.1105/tpc.114.130310 Supplemental Figure 4. Overlap comparison

Supplemental Data. Cubillos et al. (2014). Plant Cell 10.1105/tpc.114.130310

Supplemental Figure 5. Expression changes across conditions in F1 and parental

pools estimated by DESeq (A) Fold change scatter plot in F1 hybrids. Differentially

expressed genes are depicted in red. (B) Bar graph for expression changes in 6 genes with

q<0.05 estimated by DESeq (red) and RT-qPCR validation (blue). UBC was used as the

control gene to normalise for expression values. Three PCR reactions were performed per

replicate using three biological samples per genotype, with error bars representing standard

deviations. (C) Fold change scatter plot in parental pools. Differentially expressed genes are

depicted in red.

Page 6: Supplemental Data. Cubillos et al. (2014). Plant Cell 10 ... · Supplemental Data. Cubillos et al. (2014). Plant Cell 10.1105/tpc.114.130310 Supplemental Figure 4. Overlap comparison

Supplemental Data. Cubillos et al. (2014). Plant Cell 10.1105/tpc.114.130310

Supplemental Table 1. GO term enrichment for genes differentially expressed between conditions in F1 hybrids. Enrichment for Biological processes was estimated using agriGO. P-values were estimated using a Fisher test and adjusted for multiple testing using the Yekutieli FDR correction.

Go category Term Type Term p-value q-value GO:0050896 Biological Process response to stimulus 1.70E-09 3.50E-06 GO:0042221 Biological Process response to chemical stimulus 1.50E-08 1.50E-05 GO:0010033 Biological Process response to organic substance 2.10E-06 0.0011 GO:0009719 Biological Process response to endogenous stimulus 1.60E-06 0.0011 GO:0006950 Biological Process response to stress 4.50E-06 0.0018 GO:0009737 Biological Process response to abscisic acid stimulus 1.00E-05 0.0034 GO:0009753 Biological Process response to jasmonic acid stimulus 4.00E-05 0.01 GO:0006979 Biological Process response to oxidative stress 3.60E-05 0.01 GO:0009611 Biological Process response to wounding 5.70E-05 0.013 GO:0009605 Biological Process response to external stimulus 9.20E-05 0.017 GO:0006790 Biological Process sulfur metabolic process 9.00E-05 0.017 GO:0009725 Biological Process response to hormone stimulus 0.00011 0.017 GO:0016137 Biological Process glycoside metabolic process 0.00011 0.017 GO:0042254 Biological Process ribosome biogenesis 0.00012 0.017 GO:0008152 Biological Process metabolic process 0.00014 0.018 GO:0022613 Biological Process ribonucleoprotein complex biogenesis 0.00017 0.021 GO:0031407 Biological Process oxylipin metabolic process 0.00028 0.027 GO:0019760 Biological Process glucosinolate metabolic process 0.00027 0.027 GO:0016143 Biological Process S-glycoside metabolic process 0.00027 0.027 GO:0019757 Biological Process glycosinolate metabolic process 0.00027 0.027 GO:0044237 Biological Process cellular metabolic process 0.00023 0.027 GO:0009628 Biological Process response to abiotic stimulus 0.0003 0.027 GO:0044281 Biological Process small molecule metabolic process 0.00032 0.028 GO:0009058 Biological Process biosynthetic process 0.00033 0.028 GO:0006631 Biological Process fatty acid metabolic process 0.00048 0.038 GO:0006412 Biological Process translation 0.00049 0.038 GO:0006970 Biological Process response to osmotic stress 0.00064 0.048

Page 7: Supplemental Data. Cubillos et al. (2014). Plant Cell 10 ... · Supplemental Data. Cubillos et al. (2014). Plant Cell 10.1105/tpc.114.130310 Supplemental Figure 4. Overlap comparison

Supplemental Data. Cubillos et al. (2014). Plant Cell 10.1105/tpc.114.130310

Supplemental Table 2. Sequencing primers used in the qPCR validation.

Gene Primer Forward Primer Reverse AT2G38870 TGTCGACCGAATGTCCTAGG ACTCCCATCCAAAATCACGG AT1G75040 TCACATTCTCTTCCTCGTGTTC AGGGCAATTGTTCCTTAGAGTG AT3G12580 TCGATCTCGGTACAACCTACTC TCTTCTTCCGATTAGACGCTTAG AT2G25680 GGTGGGTGTGTGGCACTGT AGCACACCAACCGGAAACTT AT3G30720 AAGACCAATAGAGAGCAGGAA CCTGATGTAGAAGTGTGAGG AT5G55450 CTGCCCGTACAAGCCTTATC TCAGCTCAGTTCTTCATGCTTAG

UBC CTGCGACTCAGGGAATCTTCTAA TTGTGCCATTGAATTGAACCC

Page 8: Supplemental Data. Cubillos et al. (2014). Plant Cell 10 ... · Supplemental Data. Cubillos et al. (2014). Plant Cell 10.1105/tpc.114.130310 Supplemental Figure 4. Overlap comparison

Variance component modeling

To analyse the contribution of variance from environmental factors, cis or trans genetice�ects, the abundance profile for each gene was fit by a linear variance component model.The expression profile for a gene g, the expression file is modelled by

Here, the covariates x f reflect relevant contrasts in the overall read distribution iny , including allele-specific e�ects ( cis e�ects) the contrast of ASE in hybrids and pools(trans e�ects), environment and the interaction of environment with either cis or transfactors.

Owing to the small sample size in this study, care is needed to avoid overfitting, whichresults in overoptimistic variance contributions.

Variance function First, the noise variance ψi for individual samples i is estimatedusing a variance function that share strength across genes. Here, we follow the approachused by DeSEq [1] and fit a relationship between mean expression and variance from allgenome-wide genes, mitigating the limitations of estimating variances from small samplesets. Similar approaches have previously been applied in the context of the analysis ofallele-specific expression [2]. The fitted model is used to obtain a per-sample and genevariance σ2

g,i = f i (yg,i ), using a per-sample variance function f i .

Supplemental Materials. Cubillos et al. (2014). Plant Cell 10.1105/tpc.114.130310

y = Σ β x + ψg

F

gff =1 f,g

linear factorsnoise

Random effect modeling. Second, we model the individual linear e�ect terms as ran-dom, circumventing the common problems of overfitting for large numbers of fixed ef-fects βf,g N(0, δ

2

f,g ). Marginalizing over the linear weights, the resulting random e�ectmodel takes the form

p (y | { δ 2f,g }, { σ2

g,i } ) = N (y δ2

f,gx f xTf + diag( σ 2

g)) , (1)0ΣF

f =1

where diag denotes the diagonal noise covariance matrix reflecting the noise levels ofindividual measurements.

Page 9: Supplemental Data. Cubillos et al. (2014). Plant Cell 10 ... · Supplemental Data. Cubillos et al. (2014). Plant Cell 10.1105/tpc.114.130310 Supplemental Figure 4. Overlap comparison

Supplemental Materials. Cubillos et al. (2014). Plant Cell 10.1105/tpc.114.130310

Model fitting

Expression levels are first transformed on an approximately variance stabilized scaleusing an ANSCOMBE transformation. Next, to account for additional technical vari-ability, a variance function is fit for each sample-type, across the 2-3 replicates. Finally,for every gene and for the fixed variance function, the model parameters of the marginallikelihood model (Eqn ( 1)) is optimized with respect to the model parameters { δ2

f,g} .The fitted variance parameters allow for partitioning the gene expression variance intocontributions from the respective covariates (δ2

f,g

f,g

) and the residual noise variance σ2g,i .

{ ˆδ f,g2} = argmax

{ δ2 }p ( y | {δ

2

f,g } , { σ2

g,i } ). (2)

We consider 10 random restarts to mitigate possible local optima. The fitted varianceparameters allow for partitioning the gene expression variance into contributions fromthe respective covariates ( δ 2

f,g ) and the residual noise variance σ2g,i .

References

[1] S. Anders and W. Huber. Di�erential expression analysis for sequence count data.Genome Biol , 11(10):R106, 2010.

[2] A. Goncalves, S. Leigh-Brown, D. Thybert, K. Ste�ova, E. Turro, P. Flicek,A. Brazma, D. T. Odom, and J. C. Marioni. Extensive compensatory cis-trans regu-lation in the evolution of mouse gene expression. Genome research, 22(12):2376–2384,2012.