seminario - testing gene-environment interaction in generalized … · breslow, n. and clayton, d,...
TRANSCRIPT
![Page 1: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/1.jpg)
Seminario Escuela de Estadística UNALMED
SeminarioTesting gene-environment interaction in generalized linear mixed
models with family data
20 de noviembre de 2017
![Page 2: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/2.jpg)
Seminario Introduction
Sección 1
Introduction
![Page 3: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/3.jpg)
Seminario Introduction
Gene, Environment and Disease
![Page 4: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/4.jpg)
Seminario Introduction
Gene-environment interaction
![Page 5: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/5.jpg)
Seminario Introduction
Family data and kinship matrix
![Page 6: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/6.jpg)
Seminario Introduction
Notation
Consider a sample of N independent families.ni : number of members in the ith family (i = 1, . . . ,N).Yij : reponse variable for the phenotype (discrete orcontinuous).Xij = (X 1
ij , . . . ,Xpij )T : p non-environmental covariates.
Gij = (G1ij , . . . ,G
qij )T : q observed genotypes at certain
targeted genetic marker loci.Eij : some environmental exposure factor of interest.Sij = (EijG1
ij , . . . ,EijGqij )T : GE interaction.
![Page 7: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/7.jpg)
Seminario Introduction
Observed genotypes - Gij -
Each column in Gij is a Single Nucleotide Polymorphism (SNP).The genetic information is represented according to the followingcodification:
Gkij =
2, subject j at ith family is homozygous BB1, subject j at ith family is heterozygous Bb or bB0, subject j at ith family is homozygous bb,
with k = 1, . . . , q. B and b represent the dominant and recessivealleles, respectively. In addition, B is the allele that occurs at minorfrequency.
![Page 8: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/8.jpg)
Seminario Introduction
Observed genotypes - Gij -
![Page 9: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/9.jpg)
Seminario Model
Sección 2
Model
![Page 10: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/10.jpg)
Seminario Model
Generalized linear mixed model (GLMM) for Subject j at ith family
g [E (Yij |αij)] = XTij β1 + Eijβ2 + GT
ij θ + STij γ + αij , (1)
Var(Yij |αij) = φω−1ij ν [E (Yij |αij)] ,
αi = (αi1, . . . , αini )T ∼ N(0, 2σ2Φi),where,
g(.): monotone known function.Yij |αij follows a distribution in the exponential family.ν(.): known function.φ: a scale parameter that may be known or may need to beestimated.ωij : known weights (commonly equal to 1).Φi : the kinship matrix and σ2 is a parameter to be estimated.
![Page 11: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/11.jpg)
Seminario Model
Examples of link functions
Family Link g(·) Trait ν(·)Binomial Logit ln
(µ
1−µ
)Binary µ(1 − µ)
Gaussian Identity µ Continuous φGamma Inverse 1/µ Continuous φµ2
Inverse.gaussian Inverse squared 1/µ2 Continuous φµ3
Poisson Log ln(µ) Count µQuasi Identity µ Continuous φ
Quasibinomial Logit ln(
µ1−µ
)Binary φµ(1 − µ)
Quasipoisson Log ln(µ) Count φµ
![Page 12: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/12.jpg)
Seminario Model
GLMM for ith Family
g(µb
i
)= Xiβ1 + Eiβ2 + Giθ + Siγ + Kibi , (2)
where,2Φi = KiKT
i (Cholesky decomposition).αi = Kibi , with bi ∼ N(0, σ2Ini ) and Ini : identity matrix.
µbi = E (Yi |bi).
g(µbi ) = (g(µb
i1), . . . , g(µbini ))
T .Xi = [Xi1 . . .Xini ]T .
Gi = [Gi1 . . .Gini ]T .Ei = (Ei1, . . . ,Eini )T .Si = [Si1 . . .Sini ]T .
![Page 13: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/13.jpg)
Seminario Model
General GLMM
g(µb)
= Xβ + Gθ + Sγ + Kb, (3)
whereµb = E (Y |b).X = [X1 . . .XN ]T .G = [G1 . . .GN ]T .E = (E1, . . . ,EN)T
S = [S1 . . .SN ]T .
K = diag{K1 . . .KN}.b = [b1 . . .bN ]T .X = [X E ]T .β = (βT
1 , β2)T .
We are interested in testing the hypthotesis H0 : γ = 0.
![Page 14: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/14.jpg)
Seminario Model
Important facts
If γ is treated as a fixed vector and the null hyphotesis is testedwith a q degrees of freedom score test, it can result in loss ofpower (Lin et. al., 2013).
Another common strategy is to use a single SNP at time to testGE interaction.Assume γ as a random vector following a multivariate normaldistribution N(0, τ Iq) and to test the equivalent null hypothesisH0 : τ = 0.
= Xβ + Gθ
![Page 15: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/15.jpg)
Seminario Model
Important facts
If γ is treated as a fixed vector and the null hyphotesis is testedwith a q degrees of freedom score test, it can result in loss ofpower (Lin et. al., 2013).Another common strategy is to use a single SNP at time to testGE interaction.
Assume γ as a random vector following a multivariate normaldistribution N(0, τ Iq) and to test the equivalent null hypothesisH0 : τ = 0.
= Xβ + Gθ
![Page 16: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/16.jpg)
Seminario Model
Important facts
If γ is treated as a fixed vector and the null hyphotesis is testedwith a q degrees of freedom score test, it can result in loss ofpower (Lin et. al., 2013).Another common strategy is to use a single SNP at time to testGE interaction.Assume γ as a random vector following a multivariate normaldistribution N(0, τ Iq) and to test the equivalent null hypothesisH0 : τ = 0.
= Xβ + Gθ
![Page 17: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/17.jpg)
Seminario Model
Important facts
If γ is treated as a fixed vector and the null hyphotesis is testedwith a q degrees of freedom score test, it can result in loss ofpower (Lin et. al., 2013).Another common strategy is to use a single SNP at time to testGE interaction.Assume γ as a random vector following a multivariate normaldistribution N(0, τ Iq) and to test the equivalent null hypothesisH0 : τ = 0.
g(µb,γ
)= Xβ + Gθ + Sγ + Kb︸ ︷︷ ︸
random
![Page 18: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/18.jpg)
Seminario Model
Important facts
If γ is treated as a fixed vector and the null hyphotesis is testedwith a q degrees of freedom score test, it can result in loss ofpower (Lin et. al., 2013).Another common strategy is to use a single SNP at time to testGE interaction.Assume γ as a random vector following a multivariate normaldistribution N(0, τ Iq) and to test the equivalent null hypothesisH0 : τ = 0.
g(µb)
= Xβ + Gθ+Sγ + Kb︸︷︷︸random
![Page 19: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/19.jpg)
Seminario Model
Linkage disequilibrium (LD) -PPARG-
![Page 20: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/20.jpg)
Seminario Model
Linkage disequilibrium (LD) -CDKAL1-
![Page 21: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/21.jpg)
Seminario Model
Linkage disequilibrium (LD) -CDKAL1-
![Page 22: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/22.jpg)
Seminario Model
Linkage disequilibrium (LD) -CDKAL1-
![Page 23: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/23.jpg)
Seminario Model
Linkage disequilibrium (LD) -CDKAL1-
![Page 24: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/24.jpg)
Seminario Model
Remedial considerations to face LD
Given the high correlation among the SNPs in G, Lin et. al.(2013) proposed (for independet subjects) to penalize the esti-mation of parameter θ by using ridge regression and introducinga penalization term λ in the quasi-likelihood function. The tun-ning parameter λ is selected using generalized cross validation(Fu, 2005).
Shen et. al. (2013) proposed for generalized linear models (GLM)that ridge regression is equivalent to assume the penalized pa-rameters as independent random variables.
![Page 25: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/25.jpg)
Seminario Model
Remedial considerations to face LD
Given the high correlation among the SNPs in G, Lin et. al.(2013) proposed (for independet subjects) to penalize the esti-mation of parameter θ by using ridge regression and introducinga penalization term λ in the quasi-likelihood function. The tun-ning parameter λ is selected using generalized cross validation(Fu, 2005).Shen et. al. (2013) proposed for generalized linear models (GLM)that ridge regression is equivalent to assume the penalized pa-rameters as independent random variables.
![Page 26: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/26.jpg)
Seminario Model
Remedial considerations to face LD
It is also equivalent for GLMM and assuming θ ∼ N(0, σ2θ Iq), it is
possible to show that λ = φ/σ2θ .
= Xβ +
![Page 27: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/27.jpg)
Seminario Model
Remedial considerations to face LD
It is also equivalent for GLMM and assuming θ ∼ N(0, σ2θ Iq), it is
possible to show that λ = φ/σ2θ .
g(µb,γ,θ
)= Xβ + Gθ + Sγ + Kb︸ ︷︷ ︸
random
![Page 28: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/28.jpg)
Seminario Model
Remedial considerations to face LD
It is also equivalent for GLMM and assuming θ ∼ N(0, σ2θ Iq), it is
possible to show that λ = φ/σ2θ .
g(µb,θ
)= Xβ + Gθ︸︷︷︸
random
+Sγ + Kb︸︷︷︸random
![Page 29: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/29.jpg)
Seminario Null model estimation
Sección 3
Null model estimation
![Page 30: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/30.jpg)
Seminario Null model estimation
Null model estimation
g(µd1,d2
)= Xβ + d1 + d2
with d1 = Gθ, d2 = Kb.
Breslow and Clayton (1993) proposed aFisher scoring solution that may be expressed as the iterative solutionto the system XT W X XT W XT W
W X φ
σ2θ
(GGT )−1 + W WW X W φ
σ2 (KKT )−1 + W
( βd1d2
)=
XT W YW YW Y
Y = Xβ + d1 + d2 + ε: working vector; ε ∼ N(0, φW−1);
W = diag{ωij/
[ν(µd
ij )g ′(µdij )2]}
.Ridge
![Page 31: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/31.jpg)
Seminario Null model estimation
Null model estimation
g(µd1,d2
)= Xβ + d1 + d2
with d1 = Gθ, d2 = Kb. Breslow and Clayton (1993) proposed aFisher scoring solution that may be expressed as the iterative solutionto the system XT W X XT W XT W
W X φ
σ2θ
(GGT )−1 + W WW X W φ
σ2 (KKT )−1 + W
( βd1d2
)=
XT W YW YW Y
Y = Xβ + d1 + d2 + ε: working vector; ε ∼ N(0, φW−1);
W = diag{ωij/
[ν(µd
ij )g ′(µdij )2]}
.Ridge
![Page 32: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/32.jpg)
Seminario Null model estimation
Null model estimation
β = (XT Σ−1X)−1XT Σ−1Y ,
(d1d2
)=
σ2θ
(GGT
)Σ−1(Y − Xβ)
σ2(KKT
)Σ−1(Y − Xβ)
,with Σ = σ2
θGGT + σ2KKT + φW−1.
![Page 33: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/33.jpg)
Seminario GE interaction test
Sección 4
GE interaction test
![Page 34: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/34.jpg)
Seminario GE interaction test
Testing GE interaction
Following the approach developed by Zhang and Lin (2003), wepropose as score statitistic the quadratic form
Uτ = Uτ(β, π
)= 1
2
{(Y − Xβ
)TΣ−1SST Σ−1
(Y − Xβ
)} ∣∣∣∣β.π
.
with π is the estimator of π = (σ2θ , σ
2, φ)T .
To correct for bias, weuse the restricted maximum likelihood (REML) estimators (Breslowand Clayton, 1993) in the GLMM framework to obtain β and πunder the null hypothesis.
![Page 35: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/35.jpg)
Seminario GE interaction test
Testing GE interaction
Following the approach developed by Zhang and Lin (2003), wepropose as score statitistic the quadratic form
Uτ = Uτ(β, π
)= 1
2
{(Y − Xβ
)TΣ−1SST Σ−1
(Y − Xβ
)} ∣∣∣∣β.π
.
with π is the estimator of π = (σ2θ , σ
2, φ)T .To correct for bias, weuse the restricted maximum likelihood (REML) estimators (Breslowand Clayton, 1993) in the GLMM framework to obtain β and πunder the null hypothesis.
![Page 36: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/36.jpg)
Seminario GE interaction test
Testing GE interaction
Zhang and Lin (2003) showed that under H0 : τ = 0, Uτ followsapproximately a mixture of one degree of freedom independent chi-square distributions. However, for computational ease, we use theSatterthwaite method (Satterthwaite, 1941) to approximate the dis-tribution of Uτ by a scaled chi-square distribution κχ2
ξ .
When REML estimates are used to calculate Uτ showed that themean and variance of Uτ can be approximated, respectively, by
tr(PSST
) ∣∣∣∣π
and Iτ = 12{tr(PSST PSST )− JT M−1J
} ∣∣∣∣β,π
![Page 37: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/37.jpg)
Seminario GE interaction test
Testing GE interaction
Zhang and Lin (2003) showed that under H0 : τ = 0, Uτ followsapproximately a mixture of one degree of freedom independent chi-square distributions. However, for computational ease, we use theSatterthwaite method (Satterthwaite, 1941) to approximate the dis-tribution of Uτ by a scaled chi-square distribution κχ2
ξ .When REML estimates are used to calculate Uτ showed that themean and variance of Uτ can be approximated, respectively, by
tr(PSST
) ∣∣∣∣π
and Iτ = 12{tr(PSST PSST )− JT M−1J
} ∣∣∣∣β,π
![Page 38: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/38.jpg)
Seminario GE interaction test
Testing GE interaction
J =(
tr[PSST PGGT ] tr[PSST PKKT ] tr[PSST PW−1])T
M =
tr[PGGT PGGT ] tr[PGGT PKKT ] tr[PGGT PW−1]
tr[PKKT PGT G] tr[PKKT PKKT ] tr[PKKT PW−1]
tr[PW−1PGT G] tr[PW−1PKKT ] tr[PW−1PW−1]
,
and P = Σ−1 −Σ−1X(XT Σ−1X
)−1XT Σ−1.
![Page 39: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/39.jpg)
Seminario GE interaction test
Testing GE interaction
Since the mean and variance of κχ2ξ are given by κξ and 2κ2ξ,
respectively, we obtain the equations tr(PSST ) = κξ and Iτ =2κ2ξ, where P denotes the matrix P evaluated in π. By solvingthese equations, we demonstrate that
κ = Iτ/[2 tr(PSST )]
andξ = 2
[tr(PSST )
]2/Iτ .
Uτκ ∼ χ
2ξ
![Page 40: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/40.jpg)
Seminario GE interaction test
Testing GE interaction
Since the mean and variance of κχ2ξ are given by κξ and 2κ2ξ,
respectively, we obtain the equations tr(PSST ) = κξ and Iτ =2κ2ξ, where P denotes the matrix P evaluated in π. By solvingthese equations, we demonstrate that
κ = Iτ/[2 tr(PSST )]
andξ = 2
[tr(PSST )
]2/Iτ .
Uτκ ∼ χ
2ξ
![Page 41: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/41.jpg)
Seminario Simulations
Sección 5
Simulations
![Page 42: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/42.jpg)
Seminario Simulations
Simulated model
Eij = 2 + 0,01Ageij + 0,1I(Femaleij) + γi + εij ;
logit [P (Yij = 1|Ageij ,Femaleij ,Eij ,G1ij ,G2ij)]= 0,1 + 0,01Ageij + 0,1I(Femaleij) + 0,1Eij + 0,3G1ij
+0,3G2ij + γ1(G1ij × Eij) + γ2(G2ij × Eij) + αij
withI(·) is theindicatorfunction;
εi = (εi1, . . . , εi10)T ∼ N(0, 4I10);αi = (αi1, . . . , αi10)T ∼ N(0, 2Φi);γi ∼ N(0, 4)
Φi is the kinship matrix corresponding to the aforementionedfamily pedigree .
![Page 43: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/43.jpg)
Seminario Simulations
Correlation of the 50 simulated SNPs in LD
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
123456789
1011121314151617181920212223242526272829303132333435363738394041424344454647484950
![Page 44: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/44.jpg)
Seminario Simulations
Type I error
We first compared the empirical type I error of the different methodsat 0.05 α-level. To evaluate type I error, we set γ1 = γ2 = 0 andvaried the number of SNPs q in the gene. The SNPs were eitherindependent or in LD.
SNPs category q σ2 σ2θ λ = 1/σ2
θ Score MinP VCT
Independent5 1.247 0.034 29.4 0.020 0.031 0.034
10 1.240 0.017 58.8 0.023 0.026 0.02450 1.222 0.003 333 0.004 0.022 0.014
LD5 1.243 0.021 47.6 0.025 0.026 0.031
10 1.239 0.009 111 0.004 0.034 0.03050 1.222 0.002 500 0.000 0.028 0.022
![Page 45: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/45.jpg)
Seminario Simulations
Type I error
We first compared the empirical type I error of the different methodsat 0.05 α-level. To evaluate type I error, we set γ1 = γ2 = 0 andvaried the number of SNPs q in the gene. The SNPs were eitherindependent or in LD.
SNPs category q σ2 σ2θ λ = 1/σ2
θ Score MinP VCT
Independent5 1.247 0.034 29.4 0.020 0.031 0.034
10 1.240 0.017 58.8 0.023 0.026 0.02450 1.222 0.003 333 0.004 0.022 0.014
LD5 1.243 0.021 47.6 0.025 0.026 0.031
10 1.239 0.009 111 0.004 0.034 0.03050 1.222 0.002 500 0.000 0.028 0.022
![Page 46: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/46.jpg)
Seminario Simulations
Empirical power for independent SNPs
0.00 0.02 0.04 0.06 0.08 0.10
0.0
0.2
0.4
0.6
0.8
1.0
q = 5
γ1 = γ2
Em
piric
al P
ower
ScoreMinPVCT
0.00 0.02 0.04 0.06 0.08 0.10
0.0
0.2
0.4
0.6
0.8
1.0
q = 10
γ1 = γ2
0.00 0.02 0.04 0.06 0.08 0.10
0.0
0.2
0.4
0.6
0.8
1.0
q = 50
γ1 = γ2
![Page 47: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/47.jpg)
Seminario Simulations
Empirical power for dependent SNPs
0.00 0.02 0.04 0.06 0.08 0.10
0.0
0.2
0.4
0.6
0.8
1.0
q = 5
γ1 = γ2
Em
piric
al P
ower
ScoreMinPVCT
0.00 0.02 0.04 0.06 0.08 0.10
0.0
0.2
0.4
0.6
0.8
1.0
q = 10
γ1 = γ2
0.00 0.02 0.04 0.06 0.08 0.10
0.0
0.2
0.4
0.6
0.8
1.0
q = 50
γ1 = γ2
![Page 48: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/48.jpg)
Seminario Application
Sección 6
Application
![Page 49: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/49.jpg)
Seminario Application
Baependi Heart Study
![Page 50: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/50.jpg)
Seminario Application
Variables
Phenotype: Type II diabetesEnvironmental variable: Body Mass Index (BMI)Genotype: Were considered the following three genetic regions:
Peroxisome-Proliferator-Activated Receptors gamma (PPARG)with 16 variants genotyped;Fat Mass and Obesity associated protein (FTO) with 149 va-riants genotyped;Cyclin-dependent kinase 5 regulatory subunit associated protein1-like 1 (CDKAL1) with 186 variants genotyped.
Covariates: Age, sex, and the first two principal components ofthe entire genotype data of Baependi.
![Page 51: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/51.jpg)
Seminario Application
Summary of cases per subjects and families
Gene Subjects FamiliesControl Cases Total Control Cases Total
PPARG 845 83 928 43 42 85FTO 712 71 783 47 38 85
CDKAL1 661 69 730 47 38 85
bits.3.20GHz with a RAM memory 8.00 GB and operating system 64-R version 3.3.1 and a processor Intel(R) Core(TM) i5-6500 CPU @
![Page 52: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/52.jpg)
Seminario Application
Summary of cases per subjects and families
Gene Subjects FamiliesControl Cases Total Control Cases Total
PPARG 845 83 928 43 42 85FTO 712 71 783 47 38 85
CDKAL1 661 69 730 47 38 85
bits.3.20GHz with a RAM memory 8.00 GB and operating system 64-R version 3.3.1 and a processor Intel(R) Core(TM) i5-6500 CPU @
![Page 53: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/53.jpg)
Seminario Application
Sample size, GLMM parameters, p-values and execution times
![Page 54: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/54.jpg)
Seminario References
Sección 7
References
![Page 55: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/55.jpg)
Seminario References
References
Breslow, N. and Clayton, D, 1993. Approximate inference ingeneralized linear mixed models. J. Am. Stat. Assoc., 88, 9-25.Coombes, B., Basu, S. and Mcgue, M. , 2017. A combinationtest for detection of gene-environment interaction in cohort stu-dies. Genet. Epidemiol., 41, 396-412.Chen, H. and Conomos, M., 2016. GMMAT: Generalized Li-near Mixed Model Association Tests; R Package Version 0.7.Available online:https://content.sph.harvard.edu/xlin/dat/GMMAT_user_manual_v0.7.pdf(accessed on 16 January 2017).Lin, X., Lee, S., Chistiani, D. and Lin, X., 2013. Test for in-teractions between a genetic marker set and environment ingeneralized linear models. Biostatistics, 14, 667-681.
![Page 56: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/56.jpg)
Seminario References
References
Lin, X., Lee, S., Wu, M., Wang, C., Chen, H., Li, Z. and Lin,X., 2016. Test for rare variants by environment interactions insequencing association studies. Biometrics, 72, 156-164.Lin, X, 1997. Variance component testing in generalised linearmodels with random effects. Biometrika, 84, 309-326.Oliveira, C., Pereira, A., de Andrade, M., Soler, J. and Krieger,J., 2008. Heritability of cardiovascular risk factors in a brazilianpopulation: Baependi heart study. BMC Med. Genet., 9, 32.Shen, X., Alam, M., Fikse, F. and Ronnegard, L., 2013. A novelgeneralized ridge regression method for quantitative genetics.Genetics, 193, 1255-1268.Zhang, D. and Lin, X, 2003. Hyphotesis testing in semiparame-tric additive mixed models. Biostatistics 2003, 4, 7-74.
![Page 57: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/57.jpg)
Seminario References
References
Satterthwaite, F., 1941. Synthesis of variance. Psychometrika,6, 309-316.Fu, W.J., 2005.Nonlinear GCV and quasi-GCV for shrinkagemodels. Journal of Statistical Planning and Inference, 131, 333-347.Mazo, M. A., Coombes, B. and de Andrade, M., 2017. AnEfficient Test for Gene-Environment Interaction in GeneralizedLinear Mixed Models with Family Data. International JournalOf Environmental Research And Public Health, 14, 1134-1146.
![Page 58: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/58.jpg)
Seminario References
References
Chen, H., Wang, C., Conomos, M., Stilp, A., Li, Z., Sofer,T., Szpiro, A., Chen, W., Brehm, J., Celedon, J., Redline, S.,Papanicolaou, G., Thornton, T., Laurie, C., Rice, K. and Lin,X., 2016. Control for population structure and relatedness forbinary traits in genetic association studies via logistic mixedmodels. The American journal of human genetics, 98, 653-666.Leal, S., Yan, K. and Muller-Myhsokb, B, 2005. SimPed: ASimulation Program to Generate Haplotype and Genotype 308Data for Pedigree Structures. Hum Hered., 119-122.
![Page 59: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/59.jpg)
Seminario References
Family data and kinship matrix
Return
![Page 60: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/60.jpg)
Seminario References
Ridge regression estimation
g(µd2) = Xβ + Gθ + d2 (d2 = Kb)
ql(β, θ, φR , σR) = −12 log
∣∣∣∣σ2RφR
(KKT )WR + In
∣∣∣∣+
N∑i=1
ni∑j=1
qlij(β, θ, φR ; d2) − 12 dT
2(σ2
RKKT)−1 d2,
where d2 is choosen to maximize the sum of the last two terms,WR = diag
{ωij/
[ν(µd2
ij )g ′(µd2ij )2
]}and
qlij(β,θ, φR ; d2) =∫ µ
d2ij
Yij
ωij(Yij − µ)φRν(µ) dµ.
![Page 61: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/61.jpg)
Seminario References
Ridge regression estimation
g(µd2) = Xβ + Gθ + d2 (d2 = Kb)
ql(β, θ, φR , σR) = −12 log
∣∣∣∣σ2RφR
(KKT )WR + In
∣∣∣∣+
N∑i=1
ni∑j=1
qlij(β, θ, φR ; d2) − 12 dT
2(σ2
RKKT)−1 d2,
where d2 is choosen to maximize the sum of the last two terms,WR = diag
{ωij/
[ν(µd2
ij )g ′(µd2ij )2
]}and
qlij(β,θ, φR ; d2) =∫ µ
d2ij
Yij
ωij(Yij − µ)φRν(µ) dµ.
![Page 62: Seminario - Testing gene-environment interaction in generalized … · Breslow, N. and Clayton, D, 1993. Approximate inference in generalizedlinearmixedmodels.J. Am. Stat. Assoc.,88,9-25](https://reader033.vdocuments.mx/reader033/viewer/2022060417/5f14522786231e6f0b2e8978/html5/thumbnails/62.jpg)
Seminario References
Ridge regression estimation
Ridge regression estimator of θ, is obtained by minimazing the fun-ction
[ql(β,θ, φR , σR)× φR ]− 12λθ
Tθ.
where λ is a penalizing factor.
XT WRX XT WR XT WRWRX WR + λ(GGT )−1 WR
WRX WR WR + φRσ2
R(KKT )−1
( βd1d2
)=
XT WRYWRYWRY
where d1 = Gθ. Return