quantitative trait association in parent offspring trios: extension of case/pseudocontrol method and...
TRANSCRIPT
![Page 1: Quantitative trait association in parent offspring trios: Extension of case/pseudocontrol method and comparison of prospective and retrospective approaches](https://reader036.vdocuments.mx/reader036/viewer/2022080306/5750048e1a28ab11489f677e/html5/thumbnails/1.jpg)
Genetic Epidemiology 31 : 813–833 (2007)
Quantitative Trait Association in Parent Offspring Trios: Extensionof Case/Pseudocontrol Method and Comparison of Prospective
and Retrospective Approaches
Eleanor Wheeler,1 and Heather J. Cordell2�
1The Wellcome Trust Sanger Institute, Cambridge, UK2Institute of Human Genetics, Newcastle University, UK
The case/pseudocontrol method provides a convenient framework for family-based association analysis of case-parent trios,incorporating several previously proposed methods such as the transmission/disequilibrium test and log-linear modelling ofparent-of-origin effects. The method allows genotype and haplotype analysis at an arbitrary number of linked and unlinkedmultiallelic loci, as well as modelling of more complex effects such as epistasis, parent-of-origin effects, maternal genotype andmother-child interaction effects, and gene-environment interactions. Here we extend the method for analysis of quantitative asopposed to dichotomous (e.g. disease) traits. The resulting method can be thought of as a retrospective approach, modellinggenotype given trait value, in contrast to prospective approaches that model trait given genotype. Through simulations andanalytical derivations, we examine the power and properties of our proposed approach, and compare it to several previouslyproposed single-locus methods for quantitative trait association analysis. We investigate the performance of the different methodswhen extended to allow analysis of haplotype, maternal genotype and parent-of-origin effects. With randomly ascertainedfamilies, with or without population stratification, the prospective approach (modeling trait value given genotype) is found to begenerally most effective, although the retrospective approach has some advantages with regard to estimation and interpretabilityof parameter estimates when applied to selected samples. Genet. Epidemiol. 31:813–833, 2007. r 2007 Wiley-Liss, Inc.
Key words: family-based; regression; imprinting; TDT
The supplemental materials described in this article can be found at http://www.interscience.wiley.com/jpages/0741-0395/suppmat
Contract grant sponsor: Wellcome Trust; Contract grant numbers: 074524 and 068612.�Correspondence to: Heather Cordell, Institute of Human Genetics, Newcastle University, International Centre for Life, Central Parkway,Newcastle upon Tyne NE1 3BZ, UK. E-mail: [email protected] 24 November 2006; Revised 15 February 2007; Accepted 27 April 2007Published online 4 June 2007 in Wiley InterScience (www.interscience.wiley.com).DOI: 10.1002/gepi.20243
INTRODUCTION
Numerous methods have been proposed to testfor association between a quantitative trait and adiallelic locus of interest. In a group of unrelatedsubjects, simple linear regression can be used to relatethe quantitative trait phenotype to the genotype.However, this approach can be adversely affected bypopulationstratification [Gauderman, 2003] and hencefamily-based designs are often preferred. Perhaps thesimplest family-based design is to genotype a sampleof unrelated phenotyped individuals and their par-ents, generating a set of parent-offspring trios. Anala-gous to the transmission/disequilibrium test (TDT) fordisease traits [Spielman et al., 1993], tests that arerobust to stratification can be derived by focussing onthe transmission of the parental alleles to the offspring.If a given marker is not linked to a quantitative traitlocus (QTL) (so that the marker alleles are transmittedrandomly from parents to offspring), offspring quan-titative phenotype is independent of offspring marker
genotype given the parental marker genotypes[Whittaker et al., 2003]. This observation led Whittakeret al. [2003] and Gauderman [2003] to propose a testthat is robust to population stratification, by addingterms that code for parental mating type into the linearregression equation. This approach (denoted QTDTM
by Gauderman [2003]) is closely related to testspreviously proposed by Allison [1997] and Lunettaet al. [2000] (for details, see Gauderman [2003]).
An alternative approach was proposed by Fulkeret al. [1999] and Abecasis et al. [2000]. These authorsadded terms to the linear regression model to separateout the within and between mating type information.The within mating type test was referred to as thehierarchical QTDT (HQTDT) by Gauderman [2003]and is the method implemented in the QTDT program[Abecasis et al., 2002]. Gauderman [2003] noted thatfor parent-offspring trios, HQTDT and QTDTM arevirtually identical with regards to inference about theeffects of interest (genotype effects on the trait);however HQTDT has the advantage that it has alsobeen extended to apply to general pedigrees [Abecasis
r 2007 Wiley-Liss, Inc.
![Page 2: Quantitative trait association in parent offspring trios: Extension of case/pseudocontrol method and comparison of prospective and retrospective approaches](https://reader036.vdocuments.mx/reader036/viewer/2022080306/5750048e1a28ab11489f677e/html5/thumbnails/2.jpg)
et al., 2002]. Yang et al. [2000] described a similarmodel to the HQTDT. The differences between themodels have little or no effect on the estimates andtest of interest [Gauderman, 2003], and the Yang et al.[2000] method was therefore treated as equivalent toHQTDT by Gauderman [2003].
The above models are prospective models, that is,they model the quantitative phenotype in terms of theoffspring genotype. They all assume that the quanti-tative traits are normally distributed, or else rely onthe central limit theorem. To protect against theeffects of possible deviations from either normality orselection on the trait, the HQTDT implemented in theQTDT program can carry out a permutation proce-dure based on permutation of genotypes to producean empirical P-value. The connection between max-imum likelihood inference under an assumed normaldistribution and least-squares regression, however,means that in general we expect these regression-based methods to be reasonably robust to smalldeviations of the trait from normality, even withoutuse of permutation arguments.
An alternative approach is to use a retrospectiveapproach, in which offspring genotype is modelledas a function of quantitative phenotype (possiblygiven parental genotypes). This approach is moreakin to the original TDT of Spielman et al. [1993].Such an approach provides the rationale for thefamily-based association test (FBAT) of Laird et al.[2000], although Lange et al. [2002] showed that thisretrospective FBAT approach is in fact equivalent toimplementing the HQTDT of Abecasis et al. [2000]via a score test rather than via a likelihood ratio test.Kistner and Weinberg [2004, 2005] describe a retro-spective approach in which the offspring genotypeis modelled as a function of their phenotype andparental genotypes, making no explicit assumptionsabout the distribution of the quantitative trait. Thismodel, called the quantitative polytomous logistic(QPL) model, can be thought of as an extension ofthe log-linear model proposed by Weinberg et al.[1998] for qualitative traits.
The log-linear model for qualitative traits pro-posed by Weinberg et al. [1998] is very similar to thecase/pseudocontrol approach for case-parent triosproposed by Cordell and Clayton [2002] (describedin more detail by Cordell et al. [2004]). The maindifference between the two approaches is thatCordell and Clayton [2002] model offspring geno-types conditional both on parental genotype andascertainment through the affected offspring,whereas Weinberg et al. [1998] model the frequenciesof the 15 possible trio types consisting of offspringgenotype and parental mating type. The case/pseudocontrol approach can be thought of as ageneralization of the TDT and the approaches ofSchaid and Sommer [1993], Schaid [1996] and
Weinberg et al. [1998], generalized to allow thefitting of more complex models where several linkedand/or unlinked loci may contribute to disease via acombination of offspring and/or maternal genotypeor haplotype effects, parent-of-origin effects andgene-gene or gene-environment interactions. Giventhe flexibility of the case/pseudocontrol approach,here we extend this approach to deal with quanti-tative traits, and compare the resulting methodto original and extended versions of previouslyproposed quantitative trait association approaches.
METHODS
THE QTDTM
Gauderman [2003] and Whittaker et al. [2003]incorporate parental mating type as a fixed effect ina linear regression model. The model is of the form
yi ¼ aM þ bgiþ ei ð1Þ
where yi denotes a quantitative phenotype, gi thegenotype at a particular locus for the ith individual(gi 5 0, 1, 2 according to whether the genotype is 1/1,1/2 (or 2/1) or 2/2), and aM (M 5 1, y, 6) are mating-type specific intercepts. The residual ei is assumed tobe normally distributed with mean 0 and variance s2.The formulation above suggests that the model isparameterized in terms of six mating-type parameters(the aM) and three child-genotype parameters (b0,b1 and b2), but, in fact, only two child-genotypeparameters are estimable, with one of the genotypecategories being chosen as the reference genotypecategory (the baseline genotype category to whichthe other genotype effects are compared). For example,if 1/1 is chosen as the reference genotype, b0 is setequal to 0 and the test of no association between y andg is H0 :b1 5b2 5 0. Alternatively, if the heterozygousgenotype 1/2 is chosen as the reference genotype, b1
is set equal to 0 and the test of no association betweeny and g is H0 :b0 5b2 5 0.
Similar to the HQTDT of Abecasis et al. [2000], theQTDTM draws information from both within andbetween mating types. The HQTDT models thedifferences across mating types using a betweenmating type parameter whereas the QTDTM usesthe multiple fixed intercepts aM. Gauderman [2003]notes that inferences for the genotype effects arethe same using the HQTDT and QTDTM methods.The differences between the methods are in theestimates and interpretation of the mating-typespecific intercepts, which are treated as randomeffects in the HQTDT and fixed effects in QTDTM.
THE QPL METHOD
Kistner and Weinberg [2004, 2005] describe aretrospective model in which offspring genotype is
Genet. Epidemiol. DOI 10.1002/gepi
814 Wheeler and Cordell
![Page 3: Quantitative trait association in parent offspring trios: Extension of case/pseudocontrol method and comparison of prospective and retrospective approaches](https://reader036.vdocuments.mx/reader036/viewer/2022080306/5750048e1a28ab11489f677e/html5/thumbnails/3.jpg)
modelled as a function of offspring phenotype andparental genotypes. The model is an extension of thelog-linear model proposed by Weinberg et al. [1998]for qualitative traits and is fit using a polytomouslogistic model with a generalized logit link function.Assuming parental mating symmetry in the popula-tion, there are six distinct parental mating types andthe offspring genotype is modelled conditional onthe offspring’s quantitative trait (yi) and the parentalmating type. Let SM denote the set of possibleoffspring genotypes for mating type M. The summa-tion
Pg�
i2SM
denotes the restricted summation over
the offspring genotypes consistent with SM. Thecontribution of a trio to the likelihood is modelled as
Pðgijgim; gif ; yiÞ ¼expðb00gi
yi þ a00MgiÞP
g�i2SM
expðb00g�i
yi þ a00Mg�iÞð2Þ
where b00g are parameters representing associationbetween quantitative trait and genotype, and a00Mg arenuisance parameters to account for non-Mendelian-ism and/or population stratification and depend onboth parental mating type and the offspring geno-type. In the formulation described by Kistner andWeinberg [2004], these parameters are denoted as bg
and aMg, but here we instead use the double primednotation (b00g and a00Mg) to distinguish these parametersfrom the parameters bg and aM used in theprospective formulation of equation (1). Kistner andWeinberg [2004] code the offspring, maternaland paternal genotypes as 0, 1 or 2 depending onthe number of ‘variant’ alleles (here considered tobe allele 2) they carry (the results of the test will bethe same regardless which allele is considered to be the‘variant’). For the ith trio, let the offspring’s quanti-tative trait value be denoted by yi. Column 3 of Table Ishows the conditional likelihoods for all combina-tions of parent and offspring genotypes that wouldresult from equation (2) if all the parameters wereestimable. Note that there are three child genotypeparameters (b000, b001 and b002) and seven nuisanceparameters (a00010, a00011, a00110, a00111, a00112, a00121, a00122) witha00jkl corresponding to the nuisance parameter for thecategory with (unordered) parental genotypes j andk and child genotype l. In practice, this model isoverparameterized, and Kistner and Weinberg[2004] treat the heterozygous offspring genotype,1 (1/2 or 2/1), as the reference genotype category. Thisresults in a final model with six estimated parameters(b000, b002, a00010, a00110, a00112, and a00122), with b001, a00011, a00111 anda00121 being set equal to zero, as shown in the fourthcolumn of Table I. An alternative parameterizationfor the child’s genotype parameters (which we shallwe use in our simulation study) would be to set b000 tozero and to estimate b001 and b002. Regardless of theparameterization chosen, the categorization into threepossible offspring genotype categories means that
there are a maximum of three possible terms in thedenominator of equation (2), clearly seen in thecolumns 3 and 4 of Table I.
THE QCPG METHOD
The method we propose is closely related to theapproach by Kistner and Weinberg [2004, 2005], butthe parameterization and implementation of themethods are somewhat different. Our approachderives from the case/pseudocontrol method fordichotomous traits [Cordell and Clayton, 2002,Cordell et al., 2004]. This method involves construct-ing (from a sample of case-parent trios) a sampleof cases and matched pseudocontrols. We focus hereon the ‘conditioning on parental genotypes’ (CPG)approach of Cordell et al. [2004], which generatespseudocontrols conditional on the mother’s andfather’s genotypes (and possibly also conditionalon some other event x, such as phase or parent-of-origin being determinable).
The extension of the CPG method to quantitativetraits, here named the quantitative CPG (QCPG), isbased on a calculation of the conditional likelihoodof the offspring genotypes, conditional on theparental genotypes and the offspring phenotypes.For family i, let gi, gim, gif be the offspring, maternaland paternal genotypes respectively, and let yi be theoffspring’s quantitative trait. Then,
Pðgijgim; gif ; yiÞ ¼Pðgi; gim; gif ; yiÞ
Pðgim; gif ; yiÞ¼
Pðgi; gim; gif ; yiÞPg�
iPðg�i ; gim; gif ; yiÞ
¼Pðyijgi; gim; gif ÞPðgijgim; gif ÞPðgim; gif ÞPg�
iPðyijg�i ; gim; gif ÞPðg�i jgim; gif ÞPðgim; gif Þ
¼PðyijgiÞPðgijgim; gif ÞP
g�i2S0
MPðyijg�i ÞPðg
�i jgim; gif Þ
ð3Þ
whereP
g�i
denotes summation over the four
possible offspring genotypes andP
g�i2S0
Mdenotes
summation over all possible offspring genotypesthat could have been transmitted to the offspringgiven the parental genotypes (the probabilitiesPðg�i jgim; gif Þ ¼ 0 for offspring genotypes that areinconsistent with the parental genotypes). This isof the form of the case/pseudocontrol likelihoodfor qualitative traits [Cordell and Clayton, 2002]with the offspring’s affection status replaced by aquantitative phenotype. The likelihood can becalculated via conditional logistic regression asimplemented in standard statistical software. InAppendix A we show that the contribution of a trioto the likelihood may be assumed to be of the form
expðb0giyi þ a0Mgi
ÞPg�
i2S0
Mexpðb0g�
iyi þ a0Mg�
iÞ: ð4Þ
815Quantitative Trait Association in Parent Offspring Trios
Genet. Epidemiol. DOI 10.1002/gepi
![Page 4: Quantitative trait association in parent offspring trios: Extension of case/pseudocontrol method and comparison of prospective and retrospective approaches](https://reader036.vdocuments.mx/reader036/viewer/2022080306/5750048e1a28ab11489f677e/html5/thumbnails/4.jpg)
Here b0girepresent genotype effects, and a0Mgi
arenuisance parameters modelling non-Mendelianismand population stratification. The likelihood issimilar to Kistner and Weinberg’s QPL likelihoodequation (2), except for the summation in thedenominator. Columns 3 and 4 of Table II show theconditional probabilities corresponding to all com-binations of parent and offspring genotypes for theQCPG method. By comparing column 4 of Tables Iand II, and ignoring constants of proportionality, itcan be seen that the QCPG likelihood is identical tothe QPL likelihood, except for the offspring of twoheterozygous parents. With two heterozygous par-ents, the sum in the denominator of the QCPG like-lihood is over a maximum of four possible off-spring genotypes. However, Kistner and Weinbergsum over a maximum of three possible offspringgenotypes. For example, for a heterozygous off-spring, the contribution to the QPL likelihood is
1
expðb000 yþ a00110Þ þ 1þ expðb002 yþ a00112Þ
whereas for the QCPG method the contribution tothe likelihood is
1
expðb00 yþ a0110Þ þ 1þ 1þ expðb02 yþ a0112Þ:
Essentially, in the QCPG formulation, we distin-guish between the two possible heterozygote off-spring genotypes 1/2 and 2/1 in the summationin the denominator (although in practice – assumingno parent-of-origin effects – the likelihood will beidentical regardless of whether the observed off-spring has genotype 1/2 or 2/1). As a result of thedifferent likelihood formulations, interpretationof the nuisance parameters is different for the QCPGmethod and Kistner and Weinberg’s QPL. Thedifference is most noticeable under the null withno population stratification or non-Mendelianism. InAppendix A we show that, for the QCPG method,the true values of the nuisance parameters a0 underthe null with no population stratification or non-Mendelianism equal zero, whereas in Appendix Bwe show that for Kistner and Weinberg’s QPLmethod, the true values of the nuisance parametersa00 are non-zero. Provided all six estimable para-meters (two offspring genotype effects and four
TABLE I. The conditional likelihoods associated with the QPL model
Parents (gm, gf) Offspring (g)
QPL likelihood
Overparameterized model Six estimable parameters estimated
00 0 1 1
02 1 1 1
22 2 1 1
01 0expðb000yþ a00010Þ
expðb000yþ a00010Þ þ expðb001yþ a00011Þ
expðb000yþ a00010Þ
expðb000yþ a00010Þ þ 1
1expðb001yþ a00011Þ
expðb000yþ a00010Þ þ expðb001yþ a00011Þ1
expðb000yþ a00010Þ þ 1
11 0expðb000yþ a00110Þ
expðb000yþ a00110Þ þ expðb001yþ a00111Þ þ expðb002yþ a00112Þ
expðb000yþ a00010Þ
expðb000yþ a00110Þ þ 1þ expðb002yþ a00112Þ
1expðb001yþ a00111Þ
expðb000yþ a00110Þ þ expðb001yþ a00111Þ þ expðb002yþ a00112Þ1
expðb000yþ a00110Þ þ 1þ expðb002yþ a00112Þ
2expðb002yþ a00112Þ
expðb000yþ a00110Þ þ expðb001yþ a00111Þ þ expðb002yþ a00112Þ
expðb002yþ a00112Þ
expðb000yþ a00110Þ þ 1þ expðb002yþ a00112Þ
12 1expðb001yþ a00121Þ
expðb001yþ a00121Þ þ expðb002yþ a00122Þ1
1þ expðb002yþ a00122Þ
2expðb002yþ a00122Þ
expðb001yþ a00121Þ þ expðb002yþ a00122Þ
expðb002yþ a00122Þ
1þ expðb002yþ a00122Þ
The likelihoods are proportional to P(g|gm, gf, y), corresponding to all combinations of (unordered) parent and offspring genotypes.
816 Wheeler and Cordell
Genet. Epidemiol. DOI 10.1002/gepi
![Page 5: Quantitative trait association in parent offspring trios: Extension of case/pseudocontrol method and comparison of prospective and retrospective approaches](https://reader036.vdocuments.mx/reader036/viewer/2022080306/5750048e1a28ab11489f677e/html5/thumbnails/5.jpg)
nuisance parameters) are freely estimated duringlikelihood maximization, inference for the para-meters of interest (b000 and b002, or b001 and b002,depending on which genotype category is chosenas reference) should not be affected by this result.However, the QPL result is slightly counter-intui-tive, as one would generally expect that parametersthat are specifically included in the likelihood tomodel certain effects (such as population stratifica-tion or non-Mendelianism) would take the valuezero (i.e. be removable from the likelihood) whenthese effects do not, in fact, exist.
An additional difference between the QCPG andQPL arises with regard to the number of nuisanceparameters estimated (see Appendix C).
Liu et al. [2002] described a method that is closelyrelated to the QCPG. Although the nuisanceparameters are different, the likelihood is essen-tially of the same form. Rather than having sixpossible parental mating type parameters (aM), Liuet al. [2002] have a single baseline parameter, a, anda number of additional parameters for the clustersof individuals whose trait differs from the popula-tion mean due to population stratification, di.However, without knowing the underlying popula-tion stratification, the di parameters are unknownand cannot be estimated from the data. Liu et al.
[2002] avoid having to estimate these nuisanceparameters by showing that even in the presenceof unobservable population stratification, it is stillvalid to test the null of no genetic effect via ascore test, since population stratification hasno effect on the null distribution of the test.Gauderman [2003] refers to the method of Liuet al. [2002] as the retrospective QTDT (RQTDT).To implement this method via a likelihood ratio test,Gauderman assumed that the quantitative traitfollows a normal distribution with mean aþ bgi
and variance s2, without consideration of thedi parameters. In this implementation, a does notuse information on the parental genotypes tomodel population stratification, although someinformation on the parental genotypes is stillincorporated via the genotypes of the offspringand the pseudocontrols.
EXTENSION OF QCPGTO MULTI-LOCUS HAPLOTYPES
Cordell et al. [2004] showed that the case/pseudocontrol approach can easily be extended tofit models for parent-of-origin effects, multiallelicmarkers, multiple linked loci in multiple unlinkedregions, and gene-gene and gene-environment
TABLE II. The conditional likelihoods associated with the QCPG model
Parents (gm, gf) Offspring (g)
QCPG likelihood
Overparameterized model Six estimable parameters estimated
00 0 1 102 1 1 122 2 1 1
01 0expðb00yþ a0010Þ
2 expðb00yþ a0010Þ þ 2 expðb01yþ a0011Þ
expðb00yþ a0010Þ
2 expðb00yþ a0010Þ þ 2
1expðb01yþ a0011Þ
2 expðb00yþ a0010Þ þ 2 expðb01yþ a0011Þ1
2 expðb00yþ a0010Þ þ 2
11 0expðb00yþ a0110Þ
expðb00yþ a0110Þ þ 2 expðb01yþ a0111Þ þ expðb02yþ a0112Þ
expðb00yþ a0110Þ
expðb00yþ a0110Þ þ 2þ expðb02yþ a0112Þ
1expðb01yþ a0111Þ
expðb00yþ a0110Þ þ 2 expðb01yþ a0111Þ þ expðb02yþ a0112Þ1
expðb00yþ a0110Þ þ 2þ expðb02yþ a0112Þ
2expðb02yþ a0112Þ
expðb00yþ a0110Þ þ 2 expðb01yþ a0111Þ þ expðb02yþ a0112Þ
expðb02yþ a0112Þ
expðb00yþ a0110Þ þ 2þ expðb02yþ a0112Þ
12 1expðb01yþ a0121Þ
2 expðb01yþ a0121Þ þ 2 expðb02yþ a0122Þ1
2þ 2 expðb02yþ a0122Þ
2expðb02yþ a0122Þ
2 expðb01yþ a0121Þ þ 2 expðb02yþ a0122Þ
expðb02yþ a0122Þ
2þ 2 expðb02yþ a0122Þ
The likelihoods are proportional to P(g|gm, gf, y), corresponding to all combinations of (unordered) parent and offspring genotypes.
817Quantitative Trait Association in Parent Offspring Trios
Genet. Epidemiol. DOI 10.1002/gepi
![Page 6: Quantitative trait association in parent offspring trios: Extension of case/pseudocontrol method and comparison of prospective and retrospective approaches](https://reader036.vdocuments.mx/reader036/viewer/2022080306/5750048e1a28ab11489f677e/html5/thumbnails/6.jpg)
interactions, via an adjustment to the conditioningargument that results in differing numbers ofpseudocontrols depending on the model beingfitted. Here we extend this approach to quantitativetraits. Consider models in which the genotypeeffects depend only on child’s phased genotype.Define gi, gim, gif as the offspring, maternal andpaternal phase-known genotypes respectively, and yi
as the offspring’s quantitative trait. The likelihoodis very similar to that in equation (3) but we definean event x as the event that the set of transmittedand untransmitted haplotypes from the parentscan be deduced. The contribution to the conditionallikelihood is
Pðgijgim; gif ; yi; xÞ
¼Pðgi; gim; gif ; yi; xÞ
Pðgim; gif ; yi; xÞ¼
Pðgi; gim; gif ; yi; xÞPg�
iPðg�i ; gim; gif ; yi; xÞ
¼Pðyijgi; gim; gif ; xÞPðgijgim; gif ; xÞPðgim; gif ; xÞPg�
iPðyijg�i ; gim; gif ; xÞPðg�i jgim; gif ; xÞPðgim; gif ; xÞ
¼PðyijgiÞPðgijgim; gif ; xÞP
g�i2S0
MPðyijg�i ÞPðg
�i jgim; gif ; xÞ
whereP
g�i
denotes summation over all possible off-
spring genotypes andP
g�i2S0
Mdenotes summation
over all possible offspring genotypes that could havebeen transmitted to the offspring given the parentalgenotypes. Under Mendelian inheritance the prob-abilities Pðg�i jgim; gif ; xÞ are equal for all g�i 2 fS
0M \
Gxg and equal zero otherwise, where Gx denotes theset of offspring genotypes determined by x. Then thecontribution to the likelihood under Mendelianinheritance is given by
Pðgijgim; gif ; yi; xÞ ¼PðyijgiÞP
g�i2fS0
M\Gxg
Pðyijg�i Þ:
Note that, provided the models that are to be fitteddo not depend on phase, one could also use theQCPG for analysis of unphased multilocus genotypedata, in the same way that the CPG method canbe used for unphased genotype data [Cordellet al., 2004].
EXTENSIONS FOR MATERNAL GENOTYPEAND PARENT-OF-ORIGIN EFFECTS
Kistner et al. [2006] proposed an extension tothe QPL approach to allow testing for maternallymediated effects and parent-of-origin effects. Thelikelihood factors into two parts. The first factor testsfor genotype effects in the offspring and can bemodelled using the original QPL method. Thesecond factor tests for maternal genotype or par-ent-of-origin effects via a logistic regression model.
Maternal genotype effects are incorporated bymodelling the probability that the mother has morecopies of the variant allele than the father for eachmating type. Parent-of-origin effects are incorpo-rated by additionally including a binary indicatorvariable, indicating whether the offspring inheritedonly one copy of the variant allele. This implies thechild is heterozygous, and since the mother has morecopies of the variant allele than the father, the variantallele must have been inherited from the mother.
We may also extend the QCPG method to allow formaternal genotype and parent-of-origin effects. The‘conditioning on exchangeable parental genotypes’(CEPG) method [Cordell et al., 2004] is an extensionof the CPG approach to detecting parent-of-originor maternal genotype effects by assuming exchange-ability of parental genotypes. The method conditionson the set of parental genotypes but not on theirorder, generating additional pseudocontrols con-structed by exchanging the genotypes of the motherand father. Here we extend this approach toquantitative traits. Maternal genotype effects aredefined to be the direct effect of the maternalgenotype on the offspring’s quantitative trait, andparent-of-origin effects are defined (as in Weinberget al. [1998]) to allow the offspring’s quantitativetrait to vary according to the parental origin of thevariant allele, if present. Like Cordell et al. [2004], weintroduce an additional conditioning event x corre-sponding to the event that parent-of-origin andmaternal genotype can be deduced in the trio. Forquantitative trait CEPG method (denoted hereQCEPG), the contribution of a trio to the QCEPGlikelihood is
Pðgi; gim; gif jfgim; gifg; yi; xÞ
¼Pðgi; gim; gif ; fgim; gifg; yi; xÞ
Pðfgim; gifg; yi; xÞ
¼Pðgi; gim; gif ; fgim; gifg; yi; xÞP
g�i;g�
im;g�
ifPðg�i ; g
�im; g
�if ; fgim; gifg; yi; xÞ
¼Pðyijgi; gim; gif ; xÞPðgi; gim; gif jfgim; gifg; xÞPðfgim; gifg; xÞP
g�i;g�
im;g�
ifPðyijg�i ; g
�im; g
�if ; xÞPðg
�i ; g�im; g
�if jfgim; gifg; xÞPðfgim; gifg; xÞ
¼Pðyijgi; gim; gif ; xÞP
g�i;g�
im;g�
if2Gx
Pðyijg�i ; g�im; g
�if ; xÞ
ð5Þ
where fgim; gifg denotes the unordered set of parentalgenotypes and the final restricted sum is over thepossible trios in which parent-of-origin (as well asmaternal genotype) are deducible. Unlike Kistnerand Weinberg’s extension of the QPL method, theQCEPG method does not involve factoring thelikelihood. In addition, the null hypothesis of noparent-of-origin effects considers the transmission ofvariant alleles from the mother to all offspring, notonly those who are heterozygous. The contribution
818 Wheeler and Cordell
Genet. Epidemiol. DOI 10.1002/gepi
![Page 7: Quantitative trait association in parent offspring trios: Extension of case/pseudocontrol method and comparison of prospective and retrospective approaches](https://reader036.vdocuments.mx/reader036/viewer/2022080306/5750048e1a28ab11489f677e/html5/thumbnails/7.jpg)
of the likelihood (see Appendix D) is assumed to beof the form
expððb0giþb0gim
þbImÞyi þ a0gimgif gi
ÞPg�
i;g�
im;g�
if2Gx
expððb0g�iþ b0g�
imþb�ImÞyi þ a0g�
img�
ifg�
iÞ: ð6Þ
The offspring genotype effects are denoted by b0g�i,
the maternal genotype effects by b0g�im
and parent-of-
origin effects by b�Im, where Im is an indicator of
whether an offspring inherits a variant allele fromthe mother. The nuisance parameters a0g�
img�
ifg�
idepend
on the offspring genotype and the genotypes of theparents. For the QCPG, in the absence of maternalgenotype or parent-of-origin effects, the nuisanceparameters only depended on the parental matingtype and the offspring genotype. However, in theQCEPG, the maternal genotype is of interest andneeds to be included. No additional information isgained by incorporating the parent-of-origin indica-tor into the nuisance parameter since once the phase-known genotypes of the parents and offspring arespecified, and conditional on the fact that parent-of-origin can be deduced, parent-of-origin is alsoestablished. If only maternal genotype effects areof interest then the parent-of-origin indicator cansimply be removed from the model and the nuisanceparameters remain the same. However, if onlyparent-of-origin effects are of interest then thenuisance parameters are of the form a0Mg�
iIm , which
depend on the parental mating type, the offspringgenotype and the parent-of-origin indicator.
The QTDTM method can also be extended toinclude maternal genotype effects, bgim
, and parent-of-origin effects, bIm
, through fitting the linearregression model
yi ¼ aM þ bgiþ bgim
þ bIm:
Trios in which parent-of-origin can be resolved canbe found by first generating a case/pseudocontroldataset as described in Cordell et al. [2004], specify-ing that the parental genotypes are exchangeableand parent-of-origin can be resolved. For theQTDTM method, only the original offspring (the‘case’) from the case/pseudocontrol dataset(together with information about maternal genotypeand parent-of-origin status) is used in the prospec-tive likelihood, whereas in the QCEPG method, thefull set of cases and pseudocontrols is required forthe retrospective likelihood.
SIMULATION STUDY
SINGLE-LOCUS SIMULATIONS
Simulations were performed to investigate thepower and properties of the various methodsdescribed. Initially, a single diallelic QTL locus was
considered. One thousand replicates of data weregenerated, each consisting of a number of genotypedtrios (i.e. a single offspring with a quantitative traitand both parents). Bias in the resulting parameterestimates, 95% confidence intervals, power and typeI error were examined. A method that performs wellwould be expected to give unbiased parameterestimation and to show approximately 95% con-fidence interval coverage. The importance of thenuisance parameters in the retrospective modelswas also investigated by examining the estimatesobtained when they are removed from the modeland also when the offspring genotype is used as asubstitute.
For the single-locus model, six generating scenar-ios were considered as shown in Table 1 (online).Three different sampling schemes were employed:random sampling, one-tail sampling from the uppertail of the offspring trait distribution and two-tailedsampling from the upper and lower tails of theoffspring trait distribution. Under random sam-pling, 500 parent-offspring trios were simulatedper replicate where the offspring’s quantitative traitwas drawn from a normal distribution with geno-type mean and standard deviation as shown in Table1 (online). Population stratification was simulatedby combining data in different proportions from twosubpopulations, each of which was in Hardy-Weinberg equilibrium and showed random mating.The subpopulations had different allele frequenciesand mean quantitative trait values, producing aspurious correlation between the quantitative traitand genotypes when the populations are combined.Under selected sampling from the extremes, 5,000trios were generated per replicate, from which asubset were selected for analysis. For the two-tailedsampling scheme, 500 trios were selected from the5,000 (i.e. the top and bottom 5% of the traitdistribution). For the one-tail sampling scheme, weselected 1,000 trios from the 5,000 (i.e. the top 20%),as convergence problems were encountered whenusing only 500 trios under this sampling scheme.
Table III shows results for the first three scenarioswith no population stratification and where the trioswere randomly sampled. Under the null, all themethods gave unbiased estimates and reasonableconfidence intervals, except for the QPL methodwhere the nuisance parameters have been removed.This is expected since under the null, the a00
parameters in the QPL are nonzero and so theirremoval affects the resulting b00 estimates. Similarly,under the first alternative model (Alt 1) all themethods performed well except the QPL and QCPGmethods where the nuisance parameters have beenremoved. The retrospective models in which thenuisance parameters have been replaced by theoffspring genotype parameters give b0 and b00
819Quantitative Trait Association in Parent Offspring Trios
Genet. Epidemiol. DOI 10.1002/gepi
![Page 8: Quantitative trait association in parent offspring trios: Extension of case/pseudocontrol method and comparison of prospective and retrospective approaches](https://reader036.vdocuments.mx/reader036/viewer/2022080306/5750048e1a28ab11489f677e/html5/thumbnails/8.jpg)
TA
BL
EII
I.T
rue
an
de
stim
ate
dm
ea
ns,
sta
nd
ard
de
via
tio
ns
(SD
)a
nd
cov
era
ge
(CI)
of
the
95
%co
nfi
de
nce
inte
rva
lsfo
rth
esi
ng
lelo
cus
sim
ula
tio
ns
wit
hra
nd
om
sele
ctio
na
nd
no
po
pu
lati
on
stra
tifi
cati
on
Met
ho
dP
aram
eter
Mo
del
s
Nu
llA
lt1
Alt
2
Tru
em
ean
Mea
n(S
D)
CI
Tru
em
ean
Mea
n(S
D)
CI
Tru
em
ean
Mea
n(S
D)
CI
Lin
ear
reg
ress
ion
Co
nst
ant
–1.
99(0
.46)
––
�0.
01(0
.46)
––
�0.
01(0
.46)
–b 1
0.00
0.05
(0.5
0)0.
961.
001.
05(0
.50)
0.96
0.00
0.05
(0.5
0)0.
96b 2
0.00
0.04
(0.4
8)0.
962.
002.
04(0
.48)
0.96
1.00
1.04
(0.4
8)0.
96Q
TD
TM
b 10.
000.
08(0
.60)
0.95
1.00
1.08
(0.6
0)0.
950.
000.
08(0
.60)
0.95
b 20.
000.
07(0
.63)
0.95
2.00
2.07
(0.6
3)0.
951.
001.
07(0
.63)
0.95
QC
PG
b01
0.00
0.02
(0.1
6)0.
950.
250.
27(0
.17)
0.96
0.00
0.02
(0.1
6)0.
95b0
20.
000.
02(0
.16)
0.95
0.50
0.51
(0.1
7)0.
970.
250.
25(0
.16)
0.96
QP
Lb00 1
0.00
0.02
(0.1
6)0.
950.
250.
27(0
.17)
0.96
0.00
0.02
(0.1
6)0.
95b00 2
0.00
0.02
(0.1
6)0.
950.
500.
51(0
.17)
0.97
0.25
0.25
(0.1
6)0.
96Q
CP
Gas
rem
ov
edb0
10.
000.
02(0
.11)
0.96
0.25
0.26
(0.1
5)0.
960.
000.
02(0
.14)
0.96
b02
0.00
0.01
(0.1
1)0.
950.
500.
40(0
.16)
0.88
0.25
0.23
(0.1
5)0.
96Q
PLa00 s
rem
ov
edb00 1
0.00
0.10
(0.1
0)0.
870.
250.
27(0
.14)
0.97
0.00
0.00
(0.1
3)0.
97b00 2
0.00
0.07
(0.1
0)0.
920.
500.
40(0
.14)
0.88
0.25
0.21
(0.1
3)0.
97Q
CP
Ga0
sre
pla
ced
by
off
spri
ng
gen
oty
pe
(g)
b01
0.00
0.02
(0.1
5)0.
960.
250.
27(0
.16)
0.97
0.00
0.02
(0.1
5)0.
96b0
20.
000.
01(0
.16)
0.95
0.50
0.50
(0.1
6)0.
960.
250.
25(0
.16)
0.96
QP
La00 s
rep
lace
db
yo
ffsp
rin
gg
eno
typ
e(g
)b00 1
0.00
0.02
(0.1
5)0.
960.
250.
24(0
.16)
0.97
0.00
0.00
(0.1
5)0.
96b00 2
0.00
0.01
(0.1
5)0.
960.
500.
47(0
.16)
0.96
0.25
0.23
(0.1
5)0.
96
Th
esi
mu
lati
on
par
amet
ers
are
assh
ow
nin
Tab
le1
(on
lin
esu
pp
lem
enta
rym
ater
ials
).
820 Wheeler and Cordell
Genet. Epidemiol. DOI 10.1002/gepi
![Page 9: Quantitative trait association in parent offspring trios: Extension of case/pseudocontrol method and comparison of prospective and retrospective approaches](https://reader036.vdocuments.mx/reader036/viewer/2022080306/5750048e1a28ab11489f677e/html5/thumbnails/9.jpg)
estimates very close to the true means and reason-able coverage.
The results for the three scenarios in the presenceof population stratification and under randomsampling are shown in Table IV. Simple linearregression showed the expected bias in the estimatesof b and poor coverage of the estimated 95%confidence intervals since the population stratifica-tion is not accounted for in the method. Under thefirst null model, as in the case without populationstratification (Table III), the QPL method with thea00 parameters removed did not perform well (cov-erages 0.87 and 0.92 instead of 0.95). The remainderof the methods perform well under both nullmodels, even under population stratification. Sub-stitution of the four nuisance parameters by theoffspring genotype in the retrospective methodsappears to account sufficiently for the populationstratification. Under the alternative with populationstratification, only the prospective QTDTM methodproduced unbiased estimates and correct coverage.The retrospective methods (QCPG and QPL) pro-duced biased estimates, as expected (see AppendicesA and C).
Parameter estimates under one-tail selected sam-pling are shown in online Tables 2 and 3 (online).The results under the null (both with and withoutpopulation stratification) are the same as thosefound in the unselected case. Under the alternativewith no population stratification, both prospectivemodels (simple linear regression and QTDTM) showbiased estimates and incorrect coverage of the 95%confidence intervals. This is because the methodscannot account for the selection on quantitative traitvalue. By conditioning on the trait values, theretrospective models should be robust to selection.However, Table 2 (online) suggests that thesemethods are producing biased estimates. By lookingat the median genotype effect estimates (data notshown), we found that the bias is due to a smallnumber of outlying observations. The medians forthe QCPG and QPL methods (with the true nuisanceparameters and with the nuisance parametersreplaced by the offspring genotype) are very closeto the true means. Under the alternative withpopulation stratification, all methods performedpoorly, producing biased estimates and incorrectcoverage. Here, the prospective model QTDTM failssince it cannot account for selection on quantitativetrait value and the retrospective models, QCPG andQPL, fail to estimate the nuisance parameters underthe alternative with population stratification. Similarresults were observed using the two-tailed samplingscheme (Tables 4 and 5 (online)). Without popula-tion stratification (Table 5 (online)), it can be seenthat the bias in the estimates using simple linearregression is not as great under two-tailed sampling
as found when sampling only from the uppertail of the trait distribution (note that under thealternative, the assumption of homoscedasticity ofthe residuals is violated under the one-tailedsampling scheme).
Powers/type I errors are shown in Table 6(online). Since the powers to achieve P value of0.001 for the different methods are all 1.0 under two-tailed sampling, we also investigated the power toachieve a more stringent significance level in thiscase. Under the null with no population stratifica-tion, removal of the nonzero nuisance parameters inthe QPL method generates a bias in the estimatesand hence increased type I error rates, most clearlyseen in Table 6 (online) for the random and one-tailsampling schemes. The remaining methods all havetype I error rates close to or less than the criticalvalues. Highest power to detect a genotypic effectis seen with the linear regression method for therandom and two-tailed sampling schemes and withthe QCPG method with the a0 parameters removedfor the one-tail sampling scheme (powers are mean-ingless for the QPL method with no a00 parameterssince the type I errors are incorrect). In all cases, thehighest powers to detect a genotypic effect are seenfor the two-tailed selected sampling scheme, select-ing from the upper and lower tails of the offspringtrait distribution. In contrast, selection from only theupper tail of the offspring trait distribution actuallydecreases the power to detect a genetic effectcompared to the random sampling scheme, despitehaving the largest sample size, except for the QCPGmethod with the a0 parameters removed. Theseresults also show that, although under the alter-native with selected sampling the QTDTM methodshowed biased estimates and poor coverage of the95% confidence intervals, the method can still beused to test for a genetic effect, as the type 1 error iscorrect. In fact, the large bias in the estimates seenwhen using a two-tailed sampling scheme actuallyincreases the power to detect an effect compared torandom sampling, although this power increase mayalso be due to the fact that the selected subjects carrymore information, since they are concentrated at theextremes of the trait distribution.
Under population stratification, Table 6 (online)shows that for all sampling schemes the simplelinear regression method has increased type I errorrates. Since linear regression cannot account for thepopulation substructure, the resulting bias in theestimates generates a large number of false-positiveassociations. The QPL method with the a00 para-meters removed also has type I errors larger than thenominal values, particularly when selecting from theupper tail of the offspring trait distribution, as foundin the case of no population stratification. The type Ierrors for the QTDTM method under population
821Quantitative Trait Association in Parent Offspring Trios
Genet. Epidemiol. DOI 10.1002/gepi
![Page 10: Quantitative trait association in parent offspring trios: Extension of case/pseudocontrol method and comparison of prospective and retrospective approaches](https://reader036.vdocuments.mx/reader036/viewer/2022080306/5750048e1a28ab11489f677e/html5/thumbnails/10.jpg)
TA
BL
EIV
.T
rue
an
de
stim
ate
dm
ea
ns,
sta
nd
ard
de
via
tio
ns
(SD
)a
nd
cov
era
ge
(CI)
of
the
95
%co
nfi
de
nce
inte
rva
lsfo
rth
esi
ng
lelo
cus
sim
ula
tio
ns
wit
hra
nd
om
sele
ctio
na
nd
wit
hp
op
ula
tio
nst
rati
fica
tio
n
Met
ho
dP
aram
eter
Mo
del
s
Po
pn
stra
t(N
ull
2)P
op
nst
rat
(Nu
ll2)
Po
pn
stra
t(A
lt)
Tru
em
ean
Mea
n(S
D)
CI
Tru
em
ean
Mea
n(S
D)
CI
Tru
em
ean
Mea
n(S
D)
CI
Lin
ear
reg
ress
ion
Co
nst
ant
–2.
29(0
.21)
––
0.02
(0.1
0)–
–1.
84(0
.19)
–b 1
0.00
�1.
06(0
.26)
0.02
0.00
0.25
(0.1
5)0.
761.
000.
15(0
.24)
0.07
b 20.
00�
2.06
(0.2
2)0.
000.
001.
31(0
.14)
0.00
2.00
0.34
(0.2
0)0.
00Q
TD
TM
b 10.
000.
01(0
.33)
0.95
0.00
0.00
(0.1
7)0.
981.
001.
01(0
.31)
0.95
b 20.
000.
01(0
.39)
0.94
0.00
0.00
(0.2
4)0.
972.
002.
01(0
.35)
0.94
QC
PG
b01
0.00
0.01
(0.2
0)0.
950.
000.
01(0
.15)
0.95
0.25
0.68
(0.2
4)0.
61b0
20.
000.
00(0
.21)
0.95
0.00
0.00
(0.1
6)0.
960.
501.
23(0
.27)
0.17
QP
Lb00 1
0.00
0.01
(0.2
0)0.
950.
000.
01(0
.15)
0.95
0.25
0.68
(0.2
4)0.
61b00 2
0.00
0.00
(0.2
1)0.
950.
000.
00(0
.16)
0.96
0.50
1.23
(0.2
7)0.
17Q
CP
Ga0
sre
mo
ved
b01
0.00
0.00
(0.1
0)0.
960.
000.
01(0
.14)
0.95
0.25
0.14
(0.0
9)0.
81b0
20.
000.
00(0
.12)
0.95
0.00
0.00
(0.1
6)0.
960.
500.
29(0
.10)
0.50
QP
La00 s
rem
ov
edb00 1
0.00
0.08
(0.0
9)0.
910.
000.
02(0
.14)
0.96
0.25
0.20
(0.0
9)0.
92b00 2
0.00
0.01
(0.1
1)0.
970.
000.
01(0
.14)
0.96
0.50
0.30
(0.0
9)0.
50Q
CP
Ga0
sre
pla
ced
by
off
spri
ng
gen
oty
pe
(g)
b01
0.00
0.00
(0.1
9)0.
960.
000.
01(0
.14)
0.95
0.25
0.62
(0.2
2)0.
66b0
20.
00�
0.01
(0.2
0)0.
960.
000.
00(0
.16)
0.95
0.50
1.11
(0.2
3)0.
25Q
PLa00 s
rep
lace
db
yo
ffsp
rin
gg
eno
typ
e(g
)b00 1
0.00
�0.
03(0
.19)
0.95
0.00
0.01
(0.1
4)0.
950.
250.
56(0
.22)
0.79
b00 2
0.00
�0.
07(0
.19)
0.96
0.00
0.04
(0.1
5)0.
950.
501.
00(0
.22)
0.46
Th
esi
mu
lati
on
par
amet
ers
are
assh
ow
nin
Tab
le1
(on
lin
e).
822 Wheeler and Cordell
Genet. Epidemiol. DOI 10.1002/gepi
![Page 11: Quantitative trait association in parent offspring trios: Extension of case/pseudocontrol method and comparison of prospective and retrospective approaches](https://reader036.vdocuments.mx/reader036/viewer/2022080306/5750048e1a28ab11489f677e/html5/thumbnails/11.jpg)
stratification for the first null model are slightlylarger than expected, across all three samplingschemes. Similarly, the type I error is inflated forthe second null model under the two-tailed sam-pling scheme. The remainder of the methods havetype I errors close to the nominal values. Therefore,the retrospective methods (QCPG and QPL) with thecorrect nuisance parameters or with the nuisanceparameters replaced by the offspring genotype canbe used (and indeed have high power) to test for agenetic effect, even under circumstances where theyproduce biased estimates.
MULTI-LOCUS HAPLOTYPES
Simulations were carried out to investigate theeffect of the nuisance parameters (intended toaccount for population stratification) when themethods are extended to multi-locus haplotypes.Note that, as originally proposed, the QTDTM
method (and simple linear regression) only applyto single loci: to extend these methods to multi-locushaplotypes it is necessary to first infer the child’s(and if necessary, the parents’) haplotypes giventhe observed genotype data, as is done in thefirst stage of the CPG and QCPG methods [Cordellet al., 2004]. The resulting haplotype variablesmay then be entered as predictor variables intoequation (1).
Tables V and VI show the results of simulations inwhich the offspring quantitative trait was influencedby genotype at two linked diallelic markers assumedto be in moderate LD. The four possible haplotypes,1-1, 1-2, 2-1 and 2-2, had haplotype frequencies andhaplotype means as shown in Table 7 (online).Additive effects of haplotypes were assumed so thatfor each trio, the offspring’s quantitative trait wasdrawn from a normal distribution whose mean wasthe sum of the two haplotype means. For eachsimulation, 1,000 replicates of data were generated,each replicate consisting of 1,000 parent-offspringtrios with random selection or 1,000 trios selectedfrom 10,000 in either one-tail (top 10%) or two-tailed(top and bottom 5%) sampling from the extremes ofthe offspring trait distribution.
Three methods were considered, simple linearregression, QTDTM and the retrospective methodQCPG. The QPL method was not considered as itis so closely related to the QCPG. Simple linearregression does not have any additional parametersto account for population stratification. The QTDTM
and QCPG methods, however, have a significantnumber of nuisance parameters when the methodsare extended to multi-locus haplotypes. For exam-ple, for QTDTM, the number of possible matingtypes (assuming parental mating symmetry) is 55,a large increase from the 6 in the single-locus case.
Therefore, in addition to considering the models inwhich the ‘correct’ nuisance parameters are used,we considered models in which the number ofnuisance parameters were reduced. For the QTDTM
we considered either including in the modelthe single-locus mating-type parameters for eachlocus, or including maternal and paternal genotype(rather than mating-type) parameters. For the QCPGmethod, replacing the nuisance parameters by theoffspring genotype (g) was considered.
Table V shows the results for the case with nopopulation stratification. Also shown is the numberof replicates that converged from the original 1,000.Convergence problems were probably a smallsample size problem, due to the large numbers ofparameters to estimate in the models. Under thenull, all the methods produced unbiased parameterestimates, regardless of selection scheme. Under thealternative with no selection, the prospective meth-ods (simple linear regression and QTDTM withthe different sets of a parameters) gave unbiasedparameter estimates. The retrospective QCPG meth-od shows some small-sample bias in the estimateswhen the full set of ‘correct’ a0 parameters was usedand similar bias when the a0 parameters werereplaced by the offspring genotype g. This biasdisappeared when 10,000 trios (as opposed to 1,000)were used (data not shown). Under the alternativewith selection, only the retrospective QCPG methodwhen all the ‘correct’ a0 parameters are used, orwhen the a0 are replaced by the offspring genotype,gave estimates close to the true mean.
The sensitivity of the estimates to the way thenuisance parameters are modelled is most pro-nounced under population stratification as shownin Table VI. Under the null with random selection,the simple linear regression method producesbiased estimates, as does the QTDTM method inwhich the correct a’s are replaced by those thatwould be generated by considering the loci indivi-dually. The remaining methods, QTDTM with thefull set of a parameters or with parental genotypeparameters, and the QCPG methods with thedifferent sets of nuisance parameters, all haveunbiased estimates. Under the null with selectionfrom the upper tail of the offspring trait distribution,all of the methods produced unbiased parameterestimates. For the two-tailed sampling scheme,linear regression showed the expected bias inparameter estimates but QTDTM with the correcta’s, QTDTM with parental genotypes and theretrospective methods (with the different sets ofa0 parameters) produced unbiased parameter esti-mates. Under the alternative with random selectiononly the prospective QTDTM method (with thedifferent sets of a’s) produced unbiased estimates:as explained in Appendices A and C, the nuisance
823Quantitative Trait Association in Parent Offspring Trios
Genet. Epidemiol. DOI 10.1002/gepi
![Page 12: Quantitative trait association in parent offspring trios: Extension of case/pseudocontrol method and comparison of prospective and retrospective approaches](https://reader036.vdocuments.mx/reader036/viewer/2022080306/5750048e1a28ab11489f677e/html5/thumbnails/12.jpg)
TA
BL
EV
.M
ea
ne
stim
ate
sa
nd
sta
nd
ard
de
via
tio
ns
of
sim
ula
ted
two
-lo
cus
ha
plo
typ
ee
ffe
cts
usi
ng
the
sim
ple
lin
ea
rre
gre
ssio
n,
QT
DT
M(w
ith
dif
fere
nt
sets
of
nu
isan
cep
ara
mete
rs)
an
dth
eQ
CP
G(w
ith
dif
fere
nt
sets
of
nu
isan
cep
ara
mete
rs)
Nu
llm
od
elA
lter
nat
ive
mo
del
Sam
pli
ng
sch
eme
No
.o
fre
pli
cate
s
Ran
do
m97
3T
op
980
To
p1
bo
tto
m98
2R
and
om
978
To
p13
2T
op
1b
ott
om
292
Met
ho
dP
ara-
met
erT
rue
mea
nM
ean
(SD
)M
ean
(SD
)M
ean
(SD
)T
rue
mea
nM
ean
(SD
)M
ean
(SD
)M
ean
(SD
)
Lin
ear
reg
ress
ion
b 20.
000.
00(0
.08)
0.00
(0.0
3)�
0.01
(0.1
7)1.
001.
00(0
.08)
0.21
(0.0
6)2.
18(0
.19)
b 30.
000.
00(0
.07)
0.00
(0.0
3)0.
00(0
.15)
2.00
2.00
(0.0
7)0.
63(0
.05)
3.52
(0.0
5)b 4
0.00
0.00
(0.0
7)0.
00(0
.03)
0.00
(0.1
4)3.
003.
00(0
.07)
1.17
(0.0
6)4.
08(0
.03)
QT
DT
Mal
las
b 20.
000.
00(0
.11)
0.00
(0.0
5)0.
00(0
.27)
1.00
1.00
(0.1
1)0.
16(0
.15)
1.70
(0.2
8)b 3
0.00
0.00
(0.1
0)0.
00(0
.04)
0.00
(0.2
2)2.
002.
00(0
.10)
0.59
(0.1
2)3.
52(0
.14)
b 40.
000.
00(0
.10)
0.00
(0.0
4)0.
00(0
.20)
3.00
3.00
(0.1
0)1.
08(0
.16)
4.50
(0.1
4)Q
TD
TM
asre
pla
ced
by
par
enta
lg
eno
typ
esb 2
0.00
0.00
(0.1
1)0.
00(0
.05)
0.00
(0.2
6)1.
001.
00(0
.11)
0.15
(0.1
4)2.
21(0
.22)
b 30.
000.
00(0
.10)
0.00
(0.0
4)0.
00(0
.22)
2.00
2.00
(0.1
0)0.
57(0
.12)
3.56
(0.0
8)b 4
0.00
0.00
(0.1
0)0.
00(0
.04)
0.00
(0.1
9)3.
003.
00(0
.10)
1.10
(0.1
4)4.
12(0
.06)
QT
DT
Mas
atb
oth
loci
b 20.
000.
00(0
.10)
0.00
(0.0
4)0.
00(0
.22)
1.00
1.00
(0.1
0)0.
14(0
.10)
1.79
(0.2
1)b 3
0.00
�0.
01(0
.09)
0.00
(0.0
4)0.
00(0
.19)
2.00
1.99
(0.0
9)0.
60(0
.10)
3.78
(0.1
0)b 4
0.00
0.00
(0.0
9)0.
00(0
.04)
0.00
(0.1
8)3.
003.
00(0
.09)
1.12
(0.1
3)4.
34(0
.10)
QC
PG
alla0
sb0
20.
000.
00(0
.13)
0.01
(0.3
1)0.
00(0
.06)
1.00
1.11
(0.1
8)0.
68(0
.90)
1.36
(0.9
6)b0
30.
00�
0.01
(0.1
1)�
0.01
(0.2
5)0.
00(0
.05)
2.00
2.23
(0.2
2)1.
76(0
.83)
2.69
(1.4
3)b0
40.
000.
00(0
.11)
0.00
(0.2
5)0.
00(0
.05)
3.00
3.44
(0.3
6)2.
80(0
.97)
3.83
(1.5
6)Q
CP
Ga0
sre
pla
ced
by
off
spri
ng
gen
oty
pe
(g)
b02
0.00
0.00
(0.1
2)0.
00(0
.28)
0.00
(0.0
6)1.
001.
06(0
.14)
0.86
(0.7
6)1.
32(0
.88)
b03
0.00
0.00
(0.1
0)0.
00(0
.24)
0.00
(0.0
5)2.
002.
12(0
.19)
1.86
(0.6
7)2.
60(1
.10)
b04
0.00
0.00
(0.1
0)0.
00(0
.23)
0.00
(0.0
4)3.
003.
19(0
.27)
2.87
(0.7
6)3.
67(1
.17)
Sim
ula
ted
wit
hn
op
op
ula
tio
nst
rati
fica
tio
n.
824 Wheeler and Cordell
Genet. Epidemiol. DOI 10.1002/gepi
![Page 13: Quantitative trait association in parent offspring trios: Extension of case/pseudocontrol method and comparison of prospective and retrospective approaches](https://reader036.vdocuments.mx/reader036/viewer/2022080306/5750048e1a28ab11489f677e/html5/thumbnails/13.jpg)
TA
BL
EV
I.M
ea
ne
stim
ate
sa
nd
sta
nd
ard
de
via
tio
ns
of
sim
ula
ted
two
-lo
cus
ha
plo
typ
ee
ffe
cts
usi
ng
the
sim
ple
lin
ea
rre
gre
ssio
n,
QT
DT
M(w
ith
dif
fere
nt
sets
of
nu
isa
nce
pa
ram
ete
rs)
an
dth
eQ
CP
G(w
ith
dif
fere
nt
sets
of
nu
isa
nce
pa
ram
ete
rs)
Nu
llm
od
elA
lter
nat
ive
mo
del
Sam
pli
ng
sch
eme
No
.o
f.re
pli
cate
s
Ran
do
m99
5T
op
941
To
p1
bo
tto
m99
5R
and
om
996
To
p82
To
p1
bo
tto
m50
2
Met
ho
dP
ara-
met
erT
rue
mea
nM
ean
(SD
)M
ean
(SD
)M
ean
(SD
)T
rue
mea
nM
ean
(SD
)M
ean
(SD
)M
ean
(SD
)
Lin
ear
reg
ress
ion
b 20.
003.
30(0
.28)
0.00
(0.0
4)4.
14(0
.39)
1.00
4.30
(0.2
8)0.
26(0
.14)
0.96
(0.3
2)b 3
0.00
2.07
(0.3
7)0.
00(0
.05)
2.06
(0.4
1)2.
004.
07(0
.37)
0.31
(0.1
0)6.
74(0
.64)
b 40.
004.
07(0
.16)
0.00
(0.0
3)6.
22(0
.17)
3.00
7.07
(0.1
5)0.
50(0
.09)
9.21
(0.0
2)Q
TD
TM
allas
b 20.
00�
0.01
(0.4
2)0.
00(0
.06)
0.00
(0.5
1)1.
000.
99(0
.42)
0.23
(0.1
9)0.
90(0
.45)
b 30.
00�
0.01
(0.4
2)0.
00(0
.07)
0.00
(0.4
4)2.
001.
99(0
.42)
0.30
(0.1
4)5.
73(0
.93)
b 40.
00�
0.01
(0.3
3)0.
00(0
.04)
0.00
(0.4
1)3.
002.
99(0
.33)
0.50
(0.1
3)8.
06(0
.50)
QT
DT
Mas
rep
lace
db
yp
aren
tal
gen
oty
pes
b 20.
00�
0.01
(0.4
3)0.
000.
06�
0.01
(0.5
1)1.
000.
99(0
.43)
0.13
(0.2
3)0.
84(0
.39)
b 30.
00�
0.01
(0.4
4)0.
000.
060.
00(0
.45)
2.00
1.99
(0.4
4)0.
17(0
.23)
6.48
(0.7
4)b 4
0.00
�0.
01(0
.33)
0.00
0.04
0.00
(0.4
2)3.
002.
99(0
.33)
0.37
(0.2
3)8.
79(0
.16)
QT
DT
Mas
atb
oth
loci
b 20.
000.
20(0
.37)
0.00
(0.0
5)�
0.01
(0.4
7)1.
001.
20(0
.37)
0.24
(0.1
6)0.
66(0
.36)
b 30.
000.
23(0
.35)
0.00
(0.0
5)0.
00(0
.41)
2.00
2.23
(0.3
5)0.
31(0
.11)
6.45
(0.7
3)b 4
0.00
0.04
(0.3
2)0.
00(0
.04)
0.00
(0.4
1)3.
003.
03(0
.32)
0.51
(0.1
0)8.
53(0
.26)
QC
PG
alla0
sb0
20.
000.
00(0
.05)
0.01
(0.3
2)0.
00(0
.04)
1.00
�0.
03(0
.08)
8.83
(12.
9)1.
07(0
.55)
b03
0.00
0.00
(0.0
5)0.
00(0
.36)
0.00
(0.0
3)2.
000.
20(0
.09)
9.59
(12.
7)3.
30(1
.73)
b04
0.00
0.00
(0.0
3)0.
00(0
.23)
0.00
(0.0
2)3.
000.
44(0
.10)
10.5
1(1
2.7)
4.13
(1.7
7)Q
CP
Ga0
sre
pla
ced
by
off
spri
ng
gen
oty
pe
(g)
b02
0.00
0.00
(0.0
3)0.
01(0
.29)
0.00
(0.0
2)1.
000.
08(0
.04)
7.64
(9.1
5)1.
05(0
.50)
b03
0.00
0.00
(0.0
3)0.
01(0
.32)
0.00
(0.0
2)2.
000.
14(0
.04)
8.41
(9.0
1)3.
46(1
.85)
b04
0.00
0.00
(0.0
3)0.
00(0
.22)
0.00
(0.0
2)3.
000.
31(0
.05)
9.31
(9.0
3)4.
30(1
.92)
Sim
ula
ted
wit
hp
op
ula
tio
nst
rati
fica
tio
n.
825Quantitative Trait Association in Parent Offspring Trios
Genet. Epidemiol. DOI 10.1002/gepi
![Page 14: Quantitative trait association in parent offspring trios: Extension of case/pseudocontrol method and comparison of prospective and retrospective approaches](https://reader036.vdocuments.mx/reader036/viewer/2022080306/5750048e1a28ab11489f677e/html5/thumbnails/14.jpg)
parameters for the QPL and QCPG will not becorrectly estimated under population stratification,except under the null. Under the alternative withselection, all the methods gave biased estimates asfound in the single-locus case.
We also investigated the QTDTM and QCPGmethods with a single replicate of data generatedunder a three-locus haplotype model (data notshown). Results were broadly similar to the two-locus haplotype results, except that the QCPGmethod required a very large number (50,000) triosto produce unbiased estimates, while QTDTM gen-erally achieved convergence and unbiased para-meter estimation with only 1,000 trios.
STEPWISE PROCEDURE
A stepwise procedure (results not shown), as usedby Cordell et al. [2004] for disease traits, was used tocompare the prospective QTDTM (using the full setof nuisance parameters) with the QCPG methodwith the ‘correct’ a0 parameters replaced by theoffspring genotype, under models with and withoutpopulation stratification and selection. In general,the pattern of results in terms of power and type1 error was as expected, with the QTDTM methodbeing the more powerful in general. Under popula-tion stratification we found that the Type I errorswere slightly too large for the QTDTM method underrandom sampling, consistent with the resultsobserved (Table 6 (online)) in the single locus simu-lations. Additional simulations (data not shown)indicated that this problem could be solved by use ofWald tests incorporating robust ‘information sand-wich’ variance estimates [Huber, 1967], rather thanlikelihood ratio tests or Wald tests with the usualvariance estimate (which equals minus the inverse ofthe Hessian matrix). We also investigated the powerand type 1 error of the stepwise approach whenapplied to non-normally distributed traits and foundthat both QTDTM and QCPG appear to be suitablefor the analysis of traits that deviate slightly fromnormality. Neither method was found tobe suitable for the analysis of very nonnormallydistributed traits, although it is worth noting thatthe prospective QTDTM method could easily beextended to enable the analysis of nonnormal traits byuse of robust regression, a generalised linear model(GLM), or by assuming a variance-mean relationship,according to the departure from normality.
MATERNAL GENOTYPEAND PARENT-OF-ORIGIN EFFECTS
The previous single-locus simulations were mod-ified to include maternal genotype and parent-of-origin effects. For each replicate, 1,000 trios weregenerated in which the offspring’s quantitative trait
was influenced by its own genotype, and by eitherthe mother’s genotype, or whether the offspringreceived a variant allele from the mother, or both.Under the alternative, 100 replicates of data weregenerated. Under the null (no maternal genotypeeffects or no parent-of-origin effects) 1,000 replicatesof data were generated. The QCEPG and QTDTM
methods were implemented in Stata. For Kistner andWeinberg’s approach [Kistner et al., 2006], the SASmacro provided at http://dir.niehs.nih.gov/dirbb/weinbergfiles/qpl.htm was used. The expectedeffect estimates for QCEPG and QTDTM should bethe same, since the traits were simulated to have unitvariance. The offspring reference category waschosen to be the 1/1 genotype, and b1 and b2 arethe estimated effects for the 1/2 (2/1) and 2/2genotypes respectively. Similarly, maternal genotypeeffects are denoted as bm1 and bm2, and parent-of-origin effects by bI. However, the expected estimatesfor Kistner and Weinberg’s method shouldbe slightly different. In their method the referencecategory for the offspring genotype effects is theheterozygous genotype, rather than the homozygous(1/1) genotype. The maternal effects, denoted by d01
and d12, compare the difference in quantitative traitfor a mother with 1 variant allele to a mother with 0variant alleles, and a mother with 2 variant alleles toa mother with 1 variant allele respectively (while inthe QCEPG method, both comparisons are madewith the homozygous 1/1 genotype category). Theestimates for the parent-of-origin effects (l1) in theQPL represent the log odds that a heterozygouschild inherits a maternal copy of the variant alleleinstead of a paternal copy, per unit increase in traitvalue. Although based only on heterozygous off-spring, these parameters are expected be the same asfor the QCEPG and QTDTM methods.
Tables 8 and 9 (online) show the true effects, theestimated means and standard deviations. All threemethods produce reasonable estimates under thenull. The results for the prospective QTDTM methodshow the least bias. Under the alternative, theretrospective methods show a bias when parent-of-origin effects are present. The QPL appears toproduce parent-of-origin effects of approximately0.5, when they would have been expected to be 1.This may be due to some unrecognised difference inthe parameterizations: the QCEPG uses the originalparent-of-origin parameterization of Weinberg et al.[1998], whereas the QPL uses a parameterizationcloser to the alternative parameterization suggestedby Weinberg [1999]. Table 10 (online) shows thepowers and type I errors. The type I errors for theQPL and QTDTM methods seem reasonable. How-ever, the type I error for the QCEPG method whentesting maternal genotype effects is very large,although this appears to be a small-sample issue as
826 Wheeler and Cordell
Genet. Epidemiol. DOI 10.1002/gepi
![Page 15: Quantitative trait association in parent offspring trios: Extension of case/pseudocontrol method and comparison of prospective and retrospective approaches](https://reader036.vdocuments.mx/reader036/viewer/2022080306/5750048e1a28ab11489f677e/html5/thumbnails/15.jpg)
it improved in simulations with a larger number oftrios (data not shown). Overall, the extension of theQPL method had the highest power to detect either amaternal genotype or parent-of-origin effect.
DISCUSSION
In this paper, we have extended the case/pseudocontrol association approach for dichoto-mous phenotypes [Cordell and Clayton, 2002] toperform association analysis with quantitative traits.This approach is very similar to the QPL approachproposed by Kistner and Weinberg [2004], but uses aslightly more intuitive parameterization and extendsmore naturally to allow analysis of multiallelicmarkers, multiple linked loci, multiple unlinkedregions, parent-of-origin or maternal genotypeeffects, gene-gene and gene-environment interac-tions, using the same formulation as Cordell et al.[2004]. We compared this approach to a prospectiveaproach, the QTDTM and also extended the QTDTM
to allow analysis of multiple linked loci (includingmulti-locus haplotypes), parent-of-origin or mater-nal genotype effects. Other extensions to the QTDTM
follow naturally.All the methods incorporate nuisance parameters
intended to account for population stratification.When considering multi-locus haplotypes, the num-ber of nuisance parameters can dramaticallyincrease, and so it is important to find ways toreduce the number of nuisance parameters. It wasfound that replacing the nuisance parameters by theoffspring genotype in the retrospective methodsworked almost as well as the full model, andreplacing the nuisance parameters by parentalgenotypes worked well for the QTDTM. In oursimulations, it was assumed that both parents camefrom the same sub-population. If, in fact, matingsoccurred between individuals from different sub-populations, one might not expect these approxima-tions to work as well as fitting the full set of nuisanceparameters.
Although the retrospective approaches had someadvantages with regard to estimation of parametersunder selected sampling, in general we found theprospective QTDTM to be the most efficientapproach, requiring smaller sample sizes to achieveconvergence and asyptotic behaviour. In addition,the parameter estimates provided by the QTDTM
have a more intuitive interpretation, correspondingto the direct genotype effects on the trait, whereasthe retrospective approaches estimate parametersthat are scaled by division by the unknown(although potentially estimable) trait variance. Cov-ariates are also more easily incorporated into theQTDTM framework, simply by adding them in asterms in the regression equation, although it would
be possible to incorporate covariates in the retro-spective approaches, either by first regressing thetraits on covariates of interest and performingsubsequent analysis on the residuals, or by usingmethods such as those described by Lim et al. [2005].
The QTDTM method was found to be the onlymethod suitable for estimation of effects under thealternative hypothesis with population stratifi-cation (assuming random sampling). Under popula-tion stratification, it was necessary to use robust‘information sandwich’ variance estimates to achievecorrect type 1 errors and confidence interval cover-age with the QTDTM. This is possibly because theparental mating-type stratification parameters act asa surrogate for population membership in the sensethat they soak up the mean level of bias inducedby population stratification, but do they not fullyaccount for population membership, so that thedistribution of trait within parental mating-typeclasses violates the assumption of normality,even if this asssumption holds within eachsub-population.
The analyses described here assumed availabilityof a dataset consisting of parent-offspring trios, withno missing genotype data. A natural extension of themethods proposed here would be to consideranalysis of large extended pedigrees and/or missinggenotype data. The QPL has previously beenextended to allow analysis of multiple siblings andmissing parents [Kistner and Weinberg, 2005] whilean approach asymptotically equivalent to QTDTM,namely the HQTDT of Abecasis et al. [2000], hasbeen extended to apply to pedigrees of arbitrarystructure [Abecasis et al., 2002]. However, theseapproaches focus on testing rather than estimationof effects and apply only to a single locus at a time.A natural way to extend the QCEPG and QTDTM
approaches developed here for analysis of generalpedigrees would be to perform tests using Wald testsand incorporate robust ‘information sandwich’variance estimates that cluster observations accord-ing to pedigree [Huber, 1967]. An alternativeapproach would be to use a random-effects model-ling framework [Xu and Shete, 2006]. With regardsto missing genotype data, methods that sample oraverage over the possible genotype configurationsconsistent with the observed genotype data, in thecorrect proportions [Cordell, 2006], could be con-sidered. Investigation of these approaches and theirbehaviour under complex disease models, in thepresence of population stratification, will form thebasis of future work.
ACKNOWLEDGMENTS
Support for this work was provided by theWellcome Trust (Grant references 074524
827Quantitative Trait Association in Parent Offspring Trios
Genet. Epidemiol. DOI 10.1002/gepi
![Page 16: Quantitative trait association in parent offspring trios: Extension of case/pseudocontrol method and comparison of prospective and retrospective approaches](https://reader036.vdocuments.mx/reader036/viewer/2022080306/5750048e1a28ab11489f677e/html5/thumbnails/16.jpg)
and 068612). We thank Jenefer Blackwell, JoannaBiernacka and Jeff O’Connell for useful discussions,and to Joanna Howson and Hin-Tak Leung fortechnical assistance.
REFERENCESAbecasis GR, Cardon LR, Cookson WOC. 2000. A general test of
association for quantitative traits in nuclear families. AmJ Hum Genet 66:279–292.
Abecasis GR, Cherney SS, Cookson WO, Cardon LR. 2002. Merlin-rapid analysis of dense genetic maps using sparse gene flowtrees. Nat Genet 30:97–101.
Allison DB. 1997. Transmission-disequilibrium tests for quantita-tive traits. Am J Hum Genet 60:676–690.
Cordell HJ, Barratt BJ, Clayton DG. 2004. Case/pseudocontrolanalysis in genetic association studies: a unified framework fordetection of genotype and haplotype associations, gene-geneand gene-environment interactions, and parent-of-origineffects. Genet Epidemiol 26:167–185.
Cordell HJ, Clayton DG. 2002. A unified stepwise regres-sion procedure for evaluating the relative effects of poly-morphisms within a gene using case/control or family data:application to HLA in type 1 diabetes. Am J Hum Genet 70:124–141.
Cordell HJ. 2006. Estimation and testing of genotypeand haplotype effects in case/control studies: comparison ofweighted regression and multiple imputation procedures.Genet Epidemiol 30:259–275.
Fulker DW, Cherny SS, Sham PC, Hewitt JK. 1999. Combinedlinkage and association sib-pair analysis for quantitative traits.Am J Hum Genet 64:259–267.
Gauderman WJ. 2003. Candidate gene association analysis for aquantitative trait, using parent-offspring trios. Genet Epide-miol 25:327–338.
Huber PJ. 1967. The behaviour of maximum likelihood estimatesunder nonstandard conditions. In: Proceedings of the FifthBerkeley Symposium on Mathematical Statistics and Prob-ability, 221–233.
Kistner EO, Infante-Rivard C, Weinberg CR. 2006. A method forusing incomplete triads to test maternally mediated geneticeffects and parent-of-origin effects in relation to a quantitativetrait. Am J Epidemiol 163:255–261.
Kistner EO, Weinberg CR. 2004. Method for using complete andincomplete trios to identify genes related to a quantitative trait.Genet Epidemiol 27:33–42.
Kistner EO, Weinberg CR. 2005. A method for identifying genesrelated to a quantitative trait, incorporating multiple siblingsand missing parents. Genet Epidemiol 29:155–165.
Laird NM, Horvath S, Xu X. 2000. Implementing a unifiedapproach to family-based tests of association. Genet Epidemiol19 (suppl 1):S36–S42.
Lange C, DeMeo DL, Laird NM. 2002. Power and designconsiderations for a general class of family-based associationtests: quantitative traits. Am J Hum Genet 71:1330–1341.
Lim S, Beyenne J, Greenwood CMT. 2005. Continuous covariatesin genetic association studies of case-parent triads: gene-environment interaction effects, population stratification, andpower analysis. Statisti Appl Genetics Mol Biol 4:1–25.
Liu Y, Tritchler D, Bull SB. 2002. A unified framework fortransmission-disequilibrium test analysis of discrete andcontinuous traits. Genet Epidemiol 22:26–40.
Lunetta K, Faraone S, Biederman J, Laird N. 2000. Family-basedtests of association and linkage that use unaffectedsibs, covariates, and interactions. Am J Hum Genet 66:605–614.
Schaid DJ, Sommer SS. 1993. Genotype relative risks: methods fordesign and analysis of candidate-gene association studies. AmJ Hum Genet 53:1114–1126.
Schaid DJ. 1996. General score tests for associations of geneticmarkers with disease using cases and their parents. GenetEpidemiol 13:423–449.
Spielman RS, McGinnis RE, Ewens WJ. 1993. Transmission test forlinkage disequilibrium: the insulin gene region and insulin-dependent diabetes mellitus (IDDM). Am J Hum Genet 52:506–1993.
Weinberg CR, Wilcox AJ, Lie RT. 1998. A log-linear approachto case-parent-triad data: assessing effects of disease genesthat act either directly or through maternal effects and thatmay be subject to parental imprinting. Am J Hum Genet 62:969–978.
Weinberg CR. 1999. Methods for detection of parent-of-origineffects in genetic studies of case-parents triads. Am J HumGenet 65:229–235.
Whittaker JC, Gharani N, Hindmarsh P, McCarthy MI. 2003.Estimation and testing of parent-of-origin effects for quantita-tive traits. Am J Hum Genet 72:1035–1039.
Xu H, Shete S. 2006. Mixed-effects logistic approach for associa-tion following linkage scan for complex disorders. HumHered, in press.
Yang Q, Rabinowitz D, Isasi C, Shea S. 2000. Adjusting forconfounding due to population admixture when estimating theeffect of candidate genes on quantitative traits. Hum Hered 50:227–233.
APPENDIX A
EXPRESSION OF QCPG LIKELIHOOD
The retrospective QCPG likelihood is parameterized in terms of parameters of interest (offspring genotypeeffects, denoted b0) and several nuisance parameters (denoted a0) as shown in Table II. Here we express the b0
and a0 parameters in the QCPG likelihood in terms of various parameters (denoted a and b) in a prospectivemodel for trait given genotype. We do not propose to reparameterize the QCPG likelihood of Table II in termsof these prospective parameters before maximization. Rather, we continue to freely estimate the b0 and a0
parameters when we maximise the QCPG likelihood. However, we use the relationship between the
828 Wheeler and Cordell
Genet. Epidemiol. DOI 10.1002/gepi
![Page 17: Quantitative trait association in parent offspring trios: Extension of case/pseudocontrol method and comparison of prospective and retrospective approaches](https://reader036.vdocuments.mx/reader036/viewer/2022080306/5750048e1a28ab11489f677e/html5/thumbnails/17.jpg)
retrospective and prospective parameters to inform our understanding and interpretation of the restrospectiveparameters, since the a and b parameters in the prospective model are generally more intuitively interpretablethan those on the retrospective scale.
For a normally distributed quantitative trait Y with mean m and variance s2, the probability distributionfunction of the observed trait yi is given by
fðyi; m;s2Þ ¼1ffiffiffiffiffiffiffiffiffiffiffi
2ps2p exp �
ðyi � mÞ2
2s2
� �ð7Þ
For a trio with parental mating type M and offspring genotype gi, m equals the mean of the prospective QTDTM
approach from equation (1), that is m ¼ aM þ bgi. Equation (7) becomes
fðyi; aM;bgi;s2Þ ¼
1ffiffiffiffiffiffiffiffiffiffiffi2ps2p exp �
ðyi � ðaM þ bgiÞÞ
2
2s2
!ð8Þ
From equation (3), the contribution of a trio to the likelihood can be expressed as
Pðgijgim; gif ; yiÞ ¼PðyijgiÞPðgijgim; gif ÞP
g�i2S0
MPðyijg�i ÞPðg
�i jgim; gif Þ
ð9Þ
So, from equations (8) and (9), the contribution of a trio to the likelihood for a normally distributed trait withmean aM þ bgi
and variance s2 is given by
Pðgijgim; gif ; yiÞ ¼
1ffiffiffiffiffiffiffiffi2ps2p exp �
ðyi�ðaMþbgiÞÞ
2
2s2
� �Pðgijgim; gif Þ
Pg�
i2S0
M
1ffiffiffiffiffiffiffiffi2ps2p exp �
ðyi�ðaMþbg�iÞÞ
2
2s2
� �Pðg�i jgim; gif Þ
¼
exp �ðyi�ðaMþbgi
ÞÞ2
2s2
� �Pðgijgim; gif Þ
Pg�
i2S0
Mexp �
ðyi�ðaMþbg�iÞÞ
2
2s2
� �Pðg�i jgim; gif Þ
¼
exp �y2
i
2s2 þ2yiðaMþbgi
Þ
2s2 �ðaMþbgi
Þ2
2s2
� �Pðgijgim; gif Þ
Pg�
i2S0
Mexp �
y2i
2s2 þ2yiðaMþbg�
iÞ
2s2 �ðaMþbg�
iÞ2
2s2
� �Pðg�i jgim; gif Þ
: ð10Þ
In each offspring’s contribution to the likelihood, the quantitative trait, yi, its variance, s2, and the parentalmating type parameter aM are the same for the offspring (the ‘case’) and the pseudocontrols. Therefore,cancelling terms from the numerator and denominator,
Pðgijgim; gif ; yiÞ ¼
expyibgi
s2 �ðaMþbgi
Þ2
2s2
� �Pðgijgim; gif Þ
Pg�
i2S0
Mexp
yibg�i
s2 �ðaMþbg�
iÞ2
2s2
� �Pðg�i jgim; gif Þ
¼
expbgi
s2 yi �aM
2 þ2aMbgiþbgi
2
2s2
� �Pðgijgim; gif Þ
Pg�
i2S0
Mexp
bg�i
s2 yi �aM
2 þ2aMbg�iþb�gi
2
2s2
� �Pðg�i jgim; gif Þ
¼exp
bgi
s2 yi �bgið2aMþbgi
Þ
2s2
� �Pðgijgim; gif ÞP
g�i2S0
Mexp
bg�i
s2 yi �bg�
ið2aMþbg�
iÞ
2s2
� �Pðg�i jgim; gif Þ
which can be written in the following form:
829Quantitative Trait Association in Parent Offspring Trios
Genet. Epidemiol. DOI 10.1002/gepi
![Page 18: Quantitative trait association in parent offspring trios: Extension of case/pseudocontrol method and comparison of prospective and retrospective approaches](https://reader036.vdocuments.mx/reader036/viewer/2022080306/5750048e1a28ab11489f677e/html5/thumbnails/18.jpg)
Pðgijgim; gif ; yiÞ ¼exp
bgi
s2 yi
h iþ �
bgið2aMþbgi
Þ
2s2 þ lnðPðgijgim; gif ÞÞ
h i� �P
g�i2S0
Mexp
bg�i
s2 yi
� �þ �
bg�ið2aMþbg�
iÞ
2s2 þ lnðPðg�i jgim; gif ÞÞ
� �� � ð11Þ
This is in the general form of equation (4), where
b0g�i¼
bg�i
s2 and a0Mg�i¼ �
bg�ið2aMþbg�
iÞ
2s2 þ lnðPðg�i jgim; gif ÞÞ
Thus the parameters from the retrospective QCPG formulation (equation (4)) may be interpreted as follows.
The term b0g�i¼
bg�i
s2 is the original genotype effect of genotype g�i on the prospective scale, divided by the trait
variance. (Hence, given parameter estimates b0g�i
from fitting the QCPG model, the genotype effects on the
prospective scale could be obtained by multiplying the estimate of b0g�i
by the corresponding trait variance, if it wereknown or estimable.) The termsa0Mg�
i¼ �bg�
ið2aM þ bg�
iÞ=2s2 þ lnðPðg�i jgim; gifÞÞ correspond to nuisance parameters
that allow the model to account for non-Mendelianism and population stratification, as proposed by Kistner andWeinberg [2004]. As discussed in the text, not all of the b0 and a0 are estimable, so restrictions must be made such assetting some of these equal to zero (equivalent to choosing a reference genotype category to which the othergenotype effects are compared). This complicates the interpretation of the nuisance parameters. Suppose that allparameters were in fact estimable. Suppose further that there is no non-Mendelianism, so that Pðg�i jgim; gif Þ ¼ 0:25for all offspring genotypes consistent with the parental genotypes. In that case, we could write out in full therelationships between the retrospective and prospective parameters implied by equation (11):
b00 ¼b0
s2;b01 ¼
b1
s2;b02 ¼
b2
s2
a0010 ¼ �b0ð2a01 þ b0Þ
2s2þ lnð0:25Þ
a0011 ¼ �b1ð2a01 þ b1Þ
2s2þ lnð0:25Þ
a0110 ¼ �b0ð2a11 þ b0Þ
2s2þ lnð0:25Þ
a0111 ¼ �b1ð2a11 þ b1Þ
2s2þ lnð0:25Þ
a0112 ¼ �b2ð2a11 þ b2Þ
2s2þ lnð0:25Þ
a0121 ¼ �b1ð2a12 þ b1Þ
2s2þ lnð0:25Þ
a0122 ¼ �b2ð2a12 þ b2Þ
2s2þ lnð0:25Þ
In practice, not all the parameters are estimable and we set b01, a0011, a0111 and a0121 (the reference parameters)equal to zero. Assuming that on the prospective scale we also set b1 5 0, and rearranging the expressions above,the relationships become
b00 ¼b0
s2;b01 ¼ 0;b02 ¼
b2
s2
a0010 ¼ a0011þb1ð2a01 þ b1Þ
2s2�
b0ð2a01 þ b0Þ
2s2¼ �
b0ð2a01 þ b0Þ
2s2
a0110 ¼ a0111þb1ð2a11 þ b1Þ
2s2�
b0ð2a11 þ b0Þ
2s2¼ �
b0ð2a11 þ b0Þ
2s2
a0112 ¼ a0111þb1ð2a11 þ b1Þ
2s2�
b2ð2a11 þ b2Þ
2s2¼ �
b2ð2a11 þ b2Þ
2s2
a0122 ¼ a0121þb1ð2a12 þ b1Þ
2s2�
b2ð2a12 þ b2Þ
2s2¼ �
b2ð2a12 þ b2Þ
2s2
Under the null hypothesis, the prospective genotype effects b0 and b2 equal zero, and so the true values of theretrospective parameters (both the genotype parameters of interest and the nuisance parameters) will also equal zero.
830 Wheeler and Cordell
Genet. Epidemiol. DOI 10.1002/gepi
![Page 19: Quantitative trait association in parent offspring trios: Extension of case/pseudocontrol method and comparison of prospective and retrospective approaches](https://reader036.vdocuments.mx/reader036/viewer/2022080306/5750048e1a28ab11489f677e/html5/thumbnails/19.jpg)
A similar argument applies if, instead of setting b1 and b01 to zero, we set b0 and b00 to zero, so thatthe genotype parameters are all calculated relative to the homozygous 1/1 genotype. In this case, we still findthat under the null hypothesis that b1 and b2 equal zero, the true values of the retrospective parameters (boththe genotype parameters of interest and the nuisance parameters) equal zero (data not shown).
The above derivation for the nuisance parameters only works because Pðg�i jgim; gif Þ ¼ 0:25 in all cases and sois subtracted out when we express a0010, a0110 , a0112 and a0122 in terms of a0011, a0111 and a0121. If, in fact, there wasnon-Mendelianism, we would have instead that
a0jkl ¼ �blð2ajk þ blÞ
2s2þ lnðPðg�i ¼ ljfgim; gifg ¼ fj; kgÞÞ
� lnðPðg�i ¼ 1jfgim; gifg ¼ fj; kgÞÞ
¼ �blð2ajk þ blÞ
2s2þ cjkl; say:
In this case the term cjkl will be nonzero, but fitting a0jkl allows the model to account for this, so that even underthe null hypothesis where b0 and b2 equal zero, the nuisance parameters a0jkl will not equal zero.
The effect of population stratification is modelled on the prospective scale via m ¼ aM þ bgiwhich allows the
mean trait value for a child to vary according to the genotype of its parents. If there really were populationstratification, the correct model would in fact be m ¼ mp þ bgi
, where mp denotes the mean trait value in the(unknown) sub-population p from which the child originates. If we replace aM bymp everywhere above, we find that
a0jkl ¼ �blð2mp þ blÞ
2s2þ cjkl
Thus the true value of the nuisance parameters should vary according to sub-population, which we donot allow for. However, under the null hypothesis where b0 and b2 equal zero, a0jkl does not vary by sub-population and so we should still obtain valid inference for the b0 under the null, even though the modelmisspecification means we may not obtain valid inference under the alternative.
APPENDIX B
EXPRESSION OF QPL LIKELIHOOD
We may use a similar argument as in Appendix A to derive the relationship between the a00 andb00 parameters in the QPL model and the prospective a and b parameters. The only difference is in thesummation in the denominator of equation (11), which is over either two or three possible offspring genotypecategories corresponding to the categories shown in Table I, (e.g. 1/1, 1/2 (unordered) and 2/2 for offspringof two heterozygous parents, rather than over four genotype categories 1/1, 1/2, 2/1 and 2/2). The result ofthis is that the relationships implied by equation (11) become
b000 ¼b0
s2; b001 ¼
b1
s2;b02 ¼
b2
s2
a00010 ¼ �b0ð2a01 þ b0Þ
2s2þ lnð0:5Þ
a00011 ¼ �b1ð2a01 þ b1Þ
2s2þ lnð0:5Þ
a00110 ¼ �b0ð2a11 þ b0Þ
2s2þ lnð0:25Þ
a00111 ¼ �b1ð2a11 þ b1Þ
2s2þ lnð0:5Þ
a00112 ¼ �b2ð2a11 þ b2Þ
2s2þ lnð0:25Þ
a00121 ¼ �b1ð2a12 þ b1Þ
2s2þ lnð0:5Þ
a00122 ¼ �b2ð2a12 þ b2Þ
2s2þ lnð0:5Þ
831Quantitative Trait Association in Parent Offspring Trios
Genet. Epidemiol. DOI 10.1002/gepi
![Page 20: Quantitative trait association in parent offspring trios: Extension of case/pseudocontrol method and comparison of prospective and retrospective approaches](https://reader036.vdocuments.mx/reader036/viewer/2022080306/5750048e1a28ab11489f677e/html5/thumbnails/20.jpg)
where the terms Pðg�i jgim; gif Þ differ according to offspring genotype, for offspring of two heterozygous parents.If, as in Appendix A, we set b1, b01, a0011, a0111 and a0121 to zero, we obtain the following equations:
b000 ¼b0
s2;b001 ¼ 0;b002 ¼
b2
s2
a00010 ¼ a0011þb1ð2a01 þ b1Þ
2s2�
b0ð2a01 þ b0Þ
2s2¼ �
b0ð2a01 þ b0Þ
2s2
a00110 ¼ a0111� lnð0:5Þ þ lnð0:25Þ þb1ð2a11 þ b1Þ
2s2�
b0ð2a11 þ b0Þ
2s2¼ �
b0ð2a11 þ b0Þ
2s2þ lnð0:5Þ
a00112 ¼ a0111� lnð0:5Þ þ lnð0:25Þ þb1ð2a11 þ b1Þ
2s2�
b2ð2a11 þ b2Þ
2s2¼ �
b2ð2a11 þ b2Þ
2s2þ lnð0:5Þ
a00122 ¼ a0121þb1ð2a12 þ b1Þ
2s2�
b2ð2a12 þ b2Þ
2s2¼ �
b2ð2a12 þ b2Þ
2s2
Under the null hypothesis that b0 and b2 equal zero, the a00jkl do not all equal zero, and so it is necessary in theQPL model that the a00jkl be included in order to obtain correct inference, even when there is no populationstratification or non-Mendelianism.
APPENDIX C
DISCUSSION OF NUISANCE PARAMETERS
A difference between the QCPG and QPL methods arises with regard to the number of nuisance parametersestimated. It can be seen in Table I that there are a total of four nuisance parameters, a010, a110, a112,a122.However, by fitting the model using a polytomous logistic model as proposed by Kistner and Weinberg,two additional, essentially inestimable, nuisance parameters are estimated. These are referred to in Kistner andWeinberg ([2004] p. 36) as a012 and a120, corresponding to the situations where either one parent has no copiesof the variant allele but the offspring has two copies, or one parent has two copies but the offspring has none.There is no data to estimate these parameters (they correspond to impossible events) but the computer programtries to estimate them since they are included in the model. Implementation of the QPL approach using SAScode available from http://dir.niehs.nih.gov/dirbb/weinbergfiles/qpl.htm (data not shown) indicates thatestimation of these two inestimable a012 and a120 parameters is poor, with very large estimates and confidenceintervals, although this does not appear to adversely affect estimation of the six genuinely estimableparameters.
For the QCPG and QPL methods, if there are genotype effects present, the nuisance parameters are non-zero.When population stratification exists, the true values of these nuisance parameters may differ in the differentsub-populations (due to differences in aM between subpopulations). If estimated in the combined population,the nuisance parameters will be an average of those from the different sub-populations and so willnot necessarily provide accurate population-specific estimates. Hence, although the type 1 errors will becorrect, we do not necessarily expect the QCPG (or QPL) method to have unbiased parameter estimates whenused for estimation of effects under the alternative hypothesis with population stratification, even though theyshould be unbiased under the null.
When extending the QCPG method to multi-locus haplotypes, the number of nuisance parametersdramatically increases since the number of parental mating types and offspring genotypes increases. Thenuisance parameters in the single-locus case are of the form a0Mg�
i¼ �bg�
ið2aM þ bg�
iÞ=2s2 þ lnðPðg�i jgim; gif ÞÞ
(equation (11)).Through simulations (described in the main text), we investigate whether the parental genotype information
still required in the nuisance parameters can be captured if the nuisance parameters are replaced by parametersrepresenting the offspring genotype alone ðgg�
iÞ, thus reducing the overall number of nuisance parameters.
Equation (11) would then become
Pðgijgim; gif ; yiÞ ¼exp
bgi
s2 yi
h iþ ggi
� �P
g�i2S0
Mexp
bg�i
s2 yi
� �þ gg�
i
� �
832 Wheeler and Cordell
Genet. Epidemiol. DOI 10.1002/gepi
![Page 21: Quantitative trait association in parent offspring trios: Extension of case/pseudocontrol method and comparison of prospective and retrospective approaches](https://reader036.vdocuments.mx/reader036/viewer/2022080306/5750048e1a28ab11489f677e/html5/thumbnails/21.jpg)
The hope is that the population stratification can be accounted for by offspring genotype alone, rather thanby a combination of offspring and parental genotypes.
APPENDIX D
EXPRESSION OF THE QCEPG LIKELIHOOD
The rationale for expressing the likelihood equation (5) in the form of equation (6) is very similar to that usedfor the QCPG approach. Previously, for the QCPG method, the population mean of the quantitative trait Y (m)was assumed to equal the mean of the prospective QTDTM method such that for an offspring with genotypebgi
, m ¼ aM þ bgi, for mating type M and offspring genotype effect bgi
. To include maternal genotype or parent-
of-origin effects, the mean is now assumed to depend on parental mating type and the offspring genotypeeffect as before, but it now also depends on the maternal genotype effect, bgim
, and the parent-of-origin effectbIm
. Therefore,
m ¼ aM þ bgiþ bgim
þ bIm
Hence, the trait distribution can be expressed as a normal distribution with mean m as above, and substitutingthis into equation (5) gives
Pðgijgim; gif ; yi; xÞ ¼
1ffiffiffiffiffiffiffiffi2ps2p exp �
ðyi � ðaM þ bgiþ bgim
þ bImÞÞ
2
2s2
!Pðgijgim; gif Þ
Xg�
i;g�
im;g�
if2Gx
1ffiffiffiffiffiffiffiffiffiffiffi2ps2p exp �
ðyi � ðaM þ bg�iþ bg�
imþ b�Im
ÞÞ2
2s2
!Pðg�i jg
�im; g
�if Þ
¼
exp �ðy2
i � 2yiðaM þ bgiþ bgim
þ bImÞ þ ðaM þ bgi
þ bgimþ bIm
Þ2Þ
2s2
!Pðgijgim; gif Þ
Xg�
i;g�
im;g�
if2Gx
exp �ðy2
i � 2yiðaM þ bg�iþ bg�
imþ b�Im
Þ þ ðaM þ bg�iþ bg�
imþ b�Im
Þ2Þ
2s2
!Pðg�i jg
�im; g
�if Þ
¼
expyiðaM þ bgi
þ bgimþ bIm
Þ
s2�ðaM þ bgi
þ bgimþ bIm
Þ2
2s2
!Pðgijgim; gif Þ
Xg�
i;g�
im;g�
if2Gx
expyiðaM þ bg�
iþ bg�
imþ b�Im
Þ
s2�ðaM þ bg�
iþ bg�
imþ b�Im
Þ2
2s2
!Pðg�i jg
�im; g
�if Þ
¼
expbgi
s2yi þ
bgim
s2yi þ
bIm
s2yi �
o2s2
� �Pðgijgim; gif Þ
Xg�
i;g�
im;g�
if2Gx
expbg�
i
s2yi þ
bg�im
s2yi þ
b�Im
s2yi �
o�
2s2
!Pðgijgim; gif Þ
where
o� ¼ bg�ið2aM þ bg�
iÞ þ bg�
imð2aM þ bg�
imÞ
þ bImð2aM þ b�Im
Þ þ 2bg�iðbg�
imþ b�Im
Þ þ 2bg�imb�Im
833Quantitative Trait Association in Parent Offspring Trios
Genet. Epidemiol. DOI 10.1002/gepi