[Wiley Series in Probability and Statistics] Categorical Data Analysis || Building and Extending Loglinear/Logit Models

Download [Wiley Series in Probability and Statistics] Categorical Data Analysis || Building and Extending Loglinear/Logit Models

Post on 09-Dec-2016




0 download


<ul><li><p>C H A P T E R 9</p><p>Building and ExtendingrLoglinear Logit Models</p><p>In Chapters 5 through 7 we presented logistic regression models, which usethe logit link for binomial or multinomial responses. In Chapter 8 wepresented loglinear models for contingency tables, which use the log link forPoisson cell counts. Equivalences between them were discussed in Section8.5.3. In this chapter we discuss building and extending these models withcontingency tables.</p><p>In Section 9.1 we present graphs that show a models association andconditional independence patterns. In Section 9.2 we discuss selection andcomparison of loglinear models. Diagnostics for checking models, such asresiduals, are presented in Section 9.3.</p><p>The loglinear models of Chapter 8 treat all variables as nominal. InSection 9.4 we present loglinear models of association between ordinalvariables. In Sections 9.5 and 9.6 we present generalizations that replacefixed scores by parameters. In the final section we discuss complications thatoccur with sparse contingency tables.</p><p>9.1 ASSOCIATION GRAPHS AND COLLAPSIBILITY</p><p>A graphical representation for associations in loglinear models indicates thepairs of conditionally independent variables. This representation helps revealimplications of models. Our presentation derives partly from Darroch et al. .1980 , who used mathematical graph theory to represent certain loglinear</p><p> .models called graphical models having a conditional independence struc-ture.</p><p>9.1.1 Association Graphs</p><p>An association graph has a set of vertices, each vertex representing a variable.An edge connecting two variables represents a conditional association be-</p><p>357</p><p>Categorical Data Analysis, Second Edition. Alan AgrestiCopyright 2002 John Wiley &amp; Sons, Inc.</p><p>ISBN: 0-471-36093-7</p></li><li><p>BUILDING AND EXTENDING LOGLINEARr LOGIT MODELS358</p><p> .FIGURE 9.1 Association graph for model WX,WY,WZ, YZ .</p><p> .tween them. For instance, loglinear model WX,WY,WZ, YZ lacks XY andXZ terms. It assumes independence between X and Y and between X andZ, conditional on the remaining two variables. Figure 9.1 portrays thismodels association graph. The four variables form the vertices. The fouredges represent pairwise conditional associations. Edges do not connect Xand Y or X and Z, the conditionally independent pairs.</p><p>Two loglinear models with the same pairwise associations have the sameassociation graph. For instance, this association graph is also the one for</p><p> .model WX,WYZ , which adds a three-factor WYZ interaction.A path in an association graph is a sequence of edges leading from one</p><p>variable to another. Two variables X and Y are said to be separated by asubset of variables if all paths connecting X and Y intersect that subset. Forinstance, in Figure 9.1, W separates X and Y, since any path connecting X</p><p> 4and Y goes through W. The subset W, Z also separates X and Y.A fundamental result states that two variables are conditionally independent</p><p>given any subset of variables that separates them Kreiner 1987; Whittaker.1990, p. 67 . Thus, not only are X and Y conditionally independent given W</p><p>and Z, but also given W alone. Similarly, X and Z are conditionallyindependent given W alone.</p><p>9.1.2 Collapsibility in Three-Way Contingency Tables</p><p>In Section 2.3.3 we showed that conditional associations in partial tablesusually differ from marginal associations. Under certain collapsibility condi-tions, however, they are the same.</p><p>For three-way tables, XY marginal and conditional odds ratios are identical ifeither Z and X are conditionally independent or if Z and Y are conditionallyindependent.</p><p> .The conditions state that the variable treated as the control Z is condition-ally independent of X or Y, or both. These conditions occur for loglinear</p><p> . .models XY, YZ and XY, XZ . Thus, the fitted XY odds ratio is identicalin the partial tables and the marginal table for models with associationgraphs</p><p>X Y Z and Y X Z</p></li><li><p>ASSOCIATION GRAPHS AND COLLAPSIBILITY 359</p><p>or even simpler models, but not for the model with graph</p><p>X Z Y</p><p>in which an edge connects Z to both X and Y. The proof follows directly . . .from the formulas for models XY, YZ and XY, XZ Problem 9.26 .</p><p> .We illustrate for the student survey Table 8.3 from Section 8.2.4, withAs alcohol use, Cs cigarette use, and Ms marijuana use. Model .AM, CM specifies AC conditional independence, given M. It has associa-tion graph</p><p>A M C.</p><p>Consider the AM association. Since C is conditionally independent of A, theAM fitted conditional odds ratios are the same as the AM fitted marginalodds ratio collapsed over C. From Table 8.5, both equal 61.9. Similarly, theCM association is collapsible. The AC association is not, because M is</p><p> .conditionally dependent with both A and C in model AM, CM . Thus, Aand C may be marginally dependent, even though they are conditionallyindependent. In fact, from Table 8.5, the fitted AC marginal odds ratio forthis model is 2.7.</p><p> .For model AC, AM, CM , no pair is conditionally independent. Nocollapsibility conditions are fulfilled. Table 8.5 showed that each pair hasquite different fitted marginal and conditional associations for this model.When a model contains all two-factor effects, effects may change aftercollapsing over any variable.</p><p>9.1.3 Collapsibility and Logit Models</p><p>The collapsibility conditions apply also to logit models. For instance, supposethat a clinical trial studies the association between a binary treatment</p><p> .variable X x s 1, x s 0 and a binary response Y, using data from K1 2 .centers Z . The logit model</p><p>Zlogit P Ys 1 Xs i , Zs k s q x q . i k</p><p>has the same treatment effect for each center. Since this model corre- .sponds to loglinear model XY, XZ, YZ , this effect may differ after collaps-</p><p>ing the 2 2 K table over centers. The estimated XY conditional odds .ratio, exp , typically differs from the sample odds ratio in the marginal</p><p>2 2 table.Next, consider the simpler model that lacks center effects,</p><p>logit P Ys 1 Xs i , Zs k s q x . . i</p><p>For a given treatment, the success probability is identical for each center.The model satisfies a collapsibility condition, because it states that Z is</p></li><li><p>BUILDING AND EXTENDING LOGLINEARr LOGIT MODELS360</p><p>conditionally independent of Y, given X. This logit model is equivalent to .loglinear model XY, XZ , for which the XY association is collapsible. So,</p><p>when center effects are negligible and the simpler model fits nearly as well,the estimated treatment effect is approximately the marginal XY odds ratio.</p><p>9.1.4 Collapsibility and Association Graphs for Multiway Tables</p><p> .Bishop et al. 1975, p. 47 provided a parametric collapsibility condition withmultiway tables:</p><p>Suppose that a model for a multiway table partitions variables into threemutually exclusive subsets, A, B, C, such that B separates A and C. Aftercollapsing the table over the variables in C, parameters relating variables in A andparameters relating variables in A to variables in B are unchanged.</p><p> . . 4We illustrate using model WX,WY,WZ, YZ Figure 9.1 . Let As X , 4 4Bs W , and Cs Y, Z . Since the XY and XZ terms do not appear, all</p><p>parameters linking set A with set C equal zero, and B separates A and C. Ifwe collapse over Y and Z, the WX association is unchanged. Next, identify</p><p> 4 4 4As Y, Z , Bs W , Cs X . Then, conditional associations among W, Y,and Z remain the same after collapsing over X.</p><p>This result also implies that when any variable is independent of all othervariables, collapsing over it does not affect any other model terms. For</p><p> .instance, associations among W, X, and Y in model WX,WY, XY, Z are .the same as in WX,WY, XY .</p><p>When set B contains more than one variable, although parameter valuesare unchanged in collapsing over set C, the ML estimates of those parame-ters may differ slightly. A stronger collapsibility definition also requires thatthe estimates be identical. This condition of commutativity of fitting andcollapsing holds if the model contains the highest-order term relating vari-</p><p> .ables in B to each other. Asmussen and Edwards 1983 discussed this .property, which relates to decomposability of tables Note 8.2 .</p><p>9.2 MODEL SELECTION AND COMPARISON</p><p>Strategies for selecting and comparing loglinear models are similar to thosefor logistic regression discussed in Section 6.1. A model should be complexenough to fit well but also relatively simple to interpret, smoothing ratherthan overfitting the data.</p><p>9.2.1 Considerations in Model Selection</p><p>The potentially useful models are usually a small subset of the possiblemodels. A study designed to answer certain questions through confirmatoryanalyses may plan to compare models that differ only by the inclusion ofcertain terms. Also, models should recognize distinctions between response</p></li><li><p>MODEL SELECTION AND COMPARISON 361</p><p>and explanatory variables. The modeling process should concentrate onterms linking responses and terms linking explanatory variables to responses.The model should contain the most general interaction term relating theexplanatory variables. From the likelihood equations, this has the effect ofequating the fitted totals to the sample totals at combinations of their levels.This is natural, since one normally treats such totals as fixed. Related to this,certain marginal totals are often fixed by the sampling design. Any potentialmodel should include those totals as sufficient statistics, so likelihood equa-tions equate them to the fitted totals.</p><p>Consider Table 8.8 with Is automobile injury and Ss seat-belt use asresponses and Gs gender and Ls location as explanatory variables. Then</p><p> 4we treat n as fixed at each combination for G and L. For example,gqllq20,629 women had accidents in urban locations, so the fitted counts shouldhave 20,629 women in urban locations. To ensure this, a loglinear modelshould contain the GL term, which implies from its likelihood equations that 4 . s n . Thus, the model should be at least as complex as GL, S, I gqllq gqllqand focus on the effects of G and L on S and I as well as the SI association.</p><p> 4If S is also explanatory and only I is a response, n should be fixed.gqll sWith a single categorical response, relevant loglinear models correspond tologit models for that response. One should then use logit rather thanloglinear models, when the main focus is describing effects on that response.</p><p>For exploratory studies, a search among potential models may provideclues about associations and interactions. One approach first fits the modelhaving single-factor terms, then the model having two-factor and single-factorterms, then the model having three-factor and lower terms, and so on. Fittingsuch models often reveals a restricted range of good-fitting models. InSection 8.4.2 we used this strategy with the automobile injury data set.Automatic search mechanisms among possible models, such as backwardelimination, may also be useful but should be used with care and skepticism.Such a strategy need not yield a meaningful model.</p><p>9.2.2 Model Building for the Dayton Student Survey</p><p> . .In Sections 8.2.4 and 8.3.2 we analyzed the use of alcohol A , cigarettes C , .and marijuana M by a sample of high school seniors. The study also</p><p> . .classified students by gender G and race R . Table 9.1 shows the five-di-mensional contingency table. In selecting a model, we treat A, C, and M asresponses and G and R as explanatory. Thus, a model should contain theGR term, which forces the GR fitted marginal totals to equal the samplemarginal totals</p><p>Table 9.2 displays goodness-of-fit tests for several models. Because manycell counts are small, the chi-squared approximation for G2 may be poor, butthis index is useful for comparing models. The first model listed contains onlythe GR association and assumes conditional independence for the other ninepairs of associations. It fits horribly, which is no surprise. Model 2, with alltwo-factor terms, on the other hand, seems to fit well. Model 3, containing all</p></li><li><p>BUILDING AND EXTENDING LOGLINEARr LOGIT MODELS362</p><p>TABLE 9.1 Alcohol, Cigarette, and Marijuana Use for High School Seniors</p><p>Marijuana Use</p><p>Race s White Race s OtherFemale Male Female MaleAlcohol Cigarette</p><p>Use Use Yes No Yes No Yes No Yes No</p><p>Yes Yes 405 268 453 228 23 23 30 19No 13 218 28 201 2 19 1 18</p><p>No Yes 1 17 1 17 0 1 1 8No 1 117 1 133 0 12 0 17</p><p>Source: Harry Khamis, Wright State University.</p><p>TABLE 9.2 Goodness-of-Fit Tests for Loglinear Models for Table 9.1a 2Model G df</p><p>1. Mutual independence qGR 1325.1 252. Homogeneous association 15.3 163. All three-factor terms 5.3 6</p><p> .4a. 2 AC 201.2 17 .4b. 2 AM 107.0 17 .4c. 2 CM 513.5 17 .4d. 2 AG 18.7 17 .4e. 2 AR 20.3 17 .4f. 2 CG 16.3 17 .4g. 2 CR 15.8 17 .4h. 2 GM 25.2 17 .4i. 2 MR 18.9 17 .5. AC, AM, CM, AG, AR, GM, GR, MR 16.7 18 .6. AC, AM, CM, AG, AR, GM, GR 19.9 19 .7. AC, AM, CM, AG, AR, GR 28.8 20</p><p>aG, gender; R, race; A, alcohol use; C, cigarette use; M, marijuana use.</p><p>the three-factor interaction terms, also fits well, but the improvement in fit is 2 .not great difference in G of 15.3 y 5.3 s 10.0 based on df s 16 y 6 s 10 .</p><p>Thus, we consider models without three-factor terms. Beginning with model2, we eliminate two-factor terms. We use backward elimination, sequentiallytaking out terms for which the resulting increase in G2 is smallest, whenrefitting the model.</p><p>Table 9.2 shows the start of this process. Nine pairwise associations are .candidates for removal from model 2 all except GR , shown in models 4a</p><p>through 4i. The smallest increase in G2, compared to model 2, occurs in .removing the CR term i.e., model 4g . The increase is 15.8 y 15.3 s 0.5,</p><p>with df s 17 y 16 s 1, so this elimination seems sensible. After removing it,</p></li><li><p>MODEL SELECTION AND COMPARISON 363</p><p>the smallest additional increase results from removing the CG term model. 2 25 , resulting in G s 16.7 with df s 18, and a change in G of 0.9 based on</p><p> . 2df s 1. Removing next the MR term model 6 yields G s 19.9 withdf s 19, a change in G2 of 3.2 based on df s 1.</p><p>Further removals have a more severe effect. For instance, removing theAG term increases G2 by 5.3, with df s 1, for a P-value of 0.02. One cannottake such P-values literally, since the data suggested these tests, but it seems</p><p>w .safest not to drop additional terms. See Westfall and Wolfinger 1997 and .Westfall and Young 1993 for methods of adjusting P-values to account for</p><p>x .multiple tests . Model 6, denoted by AC, AM, CM, AG, AR, GM, GR , hasassociation graph</p><p>M G</p><p>C A R</p><p> 4 4Every path between C and G, R involves a variable in A, M . Given theoutcome on alcohol use and marijuana use, the model states that cigaretteuse is independent of both gender and race. Collapsing over the explanatoryvariables race and gender, the conditional associations between C and A and</p><p> .between C and M are the same as with the model AC, AM, CM fitted inSection 8.2.4.</p><p>Removing the GM term from this model yields model 7 in Table 9.2. Its 4 4association graph reveals that A separates G, R from C, M . Thus, all</p><p>pairwise conditional associations among A, C, and M in model 7 are .identical to those in model AC, AM, CM , collapsing over G and...</p></li></ul>


View more >