modeling a multinomial response - purdue universitybacraig/notes526/topic11a.pdf · 2020. 10....

60
Modeling a Multinomial Response Bruce A Craig Department of Statistics Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526 Topic 11 1

Upload: others

Post on 26-Jun-2021

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Modeling a Multinomial Response

Bruce A Craig

Department of StatisticsPurdue University

Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14

STAT 526 Topic 11 1

Page 2: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Multinomial Distribution

Model for discrete variable with two or more categories

Probability distribution:

Y = (Y1, . . . ,Yc) ∼ Multinomial(n, p1, . . . , pc−1)n is considered known (number of trials)

pc = 1−c−1∑

j=1

pj

p(y1, y2, . . . , yc) =(

n!y1!y2!...yc !

)

py11 p

y22 . . . pycc

E (Yj) = npj , Var(Yj) = npj(1− pj), Cov(Yj ,Yk) = −npjpk

Marginal dist for each Yj is B(n, pj)

Log-likelihood:

l(p) =c∑

j=1

yj log pj

Maximum Likelihood Estimator:p̂j = yj/n

STAT 526 Topic 11 2

Page 3: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Multinomial GLM Models

Now consider set of I Multinomial(ni ,pi) observations

Goal now to link predictors Xi to pi

As with binomial setting, can encounter data that are

Grouped: ni > 1Ungrouped: ni = 1

Predictors Xi may be continuous or discrete

Unlike binomial setting, need to distinguish between

Ordered categories for Y → cumulative logit modelNominal categories for Y → multinomial logit model

STAT 526 Topic 11 3

Page 4: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Example 1: Math Aptitude

Predicting a college freshman’s math aptitude given theirmathematics PSAT score in 10th grade.

Response: Aptitude Grade: 4 ordered levelsPredictor: PSAT score: continuous (10-pt increments)

STAT 526 Topic 11 4

Page 5: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Example 1: Math Aptitude

Aptitude grade (Y ) postively related to math score (X )

Overlap in math scores across grades means there is someuncertainty in predicting Y

In this example, ni = 1 for i = 1, 2, . . . , I = 500 students

Interested in conditional probs P(Y = j |xi )

With ordered response, often easier to work with thecumulative probabilities

P(Y ≤ j |xi ) =∑

k≤j

P(Y = j |xi )

STAT 526 Topic 11 5

Page 6: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Proportional Odds Model

Also called the cumulative logit model

log

(

P(Yi ≤ j |Xi )

1− P(Yi ≤ j |Xi )

)

= θj − Xiβ

Only parameters θj depend on level j

They are monotonically increasing with j

Parameters β describe cumulative log-odds

Odds(Xi )/Odds(Xi ′) does not depend on level jA β > 0 means a larger x increases probability of largerresponse j (positive association)

Like fitting logistic model for each j but β same

As with binary setting, can consider other link functions

STAT 526 Topic 11 6

Page 7: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Latent Variable Motivation

Similar to binary setting, can consider a latent variable tomotivate the GLM modelFor the math aptitude example, we could consider thereto be a latent continuous variable Z associated with theaptitude grade that is linearly related to their math score

Zi = β0 + β1xi + εi

Instead of observing Zi , we observe

Yi =

A Zi > c3B c2 < Zi < c3C c1 < Zi < c2D Zi < c1

Can compute the P(Yi = j |xi) using specified dist of ε

STAT 526 Topic 11 7

Page 8: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Motivation Continued

Using cJ = ∞,

P(Yi ≤ j |xi ) = P(Zi < cj)

= P(β0 + β1xi + εi ≤ cj)

= P(εi ≤ cj − β0 − β1xi )

= Fε(θ1 − β1xi )

If F is the CDF of the logistic distribution,

P(Yi ≤ j |xi ) =exp{θj − βixi}

1 + exp{θj − β1xi}

Can use Normal or Gumbel distributions to motivateprobit or complementary log-log link

STAT 526 Topic 11 8

Page 9: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Interpreting the Model Parameters

Using the logit link, the cumulative odds

log

(

P(Y ≤ j |X)

P(Y > j |X)

)

= θj − Xβ

Interpretation of a β (holding all other x constant)

log

(

P(Y ≤ j |x + δ)

P(Y > j |x + δ)÷

P(Y ≤ j |x)

P(Y > j |x)

)

= log

(

P(Y ≤ j |x + δ)

P(Y > j |x + δ)

)

− log

(

P(Y ≤ j |x)

P(Y > j |x)

)

= θ∗j − β(x + δ)− (θ∗j − βx) = −βδ

Change is proportional to the change in x for all j

STAT 526 Topic 11 9

Page 10: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Inference

Similar inference as in logistic regression when focusingon cumulative probs

Wald / LR / Score tests for model parametersPearson χ2, Deviance for goodness of fit

β invariant to the number of response categories

Predicting Y now involves a vector of probabilities

Easiest to first compute cumulative probabilities andthen use subtraction to get the probability vector

STAT 526 Topic 11 10

Page 11: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Maximum Likelihood Estimation

Let pj(X) = P(Y ≤ j |X)− P(Y ≤ j − 1|X)

Yi is vector of length j with one 1 and remainder 0’sLog-likelihood for the ith observation is:

li = log

J∏

j=1

pj (xi )yij

=J∏

j=1

[P(Y ≤ j |xi )− P(Y ≤ j − 1|xi )]yij

=J∏

j=1

[

exp{θj − xiβ}

1 + exp{θj − xiβ}−

exp{θj−1 − xiβ}

1 + exp{θj−1 − xiβ}

]yij

MaximizeI∑

i=1

li with respect to θj and β

STAT 526 Topic 11 11

Page 12: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Calculating the Residual Deviance

Residual deviance

G 2 =

I∑

i=1

J∑

j=1

yij log

(

1

p̂j(xi )

)

= −2

I∑

i=1

J∑

j=1

yij log p̂j(xi )

Degrees of freedom are the difference between model dfs

# params in saturated model (=# observations)

# params in reduced model (=# of intercepts + # predictors)

Residual deviance degrees of freedom for math aptitudestudy are 500− 4 = 496

STAT 526 Topic 11 12

Page 13: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Example: Math Aptitude

In R: Use polr in library MASS> library(MASS)

> fit = polr(grade1 ~ psat,mathapt)

> summary(fit)

Coefficients:

Value Std. Error t value

psat -0.01792 0.00124 -14.46 ***Function uses

int_j + XB

Intercepts: so be wary of sign

Value Std. Error t value

A|B -11.5391 0.6912 -16.6944

B|C -8.8652 0.5930 -14.9492

C|D -6.3311 0.5196 -12.1834

Residual Deviance: 981.8668

AIC: 989.8668

Grade A associated with j = 1 and Grade D with j = 4.That is why these are negative coefficients (see note above)

STAT 526 Topic 11 13

Page 14: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Results: Math Aptitude

Deviance is 981.9 on 496 df (from fit$df.residual)

Similar to Bernoulli distribution (ungrouped), thisdeviance should not be used to assess goodness of fit

Better to use extension of H-L test or Lipsitz testAssessment of β: For each 10-pt increase in score, theodds of being > j versus ≤ j decrease 16.4% (1− e−0.1792)

> exp(10*confint(fit))

Waiting for profiling to be done...

Re-fitting to get Hessian

2.5 % 97.5 %

0.8154889 0.8558026

STAT 526 Topic 11 14

Page 15: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Extension of Hosmer-Lemeshow

Score each observation and then group on these scores

si = p̂i1 + 2p̂i2 + · · ·+ Jp̂iJ

C∑

k=1

J∑

j=1

(Okj − Ekj)2/Ekj ∼ χ2

df

C represents the number of (equal-sized) groups

For this model, df = (C − 2)(J − 1) + (J − 2)

The additional J − 2 df are due to the reduced number ofparameters relative to a multinomial model

Note that our Bernoulli version considers J = 2

STAT 526 Topic 11 15

Page 16: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Using R

Can still use logitgof function> library(generalhoslem)

> logitgof(mathapt$grade1,ord=TRUE,fitted(fit))

Hosmer and Lemeshow test (ordinal model)

data: grade1, fitted(fit)

X-squared = 29.356, df = 26, p-value = 0.2952

sorder

Warning message:

In logitgof(grade1, ord = TRUE, fitted(fit)) :

At least one cell in the expected frequencies table is <!

Chi-square approximation may be incorrect.

This will be a problem in this example because littleoverlap in math scores between A and D aptitude students

STAT 526 Topic 11 16

Page 17: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Grouped Table - Using Deciles

Y = 1 Y = 2 Y = 3 Y = 4Group O E O E O E O E Total

1 25 21.34 21 23.81 3 4.42 1 0.43 502 11 9.75 28 32.38 18 13.28 0 1.59 573 3 3.85 23 21.93 16 16.69 3 2.53 454 2 2.72 21 19.78 20 22.30 6 4.20 495 0 2.46 26 21.26 33 34.73 8 8.56 676 0 0.75 8 7.55 21 18.38 4 6.33 337 1 0.88 6 9.66 31 32.08 20 15.38 588 0 0.39 5 4.68 17 21.79 21 16.14 439 0 0.25 4 3.19 23 21.63 30 31.93 5710 0 0.06 0 0.78 13 7.35 28 32.81 41

STAT 526 Topic 11 17

Page 18: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Lipsitz Test

As with H-L test, sort data into C groups

Define C − 1 group indicator variablesFit new ordinal logistic regression

log

(

P(Y ≤ j |X)

P(Y > j |X)

)

= θj − Xβ + γ1I1 + · · ·+ γC−1IC−1

Use the likelihood ratio test to test Ho : γ1 = · · · = γC−1 = 0

Recommend C be such that 6 ≤ C < N/5J

STAT 526 Topic 11 18

Page 19: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Using R

Can use lipsitz.test function in generalhoslem> library(generalhoslem)

> lipsitz.test(fit)

Lipsitz goodness of fit test for ordinal response models

data: formula: grade1 ~ psat

LR statistic = 11.226, df = 9, p-value = 0.2605

Tends to have rejection rates > α in small samples

Works best when covariates are continuous

STAT 526 Topic 11 19

Page 20: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Example 2: Dose Response

Effect of intravenous medication doses on patients withsubarachnoid hemorrhage trauma (p. 207, OrdCDA)

Glasgow Outcome Scale (Y )Treatment Veget. Major Minor GoodGroup (X ) Death State Disab. Disab. Recov.Placebo 59 25 46 48 32Low dose 48 21 44 47 30

Medium dose 44 14 54 64 31High dose 43 4 49 58 41

Response: Glascow Outcome scale - Ordered

Predictor: Dose level - Ordered

Similar to setting for linear-by-linear association model

Focus, however, is on predicting Y

So how do we treat the levels of X? Score them?

STAT 526 Topic 11 20

Page 21: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Notation for Ordinal Predictor

Back to contingency table summary (grouped data)Y

X 1 2 · · · J Total1 y11 y12 · · · y1J y1.2 y21 y22 · · · y2J y2....

..

....

..

....

..

.I yI1 yI2 · · · yIJ yI .

Total y.1 y

.2 · · · y.J n

Interested in cond probs P(Y = j |X = i) = pj |i

Proportional-odds model focuses on cumulative probs

P(Y ≤ j |X = i) =∑

k≤j

pk|i

STAT 526 Topic 11 21

Page 22: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Ordinal Odds Ratios

Local odds ratios

θLij =P(X = i ,Y = j) / P(X = i ,Y = j + 1)

P(X = i + 1,Y = j) / P(X = i + 1,Y = j + 1)

Global odds ratios

θGij =P(X ≤ i ,Y ≤ j) / P(X ≤ i ,Y > j)

P(X > i ,Y ≤ j) / P(X > i ,Y > j)

Cumulative odds ratios (conditional on X )

θCij =P(Y ≤ j |X = i) / P(Y > j |X = i)

P(Y ≤ j |X = i + 1) / P(Y > j |X = i + 1)

Analogues to correlations, but for categorical variables

STAT 526 Topic 11 22

Page 23: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Ordinal Odds Ratio Estimates

Local odds ratios

θ̂Lij =yij / yi,j+1

yi+1,j / yi+1,j+1

Global odds ratios

θ̂Gij =

a≤i

b≤j yab /∑

a≤i

b>j yab∑

a>i

b≤j yab /∑

a>i

b>j yab

Cumulative odds ratios (conditional on X )

θ̂Cij =

b≤j yib /∑

b>j yib∑

b≤j yi+1,b /∑

b>j yi+1,b

Alternative: testing for association with Pearson X 2

STAT 526 Topic 11 23

Page 24: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Example 2 Analysis : Dose Scored

> library(MASS)

> fit1 = polr(outcome~dose,weights=count,data=prob2)

> summary(fit1)

Coefficients:

Value Std. Error t value

dose 0.1755 0.05671 3.094

Intercepts:

Value Std. Error t value

1|2 -0.8946 0.1144 -7.8233

2|3 -0.4941 0.1107 -4.4638

3|4 0.5162 0.1118 4.6150

4|5 1.8815 0.1311 14.3565

Residual Deviance: 2461.349 degrees of freedom are 797

AIC: 2471.349 Cannot use to assess fit (ungrouped)

> exp(confint(fit1)) Increase dose 1 level increases odds of

2.5 % 97.5 % the next higher outcome between 6.6% and 33.2%

1.066619 1.332269

STAT 526 Topic 11 24

Page 25: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Plot of Predicted Probabilities> matplot(predProb, type="l", xlab="Dose+1", ylab="Predicted

Probability", cex=3.5)

> legend(x=3,y=0.15, lty=c(1:4), col=c(1:5), paste("Outcome =", c(1:5)))

STAT 526 Topic 11 25

Page 26: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Example 2 Analysis : Dose Categorical> fit2 = polr(outcome~as.factor(dose),weights=count,data=prob2)

> summary(fit2)

Coefficients:

Value Std. Error t value

as.factor(dose)1 0.1176 0.1791 0.6564

as.factor(dose)2 0.3174 0.1740 1.8240

as.factor(dose)3 0.5208 0.1794 2.9029

Intercepts:

Value Std. Error t value

1|2 -0.9188 0.1322 -6.9488 ***Two additional parameters

2|3 -0.5183 0.1291 -4.0154

3|4 0.4922 0.1298 3.7925 ***Test below does not suggest

4|5 1.8579 0.1462 12.7072 they add much to the fit

Residual Deviance: 2461.216

AIC: 2475.216

> anova(fit1,fit2)

Model R. df Resid. Dev Test Df LR stat. Pr(Chi)

dose 797 2461.349

as.factor(dose) 795 2461.216 1 vs 2 2 0.1328 0.9357261

STAT 526 Topic 11 26

Page 27: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Plot of Predicted Probabilities> matplot(predProb1, type="l", xlab="Dose+1", ylab="Predicted

Probability", cex=3.5)

> legend(x=3,y=0.15, lty=c(1:4), col=c(1:5), paste("Outcome =", c(1:5)))

STAT 526 Topic 11 27

Page 28: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Summary

Moving from scoring the ordinal variable to treating it asa nominal factor allow a test of the linearity assumption.

Result can depend on how one scores the different levelsof the dose variable

Equally spacedUnequally spaced

Visual comparison can be made via plots of the predictedprobabilities like the ones on Slides #25 and #27.

Need to look at grouped goodness of fit statistics

Multiple reasons for a poor fit

violation of proportional odds; wrong link; wrong func.form or missing predictors; overdispersion

STAT 526 Topic 11 28

Page 29: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Goodness of Fit: Grouped Data

Let the rows represent each of the groupsExpected cell frequency µ̂ij in row i and col j :

µ̂ij = yi.P̂(Y = j |X = i)

= yi.

[

P̂(Y ≤ j |X = i)− P̂(Y ≤ j − 1|X = i)]

Pearson χ2

X 2 =

I∑

i=1

J∑

j=1

(yij − µ̂ij)2

µ̂ij

H0∼ χ2df

Deviance

G 2 = 2

I∑

i=1

J∑

j=1

yij log

(

yij

µ̂ij

)

H0∼ χ2df

Dose scored: df = [I (J − 1)]− [(J − 1) + 1] = 11

Dose categorical: df = [I (J − 1)]− [(J − 1) + (I − 1)] = 9

STAT 526 Topic 11 29

Page 30: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Visual Assessment of Proportional

Odds : Grouped Data

Focus on each predictor (holding other predictors fixed)

According to the model, for all j and δ:

log

(

P(Y ≤ j |X + δ)

P(Y > j |X + δ)

)

− log

(

P(Y ≤ j |X )

P(Y > j |X )

)

= −βδ

Can plot these differences in cumulative odds usingestimates from the saturated model

When proportional odds are appropriate, the differencesshould be roughly the same for all values of X and levels j

STAT 526 Topic 11 30

Page 31: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Using R

> mat = xtabs(count~dose+outcome,prob2)

> cumProb <- apply( mat/apply(mat, 1, sum), 1, cumsum)

> cumProb

0 1 2 3

0 0.2809524 0.2526316 0.2125604 0.2205128

1 0.4000000 0.3631579 0.2801932 0.2410256

2 0.6190476 0.5947368 0.5410628 0.4923077

3 0.8476190 0.8421053 0.8502415 0.7897436

4 1.0000000 1.0000000 1.0000000 1.0000000

> logit <- function(x) {log(x/(1-x))}

> plot(0:3, logit(cumProb[-5,2])-logit(cumProb[-5,1]), type="l",

ylim=c(-1, 1), xlab="Dose", ylab="Empirical log(OR)", cex=3.5)

> for (i in 3:4) {lines(0:3, logit(cumProb[-5,i])-

logit(cumProb[-5,i-1]),col=i, lty=i)

}

abline(h=-coef(fit1), col="red", lwd=2)

legend("topleft", lty=c(1,3,4), col=c(1,3,4),

paste("Cum prob cutoff =", c(1:3)), cex=1)

legend("topright", lty=c(1), col=c("red"), "Model-based")

STAT 526 Topic 11 31

Page 32: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Are They Relatively Constant?

STAT 526 Topic 11 32

Page 33: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Formal Test for Proportional Odds

Testing

H0 : log

(

Pj

1− Pj

)

= θj − Xβ

Ha : log

(

Pj

1− Pj

)

= θj − Xβj

Model under Ha specifies cumulative logit, but notproportional odds, since log(OR) depends on j

The model under H0 is nested within the model under Ha

Thus can compare residual deviances

STAT 526 Topic 11 33

Page 34: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Formal Test in R

Must use vglm function in VGAM packageFirst fit proportional-odd model> library(VGAM)

> fit.vgam <- vglm(as.numeric(outcome) ~ dose,

+ cumulative(parallel=TRUE, reverse=FALSE),

+ weights=count,prob2)

> summary(fit.vgam)

Coefficients:

Estimate Std. Error z value Pr(>|z|)

(Intercept):1 -0.89466 0.11456 -7.809 5.74e-15 ***

(Intercept):2 -0.49410 0.11059 -4.468 7.91e-06 ***

(Intercept):3 0.51615 0.11067 4.664 3.10e-06 ***

(Intercept):4 1.88151 0.13020 14.451 < 2e-16 ***

dose -0.17548 0.05632 -3.116 0.00183 **

Residual deviance: 2461.349 on 75 degrees of freedom

**df based on ungrouped multinomial logit model**

STAT 526 Topic 11 34

Page 35: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Formal Test in R

Now fit relaxed model> fit.vgam3 <- vglm(as.numeric(outcome) ~ dose,

+ cumulative(parallel=FALSE, reverse=FALSE),

+ weights=count,prob2)

> summary(fit.vgam3)

Coefficients:

Estimate Std. Error z value Pr(>|z|)

(Intercept):1 -0.97749 0.13194 -7.408 1.28e-13 ***

(Intercept):2 -0.36265 0.12034 -3.014 0.00258 **

(Intercept):3 0.52391 0.12011 4.362 1.29e-05 ***

(Intercept):4 1.78941 0.16415 10.901 < 2e-16 ***

dose:1 -0.11292 0.07288 -1.549 0.12130

dose:2 -0.26889 0.06832 -3.936 8.29e-05 ***

dose:3 -0.18234 0.06385 -2.856 0.00430 **

dose:4 -0.11925 0.08470 -1.408 0.15916

Residual deviance: 2447.018 on 72 degrees of freedom

STAT 526 Topic 11 35

Page 36: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Results

Full residual deviance = 2447.018 on 72 df

Reduced residual deviance = 2461.349 on 75 df

Difference is 2461.349− 2447.018 = 14.331 on 3 df

> pchisq(14.331,3,lower=F)

[1] 0.002487536

Cannot accept that reduced model gives adequate fit

Proportional odds not reasonable

However, full cumulative odds model has issues too

Non-parallel lines means there is eventual crossing

STAT 526 Topic 11 36

Page 37: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Multinomial Logit Model

We now shift to case where categories are unordered

Therefore, cannot work with cumulative probabilitiesInstead declare one category as a reference and link thecovariates to probs through J − 1 relative prob ratios

ηij = log

(

pij

pi1

)

= xiβj j = 2, 3, . . . , J

This model implies

pij = exp{xiβj}pi1 j = 2, 3, . . . , J

and because∑J

1 pij = 1, this means

pi1 =1

1 +∑J

2 exp{xiβj}and pij =

exp{xiβj}

1 +∑J

2 exp{xiβj}

STAT 526 Topic 11 37

Page 38: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Multinomial Logit Model

The baseline, or reference, category is arbitrary

Common choices by software are j = 1 or j = J

Separate set of parameters βj for each ratio

Values of βj depend on the choice of baseline

Because all sets of βj relative to common category, jointlydefine probs

More flexible model than proportional odds but moredifficult to interpret (?)

Can be used as classification model using category withhighest predicted probability

STAT 526 Topic 11 38

Page 39: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Parameter Interpretation

In logistic regression and proportional-odds model, a βj

represents a log odds ratioIn this model, a βj describes the log change in relativeprob ratio

log

(

pj(x + 1)/p1(x + 1)

pj(x)/p1(x)

)

= logpj(x + 1)

p1(x + 1)− log

pj(x)

p1(x)

= β∗0j + β1j(x + 1)− (β∗

0j + β1jx)

= β1j

log

(

pj(x + 1)/pk(x + 1)

pj(x)/pk(x)

)

= logpj(x + 1)

pk(x + 1)− log

pj(x)

pk(x)

= logpj(x + 1)

p1(x + 1)− log

pk(x + 1)

p1(x + 1)−

logpj(x)

p1(x)+ log

pk(x)

p1(x)

= β1j − β1k

STAT 526 Topic 11 39

Page 40: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Maximum Likelihood Estimation

The log-likelihood for observation i is:

li = log

J∏

j=1

pj (xi )yij

=J

j=2

yij log pj (xi ) +

1−J

j=2

yij

log p1(xi )

=J

j=2

yij logpj (xi )

1−J∑

k=2pk (xi )

+ log p1(xi )

=J

j=2

yij (xiβj )− log

1 +J

j=2

exp{xiβj}

MaximizeI∑

i=1

li with respect to βj

STAT 526 Topic 11 40

Page 41: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Example 2: Dose Response

Let’s revist our dose reponse study but use multinomiallogit model

Let’s consider doseAs a categorical predictor

There are 3 indicator variables per level + interceptTotal of 4(4) = 16 parameters

As a continuous predictor

Will assign scores to the categoriesTotal of 4(2) = 8 parameters

Previous proportional-odds models had 7 and 5parameters, respectively

STAT 526 Topic 11 41

Page 42: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Example 2: Dose Categorical

> library(nnet)

> fit1 <- multinom(outcome ~ as.factor(dose), weights=count, prob2)

> summary(fit1)

Coefficients:

(Intercept) as.factor(dose)1 as.factor(dose)2 as.factor(dose)3

2 -0.8586335 0.03194971 -0.2864809 -1.5161958

3 -0.2488754 0.16185828 0.4536705 0.3794879

4 -0.2063195 0.18526707 0.5810140 0.5055581

5 -0.6117850 0.14178037 0.2615807 0.5641507

Std. Errors:

(Intercept) as.factor(dose)1 as.factor(dose)2 as.factor(dose)3

2 0.2386396 0.3541205 0.3887204 0.5746170

3 0.1966936 0.2867909 0.2827264 0.2869711

4 0.1943777 0.2826526 0.2759257 0.2797853

5 0.2195434 0.3199468 0.3212239 0.3095891

Residual Deviance: 2443.166

AIC: 2475.166

STAT 526 Topic 11 42

Page 43: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Dose Categorical - Predicted Probs> predProb <- unique(fit1$fitted.values)

> matplot(predProb,las=1,type="l")

> legend("bottomleft", lty=c(1:4), col=c(1:5),

paste("Response =", c(0:4)),cex=0.75)

STAT 526 Topic 11 43

Page 44: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Calculation of Residual Deviance

Saturated model when the data are treated as grouped:Model-based predicted probs = sample proportions> m / apply(m, 1, sum)

0 1 2 3 4

0 0.2809524 0.11904762 0.2190476 0.2285714 0.1523810

1 0.2526316 0.11052632 0.2315789 0.2473684 0.1578947

2 0.2125604 0.06763285 0.2608696 0.3091787 0.1497585

3 0.2205128 0.02051282 0.2512821 0.2974359 0.2102564

Deviance for grouped data

G2 = 2I

i=1

J∑

j=1

yij log

(

yij

µ̂ij

)

= 2I

i=1

J∑

j=1

yij log

(

yij

yij

)

= 0

Deviance for ungrouped data

G2 = 2I

i=1

J∑

j=1

yij log

(

1

p̂j (xi )

)

= −2I

i=1

J∑

j=1

yij logp̂j (xi ) = 2443.166

with I × J × (J − 1)− I · (J − 1) = 4 · 5 · 4− 4 · 4 = 64 df

STAT 526 Topic 11 44

Page 45: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Example 2: Dose Scored

> library(nnet)

> fit2 <- multinom(outcome ~ dose, weights=count, prob2)

> summary(fit2)

Coefficients:

(Intercept) dose

2 -0.6999134 -0.3544346

3 -0.2194566 0.1470232

4 -0.1772963 0.1945578

5 -0.6544057 0.1914772

Std. Errors:

(Intercept) dose

2 0.2051749 0.13796048

3 0.1676773 0.09130087

4 0.1649761 0.08894654

5 0.1896008 0.10105460

Residual Deviance: 2449.145

AIC: 2465.145

STAT 526 Topic 11 45

Page 46: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Dose Scored - Predicted Probs> predProb <- unique(fit1$fitted.values)

> matplot(predProb,las=1,type="l")

> legend("bottomleft", lty=c(1:4), col=c(1:5),

paste("Response =", c(0:4)),cex=0.75)

STAT 526 Topic 11 46

Page 47: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Conclusions

Can compare the two models to test for linearity> anova(fit1,fit2)

Model Res. df Resid. Dev Df LR stat. Pr(Chi)

1 dose 72 2449.145

2 as.factor(dose) 64 2443.166 8 5.97846 0.6496448

Conclude that it is sufficient to consider linearity

Can do grouped goodness of fit test to assess fit

G 2 = 5.98 on 8 df (same because grouped Model 2 saturated)

This model does not fit as well as the relaxedcumulative-odds model

G 2 = 2447.018 on 72 df versus G 2 = 2449.145 on 72 df

STAT 526 Topic 11 47

Page 48: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Test for Equality of βj

Can test if different slope needed for each class j

H0 : log

(

pj (X )

p1(X )

)

= β0j + βX , j = 2, . . . , J

Ha : log

(

pj (X )

p1(X )

)

= β0j + βjX , j = 2, . . . , J

# -----separate beta_j for each response category-----

# ------the last category is the baseline in VGAM------

> fit3 <- vglm(outcome~dose, multinomial(parallel=FALSE),

+ weights=count,prob2)

> summary(fit3)

# -------same beta_j for each response category-------

> fit3.parallel <- vglm(outcome~dose,multinomial(parallel=TRUE),

+ weights=count,prob2)

> summary(fit3.parallel)

> 1 - pchisq(2*(logLik(fit3)-logLik(fit3.parallel)),

df=length(coef(fit3))-length(coef(fit3.parallel)))

[1] 0.0001767769

STAT 526 Topic 11 48

Page 49: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Example 4: Housing Satisfaction

1681 Copenhagen residents in study (housing in MASS)

Three categorical predictors (1 nominal, 2 ordered)

Contact Low HighSatisfaction Low Medium High Low Medium HighHousing InfluenceTower blocks Low 21 21 28 14 19 37

Medium 34 22 36 17 23 40High 10 11 36 3 5 23

Apartments Low 61 23 17 78 46 43Medium 43 35 40 48 45 86High 26 18 54 15 25 62

Atrium houses Low 13 9 10 20 23 20Medium 8 8 12 10 22 24High 6 7 9 7 10 21

Terraced houses Low 18 6 7 57 23 13Medium 15 13 13 31 21 13High 7 5 11 5 6 13

STAT 526 Topic 11 49

Page 50: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Mosaic Plot

STAT 526 Topic 11 50

Page 51: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Multinomial Logit Null Model Fit

Distribution of satisfaction same for all residents

> fit.mull <- multinom(Sat~1,weights=Freq,housing)

> summary(fit.null)

Call:

multinom(formula = Sat ~ 1, data = housing, weights = Freq)

Coefficients:

(Intercept)

Medium -0.2400404 #Low: 1/(1+exp(-0.2400404)+exp(.1639289))=0.3372992

High 0.1639289 #Medium: =0.2653183

#High: =0.3973825

Std. Errors:

(Intercept)

Medium 0.06329155

High 0.05710232 DF = 4*3*2*3*(3-1) - 2 = 142 (ungrouped)

DF = 4*3*2*(3-1) - 2 = 46 (grouped)

Residual Deviance: 3648.878

AIC: 3652.878

STAT 526 Topic 11 51

Page 52: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Multinomial Logit Fit

Consider influence as nominal variable> fit.multinom <- multinom(Sat~Infl+Type+Cont,weights=Freq,housing)

> summary(fit.multinom)

Coefficients:

(Intercept) InflMedium InflHigh TypeApartment

Medium -0.4192316 0.4464003 0.6649367 -0.4356851

High -0.1387453 0.7348626 1.6126294 -0.7356261

TypeAtrium TypeTerrace ContHigh

Medium 0.1313663 -0.6665728 0.3608513

High -0.4079808 -1.4123333 0.4818236

Std. Errors:

(Intercept) InflMedium InflHigh TypeApartment

Medium 0.1729344 0.1415572 0.1863374 0.1725327

High 0.1592295 0.1369380 0.1671316 0.1552714

TypeAtrium TypeTerrace ContHigh

Medium 0.2231065 0.2062532 0.1323975

High 0.2114965 0.2001496 0.1241371 Should we consider

interactions among

Residual Deviance: 3470.084 predictors?

AIC: 3498.084

STAT 526 Topic 11 52

Page 53: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Surrogate Log-Linear Models

Again focusing on satisfaction as multinomial responsewith other three variables as predictors

Will use associations between variables to developpredictive model

Model #1: Satisfaction is indep of the three predictors

If true, conditional distribution of satisfaction is thesame for all predictor combinationsIn other words, conditional probs do not vary withpredictorsThis is the same as the multinomial null modelCan express as log-linear model using

> fit <- glm(Freq~Infl*Type*Cont+Sat,family=poisson,housing)

STAT 526 Topic 11 53

Page 54: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Model #1 Results

> summary(fit)

Coefficients:

Estimate Std. Error z value Pr(>|z|)

(Intercept) 3.162e+00 1.243e-01 25.433 < 2e-16 ***

InflMedium 2.733e-01 1.586e-01 1.723 0.084868 .

InflHigh -2.054e-01 1.784e-01 -1.152 0.249511

TypeApartment 3.666e-01 1.555e-01 2.357 0.018403 *

TypeAtrium -7.828e-01 2.134e-01 -3.668 0.000244 ***

TypeTerrace -8.145e-01 2.157e-01 -3.775 0.000160 ***

ContHigh -1.190e-15 1.690e-01 0.000 1.000000

Sat1Medium -2.400e-01 6.329e-02 -3.793 0.000149 ***

Sat1High 1.639e-01 5.710e-02 2.871 0.004094 **

InflMedium:TypeApartment -1.177e-01 2.086e-01 -0.564 0.572571

InflHigh:TypeApartment 1.753e-01 2.279e-01 0.769 0.441783

InflMedium:TypeAtrium -4.068e-01 3.035e-01 -1.340 0.180118

InflHigh:TypeAtrium -1.692e-01 3.294e-01 -0.514 0.607433

InflMedium:TypeTerrace 6.292e-03 2.860e-01 0.022 0.982450

InflHigh:TypeTerrace -9.305e-02 3.280e-01 -0.284 0.776633

InflMedium:ContHigh -1.398e-01 2.279e-01 -0.613 0.539715

InflHigh:ContHigh -6.091e-01 2.800e-01 -2.176 0.029585 *

STAT 526 Topic 11 54

Page 55: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Model #1 Results

TypeApartment:ContHigh 5.029e-01 2.109e-01 2.385 0.017083 *

TypeAtrium:ContHigh 6.774e-01 2.751e-01 2.462 0.013811 *

TypeTerrace:ContHigh 1.099e+00 2.675e-01 4.106 4.02e-05 ***

InflMedium:TypeApartment:ContHigh 5.359e-02 2.862e-01 0.187 0.851450

InflHigh:TypeApartment:ContHigh 1.462e-01 3.380e-01 0.432 0.665390

InflMedium:TypeAtrium:ContHigh 1.555e-01 3.907e-01 0.398 0.690597

InflHigh:TypeAtrium:ContHigh 4.782e-01 4.441e-01 1.077 0.281619

InflMedium:TypeTerrace:ContHigh -4.980e-01 3.671e-01 -1.357 0.174827

InflHigh:TypeTerrace:ContHigh -4.470e-01 4.545e-01 -0.984 0.325326

Null deviance: 833.66 on 71 degrees of freedom

Residual deviance: 217.46 on 46 degrees of freedom

AIC: 610.43

Large deviance suggests probs vary with predictors

Residual deviance based on Poisson dist here

Coefs for Sat1 are the same as null multinomial intercepts

STAT 526 Topic 11 55

Page 56: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Additive Contributions of Predictors

Can assess whether Sat1 depends on each of the 3predictors individually by adding interactions with it

> addterm(fit, ~. + Sat1:(Infl+Type+Cont), test="Chisq")

Single term additions

Model:

Freq ~ Infl * Type * Cont + Sat

Df Deviance AIC LRT Pr(Chi)

<none> 217.46 610.43

Infl:Sat1 4 111.08 512.05 106.371 < 2.2e-16 ***

Type:Sat1 6 156.79 561.76 60.669 3.292e-11 ***

Cont:Sat1 2 212.33 609.30 5.126 0.07708 .

Infl: max reduction in resid. deviance & AIC

Even though Cont:Sat1 not significant, let’s look atmodel with all three interactions

STAT 526 Topic 11 56

Page 57: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Model #2: Interactions with Sat> fit2 <- glm(Freq~Infl*Type*Cont+Sat1:Infl+Sat1*Type+Sat1*Cont,

+ family=poisson,housing)

> summary(fit2)

Coefficients:

Estimate Std. Error z value Pr(>|z|)

(Intercept) 3.32106 0.14761 22.498 < 2e-16 ***

InflMedium -0.14543 0.17855 -0.814 0.415369

InflHigh -1.17183 0.21803 -5.375 7.68e-08 ***

TypeApartment 0.68296 0.17522 3.898 9.71e-05 ***

TypeAtrium -0.70064 0.24137 -2.903 0.003698 **

TypeTerrace -0.32511 0.23230 -1.400 0.161652

ContHigh -0.28230 0.18441 -1.531 0.125814

Sat1Medium -0.41923 0.17293 -2.424 0.015342 *

Sat1High -0.13874 0.15923 -0.871 0.383570

InflMedium:TypeApartment -0.01788 0.21050 -0.085 0.932302

InflHigh:TypeApartment 0.38687 0.23330 1.658 0.097263 .

InflMedium:TypeAtrium -0.36031 0.30498 -1.181 0.237432

InflHigh:TypeAtrium -0.03679 0.33479 -0.110 0.912503

InflMedium:TypeTerrace 0.18515 0.28889 0.641 0.521580

InflHigh:TypeTerrace 0.31075 0.33482 0.928 0.353345

InflMedium:ContHigh -0.20006 0.22875 -0.875 0.381799

InflHigh:ContHigh -0.72579 0.28235 -2.571 0.010155 *

TypeApartment:ContHigh 0.56969 0.21215 2.685 0.007247**

STAT 526 Topic 11 57

Page 58: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Model #2: Interactions with SatTypeAtrium:ContHigh 0.70211 0.27606 2.543 0.010979 *

TypeTerrace:ContHigh 1.21593 0.26997 4.504 6.67e-06 ***

InflMedium:Sat1Medium 0.44640 0.14156 3.153 0.001613 **

InflHigh:Sat1Medium 0.66494 0.18634 3.568 0.000359 ***

InflMedium:Sat1High 0.73486 0.13694 5.366 8.03e-08 ***

InflHigh:Sat1High 1.61263 0.16713 9.649 < 2e-16 ***

TypeApartment:Sat1Medium -0.43569 0.17253 -2.525 0.011562 *

TypeAtrium:Sat1Medium 0.13137 0.22311 0.589 0.555980

TypeTerrace:Sat1Medium -0.66657 0.20625 -3.232 0.001230 **

TypeApartment:Sat1High -0.73563 0.15527 -4.738 2.16e-06 ***

TypeAtrium:Sat1High -0.40798 0.21150 -1.929 0.053730 .

TypeTerrace:Sat1High -1.41233 0.20015 -7.056 1.71e-12 ***

ContHigh:Sat1Medium 0.36085 0.13240 2.726 0.006420 **

ContHigh:Sat1High 0.48183 0.12414 3.881 0.000104 ***

InflMedium:TypeApartment:ContHigh 0.04690 0.28621 0.164 0.869837

InflHigh:TypeApartment:ContHigh 0.12623 0.33821 0.373 0.708979

InflMedium:TypeAtrium:ContHigh 0.15724 0.39072 0.402 0.687364

InflHigh:TypeAtrium:ContHigh 0.47861 0.44424 1.077 0.281320

InflMedium:TypeTerrace:ContHigh -0.50016 0.36713 -1.362 0.173091

InflHigh:TypeTerrace:ContHigh -0.46310 0.45471 -1.018 0.308467

Null deviance: 833.657 on 71 degrees of freedom

Residual deviance: 38.662 on 34 degrees of freedom

AIC: 455.63STAT 526 Topic 11 58

Page 59: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Model #2: Interactions with Sat

Same model as our main-effects multinomial model

Different deviances due to different saturated models.

In multinom the saturated model is for subjectsIn surrogate log-linear model, it is for cells (grouped)

Comparison with null modelmultinom: 3648.9− 3470.1 = 178.8 and 142− 130 = 12 dflog-linear: 217.5− 38.7 = 178.8 and 46− 34 = 12 df

Could also consider higher-order interactionsRepresent non-additive effects of predictors on Sat

addterm(fit1, .~.+Sat:(Infl+Type+Cont)^2, test="Chisq")

None are found significant

STAT 526 Topic 11 59

Page 60: Modeling a Multinomial Response - Purdue Universitybacraig/notes526/topic11a.pdf · 2020. 10. 6. · Purdue University Reading: Faraway Ch. 7, Agresti Ch. 7, KNNL Ch. 14 STAT 526

Summary

Models using the Poisson distributionConsider E (count response) as a function of predictors

Poisson regressionQuasipoisson or negative binomial regressionSurrogate log-linear model

Multivariate associations of categorical variables

Nominal random variables: Log-linear modelsOrdinal random variables: Linear-by-linear model,column-effect models

Models using the multinomial distributionConsider E (count response) as a function of predictors

Ordinal response: cumulative logit modelNominal response: multinomial logit model

STAT 526 Topic 11 60