ordinal and multinomial models

55
Ordinal and Multinomial Models William Simpson Research Computing Services http://intranet. hbs . edu /dept/research/statistics/

Upload: luciferio

Post on 01-Dec-2014

965 views

Category:

Documents


3 download

TRANSCRIPT

Page 2: Ordinal and Multinomial Models

Types of Models

• Models are generalizations of the logit and probit models

• Ordinal logit and probit deal with ordered data (more than 2 categories)

• Multinomial logit deals with unordered data with more than 2 categories

• (Multinomial probit is not commonly used due to computational difficulties)

Page 3: Ordinal and Multinomial Models

Outline of Talk

• Review of Binary Models

• Ordinal Models

• Multinomial Logit

Page 4: Ordinal and Multinomial Models

Binary Data – View 1 (CDF)

• View 1 – we compute a number that is a linear combination of our predictors, call it y=+ x. We then convert y into a probability p by using a cumulative distribution function (CDF). Our final outcome is 1 with probability p.

3 2 1 1 2 3 X

0.2

0.4

0.6

0.8

1prob

Page 5: Ordinal and Multinomial Models

Another CDF View

X

Y

X

p0p1

p

Page 6: Ordinal and Multinomial Models

Binary Data – View 2 (Latent or Unobserved Variable)

• View 2 – we compute a number that is a linear combination of our predictors and then add an error term, call it y*= + x + u We then get an outcome of 1 if y* >= 0, outcome 0 if y* < 0. In this case, the probabilistic element is the error term u, and y* is an unobserved variable.

Page 7: Ordinal and Multinomial Models

Binary Data – Unobserved Variable View

X

Y

X

PDF of Y*

Page 8: Ordinal and Multinomial Models

What Happens When Standard Deviation of u Changes

y*= + x + v

std(v) > std(u)X

Y

X

Page 9: Ordinal and Multinomial Models

Comparing CDF and Latent Variable Views

• The two views are equivalent. Each one can be converted into the other, where the cumulative probability function (CDF) in view 1 matches the CDF of the distribution of u in view 2.

Page 10: Ordinal and Multinomial Models

Combining the Two Views

X

Y, Y

X

p0p1

Page 11: Ordinal and Multinomial Models

Combining the Two Views

X

Y, Y

X

p0p1

Page 12: Ordinal and Multinomial Models

Ordinal Outcomes

• 3 or more categorical outcomes, which can be treated as ordered

• Bond ratings (AAA, AA, … B, C, …)

• Likert scales (e.g. responses on a 1-7 scale, from strongly disagree to strongly agree)– Often analyzed as continuous

Page 13: Ordinal and Multinomial Models

Ordinal Outcomes (Latent Variable View)

Y

1

2

3

X

Page 14: Ordinal and Multinomial Models

Ordinal Outcomes (CDF and Latent Variable View)

1

2

3

X

p0p1

Page 15: Ordinal and Multinomial Models

Ordinal Outcomes (CDF and Latent Variable View)

1

2

3

X

p0p1

Page 16: Ordinal and Multinomial Models

Ordinal Outcomes (CDF and Latent Variable View)

1

2

3

X

p0p1

Page 17: Ordinal and Multinomial Models

Ordinal Outcomes (CDF and Latent Variable View)

1

2

3

X

p0p1

Page 18: Ordinal and Multinomial Models

SAS and Stata Code

Stataoprobit outcome x

orologit outcome x

SASproc logistic; class outcome; model outcome = x / link=probit;

or model outcome = x ; run;

Page 19: Ordinal and Multinomial Models

Sample Output (Stata oprobit)

---------------------------------------------------------

y | Coef. Std. Err. z P>|z|

---------------------------------------------------------

x | 1.074575 .1209108 8.89 0.000

-------------+-------------------------------------------

_cut1 | -2.076242 .1548201 (Ancillary parameters)

_cut2 | -.9736895 .0807119

_cut3 | -.4528313 .073509

_cut4 | 1.106628 .0781733

_cut5 | 2.079342 .0932966

_cut6 | 3.176076 .167065

----------------------------------------------------------

Page 20: Ordinal and Multinomial Models

Interpretation of Stata Output

• Outcome will be in the second ordered category or higher (not the first), if 1.07*x+u > -2.08.

• Outcome will be in the third ordered category or higher (not the first or second), if 1.07*x+u > -.97.

• Outcome will be in the second ordered category exactly, if -.97 > 1.07*x+u > -2.08.

x | 1.074575 .1209108

-------------+-----------------------

_cut1 | -2.076242 .1548201

_cut2 | -.9736895 .0807119

Page 21: Ordinal and Multinomial Models

Sample Output (SAS PROC LOGISTIC with LINK=PROBIT)

Parameter DF Estimate Std Error

Intercept 7 1 -3.1758 0.1666

Intercept 6 1 -2.0793 0.0933

Intercept 5 1 -1.1066 0.0781

Intercept 4 1 0.4528 0.0734

Intercept 3 1 0.9737 0.0807

Intercept 2 1 2.0762 0.1555

x 1 1.0746 0.1208

Page 22: Ordinal and Multinomial Models

Interpretation of SAS Output

• Outcome will be in the second ordered category or higher (not the first), if 1.07*x + 2.08 + u > 0.

• Outcome will be in the third ordered category or higher, if 1.07*x + .97 + u > 0.

• Outcome will be in the second ordered category if 1.07*x + 2.08 + u > 0 and 1.07*x + .97 + u < 0.

Intercept 3 1 0.9737 0.0807

Intercept 2 1 2.0762 0.1555

x 1 1.0746 0.1208

Page 23: Ordinal and Multinomial Models

Interpreting Coefficients

• Multiple cutpoints with no intercept term, or multiple intercept terms

• Probabilities modeled are probabilities for all outcomes >=k, compared with all outcomes < k.

• Interpret the coefficients the same as in the corresponding binary model.

Page 24: Ordinal and Multinomial Models

Interpreting Coefficients(Ordinal Probit)

23

3

2

33

22

2)ly prob(exact

higheror 3 outcome ofy probabilit theis

higheror 2 outcome ofy probabilit theis

normal standard a ofon distributi cumulative theis

pp

p

p

Xp

Xp

Page 25: Ordinal and Multinomial Models

Interpreting Coefficients(Ordinal Logit)

23

3

2

3

33

2

22

22

2

2)ly prob(exact

higheror 3 outcome ofy probabilit theis

higheror 2 outcome ofy probabilit theis

)exp(1

)exp(

)exp(1

)exp(

1log

pp

p

p

X

Xp

X

Xp

Xp

p

Page 26: Ordinal and Multinomial Models

Assumptions of Ordinal Models

• Relationship between probabilities and + x follows the assumed form (normal for probit, logistic for logit).

• Parallel regressions – Coefficient is the same for every hurdle – aka equal slopes, (proportional odds for logistic models)– If not, use generalized ordered logit

Page 27: Ordinal and Multinomial Models

Parallel Regressions

X

Y

p0p1

1 X

2 X

3 X

Page 28: Ordinal and Multinomial Models

Proportional Odds

2323

232

3

232

2

3

3

33

3

22

2

exp*oddsodds

odds

oddslog

1log

1log

1log

1log

p

p

p

p

Xp

p

Xp

p

Page 29: Ordinal and Multinomial Models

Interpreting Cutpoints

Page 30: Ordinal and Multinomial Models

Sample Likert Scalewith Extra Points

2.3 4.2

1 2 3 4 5 6 7

-----------------------------------------------------------

SD D SoD N SoA A SA

MoD VSA

SD=Strongly Disagree, SoD = Somewhat Disagree

D=Disagree, N=Neutral, A=Agree

SA=Strongly Agree, SoA=Somewhat Agree

MoD=Moderately Disagree

VSA = Very Slightly Agree

Page 31: Ordinal and Multinomial Models

Probability of Responses

SD D SoD N SoA A SAMoD VSA

Page 32: Ordinal and Multinomial Models

Sample Likert Scalewith Uneven Points

1 2 3 4 5 6 7

-----------------------------------------------------------

SD D MoD SoD N VSA SA

(1) (2) (2.3) (3) (4) (4.2) (7)

SD=Strongly Disagree, SoD = Somewhat Disagree

MoD=Moderately Disagree

D=Disagree, N=Neutral

VSA = Very Slightly Agree

SA=Strongly Agree

Page 33: Ordinal and Multinomial Models

Probabilities with Uneven Scale

SD DMoD SoD NVSA SA

Page 34: Ordinal and Multinomial Models

Ordinal Outcomes (Latent Variable View)

Y

1

2

3

X

Page 35: Ordinal and Multinomial Models

Multinomial Logit

• A generalization of logistic regression

• More than two outcomes

• Outcomes are not ordered

• We are interested in the relative probabilities of outcomes

Page 36: Ordinal and Multinomial Models

Examples

• Choice of transportation – bus, taxi, private car

• Choice of product brand

• Occupational choice (considered as unordered) – craft, blue collar, professional, white collar

Page 37: Ordinal and Multinomial Models

Example Data

ID Distance Income Choice

1 5 15 Bus

2 10 10 Car

3 1 12 Car

4 25 18 Bus

5 30 40 Taxi

6 2 20 Bus

7 1 8 Taxi

… … … …

Page 38: Ordinal and Multinomial Models

Using a Reference Level

ID Distance Income Choice

1 5 15 Bus

2 10 10 Car

3 1 12 Car

4 25 18 Bus

5 30 40 Taxi

6 2 20 Bus

7 1 8 Taxi

… … … …

Page 39: Ordinal and Multinomial Models

Sample Results-----------------------------------------------------

outcome | Coef. Std. Err. z P>|z|

-------------+---------------------------------------

Taxi |

distance | -.0757664 .1305456 -0.58 0.562

income | .319901 .0830162 3.85 0.000

_cons | -6.22562 1.734012 -3.59 0.000

-------------+---------------------------------------

Car |

distance | .4482523 .1129979 3.97 0.000

income | .0447404 .0581754 0.77 0.442

_cons | -2.587764 1.214103 -2.13 0.033

-----------------------------------------------------

(Outcome outcome==Bus is the comparison group)

Page 40: Ordinal and Multinomial Models

Sample Results (2)-----------------------------------------------------

outcome | Coef. Std. Err. z P>|z|

-------------+---------------------------------------

Bus |

distance | .0757664 .1305456 0.58 0.562

income | -.319901 .0830162 -3.85 0.000

_cons | 6.22562 1.734012 3.59 0.000

-------------+---------------------------------------

Car |

distance | .5240187 .1245058 4.21 0.000

income | -.2751607 .080734 -3.41 0.001

_cons | 3.637855 1.705811 2.13 0.033

-----------------------------------------------------

(Outcome outcome==Taxi is the comparison group)

Page 41: Ordinal and Multinomial Models

Coefficients on Distance

• Taxi Bus• Bus Taxi• Bus Car• Taxi Car

• .0757664• -.0757664• .4482523• .5240187

Bus Taxi + Taxi Car = Bus Car

-.0757664 + .5240187 = .4482523

Bus Car = Taxi Car – Taxi Bus

Page 42: Ordinal and Multinomial Models

Probability Change Plot

Change in the Predicted Probability -.18 -.09 -.01 .08 .16 .24 .33

BT C

B TC

distance: +/-sd/2

income: +/-sd/2

Page 43: Ordinal and Multinomial Models

Odds Ratio Plot Factor Change Scale Relative to

Logit Coefficient Scale Relative to

.23

-1.48

.37

-1.01

.59

-.53

.95

-.05

1.54

.43

2.48

.91

4

1.39

BT

C

B

TC

distance Std Coef

income Std Coef

Page 44: Ordinal and Multinomial Models

Independence from Irrelevant Alternatives (IIA)

• Relative odds of two categories shouldn’t change when a new category is added

• E.g., if choices are car, bus, and Yellow Cab, the relative proportions shouldn’t change if a new choice is added, e.g. Black & White Cab– Not realistic in this case. Assumption should be

examined carefully.

Page 45: Ordinal and Multinomial Models

Other Models for Nominal Outcomes

• Conditional Logit– Attributes of choices can be used as predictors

• Nested Logit– Treats a set of choices as a hierarchy– IIA assumption can be relaxed

Page 46: Ordinal and Multinomial Models

References

• Long, J. S. (1997). Regression Models for Categorical and Limited Dependent Variables. Thousand Oaks, CA: Sage.

• Hosmer, D. W. and S. Lemeshow. (2000). Applied Logistic Regression (Second ed.). New York: Wiley.

• Allison, P. D. (1999). Logistic Regression Using the SAS System: Theory and Application. Cary, NC: SAS Institute.

• Long, J. S. & Freese, J. (2001). Regression Models for Categorical Dependent Variables using Stata. College Station, TX: Stata Press.

Page 47: Ordinal and Multinomial Models

Appendix

Programming ExamplesBy James Zeitler

Page 48: Ordinal and Multinomial Models

Ordered Logit (SAS)proc logistic data = work.ordinals descending; model y = x;run;

The LOGISTIC Procedure Model InformationData Set WORK.ORDINALS..............................................Model cumulative logitOptimization Technique Fisher's scoring

Response Profile Ordered Total Value y Frequency 1 7 6 ............................. 7 1 6Probabilities modeled are cumulated over the lower Ordered Values.

Analysis of Maximum Likelihood Estimates Standard WaldParameter DF Estimate Error Chi-Square Pr > ChiSqIntercept 7 1 -6.1912 0.4312 206.1863 <.0001Intercept 6 1 -3.6194 0.1804 402.7389 <.0001Intercept 5 1 -1.8611 0.1414 173.2883 <.0001Intercept 4 1 0.7326 0.1275 33.0150 <.0001Intercept 3 1 1.7093 0.1520 126.4030 <.0001Intercept 2 1 4.3014 0.4189 105.4418 <.0001x 1 1.8479 0.2176 72.1016 <.0001

Page 49: Ordinal and Multinomial Models

Ordered Probit (SAS)The LOGISTIC Procedure Model InformationData Set WORK.ORDINALS...............................................Model cumulative probit Response Profile Ordered Total Value y Frequency 1 7 6 ............................ 7 1 6Probabilities modeled are cumulated over the lower Ordered Values.

proc logistic data = work.ordinals descending; model y = X / LINK = PROBIT;run;

Analysis of Maximum Likelihood Estimates

Standard WaldParameter DF Estimate Error Chi-Square Pr > ChiSq

Intercept 7 1 -3.1758 0.1666 363.5568 <.0001Intercept 6 1 -2.0793 0.0933 496.5331 <.0001Intercept 5 1 -1.1066 0.0781 200.8158 <.0001Intercept 4 1 0.4528 0.0734 38.0347 <.0001Intercept 3 1 0.9737 0.0807 145.4615 <.0001Intercept 2 1 2.0762 0.1555 178.1792 <.0001x 1 1.0746 0.1208 79.1034 <.0001

Page 50: Ordinal and Multinomial Models

Multinomial Logit (SAS)/* Use Link = GLOGIT in PROC LOGIT *//* to estimate a multinomial logit *//* Refer to the response profile to *//* determine the reference category */ proc logistic data = transport; class Mode; model Mode = Distance Income /link = glogit;run;

The LOGISTIC Procedure Model InformationData Set WORK.TRANSPORTResponse Variable ModeNumber of Response Levels 3Model generalized logit Response Profile Ordered Total Value Mode Frequency 1 Bus 27 2 Car 42 3 Taxi 31Logits modeled use Mode='Taxi' as the reference category.

Analysis of Maximum Likelihood Estimates Standard WaldParameter Mode DF Estimate Error Chi-Square Pr > ChiSqIntercept Bus 1 6.2253 1.7340 12.8897 0.0003Intercept Car 1 3.6375 1.7057 4.5475 0.0330Distance Bus 1 0.0757 0.1305 0.3367 0.5617Distance Car 1 0.5240 0.1245 17.7135 <.0001Income Bus 1 -0.3199 0.0830 14.8488 0.0001Income Car 1 -0.2751 0.0807 11.6155 0.0007

Page 51: Ordinal and Multinomial Models

Ordered Logit (SPSS)Analyze

Regression Ordinal...

Logit is default link distribution

Page 52: Ordinal and Multinomial Models

Ordered Logit Syntax and Results (SPSS)

Parameter Estimates

-4.302 .419 105.441 1 .000 -5.123 -3.480

-1.709 .152 126.409 1 .000 -2.007 -1.411

-.733 .127 33.018 1 .000 -.983 -.483

1.861 .141 173.282 1 .000 1.584 2.138

3.619 .180 402.733 1 .000 3.266 3.973

6.191 .431 206.180 1 .000 5.346 7.036

1.848 .218 72.096 1 .000 1.421 2.274

[Y = 1]

[Y = 2]

[Y = 3]

[Y = 4]

[Y = 5]

[Y = 6]

Threshold

XLocation

Estimate Std. Error Wald df Sig. Lower Bound Upper Bound

95% Confidence Interval

Link function: Logit.

PLUM y WITH x /CRITERIA = CIN(95) DELTA(0) LCONVERGE(0) MXITER(100) MXSTEP(5) PCONVERGE (1.0E-6) SINGULAR(1.0E-8) /LINK = LOGIT /PRINT = FIT PARAMETER SUMMARY .

Page 53: Ordinal and Multinomial Models

Ordered Probit (SPSS)Analyze

Regression Ordinal...

Set Probit as link distribution

Page 54: Ordinal and Multinomial Models

Ordered Probit Syntax and Results (SPSS)

Parameter Estimates

-2.076 .156 178.170 1 .000 -2.381 -1.771

-.974 .081 145.464 1 .000 -1.132 -.815

-.453 .073 38.033 1 .000 -.597 -.309

1.107 .078 200.820 1 .000 .954 1.260

2.079 .093 496.537 1 .000 1.896 2.262

3.176 .167 363.453 1 .000 2.850 3.503

1.075 .121 79.106 1 .000 .838 1.311

[Y = 1]

[Y = 2]

[Y = 3]

[Y = 4]

[Y = 5]

[Y = 6]

Threshold

XLocation

Estimate Std. Error Wald df Sig. Lower Bound Upper Bound

95% Confidence Interval

Link function: Probit.

PLUM y WITH x /CRITERIA = CIN(95) DELTA(0) LCONVERGE(0) MXITER(100) MXSTEP(5) PCONVERGE (1.0E-6) SINGULAR(1.0E-8) /LINK = PROBIT /PRINT = FIT PARAMETER SUMMARY .

Page 55: Ordinal and Multinomial Models

Multinomial Logit (SPSS)Analyze

Regression Multinomial logit...

Parameter Estimates

2.588 1.214 4.543 1 .033

-.448 .113 15.736 1 .000 .639 .512 .797

-.045 .058 .591 1 .442 .956 .853 1.072

-3.638 1.706 4.548 1 .033

-.524 .125 17.714 1 .000 .592 .464 .756

.275 .081 11.616 1 .001 1.317 1.124 1.542

Intercept

DISTANCE

INCOME

Intercept

DISTANCE

INCOME

CHOICEBus

Taxi

B Std. Error Wald df Sig. Exp(B) Lower Bound Upper Bound

95% Confidence Interval forExp(B)

NOMREG choice WITH distance income /CRITERIA = CIN(95) DELTA(0) MXITER(100) MXSTEP(5) CHKSEP(20) LCONVERGE(0) PCONVERGE(1.0E-6) SINGULAR(1.0E-8) /MODEL /INTERCEPT = INCLUDE /PRINT = PARAMETER SUMMARY LRT .