multinomial logistic regression

Multinomial Logistic Regression

Inanimate objects can be classified scientifically into three major categories; those that don't work, those that break down and those that get lost (Russell Baker)

Multinomial Logistic RegressionAlso known as polytomous or nominal logistic or logit regression or the discrete choice model

Generalization of binary logistic regression to a polytomous DVWhen applied to a dichotomous DV identical to binary logistic regression

Polytomous VariablesThree or more unordered categoriesCategories mutually exclusive and exhaustiveSometimes called multicategorical or sometimes multinomial variables

Polytomous DVsReason for leaving welfare:marriage, stable employment, move to another state, incarceration, or deathStatus of foster home application:licensed to foster, discontinued application process prior to licensure, or rejected for licensureChanges in living arrangements of the elderly:newly co-residing with their children, no longer co-residing, or residing in institutions

Single (Dichotomous) IV ExampleDV = interview tracking efforteasy-to-interview and track mothers (Easy); difficult-to-track mothers who required more telephone calls (MoreCalls); difficult-to-track mothers who required more unscheduled home visits (MoreVisits) IV = race, 0 = European-American, 1 = African-AmericanN = 246 mothersWhat is the relationship between race and interview tracking effort?

CrosstabulationTable 3.1

Relationship between race and tracking effort is statistically significant [2(2, N = 246) = 8.69, p = .013]

Reference CategoryIn binary logistic regression category of the DV coded 0 implicitly serves as the reference category Known as baseline, base, or comparison categoryNecessary to explicitly select reference categoryEasy selected

ProbabilitiesTable 3.1More Calls (vs. Easy)European-American: .24 = 30 / (30 + 96) African-American: .31 = 24 / (24 + 53) More Visits (vs. Easy)European-American: .15 = 17 / (17 + 96) African-American: .33 = 26 / (26 +53)

Odds & Odds RatioMore Calls (vs. Easy)European-American: .3125 (.2098 / .6713)African-American: .4528 (.2330 / .5146)Odds Ratio = 1.45 (.4528 / .3125)45% increase in the oddsMore Visits (vs. Easy)European-American: .1771 (.1189 / .6713)African-American: .4905 (.2524 / .5146). Odds Ratio = 2.77 (.4905 / .1771)177% increase in the odds

Question & AnswerWhat is the relationship between race and interview tracking effort?

The odds of requiring more calls, compared to being easy-to-track, are higher for African-Americans by a factor of 1.45 (45%). The odds of requiring more visits, compared to being easy-to-track, are higher for African-Americans by a factor of 2.77 (177%).

Multinomial Logistic RegressionSet of binary logistic regression models estimated simultaneouslyNumber of non-redundant binary logistic regression equations equals the number of categories of the DV minus one

Statistical Significance Table 3.2(Race, More Calls vs. Easy) = (Race, More Visits vs. Easy) = 0RejectTable 3.3(Race, More Calls vs. Easy) = (Race, More Visits vs. Easy) = 0RejectTable 3.4(Race, More Calls vs. Easy) = 0Dont Reject(Race, More Visits vs. Easy) = 0Reject

Odds RatiosOR(More Calls vs. Easy) = 1.45The odds of requiring more calls, compared to being easy-to-track, are not significantly different for European- and African-Americans.OR(More Visits vs. Easy) = 2.77The odds of requiring more visits, compared to being easy-to-track, are higher for African-Americans by a factor of 2.77 (177%).

Estimated Logits (L) Table 3.4

L(More Calls vs. Easy) = a + BRaceXRaceL(More Calls vs. Easy) = -1.163 + (.371)(XRace)

L(More Visits vs. Easy) = a + BRaceXRaceL(More Visits vs. Easy) = -1.731 + (1.019)(XRace)

Logits to OddsAfrican-Americans (X = 1)

L(More Calls vs. Easy) = -.792 = -1.163 + (.371)(1)Odds = e-.792 = .45

L(More Visits vs. Easy) = -.712 = -1.731 + (1.019)(1)Odds = e-.712 = .49

Logits to Probabilities

African-Americans, L(More Calls vs. Easy) = -.792

African-Americans, L(More Visits vs. Easy) = -.712

Question & AnswerWhat is the relationship between race and interview tracking effort?

The odds of requiring more calls, compared to being easy-to-track, are not significantly different for European- and African-Americans.The odds of requiring more visits, compared to being easy-to-track, are higher for African-Americans by a factor of 2.77 (177%).

Single (Quantitative) IV ExampleDV = interview tracking efforteasy-to-interview and track mothers (Easy); difficult-to-track mothers who required more telephone calls (MoreCalls); difficult-to-track mothers who required more unscheduled home visits (MoreVisits) IV = years of educationN = 246 mothersWhat is the relationship between education and interview tracking effort?

Statistical Significance Table 3.6(Education, More Calls vs. Easy) = (Education, More Visits vs. Easy) = 0RejectTable 3.7(Education, More Calls vs. Easy) = 0Dont Reject(Education, More Visits vs. Easy) = 0Reject

Odds RatiosOR(More Calls vs. Easy) = .88The odds of requiring more calls, compared to being easy-to-track, are not significantly associated with education.OR(More Visits vs. Easy) = .76For every additional year of education the odds of needing more visits, compared to being easy-to-track, decrease by a factor of .76 (i.e., -24.1%).

FiguresEducation.xls

Estimated Logits (L) Table 3.7

X = 12 (high school education)

L(More Calls vs. Easy) = -.977 = .583 + (-.130)(12)L(More Visits vs. Easy) = -1.235 = 2.077 + (-.276)(12)

Effect of Education on Tracking Effort (Logits)

Data

More Calls

aLogitOddsp

0.583-0.1308-0.460.630.39

0.583-0.1309-0.580.560.360.03

0.583-0.13010-0.710.490.330.03

0.583-0.13011-0.840.430.300.03

0.583-0.13012-0.970.380.270.03

0.583-0.13013-1.100.330.250.03

0.583-0.13014-1.230.290.230.02

0.583-0.13015-1.360.260.200.02

0.583-0.13016-1.490.220.180.02

0.583-0.13017-1.620.200.160.02

More Visits

2.077-0.2768-0.130.880.47

2.077-0.2769-0.410.660.400.07

2.077-0.27610-0.690.500.340.06

2.077-0.27611-0.960.380.280.06

2.077-0.27612-1.240.290.220.05

2.077-0.27613-1.510.220.180.04

2.077-0.27614-1.790.170.140.04

2.077-0.27615-2.070.130.110.03

2.077-0.27616-2.340.100.090.02

2.077-0.27617-2.620.070.070.02

Logits to OddsX = 12 (high school education)

Odds(More Calls vs. Easy) = e-.977 = .38Odds(More Visits vs. Easy) = e-1.235 = .29

Logits to ProbabilitiesX = 12 (high school education)

Question & AnswerWhat is the relationship between education and interview tracking effort?

The odds of requiring more calls, compared to being easy-to-track, are not significantly associated with education. For every additional year of education the odds of needing more visits, compared to being easy-to-track, decrease by a factor of .76 (i.e., -24.1%).

Multiple IV ExampleDV = interview tracking efforteasy-to-interview and track mothers (Easy); difficult-to-track mothers who required more telephone calls (MoreCalls); difficult-to-track mothers who required more unscheduled home visits (MoreVisits) IV = race, 0 = European-American, 1 = African-AmericanIV = years of educationN = 246 mothers

Multiple IV Example (contd)What is the relationship between race and interview tracking effort, when controlling for education?

Statistical Significance Table 3.8(Race, More Calls vs. Easy) = (Race, More Visits vs. Easy) = (Ed, More Calls vs. Easy) = (Ed, More Visits vs. Easy) = 0Reject Table 3.9(Race, More Calls vs. Easy) = (Race, More Visits vs. Easy) = 0Reject(Ed, More Calls vs. Easy) = (Ed, More Visits vs. Easy) = 0Reject

Statistical Significance (contd) Table 3.10(Race, More Calls vs. Easy) = 0Dont reject(Race, More Visits vs. Easy) = 0Reject(Ed, More Calls vs. Easy) = 0Dont reject(Ed, More Visits vs. Easy) = 0Reject

Odds Ratios: RaceOR(More Calls vs. Easy) = 1.36The odds of requiring more calls, compared to being easy-to-track, are not significantly different for European- and African-Americans.OR(More Visits vs. Easy) = 2.48The odds of requiring more visits, compared to being easy-to-track, are higher for African-Americans by a factor of 2.48 (148%).

Odds Ratios: EducationOR(More Calls vs. Easy) = .89The odds of requiring more calls, compared to being easy-to-track, are not significantly associated with education.OR(More Visits vs. Easy) = .77For every additional year of education the odds of needing more visits, compared to being easy-to-track, decrease by a factor of .77 (i.e., -23%), when controlling for race.

FiguresRace & Education.xls

Effect of Education on Tracking Effort for African-Americans (Odds)

Data

More Calls

aBEdXEdBRaceXRaceLogitOddsp

0.345-0.12080.3071-0.310.730.42

0.345-0.12090.3071-0.430.650.39

0.345-0.120100.3071-0.550.580.37

0.345-0.120110.3071-0.670.510.34

0.345-0.120120.3071-0.790.450.31

0.345-0.120130.3071-0.910.400.29

0.345-0.120140.3071-1.030.360.26

0.345-0.120150.3071-1.150.320.24

0.345-0.120160.3071-1.270.280.22

0.345-0.120170.3071-1.390.250.20

More Visits

1.416-0.25880.91010.261.300.57

1.416-0.25890.91010.011.010.50

1.416-0.258100.9101-0.250.780.44

1.416-0.258110.9101-0.510.600.38

1.416-0.258120.9101-0.770.460.32

1.416-0.258130.9101-1.030.360.26

1.416-0.258140.9101-1.280.280.22

1.416-0.258150.9101-1.540.210.18

1.416-0.258160.9101-1.800.170.14

1.416-0.258170.9101-2.060.130.11

Figure x. Effect of Education on Tracking Effort

Figure x. Effect of Education on Tracking Effort for African-Americans

Effect of Education on Tracking Effort for African-Americans (Probabilities)

Figure x. Effect of Education on Tracking Effort

Question & AnswerWhat is the relationship between race and interview tracking effort, when controlling for education?

The odds of requiring more calls, compared to being easy-to-track, are not significantly different for European- and African-Americans, when controlling for education. The odds of requiring more visits, compared to being easy-to-track, are higher for African-Americans by a factor of 2.48 (148%), when controlling for education.

Assumptions Necessary for Testing HypothesesAssumptions discussed in GZLM lectureIndependence of irrelevant alternatives (IIA)Odds of one outcome (e.g., More Calls) relative to another (e.g., Easy) are not influenced by other alternatives (e.g., More Visits)

Model EvaluationCreate a set of binary DVs from the polytomous DV

recode TrackCat (1=0) (2=1) (3=sysmis) into MoreCalls.recode TrackCat (1=0) (2=sysmis) (3=1) into MoreVisits.

Run separate binary logistic regressionsUse binary logistic regression methods to detect outliers and influential observations

Model Evaluation (contd)Index plotsLeverage valuesStandardized or unstandardized deviance residualsCooks DGraph and compare observed and estimated counts

Analogs of R2None in standard use and each may give different resultsTypically much smaller than R2 values in linear regressionDifficult to interpret

MulticollinearitySPSS multinomial logistic regression doesnt compute multicollinearity statisticsUse SPSS linear regressionProblematic levelsTolerance < .10 or VIF > 10

Additional TopicsPolytomous IVsCurvilinear relationshipsInteractions

Additional Regression Models for Polytomous DVsMultinomial probit regressionSubstantive results essentially indistinguishable from binary logistic regressionChoice between this and binary logistic regression largely one of convenience and discipline-specific convention Many researchers prefer binary logistic regression because it provides odds ratios whereas probit regression does not, and binary logistic regression comes with a wider variety of fit statistics

Additional Regression Models for Polytomous DVs (contd)Discriminant analysisLimited to continuous IVs