multinomial logistic regression david f. staples

Click here to load reader

Post on 14-Dec-2015

222 views

Category:

Documents

0 download

Embed Size (px)

TRANSCRIPT

  • Slide 1

Multinomial Logistic Regression David F. Staples Slide 2 Outline Review of Logistic Regression BCS Example Extension to Multiple Response Groups Nominal Categories Ordinal Categories Model Fitting & Interpretation Shallow Lake Trophic Status Slide 3 Logistic Regression Based on a Binomial Random Variable: Y = {0,1} Prob(Y = 1) = p Prob(Y = 0) = 1-p p(x) = P(Y i = 1|X i ) =, where X = 0 + 1 x 1 ++ k x k. Slide 4 Logistic Regression Based on a Binomial Random Variable: Y = {0,1} Prob(Y = 1) = p Prob(Y = 0) = 1-p p(x) = P(Y i = 1|X i ) =, where X = 0 + 1 x 1 ++ k x k. A logit transformation is used to linearize p(x): = 0 + 1 x 1 ++ k x k = X The s give the additive effect of Xs on the Log Odds Log Odds of Success Slide 5 Logistic Regression Example Model p as a function of Macrophyte Patch Area glm(BCS ~ Patch_area, family = binomial) Estimate SE z Pr(>|z|) Intercept -2.433e+00 5.108e-01 -4.764 1.9e-06 Patch_area 1.765e-04 4.725e-05 3.736 0.0001 Dichotomous Variable is the Presence/Absence of BCS Y = 1 if BCS Present Y = 0 if BCS Absent p = Prob(BCS Present) Slide 6 Interpreting Logistic Regression glm(BCS ~ Patch_area, family = binomial) Estimate SE z Pr(>|z|) Intercept -2.433e+00 5.108e-01 -4.764 1.9e-06 Patch_area 1.765e-04 4.725e-05 3.736 0.0001 Effect of Patch Area on P(BCS) Non-Linear Transformation Value of Intercept Value of Other Variables Slide 7 Interpreting Logistic Regression For the average size patch area (8374), the log odds ratio would be: -2.433 + 0.0001765 * 8374 = -0.955 exponentiate to get the Odds of Success: exp(-.955) = p/1-p = 0.38, Solve for p, Prob(BCS Present|Area=8374) =.28 glm(BCS ~ Patch_area, family = binomial) Estimate SE z Pr(>|z|) Intercept -2.433e+00 5.108e-01 -4.764 1.9e-06 Patch_area 1.765e-04 4.725e-05 3.736 0.0001 Slide 8 Interpreting Logistic Regression When p = 0.5, the log odds equals 0, 2.433 +.0001765*Area = 0. Thus, the patch area for p =.50 is 2.433/.0001765 = 13784.7 glm(BCS ~ Patch_area, family = binomial) Estimate SE z Pr(>|z|) Intercept -2.433e+00 5.108e-01 -4.764 1.9e-06 Patch_area 1.765e-04 4.725e-05 3.736 0.0001 Slide 9 Multinomial Logistic Regression Logistic Regression with > 2 response categories Model Probabilities Relative to Reference Category Response May be Nominal or Ordinal NominalOrdinal Slide 10 Shallow Lake Trophic Status 3 Categories Defining Lake State: Y = 1 if Lake Clear Y = 2 if Lake Shifting States Y = 3 if Lake Turbid Slide 11 Nominal (un-ordered) Multinomial Logistic library(nnet) multinom(StateNom ~ TP) (Int) TP 2 -2.47 0.012 3 -1.89 0.014 Std. Errors: (Int) TP 2 0.549 0.004 3 0.447 0.004 Residual Deviance: 113.8345 AIC: 121.8345 Slide 12 Nominal (un-ordered) Multinomial Logistic Library(nnet) multinom(StateNom ~ TP) (Int) TP 2 -2.47 0.012 3 -1.89 0.014 For TP = 50 p(Shifting) is about 16% of p(Clear) Slide 13 Nominal (un-ordered) Multinomial Logistic For TP = 50 p(Turbid) is about 30% of p(Clear) Library(nnet) multinom(StateNom ~ TP) (Int) TP 2 -2.47 0.012 3 -1.89 0.014 Slide 14 Nominal (un-ordered) Multinomial Logistic Odds of Shifting State vs. Clear State Slide 15 Ordinal Multinomial Logistic a.k.a. Proportional Odds Model 3 Ordered Status Categories: Y = 1 if lake clear Y = 2 if lake shifting states Y = 3 if lake turbid Slide 16 Ordinal Multinomial Logistic a.k.a. Proportional Odds Model library(MASS) StateOrd = as.ordered(StateNom) polr(StateOrd ~ TP) Value SE t value TP 0.009 0.002 3.81 Intercepts: Value SE t value 1|2 1.103 0.342 3.22 2|3 1.889 0.397 4.76 Residual Deviance: 118.99 AIC: 124.9897 3 Ordered Status Categories: Y = 1 if lake clear Y = 2 if lake shifting states Y = 3 if lake turbid Assume Same Slope => Fewer Parameters Slide 17 m2 = polr(StateOrd ~ TP) newd = data.frame(TP = seq(0,600)) prd = predict(m2, newdata=newd, type='p') matplot(newd$TP,prd) Slide 18 Nominal/Ordinal Comparison Slide 19 Nominal (un-ordered) Multinomial Logistic Library(nnet) multinom(StateNom ~ TP) (Intercept) TP 2 -2.469517 0.01248172 3 -1.891459 0.01384079 Std. Errors: (Intercept) TP 2 0.5486044 0.004183882 3 0.4465049 0.003932610 Residual Deviance: 113.8345 AIC: 121.8345 For J = 3 Categories defining lake state: Y = 1 if lake clear Y = 2 if lake shifting states Y = 3 if lake turbid Slide 20 Ordinal Multinomial Logistic a.k.a. Proportional Odds Model For J = 3 Categories defining lake state: Y = 1 if lake clear Y = 2 if lake shifting states Y = 3 if lake turbid (State 2 is Intermediate between 1 & 3) Library(MASS) StateOrd = as.ordered(StateNom) polr(StateOrd ~ TP, Hess = T) Value SE t value TP 0.0086 0.0023 3.8085 Intercepts: Value SE t value 1|2 1.1028 0.3417 3.2277 2|3 1.8889 0.3968 4.7605 Residual Deviance: 118.9897 AIC: 124.9897