logistic regression

Click here to load reader

Post on 19-Mar-2016




0 download

Embed Size (px)


Logistic Regression. Chongming Yang Research Support Center FHSS College. Rules of Logarithm. Log ( uv ) = Log (u) + Log (v) Log (u/v) = Log (u) - Log (v) Log ( u ) v = v Log (u). Rules of Exponentiation (0


  • Logistic RegressionChongming YangResearch Support CenterFHSS College

  • Rules of LogarithmLog (uv) = Log (u) + Log (v)

    Log (u/v) = Log (u) - Log (v)

    Log (u)v = v Log (u)

  • Rules of Exponentiation(0
  • Exponential & LogarithmicInverse of One Another

    Y = ax

    X = Loga(y)

  • Assumptions of Linear RegressionYi = + Xi + i Yi continuous & unboundedexpected or mean (i)= 0 I = normally distributed not correlated with predictorsAbsence of perfect multicollinearityNo measurement error in all variables

  • Violation of LR Assumptions Dichotomous Dependent Variable (DV)

    Unordered Categorical (Nominal) DV

    Ordered Categorical (Ordinal) DV

  • Natural Logarithmic Transformation(Binary DV)Let p = probability of an event

  • Logit Model

  • Rearranged Logit Model

  • Logistic Model

  • Odds Ratio

  • Interpretation of Coefficients(odds ratio)Dichotomous predictor X1: The predicted odds of a positive response for group A is ? times the odds for the group B.The odds of a positive response for group a is ?% higher than the odds for group B. Continuous predictor X2:One unit increase is associated with ?% increase in the predicted odds of X

  • Interpretation

    See Handout

  • Interpretation of InteractionDefinition: The effect of a covariate depends on the level of another covariate.

    Interpretation:Plug in some values of two variablesPlot estimated logit Interpret interaction effect only when main effects is present

  • Likelihood at value of X(left side of equation)

  • Log Likelihood (left side of equation)

  • Log Logit Model(right side of equation)

  • Maximum Likelihood Estimation

  • Likelihood Ratio Test of 0, 1 Likelihood Ratio Test =Deviance = -2log (likelihood of fitted model / likelihood of Saturated model)

    likelihood of Saturated model=1 Deviance = -2log (likelihood of fitted model)

  • 2 Test of 0, 1

    1. 2 =-2Ln(likelihood of without x )/ (likelihood model with x) 2. Degree of Freedom = j - (p+1) where j = (# of Categories) + (# of continuous variables) p = # of parameters,

  • Hosmer-Lemeshow Test(2) (grouping percentile of estimated p)Whereg = 10, k = 1..10, n' = number of subjects in kth group, ck= # of covariate patterns, p = average estimated probability, df= g-2

  • y = 1

    y = 0Group 1(10% prob.)Group 220% prob.Group 10100% prob.EstimatedObservedEstimatedObservedEstimatedObservedEstimatedN1N2ObservedN3N4

  • Wald Test of 0, 1

    W = / se() (se = standard error)

    Normal Distribution test

  • Multinomial Logistic Regression(non-ordered categorical DV)P = probability of a response categoryPi1 + Pi2 + Pi3 = 1

  • Multinomial Logistic Regression

  • Interpretation

    See handout

  • Ordinal Logistic Models

    Adjacent Category ModelCompare two adjacent categories

  • Adjacent Categories ModelLet j be an ordinal scale j = 1 j & j+1 = two adjacent categoriesModel

  • PracticeRun Logistic Regression Using binary.sav

    DV = Admit

    IV = gre, gpa, rank

    Annotated output:http://www.ats.ucla.edu/stat/spss/dae/logit.htm

  • Pseudo R-squared(based on Likelihood)Explained Variability

    Improvement from null model to fitted model

    Square of correlation (predicted and observed)

  • Psudo R Square Cox & Snell Improvement of full model over intercept modelNagelkerkeImprovement of full model over intercept modelMcFaddenadjusted R-squared in OLS penalizing a model with too many predictors


  • Practice (continued)Run Multinomial Logistic Regression Using mlogit.sav

    DV= Brand

    IV = female, age

    Annotated output:http://www.ats.ucla.edu/stat/spss/dae/mlogit.htm

  • Practice (continued)Run Ordinal Logistic Regression Using ologit.sav

    DV= admit

    IV = gre, gpa, topnotch

    Annotated output:http://www.ats.ucla.edu/stat/SPSS/dae/ologit.htm

  • Practical Issues1. Low Ratio of Cases to VariablesProblem: Extremely large parameter estimates and standard errorsSolution:Collapse categories Delete the offending category Delete discrete predictors

  • Practical Issues2. Inadequacy of Expected Frequencies & PowerProblems: Lower power with small frequency cells Solution: Accept low powerCollapse categories or delete discrete predictorsEvaluate model fit with 2

  • Practical Issues3. Presence of multicollinearityProblem: Large standard errors, or estimatesSolution:Run multiway frequency tables to identify categorical variablesRun correlations to identify continuous variablesDelete theoretically less important predictors or combine with other procedures

  • Practical IssuesRare events may be appropriate for poisson regression or negative binomial regression.

  • ReferencesAllison, P. D. (Logistic regression using the SAS system. NC, Cary: SAS Institute, Inc.

    Hosmer, D. W. & Lemeshow, S. (2000). Applied logistic regression. New York: John Wiley & Sones, Inc.

    Menard, S. (1994). Applied logistic regression analysis. Thousand Oaks, CA: Sage Publications, Inc.

    Liao, T. F. (1994). Interpreting Probability models: logit, probit, and other generalized linear models. Thousand Oaks, CA: Sage Publications, Inc.

    Long, S.J. & Freese, J. (2006). Regression models for categorical dependent variables using stata. College Station, Texus: Stata press

View more