lecture slides stats1.13.l20.air

41
Statistics One Lecture 20 Binary Logistic Regression 1

Upload: atutorte

Post on 29-Jun-2015

176 views

Category:

Education


0 download

DESCRIPTION

Lecture slides stats1.13.l20.air

TRANSCRIPT

Page 1: Lecture slides stats1.13.l20.air

Statistics One

Lecture 20 Binary Logistic Regression

1

Page 2: Lecture slides stats1.13.l20.air

Two segments

•  Overview •  Example

2

Page 3: Lecture slides stats1.13.l20.air

Lecture 20 ~ Segment 1

Binary Logistic Regression Overview

3

Page 4: Lecture slides stats1.13.l20.air

Binary logistic regression

•  Appropriate when predicting a binary categorical outcome variable from a set of predictor variables that may be continuous and/or categorical – Same logic as multiple regression but outcome

variable is categorical and binary

Page 5: Lecture slides stats1.13.l20.air

Binary logistic regression

•  When outcome has two levels – Binary logistic regression

•  When outcome has multiple levels – Multinomial regression

Page 6: Lecture slides stats1.13.l20.air

Multiple regression

•  Ŷ= B0 + Σ(BkXk)

6

Ŷ = predicted value on the outcome variable Y B0 = predicted value on Y when all X = 0 Xk = predictor variables Bk = unstandardized regression coefficients (Y – Ŷ) = residual (prediction error) k = the number of predictor variables

Page 7: Lecture slides stats1.13.l20.air

Binary logistic regression

•  ln(Ŷ / (1 - Ŷ)) = B0 + Σ(BkXk)

7

Ŷ = predicted value on the outcome variable Y B0 = predicted value on Y when all X = 0 Xk = predictor variables Bk = unstandardized regression coefficients (Y – Ŷ) = residual (prediction error) k = the number of predictor variables

Page 8: Lecture slides stats1.13.l20.air

Binary logistic regression

•  Why ln(Ŷ / (1 – Ŷ))?

•  Predicted score must fall between 0 and 1

8

Page 9: Lecture slides stats1.13.l20.air

Binary logistic regression

Page 10: Lecture slides stats1.13.l20.air

Binary logistic regression

•  Why not P(outcome) = B0 + Σ(BkXk) ???

•  There is no guarantee that the linear combination of predictors will produce a score between 0 and 1

•  A transformation is therefore applied 10

Page 11: Lecture slides stats1.13.l20.air

Binary logistic regression

•  Odds = P(outcome) / (1 – P(outcome))

•  For example, what are the odds a flipped coin will land heads? Odds = .5 / .5 = 1

•  Then take the natural log of the odds, which is called the log-odds or logit

•  Logit = ln(P(outcome) / (1 – P(outcome)) •  Logit = ln(Ŷ / (1 – Ŷ))

11

Page 12: Lecture slides stats1.13.l20.air

Binary logistic regression

•  Logit = ln(Ŷ / (1 – Ŷ)) •  Ŷ = P(outcome)

Page 13: Lecture slides stats1.13.l20.air

Binary logistic regression

•  P(outcome) = odds / (1 + odds)

•  Odds = P(outcome) / P(~outcome)

• For example,

•  If P = .50 then Odds = 1 and Logit = 0

Page 14: Lecture slides stats1.13.l20.air

Binary logistic regression

•  Example •  Outcome variable = Faculty Promotion to tenure •  Predictor variable = Publications (Pubs)

•  Logit(Promotion) = B0 + B1(Pubs) •  Logit(Promotion) = 0.00 + .39(Pubs)

•  For every one unit increase in Pubs, the Logit increases .39

Page 15: Lecture slides stats1.13.l20.air

Binary logistic regression

•  Logit = ln(P(outcome) / (1 – P(outcome))

•  Odds = P(outcome) / (1 – P(outcome))

•  Logit = .39 translates to an odds ratio of 1.48 – This means that the odds of promotion are

multiplied by 1.48 for each increment in Pubs

Page 16: Lecture slides stats1.13.l20.air

Binary logistic regression

•  Thus, if the odds of Promotion with 16 publications is 1.27 then the Odds of Promotion with 17 publications is 1.27*1.48 = 1.88

•  This can also be presented in terms of probability •  Pubs = 17 means P(Promotion) = .65 because

P(Promotion) = Odds / (1 + Odds) = 1.88/2.88 = .65

Page 17: Lecture slides stats1.13.l20.air

Binary logistic regression

•  Hypothesis tests •  Is an individual predictor variable significant? •  Is the overall model significant? •  Is Model A significantly better than Model B?

Page 18: Lecture slides stats1.13.l20.air

Binary logistic regression

•  To test each predictor variable •  Regression coefficient •  Odds ratio •  Wald test

•  Tests the model vs. the model without the predictor

Page 19: Lecture slides stats1.13.l20.air

Binary logistic regression

•  To test the overall model •  Compare the chi-square for the model to the chi-square

of a model with no predictors (the null model) •  And/or compare multiple models

•  Also, does the model classify cases correctly?

Page 20: Lecture slides stats1.13.l20.air

Segment summary

•  Binary logistic regression is appropriate when predicting a binary categorical outcome variable from a set of predictor variables that may be continuous and/or categorical

Page 21: Lecture slides stats1.13.l20.air

Segment summary

•  Main components of the output are – Regression coefficients – Odds ratios – Wald tests – Model chi-square – Classification success

Page 22: Lecture slides stats1.13.l20.air

END SEGMENT

22

Page 23: Lecture slides stats1.13.l20.air

Lecture 20 ~ Segment 2

Binary Logistic Regression Example

23

Page 24: Lecture slides stats1.13.l20.air

Binary logistic regression

•  This example is based on “mock jury” research by Diamond & Casper (1992) –  People (mock jurors) watched a video of the sentencing

phase of a murder trial in which the defendant had already been found guilty

–  The issue for the jurors to decide was whether the defendant deserved the death penalty

Page 25: Lecture slides stats1.13.l20.air

Binary logistic regression

•  This example is based on “mock jury” research by Diamond & Casper (1992) –  Assume the data were collected “pre-deliberation”, which

means that each juror was asked to provide his or her vote on the death penalty verdict before the jurors met as a group to decide the overall jury verdict

Page 26: Lecture slides stats1.13.l20.air

Binary logistic regression •  Outcome variable (Y) •  Verdict

•  1 = Voted for the death penalty •  0 = Voted against the death penalty

•  Predictors (Xs) •  Danger •  Rehab •  Punish •  Gendet •  Specdet •  Incap

•  All measured on a scale of 0 – 10

Page 27: Lecture slides stats1.13.l20.air

Binary logistic regression •  Danger (Dangerousness) •  Individual’s beliefs as to the future dangerousness of the

defendant •  Rehab (Rehabilitation) •  Individual’s beliefs as to the importance of rehabilitation as a

goal of criminal sentencing •  Punish (Punishment) •  Individual’s beliefs as to the importance of punishment as a

goal of criminal sentencing

Page 28: Lecture slides stats1.13.l20.air

Binary logistic regression •  Gendet (General deterrence) •  Individual’s beliefs as to the importance of general deterrence as a

goal of criminal sentencing (sentencing should deter the general public)

•  Specdet (Specific deterrence) •  Individual’s beliefs as to the importance of specific deterrence as a

goal of criminal sentencing (sentencing should deter the specific defendant)

•  Incap (Incapacitation) •  Individual’s beliefs as to the importance of punishment as a goal of

criminal sentencing

Page 29: Lecture slides stats1.13.l20.air

Binary logistic regression

•  The General Linear Model will not guarantee a predicted outcome score between 0 and 1

•  The Logit transformation is a feature of an even more “general” mathematical framework in regression

•  The Generalized Linear Model •  Allows for non-linear relationships between predictors and

the outcome variable (see Lecture 23)

Page 30: Lecture slides stats1.13.l20.air

Binary logistic regression

Page 31: Lecture slides stats1.13.l20.air

Binary logistic regression

Page 32: Lecture slides stats1.13.l20.air

Binary logistic regression

Page 33: Lecture slides stats1.13.l20.air

Binary logistic regression

Page 34: Lecture slides stats1.13.l20.air

Binary logistic regression

Page 35: Lecture slides stats1.13.l20.air

Binary logistic regression

Page 36: Lecture slides stats1.13.l20.air

Binary logistic regression

Page 37: Lecture slides stats1.13.l20.air

Binary logistic regression

•  Evaluation of individual predictors –  Odds ratios

•  For a one unit increase in X, the predicted change in odds •  Can also report confidence intervals for odds

–  Wald test •  A function of the regression coefficient. A Wald tests is

calculated for each predictor variable and compares the fit of the model to the fit of the model without the predictor.

Page 38: Lecture slides stats1.13.l20.air

Binary logistic regression

•  Evaluation of the model –  Model chi-square

–  Compares the fit of the model to the fit of the null model –  Classification success

•  Percentage of cases classified correctly

Page 39: Lecture slides stats1.13.l20.air

Binary logistic regression

•  More than 2 categories on the outcome – Multinomial logistic regression •  A-1 logistic regression equations are formed

– Where A = # of groups – One group serves as reference group

Page 40: Lecture slides stats1.13.l20.air

END SEGMENT

40

Page 41: Lecture slides stats1.13.l20.air

END LECTURE 20

41