# logistic regression for binary response variables

Post on 01-Jan-2016

32 views

Category:

## Documents

Embed Size (px)

DESCRIPTION

Logistic regression for binary response variables. Space shuttle example. n = 24 space shuttle launches prior to Challenger disaster on January 27, 1986 Response y is an indicator variable y = 1 if O-ring failures during launch y = 0 if no O-ring failures during launch - PowerPoint PPT Presentation

TRANSCRIPT

• Logistic regression for binary response variables

• Space shuttle examplen = 24 space shuttle launches prior to Challenger disaster on January 27, 1986Response y is an indicator variabley = 1 if O-ring failures during launchy = 0 if no O-ring failures during launchPredictor x1 is launch temperature, in degrees Fahrenheit

• Space shuttle example

• A model

• The mean of a binary response

• A linear regression model for a binary responsefor Yi = 0, 1If the simple linear regression model is:

• Space shuttle example

• (Simple) logistic regression function

• Space shuttle example

• Alternative formulation of (simple) logistic regression function(algebra)logit

• Space shuttle example

• Interpretation of slope coefficients

• OddsIf there are 20% smokers and 80% non-smokers:Odds are 4 to 1 4 non-smokers to 1 smoker.and

• Odds ratioMALE: 20% smokers and 80% non-smokers:FEMALE: 40% smokers and 60% non-smokers:The odds that a male is a nonsmoker is 2.67 times the odds that a female is a nonsmoker.

• Odds ratioGroup 1Group 2The odds ratio

• Space shuttle examplePredicted odds:

• Space shuttle examplePredicted odds ratio for x1 = 55 relative to x1 = 80:The odds of O-ring failure at 55 degrees Fahrenheit is 76 times the odds of O-ring failure at 80 degrees Fahrenheit!

• Interpretation of slope coefficientsThe ratio of the odds at X1 = A relative to the odds at X1 = B (for fixed values of other Xs) is:

• Estimation of logistic regression coefficients

• Maximum likelihood estimationChoose as estimates of the parameters the values that assign the highest probability to (maximize likelihood of) the observed outcome.

• Suppose

• If = 10 and = -0.15, what is the probability of observed outcome?

• Maximum likelihood estimationChoose as estimates of the parameters the values that assign the highest probability to (maximize likelihood of) the observed outcome.

• Suppose

• If = 10.8 and = -0.17, what is the probability of observed outcome?

• Space shuttle exampleLink Function: Logit

Response Information

Variable Value Countfailure 1 7 (Event) 0 17 Total 24

Logistic Regression Table Odds 95% CIPredictor Coef SE Coef Z P Ratio Lower UpperConstant 10.875 5.703 1.91 0.057temp -0.17132 0.08344 -2.05 0.040 0.84 0.72 0.99

• Properties of MLEsIf a model is correct and the sample size is large enough:MLEs are essentially unbiased.Formulas exist for estimating the standard errors of the estimators.The estimators are about as precise as any nearly unbiased estimators.MLEs are approximately normally distributed.

• Test and confidence intervals for single coefficients

• Inference for jTest statistic:follows approximate standard normal distribution.

• Space shuttle exampleLink Function: Logit

Response Information

Variable Value Countfailure 1 7 (Event) 0 17 Total 24

Logistic Regression Table Odds 95% CIPredictor Coef SE Coef Z P Ratio Lower UpperConstant 10.875 5.703 1.91 0.057temp -0.17132 0.08344 -2.05 0.040 0.84 0.72 0.99

• Space shuttle exampleThere is sufficient evidence, at the = 0.05 level, to conclude that temperature is related to the probability of O-ring failure.For every 1-degree increase in temperature, the odds ratio of O-ring failure to O-ring non-failure is estimated to be 0.84 (95% CI is 0.72 to 0.99).

• Survival in the Donner Party In 1846, Donner and Reed families traveled from Illinois to California by covered wagon.Group became stranded in eastern Sierra Nevada mountains when hit by heavy snow.40 of 87 members died from famine and exposure.Are females better able to withstand harsh conditions than are males?

• Survival in the Donner Party

• Survival in the Donner PartyLink Function: Logit

Response Information

Variable Value CountSTATUS SURVIVED 20 (Event) DIED 25 Total 45

Logistic Regression Table Odds 95% CIPredictor Coef SE Coef Z P Ratio Lower UpperConstant 1.633 1.110 1.47 0.141AGE -0.07820 0.03729 -2.10 0.036 0.92 0.86 0.99Gender 1.5973 0.7555 2.11 0.034 4.94 1.12 21.72