Download - Logistic regression
Logistic
Regression
Will a patient live or die after being admitted to a hospital?
I don’t KnowBut this is an issue which will help me to understandLogistic regression as it is the way that can be used to model categorical outcomes such as this.
Logistic Regression
Regression
Independent Variable
Dependent Variable
Example
Quantitative, Qualitative
Qualitative
Quantitative, Qualitative
Quantitative
Result (Pass, Fail) is the
function of time given to study
Marks obtained is the function of
time given to study
Binomial Distribution
Data Qualitative Data with two categories
Number of Trials
Fixed or known
Relation between
TrialsIndependent
ProbabilityConstant Probability of Success &
Failure
Marks
Study Hours
Passing Marks
Study Hours
Result
Pass
Fail
Logistic Regression
Regression
What is Logistic Regression?
Probability Odd Ratio
Let there be 7 chances of success and 3 chances of failure out of total 10 chances
Chances of event / Chances of not Event
Chances of event / Total Chances
.01.1.5.6.9
.99
1:991:91:13:29:1
99:1
Change the Given Probabilities into Odd
ratio
Probability Odd RatioLog Odd Ratio
Logit Ratio
Formula
Values
.01.1.5.6.9
.99
1:991:91:13:29:1
99:1
-4.59-2.20
0.41
2.204.59
Logistic Regression Theory
Different Methods to Express Logistic Regression
Odd Ratio Form
Logit formConditional Probability
form
Formula
0to +∞Range -∞ to +∞0 to 1
.01.1.5.6.9
.99
1:991:91:13:29:1
99:1
-4.59-2.20
0.41
2.204.59
Values
0.0101
0.111
1.59
99
Logistic Regression Theory
Male
Female
Total
Pass
45 25 70
Fail 5 25 30Total
50 50 100Male Female TotalPass 45[.9]
Pass/Male
25 [.5] Pass/Female
70 (.7)
Fail 5 [.1] Fail/Male
25 [.5] Fail/Female
30 (.3)
Total 50 (.5) 50 (.5) 100 (1)
Contingency Table
Contingency Table with
Conditional Probabilities [ ]
Males have more chances
of passing
Male
Female
Total
Pass
45 25 70
Fail 5 25 30Total
50 50 100
Odd Ratio
Male
Female
Total
Pass
45:5 25:25 70:30
Fail 5:45 25:25 30:70
Total
50 50 100
Contingency Table
Odd Ratio
Males Have Better Odd
Ratio
Male
Female
Total
Pass
9:1 1:1 7:3
Fail 1:9 1:1 3:7Total
50 50 100
Simplified Odd Ratio
Male’s Odd of Passin
g
Female’s Odd
of Passin
g
Male
Female
Total
Pass
9 1 2.33
Fail .111 1 .433Total
50 50 100
Odd Ratio in fraction
Relative Odd Ratio
Ratio of Two Odd Ratios
9/1 = 9
We are interested in the relationship between unemployment & Ethnic Group for a sample of 18 years
old. The following data is available
Unemployed at 18
Ethnic Groups
White Black Total
No 1700 40 1740
Yes 112 8 120
Total 1812 48 1860
Calculate1. Conditional Probability of Being unemployed given
each ethnic Group2. Odd ratio of being unemployed for both the Ethnic
Groups3. Simplified Odd ratios and Odd Ratios in numbers4. Relative Odd Ratios
Conditional Probability for being Unemployed given each ethnic Group
Unemployed at 18
Ethnic Groups
White Black Total
No 1700 40 1740
Yes 112 8 120
Total 1812 48 1860Unemployed at 18
Ethnic Groups
White Black Total
No 1700/1812
40/48 1740
Yes 112/1812
8/48 120
Total 1812/1812
48/48 1860
Unemployed at 18
Ethnic Groups
White Black Total
No .94 .83 1740
Yes .06 .17 120
Total 1 1 1860
Conditional Probability for being Unemployed given each ethnic Group
Odd Ratio for being Unemployed for each ethnic Group
Unemployed at 18
Ethnic Groups
White Black Total
No 1700 40 1740
Yes 112 8 120
Total 1812 48 1860Unemployed at 18
Ethnic Groups
White Black Total
No 1700:112
40:8 1740
Yes 112:1700
8:40 120
Total 1812 48 1860
Unemployed at 18
Ethnic Groups
White Black Total
No 15.2 5 1740
Yes .066 .2 120
Total 1812 48 1860
Odd Ratio for being Unemployed for each ethnic Group
Unemployed at 18
Ethnic Groups
White Black Total
No 1700:112
40:8 1740
Yes 112:1700
8:40 120
Total 1812 48 1860
Relative Odd Ratio for being Unemployed for White and Black
Relative Odd Ratio
=
Odd Ratio of One Group for Being Unemployed
Odd Ratio of the other Group for Being
Unemployed
= 0.33 to 1 = 3 to 1&
Logistic Example Manually & Through SPSS
Behavioral Problem
Ethnic Groups
White Black Total
No 90 30 120
Yes 19 33 52
Total 109 63 172
Behavioral Problem
Ethnic Groups
White Black Total
No 90 30 120
Yes 19 33 52
Total 109 63 172Behavioral Problem
Ethnic Groups
White Black Total
No 0.83 0.48 120 (.7)
Yes 0.17 0.52 52 (.3)
Total 109 (.63) 63 (.37) 172 (1)
Frequency Data
Conditional Probability
Behavioral Problem
Ethnic Groups
White Black Total
No 90 30 120
Yes 19 33 52
Total 109 63 172
Frequency Data
Behavioral Problem
Ethnic Groups
White Black Total
No 90:19 30:33 120
Yes 19:90 33:30 52
Total 109 63 172
Odd Ratio
Behavioral Problem
Ethnic Groups
White Black Total
No 4.73 0.91 120
Yes 0.21 1.1 52
Total 109 63 172
Odd Ratio in Fraction
White Having Behavioral Problem
Black Having Behavioral Problem
Conditional Probability Odd Ratio,
FractionRelative Odd
RatioLn of Odd Ratio
0.17 0.52
19:90 = 0.21 33:30 = 1.1
0.192 to 1 5.21 to 1
-1.561 0.095
Logistic Equation
Ln(Odd Ratio) = -1.56 +1.65X
X = 0, LnOR = -1.56
X = 1, LnOR = 0.095
X = 0, OR = 0.21
X = 1, OR = 1.1
0.17 0.52
compare the fit of two models.
How well a model fits as
compared to the other.
-2Logliklihood
Lower the Value better
the fit of Alternative
Chi Square Test
Base Model is better
Alternative is better
Table showing how many observations
have been predicted correctly
Both Models are same
Proposed is better
Larger difference is
betterP < 0.05
Diagnosis of LR
Classification Table
Difference between the
Base Model and Proposed Model
Higher the correct prediction the better
Likelihood Ratio Test
Based On
it checks whether the fuller model is better than the base model.
What is it?
Loglikelihood function= -2loglikelihood
Measures the discrepancy between the observed and predicted values
Interpretation
loglikelihood
Lower the value the better
Wald Test
Based On
Squared ratio between b1 and Sb1 , (b1/Sb1)2What is it?
Chi Square distribution at 1 df
Interpretation Larger value is significant
Measure of the Proportion of Variance
Based On
Measure of the proportion of variation explained
What is it?
Comparison of log-liklihood of the base and proposed model
Measures Cox & Snell’s R2 Nagelkerke’s R2
Interpretation
The higher the better (Value is between 0 & 1)
Does not attain 1 for the perfect
model
Attains1 for the perfect model
The Hosmer-Lemeshow Goodness-of-Fit Test
Based On
What is it?
Interpretation Significant means the fit is bad
Interpreting the Logistic ModelM
odel
With one unit increase in x
log(OR) of the success will
increase by 1.3 units on average
Inte
rpre
tatio
n
Logit Odd Ratio Probability
With one unit increase in x
OR of success will increase by e1.3 units or by
3.67 units.
It gives the probability of success for a
particular value of x
Conducting Logistic Regression Using SPSS
Data Codes
Interpreting the Logistic ModelM
odel
Inte
rpre
tatio
n
Logit
• Log of Odd ratio of being unemployed is -1.6 for the white
• Log of Odd Ratio of being unemployed decreases by 1.1 for the Black
Interpreting the Logistic ModelM
odel
Inte
rpre
tatio
n
Odd Ratio
• Odd ratio of being unemployed is 0.2 for the white
• Odd Ratio of being unemployed is 0.61
= 0.20
= 0.061
Logistic Regression with Quantitative Independent VariableWe want to determine whether
marks of the students really determine the result of the studetns
Logistic Regression
Logistic Regression
Logistic Regression
Logistic Regression