logistic regression. recall the simple linear regression model: y = 0 + 1 x + where we are trying...
TRANSCRIPT
![Page 1: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/1.jpg)
Logistic regression
![Page 2: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/2.jpg)
Recall the simple linear regression model:
y = 0 + 1x +
where we are trying to predict a continuous dependent variable y from a continuous independent variable x.
This model can be extended to Multiple linear regression model:
y = 0 + 1x1 + 2x2 + … + + pxp + Here we are trying to predict a continuous dependent variable y from a several continuous dependent variables x1 , x2 , … , xp .
![Page 3: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/3.jpg)
Now suppose the dependent variable y is binary.
It takes on two values “Success” (1) or “Failure” (0)
This is the situation in which Logistic Regression is used
We are interested in predicting a y from a continuous dependent variable x.
![Page 4: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/4.jpg)
Example
We are interested how the success (y) of a new antibiotic cream is curing “acne problems” and how it depends on the amount (x) that is applied daily.
The values of y are 1 (Success) or 0 (Failure).
The values of x range over a continuum
![Page 5: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/5.jpg)
The logisitic Regression ModelLet p denote P[y = 1] = P[Success].
This quantity will increase with the value of x.
The ratio: 1
p
pis called the odds ratio
This quantity will also increase with the value of x, ranging from zero to infinity.
The quantity: ln1
p
p
is called the log odds ratio
![Page 6: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/6.jpg)
Example: odds ratio, log odds ratio
Suppose a die is rolled:Success = “roll a six”, p = 1/6
1 16 6
516 6
1
1 1 5
p
p
The odds ratio
1ln ln ln 0.2 1.69044
1 5
p
p
The log odds ratio
![Page 7: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/7.jpg)
The logisitic Regression Model
i. e. :
0 1
1xp
ep
In terms of the odds ratio
0 1ln1
px
p
Assumes the log odds ratio is linearly related to x.
![Page 8: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/8.jpg)
The logisitic Regression Model
0 1
1xp
ep
or
Solving for p in terms x.
0 1 1xp e p
0 1 0 1x xp pe e
0 1
0 11
x
x
ep
e
![Page 9: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/9.jpg)
0
0.2
0.4
0.6
0.8
1
0 2 4 6 8 10
Interpretation of the parameter 0
(determines the intercept)
p
0
01
e
e
x
![Page 10: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/10.jpg)
0
0.2
0.4
0.6
0.8
1
0 2 4 6 8 10
Interpretation of the parameter 1
(determines when p is 0.50 (along with 0))
p 0 1
0 1
1 1
1 1 1 2
x
x
ep
e
x
00 1
1
0 or x x
when
![Page 11: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/11.jpg)
Also0 1
0 11
x
x
dp d e
dx dx e
0
1
x
when
0 1 0 1 0 1 0 1
0 1
1 1
2
1
1
x x x x
x
e e e e
e
0 1
0 1
1 12 41
x
x
e
e
1
4
is the rate of increase in p with respect to x when p = 0.50
![Page 12: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/12.jpg)
0
0.2
0.4
0.6
0.8
1
0 2 4 6 8 10
Interpretation of the parameter 1
(determines slope when p is 0.50 )
p
x
1slope 4
![Page 13: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/13.jpg)
The data
The data will for each case consist of
1. a value for x, the continuous independent variable
2. a value for y (1 or 0) (Success or Failure)
Total of n = 250 cases
![Page 14: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/14.jpg)
case x y
1 0.8 02 2.3 13 2.5 04 2.8 15 3.5 16 4.4 17 0.5 08 4.5 19 4.4 110 0.9 011 3.3 112 1.1 013 2.5 114 0.3 115 4.5 116 1.8 017 2.4 118 1.6 019 1.9 120 4.6 1
case x y
230 4.7 1231 0.3 0232 1.4 0233 4.5 1234 1.4 1235 4.5 1236 3.9 0237 0.0 0238 4.3 1239 1.0 0240 3.9 1241 1.1 0242 3.4 1243 0.6 0244 1.6 0245 3.9 0246 0.2 0247 2.5 0248 4.1 1249 4.2 1250 4.9 1
![Page 15: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/15.jpg)
Estimation of the parameters
The parameters are estimated by Maximum Likelihood estimation and require a statistical package such as SPSS
![Page 16: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/16.jpg)
Using SPSS to perform Logistic regression
Open the data file:
![Page 17: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/17.jpg)
Choose from the menu:
Analyze -> Regression -> Binary Logistic
![Page 18: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/18.jpg)
The following dialogue box appears
Select the dependent variable (y) and the independent variable (x) (covariate).
Press OK.
![Page 19: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/19.jpg)
Here is the output
The Estimates and their S.E.
![Page 20: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/20.jpg)
The parameter Estimates
SE
X 1.0309 0.1334Constant -2.0475 0.332
1 1.03090 -2.0475
![Page 21: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/21.jpg)
Interpretation of the parameter 0
(determines the intercept)
0
0
-2.0475
-2.0475intercept 0.1143
1 1
e e
e e
Interpretation of the parameter 1
(determines when p is 0.50 (along with 0))
0
1
2.04751.986
1.0309x
![Page 22: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/22.jpg)
Another interpretation of the parameter 1
1
4
is the rate of increase in p with respect to x when p = 0.50
1 1.03090.258
4 4
![Page 23: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/23.jpg)
The Multiple Logistic Regression model
![Page 24: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/24.jpg)
Here we attempt to predict the outcome of a binary response variable Y from several independent variables X1, X2 , … etc
0 1 1ln1 p p
pX X
p
0 1 1
0 1 1 or
1
p p
p p
X X
X X
ep
e
![Page 25: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/25.jpg)
Multiple Logistic Regression an example
In this example we are interested in determining the risk of infants (who were born prematurely) of developing BPD (bronchopulmonary dysplasia)
More specifically we are interested in developing a predictive model which will determine the probability of developing BPD from
X1 = gestational Age and X2 = Birthweight
![Page 26: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/26.jpg)
For n = 223 infants in prenatal ward the following measurements were determined
1. X1 = gestational Age (weeks),
2. X2 = Birth weight (grams) and3. Y = presence of BPD
![Page 27: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/27.jpg)
The datacase Gestational Age Birthweight presence of BMD
1 28.6 1119 12 31.5 1222 03 30.3 1311 14 28.9 1082 05 30.3 1269 06 30.5 1289 07 28.5 1147 08 27.9 1136 19 30 972 0
10 31 1252 011 27.4 818 012 29.4 1275 013 30.8 1231 014 30.4 1112 015 31.1 1353 116 26.7 1067 117 27.4 846 118 28 1013 019 29.3 1055 020 30.4 1226 021 30.2 1237 022 30.2 1287 023 30.1 1215 024 27 929 125 30.3 1159 026 27.4 1046 1
![Page 28: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/28.jpg)
The resultsVariables in the Equation
-.003 .001 4.885 1 .027 .998
-.505 .133 14.458 1 .000 .604
16.858 3.642 21.422 1 .000 2.1E+07
Birthweight
GestationalAge
Constant
Step1
a
B S.E. Wald df Sig. Exp(B)
Variable(s) entered on step 1: Birthweight, GestationalAge.a.
ln 16.858 .003 .5051
pBW GA
p
16.858 .003 .505
1BW GAp
ep
16.858 .003 .505
16.858 .003 .5051
BW GA
BW GA
ep
e
![Page 29: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/29.jpg)
Graph: Showing Risk of BPD vs GA and BrthWt
0
0.2
0.4
0.6
0.8
1
700 900 1100 1300 1500 1700
GA = 27
GA = 28
GA = 29
GA = 30
GA = 31
GA = 32
![Page 30: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/30.jpg)
Discrete Multivariate Analysis
Analysis of Multivariate Categorical Data
![Page 31: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/31.jpg)
Example 1
Data Set #1 - A two-way frequency table Serum Systolic Blood pressure
Cholesterol <127 127-146 147-166 167+ Total <200 117 121 47 22 307 200-219 85 98 43 20 246 220-259 119 209 68 43 439 260+ 67 99 46 33 245 Total 388 527 204 118 1237
In this study we examine n = 1237 individuals measuring X, Systolic Blood Pressure and Y, Serum Cholesterol
![Page 32: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/32.jpg)
Example 2
The following data was taken from a study of parole success involving 5587 parolees in Ohio between 1965 and 1972 (a ten percent sample of all parolees during this period).
![Page 33: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/33.jpg)
The study involved a dichotomous response Y– Success (no major parole violation) or – Failure (returned to prison either as technical
violators or with a new conviction)
based on a one-year follow-up.
The predictors of parole success included are: 1. type of committed offence (Person offense or
Other offense), 2. Age (25 or Older or Under 25), 3. Prior Record (No prior sentence or Prior
Sentence), and 4. Drug or Alcohol Dependency (No drug or
Alcohol dependency or Drug and/or Alcohol dependency).
![Page 34: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/34.jpg)
• The data were randomly split into two parts. The counts for each part are displayed in the table, with those for the second part in parentheses.
• The second part of the data was set aside for a validation study of the model to be fitted in the first part.
![Page 35: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/35.jpg)
Table
No drug or alcohol dependency Drug and/or alcohol dependency 25 or older Under 25 25 or Older Under 25 Person
offense Other
offense Person offense
Other offense
Person offense
Other offense
Person offense
Other offense
No prior Sentence of Any Kind Success 48 34 37 49 48 28 35 57 (44) (34) (29) (58) (47) (38) (37) (53) Failure 1 5 7 11 3 8 5 18 (1) (7) (7) (5) (1) (2) (4) (24) Prior Sentence Success 117 259 131 319 197 435 107 291 (111) (253) (131) (320) (202) (392) (103) (294) Failure 23 61 20 89 38 194 27 101 (27) (55) (25) (93) (46) (215) (34) (102)
![Page 36: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/36.jpg)
Analysis of a Two-way Frequency Table:
![Page 37: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/37.jpg)
Frequency Distribution (Serum Cholesterol and Systolic Blood Pressure)
Serum Systolic Blood pressure Cholesterol <127 127-146 147-166 167+ Total
<200 117 121 47 22 307 200-219 85 98 43 20 246 220-259 119 209 68 43 439
260+ 67 99 46 33 245 Total 388 527 204 118 1237
![Page 38: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/38.jpg)
Joint and Marginal Distributions (Serum Cholesterol and Systolic Blood Pressure)
Serum Systolic Blood pressure Marginal distn Cholesterol <127 127-146 147-166 167+ (Serum Chol.)
<200 9.46 9.78 3.80 1.78 24.82 200-219 6.87 7.92 3.48 1.62 19.89 220-259 9.62 16.90 5.50 3.48 35.49
260+ 5.42 8.00 3.72 2.67 19.81 Marginal distn (BP)
31.37 42.60 16.49 9.54 100.00
The Marginal distributions allow you to look at the effect of one variable, ignoring the other. The joint distribution allows you to look at the two variables simultaneously.
![Page 39: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/39.jpg)
Conditional Distributions ( Systolic Blood Pressure given Serum Cholesterol )
The conditional distribution allows you to look at the effect of one variable, when the other variable is held fixed or known.
Serum Systolic Blood pressure Cholesterol <127 127-146 147-166 167+ Total
<200 38.11 39.41 15.31 7.17 100.00 200-219 34.55 39.84 17.48 8.13 100.00 220-259 27.11 47.61 15.49 9.79 100.00
260+ 27.35 40.41 18.78 13.47 100.00 Marginal distn (BP)
31.37 42.60 16.49 9.54 100.00
![Page 40: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/40.jpg)
Conditional Distributions
(Serum Cholesterol given Systolic Blood Pressure)
Serum Systolic Blood pressure Marginal distn Cholesterol <127 127-146 147-166 167+ (Serum Chol.)
<200 30.15 22.96 23.04 18.64 24.82 200-219 21.91 18.60 21.08 16.95 19.89 220-259 30.67 39.66 33.33 36.44 35.49
260+ 17.27 18.79 22.55 27.97 19.81 Total 100.00 100.00 100.00 100.00 100.00
![Page 41: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/41.jpg)
GRAPH: Conditional distributions of Systolic Blood Pressure given Serum Cholesterol
127-146 147-166<127 167+
SYSTOLIC BLOOD PRESSURE
<200
200-219
260+
220-259
Marginal Distribution
SERUM CHOLESTEROL
40%
50%
30%
20%
10%
![Page 42: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/42.jpg)
Notation:
Let xij denote the frequency (no. of cases) where X (row variable) is i and Y (row variable) is j.
1
c
i i ijj
x R x
1
r
j j iji
x C x
1 1 1 1
r c r c
ij i ji j i j
x N x x x
![Page 43: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/43.jpg)
Different Models
,ij P X i Y j
11 1211 12 11 12
11
, , , rcxx xrc rc
rc
Nf x x x
x x
The Multinomial Model:
Here the total number of cases N is fixed and xij follows a multinomial distribution with parameters ij
11 1211 12
11
!
! !rcxx x
rcrc
N
x x
ij ij ijE x N
![Page 44: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/44.jpg)
11 1211 12 1| 2| |
1 1
, , , ic
ri xx x
rc i i c ii i ic
Rf x x x
x x
The Product Multinomial Model:
Here the row (or column) totals Ri are fixed and for a given row i, xij follows a multinomial distribution with parameters j|i
|ij ij i j iE x R
![Page 45: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/45.jpg)
11 121 1
, , ,!
ij
ij
xr cij
rci j ij
f x x x ex
The Poisson Model:
In this case we observe over a fixed period of time and all counts in the table (including Row, Column and overall totals) follow a Poisson distribution. Let ij denote the mean of xij.
ij ijE x
!
ij
ij
xij
ij ijij
f x ex
![Page 46: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/46.jpg)
Independence
![Page 47: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/47.jpg)
Multinomial Model
,ij P X i Y j P X i P Y j
i j
ij ij i jN N
if independent
and
The estimated expected frequency in cell (i,j) in the case of independence is:
ˆ ˆ ˆ jiij ij i j
xxm N N
N N
i j i jx x R C
N N
![Page 48: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/48.jpg)
The same can be shown for the other two models – the Product Multinomial model and the Poisson model
namely
The estimated expected frequency in cell (i,j) in the case of independence is:
ˆ i j i jij ij
R C x xm
N x
Standardized residuals are defined for each cell:
ij ijij
ij
x mr
m
![Page 49: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/49.jpg)
The Chi-Square Statistic
2
2 2
1 1 1 1
r c r cij ij
iji j i j ij
x mr
m
The Chi-Square test for independence
Reject H0: independence if
2
2 2/ 2
1 1
1 1r c
ij ij
i j ij
x mdf r c
m
![Page 50: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/50.jpg)
TableExpected frequencies, Observed frequencies,
Standardized Residuals
Serum Systolic Blood pressure Cholesterol <127 127-146 147-166 167+ Total
<200 96.29 130.79 50.63 29.29 307 (117) (121) (47) (22) 2.11 -0.86 -0.51 -1.35
200-219 77.16 104.80 40.47 23.47 246 (85) (98) (43) (20) 0.86 -0.66 0.38 -0.72
220-259 137.70 187.03 72.40 41.88 439 (119) (209) (68) (43) -1.59 1.61 -0.52 0.17
260+ 76.85 104.38 40.04 23.37 245 (67) (99) (46) (33) -1.12 -0.53 0.88 1.99
Total 388 527 204 118 1237 2 = 20.85 (p = 0.0133)
![Page 51: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/51.jpg)
Example
In the example N = 57,407 cases in which individuals were victimized twice by crimes were studied.
The crime of the first victimization (X) and the crime of the second victimization (Y) were noted.
The data were tabulated on the following slide
![Page 52: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/52.jpg)
Table 1: Frequencies
Second Victimization in Pair Ra A Ro PP/PS PL B HL MV Total Ra 26 50 11 6 82 39 48 11 273 A 65 2997 238 85 2553 1083 1349 216 8586
First Ro 12 279 197 36 459 197 221 47 1448 Victimization PP/PS 3 102 40 61 243 115 101 38 703
in pair PL 75 2628 413 229 12137 2658 3689 687 22516 B 52 1117 191 102 2649 3210 1973 301 9595 HL 42 1251 206 117 3757 1962 4646 391 12372 MV 3 221 51 24 678 301 367 269 1914 Total 278 8645 1347 660 22558 9565 12394 1960
![Page 53: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/53.jpg)
Table 2: Standardized residuals
Second Victimization in Pair Ra A Ro PP/PS PL B HL MV Ra 21.5 1.4 1.8 1.6 -2.4 -1.0 -1.9 0.6 A 3.6 47.4 2.6 -1.4 -14.1 -9.2 -11.7 -4.5
First Ro 1.9 4.1 28.0 4.7 -4.6 -2.8 -5.2 -0.3 Victimization PP/PS -0.2 -0.4 5.8 18.6 -2.0 -0.2 -4.1 2.9
in pair PL -3.3 -13.1 -5.0 -1.9 35.0 -17.9 -16.8 -2.9 B 0.8 -8.6 -2.3 -0.8 -18.3 40.3 -2.2 -1.5 HL -2.3 -14.2 -4.9 -2.1 -15.8 -2.2 38.2 -1.5 MV -2.1 -4.0 0.9 0.4 -2.7 -1.0 -2.3 25.2
11,430 (highly significant)
![Page 54: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/54.jpg)
Table 3: Conditional distribution of second victimization given the first victimization (%)
Second Victimization in Pair Ra A Ro PP/PS PL B HL MV Ra 9.5 18.3 4.0 2.2 30.0 14.3 17.6 4.0 100.0 A 0.8 34.9 2.8 1.0 29.7 12.6 15.7 2.5 100.0
First Ro 0.8 19.3 13.6 2.5 31.7 13.6 15.3 3.2 100.0 Victimization PP/PS 0.4 14.5 5.7 8.7 34.6 16.4 14.4 5.4 100.0
in pair PL 0.3 11.7 1.8 1.0 53.9 11.8 16.4 3.1 100.0 B 0.5 11.6 2.0 1.1 27.6 33.5 20.6 3.1 100.0 HL 0.3 10.1 1.7 0.9 30.4 15.9 37.6 3.2 100.0 MV 0.2 11.5 2.7 1.3 35.4 15.7 19.2 14.1 100.0
Marginal 0.5 15.1 2.3 1.1 39.3 16.7 21.6 3.4 100.0
![Page 55: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/55.jpg)
Log Linear Model
![Page 56: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/56.jpg)
Recall, if the two variables, rows (X) and columns (Y) are independent then
ij ij i jN N
and
ln ln ln lnij i jN
![Page 57: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/57.jpg)
In general let
1( ) 2( ) 12( , )ln ij i j i ju u u u
1ln ij
i j
urc
1( )
1lni ij
j
u uc
2( )
1lnj ij
i
u ur
12( , ) 1( ) 2( )lni j ij i ju u u u
then
where1( ) 2( ) 12( , ) 12( , ) 0i j i j i j
i j i j
u u u u
(1)
Equation (1) is called the log-linear model for the frequencies xij.
![Page 58: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/58.jpg)
Note: X and Y are independent if
1( ) 2( )ln ij i ju u u
In this case the log-linear model becomes
12( , ) 0 for all ,i ju i j
![Page 59: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/59.jpg)
Three-way Frequency Tables
![Page 60: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/60.jpg)
ExampleData from the Framingham Longitudinal Study of Coronary Heart Disease (Cornfield [1962])
Variables
1. Systolic Blood Pressure (X)– < 127, 127-146, 147-166, 167+
2. Serum Cholesterol– <200, 200-219, 220-259, 260+
3. Heart Disease– Present, Absent
The data is tabulated on the next slide
![Page 61: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/61.jpg)
Three-way Frequency Table
Coronary Heart
Serum Cholesterol
Systolic Blood pressure (mm Hg)
Disease (mm/100 cc) <127 127-146 147-166 167+ <200 2 3 3 4
Present 200-219 3 2 0 3 220-259 8 11 6 6 260+ 7 12 11 11 <200 117 121 47 22
Absent 200-219 85 98 43 20 220-259 119 209 68 43 260+ 67 99 46 33
![Page 62: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/62.jpg)
Log-Linear model for three-way tables
Let ijk denote the expected frequency in cell (i,j,k) of the table then in general
1( ) 2( ) 3( ) 12( , )ln ij i j k i ju u u u u
1( ) 2( ) 3( ) 12( , ) 12( , )0 i j k i j i ji j k i j
u u u u u
13( , ) 23( , ) 123( , , )i k j k i j ku u u where
13( , ) 13( , ) 23( , ) 23( , )i k i k j k j ki k j k
u u u u 123( , , ) 123( , , ) 123( , , )i j k i j k i j k
i j k
u u u
![Page 63: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/63.jpg)
Hierarchical Log-linear models for categorical Data
For three way tables
The hierarchical principle:
If an interaction is in the model, also keep lower order interactions and main effects associated with that interaction
![Page 64: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/64.jpg)
1.Model: (All Main effects model)
ln ijk = u + u1(i) + u2(j) + u3(k)
i.e. u12(i,j) = u13(i,k) = u23(j,k) = u123(i,j,k) = 0.
Notation:
[1][2][3]
Description:
Mutual independence between all three variables.
![Page 65: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/65.jpg)
2.Model:
ln ijk = u + u1(i) + u2(j) + u3(k) + u12(i,j)
i.e. u13(i,k) = u23(j,k) = u123(i,j,k) = 0.
Notation:
[12][3]
Description:
Independence of Variable 3 with variables 1 and 2.
![Page 66: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/66.jpg)
3.Model:
ln ijk = u + u1(i) + u2(j) + u3(k) + u13(i,k)
i.e. u12(i,j) = u23(j,k) = u123(i,j,k) = 0.
Notation:
[13][2]
Description:
Independence of Variable 2 with variables 1 and 3.
![Page 67: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/67.jpg)
4.Model:
ln ijk = u + u1(i) + u2(j) + u3(k) + u23(j,k)
i.e. u12(i,j) = u13(i,k) = u123(i,j,k) = 0.
Notation:
[23][1]
Description:
Independence of Variable 3 with variables 1 and 2.
![Page 68: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/68.jpg)
5.Model:
ln ijk = u + u1(i) + u2(j) + u3(k) + u12(i,j) + u13(i,k)
i.e. u23(j,k) = u123(i,j,k) = 0.
Notation:
[12][13]
Description:
Conditional independence between variables 2 and 3 given variable 1.
![Page 69: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/69.jpg)
6.Model:
ln ijk = u + u1(i) + u2(j) + u3(k) + u12(i,j) + u23(j,k)
i.e. u13(i,k) = u123(i,j,k) = 0.
Notation:
[12][23]
Description:
Conditional independence between variables 1 and 3 given variable 2.
![Page 70: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/70.jpg)
7.Model:
ln ijk = u + u1(i) + u2(j) + u3(k) + u13(i,k) + u23(j,k)
i.e. u12(i,j) = u123(i,j,k) = 0.
Notation:
[13][23]
Description:
Conditional independence between variables 1 and 2 given variable 3.
![Page 71: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/71.jpg)
8.Model:
ln ijk = u + u1(i) + u2(j) + u3(k) + u12(i,j) + u13(i,k) + u23(j,k)
i.e. u123(i,j,k) = 0.Notation:
[12][13][23] Description:Pairwise relations among all three variables, with each two variable interaction unaffected by the value of the third variable.
![Page 72: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/72.jpg)
9.Model: (the saturated model)
ln ijk = u + u1(i) + u2(j) + u3(k) + u12(i,j) + u13(i,k) + u23(j,k) + u123(i,j,k)
Notation:
[123]
Description:
No simplifying dependence structure.
![Page 73: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/73.jpg)
Hierarchical Log-linear models for 3 way table
Model Description
[1][2][3] Mutual independence between all three variables.
[1][23] Independence of Variable 1 with variables 2 and 3.
[2][13] Independence of Variable 2 with variables 1 and 3.
[3][12] Independence of Variable 3 with variables 1 and 2.
[12][13] Conditional independence between variables 2 and 3 given variable 1.
[12][23] Conditional independence between variables 1 and 3 given variable 2.
[13][23] Conditional independence between variables 1 and 2 given variable 3.
[12][13] [23] Pairwise relations among all three variables, with each two variable interaction unaffected by the value of the third variable.
[123] The saturated model
![Page 74: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/74.jpg)
Maximum Likelihood Estimation
Log-Linear Model
![Page 75: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/75.jpg)
For any Model it is possible to determine the maximum Likelihood Estimators of the parameters
Example
Two-way table – independence – multinomial model
11 1211 12 11 12
11
, , , rcxx xrc rc
rc
Nf x x x
x x
11 12
11 12
11
!
! !
rcxx x
rc
rc
N
x x N N N
ij ij ijE x N orij
ij N
![Page 76: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/76.jpg)
Log-likelihood
11 12, , ln ! ln !rc iji j
l N x
ln lnij ij iji j i j
N x x lnij ij
i j
K x where ln ! ln ! lnij
i j
K N x N N
1 2ln ij i ju u u
With the model of independence
![Page 77: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/77.jpg)
and
1 1 1 2 1 2, , , , , ,c rl u u u u u K
1 2ij i ji j
x u u u
with 1 2 0i ji j
u u
1 2i ji ji j
K Nu x u x u
1 2 1 2i j i ju u u u uuij
i j i j i j
e e e e N
also
![Page 78: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/78.jpg)
Let
1 2 21 1 1 2 1 2, , , , , , , , ,c rg u u u u u
1 2
1 11 2i ju uu
i ji j i j
u u e e e N
1 2i ji ji j
K Nu x u x u
Now
1 2 1 0i ju uu
i j
gN e e e N
u
1
![Page 79: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/79.jpg)
1 2
1
1
i ju uui
ji
gx e e e
u
1
11 0
i
i
u
i u
i
ex N
e
1
1
1i
i
u
i iu
i
x xe
N Ne
1 111 and 0
ii i
i
xx
rN N N
Since
![Page 80: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/80.jpg)
Now 1
1iu
ie x K
or 11 ln lniiu x K
11 ln ln 0iii i
u x r K
![Page 81: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/81.jpg)
Hence
1
1ln lni ii
i
u x xr
1
1ln ln i
i
K xr
and
2
1ln lnj jj
i
u x xc Similarly
1 2 1 2i j i ju u u u uuij
i j i j i j
e e e e N
Finally
![Page 82: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/82.jpg)
Hence
2
1
1
ju j
c c
jj
xe
x
Now
1 2i j
uu u
i j
Ne
e e
and
1
1
1
iu i
r r
ii
xe
x
11
1 1
r c cru
i ji ji j
i j
Ne x x
x x
11
1 1
1 r c cr
i ji j
x xN
![Page 83: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/83.jpg)
Hence
Note
1 1ln ln lni j
i j
u x x Nr c
1 2ln ij i ju u u 1 1
ln ln lni ji j
x x Nr c
1 1ln ln ln lni i j j
i i
x x x xr c
ln ln lni jN x x
or i jij
x x
N
![Page 84: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/84.jpg)
Comments
• Maximum Likelihood estimates can be computed for any hierarchical log linear model (i.e. more than 2 variables)
• In certain situations the equations need to be solved numerically
• For the saturated model (all interactions and main effects), the estimate of ijk… is xijk… .
![Page 85: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/85.jpg)
Goodness of Fit Statistics
These statistics can be used to check if a log-linear model will fit the
observed frequency table
![Page 86: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/86.jpg)
Goodness of Fit StatisticsThe Chi-squared statistic
2
2 Observed Expected
Expected
The Likelihood Ratio statistic:
2 2 ln 2 lnˆ
ijkijk
ijk
xObservedG Observed x
Expected
d.f. = # cells - # parameters fitted
2ˆ
ˆijk ijk
ijk
x
We reject the model if 2 or G2 is greater than2
/ 2
![Page 87: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/87.jpg)
Example: Variables
Coronary Heart
Serum Cholesterol
Systolic Blood pressure (mm Hg)
Disease (mm/100 cc) <127 127-146 147-166 167+ <200 2 3 3 4
Present 200-219 3 2 0 3 220-259 8 11 6 6 260+ 7 12 11 11 <200 117 121 47 22
Absent 200-219 85 98 43 20 220-259 119 209 68 43 260+ 67 99 46 33
1. Systolic Blood Pressure (B)Serum Cholesterol (C)Coronary Heart Disease (H)
![Page 88: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/88.jpg)
MODEL DF LIKELIHOOD- PROB. PEARSON PROB. RATIO CHISQ CHISQ ----- -- ----------- ------- ------- ------- B,C,H. 24 83.15 0.0000 102.00 0.0000 B,CH. 21 51.23 0.0002 56.89 0.0000 C,BH. 21 59.59 0.0000 60.43 0.0000 H,BC. 15 58.73 0.0000 64.78 0.0000 BC,BH. 12 35.16 0.0004 33.76 0.0007 BH,CH. 18 27.67 0.0673 26.58 0.0872 n.s. CH,BC. 12 26.80 0.0082 33.18 0.0009 BC,BH,CH. 9 8.08 0.5265 6.56 0.6824 n.s.
Goodness of fit testing of Models
Possible Models:1. [BH][CH] – B and C independent given H.2. [BC][BH][CH] – all two factor interaction model
![Page 89: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/89.jpg)
Model 1: [BH][CH] Log-linear parameters
Heart disease -Blood Pressure Interaction
Bp Hd <127 127-146 147-166 167+ Pres -0.256 -0.241 0.066 0.431 Abs 0.256 0.241 -0.066 -0.431
,HB i ju
Bp Hd <127 127-146 147-166 167+ Pres -2.607 -2.733 0.660 4.461 Abs 2.607 2.733 -0.660 -4.461
,
,
HB i j
HB i j
u
uz
![Page 90: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/90.jpg)
Multiplicative effect
,
, ,exp HB i ju
HB i j HB i ju e
Bp Hd <127 127-146 147-166 167+ Pres 0.774 0.786 1.068 1.538 Abs 1.291 1.272 0.936 0.65
, ,ln ijk H i B j C k HB i j HC i ku u u u u u
, ,H i B j C k HB i j HC i ku u u u uuijk e e e e e e
Log-Linear Model
, ,H i B j C k HB i j HC i k
![Page 91: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/91.jpg)
Heart Disease - Cholesterol Interaction
Chol Hd <200 200-219 220-259 260+ Pres -0.233 -0.325 0.063 0.494 Abs 0.233 0.325 -0.063 -0.494
,HC i ku
,
,
HC i k
HC i k
u
uz
Chol Hd <200 200-219 220-259 260+ Pres -1.889 -2.268 0.677 5.558 Abs 1.889 2.268 -0.677 -5.558
![Page 92: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/92.jpg)
Multiplicative effect
,
, ,exp HB i ku
HC i k HB i ku e
Chol Hd <200 200-219 220-259 260+ Pres 0.792 0.723 1.065 1.640 Abs 1.262 1.384 0.939 0.610
![Page 93: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/93.jpg)
Model 2: [BC][BH][CH] Log-linear parameters
Blood pressure-Cholesterol interaction: ,BC j ku
Bp Chol <200 200-219 220-259 260+ <200 0.222 -0.019 -0.034 -0.169 200-219 0.114 -0.041 0.013 -0.086 220-259 -0.114 0.154 -0.058 0.018 260+ -0.221 -0.094 0.079 0.237
![Page 94: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/94.jpg)
,
,
BC j k
BC j k
u
uz
Bp Chol <200 200-219 220-259 260+ <200 2.68 -0.236 -0.326 -1.291 200-219 1.27 -0.472 0.117 -0.626 220-259 -1.502 2.253 -0.636 0.167 260+ -2.487 -1.175 0.785 2.051
Bp Chol <200 200-219 220-259 260+ <200 1.248 0.981 0.967 0.844 200-219 1.120 0.960 1.013 0.918 220-259 0.892 1.166 0.944 1.018 260+ 0.802 0.910 1.082 1.267
Multiplicative effect ,
, ,exp HB j ku
BC j k BC j ku e
![Page 95: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/95.jpg)
Heart disease -Blood Pressure Interaction
Bp Hd <127 127-146 147-166 167+ Pres -0.211 -0.232 0.055 0.389 Abs 0.211 0.232 -0.055 -0.389
,HB i ju
Bp Hd <127 127-146 147-166 167+ Pres -2.125 -2.604 0.542 3.938
Abs 2.125 2.604 -0.542 -3.938
,
,
HB i j
HB i j
u
uz
![Page 96: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/96.jpg)
Multiplicative effect
,
, ,exp HB i ju
HB i j HB i ju e
Bp Hd <127 127-146 147-166 167+ Pres 0.809 0.793 1.056 1.475
Abs 1.235 1.261 0.947 0.678
![Page 97: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/97.jpg)
Heart Disease - Cholesterol Interaction
Chol Hd <200 200-219 220-259 260+ Pres -0.212 -0.316 0.069 0.460
Abs 0.212 0.316 -0.069 -0.460
,HC i ku
,
,
HC i k
HC i k
u
uz
Chol Hd <200 200-219 220-259 260+ Pres -1.712 -2.199 0.732 5.095
Abs 1.712 2.199 -0.732 -5.095
![Page 98: Logistic regression. Recall the simple linear regression model: y = 0 + 1 x + where we are trying to predict a continuous dependent variable y from](https://reader035.vdocuments.mx/reader035/viewer/2022081508/5697bf701a28abf838c7d86e/html5/thumbnails/98.jpg)
Multiplicative effect
,
, ,exp HB i ku
HC i k HB i ku e
Chol Hd <200 200-219 220-259 260+ Pres 0.809 0.729 1.071 1.584
Abs 1.237 1.372 0.933 0.631