ramin shamshiri abe6981 hw_03

Upload: raminshamshiri

Post on 04-Apr-2018

231 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/30/2019 Ramin Shamshiri ABE6981 HW_03

    1/13

    Ramin Shamshiri ABE 6986, HW #3 Due 09/17/08

    Homework #3

    Due 09/17/08

    ABE 6986

    Ramin Shamshiri

    UFID # 90213353

    Incidence of coronary heart disease with age

    Reference: Hosmer, D.W and S. Lemeshow. 1989. Applied logistic regressionJohn Wiley & Sons. New York, NY

    Table 1: Dependence of coronary heart disease with age

    Age Group

    yr

    Age

    yr

    F Z

    20-29 25.0 0.10 -2.079

    30-34 32.5 0.13 -1.779

    35-39 37.5 0.25 -0.95640-44 42.5 0.33 -0.547

    45-49 47.5 0.46 0.044

    50-54 52.5 0.63 0.847

    55-59 57.5 0.76 1.692

    60-69 65.0 0.80 2.079

    Data are given in Table 1 for incidence of coronary heart disease with age group among 100 subjects,where F is the fraction of the 100 with significant symptoms. Assume the result can be described by the

    logistic model given by:

    = 1 + exp ( .)

    Where A is the maximum value ofF at high age; b is the intercept parameter; and c is the response

    coefficient, yr-1

    . Now equation above can be rearranged to the linearized form

    = ln 1 = .

  • 7/30/2019 Ramin Shamshiri ABE6981 HW_03

    2/13

    Ramin Shamshiri ABE 6986, HW #3 Due 09/17/08

    1- Plot F vs. Age on linear-linear graph paper.Answer:

    Age

    yr

    F

    25.0 0.10

    32.5 0.13

    37.5 0.25

    42.5 0.33

    47.5 0.46

    52.5 0.63

    57.5 0.76

    65.0 0.80

  • 7/30/2019 Ramin Shamshiri ABE6981 HW_03

    3/13

    Ramin Shamshiri ABE 6986, HW #3 Due 09/17/08

    2- Calculate values of Z for each age for values of A= 0.90, 0.91, 0.92, 0.93, 0.94 and 0.95.Answer:

    = ln 1 = . A=0.90

    Age=25 => F=0.1

    = ln 0.90.1

    1 = 2.079.

    .

    .

    A=0.95

    Age=65 => F=0.8

    = ln 0.950.8

    1 = 1.6739

    Table 2: Values of Z for each age for the given values of AAge

    yr

    Z1 Z2 Z3 Z4 Z5 Z6

    25.0 -2.0794 -2.0919 -2.1041 -2.1163 -2.1282 -2.1401

    32.5 -1.7789 -1.7918 -1.8045 -1.8171 -1.8295 -1.8418

    37.5 -0.95551 -0.97078 -0.9858 -1.0006 -1.0152 -1.0296

    42.5 -0.54654 -0.56394 -0.5810 -0.59784 -0.61437 -0.63063

    47.5 0.044452 0.021979 0 -0.02151 -0.04256 -0.06318

    52.5 0.8473 0.81093 0.7758 0.74194 0.70915 0.6774

    57.5 1.6917 1.6227 1.5581 1.4975 1.4404 1.3863

    65.0 2.0794 1.9841 1.8971 1.8171 1.743 1.674

    A= 0.90 0.91 0.92 0.93 0.94 0.95

  • 7/30/2019 Ramin Shamshiri ABE6981 HW_03

    4/13

    Ramin Shamshiri ABE 6986, HW #3 Due 09/17/08

    3- Estimate values of parameters b and c corresponding to each values of A by linearregression of Z vs. Age. Include the correlation coefficient (r) to 5 decimal places.

    Answer:

    Linear model: = ln 1 = . = 0.1144 5.235Coefficients (with 95% confidence bounds):

    c = 0.1144 (0.09751, 0.1313)

    b = -5.235 (-6.022, -4.447)

    Goodness of fit:

    SSE: 0.3531 R-square: 0.9787 r=0.98929

    Adjusted R-square: 0.9751 RMSE: 0.2426

    Age

    yr

    Z

    25.0 -2.07932.5 -1.779

    37.5 -0.956

    42.5 -0.547

    47.5 0.044

    52.5 0.847

    57.5 1.692

    65.0 2.079

  • 7/30/2019 Ramin Shamshiri ABE6981 HW_03

    5/13

    Ramin Shamshiri ABE 6986, HW #3 Due 09/17/08

    Linear model: = ln 1 = . 1 = 0.1144 5.235Coefficients (with 95% confidence bounds):

    c = 0.1144 (0.09754, 0.1312)b = -5.235 (-6.022, -4.448)

    Goodness of fit:

    SSE: 0.3523 R-square: 0.9787 r=0.98929Adjusted R-square: 0.9752 RMSE: 0.2423

    Age (yr) Z1

    25.0 -2.0794

    32.5 -1.7789

    37.5 -0.95551

    42.5 -0.54654

    47.5 0.044452

    52.5 0.8473

    57.5 1.6917

    65.0 2.0794

    A= 0.90

  • 7/30/2019 Ramin Shamshiri ABE6981 HW_03

    6/13

    Ramin Shamshiri ABE 6986, HW #3 Due 09/17/08

    Linear model: = ln 1 = . 2 = 0.1123 5.178Coefficients (with 95% confidence bounds):

    c = 0.1123 (0.09589, 0.1288)

    b = -5.178 (-5.946, -4.41)

    Goodness of fit:

    SSE: 0.3358 R-square: 0.979 r=0.98944

    Adjusted R-square: 0.9754 RMSE: 0.2366

    Age (yr) Z2

    25.0 -2.0919

    32.5 -1.7918

    37.5 -0.97078

    42.5 -0.56394

    47.5 0.021979

    52.5 0.81093

    57.5 1.622765.0 1.9841

    A= 0.91

  • 7/30/2019 Ramin Shamshiri ABE6981 HW_03

    7/13

    Ramin Shamshiri ABE 6986, HW #3 Due 09/17/08

    Linear model: = ln 1 = . 3 = 0.1105 5.127Coefficients (with 95% confidence bounds):

    c = 0.1105 (0.09435, 0.1266)

    b = -5.127 (-5.88, -4.374)

    Goodness of fit:

    SSE: 0.3226 R-square: 0.9791 r=0.989797

    Adjusted R-square: 0.9756 RMSE: 0.2319

    Age (yr) Z3

    25.0 -2.1041

    32.5 -1.8045

    37.5 -0.9858

    42.5 -0.5810

    47.5 0

    52.5 0.7758

    57.5 1.5581

    65.0 1.8971

    A= 0.92

  • 7/30/2019 Ramin Shamshiri ABE6981 HW_03

    8/13

    Ramin Shamshiri ABE 6986, HW #3 Due 09/17/08

    Linear model: = ln 1 = . 4 = 0.1088 5.082Coefficients (with 95% confidence bounds):

    c = 0.1088 (0.09291, 0.1246)

    b = -5.082 (-5.823, -4.341)

    Goodness of fit:

    SSE: 0.312 R-square: 0.9791 r=0.989494

    Adjusted R-square: 0.9757 RMSE: 0.2281

    Age (yr) Z4

    25.0 -2.1163

    32.5 -1.8171

    37.5 -1.0006

    42.5 -0.59784

    47.5 -0.02151

    52.5 0.74194

    57.5 1.497565.0 1.8171

    A= 0.93

  • 7/30/2019 Ramin Shamshiri ABE6981 HW_03

    9/13

    Ramin Shamshiri ABE 6986, HW #3 Due 09/17/08

    Linear model: = ln 1 = . 5 = 0.1072 5.041Coefficients (with 95% confidence bounds):

    c = 0.1072 (0.09156, 0.1228)

    b = -5.041 (-5.772, -4.311)

    Goodness of fit:

    SSE: 0.3035 R-square: 0.9791 r=0.989494

    Adjusted R-square: 0.9756 RMSE: 0.2249

    Age (yr) Z5

    25.0 -2.1282

    32.5 -1.8295

    37.5 -1.0152

    42.5 -0.61437

    47.5 -0.04256

    52.5 0.70915

    57.5 1.4404

    65.0 1.743

    A= 0.94

  • 7/30/2019 Ramin Shamshiri ABE6981 HW_03

    10/13

    Ramin Shamshiri ABE 6986, HW #3 Due 09/17/08

    Linear model: = ln 1 = . 6 = 0.1057 5.004Coefficients (with 95% confidence bounds):

    c = 0.1057 (0.09028, 0.1212)

    b = -5.004 (-5.726, -4.283)

    Goodness of fit:

    SSE: 0.2964 R-square: 0.979 r=0.989444

    Adjusted R-square: 0.9755 RMSE: 0.2223

    Age (yr) Z6

    25.0 -2.1401

    32.5 -1.8418

    37.5 -1.0296

    42.5 -0.63063

    47.5 -0.06318

    52.5 0.6774

    57.5 1.3863

    65.0 1.674

    A= 0.95

  • 7/30/2019 Ramin Shamshiri ABE6981 HW_03

    11/13

    Ramin Shamshiri ABE 6986, HW #3 Due 09/17/08

    4- Select the values of A, b, and c for the optimum r.Answer:Based on the results of problem 3, we have the below table for the r values:

    A r

    0.90 0.989290.91 0.98944

    0.92 0.98979

    0.93 0.98949

    0.94 0.98949

    0.95 0.98944

    The correlation coefficient value is desired to be closer to 1.0. According to the above table, r=0.98979

    has the larger value and closet to one, thus we consider it as the optimum r. This value corresponds to:

    3 = 0.1105 5.127A=0.92,b = -5.127 (-5.88, -4.374)

    c = 0.1105 (0.09435, 0.1266)

    5- Plot Z vs. Age for this case on linear-linear graph paper. Plot the regression line as well.Answer:

  • 7/30/2019 Ramin Shamshiri ABE6981 HW_03

    12/13

    Ramin Shamshiri ABE 6986, HW #3 Due 09/17/08

    6- Plot the estimation equation on part (1)Answer:First, using the F and Age data set, I have plotted F vs. Age on linear-linear paper and a regression line

    (Figure 9). The using MATLAB, I have symbolically plotted (Figure 10) Eq.1 with:

    A=0.92

    b = -5.127

    c = 0.1105

    = 0.921 + exp (5.127 0.1105.)

    Linear model to fit on F vs. Age:

    = 0.02024 0.4784R-square: 0.9605 r=0.98005

    Age

    yr

    F

    25.0 0.10

    32.5 0.13

    37.5 0.25

    42.5 0.33

    47.5 0.46

    52.5 0.63

    57.5 0.76

    65.0 0.80

  • 7/30/2019 Ramin Shamshiri ABE6981 HW_03

    13/13

    Ramin Shamshiri ABE 6986, HW #3 Due 09/17/08

    7- Discuss your results.Answer:I tried to fit Eq.1 to Age and F data set using MATLAB to find out the constants, b & c. The fit

    equation was then calculated as below:

    General model:

    F(Age) = 0.92/(1+exp(b-c*Age))Coefficients (with 95% confidence bounds):

    b = 0.402

    c = -9.724

    Goodness of fit:SSE: 2.024

    R-square: -2.834

    The results show that this model is not capable of fitting this data set and cannot be considered as a

    prediction model for this data. Converting the F values to linear values which has resulted the Z values

    seems to have solved this problem. We can see from the results that a simple linear regression model witha very good correlation coefficient is capable to fit the data.

    Using the b and c constant from the linear model, I plotted Eq.1 which appears to better fit the F data,

    however it is still not a good prediction because for instance, at Age=65 the given value for F is 0.8 whilethe value from the prediction model is around 0.92!