therapy analysis

4
Austin Kinion STA 138 SID: 998793649 Project 1 Introduction: A researcher wants to know whether there is a significant difference among three therapies for curing patients of cocaine dependence (defined as not taking cocaine for at least 6 months). She tests 500 patients and obtains the results shown in the table below. We need to determine which of the 8 models described in class is the most parsimonious fit for the data. There are three variables in the table: Cure (C) , Sex (S), and Therapy(T). Cure can take the value Positive (i.e. the patient was cured) or Negative (i.e. the patient was not cured), Sex is classified as Male or Female and Therapy is any one of three therapies used to treat the patient. The table lists the number of patients that meets each of the 2 × 2 × 3 different combinations of the three variables. Just as for two-way contingency tables, the saturated model provides a complete characterization of the data equivalent to the information in The table. What we are looking for is the smallest model which is a significantly good fit for the data. We will look at each of models as described in class, one by one, to determine which is best. Materials and methods: I used SAS for the analysis of the data. the SAS code is provided on the last page of this report for ease of reading. I was able to analyze each of the 8 models in SAS, and get the G 2 , AIC, and BIC values of each of the models. After looking at the SAS output data, I was able to determine that Model 4 is the best fitting model for this data with the lowest AIC(85.32), BIC(89.20), and G 2 (5.59) for the amount of variables it uses. Model 7 had the ultimate best G 2 , AIC, and BIC values, but it contained many more variables than model 4 and was significantly different than model 4, so I chose model for. To see if Model 4 was not significantly different that Model 7 , I just compared Model 7 (CS, ST, CT) with Model 4(CS, CT). The difference between the chi-square statistics for the two models was 7.86 – 1.11 = 6.75, which yields a p-value of .106 on 4 – 2 = 2 degrees of freedom. This is not a significant difference, and so I was able to adopt the model 4 (CS, CT). This

Upload: austin-kinion

Post on 24-Dec-2015

6 views

Category:

Documents


0 download

DESCRIPTION

Which Therapy works Best?

TRANSCRIPT

Page 1: Therapy analysis

Austin KinionSTA 138 SID: 998793649

Project 1Introduction: A researcher wants to know whether there is a significant difference among

three therapies for curing patients of cocaine dependence (defined as not taking cocaine for at least 6 months). She tests 500 patients and obtains the results shown in the table below. We need to determine which of the 8 models described in class is the most parsimonious fit for the data.

There are three variables in the table: Cure (C) , Sex (S), and Therapy(T). Cure can take the value Positive (i.e. the patient was cured) or Negative (i.e. the patient was not cured), Sex is classified as Male or Female and Therapy is any one of three therapies used to treat the patient.

The table lists the number of patients that meets each of the 2 × 2 × 3 different combinations of the three variables. Just as for two-way contingency tables, the saturated model provides a complete characterization of the data equivalent to the information in The table. What we are looking for is the smallest model which is a significantly good fit for the data. We will look at each of models as described in class, one by one, to determine which is best.

Materials and methods: I used SAS for the analysis of the data. the SAS code is provided on the last page of this report for ease of reading. I was able to analyze each of the 8 models in SAS, and get the G2 , AIC, and BIC values of each of the models. After looking at the SAS output data, I was able to determine that Model 4 is the best fitting model for this data with the lowest AIC(85.32), BIC(89.20), and G2(5.59) for the amount of variables it uses. Model 7 had the ultimate best G2 , AIC, and BIC values, but it contained many more variables than model 4 and was significantly different than model 4, so I chose model for.

To see if Model 4 was not significantly different that Model 7 , I just compared Model 7 (CS, ST, CT) with Model 4(CS, CT). The difference between the chi-square statistics for the two models was 7.86 – 1.11 = 6.75, which yields a p-value of .106 on 4 – 2 = 2 degrees of freedom. This is not a significant difference, and so I was able to adopt the model 4 (CS, CT). This

Page 2: Therapy analysis

indicates that the interaction between S(sex) and T (therapy) does not make a significant contribution.

A table of all the important values from all models is shown below, with Model 4 and Model 7 highlighted.

This indicates that that there is an interaction between cure and sex as well as between cure and therapy. This can be seen by looking at the odds ratios of the observed data below.

Odds ratios

There is an interaction between Cure and Therapy. This can be seen from the fact that the odds of a cure for therapy 1 is 91/25 = 3.64, while that for therapy 2 is 79/45 = 1.76. The odds ratio is 2.07 (i.e. therapy 2 seems to be twice as effective as therapy 1.)

There is an interaction between Cure and Sex. This can be seen from the fact that the odds of a cure for males is 221/38 = 5.82, while that for females is 136/105 = 1.30. The odds ratio is 4.49(i.e. the therapies seem to be much more effective for men than for women.).

Finally, I computed the coefficients for model 4 in SAS as well and they are as follows: log(μijk)= λ + λix+ λjy+ λkz+ λikxz+ λjkyz

= 3.9816 - 1.01 - 1.07 - .48 + .2845 + 1.502 + .3513 - .3779 = 3.1799.

μijk = 23.996

Page 3: Therapy analysis

Conclusion and Results: I have determined that , of the 8 models described in class, model 4 is the most parsimonious fit for this data. This was determined after analyzing each of the models in SAS, based on AIC, BIC, and G2. This means that there is an interaction between Cure and Therapy and also an interaction between Cure and Sex.

I was able to determine that Model 4 was not significantly different that Model 7 ,by comparing the difference between the chi-square statistics for the two models, which was 7.86 – 1.11 = 6.75, which yields a p-value of .106 on 4 – 2 = 2 degrees of freedom. Since this is not a significant difference, and I was able to adopt the model 4 as my most parsimonious fit. It is also important to note that since model 4 was the best fit, this indicates that the interaction between Sex and Therapy does not make a significant contribution.

The important SAS output for Model 4 is provided below:

Page 4: Therapy analysis