two-way analysis of variance
DESCRIPTION
STAT E-150 Statistical Methods. Two-Way Analysis of Variance. - PowerPoint PPT PresentationTRANSCRIPT
Two-Way Analysis of Variance
STAT E-150Statistical Methods
2
One-way ANOVA analyzes the relationship between two variables, a quantitative response variable and the groups that are categories of a factor. In a two-way ANOVA model, there are two factors, each with its own number of levels. The two-way ANOVA tests to see if the factors are significant, either separately (called the main effects) or in combination (via an interaction).
3
In a one-way ANOVA, we test for the equality of the means of several levels of a variable, by comparing the variation between the levels and the variation within each level (or group, or treatment.)
In two-way ANOVA the total variance is again partitioned into separate components. One is the variation within the groups. But in this case the between-groups variability is further divided into the variability due to Factor A, the variability due to Factor B, and the variability due to the interaction between the factors.
4
The assumptions for this test are: Independence Assumption
The groups must be independent of each other, and the subjects within each group must be randomly assigned
Equal Variance Assumption
The variances of the treatment groups are equal Normal Population Assumption
The values for each treatment group are normally distributed
5
Example: Suppose we are interested in analyzing the effect of gender and age on income. We will treat the ages as categories: ages 18 - 29, 30 - 39, 40 - 49, and 50 or higher.
This is the structure of this design:
Gender
Female Male
Age category
18 - 29Income for n subjectsFemale, 18-29
Income for n subjectsMale, 18-29
30 - 39Income for n subjectsFemale, 30-39
Income for n subjectsMale, 30-39
40 - 49Income for n subjectsFemale, 40-49
Income for n subjectsMale, 40-49
≥ 50Income for n subjectsFemale, ≥ 50
Income for n subjectsMale, ≥ 50
6
We can then address these questions:
- Are there significant mean differences for income between male and female employees?
- Are there significant mean differences for income by age category among employees?
- Is there a significant interaction on income between gender and age category?
7
The hypotheses are:
H0: μF1= μF2= μF3= μM1= μM2= μM3 (1, 2, and 3 refer to the age groups)
Ha: the means are not all equal
8
The first step is to determine whether there is interaction between the two factors, age and gender, by creating an interaction plot. If the lines intersect suggest that there is factor interaction. However, it is important to check the ANOVA results to see if any interaction is significant. Here is the interaction plot for our example:
Since there is no intersection, we can conclude that there is no interaction between gender and age.
9
The next step is to check the Equal Variances assumption:
Tests the null hypothesis that the error variance of the dependent variable is equal across groups.
a. Design: Intercept + agecat4 + sex + agecat4 * sex
Since p is large, the null hypothesis of equal variances is not rejected; we can conclude that the data does not violate the Equal Variances assumption.
Levene's Test of Equality of Error Variancesa
Dependent Variable:rincom2
F df1 df2 Sig.
1.061 7 701 .387
10
Finally, here are the results of the ANOVA test:
A two-way ANOVA consists of three separate hypothesis tests, for - the mean difference between levels of the first factor- the mean difference between levels of the second factor- any other mean differences that may result from the combination of the
factors
Tests of Between-Subjects Effects
Dependent Variable:rincom2
Source
Type III Sum of
Squares df Mean Square F Sig.
Corrected Model 2236.798a 7 319.543 15.360 .000
Intercept 121426.954 1 121426.954 5836.953 .000
agecat4 1350.516 3 450.172 21.640 .000
sex 842.073 1 842.073 40.478 .000
agecat4 * sex 60.316 3 20.105 .966 .408
Error 14583.002 701 20.803
Total 149966.000 709
Corrected Total 16819.800 708
a. R Squared = .133 (Adjusted R Squared = .124)
11
The first two are tests for the main effects. The null hypothesis is always that there are no differences between the levels of the factor.
Example: H0: μM - μF = 0 The third test is the test for interaction; the null hypothesis is that there is no interaction between the factors.
Tests of Between-Subjects Effects
Dependent Variable:rincom2
Source
Type III Sum of
Squares df Mean Square F Sig.
Corrected Model 2236.798a 7 319.543 15.360 .000
Intercept 121426.954 1 121426.954 5836.953 .000
agecat4 1350.516 3 450.172 21.640 .000
sex 842.073 1 842.073 40.478 .000
agecat4 * sex 60.316 3 20.105 .966 .408
Error 14583.002 701 20.803
Total 149966.000 709
Corrected Total 16819.800 708
a. R Squared = .133 (Adjusted R Squared = .124)
12
It is important to note that these tests are independent; the outcome of one test does not affect the outcome of any other test. Therefore, it is possible to have any combination of significant and nonsignificant main effects and interactions.
13
What do these results show? We can see that the two categories Age and Sex are significant; however the interaction is not, as the interaction graph suggested. And so we have two significant main effects and an insignificant interaction.
Tests of Between-Subjects Effects
Dependent Variable:rincom2
Source
Type III Sum of
Squares df Mean Square F Sig.
Corrected Model 2236.798a 7 319.543 15.360 .000
Intercept 121426.954 1 121426.954 5836.953 .000
agecat4 1350.516 3 450.172 21.640 .000
sex 842.073 1 842.073 40.478 .000
agecat4 * sex 60.316 3 20.105 .966 .408
Error 14583.002 701 20.803
Total 149966.000 709
Corrected Total 16819.800 708
a. R Squared = .133 (Adjusted R Squared = .124)
14
To investigate which groups are different, we can conduct a Scheffe post hoc test to compare all group combinations and identify any significant pairs. Here are the results of this test:
We can see that the age category 18 - 29 differs significantly in income from all other age categories. In addition, those 30 - 39 are significantly different in income from those 40 - 49 years of age.
Multiple Comparisons
rincom2Scheffe
(I) 4 categories of age (J) 4 categories of ageMean
Difference (I-J) Std. Error Sig.95% Confidence Interval
Lower Bound Upper Bound18-29 30-39 -2.1172* .49334 .000 -3.4997 -.7348
40-49 -3.9165* .51013 .000 -5.3461 -2.487050+ -3.2930* .54550 .000 -4.8216 -1.7643
30-39 18-29 2.1172* .49334 .000 .7348 3.499740-49 -1.7993* .44207 .001 -3.0381 -.560550+ -1.1757 .48245 .116 -2.5277 .1762
40-49 18-29 3.9165* .51013 .000 2.4870 5.346130-39 1.7993* .44207 .001 .5605 3.038150+ .6235 .49961 .669 -.7765 2.0236
50+ 18-29 3.2930* .54550 .000 1.7643 4.821630-39 1.1757 .48245 .116 -.1762 2.527740-49 -.6235 .49961 .669 -2.0236 .7765
Based on observed means. The error term is Mean Square(Error) = 20.803.
*. The mean difference is significant at the .05 level.
15
How do you report the results? The two-way analysis of variance was conducted to investigate income differences in gender and age categories among employees. The results show a significant main effect for gender (F = 40.48, p < .001) and age category (F = 21.64, p < .001). Interaction between the factors was not significant (F = .966, p = .408).
The Scheffe post-hoc test revealed that the age category of 18-29 differed significantly in income from the other age categories. In addition, the income for those employees 30-39 years of age differed significantly from those 40-49 years of age.
16
SPSS Instructions for Two-Way ANOVA To create an interaction plot: Choose > Graphs > Chart BuilderChoose Line and drag the second graph (Multiple) to the preview area.Select the response variable and move it to the y-axis.Select one predictor and move it to the x-axis. Select the other predictor and move it to the Set Color area.Click OK.
17
To perform a Two-Way Analysis of Variance Choose > Analyze > General Linear Model > Univariate
Identify the response variable and move it to the Dependent Variable list. Select the variables that define the groups and move them into the Fixed Factors box.
18
Then click on Options, and under Display, select Descriptive statistics and Homogeneity tests. Click on Continue and then OK.
Also click on Save and save the unstandardized predicted values and residuals: Group 1
19
Then click on Options, and under Display, select Descriptive statistics and Homogeneity tests. Click on Continue and then OK.
Also click on Save and save the unstandardized predicted values and residuals: Group 1
20
You will then see the results for Levene's Test and the Tests of Between-Subjects Effects.
The predicted values and residuals will be saved in your data sheet as PRE_1 and RES_1.
In the Univariate dialog box, you can also choose Post-hoc… and then in the next dialog box, choose the Scheffe post hoc test. This will produce the Multiple Comparisons table.
21
Note that this analysis can also be used for a One-Way ANOVA so that the residuals and predicted values can be saved. Using the data for the Anorexia study, once these two values are saved, the residuals can be graphed.
22
Use > Analyze > Descriptive Statistics > Explore and choose the residuals as the dependent variable. Then choose Plots and select Histogram and Normality Plots with tests. Then click on Continue and OK.
23
The results include the following graphs of the residuals: