the analysis of variance anova

19

Click here to load reader

Upload: caitlin-bates

Post on 06-Jan-2018

223 views

Category:

Documents


0 download

DESCRIPTION

Introduction ANOVA handles situations with more than two samples or categories to compare Easiest to think of ANOVA as an extension of the t test for the significance of the difference between two sample’s means (Chap. 9) But the t test was limited to the two-sample case Example from your book We want to find if the attitude toward capital punishment is related significantly to age We will want to know which age group shows the most support for capital punishment

TRANSCRIPT

Page 1: The Analysis of Variance ANOVA

Hypothesis Testing

The Analysis of VarianceANOVA

Page 2: The Analysis of Variance ANOVA

Introduction ANOVA handles situations with more than two

samples or categories to compare Easiest to think of ANOVA as an extension of the t

test for the significance of the difference between two sample’s means (Chap. 9) But the t test was limited to the two-sample case

Example from your book We want to find if the attitude toward capital

punishment is related significantly to age We will want to know which age group shows the most

support for capital punishment

Page 3: The Analysis of Variance ANOVA

Example in your book Table 10.1 shows little difference among the

age groups The means are about the same And the standard deviation is the same for both

tables What does this tell you?

They all show about the same support for capital punishment

And, there is around the same amount of diversity on support for capital punishment for each group

This would support the null hypothesis

Page 4: The Analysis of Variance ANOVA

Table 10.2 Higher numbers are more likely to oppose the

death penalty. The oldest group shows the least support for capital punishment, and the youngest group shows the most support

Again, the greater the differences between categories relative to the differences within categories, the more likely the null is false, and there really is a difference among the groups

If groups are really different, then the sample mean for each should be quite different from the others and dispersion within the categories should be relatively low

Page 5: The Analysis of Variance ANOVA

The Logic of the Analysis of Variance

The null hypothesis for ANOVA Is that the populations from which the samples

are drawn are equal on the characteristic of interest

In other words, the null hypothesis for ANOVA is that the population means are equal

For the example, the null is stated that people of various age groups do not vary in their support for the death penalty If the null is true, then the average score for

the youngest group should be about the same as the average score for the all the other groups

Page 6: The Analysis of Variance ANOVA

Logic, continued The averages are unlikely to be exactly the same

value, even if the null really is true, since there is always some error or chance fluctuations in the measurement process

Therefore, we are not asking if there are differences among the age groups in the sample, but are asking if the differences among the age groups are large enough to justify a decision to reject the null hypothesis and say there are differences in the populations

The researcher will be interested in rejecting the null—to show that support for capital punishment is related to age

Page 7: The Analysis of Variance ANOVA

Logic, continued

Basically, what ANOVA does It compares the amount of variation between

categories with the amount of variation within categories

The greater the differences between categories, relative to the differences within categories, the more likely that the null of “no difference” is false and can be rejected

Page 8: The Analysis of Variance ANOVA

The Computation of ANOVA

We will be looking at the variances within samples and between samples The variance of the distribution is the standard

deviation squared, and both are measures of dispersion or variability (or measures of heterogeneity)

Page 9: The Analysis of Variance ANOVA

Computation, continued We will have two separate estimates of the

population variance One will be the pattern of variation within the

categories which is called the sum of squares within (SSW)

The other is based on the variation between categories and is called the sum of squares between (SSB)

The relationship of these three sums of squares is Formula 10.2

SST = SSB + SSW

Page 10: The Analysis of Variance ANOVA

Five-Step Model for ANOVA

Page 11: The Analysis of Variance ANOVA

Step 1 In the ANOVA test, the assumption that must

be made with regard to the population variances is that they are equal If not equal, then ANOVA cannot separate

effects of different means from effects of different variances

If the sample sizes are nearly equal, some of the assumptions can be relaxed, but if they are very different, it would be better to use the Chi Square test (in next chapter) but you will have to collapse the data into a few categories

Page 12: The Analysis of Variance ANOVA

Step 2

The null hypothesis states that the means of the populations from which the samples were drawn are equal

The alternative (research) hypothesis states simply that at least one of the population means is different If we reject the null, ANOVA does not identify

which of the means are significantly different In the ANOVA test, if the null hypothesis is

true, then SSB and SSW should be roughly equal in value

Page 13: The Analysis of Variance ANOVA

Step 3

Selecting the sampling distribution and establishing the critical region The sampling distribution for ANOVA is the F

distribution, which is summarized in Appendix D

There are separate tables for alphas of .05 and .01, respectively

The value of the critical F score will vary by degrees of freedom

Page 14: The Analysis of Variance ANOVA

Step 3, continued For ANOVA, there are two separate degrees of freedom, one for

each estimate of the population variance The numbers across the top of the table are the degrees of

freedom associated with the between estimate (dfb), and the numbers down the side of the table are those associated with the within estimate (dfw)

In the two F tables, all the values are greater than 1.00 This is because ANOVA is a one-tailed test and we are

concerned only with outcomes in which there is more variance between categories than within categories

F values of less than 1.00 would indicate that the between estimate was lower in value than the within estimate and, since we would always fail to reject the null in such cases, we simply ignore this class of outcomes

Page 15: The Analysis of Variance ANOVA

Step 4

Computing the test statistic. This is the F ratio

Page 16: The Analysis of Variance ANOVA

Step 5

Making a decision If our F (obtained) exceeds the F (critical), we

reject the null So, in the test of ANOVA, if the test statistic

falls in the critical region, we may conclude that at least one population mean is different

Page 17: The Analysis of Variance ANOVA

The Limitations of the Test ANOVA is appropriate whenever you want to

test the significance of a difference across three or more categories of a single variable This application is called one-way analysis of

variance Since we observe the effect of a single variable

(age) on another (support for capital punishment) Or effects of region of residence on TV viewing

But, the test has other applications You may have a research project in which the

effects of two separate variables (e.g., age and gender) on some third variable were observed (a two-way analysis of variance)

Page 18: The Analysis of Variance ANOVA

Limitations, continued The major limitations of ANOVA are that it

requires interval-ratio measurement for the dependent variable and nominal or ordinal for the independent, and roughly equal numbers of cases in each of the categories Most variables in the social sciences are not

interval-ratio The second limitation is sometimes difficult,

since you may want to compare groups that are unequal

So may need to sample equal numbers from each group

Page 19: The Analysis of Variance ANOVA

Limitations, continued

The second major limitation is that ANOVA does not tell you which category or categories are different if the null is rejected Can sometimes determine this by inspection

of the sample means But you need to be cautious when drawing

conclusions about which means are significantly different