1 1 slide © 2006 thomson/south-western slides prepared by john s. loucks st. edward’s university...
TRANSCRIPT
1 1 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Slides Prepared bySlides Prepared by
JOHN S. LOUCKSJOHN S. LOUCKSSt. Edward’s UniversitySt. Edward’s University
Slides Prepared bySlides Prepared by
JOHN S. LOUCKSJOHN S. LOUCKSSt. Edward’s UniversitySt. Edward’s University
2 2 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Chapter 10Chapter 10 Comparisons Involving Means Comparisons Involving Means
Part BPart B Introduction to Analysis of Variance Introduction to Analysis of Variance Analysis of Variance: Testing for the Equality of Analysis of Variance: Testing for the Equality of k k Population Means Population Means
3 3 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Introduction to Analysis of VarianceIntroduction to Analysis of Variance
Analysis of VarianceAnalysis of Variance (ANOVA) can be used to test (ANOVA) can be used to test for the equality of three or more population means.for the equality of three or more population means. Analysis of VarianceAnalysis of Variance (ANOVA) can be used to test (ANOVA) can be used to test for the equality of three or more population means.for the equality of three or more population means.
Data obtained from observational or experimentalData obtained from observational or experimental studies can be used for the analysis.studies can be used for the analysis. Data obtained from observational or experimentalData obtained from observational or experimental studies can be used for the analysis.studies can be used for the analysis.
We want to use the sample results to test theWe want to use the sample results to test the following hypotheses:following hypotheses: We want to use the sample results to test theWe want to use the sample results to test the following hypotheses:following hypotheses:
HH00: : 11==22==33==. . . . . . = = kk
HHaa: Not all population means are equal: Not all population means are equal
4 4 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Introduction to Analysis of VarianceIntroduction to Analysis of Variance
HH00: : 11==22==33==. . . . . . = = kk
HHaa: Not all population means are equal: Not all population means are equal
If If HH00 is rejected, we cannot conclude that is rejected, we cannot conclude that allall population means are different.population means are different.
If If HH00 is rejected, we cannot conclude that is rejected, we cannot conclude that allall population means are different.population means are different.
Rejecting Rejecting HH00 means that at least two population means that at least two population means have different values.means have different values.
Rejecting Rejecting HH00 means that at least two population means that at least two population means have different values.means have different values.
5 5 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Sampling Distribution of Given Sampling Distribution of Given HH00 is True is Truexx
Introduction to Analysis of VarianceIntroduction to Analysis of Variance
1x1x 3x3x2x2x
Sample means are close togetherSample means are close together because there is onlybecause there is only
one sampling distributionone sampling distribution when when HH00 is true. is true.
22x n
2
2x n
6 6 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Introduction to Analysis of VarianceIntroduction to Analysis of Variance
Sampling Distribution of Given Sampling Distribution of Given HH00 is False is Falsexx
33 1x1x 2x2x3x3x 11 22
Sample means come fromSample means come fromdifferent sampling distributionsdifferent sampling distributionsand are not as close togetherand are not as close together
when when HH00 is false. is false.
7 7 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
For each population, the response variable isFor each population, the response variable is normally distributed.normally distributed. For each population, the response variable isFor each population, the response variable is normally distributed.normally distributed.
Assumptions for Analysis of VarianceAssumptions for Analysis of Variance
The variance of the response variable, denoted The variance of the response variable, denoted 22,, is the same for all of the populations.is the same for all of the populations. The variance of the response variable, denoted The variance of the response variable, denoted 22,, is the same for all of the populations.is the same for all of the populations.
The observations must be independent.The observations must be independent. The observations must be independent.The observations must be independent.
8 8 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Analysis of Variance:Analysis of Variance:Testing for the Equality of Testing for the Equality of kk Population Population
MeansMeans Between-Treatments Estimate of Population VarianceBetween-Treatments Estimate of Population Variance
Within-Treatments Estimate of Population VarianceWithin-Treatments Estimate of Population Variance Comparing the Variance Estimates: The Comparing the Variance Estimates: The F F Test Test ANOVA TableANOVA Table
9 9 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Between-Treatments EstimateBetween-Treatments Estimateof Population Varianceof Population Variance
A between-treatment estimate of A between-treatment estimate of 2 2 is called the is called the mean square treatmentmean square treatment and is denoted MSTR. and is denoted MSTR.
2
1
( )
MSTR1
k
j jj
n x x
k
2
1
( )
MSTR1
k
j jj
n x x
k
Denominator representsDenominator represents the the degrees of freedomdegrees of freedom associated with SSTRassociated with SSTR
Numerator is theNumerator is the sum of squaressum of squares
due to treatmentsdue to treatmentsand is denoted SSTRand is denoted SSTR
10 10 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
The estimate of The estimate of 22 based on the variation of the based on the variation of the sample observations within each sample is called sample observations within each sample is called the the mean square errormean square error and is denoted by MSE. and is denoted by MSE.
Within-Samples EstimateWithin-Samples Estimateof Population Varianceof Population Variance
kn
sn
T
k
jjj
1
2)1(
MSEkn
sn
T
k
jjj
1
2)1(
MSE
Denominator representsDenominator represents the the degrees of freedomdegrees of freedom
associated with SSEassociated with SSE
Numerator is theNumerator is the sum of squaressum of squares
due to errordue to errorand is denoted SSEand is denoted SSE
11 11 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Comparing the Variance Estimates: The Comparing the Variance Estimates: The FF TestTest
If the null hypothesis is true and the ANOVAIf the null hypothesis is true and the ANOVA assumptions are valid, the sampling distribution ofassumptions are valid, the sampling distribution of MSTR/MSE is an MSTR/MSE is an FF distribution with MSTR d.f. distribution with MSTR d.f. equal to equal to kk - 1 and MSE d.f. equal to - 1 and MSE d.f. equal to nnTT - - kk..
If the means of the If the means of the kk populations are not equal, the populations are not equal, the value of MSTR/MSE will be inflated because MSTRvalue of MSTR/MSE will be inflated because MSTR overestimates overestimates 22.. Hence, we will reject Hence, we will reject HH00 if the resulting value of if the resulting value of MSTR/MSE appears to be too large to have beenMSTR/MSE appears to be too large to have been selected at random from the appropriate selected at random from the appropriate FF distribution.distribution.
12 12 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Test for the Equality of Test for the Equality of kk Population Population MeansMeans
FF = MSTR/MSE = MSTR/MSE
HH00: : 11==22==33==. . . . . . = = kk
HHaa: Not all population means are equal: Not all population means are equal
HypothesesHypotheses
Test StatisticTest Statistic
13 13 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Test for the Equality of Test for the Equality of kk Population Population MeansMeans
Rejection RuleRejection Rule
where the value of where the value of FF is based on an is based on anFF distribution with distribution with kk - 1 numerator d.f. - 1 numerator d.f.and and nnTT - - kk denominator d.f. denominator d.f.
Reject Reject HH00 if if pp-value -value << pp-value Approach:-value Approach:
Critical Value Approach:Critical Value Approach: Reject Reject HH00 if if FF >> FF
14 14 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Sampling Distribution of MSTR/MSESampling Distribution of MSTR/MSE
Rejection RegionRejection Region
Do Not Reject H0Do Not Reject H0
Reject H0Reject H0
MSTR/MSEMSTR/MSE
Critical ValueCritical ValueFF
Sampling DistributionSampling Distributionof MSTR/MSEof MSTR/MSE
15 15 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
ANOVA TableANOVA Table
SST is SST is partitionedpartitioned
into SSTR and into SSTR and SSE.SSE.
SST’s degrees of SST’s degrees of freedomfreedom
(d.f.) are partitioned (d.f.) are partitioned intointo
SSTR’s d.f. and SSE’s SSTR’s d.f. and SSE’s d.f.d.f.
TreatmentTreatment
ErrorError
TotalTotal
SSTRSSTR
SSESSE
SSTSST
kk – 1 – 1
nnT T – – kk
nnTT - 1 - 1
MSTRMSTR
MSEMSE
Source ofSource ofVariationVariation
Sum ofSum ofSquaresSquares
Degrees ofDegrees ofFreedomFreedom
MeanMeanSquaresSquares
MSTR/MSEMSTR/MSE
FF
16 16 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
ANOVA TableANOVA Table
SST divided by its degrees of freedom SST divided by its degrees of freedom nnTT – 1 is the – 1 is the overall sample variance that would be obtained if weoverall sample variance that would be obtained if we treated the entire set of observations as one data set.treated the entire set of observations as one data set.
SST divided by its degrees of freedom SST divided by its degrees of freedom nnTT – 1 is the – 1 is the overall sample variance that would be obtained if weoverall sample variance that would be obtained if we treated the entire set of observations as one data set.treated the entire set of observations as one data set.
With the entire data set as one sample, the formulaWith the entire data set as one sample, the formula for computing the total sum of squares, SST, is:for computing the total sum of squares, SST, is: With the entire data set as one sample, the formulaWith the entire data set as one sample, the formula for computing the total sum of squares, SST, is:for computing the total sum of squares, SST, is:
2
1 1
SST ( ) SSTR SSEjnk
ijj i
x x
2
1 1
SST ( ) SSTR SSEjnk
ijj i
x x
17 17 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
ANOVA TableANOVA Table
ANOVA can be viewed as the process of partitioningANOVA can be viewed as the process of partitioning the total sum of squares and the degrees of freedomthe total sum of squares and the degrees of freedom into their corresponding sources: treatments and error.into their corresponding sources: treatments and error.
ANOVA can be viewed as the process of partitioningANOVA can be viewed as the process of partitioning the total sum of squares and the degrees of freedomthe total sum of squares and the degrees of freedom into their corresponding sources: treatments and error.into their corresponding sources: treatments and error.
Dividing the sum of squares by the appropriateDividing the sum of squares by the appropriate degrees of freedom provides the variance estimatesdegrees of freedom provides the variance estimates and the and the FF value used to test the hypothesis of equal value used to test the hypothesis of equal population means.population means.
Dividing the sum of squares by the appropriateDividing the sum of squares by the appropriate degrees of freedom provides the variance estimatesdegrees of freedom provides the variance estimates and the and the FF value used to test the hypothesis of equal value used to test the hypothesis of equal population means.population means.
18 18 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Example: Reed ManufacturingExample: Reed Manufacturing
Test for the Equality of Test for the Equality of kk Population Population MeansMeans
Janet Reed would like to know ifJanet Reed would like to know ifthere is any significant difference inthere is any significant difference inthe mean number of hours worked per the mean number of hours worked per week for the department managersweek for the department managersat her three manufacturing plantsat her three manufacturing plants(in Buffalo, Pittsburgh, and Detroit). (in Buffalo, Pittsburgh, and Detroit).
19 19 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Example: Reed ManufacturingExample: Reed Manufacturing
Test for the Equality of Test for the Equality of kk Population Population MeansMeans
A simple random sample of fiveA simple random sample of fivemanagers from each of the three plantsmanagers from each of the three plantswas taken and the number of hourswas taken and the number of hoursworked by each manager for theworked by each manager for theprevious week is shown on the nextprevious week is shown on the nextslide.slide. Conduct an Conduct an FF test using test using = .05. = .05.
20 20 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
1122334455
48485454575754546262
73736363666664647474
51516363616154545656
Plant 1Plant 1BuffaloBuffalo
Plant 2Plant 2PittsburghPittsburgh
Plant 3Plant 3DetroitDetroitObservationObservation
Sample MeanSample MeanSample VarianceSample Variance
5555 68 68 57 5726.026.0 26.5 26.5 24.5 24.5
Test for the Equality of Test for the Equality of kk Population Population MeansMeans
21 21 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Test for the Equality of Test for the Equality of kk Population Population MeansMeans
HH00: : 11==22==33
HHaa: Not all the means are equal: Not all the means are equalwhere: where: 1 1 = mean number of hours worked per= mean number of hours worked per
week by the managers at Plant 1week by the managers at Plant 1 2 2 = mean number of hours worked per= mean number of hours worked per week by the managers at Plant 2week by the managers at Plant 23 3 = mean number of hours worked per= mean number of hours worked per week by the managers at Plant 3week by the managers at Plant 3
1. Develop the hypotheses.1. Develop the hypotheses.
pp -Value and Critical Value Approaches -Value and Critical Value Approaches
22 22 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
2. Specify the level of significance.2. Specify the level of significance. = .05= .05
Test for the Equality of Test for the Equality of kk Population Population MeansMeans
pp -Value and Critical Value Approaches -Value and Critical Value Approaches
3. Compute the value of the test statistic.3. Compute the value of the test statistic.
MSTR = 490/(3 - 1) = 245MSTR = 490/(3 - 1) = 245SSTR = 5(55 - 60)SSTR = 5(55 - 60)22 + 5(68 - 60) + 5(68 - 60)22 + 5(57 - 60) + 5(57 - 60)22 = 490 = 490
= (55 + 68 + 57)/3 = 60= (55 + 68 + 57)/3 = 60xx(Sample sizes are all equal.)(Sample sizes are all equal.)
Mean Square Due to TreatmentsMean Square Due to Treatments
23 23 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
3. Compute the value of the test statistic.3. Compute the value of the test statistic.
Test for the Equality of Test for the Equality of kk Population Population MeansMeans
MSE = 308/(15 - 3) = 25.667MSE = 308/(15 - 3) = 25.667
SSE = 4(26.0) + 4(26.5) + 4(24.5) = 308SSE = 4(26.0) + 4(26.5) + 4(24.5) = 308Mean Square Due to ErrorMean Square Due to Error
(continued)(continued)
FF = MSTR/MSE = 245/25.667 = 9.55 = MSTR/MSE = 245/25.667 = 9.55
pp -Value and Critical Value Approaches -Value and Critical Value Approaches
24 24 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
TreatmentTreatment
ErrorError
TotalTotal
490490
308308
798798
22
1212
1414
245245
25.66725.667
Source ofSource ofVariationVariation
Sum ofSum ofSquaresSquares
Degrees ofDegrees ofFreedomFreedom
MeanMeanSquaresSquares
9.559.55
FF
Test for the Equality of Test for the Equality of kk Population Population MeansMeans
ANOVA TableANOVA Table
25 25 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
Test for the Equality of Test for the Equality of kk Population Population MeansMeans
5. Determine whether to reject 5. Determine whether to reject HH00..
We have sufficient evidence to conclude that We have sufficient evidence to conclude that the mean number of hours worked per week the mean number of hours worked per week by department managers is not the same at by department managers is not the same at all 3 plant.all 3 plant.
The The pp-value -value << .05, .05, so we reject so we reject HH00..
With 2 numerator d.f. and 12 With 2 numerator d.f. and 12 denominator d.f.,denominator d.f.,the the pp-value is .01 for -value is .01 for FF = 6.93. = 6.93. Therefore, theTherefore, thepp-value is less than .01 for -value is less than .01 for FF = 9.55. = 9.55.
pp –Value Approach –Value Approach
4. Compute the 4. Compute the pp –value. –value.
26 26 Slide
Slide
© 2006 Thomson/South-Western© 2006 Thomson/South-Western
5. Determine whether to reject 5. Determine whether to reject HH00..
Because Because FF = 9.55 = 9.55 >> 3.89, we reject 3.89, we reject HH00..
Critical Value ApproachCritical Value Approach
4. Determine the critical value and rejection rule.4. Determine the critical value and rejection rule.
Reject Reject HH00 if if FF >> 3.89 3.89
Test for the Equality of Test for the Equality of kk Population Population MeansMeans
We have sufficient evidence to conclude that We have sufficient evidence to conclude that the mean number of hours worked per week the mean number of hours worked per week by department managers is not the same at by department managers is not the same at all 3 plant.all 3 plant.
Based on an Based on an FF distribution with 2 numerator distribution with 2 numeratord.f. and 12 denominator d.f., d.f. and 12 denominator d.f., FF.05.05 = 3.89. = 3.89.