anova determining which means differ in single factor models determining which means differ in...

30
ANOVA ANOVA Determining Which Means Determining Which Means Differ in Single Factor Differ in Single Factor Models Models

Upload: aubrey-oke

Post on 01-Apr-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ANOVA Determining Which Means Differ in Single Factor Models Determining Which Means Differ in Single Factor Models

ANOVAANOVA

Determining Which Means Determining Which Means Differ in Single Factor ModelsDiffer in Single Factor Models

Page 2: ANOVA Determining Which Means Differ in Single Factor Models Determining Which Means Differ in Single Factor Models

Single Factor ModelsSingle Factor ModelsReview of AssumptionsReview of Assumptions

• Recall that the problem solved by ANOVA is to determine if at least one of the true mean values of several different treatments differs from the others.

• For ANOVA we assumed:1. The distribution of the populationpopulation for each

treatment is normalnormal.

2. The standard deviationsstandard deviations of each population, although unknown, are equalequal.

3. Sampling is randomrandom and independentindependent.

Page 3: ANOVA Determining Which Means Differ in Single Factor Models Determining Which Means Differ in Single Factor Models

Determining Which Means DifferDetermining Which Means DifferBasic ConceptBasic Concept

• Suppose the result of performing a single factor ANOVA test is a low p-value, which indicates that at least one population mean does, in fact, differ from the others.

• The natural question is, “Which differ?”

• The answer is that we conclude that two population means differ if their two sample means differ by “a lot”.– The statistical question is, “What is ‘a lot’?”

Page 4: ANOVA Determining Which Means Differ in Single Factor Models Determining Which Means Differ in Single Factor Models

ExampleExample

• The length of battery life for notebook computers is of concern to computer manufacturers.

• Toshiba is considering 5 different battery models (A, B, C, D, E) that have different costs.

• The question is, “Is there enough evidence to show that average battery life differs among battery types?”

Page 5: ANOVA Determining Which Means Differ in Single Factor Models Determining Which Means Differ in Single Factor Models

DataData

A B C D E

130 90 100 140 160

115 80 95 150 150

130 95 110 150 155

125 98 100 125 145

120 92 105 145 165

110 85 90 130 125

x =121.67 90 100 140 150

Grand MeanGrand Mean

x =120

Page 6: ANOVA Determining Which Means Differ in Single Factor Models Determining Which Means Differ in Single Factor Models
Page 7: ANOVA Determining Which Means Differ in Single Factor Models Determining Which Means Differ in Single Factor Models

OUTPUTOUTPUT

p-value = .000000000108p-value = .000000000108Can conclude differencesCan conclude differences

Page 8: ANOVA Determining Which Means Differ in Single Factor Models Determining Which Means Differ in Single Factor Models

Motivation for The Fisher Motivation for The Fisher ProcedureProcedure

• Fisher’s Procedure is a natural extension of the comparison of two population means when the unknown variances are assumed to be equal – Recall this is an assumption in single factor ANOVA

• Testing for the difference of two population means (with equal but unknown σ’s) has the form:

H0: μ1 – μ2 = 0

HA: μ1 – μ2 ≠ 0Reject H0 (Accept HA) if:

DFα/2,DFα/2,

21

2

21 t- or t

n

1

n

1)σfor estimate(best

0)xx( t

What do we use forthese two values?

Page 9: ANOVA Determining Which Means Differ in Single Factor Models Determining Which Means Differ in Single Factor Models

Best Estimate for Best Estimate for σσ22 and the and the Appropriate Degrees of FredomAppropriate Degrees of Fredom

• Recall that when there were only 2 populations, the best estimate for σ2 is sp

2 and the degrees of freedom is (n1-1) + (n2-1) or n1 + n2 - 2.

• For ANOVA, using all the information from the k populations the best estimate for σ2 is MSE and the degrees of freedom is DFE.

Two populationsWith Equal Variances ANOVA

Best estimate for σ2 sp2 MSE

Degrees of Freedom n1 + n2 – 2 DFE

Page 10: ANOVA Determining Which Means Differ in Single Factor Models Determining Which Means Differ in Single Factor Models

Two Types of TestsTwo Types of Tests• There are two types of tests that can be

applied:1. A test or a confidence interval for the

difference in two particular means • e.g. µE and µB

2. A set of tests which determine differences among all means.• This is called a set of experimentwise (EW) tests.

• The approach is the same. – We will illustrate an approach called the Fisher Fisher

LSD approachLSD approach.– Only the value used for αα will be different will be different.

Page 11: ANOVA Determining Which Means Differ in Single Factor Models Determining Which Means Differ in Single Factor Models

Determining if Determining if μμii Differs From Differs From μμjj

Fisher’s LSD ApproachFisher’s LSD ApproachH0: μi – μj = 0

HA: μi – μj ≠ 0Reject HReject H00 (Accept H (Accept HAA) if:) if:

DFEα/2,DFEα/2,

ji

ji t- or t

n

1

n

1 MSE

0)xx( t

That is, we conclude there is a differences between That is, we conclude there is a differences between μμii and and μμjj if if

jiDFEα/2,ji n

1

n

1 MSE txx LSDLSD

LSD stands for “Least Significant Difference”

Page 12: ANOVA Determining Which Means Differ in Single Factor Models Determining Which Means Differ in Single Factor Models

When Do We Conclude Two When Do We Conclude Two Treatment Means (Treatment Means (µµii and µ and µjj) Differ?) Differ?

• We conclude that two means differ, if their sample means,xi andxj, differ by “a lot”.

• “A lot” is LSD given by:

jiDFEα/2, n

1

n

1 MSE t LSD

Page 13: ANOVA Determining Which Means Differ in Single Factor Models Determining Which Means Differ in Single Factor Models

Confidence Intervals for the Confidence Intervals for the Difference in Two Population MeansDifference in Two Population Means

• A confidence interval for μi – μj is found by:

Confidence Interval for Confidence Interval for μμii – – μμjj

jiDFEα/2,ji n

1

n

1MSEt)xx(

Confidence Interval for Confidence Interval for μμii – – μμjj

LSD)xx( ji

Page 14: ANOVA Determining Which Means Differ in Single Factor Models Determining Which Means Differ in Single Factor Models

Equal vs. Unequal Sample SizesEqual vs. Unequal Sample Sizes• If the sample sizes drawn from the various

populations differ, then the denominator of the t-statistic will be different for each pairwise comparison.

• But if the sample sizes are equal (n1 = n2 = n3 = ….) , we can designate the equal sample size by N

• Then the t-test becomes:Reject H0 (Accept HA) if:

N

2 MSEtxx ifor t- or t

N

2 MSE

0)xx( t DFEα/2,jiDFEα/2,DFEα/2,

ji

LSDLSD

Page 15: ANOVA Determining Which Means Differ in Single Factor Models Determining Which Means Differ in Single Factor Models

LSD For Equal Sample SizesLSD For Equal Sample Sizes

N

2 MSE t LSD DFEα/2,

Page 16: ANOVA Determining Which Means Differ in Single Factor Models Determining Which Means Differ in Single Factor Models

What Do We Use For What Do We Use For αα??

• Recall that α is • In Hypothesis Tests: the probability of concluding that there is

a difference when there is not.• In Confidence Intervals: the probability the interval will not

contain the true difference in mean values

• If doing a single comparison test or constructing a confidence interval,

• For an experimentwise comparison of all means,• We will actually be conducting 10 t-tests:

(1) μE - μD, (2) μE - μC, (3) μE - μB, (4) μE - μA, (5) μD - μC,

(6) μD - μB, (7) μD - μA, (8) μC – μB, (9) μC - μA, (10) μB - μA

select α as usual

Use αEW

Page 17: ANOVA Determining Which Means Differ in Single Factor Models Determining Which Means Differ in Single Factor Models

ααEWEW = The probability of = The probability of

Making at least one Type I ErrorMaking at least one Type I Error• Suppose each test has a probability of

concluding that there is a difference when there is not (making a Type I error) =(making a Type I error) = αα..– Thus for each test, the probability of not probability of not

making a Type I error is 1-making a Type I error is 1-αα..

• So the probability of not making any Type I errors on any of the 10 tests is: (1- (1- αα))1010

• For α = .05, this is (.95)10 = .5987• The probability of making at least one Type I probability of making at least one Type I error in error in

this experimentthis experiment, is denoted by ααEW.EW.

• Here, ααEW EW = 1 - .5987 = .4013 -- = 1 - .5987 = .4013 -- That is, the probability the probability we make at least one mistake is now over 40%! over 40%!

• To have a lower lower ααEWEW, α for each test must be significantly reduced.

Page 18: ANOVA Determining Which Means Differ in Single Factor Models Determining Which Means Differ in Single Factor Models

The Bonferroni Adjustment for The Bonferroni Adjustment for αα • To make αEW reasonable, say .05, α for each test must

be reduced.• The Bonferroni Adjustment is as follows:

NOTE: decreasing α, increases β, the probability of not concluding that there is a difference between to means when there really is. Thus, some researchers are reluctant to make α too small because this can result in very high β values.

αα for each Test for each TestFor an experimentwise value, αEW, for each test use

αα = = ααEWEW/c/c

c = number of testsc = number of testsFor k treatments,

c = k(k-1)/2c = k(k-1)/2

Page 19: ANOVA Determining Which Means Differ in Single Factor Models Determining Which Means Differ in Single Factor Models

What Should What Should αα for Each Test Be? for Each Test Be?

For αEW = .10

Number of Treatments, k

α

for each test

3 0.03333

4 0.01667

5 0.01000

6 0.00667

7 0.00476

8 0.00357

9 0.00278

10 0.00222

For αEW = .05

Number of Treatments, k

α

for each test

3 0.01667

4 0.00833

5 0.00500

6 0.00333

7 0.00238

8 0.00179

9 0.00139

10 0.00111

The required α values for the individual t-tests for αEW = .05 and αEW = .10 are:

Page 20: ANOVA Determining Which Means Differ in Single Factor Models Determining Which Means Differ in Single Factor Models

LSDLSDEWEW

For Multiple Comparison TestsFor Multiple Comparison TestsWhen doing the series of multiple comparison tests

to determine which means differ, the test would be to conclude that µi differs from µj if :

Where LSDEW is given by:

EWji LSD |xx|

jiDFE/c)/2,(αEW n

1

n

1 MSE tLSD

EW

When ni ≠ nj

N

2 MSE tLSD DFE/c)/2,(αEW EW

When ni = nj = N

α for each test

Page 21: ANOVA Determining Which Means Differ in Single Factor Models Determining Which Means Differ in Single Factor Models

Procedure for Testing Differences Procedure for Testing Differences Among All MeansAmong All Means

• We begin by calculating LSDEW which we have shown will not change from test to test if the sample sizes are the same from each sample. That is the situation in the battery example that we illustrate here. – A different LSD would have to be calculated for

each comparison if the sample sizes are different.

• Then we order the x’s and begin doing the tests, comparing the x’s in descending order. (In our example, xE = 150, xD = 140, xA = 121.67, xc = 100, xB = 90.)

Page 22: ANOVA Determining Which Means Differ in Single Factor Models Determining Which Means Differ in Single Factor Models

Procedure (continued)Procedure (continued)

• Subtract the second largest from the first largest sample mean. If the difference is not less LSDEW, then subtract the third largest from the largest sample mean and so forth until we find a difference larger than LSDEw. We just determined that the μi associated with the largest sample mean differs from another μj.

• Once we conclude that some μi differs from another μj, we can conclude that μi differs from all other μ’s whose corresponding x’s < xj.

• Repeat this process subtracting from the second largest sample mean, then the third, etc.

Page 23: ANOVA Determining Which Means Differ in Single Factor Models Determining Which Means Differ in Single Factor Models

Tests For the Battery ExampleTests For the Battery Example

• For the battery example,1. Which average battery lives can we

conclude differ?

2. Give a 95% confidence interval for the difference in average battery lives between:

• C batteries and B batteries• E batteries and B batteries

Use LSDEW

Multiple Comparisons

Use LSD

Individual Comparisons

Page 24: ANOVA Determining Which Means Differ in Single Factor Models Determining Which Means Differ in Single Factor Models

Battery Example CalculationsBattery Example Calculations

• Experimental error of EW = .05

• For k = 5 populations, α = αEW /10 = .05/10 = .005

• From the Excel output: xE = 150, xD = 140, xA = 121.67, xc = 100, xB = 90

MSE = 94.05333, DFE = 25, N = 6 from each population

• Use TINV(.005,25) to generate t.0025,25 = 3.078203

2355.176

205333.94078203.3

225,0025.

N

MSEtLSDEW

Page 25: ANOVA Determining Which Means Differ in Single Factor Models Determining Which Means Differ in Single Factor Models

Analysis of Which Means DifferAnalysis of Which Means Differ

• We conclude that two population means differ if their sample means differ by more than LSDEW = 17.2355.

• Order the sample means and start with the largest:

• E and D• E and A• D and A• A and C• C and B

D from differs E concludecannot -- 17.2355 10xx DE

BC,A, from differs E concludecan -- 17.2355 28.33xx AE

BC,A, from differs D concludecan -- 17.2355 18.33xx AD

BC, from differsA concludecan -- 17.2355 21.67xx CA

B from differs C concludecannot -- 17.2355 10xx BC

Page 26: ANOVA Determining Which Means Differ in Single Factor Models Determining Which Means Differ in Single Factor Models

LSD For Confidence IntervalsLSD For Confidence Intervals

Confidence intervals for the difference between two mean values, i and j, are of the form:

(Point Estimate) ± t/2,DFE(Standard Error)

ji xx = .05

ji n

1

n

1MSE

N

2MSE

N nn If ji

LSDLSD(not LSD(not LSDEWEW))

Confidence Interval forConfidence Interval for

LSDxx ji ji xx

Page 27: ANOVA Determining Which Means Differ in Single Factor Models Determining Which Means Differ in Single Factor Models

LSD for Battery ExampleLSD for Battery Example

For the battery example:

5318.116

205333.9405954.2

225,025.

N

MSEtLSD

Page 28: ANOVA Determining Which Means Differ in Single Factor Models Determining Which Means Differ in Single Factor Models

The Confidence IntervalsThe Confidence Intervals• 95% confidence interval for the difference in

mean battery lives between batteries of type C and batteries of type B.

• 95% confidence interval for the difference in mean battery lives between batteries of type E and batteries of type B.

Confidence Interval for Confidence Interval for μμCC – – μμBB

21.5398 1.5398-

11.5398 90)- (100 LSD)xx( BC

Confidence Interval for Confidence Interval for μμEE – – μμBB

71.5398 48.4602

11.5398 90)- (150 LSD)xx( BE

Page 29: ANOVA Determining Which Means Differ in Single Factor Models Determining Which Means Differ in Single Factor Models

=TINV(.005,C23)*SQRT(D23*(2/6))

=TINV(.05,C23)*SQRT(D23*(2/6))

D15-D14-K3

D17-D14+K3D17-D14-K3

D15-D14+K3

Copy and pasteAverage and Groups

Then do a Z to A ordering

Compare differencesTo cell H3

Page 30: ANOVA Determining Which Means Differ in Single Factor Models Determining Which Means Differ in Single Factor Models

REVIEWREVIEW

• The Fisher LSD Test– What to use for:

• Best Estimate of σ2 = MSE• Degrees of Freedom = DFE

– Calculation of LSD

• Bonferroni Modification– Modify α so that αEW is reasonable

– α = αEW/c, where the # of tests, c = k(k-1)/2

– Calculation of LSDEW

• Excel Calculations