chapter 7 inferences regarding population variances

Chapter 7

Inferences Regarding Population Variances

Introduction

• Population Variance: Measure of average squared deviation of individual measurements around the mean

• Sample Variance: Measure of “average” squared deviation of a sample of measurements around their sample mean. Unbiased estimator of 2

N

YYEYV

N

ii

1

2

22

)(])[()(

1

1

2

2

n

yys

n

ii

Sampling Distribution of s2 (Normal Data)

• Population variance (2) is a fixed (unknown) parameter based on the population of measurements

• Sample variance (s2) varies from sample to sample (just as sample mean does)

• When Y~N(,), the distribution of (a multiple of) s2 is Chi-Square with n-1 degrees of freedom. (n-1)s2/2 ~ 2 with df=n-1

• Chi-Square distributions– Positively skewed with positive density over (0,)– Indexed by its degrees of freedom (df)– Mean=df, Variance=2(df)– Critical Values given in Table 8, pp. 686-687

Chi-Square Distributions

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

0 10 20 30 40 50 60 70

f(X

^2

)

X^2

Chi-Square Distributions

f1(y)

f2(y)

f3(y)

f4(y)

f5(y)

df=4

df=10

df=20

df=30

df=50

Chi-Square Distribution Critical Values

-0.02

0

0.02

0.04

0.06

0.08

0.1

0.12

0 5 10 15 20 25 30 35 40

f(X

^2)

X^2

Chi-Square Distribution (df=10)

.95

.025

.025

3.247 20.48

2( ) df=100.995 2.1560.990 2.5580.975 3.2470.950 3.9400.900 4.8650.100 15.9870.050 18.3070.025 20.4830.010 23.2090.005 25.188

Chi-Square Critical Values (2-Sided Tests/CIs)f(

X^

2)

1-

/2

/2

2L 2

U

(1-)100% Confidence Interval for 2 (or )

• Step 1: Obtain a random sample of n items from the population, and compute s2

• Step 2: Choose confidence level (1- ) • Step 3: Obtain 2

L and 2U from the table of critical

values for the chi-square distribution with n-1 df• Step 4: Compute the confidence interval for 2 based

on the formula below• Step 5: Obtain confidence interval for standard

deviation by taking square roots of bounds for 2

2

2

2

22 )1(

,)1(

:for CI %100)1(LU

snsn

Statistical Test for 2

• Null and alternative hypotheses– 1-sided (upper tail): H0: 2 0

2 Ha: 2 > 0

2

– 1-sided (lower tail): H0: 2 02

Ha: 2 < 02

– 2-sided: H0: 2 = 02

Ha: 2 02

• Test Statistic

• Decision Rule based on chi-square distribution w/ df=n-1:– 1-sided (upper tail): Reject H0 if obs

2 > U2 =

2

– 1-sided (lower tail): Reject H0 if obs2 < L

2 = 1-2

– 2-sided: Reject H0 if obs2 < L

2 = 1-/22 (Conclude 2 < 0

2) or if obs

2 > U2 = /2

2 (Conclude 2 > 02

)

20

22 )1(

sn

obs

Inferences Regarding 2 Population Variances

• Goal: Compare variances between 2 populations• Parameter: (Ratio is 1 when variances are equal)

• Estimator: (Ratio of sample variances)

• Distribution of (multiple) of estimator (Normal Data):

22

21

22

21

s

s

1df and 1df with ~ 221122

21

22

21

22

22

21

21 nnF

ss

s

s

F-distribution with parameters df1 = n1-1 and df2 = n2-1

Properties of F-Distributions

• Take on positive density over the range (0 , )• Cannot take on negative values• Non-symmetric (skewed right)

• Indexed by two degrees of freedom (df1 (numerator df) and df2 (denominator df))

• Critical values given in Table 9, pp 688-699• Parameters of F-distribution:

4df)4df()2df(df

2dfdfdf2)2df(

2df

df2

22

21

12222

22

2

-0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0 1 2 3 4 5 6 7 8 9 10

Den

sity

Fu

nct

ion

of

F

F

F-Distributions

f(5,5)

f(5,10)

f(10,20)

Critical Values of F-Distributions

• Notation: Fa, df1, df2 is the value with upper tail area of a above it for the F-distribution with degrees’ of freedom df1 and df2, respectively

• F1-a, df1, df2 = 1/ Fa, df2, df1 (Lower tail critical values can be obtained from upper tail critical values with “reversed” degrees of freedom)

• Values given for various values of a, df1, and df2 in Table 9, pp 688-699

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0 1 2 3 4 5 6 7 8 9 10

Den

sity

Fu

nct

ion

of

F

F

Critical Values of F (df1=5,df2=5)

F(.05,5,5)=5.05F(.95,5,5)=1/F(.05,5,5)=1/5.05=.198

.90

.05

.05

upper area middle area lower area upper cv lower cv0.25 0.5 0.25 1.8947 0.52780.1 0.8 0.1 3.4530 0.2896

0.05 0.9 0.05 5.0503 0.19800.025 0.95 0.025 7.1464 0.13990.01 0.98 0.01 10.9671 0.09120.005 0.99 0.005 14.9394 0.06690.001 0.998 0.001 29.7514 0.0336

Test Comparing Two Population Variances

• Assumption: the 2 populations are normally distributed

))(),(2min( :value

)( or

)( :RegionRejection

:StatisticTest

:: :Test Sided-2

)( :value

:RegionRejection

:StatisticTest

:: :Test Sided-1

22

211,1,2/1

22

211,1,2/

22

21

22

21

22

210

1,1,

22

21

22

21

22

210

21

21

21

obsobs

nnobs

nnobs

obs

a

obs

nnobs

obs

a

FFPFFPP

FF

FF

s

sF

HH

FFPP

FF

s

sF

HH

(1-)100% Confidence Interval for 12/2

2

• Obtain ratio of sample variances s12/s2

2 = (s1/s2)2

• Choose , and obtain:– FL = F, n2-1, n1-1 = 1/ F, n1-1, n2-1

– FU = F, n2-1, n1-1

• Compute Confidence Interval:

UL F

s

sF

s

s22

21

22

21 ,

Conclude population variances unequal if interval does not contain 1

Tests Among t > 2 Population Variances

• Hartley’s Fmax Test– Very simple to compute Test Statistic

– Must have equal sample sizes (n1 = … = nt)

– Test based on assumption of normally distributed data

– Uses special table for critical values (Table 10, p. 700)

• Levene’s Test– More difficult to compute by hand

– No assumptions regarding sample sizes/distributions

– Uses F-distribution for the test

– Computed automatically by software packages (SAS,SPSS, Minitab)

Hartley’s Fmax Test

• H0: 12 = … = t

2 (homogeneous variances)

• Ha: Population Variances are not all equal

• Data: smax2 is largest sample variance, smin

2 is smallest

• Test Statistic: Fmax = smax2/smin

2

• Rejection Region: Fmax F* (Values from Table 10, p. 700, indexed by (.05, .01), t (number of populations) and df2 (n-1, where n is the individual sample sizes)

Levene’s Test

• H0: 12 = … = t

2 (homogeneous variances)

• Ha: Population Variances are not all equal

• Data: For each group, obtain the following quantities:

tNt

t

i

n

j

iij

t

i

ii

t

t

i

n

jij

i

n

jij

i

iiijij

i

iij

FL

tNzz

tzznL

nnNN

z

zn

z

z

njtiyyz

tiiy

njtiijy

i

ii

,1,

1 1

2.

1

2...

11 1

..1

.

~

~

th

:RegionRejection

)(

)1( :StatisticTest

...

),...,1,...,1(

),...,1( groupfor median sample

),...,1,...,1( group fromt measuremen the

chapter 7 inferences regarding population variances

Documents