two sample test

Chapter 10

Two-Sample Tests and ANOVA

Learning ObjectivesIn this chapter, you learn hypothesis testing procedures to

test: • The means of two independent populations• The means of two related populations • The proportions of two independent populations• The variances of two independent populations• The means of more than two populations

Chapter Overview

One-Way Analysis of Variance (ANOVA)

F-test

Tukey-Kramer test

Two-Sample Tests

Population Means, Independent Samples

Means, Related Samples

PopulationProportions

PopulationVariances

Two-Sample Tests

Two-Sample Tests

Population Means,

Independent Samples

Means, Related Samples

Population Variances

Mean 1 vs. independent Mean 2

Same population before vs. after treatment

Variance 1 vs.Variance 2

Examples:

Population Proportions

Proportion 1 vs. Proportion 2

Difference Between Two Means

Population means, independent

samples

σ1 and σ2 known

Goal: Test hypothesis or form a confidence interval for the difference between two population means, μ1 – μ2

The point estimate for the difference is

X1 – X2

*

σ1 and σ2 unknown, assumed equal

σ1 and σ2 unknown, not assumed equal

Independent Samples


samples

• Different data sources• Unrelated• Independent

• Sample selected from one population has no effect on the sample selected from the other population

• Use the difference between 2 sample means

• Use Z test, a pooled-variance t test, or a separate-variance t test

*σ1 and σ2 known



Difference Between Two Means


samples

σ1 and σ2 known

*Use a Z test statistic

Use Sp to estimate unknown σ , use a t test statistic and pooled standard deviation



Use S1 and S2 to estimate unknown σ1 and σ2, use a separate-variance t test


samples

σ1 and σ2 known

σ1 and σ2 KnownAssumptions:

Samples are randomly and independently drawn

Population distributions are normal or both sample sizes are 30

Population standard deviations are known

*σ1 and σ2 unknown,

assumed equal



samples

σ1 and σ2 known …and the standard error of X1 – X2 is

When σ1 and σ2 are known and both populations are normal or both sample sizes are at least 30, the test statistic is a Z-value…

2

22

1

21

XX nσ

nσσ

21

(continued)σ1 and σ2 Known


assumed equal



samples

σ1 and σ2 known

2

22

1

21

2121

nσ

nσ

μμXXZ

The test statistic for μ1 – μ2 is:

σ1 and σ2 Known

*

(continued)



Hypothesis Tests forTwo Population Means

Lower-tail test:

H0: μ1 μ2

H1: μ1 < μ2

i.e.,

H0: μ1 – μ2 0H1: μ1 – μ2 < 0

Upper-tail test:

H0: μ1 ≤ μ2

H1: μ1 > μ2

i.e.,

H0: μ1 – μ2 ≤ 0H1: μ1 – μ2 > 0

Two-tail test:

H0: μ1 = μ2

H1: μ1 ≠ μ2

i.e.,

H0: μ1 – μ2 = 0H1: μ1 – μ2 ≠ 0

Two Population Means, Independent Samples

Two Population Means, Independent Samples

Lower-tail test:

H0: μ1 – μ2 0H1: μ1 – μ2 < 0

Upper-tail test:

H0: μ1 – μ2 ≤ 0H1: μ1 – μ2 > 0

Two-tail test:

H0: μ1 – μ2 = 0H1: μ1 – μ2 ≠ 0

a a/2 a/2a

-za -za/2za za/2

Reject H0 if Z < -Za Reject H0 if Z > Za Reject H0 if Z < -Za/2

or Z > Za/2

Hypothesis tests for μ1 – μ2


samples

σ1 and σ2 known

2

22

1

21

21nσ

nσZXX

The confidence interval for μ1 – μ2 is:

Confidence Interval, σ1 and σ2 Known


assumed equal



samples

σ1 and σ2 known

σ1 and σ2 Unknown, Assumed Equal

Assumptions: Samples are randomly and independently drawn

Populations are normally distributed or both sample sizes are at least 30

Population variances are unknown but assumed equal

*σ1 and σ2 unknown, assumed equal



samples

σ1 and σ2 known

(continued)

*

Forming interval estimates:

The population variances are assumed equal, so use the two sample variances and pool them to estimate the common σ2

the test statistic is a t value with (n1 + n2 – 2) degrees of freedom





samples

σ1 and σ2 knownThe pooled variance is

(continued)

1)n(n

S1nS1nS21

222

2112

p

()1*σ1 and σ2 unknown,

assumed equal




samples

σ1 and σ2 known

Where t has (n1 + n2 – 2) d.f.,

and

21

2p

2121

n1

n1S

μμXXt


*

1)n()1(nS1nS1nS

21

222

2112

p

(continued)





samples

σ1 and σ2 known

21

2p2-nn21

n1

n1StXX

21

The confidence interval for μ1 – μ2 is:

Where

1)n()1(n

S1nS1nS21

222

2112

p

*

Confidence Interval, σ1 and σ2 Unknown



Pooled-Variance t Test: ExampleYou are a financial analyst for a brokerage firm. Is there a difference in dividend yield between stocks listed on the NYSE & NASDAQ? You collect the following data: NYSE NASDAQNumber 21 25Sample mean 3.27 2.53Sample std dev 1.30 1.16

Assuming both populations are approximately normal with equal variances, isthere a difference in average yield (a = 0.05)?

Calculating the Test Statistic

1.50211)25(1)-(21

1.161251.301211)n()1(n

S1nS1nS22

21

222

2112

p

2.040

251

2115021.1

02.533.27

n1

n1S

μμXXt

21

2p

2121

The test statistic is:

SolutionH0: μ1 - μ2 = 0 i.e. (μ1 = μ2)

H1: μ1 - μ2 ≠ 0 i.e. (μ1 ≠ μ2)a = 0.05df = 21 + 25 - 2 = 44Critical Values: t = ± 2.0154

Test Statistic: Decision:

Conclusion:Reject H0 at a = 0.05

There is evidence of a difference in means.

t0 2.0154-2.0154

.025

Reject H0 Reject H0

.025

2.040

2.040

251

2115021.1

2.533.27t


samples

σ1 and σ2 known

σ1 and σ2 Unknown, Not Assumed Equal

Assumptions: Samples are randomly and independently drawn

Populations are normally distributed or both sample sizes are at least 30

Population variances are unknown but cannot be assumed to be equal*




samples

σ1 and σ2 known

(continued)

*

Forming the test statistic:

The population variances are not assumed equal, so include the two sample variances in the computation of the t-test statistic

the test statistic is a t value (statistical software is generally used to do the necessary computations)





samples

σ1 and σ2 known

2

22

1

21

2121

nS

nS

μμXXt


*

(continued)




Related Populations Tests Means of 2 Related Populations

• Paired or matched samples• Repeated measures (before/after)• Use difference between paired values:

• Eliminates Variation Among Subjects• Assumptions:

• Both Populations Are Normally Distributed• Or, if not Normal, use large samples

Related samples

Di = X1i - X2i

Mean Difference, σD KnownThe ith paired difference is Di , where

Related samples

Di = X1i - X2i

The point estimate for the population mean paired difference is D : n

DD

n

1ii

Suppose the population standard deviation of the difference scores, σD, is known

n is the number of pairs in the paired sample

The test statistic for the mean difference is a Z value:Paired

samples

nσ

μDZD

D

Mean Difference, σD Known(continued)

WhereμD = hypothesized mean differenceσD = population standard dev. of differencesn = the sample size (number of pairs)

Confidence Interval, σD Known

The confidence interval for μD isPaired samples

nσZD D

Where n = the sample size

(number of pairs in the paired sample)

If σD is unknown, we can estimate the unknown population standard deviation with a sample standard deviation:

Related samples

1n

)D(DS

n

1i

2i

D

The sample standard deviation is

Mean Difference, σD Unknown

• Use a paired t test, the test statistic for D is now a t statistic, with n-1 d.f.:Paired

samples

1n

)D(DS

n

1i

2i

D

nS

μDtD

D

Where t has n - 1 d.f.

and SD is:

Mean Difference, σD Unknown

(continued)

The confidence interval for μD isPaired samples

1n

)D(DS

n

1i

2i

D

nStD D

1n

where

Confidence Interval, σD Unknown

Lower-tail test:

H0: μD 0H1: μD < 0

Upper-tail test:

H0: μD ≤ 0H1: μD > 0

Two-tail test:

H0: μD = 0H1: μD ≠ 0

Paired Samples

Hypothesis Testing for Mean Difference, σD Unknown

a a/2 a/2a

-ta -ta/2ta ta/2

Reject H0 if t < -ta Reject H0 if t > ta Reject H0 if t < -ta/2 or t > ta/2 Where t has n - 1 d.f.

• Assume you send your salespeople to a “customer service” training workshop. Has the training made a difference in the number of complaints? You collect the following data:

Paired t Test Example

Number of Complaints: (2) - (1)Salesperson Before (1) After (2) Difference, Di

C.B. 6 4 - 2 T.F. 20 6 -14 M.H. 3 2 - 1 R.K. 0 0 0 M.O. 4 0 - 4 -21

D =Di

n

5.67

1n)D(D

S2

iD

= -4.2

• Has the training made a difference in the number of complaints (at the 0.01 level)?

- 4.2D =

1.6655.67/04.2

n/SμDt

D

D

H0: μD = 0H1: μD 0

Test Statistic:

Critical Value = ± 4.604 d.f. = n - 1 = 4

Reject

a/2 - 4.604 4.604

Decision: Do not reject H0(t stat is not in the reject region)

Conclusion: There is not a significant change in the number of complaints.

Paired t Test: Solution

Reject

a/2

- 1.66a = .01

Two Population Proportions

Goal: test a hypothesis or form a confidence interval for the difference between two population proportions,

π1 – π2

The point estimate for the difference is

Population proportions

Assumptions: n1 π1 5 , n1(1- π1) 5

n2 π2 5 , n2(1- π2) 5

21 pp



21

21

nnXXp

The pooled estimate for the overall proportion is:

where X1 and X2 are the numbers from samples 1 and 2 with the characteristic of interest

Since we begin by assuming the null hypothesis is true, we assume π1 = π2 and pool the two sample estimates



21

2121

n1

n1)p(1p

ppZ ππ

The test statistic for p1 – p2 is a Z statistic:

(continued)

2

22

1

11

21

21

nXp ,

nXp ,

nnXXp

where

Confidence Interval forTwo Population Proportions


2

22

1

1121 n

)p(1pn

)p(1pZpp

The confidence interval for π1 – π2 is:

Hypothesis Tests forTwo Population Proportions


Lower-tail test:

H0: π1 π2

H1: π1 < π2

i.e.,

H0: π1 – π2 0H1: π1 – π2 < 0

Upper-tail test:

H0: π1 ≤ π2

H1: π1 > π2

i.e.,

H0: π1 – π2 ≤ 0H1: π1 – π2 > 0

Two-tail test:

H0: π1 = π2

H1: π1 ≠ π2

i.e.,

H0: π1 – π2 = 0H1: π1 – π2 ≠ 0

Hypothesis Tests forTwo Population Proportions


Lower-tail test:

H0: π1 – π2 0H1: π1 – π2 < 0

Upper-tail test:

H0: π1 – π2 ≤ 0H1: π1 – π2 > 0

Two-tail test:

H0: π1 – π2 = 0H1: π1 – π2 ≠ 0

a a/2 a/2a

-za -za/2za za/2

Reject H0 if Z < -Za Reject H0 if Z > Za Reject H0 if Z < -Za/2

or Z > Za/2

(continued)

Example: Two population Proportions

Is there a significant difference between the proportion of men and the proportion of women who will vote Yes on Proposition A?

• In a random sample, 36 of 72 men and 31 of 50 women indicated they would vote Yes

• Test at the .05 level of significance

• The hypothesis test is:H0: π1 – π2 = 0 (the two proportions are equal)

H1: π1 – π2 ≠ 0 (there is a significant difference between proportions)

The sample proportions are: Men: p1 = 36/72 = .50

Women: p2 = 31/50 = .62

.54912267

50723136

nnXXp

21

21

The pooled estimate for the overall proportion is:


(continued)

The test statistic for π1 – π2 is:


(continued)

.025

-1.96 1.96

.025

-1.31

Decision: Do not reject H0

Conclusion: There is not significant evidence of a difference in proportions who will vote yes between men and women.

1.31

501

721.549)(1.549

0.62.50

n1

n1p(1p

ppz

21

2121

)

Reject H0 Reject H0

Critical Values = ±1.96For a = .05

two sample test

Documents

equal population

independent samples1

test hypothesis

independent samplesmeans

independent mean

equal difference

sample meansuse z test

uppertail test