hypothesis testing. to define a statistical test we 1.choose a statistic (called the test statistic)...

Hypothesis Testing

To define a statistical Test we

1. Choose a statistic (called the test statistic)

2. Divide the range of possible values for the test statistic into two parts

• The Acceptance Region

• The Critical Region

To perform a statistical Test we

1. Collect the data.

2. Compute the value of the test statistic.

3. Make the Decision:

• If the value of the test statistic is in the Acceptance Region we decide to accept H0 .

• If the value of the test statistic is in the Critical Region we decide to reject H0 .

The z-test for Proportions

Testing the probability of success in a binomial experiment

Situation

• A success-failure experiment has been repeated n times

• The probability of success p is unknown. We want to test – H0: p = p0 (some specified value of p)

Against

– HA: 0pp

The Test Statistic

n

pp

ppppz

p 00

0

ˆ

0

1

ˆ

ˆ

The Acceptance and Critical Region

• Accept H0 if:

• Reject H0 if:

2/2/ zzz

2/2/ or zzzz

Two-tailed critical region

The Acceptance and Critical Region• Accept H0 if:

• Reject H0 if:

z zz z

One-tailed critical regions

These are used when the alternative hypothesis (HA) is one-sided

0 0 0i.e. : and :AH p p H p p

z z z z

0 0 0or if : and :AH p p H p p

• Accept H0 if:

• Reject H0 if:

The Acceptance and Critical Region

Accept H0 if: , Reject H0 if:z z z z

One-tailed critical regions

0 0 0: and :AH p p H p p

Comments

• Whether you use a one-tailed or a two-tailed tests is determined by the choice of the alternative hypothesis HA

• The alternative hypothesis, HA, is usually the research hypothesis. The hypothesis that the researcher is trying to “prove”.

Examples

1. A person wants to determine if a coin should be accepted as being fair. Let p be the probability that a head is tossed.

One is trying to determine if there is a difference (positive or negative) with the fair value of p.

1 10 2 2: vs :AH p H p

2. A researcher is interested in determining if a new procedure is an improvement over the old procedure. The probability of success for the old procedure is p0 (known). The probability of success for the new procedure is p (unknown) .

One is trying to determine if the new procedure is better (i.e. p > p0) .

0 0 0: vs :AH p p H p p

2. A researcher is interested in determining if a new procedure is no longer worth considering. The probability of success for the old procedure is p0 (known). The probability of success for the new procedure is p (unknown) .

One is trying to determine if the new procedure is definitely worse than the one presently being used (i.e. p < p0) .

0 0 0: vs :AH p p H p p

The z-test for the Mean of a Normal Population

We want to test, , denote the mean of a normal population

The Situation

• Let x1, x2, x3 , … , xn denote a sample from a normal population with mean and standard deviation .

• Let

• we want to test if the mean, , is equal to some given value 0.

• Obviously if the sample mean is close to 0 the Null Hypothesis should be accepted otherwise the null Hypothesis should be rejected.

mean sample the1

n

xx

n

ii

The Test Statistic

0 0 x

x xz

n

0 x

n

0 x

ns

The Acceptance and Critical RegionThis depends on H0 and HA

• Accept H0 if:

• Reject H0 if:

2/2/ zzz

2/2/ or zzzz

Two-tailed critical region

0 0 0: and :AH H

• Accept H0 if:

• Reject H0 if:

One-tailed critical regions0 0 0: and :AH H

z zz z

• Accept H0 if:

• Reject H0 if:

0 0 0: and :AH H

z zz z

Example

A manufacturer Glucosamine capsules claims that each capsule contains on the average:

• 500 mg of glucosamine

To test this claim n = 40 capsules were selected and amount of glucosamine (X) measured in each capsule.

Summary statistics:

496.3 and 8.5x s

We want to test:

Manufacturers claim is correct

against

0 :H

:AH Manufacturers claim is not correct

The Test Statistic

s

xn

xn

n

xxz

x

0000

496.3 500 40

8.52.75

The Critical Region and Acceptance Region

Using = 0.05

We accept H0 if-1.960 ≤ z ≤ 1.960

z/2 = z0.025 = 1.960

reject H0 ifz < -1.960 or z > 1.960

The Decision

Sincez= -2.75 < -1.960

We reject H0

Conclude: the manufacturers’s claim is incorrect:

“Students” t-test

Recall: The z-test for means

ns

x

n

xxz

x

000

The Test Statistic

Comments

• The sampling distribution of this statistic is the standard Normal distribution

• The replacement of by s leaves this distribution unchanged only the sample size n is large.

For small sample sizes:

ns

xt 0

The sampling distribution of

Is called “students” t distribution with n –1 degrees of freedom

Properties of Student’s t distribution

• Similar to Standard normal distribution– Symmetric– unimodal– Centred at zero

• Larger spread about zero.– The reason for this is the increased variability introduced

by replacing by s.

• As the sample size increases (degrees of freedom increases) the t distribution approaches the standard normal distribution

-4 -2 2 4

0.1

0.2

0.3

0.4

t distribution

standard normal distribution

The Situation

• Let x1, x2, x3 , … , xn denote a sample from a normal population with mean and standard deviation . Both and are unknown.

• Let

• we want to test if the mean, , is equal to some given value 0.

mean sample the1

n

xx

n

ii

deviation standard sample the

11

2

n

xxs

n

ii

The Test Statistic

ns

xt 0

The sampling distribution of the test statistic is the t distribution with n-1 degrees of freedom

The Alternative Hypothesis HA

The Critical Region

0: AH

0: AH

0: AH

2/2/ or tttt

tt

tt

t and t/2 are critical values under the t distribution with n – 1 degrees of freedom

Critical values for the t-distribution

or /2

0 t

tt or 2/

Critical values for the t-distribution are provided in tables. A link to these tables are given with today’s lecture

Look up df

Look up

Note: the values tabled for df = ∞ are the same values for the standard normal distribution

Example

• Let x1, x2, x3 , x4, x5, x6 denote weight loss from a new diet for n = 6 cases.

• Assume that x1, x2, x3 , x4, x5, x6 is a sample from a normal population with mean and standard deviation . Both and are unknown.

• we want to test:

0: AH

0:0 H

versus

New diet is not effective

New diet is effective

The Test Statistic

ns

xt 0

The Critical region:

tt Reject if

The Data

The summary statistics:

462418.1 and 96667.0 sx

1 2 3 4 5 6

2.0 1.0 1.4 -1.8 0.9 2.3

The Test Statistic

619.1

6462418.1

096667.00

ns

xt

The Critical Region (using = 0.05)

d.f. 5for 0152050 .tt . Reject if

Conclusion: Accept H0:

Confidence Intervals

Confidence Intervals for the mean of a Normal Population, m, using the Standard Normal distribution

nzx

2/

Confidence Intervals for the mean of a Normal Population, m, using the t distribution

n

stx 2/

The Data


462418.1 and 96667.0 sx

1 2 3 4 5 6

2.0 1.0 1.4 -1.8 0.9 2.3

Example

• Let x1, x2, x3 , x4, x5, x6 denote weight loss from a new diet for n = 6 cases.

The Data:


462418.1 and 96667.0 sx

1 2 3 4 5 6

2.0 1.0 1.4 -1.8 0.9 2.3

Confidence Intervals (use = 0.05)

n

stx 025.0

6

462418.1571.296667.0

535.196667.0

50.2 to57.0

Comparing Populations

Proportions and means

Sums, Differences, Combinations of R.V.’s

A linear combination of random variables, X, Y, . . . is a combination of the form:

L = aX + bY + …

where a, b, etc. are numbers – positive or negative.

Most common:Sum = X + Y Difference = X – Y

Simple Linear combination of X, bX + a

Means of Linear Combinations

The mean of L is:

Mean(L) = a Mean(X) + b Mean(Y) + …

Most common:

Mean( X + Y) = Mean(X) + Mean(Y)

Mean(X – Y) = Mean(X) – Mean(Y)

Mean(bX + a) = bMean(X) + a

If L = aX + bY + …

Variances of Linear Combinations

If X, Y, . . . are independent random variables and

L = aX + bY + … then

Variance(L) = a2 Variance(X) + b2 Variance(Y) + …

Most common:

Variance( X + Y) = Variance(X) + Variance(Y)

Variance(X – Y) = Variance(X) + Variance(Y)

Variance(bX + a) = b2Variance(X)

If X, Y, . . . are independent normal random variables, then L = aX + bY + … is normally distributed.

In particular:

X + Y is normal with

X – Y is normal with

Combining Independent Normal Random Variables

22 deviation standard

mean

YX

YX

22 deviation standard

mean

YX

YX

Comparing proportions

Situation• We have two populations (1 and 2)• Let p1 denote the probability (proportion) of

“success” in population 1.• Let p2 denote the probability (proportion) of

“success” in population 2.• Objective is to compare the two population

proportions

We want to test either:

21210 : vs: .1 ppHppH A

21210 : vs: .2 ppHppH A

21210 : vs: .3 ppHppH A

or

or

The test statistic:

ˆ1ˆˆ1ˆ

ˆˆ

ˆˆ

1

11

1

11

21

ˆˆ

21

21

npp

npp

ppppz

pp

Where:

A sample of n1 is selected from population 1 resulting in x1 successes

A sample of n2 is selected from population 2 resulting in x2 successes

2

22

1

11

ˆ and

ˆ

n

xp

n

xp

Logic:

1

1

11ˆ1 n

ppp

2ˆ

2ˆˆˆ 2121 pppp

1

1

22ˆ2 n

ppp

11

2

22

1

11

n

pp

n

pp

pppnn

pp

21

21

if 11

1

11

ˆ1ˆ 21

nnpp


The Critical Region

21: ppH A

21: ppH A

21: ppH A

2/2/ or zzzz

zz

zz

Example• In a national study to determine if there was an

increase in mortality due to pipe smoking, a random sample of n1 = 1067 male nonsmoking pensioners were observed for a five-year period.

• In addition a sample of n2 = 402 male pensioners who had smoked a pipe for more than six years were observed for the same five-year period.

• At the end of the five-year period, x1 = 117 of the nonsmoking pensioners had died while x2 = 54 of the pipe-smoking pensioners had died.

• Is there a the mortality rate for pipe smokers higher than that for non-smokers

We want to test:

21210 : vs: ppHppH A

The test statistic:

11ˆ1ˆ

ˆˆ

ˆˆ

21

21

ˆˆ

21

21

nnpp

ppppz

pp

Note:

1097.01067

117

ˆ

1

11

n

xp

1343.0402

54 ˆ

2

22

n

xp

4021067

54117 ˆ

21

21

nn

xxp

1164.01469

171

The test statistic:

11ˆ1ˆ

ˆˆ

21

21

nnpp

ppz

4021

10671

1164.011164.0

1343.1097.0

315.1

We reject H0 if:

645.1 05.0 zzz

Not true hence we accept H0.

Conclusion: There is not a significant ( = 0.05) increase in the mortality rate due to pipe-smoking

Estimating a difference proportions using confidence intervals

Situation• We have two populations (1 and 2)• Let p1 denote the probability (proportion) of

“success” in population 1.• Let p2 denote the probability (proportion) of

“success” in population 2.• Objective is to estimate the difference in the

two population proportions = p1 – p2.

Confidence Interval for = p1 – p2

100P% = 100(1 – ) % :

ˆˆ21 ˆˆ2/21 ppzpp

2

22

1

112/21

ˆ1ˆˆ1ˆ ˆˆ

n

pp

n

ppzpp

Example• Estimating the increase in the mortality rate

for pipe smokers higher over that for non-smokers = p2 – p1

2

22

1

112/12

ˆ1ˆˆ1ˆ ˆˆ

n

pp

n

ppzpp

402

1343.011343.0

1067

1097.011097.0 960.11097.01343.0

0382.00247.0

0629.0 to0136.0%29.6 to%36.1

Comparing MeansSituation• We have two normal populations (1 and 2)• Let 1 and 1 denote the mean and standard

deviation of population 1.• Let 2 and 2 denote the mean and standard

deviation of population 1.• Let x1, x2, x3 , … , xn denote a sample from a

normal population 1.• Let y1, y2, y3 , … , ym denote a sample from a

normal population 2.• Objective is to compare the two population means


21210 : vs: .1 AHH

21210 : vs: .2 AHHor

21210 : vs: .3 AHH

or

Consider the test statistic:

22yxyx

yxyxz

m

s

ns

yx

mn

yx

yx222

221

If: trueis : 210 H

• will have a standard Normal distribution

• This will also be true for the approximation (obtained by replacing 1 by sx and 2 by sy) if the sample sizes n and m are large (greater than 30)

m

s

ns

yx

mn

yxz

yx222

221

Note:

n

xx

n

ii

1

11

2

n

xxs

n

ii

x

m

yy

n

ii

1

11

2

m

yys

n

ii

y


The Critical Region

21: AH

21: AH

21: AH

2/2/ or zzzz

zz

zz

Example• A study was interested in determining if an

exercise program had some effect on reduction of Blood Pressure in subjects with abnormally high blood pressure.

• For this purpose a sample of n = 500 patients with abnormally high blood pressure were required to adhere to the exercise regime.

• A second sample m = 400 of patients with abnormally high blood pressure were not required to adhere to the exercise regime.

• After a period of one year the reduction in blood pressure was measured for each patient in the study.

We want to test:

210 : H

The exercize group did not have a higher

average reduction in blood pressure

The exercize group did have a higher

average reduction in blood pressure

21: AHvs

The test statistic:

22yxyx

yxyxz

m

s

ns

yx

mn

yx

yx222

221

Suppose the data has been collected and:

67.101

n

xx

n

ii

895.3

11

2

n

xxs

n

ii

x

83.71

m

yy

n

ii

224.4

11

2

m

yys

n

ii

y

The test statistic:

400224.4

500895.3

83.767.10

2222

m

s

ns

yxz

yx

4.10273765.0

84.2

We reject H0 if:

645.1 05.0 zzz

True hence we reject H0.

Conclusion: There is a significant ( = 0.05) effect due to the exercise regime on the reduction in Blood pressure

Estimating a difference means using confidence intervals

Situation

• We have two populations (1 and 2)

• Let 1 denote the mean of population 1.

• Let 2 denote the mean of population 2.

• Objective is to estimate the difference in the two population proportions = 1 – 2.

Confidence Interval for

= 1 – 2

ˆˆ21 ˆˆ2/21 z

m

s

n

szyx yx

22

2/

Example• Estimating the increase in the average

reduction in Blood pressure due to the excercize regime = 1 – 2

m

s

n

szyx yx

22

2/

400

224.4

500

895.3 960.183.767.10

22

)273765(.96.184.2 537.0.842

.3373 to.3032

Comparing Means – small samplesSituation• We have two normal populations (1 and 2)• Let 1 and 1 denote the mean and standard

deviation of population 1.• Let 2 and 2 denote the mean and standard

deviation of population 1.• Let x1, x2, x3 , … , xn denote a sample from a

normal population 1.• Let y1, y2, y3 , … , ym denote a sample from a

normal population 2.• Objective is to compare the two population means


21210 : vs: .1 AHH

21210 : vs: .2 AHH

21210 : vs: .3 AHH

or

or

Consider the test statistic:

22yxyx

yxyxz

m

s

ns

yx

mn

yx

yx222

221

If the sample sizes (m and n) are large the statistic

m

s

ns

yxt

yx22

will have approximately a standard normal distribution

This will not be the case if sample sizes (m and n) are small

The t test – for comparing means – small samples

Situation• We have two normal populations (1 and 2)• Let 1 and denote the mean and standard

deviation of population 1.• Let 2 and denote the mean and standard

deviation of population 1.• Note: we assume that the standard deviation

for each population is the same.

1 = 2 =

Let

n

xx

n

ii

1

11

2

n

xxs

n

ii

x

m

yy

n

ii

1

11

2

m

yys

n

ii

y

The pooled estimate of .

2

11 22

mn

smsns yx

Pooled

Note: both sx and sy are estimators of .

These can be combined to form a single

estimator of , sPooled.

The test statistic:

mns

yx

ms

ns

yxt

PooledPooledPooled

11

22

If 1 = 2 this statistic has a t distribution with n + m –2 degrees of freedom


The Critical Region

21: AH

21: AH

21: AH

2/2/ or tttt

tt

tt

tt and 2/

are critical points under the t distribution with degrees of freedom n + m –2.

Example• A study was interested in determining if

administration of a drug reduces cancerous tumor size.

• For this purpose n +m = 9 test animals are implanted with a cancerous tumor.

• n = 3 are selected at random and administered the drug.

• The remaining m = 6 are left untreated. • Final tumour sizes are measured at the end

of the test period

We want to test:

210 : H

21: AH

The treated group did not have a lower

average final tumour size.

The exercize group did have a lower

average final tumour size.

vs

The test statistic:

mns

yxt

Pooled

11

Suppose the data has been collected and:

657.11

n

xx

n

ii

3215.01

1

2

n

xxs

n

ii

x

915.11

m

yy

n

ii

3693.01

1

2

m

yys

n

ii

y

drug treated 1.89 1.79 1.29untreated 2.08 1.28 1.75 1.90 2.32 2.16

The test statistic:

025.1252.

258.

61

31

3563.0

915.1657.1

t

2

11 22

mn

smsns yx

Pooled

3563.0

7

3693.053215.02 22

We reject H0 if:

895.1 050 .ttt

Hence we accept H0.

Conclusion: The drug treatment does not result in a significant ( = 0.05) smaller final tumour size,

with d.f. = n + m – 2 = 7

hypothesis testing. to define a statistical test we 1.choose a statistic (called the test statistic)...

Documents

test statistic slide

alternative hypothesis

hypothesis testing slide

acceptance region

probability of success

critical regions

p unknown

statistical test