stat

TESTING OF HYPOTHESIS

J.V.RamaniRadiant Academy

INDIA

Sampling Distributions

Introduction. The purpose of most statistical investigations is to gen-

eralize from the information contained in random samples about the

population from which the samples were obtained. In particular, we

are usually concerned with the problem of making inferences about the

”PARAMETERS” of populations, such as the mean µ or the stan-

dard deviation σ. In making such inferences, we use statistics such as

x̄ and s, namely, quantities calculated on the basis of sample observa-

tions.

Population. A population consists of the totality of observations with

which we are concerned.

Sample. A sample is a subset of a population.

Statistic. Any function of the random variables constituting a random

sample is called a statistic.

Sampling Distribution. The probability distribution of a statistic is

called a sampling distribution.

Tests of Hypothesis. A statistical hypothesis is an ”assertion” or a

”conjecture” concerning one or more populations.

The truth or falsity of a statistical hypothesis is never known with

absolute certainty unless we examine the entire population. This, of

courese would be the impractical in most situations. Instead, we take

a random sample from the population of interest and use the data con-

tained in this sample to provide evidence that either supports or does

not support the hypothesis. Evidence from the sample that is inconsis-

tent with the stated hypothesis leads to a rejection of the hypothesis,

whereas evidence supporting the hypothesis leads to its acceptance.

The Role of probability in Hypothesis Testing. The decision proce-

dure must be done with the awareness of the probability of a wrong

conclusion. The acceptance of a hypothesis merely implies that the

data do not give sufficient evidence to refute it. On the other hand,

2

rejection implies that the sample evidence refutes it. That is, rejec-

tion means that there is a small probability of obtaining the sample

observation observed when, in fact, the hypothesis is true.

The formal statement of a hypothesis is often influenced by the struc-

ture of the probability of a wrong conclusion. The structure of hypoth-

esis testing will be formulated with the use of the term ”null hypoth-

esis”. This refers to any hypothesis we wish to test and is denoted

by H0. The rejection of H0 leads to the acceptance of an ”alternate

hypothesis”, denoted by H1.

Guidelines for selecting the Null Hypothesis.

When the goal of an experiment is to establish an assertion, the nega-

tion of the assertion should be taken as the null hypothesis. The asser-

tion becomes the alternate hypothesis.

A null hypothesis concerning a population parameter will always be

stated so as to specify an ”EXACT” value of the parameter, whereas

the alternative hypothesis allows for the possibility of several values.

Hence if, H0 is the null hypothesis p = 0.5 for a binomial population,

the alternative hypothesis H1 would be one of the following:

p > 0.5, p < 0.5 or p 6= 0.5.

Rejection of the null hypothesis when it is true is called a ”TYPE

I” error. Acceptance of the null hypothesis when it is false is called

”TYPE II” error.

Possible situations for testing a statistical hypothesis are given below:

H0 is true H1 is falseAccept H0 Correct Decision TYPE II errorReject H0 TYPE I error Correct decision

The probability of committing a TYPE I error, also called the ”Level

of significance”, is denoted by α. The probability of TYPE II error

will be denoted by β.

3

One and two-tailed Tests. A test of any statistical hypothesis, where

the alternative is one sided, such as

H0 : θ = θ0, H1 = θ > θ0

or

H0 : θ = θ0, H1 = θ < θ0

is called a one-sided test.

A test of any statistical hypothesis where the alternative is two-sided,

such as

H0 : θ = θ0, H1 : θ 6= θ0

is called a two-tailed test, since the critical region is split into two parts,

often having equal probabilities placed in each tail of the distribution

of the test statistic. The alternative hypothesis is either

θ < θ0 or θ > θ0.

Procedure for Testing of Hypothesis.

1. We formulate a null hypothesis and an appropriate alternative hypothesis

which we accept when the null hypothesis must be rejected.

2. We specify the probability od TYPE I error, i.e., α. If possible, desired

or necessary, we may also specify β.

3. Based on the sampling distribution of an appropriate statistic, we con-

struct a criterion for testing the null hypothesis against the given al-

ternative.

4. We calculate from the data the value of the statistic on which the decision

is to be based.

5. We decide whether to reject the null hypothesis, whether to accept it or

to reserve judgement.

TESTS CONCERNING MEANS.

4

Example 1. A random sample of recorded deaths in India during the past

year showed an average life span of 71.8 years. Assuming a population

standard deviation of 8.9 years, dpes this seem to indicate that the

mean life span today is greater than 70 years? Use a 0.05 level of

significance.

Solution.

1. H0 : µ = 70 years , H1 : µ > 70 years.

2. α = 0.05.

3. Critical region. z > 1.645 where

z =x̄− µ

σ√n

.

4. x̄ = 71.8 years, σ = 8.9 years and

z =71.8− 70

8.9√100

= 2.02.

5. Decision. Reject H0 and we conclude that the mean life span today

is greater than 70 years.

Example 2. A manufacturer of sports equipment has developed a new

synthetic fishing line that he claims has a mean breaking strength of 8kg

with a standard deviation of 0.5kg. Test the hypothesis that µ = 8kg

against the alternate hypothesis that µ 6= 8kg if a random sample of

50 lines is tested and found to have a mean breaking strength of 7.8kg.

Use 0.01 level of significance.

Solution.

1. H0 : µ = 8kg , H1 : µ 6= 8kg.

2. α = 0.01.

3. Critical region. z < −2.575 and z > 2.575 where

z =x̄− µ

σ√n

.

5

4. x̄ = 7.8kg, n = 50 and

z =7.8− 8

0.5√50

= −2.83.

5. Decision. Reject H0 and we conclude that the average breaking

strength is not equal to 8, but in fact, less than 8kg.

Example 3. The specifications for a certain kind of ribbon call for a mean

breaking strength of 180lbs. If five pieces of the ribbon(randomly se-

lected from different rolls) have a mean breaking strength of 169.5lbs

with a standard deviation of 5.7lbs, test the null hypothesis µ = 180lbs

against the alternate hypothesis µ < 180lbs, at the 0.01 level of signif-

icance.

Solution.

1. H0 : µ = 180lbs , H1 : µ < 180lbs.

2. α = 0.01.

3. Critical region. Reject the null hypothesis if t < −3.747 where

t0.01 = −3.747 for 5-1=4 degrees of freedom and

t =x̄− µ0

s√n

.

4.

t =169.5− 180

5.7√5

= −4.12

5. Decision. Reject the null hypothesis H0,i.e., the breaking strength

is below specification.

Example 4. To test the claim that the resistance of electric wire can

be reduced by more than 0.05ω by alloying, 32 values obtained for

standard wire yielded x̄1 = 0.136ω and s1 = 0.004ω, and 32 values for

alloyed wire yielded x̄2 = 0.083ω and s1 = 0.005ω. At the 0.05 level of

significance, does it support the claim?

6

Solution.

1. H0 : µ1 − µ2 = 0.05 = d0 , H1 : µ1 − µ2 > 0.05.

2. α = 0.05.

3. Critical region. Reject the null hypothesis if z > 1.645 where

z =(x̄1 − x̄2)− d0√

σ21

n1+

σ22

n2

.

4. We have

z =(0.136− 0.083)− 0.05√

0.0042

32+ 0.0052

32

= 2.65

5. Decision. Since 2.65 > 1.645, H0 must be rejected,i.e., the data

substantiates the claim.

Example 5. An experiment was performed to compare the abrasive wear

of two different laminated materials. Twelve pieces of material 1 were

tested by exposing each piece to a machine measuring wear. Ten pieces

of material 2 were similarly tested. In each case, the depth of wear

was observed. The samples of material 1 gave an average(coded) wear

of 85 units with a sample standard deviation of 4, while the samples

of material 2 gave an average of 81 and a sample standard deviation

of 5. Can we conclude that at the 0.05 level of significance that the

abrasive wear of material 1 exceeds that of material 2 by more than 2

units. Assume the populations to be approximately normal with equal

variances.

Solution.

1. H0 : µ1 − µ2 = 2 = d0 , H1 : µ1 − µ2 > 2.

2. α = 0.05.

3. Critical region. t > 1.725 where

t =(x̄1 − x̄2)− d0

sp

√1n1

+ 1n2

.

7

(σ1 = σ2 but unknown) with ν = 20 = 12 + 10− 2.

4. We have

x1 = 85, s1 = 4, n1 = 12

x2 = 81, s2 = 5, n2 = 10.

Hence

sp =

√11× 16 + 9× 25

12 + 10− 2= 4.478

t =(85− 81)− 2

4.478√

112

+ 110

= 1.04

5. Decision. Do not reject H0.

Hypotheses concerning one proportion To test the null hupothesis

p = p0 against one of the alternatives

p < p0, p > p0, p 6= p0

we use the statistic

z =x− np0√

np0 (1− p0)∼ N (0; 1) .

Example 6. In a study designed to investigate whether certain detonators

used with explosives in coal mining meet the requirement that at least

90 percent will ignite the explosive when charged, it is found that 174

of 200 detonators function properly. Test the null hypothesis p < 0.90

at the 0.05 level of significance.

Solution.

1. H0 : p = 0.90 , H1 : p < 0.90.

2. α = 0.05.

3. Critical region. Reject H0 if z < −1.645 where

z =x− np0√

np0 (1− p0).

8

4. We have

z =174− 200× 0.90√200× 0.90× 0.10

= −1.41

5. Decision. Since z = −1.41 > −1.645, H0 cannot be rejected.

Goodness of fit. The question of goodness of fit arises whenever we try

to compare the observed frequency distribution with the corresponding

values of an expected or theoretical distribution.

To test whether the discrepencies between the observed and expected

frequencies can be attributed to chance, we use the statistic

χ2 =k∑

i=1

(Oi − Ei)2

Ei

where the Oi and Ei are the observed and expected frequencies and

χ2 is chi-square distribution with k − m degrees of freedom, where k

is the number of terms in the formula for χ2 and m is the number

of quantities, obtained from the observed data, that are needed to

calculate the expected frequencies.

Example 7. During 400 five-minute intervals the air-traffic control of an

airport received 0,1,2,...,13 radio messages with respective frequencies

of 3,15,47,76,68,74,46,39,15, 9,5,2,0 and 1. Suppose, furthermore, that

we want to check whether these data substantiate the claim that the

number of radio messages which they receive during a 5-minute in-

terval may be looked upon as a random variable having the Poisson

distribution with λ = 4.6.

Solution.

1. H0: Random variable has Poisson distribution with λ = 4.6. H1:

Random variable does not have Poisson distribution with λ = 4.6.

2. α = 0.01.

3. Critical region. Reject H0 if χ2 > 16.919, i.e., χ20.01 = 16.919 for

ν = k −m = 10− 1 = 9.

9

4.

χ2 =(18− 22.4)2

22.4+

(47− 42.8)2

42.8+ . . . +

(8− 8.0)2

8.0= 6.749

5. Since χ2 < 16.919, H0 cannot be rejected.

The above computations are done using the following table:

No. of radio messages Observed freq. Poisson prob. Expected freq.0 3 0.010 4.01 15 0.046 18.42 47 .107 42.83 76 .163 65.24 68 .187 74.85 74 .173 69.26 46 .132 52.87 39 .087 34.88 15 .050 20.09 9 .025 10.010 5 .012 4.811 2 .005 2.012 0 .002 .813 1 .001 .4

10

stat

Documents