8.2 basics of hypothesis testing · pdf file11/8/2014 · 8.2 – basics of...

8.2 – Basics of Hypothesis Testing:

Objectives: 1. State Null and Alternative Hypotheses 2. Find Critical Values 3. Find Test Statistics 4. Find P-Values 5. State Conclusions about Claims 6. Identify Type I and Type II Errors.

Overview: In previous chapters we used “descriptive statistics” when we summarized data using tools such as graphs, and statistics such as the mean and standard deviation. Methods of inferential statistics use sample data to make an inference or conclusion about a population. The two main activities of inferential statistics are using sample data to (1) estimate a population parameter (such as estimating a population parameter with a confidence interval), and (2) test a hypothesis or claim about a population parameter. In Chapter 7 we presented methods for estimating a population parameter with a confidence interval, and in this chapter we present the method of hypothesis testing.

What is a Hypothesis Test? In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test is a standard procedure for testing a claim about a property of a population. The main objective of this chapter is to develop the ability to conduct hypothesis tests for claims made about a population proportion p, a population meanµ, or a population standard deviationσ.

Examples of Hypotheses that can be Tested: • Genetics: The Genetics & IVF Institute claims that its XSORT method allows couples to

increase the probability of having a baby girl. • Business: A newspaper headline makes the claim that most workers get their jobs through

networking. • Medicine: Medical researchers claim that when people with colds are treated with echinacea,

the treatment has no effect. • Aircraft Safety: The Federal Aviation Administration claims that the mean weight of an airline

passenger (including carry-on baggage) is greater than 185 lb, which it was 20 years ago. • Quality Control: When new equipment is used to manufacture aircraft altimeters, the new

altimeters are better because the variation in the errors is reduced so that the readings are more consistent. (In many industries, the quality of goods and services can often be improved by reducing variation.)

When conducting hypothesis tests as described in this chapter and the following chapters, instead of jumping directly to procedures and calculations, be sure to consider the context of the data, the source of the data, and the sampling method used to obtain the sample data.

Rare Event Rule for Inferential Statistics: The methods of this chapter are based on the rare event rule. If, under a given assumption, the probability of a particular observed event is exceptionally small, we conclude that the assumption is probably not correct. Following this rule, we test a claim by analyzing sample data in an attempt to distinguish between results that can easily occur by chance and results that are highly unlikely to occur by chance. Example: Claim: An employee of a company is equally likely to take a sick day on any day of the week. Last year, the total number of sick days taken by all the employees of the company was 143. Of these, 52 were Mondays, 14 were Tuesdays, 17 were Wednesdays, 17 were Thursdays, and 43 were Fridays. What do you conclude about the claim? Do not use formal procedures or exact calculations. Use only the rare event rule and make a subjective estimate to determine whether the event is likely. Solution: Based on the numbers in the data given, it would seem to be highly unlikely that this claim is true. Example: A math teacher tries a new method for teaching her introductory statistics class. Last year the mean score on the final test was 73. This year the mean on the same final was 76. Write the claim that is suggested by the given statement, and then write a conclusion about the claim. Do not use symbolic expressions or formal procedures; use common sense. Solution: A possible claim suggested by this statement is that the new method of teaching introductory statistics has made an improvement in students test scores. The logical conclusion is that this claim is false based on the fact that the difference in mean test scores is highly likely to occur by chance.

Null and Alternate Hypotheses:

1. The null hypothesis (denoted by H0) is a statement that the value of a population parameter (such as proportion, mean, or standard deviation) is equal to some claimed value. We test the null hypothesis directly. Either reject H0 or fail to reject H0.

2. The alternative hypothesis (denoted by H1 or Ha or HA) is the statement that the parameter has a

value that somehow differs from the null hypothesis. The symbolic form of the alternative hypothesis must use one of these symbols: ≠, <, >.

If you are conducting a study and want to use a hypothesis test to support your claim, the claim must be worded so that it becomes the alternative hypothesis. This hypothesis MUST use the symbols ≠≠≠≠, <, >. You can never support a claim that some parameter is equal to some specified value.

Identifying the Null and Alternate Hypotheses: The following flowchart can be helpful for identifying the null and alternate hypothesis of a claim.

1. Identify the specific claim or hypothesis to be tested and express it in symbolic form. 2. Give the symbolic form that must be true when the original claim is false. 3. Of the two symbolic expressions obtained so far, let the alternative hypothesis be the one not

containing equality. Let the Null Hypothesis be the symbolic expression that the parameter equals the fixed value being considered.

Example: A math teacher tries a new method for teaching her introductory statistics class. Last year the mean score on the final exam was 73. This year the mean on the same final was 76. The teacher makes the claim that the new method makes a difference in student’s final exam scores. Write the null and alternate hypothesis. Solution:

1. The hypothesis to be tested is that the mean final exam score is greater than 73. In symbolic form 73>µ .

2. If the original claim is false, then 73=µ .

3. The null hypothesis is the statement containing an equality, therefore, 73:0 =µH . The

alternative hypothesis is then 73:1 >µH . It may be helpful to remember that the Null Hypothesis essentially means no change, difference or effect. Example: The owner of a football team claims that the average attendance at games is over 62,900, and he is therefore justified in moving the team to a city with a larger stadium. State the hypothesis to be tested. Express the null hypothesis and the alternative hypothesis in symbolic form. Solution:

1. The hypothesis to be tested is that average attendance is greater than 62,900. In symbolic form900,62>µ .

2. If the original claim is false, then 900,62=µ .

3. The null hypothesis is the statement containing an equality, therefore, 900,62:0 =µH . The

alternative hypothesis is then 900,62:1 >µH . Example: A skeptical paranormal researcher claims that the proportion of Americans that have seen a UFO is less than 3 in every one thousand. State the hypothesis to be tested. Express the null hypothesis and the alternative hypothesis in symbolic form. Solution:

1. The hypothesis to be tested is that the proportion of Americans that have seen a UFO is less than 3 in every one thousand. In symbolic form 003.0<p .

2. If the original claim is false, then 003.0=p .

3. The null hypothesis is the statement containing an equality, therefore, 003.0:0 =pH . The

alternative hypothesis is then 003.0:1 <pH .

Converting Sample Data to a Test Statistic: The calculations required for a hypothesis test typically involves converting a sample statistic to a test statistic. The test statistic is a value used in making a decision about the null hypothesis, and is found by converting the sample statistic to a score with the assumption that the null hypothesis is true.

Test Statistic for a Proportion:

n

pq

ppz

−=ˆ

Test Statistic for the Mean:

n

xz σ

µ−= or

n

sx

tµ−=

Test Statistic for Standard Deviation: 2

22 )1(

σχ sn −=

The test statistic for the mean uses the normal or t distribution, depending on the conditions that are satisfied. Example: A claim is made that the proportion of children who play sports is less than 0.5, and the sample statistics include n = 1671 subjects with 30% saying that they play a sport. Solution: Convert the sample statistic by using the test statistic formula for a proportion.

35.16

1671

)5.0)(5.0(

5.03.0ˆ−=−=−=

n

pq

ppz

Example: The claim is that the proportion of accidental deaths of the elderly attributable to residential falls is more than 0.10, and the sample statistics include n = 800 deaths of the elderly with 15% of them attributable to residential falls. Solution: Convert the sample statistic by using the test statistic formula for a proportion.

7.4

800

)9.0)(1.0(

1.015.0ˆ=−=−=

n

pq

ppz

Finding Critical Values: The critical region (or rejection region) is the set of all values of the test statistic that cause us to reject the null hypothesis. For example, see the red-shaded region in the following diagram. The significance level (denoted byα) is the probability that the test statistic will fall in the critical region when the null hypothesis is actually true. This is the same α introduced in Section 7-2. Common choices for α are 0.05, 0.01, and 0.10. A critical value is any value that separates the critical region (where we reject the null hypothesis) from the values of the test statistic that do not lead to rejection of the null hypothesis. The critical values depend on the nature of the null hypothesis, the sampling distribution that applies, and the significance levelα. See the previous figure where the critical value of z = 1.645 corresponds to a significance level of α = 0.05. Example: Assume that the data has a normal distribution and the number of observations is greater than fifty. Find the critical z value used to test a null hypothesis. α = 0.05 for a two-tailed test. Solution: With α = 0.05 for a two-tailed test, there will be 0.025 in the left tail and 0.025 in the right tail. Using table A-2 and an area of 0.975, the z-score is 1.96 . Example: Assume that the data has a normal distribution and the number of observations is greater than fifty. Find the critical z value used to test a null hypothesis. α = 0.09 for a right-tailed test. Solution: With α = 0.09 for a right-tailed test, there will be 0.09 in the right tail. Using table A-2 and an area of 0.910, the z-score is 1.34 .

Example: Assume that the data has a normal distribution and the number of observations is greater than fifty. Find the critical z value used to test a null hypothesis. α = 0.05 for a left-tailed test. Solution: With α = 0.05 for a left-tailed test, there will be 0.05 in the left tail. Using table A-2 and an area of 0.05, the z-score is -1.645.

The P-Value: The P-value (or p-value or probability value) is the probability of getting a value of the test statistic that is at least as extreme as the one representing the sample data, assuming that the null hypothesis is true. P-values can be found after finding the area beyond the test statistic. The procedure for finding p-values is summarized as follows:

1. Critical region in the left tail: P-value = area to the left of the test statistic 2. Critical region in the right tail: P-value = area to the right of the test statistic 3. Critical region in two tails: P-value = twice the area in the tail beyond the test statistic

The null hypothesis is rejected if the P-value is very small, such as 0.05 or less. Here is a memory tool useful for interpreting the P-value:

If the P is low, the null must go. If the P is high, the null will fly.

Interpretation: If the P-value is high, we are stating that the null hypothesis is likely to occur and that the alternative hypothesis is false and if the P-value is low the null hypothesis is highly unlikely to occur and therefore the alternative hypothesis is true. The following diagram is very helpful for remembering whether to accept or reject a null hypothesis based on the p-values.

Example: Use the given information to find the P-value. Also, use a 0.05 significance level and state the conclusion about the null hypothesis (reject the null hypothesis or fail to reject the null hypothesis). The test statistic in a right-tailed test is z = 0.52. Solution:

1. Using the test statistic which is the z-score z = 0.52 and Table A-2, we find that the area to the left of value point is 0.6985.

2. The P-Value is the area to the right of this test statistic or P = 1 – 0.6985 = 0.3015. 3. Because the area (probability) of 0.3015 > 0.05 we use the above rule, If the p is high, the null

will fly and fail to reject the null hypothesis. Example: Use the given information to find the P-value. Also, use a 0.05 significance level and state the conclusion about the null hypothesis (reject the null hypothesis or fail to reject the null hypothesis). The test statistic in a left-tailed test is z = -2.05. Solution:

1. Using the test statistic which is z = -2.05 and Table A-2, we find that the area to the left of value point is 0.0202.

2. The P-Value is the area to the left of this test statistic or P = 0.0202. 3. Because the area (probability) of 0.0202< 0.05 we use the above rule, If the p is low, the null

must go and reject the null hypothesis. Example: Use the given information to find the P-value. Also, use a 0.05 significance level and state the conclusion about the null hypothesis (reject the null hypothesis or fail to reject the null hypothesis). The test statistic in a two-tailed test is z = -1.63. Solution:

1. Using the test statistic which is z = -1.63and Table A-2, we find that the area to the left of value point is 0.0516

2. The P-Value is twice the area to the left of this test statistic or P = 0.1032. 4. Because the area (probability) of 0.1032> 0.05 we use the above rule, If the p is high, the null

will fly and fail to reject the null hypothesis.

Types of Hypothesis Tests:The tails in a distribution are the extreme regions bounded by critical values.Determinations of P-values and critical values are affected by whether a critical region is in two tails, the left tail, or the right tail. It therefore becomes important to correctly characterize a hypothesis test as two-tailed, left-tailed, or right-tailed.

Two-tailed Test: α is divided equally between the two tails of the critical regionAlternate Hypothesis Symbol: H1:

Left-tailed Test: Value of α is in the left tail Alternate Hypothesis Symbol: H1: < (points left) Right-tailed Test: Value of α is in the right tail Alternate Hypothesis Symbol: H1:

Types of Hypothesis Tests: Two-tailed, Left-tailed, RightThe tails in a distribution are the extreme regions bounded by critical values.

values and critical values are affected by whether a critical region is in two tails, the left tail, or the right tail. It therefore becomes important to correctly characterize a hypothesis test as

tailed.

is divided equally between the two tails of the critical region : ≠ (less than or greater than)

: < (points left)

: > (points right)

Right-tailed:

values and critical values are affected by whether a critical region is in two tails, the left tail, or the right tail. It therefore becomes important to correctly characterize a hypothesis test as

Conclusions in Hypothesis Testing: We always test the null hypothesis. The initial conclusion will always be one of the following: 1. Reject the null hypothesis. 2. Fail to reject the null hypothesis. The decision to reject or fail to reject the null hypothesis is usually made using either the P-value method or the traditional method. Sometimes the decision is based on confidence intervals. P-value method: Using the significance levelαααα: If P-value ≤ α , reject H0. If P-value > α , fail to reject H0. Traditional method: If the test statistic falls within the critical region, reject H0. If the test statistic does not fall within the critical region, fail to reject H0. Confidence Intervals: A confidence interval estimate of a population parameter contains the likely values of that parameter. If a confidence interval does not include a claimed value of a population parameter, reject that claim.

Wording the Final Conclusion: The final conclusion should be written in simple, non-technical terms. The term “accept” the null hypothesis is generally not use because it implies that the null hypothesis has been proved, but we can never prove a null hypothesis. The phrase “fail to reject” says more correctly that the available evidence is not strong enough to warrant rejection of the null hypothesis.

Example: An entomologist writes an article in a scientific journal which claims that fewer than 12 in ten thousand male fireflies are unable to produce light due to a genetic mutation. Assuming that a hypothesis test of the claim has been conducted and that the conclusion is to reject the null hypothesis, state the conclusion in nontechnical terms. Solution: 0012.00 =H , 0012.01 <H . The sample data support the claim that the true proportion is less

than 12 in ten thousand. Example: A psychologist claims that more than 55% of the population suffers from professional problems due to extreme shyness. Assuming that a hypothesis test of the claim has been conducted and that the conclusion is failure to reject the null hypothesis, state the conclusion in nontechnical terms. Solution: 55.00 =H , 55.01 >H . There is not sufficient evidence to support the claim that the true

proportion is greater than 55%.

Errors in Hypothesis Tests: When testing a null hypothesis, we arrive at a conclusion of rejecting it or failing to reject it. Such conclusions are sometimes correct and sometimes wrong. There are two common types of errors associated with hypothesis testing: Type I Error:

• A Type I error is the mistake of rejecting the null hypothesis when it is actually true. • The symbol a (alpha) is used to represent the probability of a type I error.

Type II Error:

• A Type II error is the mistake of failing to reject the null hypothesis when it is actually false. • The symbol β (beta) is used to represent the probability of a type II error.

Controlling Type I and Type II Errors:

• For any fixedα , an increase in the sample size n will cause a decrease inβ . • For any fixed sample size n, a decrease in α will cause an increase inβ . Conversely, an increase

in α will cause a decrease inβ . • To decrease both α andβ , increase the sample size.

Example: A medical researcher claims that 6% of children suffer from a certain disorder. Identify the type I error for the test. Solution: Reject the claim that the percentage of children who suffer from the disorder is equal to 6% when that percentage is actually 6%. Example: A psychologist claims that more than 3.6% of adults suffer from extreme shyness. Identify the type II error for the test. Solution: Fail to reject the claim that the percentage of adults who suffer from extreme shyness is equal to 3.6% when that percentage is actually greater than 3.6%.

Comprehensive Hypothesis Test: In this set of notes we describe the individual components used in a hypothesis test, but the following lessons will combine those components into comprehensive procedures. General Procedure:

� Identify the specific claim or hypothesis to be tested, and put it in symbolic form. � Give the symbolic form that must be true when the original claim is false. � Of the two symbolic expressions obtained so far, let the alternative hypothesis br the one not

containing an equality. Let the null hypothesis be be the symbolic expression that the parameter equals the fixed value being considered.

� Select the significance levelα based on the seriousness of type 1 error. Make α small if the consequences of rejecting a true 0H are severe. The values of 0.05 and 0.01 are very common.

� Identify the statistic that is relevant to this test and determine its sampling distribution (normal or t-distribution).

P-value Method:

� Find the test statistic and find the P-value. Draw a graph and show the test statistic and the P-value.

� Reject 0H if the P-value is less than or equal to the significance level α . Fail to reject 0H if the

P-value is greater than α . � Restate this previous decision in simple non-technical terms and address the original claim.

Traditional Method:

� Find the test statistic, critical values and the critical region. Draw a graph and include the test statistic, critical values, and critical region.

� Reject 0H if the test statistic is in the critical region. Fail to reject 0H if the test statistic is not in

the critical region. � Restate this previous decision in simple non-technical terms and address the original claim.

Confidence Interval Method: A confidence interval estimate of a population parameter contains the likely values of that parameter. We should therefore reject a claim that the population parameter has a value that is not included in the confidence interval. In some cases, a conclusion based on a confidence interval may be different from a conclusion based on a hypothesis test. See the comments in the individual sections.

8.2 basics of hypothesis testing · pdf file11/8/2014 · 8.2 – basics of...

Documents