section 10.3: large-sample hypothesis tests for a population proportion
TRANSCRIPT
Section 10.3: Large-Sample Hypothesis Tests for a Population
Proportion
• Test Statistic – is the function of sample data on which a conclusion to reject or fail to reject H0 is based.
• P-value – (also sometimes called the observed significance level) is a measure of inconsistency between the hypothesized value for a population characteristic and the observed sample. It is the probability, assuming that H0 is true, of obtaining a test statistic value at least as inconsistent with H0 as what actually resulted.
Example
• A number of initiatives on the topic of legalized gambling have appeared on state ballots in recent years. Suppose that a political candidate has decided to support legalization of casino gambling if he is convinced that more than two-thirds of U.S. adults approve of casino gambling. USA Today reported the results of a Gallup poll in which 1523 adults (selected at random) were asked whether they approved of casino gambling. The number in the sample who approved was 1035. Does the sample provide convincing evidence that more than two-thirds approve?
• π = true proportion of U.S. adults who approve of casino gambling
• H0: π = 2/3 = .667
• Ha: π > .667
?Hon doubt
lsubstantiacast enough toby thirds- twoexceed p of value theDoes
680.1523
1035
is proportion sample The
0
p
08.1012.
013.
1523)667.1)(667(.
667.680.
is statistic test theof valuecalculated The
true.is Hon when distributi normal standard aely approximat has
)667.1)(667(.
667.
statistic thelarge, is size sample theBecause
0
z
n
pz
The probability that a z value at least this inconsistent with H0 would be observed if in fact H0 is true is
.1401
.8599-1
1.08 ofright the tocurve z under thearea
true)is H when08.1( 0
zPvaluep
• A decision as to whether H0 should be rejected results from comparing the P-value to the chosen α:
H0 should be rejected if P-value ≤ α
H0 should not be rejected if P-value >α
Determination of the P-Value When the Test Statistic is z
1. Upper-Tailed test:
Ha: π > hypothesized value
p-value computes as illustrated:
Determination of the P-Value When the Test Statistic is z
2. Lower-tailed test:
Ha: π < hypothesized value
P-value computes as illustrated:
Determination of the P-Value When the Test Statistic is z
3. Two-tailed test:
Ha: π ≠ hypothesized value
P-value computed as illustrated:
Example
• In December 2003 a countrywide water conservation campaign was conducted in a particular county. In January 2004 a random sample of 500 homes was selected, and water usage was recorded for each home in the sample. The county supervisors want to know whether the data support the claim that fewer than half the house-holds in the county reduced water consumption.
• H0: π = .5 versus Ha: π < .5
Where π is the true proportion of households in the county with reduced water usage.
Suppose that the sample results were n = 500 and p = .440. Because the sample size is large and this is a lower-tailed test, we can compute the P-value by first calculating the value of the z test statistic
n
pz
)5.1)(5(.
5.
Then find the area under the z curve to the left of this z.
68.20224.
060.
500)5.1)(5(.
5.440.
z
The p-value is then equal to the area under the z curve and to the left of -2.68. From the table we find that the P-value = .0037
• We reject H0 because .0037 ≤ .01, suggesting that the proportion with reduced water usage was less than .5.
• Summary of Large-Sample z test for π
Hypothesis: H0: π = hypothesized value
n
z value)edhypothesiz- value)(1edhypothesiz(
valueedhypothesiz - p :Statistic
Alternate Hypothesis:Ha: π > hypothesized value
Ha: π < hypothesized value
Ha: π ≠ hypothesized value
• P-Value:Area under z curve to right of
calculated z
Area under z curve to left of calculated z
(1) 2(area to right of z) if z is positive, or
(2) 2(area to left of z) if z is negative
• Assumptions:1. p is the sample proportion from a random
sample
2. The sample size is large. This test can be used if n satisfies both.
3. If sampling is without replacement, the sample size is no more than 10% of the population size.
Steps in a Hypothesis-Testing Analysis
1. Describe the population characteristic about which hypotheses are to be tested.
2. State the null hypothesis H0.
3. State the alternative hypothesis Ha.
4. Select the significance level α for the test.
5. Display the test statistic to be used, with substitution of the hypothesized value identified in Step 2 but without any computation at this point.
Steps continued…
6. Check to make sure that any assumptions required for the test are reasonable.
7. Compute all quantities appearing in the test statistic and then the value of the test statistic itself.
8. Determine the P-value associated with the observed value of the test statistic.
9. State the conclusion (which is to reject H0 if P-value ≤ α and not to reject H0 otherwise). The conclusion should then be stated in the context of the problem, and the level of significance should be included.
• Steps 1-4 constitute a statement of the problem.
• Steps 5-8 give the analysis that leads to a decision.
• Step 9 provides the conclusion.
Example
• An article described a study of credit card payment practices of college students. According to the authors of the article, the credit card industry asserts that at most 50% of college students carry a credit card balance from month to month. However, the authors of the article report that, in a random sample of 310 college students, 217 carried a balance each month. Does this sample provide sufficient evidence to reject the industry claim? We answer this question by carrying out a hypothesis test using a .05 significance level.
1. Population characteristic of interest:π = true proportion of college students who
carry a balance from month to month
2. Null Hypothesis: H0: π = .5
3. Alternative hypothesis: Ha: π > .5
4. Significance level: α = .05
5. Test Statistic:
n
p
nvalueedhypothesizvalueedhypothesiz
valueedhypothesizpz
)5.1)(5(.
5.
)1)((
6. Assumptions: This test requires a random sample and a large sample size. The given sample was a random sample with n = 310. Because 310(.5) ≥ 10 and 310(1 - .5) ≥ 10, the large-sample test is appropriate. The sample size is small compared to the population (college students) size.
7. Computations: n = 310 and p = 217/310 = .700, so
14.7028.
200.
310)5.1)(5(.
5.700.
z
8. P-value: This is an upper-tailed test (the inequality in Ha is “greater than”), so the P-value is the area to the right of the computed z value. Because z = 7.14 is so far out in the upper tail of the standard normal distribution, the area to its right is negligible. Thus P-value ≈ 0.
9. Conclusion: Because P-value ≤ α(0 ≤ .05), H0 is rejected at the .05 level of significance.
• We conclude that the proportion of students who carry a credit card balance from month to month is greater than .5. That is, the sample provides convincing evidence that the industry claim is not correct.
Example: Let’s Do Together
• The Public Policy Institute of California reported that 71% of people nationwide prefer to live in a single-family home. To determine whether the preferences of Californians are consistent with this nationwide figure, a random sample of 2002 Californians were interviewed. Of those interviewed, 1682 said that they consider a single-family home the ideal. Can we reasonably conclude that the proportion of Californians who prefer a single-family home is different from the national figure? We answer the question by carrying out a hypothesis test with α = .01.
1. π = proportion of all Californians who prefer a single-family home
2. H0: π = .713. Ha: π ≠ .71 (differs from proportion)4. Significance Level: α = .015. Test Statistic:
n
p
nvalueedhypothesizvalueedhypothesiz
valueedhypothesizpz
)29)(.71(.
71.
)1)((
6. Assumptions: This test requires a random sample and a large sample size. The given sample was a random, the population size is much larger than the sample size, and the sample size was n = 2002. Because 2002(.71) ≥ 10 and 2002(.29) ≥ 10, the large-sample test is appropriate.
7. Computations: p = 1682/2002 = .84
87.120101.
13.
2002)29)(.71(.
71.84.
z
8. P-value: The area under the z curve to the right of 12.87 is approximately 0, so P-value ≈ 2(0) = 0.
9. Conclusion: At significance level .01, we reject H0 because P-value ≈ 0 < .01 = α. The data provide convincing evidence that the proportion in California who prefer a single-family home differs from the nationwide proportion.