lecture7 hypothesis

Upload: christyashley

Post on 07-Mar-2016

23 views

Category:

Documents


0 download

DESCRIPTION

Hypothesis- Statistics

TRANSCRIPT

Introduction

Hypothesis testingHyelim Son

We see so many survey results and statistics cited in newspapers and media. This class is meant to develop how to better understand these statistics and develop a critical thinking of these matter. One should note that statistics could be easily manipulated to give misleading information.

1Chapter 7. Hypothesis Testing7-1 Review and Preview7-2 Basics of Hypothesis Testing7-3 Testing a Claim about a Proportion7-4 Testing a Claim About a Mean 7-5 Testing a Claim About a Standard Deviation or Variance

ReviewThe two main activities of inferential statistics are using sample data to:

(1) estimate a population parameter

(2) Or test a hypothesis or claim about a population parameter.

Confidence intervals allowed us to find ranges of reasonable values for parameters we were interested in.

Hypothesis testing will let us make decisions about specific values of parameters or relationships between parameters.page 386 of Elementary Statistics, 10th EditionVarious examples are provided below definition boxChapter ObjectiveDevelop the ability to conduct hypothesis tests for claims made about

a population proportion pa population mean a population standard deviation . page 386 of Elementary Statistics, 10th EditionVarious examples are provided below definition boxExamples of Hypotheses that can be TestedGenetics: The Genetics & IVF Institute claims that its XSORT method allows couples to increase the probability of having a baby girl.

Business: A newspaper cites a PriceGrabber.com survey of 1631 subjects and claims that a majority have heard of Kindle as an e-book reader.

Health: It is often claimed that the mean body temperature is 98.6 degrees. We can test this claim using a sample of 106 body temperatures with a mean of 98.2 degrees.page 386 of Elementary Statistics, 10th EditionVarious examples are provided below definition boxChapter 7. Hypothesis Testing7-1 Review and Preview7-2 Basics of Hypothesis Testing7-3 Testing a Claim about a Proportion7-4 Testing a Claim About a Mean 7-5 Testing a Claim About a Standard Deviation or Variance

Main ObjectiveThe main objective of this chapter is to develop the ability to conduct hypothesis tests for claims made about a population proportion p, a population mean , or a population standard deviation . page 386 of Elementary Statistics, 10th EditionVarious examples are provided below definition boxIntroductionIn hypothesis testing, we use sample data to choose between two competing hypotheses concerning the values of one or more population parameters

In hypothesis testing, there are 2 choices, the null hypothesis and the alternative hypothesis.

Think of it like a jury trial. There are two options: innocent and guilty. You assume innocence until shown guilty beyond a reasonable doubt.

Key ConceptThis section presents individual components of a hypothesis test. We should know and understand the following:How to identify the null hypothesis and alternative hypothesis from a given claim, and how to express both in symbolic formHow to calculate the value of the test statistic, given a claim and sample dataHow to choose the sampling distribution that is relevantHow to identify the P-value or identify the critical value(s)How to state the conclusion about a claim in simple and nontechnical termsDefinitionsNull Hypothesis

Alternative Hypothesis

DefinitionsA Null hypothesis - the status quo - initially assumed true

An Alternative Hypothesis - the researchers proposal - what you hope to show

The goal of a statistical test is to evaluate the viability of the null hypothesis in the light of the data that we have collected.

Depending on the data, the null hypothesis will either be rejected or it wont be rejected (it is never accepted).Steps 1, 2, 3Identifying H0 and H1

ExampleAssume that 100 babies are born to 100 couples treated with the XSORT method of gender selection that is claimed to make girls more likely.

We observe 58 girls in 100 babies. Write the hypotheses to test the claim the with the XSORT method, the proportion of girls is greater than the 50% that occurs without any treatment.

Analogous Example: IntuitionSuppose the null hypothesis is that some candidate A will win more than 50% of the vote and that the alternative hypothesis is that candidate A will not win more than 50%.

Suppose we sampled 15 voters and recorded who voted for A.

What would you conclude if no one in the sample voted for A (X = 0)?

If 50% of voters really did vote for A, it is highly improbable to randomly draw a sample of 15 and find no voters for AThus, we would reject the null hypothesis.

Step 4Select the Significance Level

Conducting Hypothesis TestingNow that weve learned how to design null and alternative hypothesis, lets point out the elements necessary to actually test the hypothesis

A test statistic: a function of the data points in the sample A rejection region: if the test statistic falls in the rejection region of the sample statistic, we reject the null hypothesis.

To define the critical region, we need to first know significance levelRejection region a previewOne tail rejection region

Step 5Identify the Test Statistic and Determine its Sampling DistributionTest StatisticTest StatisticFirst choose the correct test statistic, and identify the sampling distribution of the statistic.

Rejection RegionThe rejection region specifies the values of the test statistic for which the null hypothesis is to be rejected.

To define the rejection region, knowing the significance level and which distribution to use is important

ExampleExample

Example

Example

Step 6Find the Value of the Test Statistic, Then Find Either the P-Value or the Critical Value(s)First transform the relevant sample statistic to a standardized score called the test statistic.

Then find the P-Value or the critical value(s).ExampleLets again consider the claim that the XSORT gender selection method increases the likelihood of having a girl.

Preliminary results from a test of the XSORT method of gender selection involved 100 couples who gave birth to 58 girls and 42 boys.

Use the given claim and the preliminary results to calculate the value of the test statistic.

Use z test statistic, so that a normal distribution is used to approximate a binomial distribution. Example - ContinuedThe claim that the XSORT method of gender selection increases the likelihood of having a baby girl results in the following null and alternative hypotheses:

We work under the assumption that the null hypothesis is true with p = 0.5.

The sample proportion of 58 girls in 10 births results in:

Example Convert to the Test StatisticWe know from previous chapters that a z score of 1.60 is not unusual.

At first glance, 58 girls in 100 births does not seem to support the claim that the XSORT method increases the likelihood a having a girl (more than a 50% chance).

Hypothesis testing Test statistics alone cannot give us enough information to make a decision about the claim being tested.

We need either Rejection Region or Calculation of P-value. Types of Hypothesis Tests:Two-tailed, Left-tailed, Right-tailedThe tails in a distribution are the extreme regions bounded by critical values.Determinations of P-values and critical values are affected by whether a critical region is in two tails, the left tail, or the right tail.

Therefore, it becomes important to correctly ,characterize a hypothesis test as two-tailed, left-tailed, or right-tailed. Two-tailed Test

is divided equally between the two tails of the critical region

Left-tailed Test

All in the left tail

Right-tailed Test

All in the right tailP-ValueThe P-value (or probability value) is the probability of getting a value of the test statistic that is at least as extreme as the one representing the sample data, assuming that the null hypothesis is true.Critical region in the left tail:Critical region in the right tail:Critical region in two tails:P-value = area to the left of the test statisticP-value = area to the right of the test statisticP-value = twice the area in the tail beyond the test statistic

Procedure for Finding P-ValuesEpickerSuppose we have a test statistic value of 1.96 What is the p-value if we are conducting a right tailed hypothesis test?

0.050.0250.10.975EpickerSuppose we have a test statistic value of 1.96 What is the p-value if we are conducting a left tailed hypothesis test?

0.050.0250.10.975

EpickerSuppose we have a test statistic value of 1.96

What is the p-value if we are conducting a two tailed hypothesis test?

0.050.0250.10.975

P-ValueThe null hypothesis is rejected if the P-value is very small, such as 0.05 or less.ExampleThe claim that the XSORT method of gender selection increases the likelihood of having a baby girl results in the following null and alternative hypotheses:

The test statistic was :

ExampleCalculate the p value for the test statistic with z=1.60

The test statistic of z = 1.60 has an area of 0.0548 to its right, so a right-tailed test with test statistic z = 1.60 has a P-value of 0.0548.

Or in other words: P(z>1.60)=0.0548

Critical Region/ Rejection regionThe critical region (or rejection region) is the set of all values of the test statistic that cause us to reject the null hypothesis. For example, see the red-shaded region in the previous figures.Critical ValueA critical value is any value that separates the critical region (where we reject the null hypothesis) from the values of the test statistic that do not lead to rejection of the null hypothesis.

The critical values depend on the nature of the null hypothesis, the sampling distribution that applies, and the significance level .

ExampleFor the XSORT birth hypothesis test, the critical value and critical region for an = 0.05 test are shown below:

Critical ValueStep 7 : Make a Decision:Reject H0 or Fail to Reject H0The methodologies depend on if you are using the P-Value method or the critical value method.Decision CriterionCritical Value Method:

If the test statistic falls within the critical region, reject H0.

If the test statistic does not fall within the critical region, fail to reject H0.

e.g. If critical region is z=1.645 to infinity and test statistic z=1.60, then the test statistic is not within the critical region, so we fail to reject the null

Decision CriterionExampleFor the XSORT baby gender test, the test had a test statistic of z = 1.60 and a P-Value of 0.0548. We tested:

Using the P-Value method, we would fail to reject the null at the = 0.05 level.

Using the critical value method, we would fail to reject the null because the test statistic of z = 1.60 does not fall in the rejection region.

(You will come to the same decision using either method.)

Step 8 : Restate the Decision Using Simple and Nontechnical TermsState a final conclusion that addresses the original claim with wording that can be understood by those without knowledge of statistical procedures.

ExampleFor the XSORT baby gender test, there was not sufficient evidence to support the claim that the XSORT method is effective in increasing the probability that a baby girl will be born.Wording of Final Conclusion

CautionNever conclude a hypothesis test with a statement of reject the null hypothesis or fail to reject the null hypothesis.

Always make sense of the conclusion with a statement that uses simple nontechnical wording that addresses the original claim.Accept Versus Fail to RejectSome texts use accept the null hypothesis.We are not proving the null hypothesis.Fail to reject says more correctly that the available evidence is not strong enough to warrant rejection of the null hypothesis.Errors in hypothesis testingJust like you could have a mistake in a jury trial, you can make mistakes in hypothesis testing.

For any fixed rejection region, we can make two types of errors in making a decision about the null hypothesis.

Type I ErrorA Type I error is the mistake of rejecting the null hypothesis when it is actually true.The symbol is used to represent the probability of a type I error.

E.g. Innocent man is pronounced guilty. This would mean you rejected the null hypothesis when in fact, the null hypothesis was true. page 398 of Elementary Statistics, 10th EditionType II ErrorA Type II error is the mistake of failing to reject the null hypothesis when it is actually false.The symbol (beta) is used to represent the probability of a type II error.E.g. Guilty man is pronounced innocent. This would mean you did not reject the null hypothesis when in fact, the alternative was true. This is a Type II Error.

Page 398 of Elementary Statistics, 10th EditionErrorsType I error is usually considered more serious. The symbol is used to represent the probability of a type I error. is also called significance levelWe get to set in advance, i.e. have direct control over Type I errors. The symbol (beta) is used to represent the probability of a type II error. 1- is called power.

In any hypothesis test, you need to balance and . You can never get them both to be 0 with a sample. In practice, we choose the largest that is tolerable - usually .01, .05 or .10. The relationship between and is not deterministic. That is to say, when increases, decreases (and power would increase), and vice versa, but you cant say how much because other factors affect the quantities.

Type I and Type II Errors

Examplea) Identify a type I error.b) Identify a type II error.Assume that we are conducting a hypothesis test of the claim that a method of gender selection increases the likelihood of a baby girl, so that the probability of a baby girls is p > 0.5.

Here are the null and alternative hypotheses:

Example - Continued A type I error is the mistake of rejecting a true null hypothesis:

We conclude the probability of having a girl is greater than 50%, when in reality, it is not. In other words, we will falsely conclude that XSORT is effective. Our data misled us.

Example - Continued A type II error is the mistake of failing to reject the null hypothesis when it is false:

There is no evidence to conclude the probability of having a girl is greater than 50% (our data misled us), but in reality, the probability is greater than 50%.

In other words XSORT is actually effective, but our data concluded that it has no effect.Controlling Type I and Type II ErrorsFor any fixed , an increase in the sample size n will cause a decrease in For any fixed sample size n, a decrease in will cause an increase in . Conversely, an increase in will cause a decrease in .To decrease both and , increase the sample size.

Part 2: Beyond the Basics of Hypothesis Testing:The Power of a Test

DefinitionThe power of a hypothesis test is the probability 1 of rejecting a false null hypothesis. The value of the power is computed by using a particular significance level and a particular value of the population parameter that is an alternative to the value assumed true in the null hypothesis.Power

Power and theDesign of ExperimentsJust as 0.05 is a common choice for a significance level, a power of at least 0.80 is a common requirement for determining that a hypothesis test is effective. (Some statisticians argue that the power should be higher, such as 0.85 or 0.90.) When designing an experiment, we might consider how much of a difference between the claimed value of a parameter and its true value is an important amount of difference. When designing an experiment, a goal of having a power value of at least 0.80 can often be used to determine the minimum required sample size. Chapter 7Hypothesis Testing7-1 Review and Preview7-2 Basics of Hypothesis Testing7-3 Testing a Claim about a Proportion7-4 Testing a Claim About a Mean 7-5 Testing a Claim About a Standard Deviation or Variance

Key ConceptThis section presents complete procedures for testing a hypothesis (or claim) made about a population proportion. This section uses the components introduced in the previous section for the P-value method, the traditional method or the use of confidence intervals.Key ConceptTwo common methods for testing a claim about a population proportion are (1) to use a normal distribution as an approximation to the binomial distribution, and (2) to use an exact method based on the binomial probability distribution. Part 1 of this section uses the approximate method with the normal distribution, and Part 2 of this section briefly describes the exact method. Part 1:Basic Methods of Testing Claims about a Population Proportion pNotationn= sample size or number of trials

p= population proportionq= 1 p

1) The sample observations are a simple random sample.2) The conditions for a binomial distribution are satisfied.The conditions np 5 and nq 5 are both satisfied, so the binomial distribution of sample proportions can be approximated by a normal distribution with = np and . Note: p is the assumed proportion not the sample proportion.Requirements for Testing Claims About a Population Proportion p

Test Statistic for Testing a Claim About a ProportionP-values:

Critical Values:Use the standard normal distribution (Table A-2) and refer to Figure 8-1.

Use the standard normal distribution (Table A-2).

Methods to use P-value methodCritical Value method Confidence interval method - usually used for estimationCaution When testing claims about a population proportion, the traditional method and the P-value method are equivalent and will yield the same result since they use the same standard deviation based on the claimed proportion p. However, the confidence interval uses an estimated standard deviation based upon the sample proportion . Consequently, it is possible that the traditional and P-value methods may yield a different conclusion than the confidence interval method.A good strategy is to use a confidence interval to estimate a population proportion, but use the P-value or traditional method for testing a claim about the proportion.

ExampleBased on information from the National Cyber Security Alliance, 93% of computer owners believe they have antivirus programs installed on their computers.

In a random sample of 400 scanned computers, it is found that 380 of them (or 95%) actually have antivirus software programs.

Use the sample data from the scanned computers to test the claim that 93% of computers have antivirus software.Example - ContinuedRequirement check:

The 400 computers are randomly selected.There is a fixed number of independent trials with two categories (computer has an antivirus program or does not).The requirements np 5 and nq 5 are both satisfied with n = 400

Example - ContinuedP-Value Method:

The original claim that 93% of computers have antivirus software can be expressed as p = 0.93.

The opposite of the original claim is p 0.93.

The hypotheses are written as:

Example - ContinuedP-Value Method:

For the significance level, we select = 0.05. Because we are testing a claim about a population proportion, the sample statistic relevant to this test is:

6. The test statistic is calculated as:

Example - ContinuedP-Value Method:

Because the hypothesis test is two-tailed with a test statistic of z = 1.57, the P-value is twice the area to the right of z = 1.57.

The P-value is twice 0.0582, or 0.1164.

Example - ContinuedP-Value Method:

Because the P-value of 0.1164 is greater than the significance level of = 0.05, we fail to reject the null hypothesis.

We fail to reject the claim that 93% computers have antivirus software. We conclude that there is not sufficient sample evidence to warrant rejection of the claim that 93% of computers have antivirus programs.

Example - ContinuedCritical Value Method: Steps 1 5 are the same as for the P-value method.

The test statistic is computed to be z = 1.57. We now find the critical values, with the critical region having an area of = 0.05, split equally in both tails.

Example - ContinuedCritical Value Method:

Because the test statistic does not fall in the critical region, we fail to reject the null hypothesis.

We fail to reject the claim that 93% computers have antivirus software. We conclude that there is not sufficient sample evidence to warrant rejection of the claim that 93% of computers have antivirus programs.

Example - ContinuedConfidence Interval Method:

The claim of p = 0.93 can be tested at the = 0.05 level of significance with a 95% confidence interval.

Using the methods of Section 7-2, we get:

0.929 < p < 0.971

This interval contains p = 0.93, so we do not have sufficient evidence to warrant the rejection of the claim that 93% of computers have antivirus programs.Part 2Exact Method for Testing Claims about a Proportion

Testing Claims Using the Exact MethodWe can get exact results by using the binomial probability distribution. Binomial probabilities are a nuisance to calculate manually, but technology makes this approach quite simple. Also, this exact approach does not require that np 5 and nq 5 so we have a method that applies when that requirement is not satisfied. To test hypotheses using the exact binomial distribution, use the binomial probability distribution with the P-value method, use the value of p assumed in the null hypothesis, and find P-values as follows:Testing Claims Using the Exact MethodLeft-tailed test:The P-value is the probability of getting x or fewer successes among n trials.Right-tailed test:The P-value is the probability of getting x or more successes among n trials.Testing Claims Using the Exact MethodTwo-tailed test:, the P-value is twice the probability of getting x or more successes, the P-value is twice the probability of getting x or fewer successes

ExampleIn testing a method of gender selection, 10 randomly selected couples are treated with the method, and 9 of the babies are girls.

Use a 0.05 significance level to test the claim that with this method, the probability of a baby being a girl is greater than 0.75.Example - ContinuedWe will test

Use manual calculation or technology to find probabilities in a binomial distribution with p = 0.75.

Because it is a right-tailed test, the P-value is the probability of 9 or more successes among 10 trials, assuming p = 0.75.

Example - ContinuedThe probability of 9 or more successes is 0.2440252, which is the P-value of the hypothesis test.

The P-value is high (greater than 0.05), so we fail to reject the null hypothesis.

There is not sufficient evidence to support the claim that with the gender selection method, the probability of a girl is greater than 0.75.7-1 Review and Preview7-2 Basics of Hypothesis Testing7-3 Testing a Claim about a Proportion7-4 Testing a Claim About a Mean 7-5 Testing a Claim About a Standard Deviation or Variance

Chapter 7Hypothesis TestingKey ConceptThis section presents methods for testing a claim about a population mean.

Part 1 deals with the very realistic and commonly used case in which the population standard deviation is not known.

Part 2 discusses the procedure when is known, which is very rare.Part 1When is not known, we use a t test that incorporates the Student t distribution.n = sample size= sample mean

= population meanRequirementsThe sample is a simple random sample.Either or both of these conditions is satisfied: The population is normally distributed or n > 30.Test StatisticFor running the test, both for pvalue method and critical value method, use t distribution with n-1 degrees of freedom

ExampleListed below are the measured radiation emissions (in W/kg) corresponding to a sample of cell phones.

Use a 0.05 level of significance to test the claim that cell phones have a mean radiation level that is less than 1.00 W/kg.

The summary statistics are: .0.380.551.541.550.500.600.920.961.000.861.46

Example - ContinuedRequirement Check:

We assume the sample is a simple random sample.The sample size is n = 11, which is not greater than 30, so we must check a normal quantile plot for normality.Example - ContinuedThe points are reasonably close to a straight line and there is no other pattern, so we conclude the data appear to be from a normally distributed population.

Example - ContinuedStep 1: The claim that cell phones have a mean radiation level less than 1.00 W/kg is expressed as < 1.00 W/kg.

Step 2: The alternative to the original claim is 1.00 W/kg.

Step 3: The hypotheses are written as:

Example - ContinuedStep 4: The stated level of significance is = 0.05.

Step 5: Because the claim is about a population mean , the statistic most relevant to this test is the sample mean: .

Example - ContinuedStep 6: Calculate the test statistic and then find the P-value or the critical value from Table A-3:

Example - ContinuedStep 7: Critical Value Method: Because the test statistic of t = 0.486 does not fall in the critical region bounded by the critical value of t = 1.812, fail to reject the null hypothesis.

Example - ContinuedStep 7: P-value method: Technology, such as a TI-83/84 Plus calculator can output the P-value of 0.3191. Since the P-value exceeds = 0.05, we fail to reject the null hypothesis.

ExampleStep 8: Because we fail to reject the null hypothesis, we conclude that there is not sufficient evidence to support the claim that cell phones have a mean radiation level that is less than 1.00 W/kg.

a) In a left-tailed hypothesis test, the sample size is n = 12, and the test statistic is t = 2.007. b) In a right-tailed hypothesis test, the sample size is n = 12, and the test statistic is t = 1.222.c) In a two-tailed hypothesis test, the sample size is n = 12, and the test statistic is t = 3.456.Assuming that neither software nor a TI-83 Plus calculator is available, use Table A-3 to find a range of values for the P-value corresponding to the given results.Finding P-ValuesExample Confidence Interval MethodWe can use a confidence interval for testing a claim about .

For a two-tailed test with a 0.05 significance level, we construct a 95% confidence interval.

For a one-tailed test with a 0.05 significance level, we construct a 90% confidence interval.

Example Confidence Interval MethodUsing the cell phone example, construct a confidence interval that can be used to test the claim that < 1.00 W/kg, assuming = 0.05Note that a left-tailed hypothesis test with = 0.05 corresponds to a 90% confidence interval. we get:

0.707 W/kg < < 1.169 W/kg

Because the value of = 1.00 W/kg is contained in the interval, we cannot reject the null hypothesis that = 1.00 W/kg .

Based on the sample of 11 values, we do not have sufficient evidence to support the claim that the mean radiation level is less than 1.00 W/kg.

Part 2When is known, we use test that involves the standard normal distribution.

In reality, it is very rare to test a claim about an unknown population mean while the population standard deviation is somehow known.Test Statistic for Testing a Claim About a Mean (with Known)The test statistic is:

For running the test, both for pvalue method and critical value method, use the standard normal distribution

ExampleIf we repeat the cell phone radiation example, with the assumption that = 0.480 W/kg, the test statistic is:

Epicker Find the corresponding p value for the previous example where

and the test is a left tailed test. a) 0.3336 b) 0.05 c) 0.6664

so the P-value=P(z