chapter 16: inference in practice. chapter 16 concepts 2 conditions for inference in practice ...

CHAPTER 16:Inference in Practice

Chapter 16 Concepts2

Conditions for Inference in Practice

Cautions About Confidence Intervals

Cautions About Significance Tests

Planning Studies: Sample Size for Confidence Intervals

Planning Studies: The Power of the Statistical Test

Chapter 16 Objectives3

Describe the conditions necessary for inference Describe cautions about confidence intervals Describe cautions about significance tests Calculate the sample size for a desired margin of

error in a confidence interval Define Type I and Type II errors Calculate the power of a significance test

So far, we have met two procedures for statistical inference. When the “simple conditions” are true: the data are an SRS, the population has a Normal distribution and we know the standard deviation of the population, a confidence interval for the mean is:

To test a hypothesis H0: = 0 we use the one-sample z statistic:

These are called z procedures because they both involve a one-sample z statistic and use the standard Normal distribution.

z Procedures4

5

Conditions for Inference in PracticeAny confidence interval or significance test can be trusted only under specific conditions.

Where did the data come from?When you use statistical inference, you are acting as if your data are a random sample or come from a randomized comparative experiment.

• If your data don’t come from a random sample or randomized comparative experiment, your conclusions may be challenged.

• Practical problems such as nonresponse or dropouts from an experiment can hinder inference.

• Different methods are needed for different designs.• There is no cure for fundamental flaws like voluntary response.

Where did the data come from?When you use statistical inference, you are acting as if your data are a random sample or come from a randomized comparative experiment.

• If your data don’t come from a random sample or randomized comparative experiment, your conclusions may be challenged.

• Practical problems such as nonresponse or dropouts from an experiment can hinder inference.

• Different methods are needed for different designs.• There is no cure for fundamental flaws like voluntary response.

What is the shape of the population distribution?Many of the basic methods of inference are designed for Normal populations.

• Any inference procedure based on sample statistics like the sample mean that are not resistant to outliers can be strongly influenced by a few extreme observations.

What is the shape of the population distribution?Many of the basic methods of inference are designed for Normal populations.

• Any inference procedure based on sample statistics like the sample mean that are not resistant to outliers can be strongly influenced by a few extreme observations.

6

Cautions About Confidence Intervals

A sampling distribution shows how a statistic varies in repeated random sampling.

This variation causes random sampling error because the statistic misses the true parameter by a random amount.

No other source of variation or bias in the sample data influences the sampling distribution.

The margin of error in a confidence interval covers only random sampling errors. Practical difficulties such as undercoverage and nonresponse are often more serious than random sampling error. The margin of error does not take such difficulties into account.

The margin of error in a confidence interval covers only random sampling errors. Practical difficulties such as undercoverage and nonresponse are often more serious than random sampling error. The margin of error does not take such difficulties into account.


7

Significance tests are widely used in most areas of statistical work. Some points to keep in mind when you use or interpret significance tests are:

How small a P is convincing?The purpose of a test of significance is to describe the degree of evidence provided by the sample against the null hypothesis. How small a P-value is convincing evidence against the null hypothesis depends mainly on two circumstances:

• If H0 represents an assumption that has been believed for years, strong evidence (a small P) will be needed.

• If rejecting H0 means making a costly changeover, you need strong evidence.

How small a P is convincing?The purpose of a test of significance is to describe the degree of evidence provided by the sample against the null hypothesis. How small a P-value is convincing evidence against the null hypothesis depends mainly on two circumstances:

• If H0 represents an assumption that has been believed for years, strong evidence (a small P) will be needed.

• If rejecting H0 means making a costly changeover, you need strong evidence.


8


Significance Depends on the Alternative HypothesisThe P-value for a one-sided test is one-half the P-value for the two-sided test of the same null hypothesis based on the same data.

• The evidence against the null hypothesis is stronger when the alternative is one-sided because it is based on the data plus information about the direction of possible deviations from the null.

• If you lack this added information, always use a two-sided alternative hypothesis.

Significance Depends on the Alternative HypothesisThe P-value for a one-sided test is one-half the P-value for the two-sided test of the same null hypothesis based on the same data.

• The evidence against the null hypothesis is stronger when the alternative is one-sided because it is based on the data plus information about the direction of possible deviations from the null.

• If you lack this added information, always use a two-sided alternative hypothesis.


9


Sample Size Affects Statistical SignificanceBecause large random samples have small chance variation, very small population effects can be highly significant if the sample is large.Because small random samples have a lot of chance variation, even large population effects can fail to be significant if the sample is small.Statistical significance does not tell us whether an effect is large enough to be important. Statistical significance is not the same as practical significance.

Sample Size Affects Statistical SignificanceBecause large random samples have small chance variation, very small population effects can be highly significant if the sample is large.Because small random samples have a lot of chance variation, even large population effects can fail to be significant if the sample is small.Statistical significance does not tell us whether an effect is large enough to be important. Statistical significance is not the same as practical significance.

Beware of Multiple AnalysesThe reasoning of statistical significance works well if you decide what effect you are seeking, design a study to search for it, and use a test of significance to weigh the evidence you get.

Beware of Multiple AnalysesThe reasoning of statistical significance works well if you decide what effect you are seeking, design a study to search for it, and use a test of significance to weigh the evidence you get.

Sample Size for Confidence Intervals

A wise user of statistics never plans a sample or an experiment without also planning the inference. The number of observations is a critical part of planning the study.

The margin of error ME of the confidence interval for the population mean µ is

To determine the sample size n that will yield a level C confidence interval for a population mean with a specified margin of error ME:

• Get a reasonable value for the population standard deviation σ from an earlier or pilot study.• Find the critical value z* from a standard Normal curve for confidence level C.• Set the expression for the margin of error to be less than or equal to ME and solve for n:

To determine the sample size n that will yield a level C confidence interval for a population mean with a specified margin of error ME:

• Get a reasonable value for the population standard deviation σ from an earlier or pilot study.• Find the critical value z* from a standard Normal curve for confidence level C.• Set the expression for the margin of error to be less than or equal to ME and solve for n:

Choosing Sample Size for a Desired Margin of Error When Estimating µ

Choosing Sample Size for a Desired Margin of Error When Estimating µ

10

Sample Size for Confidence Intervals

11

Researchers would like to estimate the mean cholesterol level µ of a particular variety of monkey that is often used in laboratory experiments. They would like their estimate to be within 1 milligram per deciliter (mg/dl) of the true value of µ at a 95% confidence level. A previous study involving this variety of monkey suggests that the standard deviation of cholesterol level is about 5 mg/dl.

The critical value for 95% confidence is z* = 1.96.

We will use σ = 5 as our best guess for the standard deviation.

Multiply both sides by square root n and divide both sides by 1.

Multiply both sides by square root n and divide both sides by 1.

Square both sides.Square both sides.

We round up to 97 monkeys to ensure the margin of error is no more than 1 mg/dl at 95% confidence.

We round up to 97 monkeys to ensure the margin of error is no more than 1 mg/dl at 95% confidence.

12

The Power of a Statistical Test

When we draw a conclusion from a significance test, we hope our conclusion will be correct. But sometimes it will be wrong. There are two types of mistakes we can make.

If we reject H0 when H0 is true, we have committed a Type I error.

If we fail to reject H0 when H0 is false, we have committed a Type II error.

If we reject H0 when H0 is true, we have committed a Type I error.

If we fail to reject H0 when H0 is false, we have committed a Type II error.

Truth about the population

H0 trueH0 false(Ha true)

Conclusion based on sample

Reject H0 Type I errorCorrect

conclusion

Fail to reject H0

Correct conclusion

Type II error

The Power of a Statistical Test13

The probability of a Type I error is the probability of rejecting H0 when it is really true. This is exactly the significance level of the test.

The significance level α of any fixed level test is the probability of a Type I error. That is, α is the probability that the test will reject the null hypothesis H0 when H0 is in fact true. Consider the consequences of a Type I error before choosing a significance level.

The significance level α of any fixed level test is the probability of a Type I error. That is, α is the probability that the test will reject the null hypothesis H0 when H0 is in fact true. Consider the consequences of a Type I error before choosing a significance level.

Significance and Type I ErrorSignificance and Type I Error

A significance test makes a Type II error when it fails to reject a null hypothesis that really is false. There are many values of the parameter that satisfy the alternative hypothesis, so we concentrate on one value. We can calculate the probability that a test does reject H0 when an alternative is true. This probability is called the power of the test against that specific alternative.

The power of a test against a specific alternative is the probability that the test will reject H0 at a chosen significance level α when the specified alternative value of the parameter is true.

The power of a test against a specific alternative is the probability that the test will reject H0 at a chosen significance level α when the specified alternative value of the parameter is true.


A potato-chip producer wonders whether the significance test of H0: p = 0.08 versus Ha: p > 0.08 based on a random sample of 500 potatoes has enough power to detect a shipment with, say, 11% blemished potatoes.

We would reject H0 at α = 0.05 if our sample yielded a sample proportion to the right of the green line.

What if p = 0.11?

Since we reject H0 at α = 0.05 if our sample yields a proportion > 0.0999, we’d correctly reject the shipment about 75% of the time.

The power of a test against any alternative is 1 minus the probability of a Type II error for that alternative; that is, power = 1 – β.

The power of a test against any alternative is 1 minus the probability of a Type II error for that alternative; that is, power = 1 – β.


How large a sample should we take when we plan to carry out a significance test? The answer depends on what alternative values of the parameter are important to detect.

Summary of influences on the question “How many observations do I need?”

•If you insist on a smaller significance level (such as 1% rather than 5%), you have to take a larger sample. A smaller significance level requires stronger evidence to reject the null hypothesis.

• If you insist on higher power (such as 99% rather than 90%), you will need a larger sample. Higher power gives a better chance of detecting a difference when it is really there.

• At any significance level and desired power, detecting a small difference requires a larger sample than detecting a large difference.

Summary of influences on the question “How many observations do I need?”

•If you insist on a smaller significance level (such as 1% rather than 5%), you have to take a larger sample. A smaller significance level requires stronger evidence to reject the null hypothesis.

• If you insist on higher power (such as 99% rather than 90%), you will need a larger sample. Higher power gives a better chance of detecting a difference when it is really there.

• At any significance level and desired power, detecting a small difference requires a larger sample than detecting a large difference.

Chapter 16 Objectives Review

16

Describe the conditions necessary for inference

Describe cautions about confidence intervals Describe cautions about significance tests Calculate the sample size for a desired margin

of error in a confidence interval Define Type I and Type II errors Calculate the power of a significance test

chapter 16: inference in practice. chapter 16 concepts 2 conditions for inference in practice ...

Documents

random sample

sample data

test of significance

sample size

sample z statistic

statistical inference

significance testscalculate

sample statistics