goodness of fit tests - math.uh.edu

24
Goodness of Fit Tests Section 8.5 Cathy Poliak, Ph.D. [email protected] Office hours: T Th 2:30 pm - 5:15 pm 620 PGH Department of Mathematics University of Houston April 21, 2016 Cathy Poliak, Ph.D. [email protected] Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Hous Section8.5 April 21, 2016 1 / 20

Upload: others

Post on 05-May-2022

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Goodness of Fit Tests - math.uh.edu

Goodness of Fit TestsSection 8.5

Cathy Poliak, [email protected]

Office hours: T Th 2:30 pm - 5:15 pm 620 PGH

Department of MathematicsUniversity of Houston

April 21, 2016

Cathy Poliak, Ph.D. [email protected] Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 1 / 20

Page 2: Goodness of Fit Tests - math.uh.edu

Outline

1 Beginning Questions

2 Beginning Example

3 Chi-Square

4 Examples

Cathy Poliak, Ph.D. [email protected] Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 2 / 20

Page 3: Goodness of Fit Tests - math.uh.edu

Popper Set Up

Fill in all of the proper bubbles.

Use a #2 pencil.

This is popper number 22.

Cathy Poliak, Ph.D. [email protected] Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 3 / 20

Page 4: Goodness of Fit Tests - math.uh.edu

Steps of a Significance Test

When performing a significance test, we follow these steps:1. Check assumptions.

2. State the null and alternative hypothesis.

3. Graph the rejection region, labeling the critical values.

4. Calculate the test statistic.

5. Find the p-value. If this answer is less than the significance level,α, we can reject the null hypothesis in favor of the alternativehypothesis.

6. Give your conclusion using the context of the problem. Whenstating the conclusion give results with a confidence of(1 − α)(100)%.

Cathy Poliak, Ph.D. [email protected] Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 4 / 20

Page 5: Goodness of Fit Tests - math.uh.edu

What if we are not given α?

If the P-value for testing H0 is less than:0.1 we have some evidence that H0 is false.

0.05 we have strong evidence that H0 is false.

0.01 we have very strong evidence that H0 is false.

0.001 we have extremely strong evidence that H0 is false.

If the P-value is greater than 0.1, we do not have any evidence thatH0 is false.

Cathy Poliak, Ph.D. [email protected] Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 5 / 20

Page 6: Goodness of Fit Tests - math.uh.edu

Popper #22 Questions

1. In a hypothesis test if the computed P-value is 0.35, our decisionis toa) retest with a different sample.b) fail to reject the null hypothesis.c) reject the null hypothesis.d) accept the null hypothesis.

2. Consumer Reports (January 1993) stated that the mean retailcost of an AT&T model 3730 cellular phone was $600. A randomsample of 10 stores in Los Angeles had a mean cost of $586.5with standard deviation of $26.77. Does this indicate that themean cost in Los Angeles is less than $600? To answer thisquestion which test should be used?a) One Sample T Test for Meansb) χ2 Goodness of Fit Testc) Two Sample T Test for Meansd) One Sample Z Test for Means

Cathy Poliak, Ph.D. [email protected] Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 6 / 20

Page 7: Goodness of Fit Tests - math.uh.edu

Candy

Mars Inc. claims that they produce M&Ms with the followingdistributions:

Brown 30% Red 20% Yellow 20%Orange 10% Green 10% Blue 10%

A bag of M&Ms was randomly selected from the grocery store shelf,and the color counts were:

Brown 14 Red 14 Yellow 5Orange 7 Green 6 Blue 10

We want to know if the distribution of color the same as themanufacturer’s claim.

Cathy Poliak, Ph.D. [email protected] Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 7 / 20

Page 8: Goodness of Fit Tests - math.uh.edu

Goodness-of-fit Test

This is a test to see how well on sample proportions of categories"match-up" with the known population proportions.

The Chi-square goodness-of-fit test extends inference onproportions to more than two proportions by enabling us todetermine if a particular population distribution has changed froma specified form.

Hypotheses:I H0: The proportions are the same as what is claimed.I Ha: At least one proportion is different as what is claimed.

This would be better in context of the problem. For example in ourM&Ms test;

I H0: The distribution of candy colors is as the manufacturer claims.I Ha: The distribution of candy colors is not what the manufacturer

claims.

Cathy Poliak, Ph.D. [email protected] Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 8 / 20

Page 9: Goodness of Fit Tests - math.uh.edu

Goodness-of-fit Test

This is a test to see how well on sample proportions of categories"match-up" with the known population proportions.

The Chi-square goodness-of-fit test extends inference onproportions to more than two proportions by enabling us todetermine if a particular population distribution has changed froma specified form.

Hypotheses:I H0: The proportions are the same as what is claimed.I Ha: At least one proportion is different as what is claimed.

This would be better in context of the problem. For example in ourM&Ms test;

I H0: The distribution of candy colors is as the manufacturer claims.I Ha: The distribution of candy colors is not what the manufacturer

claims.

Cathy Poliak, Ph.D. [email protected] Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 8 / 20

Page 10: Goodness of Fit Tests - math.uh.edu

Goodness-of-fit Test

This is a test to see how well on sample proportions of categories"match-up" with the known population proportions.

The Chi-square goodness-of-fit test extends inference onproportions to more than two proportions by enabling us todetermine if a particular population distribution has changed froma specified form.

Hypotheses:I H0: The proportions are the same as what is claimed.I Ha: At least one proportion is different as what is claimed.

This would be better in context of the problem. For example in ourM&Ms test;

I H0: The distribution of candy colors is as the manufacturer claims.I Ha: The distribution of candy colors is not what the manufacturer

claims.

Cathy Poliak, Ph.D. [email protected] Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 8 / 20

Page 11: Goodness of Fit Tests - math.uh.edu

Goodness-of-fit Test

This is a test to see how well on sample proportions of categories"match-up" with the known population proportions.

The Chi-square goodness-of-fit test extends inference onproportions to more than two proportions by enabling us todetermine if a particular population distribution has changed froma specified form.

Hypotheses:I H0: The proportions are the same as what is claimed.I Ha: At least one proportion is different as what is claimed.

This would be better in context of the problem. For example in ourM&Ms test;

I H0: The distribution of candy colors is as the manufacturer claims.I Ha: The distribution of candy colors is not what the manufacturer

claims.

Cathy Poliak, Ph.D. [email protected] Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 8 / 20

Page 12: Goodness of Fit Tests - math.uh.edu

Chi-Square Test

Test Statistic: Called the chi-square statistic is a measure of howmuch the observed cell counts diverge from the expected cell counts.To calculate for each problem you will make a table with the followingheadings:

Observed Expected (O−E)2

ECounts (O) Counts (E)

The sum of the third column is called the Chi-square test statistic, χ2.

χ2 =∑ (observed − expected)2

expected

Where expected counts = total count × proportion of each category.

Cathy Poliak, Ph.D. [email protected] Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 9 / 20

Page 13: Goodness of Fit Tests - math.uh.edu

Chi-square of M&Ms

Color Observed Proportions Expected (O − E)2

Counts (O) Counts (E) EBrown 14 0.3

Red 14 0.2

Yellow 5 0.2

Orange 7 0.1

Green 6 0.1

Blue 10 0.1

Cathy Poliak, Ph.D. [email protected] Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 10 / 20

Page 14: Goodness of Fit Tests - math.uh.edu

Chi-square

Chi-square distributions have only positive values and are skewedright.

This has a degrees of freedom which is n − 1.

As the degrees of freedom increases it become more like aNormal distribution.

The total area under the χ2 curve is 1.

To find area under the curveI Table providedI In R: 1 - pchisq(x,df)I In TI-83(84): χ2cdf(x,1e99, df).

Cathy Poliak, Ph.D. [email protected] Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 11 / 20

Page 15: Goodness of Fit Tests - math.uh.edu

Chi-Square

Cathy Poliak, Ph.D. [email protected] Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 12 / 20

Page 16: Goodness of Fit Tests - math.uh.edu

Assumptions for a Chi-Square Goodness-of-fit Test

1. The sample must be an SRS from the populations of interest.

2. The population size is at least 10 times the size of the sample.

3. All expected cell counts must be at least 5.

Cathy Poliak, Ph.D. [email protected] Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 13 / 20

Page 17: Goodness of Fit Tests - math.uh.edu

Is the manufacturers claim correct?

Cathy Poliak, Ph.D. [email protected] Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 14 / 20

Page 18: Goodness of Fit Tests - math.uh.edu

Using R

chisq.test(c(list of observed values),correct = FALSE, p = c(list ofproportions))If we are not given a list of proportions then p = 1/n and that is adefault for R so we do not need to give that information.

> chisq.test(c(14,14,5,7,6,10),correct=FALSE,p=c(.3,.2,.2,.1,.1,.1))

Chi-squared test for given probabilities

data: c(14, 14, 5, 7, 6, 10)X-squared = 8.4345, df = 5, p-value = 0.1339

Cathy Poliak, Ph.D. [email protected] Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 15 / 20

Page 19: Goodness of Fit Tests - math.uh.edu

Using R

chisq.test(c(list of observed values),correct = FALSE, p = c(list ofproportions))If we are not given a list of proportions then p = 1/n and that is adefault for R so we do not need to give that information.

> chisq.test(c(14,14,5,7,6,10),correct=FALSE,p=c(.3,.2,.2,.1,.1,.1))

Chi-squared test for given probabilities

data: c(14, 14, 5, 7, 6, 10)X-squared = 8.4345, df = 5, p-value = 0.1339

Cathy Poliak, Ph.D. [email protected] Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 15 / 20

Page 20: Goodness of Fit Tests - math.uh.edu

Zodiac SignsDoes your zodiac sign determine how successful you will be in laterlife? Fortune magazine collected the zodiac signs of 256 heads of thelargest 400 companies. The following are the number of births for eachsign:

Sign BirthsAries 23

Taurus 20Gemini 18Cancer 23

Leo 20Virgo 19Libra 18

Scorpio 21Sagittarius 19Capricorn 22Aquarius 24Pisces 29

From: Intro Stats, De Veaux, Velleman, Bock. 2nd Edition, Pearson, pg 604.Cathy Poliak, Ph.D. [email protected] Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 16 / 20

Page 21: Goodness of Fit Tests - math.uh.edu

2. Hypotheses

H0: The number of births are the same over the zodiac signs.

Ha: The number of births are not the same over the zodiac signs.

Cathy Poliak, Ph.D. [email protected] Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 17 / 20

Page 22: Goodness of Fit Tests - math.uh.edu

3 & 4. Chi-square Test statistic and P-value

Cathy Poliak, Ph.D. [email protected] Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 18 / 20

Page 23: Goodness of Fit Tests - math.uh.edu

5 & 6. Decision and Conclusion

Cathy Poliak, Ph.D. [email protected] Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 19 / 20

Page 24: Goodness of Fit Tests - math.uh.edu

Popper #22 Questions

Mars Inc. claims that they produce M&Ms with the followingdistributions:

Brown 30% Red 20% Yellow 20%Orange 10% Green 10% Blue 10%

A bag of M&Ms was randomly selected from the grocery store shelf,and the color counts were:

Brown 25 Red 23 Yellow 21Orange 13 Green 15 Blue 14

3. Using the χ2 goodness of fit test to determine if the proportion ofM&Ms is what is claimed, what is the test statistic?a) χ2 = 9.231b) χ2 = 2.716c) χ2 = 4.616d) χ2 = 1.960

Cathy Poliak, Ph.D. [email protected] Office hours: T Th 2:30 pm - 5:15 pm 620 PGH (Department of Mathematics University of Houston )Section8.5 April 21, 2016 20 / 20