11-2 goodness-of-fit in this section, we consider sample data consisting of observed frequency...

13
11-2 Goodness-of-Fit In this section, we consider sample data consisting of observed frequency counts arranged in a single row or column (called a one-way frequency table). We will use a hypothesis test for the claim that the observed frequency counts agree with some claimed distribution, so that there is a good fit of the observed data with the claimed distribution.

Upload: alexandrina-doyle

Post on 21-Dec-2015

214 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: 11-2 Goodness-of-Fit In this section, we consider sample data consisting of observed frequency counts arranged in a single row or column (called a one-way

11-2 Goodness-of-Fit

In this section, we consider sample data consisting of observed frequency counts arranged in a single row or column (called a one-way frequency table).

We will use a hypothesis test for the claim that the observed frequency counts agree with some claimed distribution, so that there is a good fit of the observed data with the claimed distribution.

Page 2: 11-2 Goodness-of-Fit In this section, we consider sample data consisting of observed frequency counts arranged in a single row or column (called a one-way

Definition

A goodness-of-fit test is used to test the hypothesis that an observed frequency distribution fits (or conforms to) some claimed distribution.

Page 3: 11-2 Goodness-of-Fit In this section, we consider sample data consisting of observed frequency counts arranged in a single row or column (called a one-way

Notation

O represents the observed frequency of an outcome, found from the sample data.

E represents the expected frequency of an outcome, found by assuming that the distribution is as claimed.

k represents the number of different categories or cells.

n represents the total number of trials.

Goodness-of-Fit Test

Page 4: 11-2 Goodness-of-Fit In this section, we consider sample data consisting of observed frequency counts arranged in a single row or column (called a one-way

Goodness-of-Fit

Hypotheses and Test Statistic

Critical Values1. Found in Table A-4 using k – 1 degrees of

freedom, where k = number of categories.2. Goodness-of-fit hypothesis tests are always right-

tailed.

22 ( )O Ex

E

0 : The frequency counts agree with the claimed distribution.

: The frequency counts do not agree with the claimed distribution.A

H

H

Page 5: 11-2 Goodness-of-Fit In this section, we consider sample data consisting of observed frequency counts arranged in a single row or column (called a one-way

Finding Expected Frequencies

If all expected frequencies are assumed equal:

If all expected frequencies are assumed not equal:

nE

k

for each individual categoryE np

Page 6: 11-2 Goodness-of-Fit In this section, we consider sample data consisting of observed frequency counts arranged in a single row or column (called a one-way

A close agreement between observed and expected values will lead to a small value of χ2 and a large P-value. (Do Not Reject Ho.)

A large disagreement between observed and expected values will lead to a large value of χ2 and a small P-value.

A significantly large value of χ2 will cause a rejection of the null hypothesis of no difference between the observed and the expected. (Reject Ho)

Goodness-of-Fit Test

Page 7: 11-2 Goodness-of-Fit In this section, we consider sample data consisting of observed frequency counts arranged in a single row or column (called a one-way

Goodness-Of-Fit Tests

Page 8: 11-2 Goodness-of-Fit In this section, we consider sample data consisting of observed frequency counts arranged in a single row or column (called a one-way

..

Example

A random sample of 100 weights of Californians is obtained, and the last digit of those weights are summarized on the next slide.

When obtaining weights, it is extremely important to actually measure the weights rather than ask people to self-report them.

By analyzing the last digit, we can verify the weights were actually measured since reported weights tend to be rounded to something ending with a 0 or a 5.

Test the claim that the sample is from a population of weights in which the last digits do not occur with the same frequency.

Page 9: 11-2 Goodness-of-Fit In this section, we consider sample data consisting of observed frequency counts arranged in a single row or column (called a one-way

..

Example - Continued

Page 10: 11-2 Goodness-of-Fit In this section, we consider sample data consisting of observed frequency counts arranged in a single row or column (called a one-way

..

Example - Continued

The hypotheses can be written as:

No significance level was specified, so we select α = 0.05.

0 0 1 2 3 4 5 6 7 8 9

1

:

: At least one of the probabilities is different.

H p p p p p p p p p p

H

Page 11: 11-2 Goodness-of-Fit In this section, we consider sample data consisting of observed frequency counts arranged in a single row or column (called a one-way

..

Example - Continued

The calculation of the test statistic is given:

Page 12: 11-2 Goodness-of-Fit In this section, we consider sample data consisting of observed frequency counts arranged in a single row or column (called a one-way

..

Example - Continued

The test statistic is χ2 = 212.800 and the critical value is χ2 = 16.919 (Table A-4).

Page 13: 11-2 Goodness-of-Fit In this section, we consider sample data consisting of observed frequency counts arranged in a single row or column (called a one-way

..

Example - Continued

Since the = 212.8 > =16.919 we have SE to reject and support . We conclude there is sufficient evidence to support the claim that the last digits do not occur with the same relative frequency.

In other words, we have evidence that the weights were self-reported by the subjects, and the subjects were not actually weighed.

2TS 2

CVAHOH