sampling distribution of a sample proportion lecture 25 sections 8.1 – 8.2 fri, feb 29, 2008

Sampling Distribution of a Sample ProportionLecture 25

Sections 8.1 – 8.2

Fri, Feb 29, 2008

Sampling Distributions

Sampling Distribution of a Statistic

The Sample Proportion

The letter p represents the population proportion.

The symbol p^ (“p-hat”) represents the sample proportion.

p^ is a random variable. The sampling distribution of p^ is the probability

distribution of all the possible values of p^.

Example

Suppose that 2/3 of all males wash their hands after using a public restroom.

Suppose that we take a sample of 1 male. Find the sampling distribution of p^.

Example

W

N

2/3

1/3

P(W) = 2/3

P(N) = 1/3

Example

Let x be the sample number of males who wash.

The probability distribution of x is

x P(x)

0 1/3

1 2/3

Example

Let p^ be the sample proportion of males who wash. (p^ = x/n.)

The sampling distribution of p^ is

p^ P(p^)

0 1/3

1 2/3

Example

Now we take a sample of 2 males, sampling with replacement.

Find the sampling distribution of p^.

Example

W

N

W

N

W

N

2/3

1/3

2/3

1/3

2/3

1/3

P(WW) = 4/9

P(WN) = 2/9

P(NW) = 2/9

P(NN) = 1/9

Example

Let x be the sample number of males who wash.

The probability distribution of x is

x P(x)

0 1/9

1 4/9

2 4/9

Example

Let p^ be the sample proportion of males who wash. (p^ = x/n.)

The sampling distribution of p^ is

p^ P(p^)

0 1/9

1/2 4/9

1 4/9

Samples of Size n = 3

If we sample 3 males, then the sample proportion of males who wash has the following distribution.

p^ P(p^)

0 1/27 = .03

1/3 6/27 = .22

2/3 12/27 = .44

1 8/27 = .30



p^ P(p^)

0 1/81 = .01

1/4 8/81 = .10

2/4 24/81 = .30

3/4 32/81 = .40

1 16/81 = .20



p^ P(p^)

0 1/243 = .004

1/5 10/243 = .041

2/5 40/243 = .165

3/5 80/243 = .329

4/5 80/243 = .329

1 32/243 = .132

Our Experiment

In our experiment, we had 80 samples of size 5.

Based on the sampling distribution when n = 5, we would expect the following

Value of p^ 0.0 0.2 0.4 0.6 0.8 1.0

Actual

Predicted 0.3 3.3 13.2 26.3 26.3 10.5

The pdf when n = 1

0 1

The pdf when n = 2

0 11/2

The pdf when n = 3

0 11/3 2/3

The pdf when n = 4

0 11/4 2/4 3/4

The pdf when n = 5

0 11/5 2/5 3/5 4/5

18/10

The pdf when n = 10

0 2/10 4/10 6/10

Observations and Conclusions

Observation: The values of p^ are clustered around p.

Conclusion: p^ is close to p most of the time.


Observation: As the sample size increases, the clustering becomes tighter.

Conclusion: Larger samples give better estimates.

Conclusion: We can make the estimates of p as good as we want, provided we make the sample size large enough.


Observation: The distribution of p^ appears to be approximately normal.

Conclusion: We can use the normal distribution to calculate just how close to p we can expect p^ to be.

One More Observation

However, we must know the values of and for the distribution of p^.

That is, we have to quantify the sampling distribution of p^.

The Central Limit Theorem for Proportions It turns out that the sampling distribution of

p^ is approximately normal with the following parameters.

n

ppp

n

ppp

pp

p

p

p

1ˆ ofdeviation Standard

1ˆ of Variance

ˆ ofMean

ˆ

2ˆ

ˆ

The Central Limit Theorem for Proportions The approximation to the normal

distribution is excellent if

.51 and 5 pnnp

Example

If we gather a sample of 100 males, how likely is it that between 60 and 70 of them, inclusive, wash their hands after using a public restroom?

This is the same as asking the likelihood that 0.60 p^ 0.70.

Example

Use p = 0.66. Check that

np = 100(0.66) = 66 > 5,n(1 – p) = 100(0.34) = 34 > 5.

Then p^ has a normal distribution with

04737.0100

)34.0)(66.0(ˆ

ˆ

p

p

Example

So

P(0.60 p^ 0.70)

= normalcdf(.60,.70,.66,.04737)

= 0.6981.

Why Surveys Work

Suppose that we are trying to estimate the proportion of the male population who wash their hands after using a public restroom.

Suppose the true proportion is 66%. If we survey a random sample of 1000

people, how likely is it that our error will be no greater than 5%?

Why Surveys Work

Now we have

.01498.01000

)34.0)(66.0(ˆ

ˆ

p

p

Why Surveys Work

Now find the probability that p^ is between 0.61 and 0.71:

normalcdf(.61, .71, .66, .01498) = 0.9992. It is virtually certain that our estimate will

be within 5% of 66%.

Why Surveys Work

What if we had decided to save money and surveyed only 100 people?

If it is important to be within 5% of the correct value, is it worth it to survey 1000 people instead of only 100 people?

Quality Control

A company will accept a shipment of components if there is no strong evidence that more than 5% of them are defective.

H0: 5% of the parts are defective.

H1: More than 5% of the parts are defective.

Quality Control

They will take a random sample of 100 parts and test them. If no more than 10 of them are defective, they will accept the shipment.

What is ? What is ?

sampling distribution of a sample proportion lecture 25 sections 8.1 – 8.2 fri, feb 29, 2008

Documents

sampling distribution

estimates of p

letter p

symbol p

statistic slide

following value of p

possible values of p

sample proportion of