stt 315 ashwini maurya this lecture is based on chapter 5.4 acknowledgement: author is thankful to...

19
STT 315 Ashwini maurya This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing him to use/edit some of their slides.

Upload: kimberly-newton

Post on 17-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: STT 315 Ashwini maurya This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil

STT 315

Ashwini maurya

This lecture is based on Chapter 5.4

Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing him to use/edit some of their slides.

Page 2: STT 315 Ashwini maurya This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil

2

Statistical Inference• Inference means that we are making a

conclusion about the population parameter based on the statistic we calculated from a sample.

• Conclusions made using statistical inference are probabilistic in nature. We may not be able to say for sure, but with certain confidence.

• There are two types of inference:– Confidence Intervals,– Hypothesis Tests.

Page 3: STT 315 Ashwini maurya This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil

3

Goal

Students will be able to:• Construct a confidence interval for a proportion.• Interpret a confidence interval for a proportion.• Check conditions for the use of inference about a

population proportion– Independence (or sample less than 10% of population),– Sample size large enough (successes and failures each

greater than 10).

• Explain the relationship between the margin of error, sample size, and level of certainty.

Page 4: STT 315 Ashwini maurya This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil

4

Estimating Smokers• Suppose I want to estimate the

percent of MSU undergraduate students who smoke.

• A random sample of 99 undergraduate students were selected and 17 of them smoked tobacco last week.

• I want to make a 95% confidence interval for the proportion of MSU undergraduates based on this information.

Page 5: STT 315 Ashwini maurya This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil

5

Are the conditions met?• Firstly it is a random sample.

• Though the sample is without replacement, but it satisfies 10% condition as there are more than 1000 undergraduate students in MSU.

• Also both number of smokers (17) and non-smokers (82) are larger than 10, the sample can be considered to be large enough.

Page 6: STT 315 Ashwini maurya This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil

6

Use the results to make a CI• The sampling distribution results

guarantees us that the sample proportions will be roughly normally distributed around the population proportion.

• So 95% of samples should fall within two standard deviations of the population proportion.

• But we don’t know the population proportion (that’s what we are trying to estimate)! So we cannot get

• Therefore we need to use the sample proportion and work backward from there.

.p̂

Page 7: STT 315 Ashwini maurya This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil

7

To make a 95% confidence interval we must create an interval that is 2 standard deviations long, above and below the statistic.

One standard deviation is

So 2 standard deviations is 2(.0379) = .0758 (and 7.58% is the margin of error).

In our sample, 17 out of 99 students smoked tobacco in the last week, or 17.2%.

17.2% is a statistic (or a point estimate).

We will use 17.2% to make an interval estimate for the value of the parameter.

.0379.099

)828(.172.

Construction of C.I.

Page 8: STT 315 Ashwini maurya This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil

8

And we write:We are 95% confident that between 9.62% and 24.78% of MSU undergraduates smoke tobacco.

If we want to make a 68% confidence interval, we only have to extend the interval one standard deviation from the statistic in each direction:

So a 68% confidence interval has endpoints at 0.172 - 0.0379 = 0.1341, and 0.172 + 0.0379 = 0.2099and we write:

We are 68% confident that between 13.4% and 21.0% of MSU undergraduates smoke tobacco.

So a 95% confidence interval has endpoints at 0.172 - 0.0758 = 0.0962 , and 0.172 + 0.0758 = 0.2478.

Page 9: STT 315 Ashwini maurya This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil

9

Our 95% CI for smokers was 9.62% to 24.78%.

This means that (find the correct one):a) 95% of random samples of MSU undergraduates

will have between 9.62% and 24.78% smokers.b) Between 9.62% and 24.78% of MSU

undergraduates smoke.c) 95% of MSU undergraduates smoke between

9.62% and 24.78% of the time.d) We are 95% sure that between 9.62% and

24.78% of MSU undergraduates smoke.

Page 10: STT 315 Ashwini maurya This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil

10

Standard Error (S.E.)• If subjects are independent, • and if the sample size is large enough,

then the sample proportions are approximately normally distributed with mean p, and standard deviation

i.e.,• But in an estimation problem, p is unknown. So

we replace population proportion (p) by the sample proportion ( ) in its formula and get standard error of sample proportion

),(~ˆ p̂pNp

.)1(

ˆn

ppp

approximately.

,ˆˆ)ˆ1(ˆ

)ˆ.(.n

qp

n

pppES

where .ˆ1ˆ pq

Page 11: STT 315 Ashwini maurya This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil

11

Confidence Interval (C.I.) andMargin of Error (M.E.)

• Since for large n, the sample proportion ( ) is approximately normal, we can conclude (using empirical rule) that within a margin of error of 1×S.E. we are about

68% sure the population proportion (p) lies. within a margin of error of 2×S.E. we are about

95% sure the population proportion (p) lies.• So confidence interval for p is• Obviously, more the confidence you require, larger the

margin of error.

..ˆ EMp

Page 12: STT 315 Ashwini maurya This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil

12

Find the exact area between -2 and 2 standard deviations from the mean on a normal curve using your calculator.

Hint: normalcdf(-2,2,0,1) = 0.954 = 95.4%.So this is not exactly 95%, but slightly more.On the other hand the exact area between -1.96 and 1.96 standard deviations from the mean is

normalcdf(-1.96,1.96,0,1) = 0.95 = 95%.Using 1.96, we get the 95% C.I. for p to be: (0.097, 0.246).

Is it 1.96 or 2 for 95% C.I.?

Note: calculator uses 1.96.But how to use calculator to construct C.I.?

Page 13: STT 315 Ashwini maurya This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil

13

Formula: C.I. for • The formula for C.I. for is given by

where is such a number that , where Z is a standard normal variable.• However, one can use TI 83/84 to compute

C.I.’s for p.

Page 14: STT 315 Ashwini maurya This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil

14

C.I. with TI 83/84 PlusWant to make a 85% confidence interval for smokers among MSU undergraduates. In a random sample of 99 MSU undergraduates 17 smoked tobacco last week.• Press [STAT].• Select [TESTS].• Choose A: 1-PropZInt….• Input the following:

o x: 17o n: 99o C-Level: 85

• Choose Calculate and press [ENTER].

Answer: 85% C.I. for p is (0.117, 0.226).

Page 15: STT 315 Ashwini maurya This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil

15

How would I a confidence interval a proportion using a calculator?

Sample Input

Sample Output

The sample input shows finding a 99% confidence interval with a sample size of 4040 people and 2048 smokers.

We would interpret the sample output as:“We are 99% confident that between 48.7% and 52.7% of the population smokes.

Note: This example wasn’t actually about smoking.

Page 16: STT 315 Ashwini maurya This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil

16

Width of a C.I.• Since the formula of confidence interval for p is

the width of the C.I. is 2×M.E.• So if we know the width of C.I., we can compute the

M.E. by halving the width.• Example: Given a 90% C.I. for p is (0.23, 0.37), find

the values of (a) sample proportion and (b) margin of error of the 90% C.I.Solution: Since the width = (0.37-0.23) = 0.14, and so the margin of error of 90% C.I. for p is 0.14/2 = 0.07.Moreoverand so

..ˆ EMp

23.0..ˆ EMp

.30.023.007.0ˆ p

Page 17: STT 315 Ashwini maurya This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil

17

Smoker example• We found that 17.2% of a sample of 99

MSU undergraduates had smoked in the past week.

• We used this to find a 95% confidence interval for the proportion of MSU undergraduates who smoke.

• The endpoints of our 95% confidence interval is (0.096, 0.248).

• The width of 95% C.I. is (0.248-0.096) = 0.15, and so the margin of error is (0.15/2) = 0.075.

• If we want to reduce the margin of error while keeping the confidence level the same, we could increase the sample size.

Page 18: STT 315 Ashwini maurya This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil

18

M.E. and sample size• If we wanted to reduce the margin of error to

4%, minimum how many undergrads would we have to survey?

• The formula is: • But what p to use (remember q = 1-p)?

.25.600)04.0(

5.05.096.1

.).(

5.05.096.12

2

2

2

EM

n

.98.341)04(.

)828)(.172(.8416.3

.).(

96.122

2

EM

pqn So we would need 342 subjects.

..).(

)(2

22/

EM

pqzn

Two cases: No information about p is given. In that case use p = 0.5.In our exercise, if nothing about p is known:

So we would need 601 subjects.

If some information about p is known, use that information.If we use the information of sample: p = 0.172, q = 1-0.172 = 0.828.

Page 19: STT 315 Ashwini maurya This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil

19

Summary

• Larger sample size makes smaller margin of error.

• Larger confidence makes larger margin of error.

• The level of confidence is the proportion of intervals that will contain the value of the population parameter.

• As long as the conditions are met, the process of confidence interval works.