sta. 113 chapter 7 of devore - duke universitysayan/113/lectures/lec7print.pdf · normal...

Normal distribution known varianceLarge sample CI, or CLT to the rescueSmall sample normal, thank Guinness

Confidence intervals on the spread or varianceConfidence bounds

Sample size computations

Confidence intervals

Artin Armagan and Sayan Mukherjee

Sta. 113 Chapter 7 of Devore

March 12, 2010

Artin Armagan and Sayan Mukherjee Confidence intervals




Table of contents

1 Normal distribution known variance

2 Large sample CI, or CLT to the rescue

3 Small sample normal, thank Guinness

4 Confidence intervals on the spread or variance

5 Confidence bounds

6 Sample size computations





Uncertainty

In the last lecture we learned about point estimates using the MLE.

We also learned about uncertainty in the context of Bayesianmethods and the posterior density.

We now study within the likelihood framework how to think ofuncertainty. This is the idea of a confidence interval and instatistics lingo it is the frequentist analog of the Bayesian credibleinterval.





Confidence interval of the mean

If X1, ..., Xniid∼ No(µ, σ2) with then we know that

Z =X̄ − µ

σ/√

n∼ No(0, 1).

This means that

Pr (−1.96 < Z < 1.96) = .95.

Pr

−1.96 <X̄ − µ

σ/√

n< 1.96

!

= .95.

Pr

−1.96σ√

n< X̄ − µ < 1.96

σ√

n

!

= .95.

Pr

−1.96σ√

n− X̄ < µ < −X̄ + 1.96

σ√

n

!

= .95.

Pr

1.96σ√

n+ X̄ > µ > X̄ − 1.96

σ√

n

!

= .95.

Pr

X̄ − 1.96σ√

n< µ < X̄ + 1.96

σ√

n

!

= .95.





A random interval

Consider the quantity

Pr

X̄ − 1.96σ√

n< µ < X̄ + 1.96

σ√

n

!

= .95,

X̄ is random but µ is not it is fixed.The interpretation of the above equation is as a random interval

ℓ = X̄ − 1.96σ√

n, u = X̄ + 1.96

σ√

n

!

.

The interval is centered at the sample mean and extends in either direction by 1.96 σ√n.

What a statistician would say is“ the probability is .95 that the random interval includes the true value µ.”





Formal definition

Definition

Given x1, ..., xniid∼ No(µ, σ2) compute x̄. The 95% confidence

interval for µ is

(

x̄ − 1.96σ√

n, x̄ + 1.96

σ√

n

)

,

or as x̄ ∓ 1.96 σ√n.





Meaning of a CI

What you want a confidence interval to say is“the probability that µ is included between x̄ ∓ 1.96 σ√

nis .95.”

Do not say this on an exam.





Meaning of a CI

The 95% CI is interpreted as the limit of the following procedure and limT→∞ val = .05:

Out = 0For t = 1 to T

x1, ..., xniid∼ No(µ, σ2)

compute x̄

if µ 6∈“

x̄ − 1.96 σ√n, x̄ + 1.96 σ√

n

”

then Out → Out + 1

val = OutT





Meaning of a CI

The CI is a statement not about the estimate that you performedbut what would happen if you repeated the same estimationprocedure again and again.





Example: n = 20 T = 5

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.80.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5






0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.20

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5





Example: n = 20 T = 500

−0.5 0 0.5 1 1.5 2 2.50

5

10

15

20

25

30

35

40

45

50






0.85 0.9 0.95 1 1.05 1.1 1.15 1.2 1.250.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5





Example: n = 200 T = 50

0.7 0.8 0.9 1 1.1 1.2 1.3 1.40

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5





Example: n = 200 T = 500

0.7 0.8 0.9 1 1.1 1.2 1.3 1.40

5

10

15

20

25

30

35

40

45

50





Code

T = 500;

n=200;

for i=1:n

x = randn(1,n) + 1;

m = mean(x);

l(1,i) = m - 1.96/sqrt(n);

u(1,i) = m + 1.96/sqrt(n);

end

yv = (1:T)*.1;

plot(l,yv,’b*’);

hold on;

plot(u,yv,’r*’);

plot(1,yv,’g+’);

hold off





Levels of confidence

We can define any 100(1 − α)% CI not just a 95% CI.

This is done by replacing 1.96 with zα/2 since

Pr(

−zα/2 < Z < zα/2

)

= 1 − α.

Definition

A 100(1 − α)% CI of µ for a normal population with known σ is

(

x̄ − zα/2σ√

n, x̄ + zα/2

σ√

n

)

,

or as x̄ ∓ zα/2σ√n.





Using the CLT

If X1, ..., Xn are drawn i.i.d. from a distribution with mean µ and variance σ2 and n is large then the CLT holdsand

Z =X̄ − µ

σ/√

n∼ No(0, 1).

soPr“

−zα/2 < Z < zα/2

”

≈ 1 − α.

We almost never know σ so we replace it with the sample standard deviation S =P

i (Xi−X̄ )2

n−1and

Z =X̄ − µ

S/√

n.

Now pretend you are in the normal setting





Formal definition

Definition

For n big enough (n > 40)

x̄ ∓ zα/2s√

n

is the large sample confidence interval for µ with CI approximately100(1 − α)%.This holds as long as the CLT is approximately true.





Application 1

Suppose we have an estimator θ̂ that is

1 normally distributed

2 approximately unbiased

3 σθ̂

is available.

The following is true

Pr

−zα/2 <θ̂ − θ

σθ̂

< zα/2

!

≈ 1 − α

and

θ̂ ∓ zα/2

s√

n

is the large sample confidence interval for θ with CI approximately 100(1 − α)%.





Application 2: Binomial

Given X ∼ Bin(n, p) and min(np, n(1 − p)) ≥ 10 the CLT allows for the normal approximation and

σp̂ =p

p(1 − p)/n.

So

Pr

−zα/2 <p̂ − p

p

p(1 − p)/n< zα/2

!

≈ 1 − α

and we need to solve the above for p so we can put p in the middle.

A good approximation for large n is

p̂ ∓ zα/2

s

p̂(1 − p̂)

n

is the large sample confidence interval for µ with CI approximately 100(1 − α)%.





Binomial with more pain

Instead of the approximation

p̂ ∓ zα/2

s

p̂(1 − p̂)

n.

We can try and solve for p the following

Pr

−zα/2 <p̂ − p

p

p(1 − p)/n< zα/2

!

≈ 1 − α

so

p =p̂ +

z2α/22n

± zα/2

s

p̂(1−p̂)n

+z2α/2

4n2

1 +z2α/2n

and

ℓ =p̂ +

z2α/22n

− zα/2

s

p̂(1−p̂)n

+z2α/2

4n2

1 +z2α/2n

u =p̂ +

z2α/22n

+ zα/2

s

p̂(1−p̂)n

+z2α/2

4n2

1 +z2α/2nArtin Armagan and Sayan Mukherjee Confidence intervals




The t distribution

Theorem

If x̄ is the mean of a random sample of size n drawn from a normaldistribution with mean µ

T =X̄ − µ

S/√

n

is distributed as a t distribution with ν = n − 1 degrees of freedom.





Student: William Sealy Gosset





t distribution ν = 2

−10 −8 −6 −4 −2 0 2 4 6 8 100

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

x

p(x)

t distnormal






−10 −8 −6 −4 −2 0 2 4 6 8 100

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

x

p(x)

t distnormal





Properties

Let tν denotes the t density with ν degrees of freedom

1 tν is centered at zero and bell shaped

2 tν has heavier tails than the normal

3 as ν increases tν has less spread

4 as limν→∞ tνdist= No(0, 1) or as ν increases tν approaches the

standard normal.





tα,ν notation

Definition

The notation tα,ν denotes the value z such that for a t distributionwith ν degrees of freedom

Pr(T ≥ tα,ν) = α

orPr(T < tα,nu) = 1 − α.





Confidence intervals for Normal rvs

Definition

Let x̄ and s be the sample mean and sample standard deviationfrom a normal population with mean µ. The 100(1 − α)%confidence interval for µ is

x̄ ∓ tα/2,νs√

n.





Confidence intervals for the variance

Definition

Let X1, ..,Xniid∼ No(µ, σ2). Then the random variable

(n − 1)S2

σ2=

∑

i (Xi − X̄ )2

σ2,

has a chi-squared distribution, χ2ν , with ν = n − 1 degrees of

freedom.





χ2 distribution ν = 10

0 20 40 60 80 100 120 140 160 180 2000

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.1

x

p(x)






0 20 40 60 80 100 120 140 160 180 2000

0.01

0.02

0.03

0.04

0.05

0.06

0.07

x

p(x)






0 20 40 60 80 100 120 140 160 180 2000

0.01

0.02

0.03

0.04

0.05

0.06

x

p(x)






0 20 40 60 80 100 120 140 160 180 2000

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04

0.045

0.05

x

p(x)






0 20 40 60 80 100 120 140 160 180 2000

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04

0.045

x

p(x)






0 20 40 60 80 100 120 140 160 180 2000

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04

x

p(x)






0 20 40 60 80 100 120 140 160 180 2000

0.005

0.01

0.015

0.02

0.025

0.03

0.035

x

p(x)






0 20 40 60 80 100 120 140 160 180 2000

0.005

0.01

0.015

0.02

0.025

0.03

x

p(x)





Critical values for χ2

The χ2ν distribution is not symmetric in general. We denote χ2

α,ν

as the value such that %100α of the area lies to the right of it.





Confidence interval of the variance

If X1, ..., Xniid∼ No(µ, σ2) with then we know that

(n − 1)S2

σ2∼ χ

2n−1.

This means that

Pr

χ21−α/2,n−1 <

(n − 1)S2

σ2< χ

2α/2,n−1

!

= 1 − α.

Pr

0

@

1

χ21−α/2,n−1

>σ2

(n − 1)S2>

1

χ2α/2,n−1

1

A = 1 − α.

Pr

0

@

(n − 1)S2

χ21−α/2,n−1

> σ2

>(n − 1)S2

χ2α/2,n−1

1

A = 1 − α.

Pr

0

@

(n − 1)S2

χ2α/2,n−1

< σ2

<(n − 1)S2

χ21−α/2,n−1

1

A = 1 − α.





Formal definition

Definition

Given x1, ..., xniid∼ No(µ, σ2) the 100(1 − α)% confidence interval

for σ2 is(

(n − 1)S2/χ2α/2,n−1, (n − 1)S2/χ2

1−α/2,n−1

)

.





Confidence bounds

Sometimes we only care about bounding the uncertainty fromabove or below. In this case we use confidence bounds.We illustrate this for the normal distribution with known variance.





Normal distribution known variance

If X1, ...,Xniid∼ No(µ, σ2) with then we know that

Z =X̄ − µ

σ/√

n∼ No(0, 1).

This means that

Pr

(

X̄ − µ

σ/√

n> −zα

)

= 1 − α.

Pr

(

µ < X̄ + zασ√

n

)

= 1 − α.





Formal definition

Definition

Given x1, ..., xniid∼ No(µ, σ2) the 100(1 − α)% confidence bounds

for µ are

µ < x̄ + zασ√

n

µ > x̄ − zασ√

n.





Precision and reliability

The idea behind a confidence interval is to relate the trade-offbetween precision, the confidence interval, and reliability, theconfidence or α.In the normal case with known variance

CI = w = 2zα/2σ√

n

and α are inversely proportional.





Sample size requirements

A very common problem is to find the smallest sample size n suchthat a particular level or reliability and precision is satisfied or givenw and α find the smallest n such that

w = 2zα/2σ√

n

or

n =(

2zα/2σ

w

)2.


sta. 113 chapter 7 of devore - duke universitysayan/113/lectures/lec7print.pdf · normal...

Documents