confidence intervals - kasetsart university
TRANSCRIPT
![Page 1: Confidence Intervals - Kasetsart University](https://reader033.vdocuments.mx/reader033/viewer/2022042311/625b9f27730c952354208d47/html5/thumbnails/1.jpg)
Confidence Intervals รศ.ดร. อนันต์ ผลเพิม่
Assoc.Prof. Anan Phonphoem, Ph.D. [email protected]
Intelligent Wireless Network Group (IWING Lab)
http://iwing.cpe.ku.ac.th
Computer Engineering Department
Kasetsart University, Bangkok, Thailand
1
![Page 2: Confidence Intervals - Kasetsart University](https://reader033.vdocuments.mx/reader033/viewer/2022042311/625b9f27730c952354208d47/html5/thumbnails/2.jpg)
Confidence Interval
A sample of independent observation
x1,x2,…xn from unknown distribution
Want to find population mean μ
How to know that μ plausible (believable)
◦ Point estimate μ and its estimated s.e.
Confidence Interval
◦ Interval that covers parameter (e.g. μ) with range
of confidence (confidence level)
◦ Ex. 95% of confidence that μ will be in this range
2
![Page 3: Confidence Intervals - Kasetsart University](https://reader033.vdocuments.mx/reader033/viewer/2022042311/625b9f27730c952354208d47/html5/thumbnails/3.jpg)
Chi-Square (2) Distribution
If X is normal distribution,
◦ is Chi-Square with degree of freedom
(d.f) equals to 1
3
)1,0(NX
2XY
For independent X1, X2,…,Xn with
is Chi-Square with d.f = n
n
i
iXY
2
)1,0(NX
For independent X1, X2,…,Xn with
is Chi-Square with d.f = n
2
1
n
i
ix
Y
),(2
NX
![Page 4: Confidence Intervals - Kasetsart University](https://reader033.vdocuments.mx/reader033/viewer/2022042311/625b9f27730c952354208d47/html5/thumbnails/4.jpg)
Degree of Freedom (d.f)
# of data that is not depended on others
Example:
◦ Select 3 random numbers
d.f = 3
◦ Select 3 random numbers that sum = 10
can select any 2 numbers, the last one is depended on condition
E.g. Select 2 and 5, the last must be 3
d.f =2
◦ Select 3 random numbers that sum of square(x) = 54
can select any 1 number, the last two are depended on condition
E.g. Select 7, the rest must be 1 and 2
d.f = 1
4
![Page 5: Confidence Intervals - Kasetsart University](https://reader033.vdocuments.mx/reader033/viewer/2022042311/625b9f27730c952354208d47/html5/thumbnails/5.jpg)
t-Distribution
5
If X is normal distribution, and
are independent, variable t )1,0(NX
YandX
2Y
f
Xt
2
is t-Distribution with d.f = f
For n random from , we can calculate and
And we know the distribution ),(
2NX x
2s
)1,0(N
n
X
2
)1(2
2)1(
n
sn
Therefore, is t-Distribution with d.f = (n-1)
n
s
X
n
sn
n
X
t
2
2
)1(
)1(
![Page 6: Confidence Intervals - Kasetsart University](https://reader033.vdocuments.mx/reader033/viewer/2022042311/625b9f27730c952354208d47/html5/thumbnails/6.jpg)
Confidence Interval of μ, known σ
6
n samples are random with unknown μ, known σ (not practical for study)
)1,0(NX
We can find and use it to find Confidence Interval of μ
x
),(2
NX From
nNX
2
,
, then
and )1,0(N
n
XZ
![Page 7: Confidence Intervals - Kasetsart University](https://reader033.vdocuments.mx/reader033/viewer/2022042311/625b9f27730c952354208d47/html5/thumbnails/7.jpg)
Confidence Interval of μ, known σ
7
From normal distribution, suppose we want to find μ with the confidence of 95% (0.95)
0.95 0.025 0.025
z is between ± 1.96
-1.96 +1.96
Therefore,
P(-1.96 < Z < 1.96) = 0.95
![Page 8: Confidence Intervals - Kasetsart University](https://reader033.vdocuments.mx/reader033/viewer/2022042311/625b9f27730c952354208d47/html5/thumbnails/8.jpg)
Confidence Interval of μ, known σ
8
95.096.196.1
n
XP
n
X
n
n
X
96.196.196.196.1
n
X
n
X
96.196.1
n
X
n
X
96.196.1
95.096.196.1
n
X
n
XP
Therefore,
![Page 9: Confidence Intervals - Kasetsart University](https://reader033.vdocuments.mx/reader033/viewer/2022042311/625b9f27730c952354208d47/html5/thumbnails/9.jpg)
Confidence Interval of μ, known σ
9
Confidence Level 95% of μ is n
X
96.1
( ) n
X
96.1n
X
96.1 X
Confidence Level 95%
![Page 10: Confidence Intervals - Kasetsart University](https://reader033.vdocuments.mx/reader033/viewer/2022042311/625b9f27730c952354208d47/html5/thumbnails/10.jpg)
Adjust the Interval
We can select any Confidence Interval
high confidence interval
wide confidence interval length
(not a good estimation)
To decrease confidence interval length
◦ Increase number of samples high cost
◦ Decrease the confidence level
Popular Confidence Interval
◦ 90%, 95%, and 99%
For non-normal distribution
◦ Central limit theorem
◦ n > 30 is OK for estimation 10
![Page 11: Confidence Intervals - Kasetsart University](https://reader033.vdocuments.mx/reader033/viewer/2022042311/625b9f27730c952354208d47/html5/thumbnails/11.jpg)
Example 1
A cylinder with normal distribution
Unknown μ, and known σ = 2.009 cm.
Randomly select 12 sample cylinders
11
Calculate .91.12 cmx
Find the confidence interval of population mean of
cylinder for this factory with 95% confidence level
Solution
Let μ = population mean
confidence interval = n
x
96.1
12
009.296.191.12
05.1477.11
14.191.12
to
![Page 12: Confidence Intervals - Kasetsart University](https://reader033.vdocuments.mx/reader033/viewer/2022042311/625b9f27730c952354208d47/html5/thumbnails/12.jpg)
More easy symbol
12
pz represents 100p percentile of normal
475.0z represents 95 percentile of normal
495.0z represents 99 percentile of normal
Let α is the different between 1.00 and required value
(e.g. require 95% α = 0.05)
n
zX
2
1Confidence interval (1- α ) 100% of μ =
![Page 13: Confidence Intervals - Kasetsart University](https://reader033.vdocuments.mx/reader033/viewer/2022042311/625b9f27730c952354208d47/html5/thumbnails/13.jpg)
Example 1
13
For more confidence level = 99%
α = 0.01 α/2 = 0.005 58.2495.0
z
confidence interval = n
x
58.2
12
009.258.291.12
41.1441.11
4964.191.12
to
n
zx
495.0
Which is wider than 95% confidence level )05.1477.11(14.191.12 to
![Page 14: Confidence Intervals - Kasetsart University](https://reader033.vdocuments.mx/reader033/viewer/2022042311/625b9f27730c952354208d47/html5/thumbnails/14.jpg)
Example 2
Study the salary of employee
σ2= 10,000 Baht
Randomly ask 100 employee
14
Calculate Bahtx 500,2
Find the confidence interval of population mean of salary for this company with 95% confidence level
Solution σ2= 10,000 σ = 100, , n = 100 500,2x
From central limit theorem, 95% confidence level of μ is
n
zx
495.0
n
x
96.1 60.192500
100
10096.12500
60.519,240.480,2 to
![Page 15: Confidence Intervals - Kasetsart University](https://reader033.vdocuments.mx/reader033/viewer/2022042311/625b9f27730c952354208d47/html5/thumbnails/15.jpg)
Example 3: National Discount, Inc.
National Discount has 260 retail outlets throughout
the United States. National evaluates each
potential location for a new retail outlet in part on
the mean annual income of the individuals in the
marketing area of the new location.
Sampling can be used to develop an interval
estimate of the mean annual income for individuals
in a potential marketing area for National Discount.
A sample of size n = 36 was taken.
◦ The sample mean, , is $21,100
◦ The sample standard deviation, s, is $4,500
We will use .95 as the confidence coefficient in our
interval estimate.
x
![Page 16: Confidence Intervals - Kasetsart University](https://reader033.vdocuments.mx/reader033/viewer/2022042311/625b9f27730c952354208d47/html5/thumbnails/16.jpg)
Precision Statement
There is a .95 probability that the value of a
sample mean for National Discount will
provide a sampling error of $1,470 or less.
Determined as follows:
◦ 95% of the sample means that can be observed
are within 1.96 of the population mean .
◦ If , then
Example 3: National Discount, Inc.
75036
500,4
n
sx
147096.1 x
![Page 17: Confidence Intervals - Kasetsart University](https://reader033.vdocuments.mx/reader033/viewer/2022042311/625b9f27730c952354208d47/html5/thumbnails/17.jpg)
Example 3: National Discount, Inc.
Interval Estimate of the Population Mean:
Unknown
Interval Estimate of is:
$21,100 + $1,470
or $19,630 to $22,570
We are 95% confident that the interval
contains the population mean.
![Page 18: Confidence Intervals - Kasetsart University](https://reader033.vdocuments.mx/reader033/viewer/2022042311/625b9f27730c952354208d47/html5/thumbnails/18.jpg)
Many Intervals
For a random set
◦ Can create many intervals (a,b)
◦ Depends on confidence level
For 100 confidence intervals finding
◦ There are (1-α)*100 times from 100 times that
the confidence intervals will cover the parameter (e.g. μ)
18
![Page 19: Confidence Intervals - Kasetsart University](https://reader033.vdocuments.mx/reader033/viewer/2022042311/625b9f27730c952354208d47/html5/thumbnails/19.jpg)
Example 3
From a normal distribution with μ = 10, σ = 9
Generate data 16 sets (8 sampling per set)
19
For each set
create confidence interval for 95% level of x
![Page 20: Confidence Intervals - Kasetsart University](https://reader033.vdocuments.mx/reader033/viewer/2022042311/625b9f27730c952354208d47/html5/thumbnails/20.jpg)
Example 3
20
n
96.1
n
96.1
5 6 7 8 9 10 11 12 13 14 15
Set 1: 295.10x
With 95% confidence level (8.216,12.374) cover μ = 10
Set 16: 703.12x
With 95% confidence level (10.624,14.781) not cover μ = 10
Therefore, for 16 sets
not cover: 1/16 = 0.0625 = 6.25%
cover: 15/16 = 0.9375 = 93.75%
![Page 21: Confidence Intervals - Kasetsart University](https://reader033.vdocuments.mx/reader033/viewer/2022042311/625b9f27730c952354208d47/html5/thumbnails/21.jpg)
Interval Estimation of a Population Mean:
Small-Sample Case (n < 30)
Population is Not Normally Distributed
◦ The only option is to increase the sample size to
◦ n > 30 and use the large-sample interval-estimation
◦ procedures.
Population is Normally Distributed and is Known
◦ The large-sample interval-estimation procedure can
◦ be used.
Population is Normally Distributed and is Unknown
◦ The appropriate interval estimate is based on a
◦ probability distribution known as the t distribution.
![Page 22: Confidence Intervals - Kasetsart University](https://reader033.vdocuments.mx/reader033/viewer/2022042311/625b9f27730c952354208d47/html5/thumbnails/22.jpg)
t Distribution
The t distribution is a family of similar probability
distributions.
A specific t distribution depends on a parameter
known as the degrees of freedom.
As the number of degrees of freedom increases,
the difference between the t distribution and the
standard normal probability distribution becomes
smaller and smaller.
A t distribution with more degrees of freedom has
less dispersion.
The mean of the t distribution is zero.
![Page 23: Confidence Intervals - Kasetsart University](https://reader033.vdocuments.mx/reader033/viewer/2022042311/625b9f27730c952354208d47/html5/thumbnails/23.jpg)
Interval Estimate
where
1 -α = the confidence coefficient
t α /2 = the t value providing an area of α /2
in the upper tail of a t distribution
with n - 1 degrees of freedom
s = the sample standard deviation
Interval Estimation of a Population Mean:
Small-Sample Case (n < 30) with Unknown
n
stx
2/
![Page 24: Confidence Intervals - Kasetsart University](https://reader033.vdocuments.mx/reader033/viewer/2022042311/625b9f27730c952354208d47/html5/thumbnails/24.jpg)
Example 4: Apartment Rents
Interval Estimation of a Population Mean:
Small-Sample Case (n < 30) with Unknown
A reporter for a student newspaper is writing an article on the
cost of off-campus housing.
A sample of 10 one-bedroom units within a half-mile of
campus resulted in a sample mean of $550 per month and a
sample standard deviation of $60.
Let us provide a 95% confidence interval estimate of the
mean rent per month for the population of one-bedroom units
within a half-mile of campus.
Assume this population to be normally distributed.
![Page 25: Confidence Intervals - Kasetsart University](https://reader033.vdocuments.mx/reader033/viewer/2022042311/625b9f27730c952354208d47/html5/thumbnails/25.jpg)
Example 4: Apartment Rents
t Value
At 95% confidence, 1 - α = .95, α = .05,
and α /2 = .025.
t.025 is based on n - 1 = 10 - 1 = 9 degrees of
freedom.
In the t distribution table we see that t.025 = 2.262.
Degrees
of
Freedom
Area in Upper Tail
.10 .05 .025 .01 .005
. . . . . .
7 1.415 1.895 2.365 2.998 3.499
8 1.397 1.860 2.306 2.896 3.355
9 1.383 1.833 2.262 2.821 3.250
10 1.372 1.812 2.228 2.764 3.169
![Page 26: Confidence Intervals - Kasetsart University](https://reader033.vdocuments.mx/reader033/viewer/2022042311/625b9f27730c952354208d47/html5/thumbnails/26.jpg)
Example 4: Apartment Rents
Interval Estimation of a Population Mean:
Small-Sample Case (n < 30) with Unknown
550 42.92
or $507.08 to $592.92
We are 95% confident that the mean rent per month for the
population of one-bedroom units within a half-mile of campus
is between $507.08 and $592.92.
n
stx
2/
10
60262.2550
![Page 27: Confidence Intervals - Kasetsart University](https://reader033.vdocuments.mx/reader033/viewer/2022042311/625b9f27730c952354208d47/html5/thumbnails/27.jpg)
Sample Size for an Interval Estimate
of a Population Mean
Let E = the maximum sampling error mentioned
in the precision statement.
E is the amount added to and subtracted from the
point estimate to obtain an interval estimate.
E is often referred to as the margin of error.
We have
Solving for n we have
n
zE
2/
2
22
2/)(
E
zn
![Page 28: Confidence Intervals - Kasetsart University](https://reader033.vdocuments.mx/reader033/viewer/2022042311/625b9f27730c952354208d47/html5/thumbnails/28.jpg)
Example 5: National Discount, Inc.
Sample Size for an Interval Estimate of a
Population Mean
Suppose that National’s management team
wants an estimate of the population
mean such that there is a 0.95 probability
that the sampling error is $500 or less.
How large a sample size is needed to meet
the required precision?
![Page 29: Confidence Intervals - Kasetsart University](https://reader033.vdocuments.mx/reader033/viewer/2022042311/625b9f27730c952354208d47/html5/thumbnails/29.jpg)
Example 5: National Discount, Inc.
Sample Size for Interval Estimate of a
Population Mean
At 95% confidence, z.025 = 1.96.
Recall that = 4,500
Solving for n we have
We need to sample 312 to reach a desired
precision of $500 at 95% confidence.
500
n
σz
2α/
17.311)500(
)4500()96.1(
2
22
n
![Page 30: Confidence Intervals - Kasetsart University](https://reader033.vdocuments.mx/reader033/viewer/2022042311/625b9f27730c952354208d47/html5/thumbnails/30.jpg)
Interval Estimate
where: 1 -α is the confidence coefficient
zα/2 is the z value providing an area of
α/2 in the upper tail of the standard
normal probability distribution
is the sample proportion
Interval Estimation
of a Population Proportion
n
ppzp
)1(
2/
p
![Page 31: Confidence Intervals - Kasetsart University](https://reader033.vdocuments.mx/reader033/viewer/2022042311/625b9f27730c952354208d47/html5/thumbnails/31.jpg)
Example 6: Political Science, Inc.
Interval Estimation of a Population Proportion
Political Science, Inc. (PSI) specializes in voter
polls and surveys designed to keep political office
seekers informed of their position in a race. Using
telephone surveys, interviewers ask registered
voters who they would vote for if the election were
held that day.
In a recent election campaign, PSI found that
◦ 220 registered voters, out of 500 contacted, favored a
particular candidate.
◦ PSI wants to develop a 95% confidence interval estimate
for the proportion of the population of registered voters that
favors the candidate.
![Page 32: Confidence Intervals - Kasetsart University](https://reader033.vdocuments.mx/reader033/viewer/2022042311/625b9f27730c952354208d47/html5/thumbnails/32.jpg)
Example 6: Political Science, Inc.
Interval Estimate of a Population Proportion
where: n = 500, = 220/500 = .44, zα/2 = 1.96
.44 + .0435
PSI is 95% confident that the proportion of all voters that favors
the candidate is between .3965 and .4835.
n
ppzp
)1(
2/
p
500
)44.1(44.96.144.
![Page 33: Confidence Intervals - Kasetsart University](https://reader033.vdocuments.mx/reader033/viewer/2022042311/625b9f27730c952354208d47/html5/thumbnails/33.jpg)
Let E = the maximum sampling error mentioned
in the precision statement.
We have
Solving for n we have
Sample Size for an Interval Estimate
of a Population Proportion
n
ppzE
)1(
2/
2
2
2/)1()(
E
ppzn
![Page 34: Confidence Intervals - Kasetsart University](https://reader033.vdocuments.mx/reader033/viewer/2022042311/625b9f27730c952354208d47/html5/thumbnails/34.jpg)
Example 7: Political Science, Inc.
Sample Size for an Interval Estimate of a
Population Proportion
Suppose that PSI would like a .99
probability that the sample proportion is
within .03 of the population proportion.
How large a sample size is needed to meet
the required precision?
![Page 35: Confidence Intervals - Kasetsart University](https://reader033.vdocuments.mx/reader033/viewer/2022042311/625b9f27730c952354208d47/html5/thumbnails/35.jpg)
Example 7: Political Science, Inc.
Sample Size for Interval Estimate of a Population
Proportion
At 99% confidence, z.005 = 2.576.
Note:
We used 0.44 as the best estimate of p in the above expression.
If no information is available about p, then 0.5 is often
assumed because it provides the highest possible sample size.
If we had used p = 0.5, the recommended n would have been
1843.
1817)03(.
)56)(.44(.)576.2()1()(
2
2
2
2
2/
E
ppzn
![Page 36: Confidence Intervals - Kasetsart University](https://reader033.vdocuments.mx/reader033/viewer/2022042311/625b9f27730c952354208d47/html5/thumbnails/36.jpg)
Confidence Interval of μ, unknown σ
36
n samples are random with unknown μ, unknown σ
)1,0(NX
Confidence Level 95% of μ is n
sX 96.1
For n > 30, good estimate
n < 30, not good estimate of σ
For n < 30, Confidence Level 95% of μ is
n
stX
![Page 37: Confidence Intervals - Kasetsart University](https://reader033.vdocuments.mx/reader033/viewer/2022042311/625b9f27730c952354208d47/html5/thumbnails/37.jpg)
tα for 95% confidence level
Sample size (n) tα 4 3.128
8 2.365
12 2.201
20 2.093
25 2.065
30 2.045
1.960
37
Note: t fast increasing for small n
t slow increasing for large n
For n , t z (t normal distribution)
![Page 38: Confidence Intervals - Kasetsart University](https://reader033.vdocuments.mx/reader033/viewer/2022042311/625b9f27730c952354208d47/html5/thumbnails/38.jpg)
8 samples randomly select from population
Unknown μ with , s = 0.67
Find 95% confidence level of confidence Intervals of μ
Example 8
38
91.7x
Solution α= 0.05, d.f = n-1 = 8-1 =7 , t0.05 = 2.36
n
stx
05.0
47.835.7
8
67.036.291.7
8
67.036.291.7
With 95% confidence level that μ will be in the confidence interval (7.35,8.47)
![Page 39: Confidence Intervals - Kasetsart University](https://reader033.vdocuments.mx/reader033/viewer/2022042311/625b9f27730c952354208d47/html5/thumbnails/39.jpg)
Example 9
49 students randomly select from 3rd IUP students
Unknown μ with average grade , s = 0.7
Find 95% confidence level of confidence Intervals of μ
39
3.2x
Solution n=49, unknown σ s n > 30 use z (normal)
50.210.2
49
7.096.13.2
49
7.096.13.2
With 95% confidence level that μ will be in the confidence interval (2.1, 2.5)
n
szx
475.0
![Page 40: Confidence Intervals - Kasetsart University](https://reader033.vdocuments.mx/reader033/viewer/2022042311/625b9f27730c952354208d47/html5/thumbnails/40.jpg)
Estimating Population Total
40
N
i
iNXXTotal
1
Sometimes, we interested in Total population more than population mean
; N = # population
Confidence Interval (1-α) 100% of Total is n
sNtXN
n )1(
factorfpcforN
nN
n
sNtXN
n
1)1(
fpc factor: Finite population correction factor
N >> n, factor ~ 1
![Page 41: Confidence Intervals - Kasetsart University](https://reader033.vdocuments.mx/reader033/viewer/2022042311/625b9f27730c952354208d47/html5/thumbnails/41.jpg)
Example 10 For a maize field, Area = 5,000 Rai
Randomly select 1,000 Rai
The average total output Tung, s =273 Tung/Rai
41
1076x
Solution Total output of 5,000 Rai can be approximate with
Total output of maize field is in the interval (5,111,865.14 , 5,648,134.86 )
1)1(
N
nN
n
sNtXN
n
86.134,268000,380,5
1000,5
1000000,5
1000
273)9842.1(000,5)076,1(000,5
![Page 42: Confidence Intervals - Kasetsart University](https://reader033.vdocuments.mx/reader033/viewer/2022042311/625b9f27730c952354208d47/html5/thumbnails/42.jpg)
Reference
สถิตพีิน้ฐาน ส าหรับนกัพฒันาสงัคม, วีนสั พีชวณิชย์ สมจิต วฒันาชยากลู และเบญจมาศ ตลุยนิตกิลุ, คณะวิทยาศาสตร์และเทคโนโลยี ม.ธรรมศาสตร์, มค.2547
42