1. 8.1introduction the field of statistical inference consist of those methods used to make...
TRANSCRIPT
Chapter 8 : Estimation
1
8.1 IntroductionThe field of statistical inference consist of those methods used to make decisions or to draw conclusions about a population. These methods utilize the information contained in a sample from the population in drawing conclusions.
2
I have a sample of 5 numbers and I take the average. The estimator is taking the average of the sample.
The estimator of the mean.
Let say, the average = 4 the estimate.
Estimator VS Estimate
Estimator
• In statistics, the method used
Estimate
• The value that obtained from a sample
Point estimate – an estimate of the parameter using a single number.◦ E.g :
Point Estimation
x is the point estimate for
In choosing the point estimators, we have to depends on the properties of the estimators.UnbiasedConsistentEfficientSufficient
Unbiased
• If the expected value of the statistics is equal to the population parameter.
Example 8.1
• If a random sample size n is taken from a population with mean and variance , hence is an unbiased estimator for .
x
2
( ) ( ) ( )
[ ]
( )
1 2
1 2
1 2
1
...( )
1 ...
1 ...
1
1
n
n
n
n
ii
x x xE x E
n
E x E x E xn
n
n
nn
m m m
m
m
m
=
æ ö+ + +ç ÷= ç ÷è ø
é ù= + + +ë û
= + + +
é ù= ê ú
ê úë û
é ù= ë û
=
å
6
Consistent • If it gets closer to the parameter value as the sample size increase• Consistent if its variance decrease while the n increase.
Example 8.2If a random sample size n is taken from a population with mean, and variance , hence (mean sample) is a consistent estimator for .
2 x
( )2
ˆlim var 0
lim 0
n
n n
q
s®¥
®¥
\ =
=
( )( )
( ) ( ) ( )
( )
1 2
1 22
1 22
2 2 22
22
2
...
1...
1...
1...
1
n
n
n
x x xV x V
n
V x x xn
V x V x V xn
n
nn n
s s s
ss
æ ö+ + +ç ÷= ç ÷è ø
é ù= + + +ë û
é ù= + + +ë û
é ù= + + +ë û
= =
7
Efficient
• For 2 or more unbiased estimators, the one with the smallest variance is considered the most efficient estimator.
Example 8.3 Sample mean is an efficient estimator compares to sample median in estimating the population mean.
This gives if ( ) ( )var varx x< 2n >
m
Proof: Variance for sample mean,
Variance for sample median,
Thus, significantly, mean is more efficient in estimating
.
( )2
var xn
s=
( )2
2var
2
if n is oddx
if n is even
s
s
ìïï=íïïî
8
Sufficient • If it used all the sample’s
information
Point estimators for mean, variance, and proportion
Population mean
• Given a sample X1, X2,X3,...,Xn of size n taken from a certain population with unknown mean, µ and variance, σ2 . The sample mean is the best estimator of µ.
Population variance
• Given a sample X1, X2,X3,...,Xn of size n taken from a certain population with mean, µ and variance, σ2 . The sample variance is the best estimator of .
Population proportion
• Given a sample X1, X2,X3,...,Xn of size n taken from a certain population with unknown proportion P . The sample proportion is the best estimator of P.
X
2 2
1
1
1
n
ii
S X Xn
== ( - )
-
P̂
2
Definition 8.1: An Interval EstimateIn interval estimation, an interval is constructed around the point estimate and it is stated that this interval is likely to contain the corresponding population parameter. Definition 8.2: Confidence Level and Confidence Interval Each interval is constructed with regard to a given confidence level and is called a confidence interval. The confidence level associated with a confidence interval states how much confidence we have that this interval contains the true population parameter. The confidence level is denoted by .
10
8.2 Interval Estimation
( )1 100%a-
11
8.2.1 Confidence Interval Estimates for Population Mean
2
2 2
2
The (1 )100% Confidence Interval of Population Mean,
(i) if is known and normally distributed population
or
(ii) if is unknown, large (
x zn
x z x zn n
sx z n
n
a
a a
a
a m
ss
s sm
s
-
±
æ öç ÷- < < +ç ÷è ø
±
2 2
30)
or
n
s sx z x z
n na am
³
æ öç ÷- < < +ç ÷è ø
( )1, 2
1, 1,2 2
(iii) if is unknown, normally distributed population
and small sample size 30
or
n
n n
sx t
n
n
s sx t x t
n n
a
a a
s
m
-
- -
±
<
æ öç ÷- < < +ç ÷è ø
12
13
Example 8.4
2
If a random sample of size 20 from a normal population
with the variance 225 has the mean 64.3, construct
a 95% confidence interval for the population mean, .
n
xs
m
=
= =
14
solution
0.025
2
It is known that, 20, 64.3 and 15
For 95% CI,
95% 100(1– )%
1– 0.95
0.05
0.0252
1.96
n x
z za
m s
a
a
a
a
= = = =
=
=
=
=
= =
15
2
Hence, 95% CI
15 64.3 1.96
20
64.3 6.57
[57.73,70.87]
@
x zn
a
sæ öç ÷= ± ç ÷è ø
æ öç ÷= ± ç ÷è ø
= ±
=
57.73 70.87
Thus, we are 95% confident that the mean of random variable
is between 57.73 and 70.87
m< <
Example 8.5 :
A publishing company has just published a new textbook. Before the
company decides the price at which to sell this textbook, it wants to know the
average price of all such textbooks in the market. The research department at
the company took a sample of 36 comparable textbooks and collected the
information on their prices. This information produced a mean price RM 70.50
for this sample. It is known that the standard deviation of the prices of
all such textbooks is RM4.50. Construct a 90% confidence interval for the mean
price of all such college textbooks.
16
solution
17
0.05
2
It is known that, 36, RM70.50 and 4.50
For 90% CI,
90% 100(1– )%
1– 0.90
0.1
0.05 2
1.65
n x RM
z za
m s
a
a
a
a
= = = =
=
=
=
=
= =
2
Hence, 90% CI
4.50 70.50 1.65
36
70.50 1.24
[ 69.26, 71.74]
Thus, we are 90% confident that t
x zn
RM RM
a
sæ öç ÷= ± ç ÷è ø
æ öç ÷= ± ç ÷è ø
= ±
=
he mean price of all such
college textbooks is between RM69.26 and RM71.74
18
19
( )
( ) ( )
2
2 2
The (1 )100% Confidence Interval for for Large Samples ( 30)
ˆ ˆ1ˆ
or
ˆ ˆ ˆ ˆ1 1ˆ ˆ
p n
p pp z
n
p p p pp z p p z
n n
a
a a
a- ³
-±
- -- < < +
8.2.3 Confidence Interval Estimates for Population Proportion
Example 8.6According to the analysis of Women Magazine in June 2005, “Stress has become a common part of everyday life among working women in Malaysia. The demands of work, family and home place an increasing burden on average Malaysian women”. According to this poll, 40% of working women included in the survey indicated that they had a little amount of time to relax. The poll was based on a randomly selected of 1502 working women aged 30 and above. Construct a 95% confidence interval for the corresponding population proportion.
20
Solution
21
Let be the proportion of all working women age 30 and above,
ˆwho have a limited amount of time to relax, and let be the
corresponding sample proportion. From the given information,
ˆ 1502 , 0.
p
p
n p= =
2
ˆ ˆ40, 1 1 – 0.40 0.60
ˆ ˆˆHence, 95% CI
0.40(0.60) 0.40 1.96
1502
0.40 0.02478
[0.375,0.425
q p
pqp z
na
= - = =
æ öç ÷= ± ç ÷è ø
æ öç ÷= ± ç ÷è ø
= ±
= ] or 37.5% to 42.5%
Thus, we can state with 95% confidence that the proportion of all
working women aged 30 and above who have a limited amount of
time to relax is between 37.5% and 42.5%.
/2,z
Error En
a s=
Definition 8.3:
22
8.2.5 Error of Estimation and Determining the Sample size
2
/2
If is used as an estimate of , we can be 100(1- )% confident
that the error | | will not exceed a specified amount when the
sample size is
sample size,
x
x E
zn
Ea
m a
m
s
-
æ öç ÷=ç ÷è ø
Example 8.7:A team of efficiency experts intends to use the mean of a
random sample of size n=150 to estimate the average mechanical aptitude of assembly-line workers in a large industry (as measured by a certain standardized test). If, based on experience, the efficiency experts can assume thatfor such data, what can they assert with probability 0.99 about the maximum error of their estimate?
23
6.2s =
Solutions
24
0.005
/2
Substituting 150, 6.2, and 2.575 into the expression
for the maximum error, we get
2.575(6.2) 1.30
150
Thus,
n z
zE
na
s
s
= = =
=
= =
the efficiency experts can assert with probability 0.99 that their
error will be less than 1.30.
25
Definition 8.4:
/2
/2
ˆIf is used as an estimate of , we can assert with
ˆ ˆ(1 )(1- )100% confidence that the error is less than .
ˆ ˆ(1 )If we set and solve for , the appropriate
sample size is
xp p
n
p pz
n
p pE z n
n
a
a
a
=
-
-=
2
/2
ˆ ˆ (1 )z
n p pEa
æ öç ÷= -ç ÷è ø
Example 8.8:A study is made to determine the proportion of voters in a sizable community who favor the construction of a nuclear power plant. If 140 of 400 voters selected at random favor the project and we use as an estimate of the actual proportion of all voters in the community who favor the project, what can we say with 99% confidence about the maximum error?
26
140ˆ 0.35
400p = =
Solution
27
0.005
/2
ˆSubstituting 400, 0.35, and 2.575 into the formula,
we get
ˆ ˆ(1 )
(0.35)(0.65) 2.575 0.061
400
ˆThus, if we use 0.35 as an estimate of the actual proportion of
voters in the commun
n p z
p pE z
n
p
a
= = =
-=
= =
=
ity who favor the project, we can assert with
99% confidence that the error is less than 0.061.
Example 8.9:How large a sample required if we want to be 95% confident that the error in using to estimate p is less than 0.05? If , find the required sample size.
28
p̂ˆ 0.12p =
Solution
29
2
0.025
2
ˆ ˆ(1 )
1.96 0.12(0.88) 162
0.05
zn p p
E
æ öç ÷= -ç ÷è ø
æ öç ÷= @ç ÷è ø