week 9 the central limit theorem and estimation...
TRANSCRIPT
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
Week 9The Central Limit Theoremand Estimation Concepts
Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
Week 9 Objectives
1 The Law of Large Numbers and the concept of consistency ofaverages are introduced. The condition of existence of thepopulation mean is elaborated.
The Cauchy distribution is used to illustrate the effects ofviolating this condition.
2 The averages of n = 2 and of n = 3 uniform random variablesare used to demonstrate that the exact distribution of averagesdepends on the sample size.
3 Simulations illustrate the approximate normality of averages.
4 The Central Limit Theorem, and the DeMoivre-Laplace Theotemare given.
5 Some estimation concepts are presented.
Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
1 Consistency and Law of Large Numbers
2 The Central Limit Theorem
Convolutions
The Main Result
The DeMoivre-Laplace Theorem
3 Lab 5: The distribution of averages
4 Some Estimation Concepts
Model-Free and Model-Based Estimators
Comparing Estimators
The Method of Moments
Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
We have seen that X , p and S2, approximate theirpopulation counterparts and that the approximationimproves as the sample size increases. This is calledconsistency.In general, an estimator θ of a population parameter θ iscalled consistent if
θ → θ as n increases to∞.
Consistency is such a basic property that all estimatorsused in statistics are consistent.
Thus, the sample covariance, sample correlation, and theestimators of the intercept and slope of the simple linearregression model are all consistent estimators.
Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
Consistency is supported by a very famous probability result:
Theorem (The Law of Large Numbers)Let X1, . . . ,Xn be iid and let g be a function such that−∞ < E [g(X1)] <∞. Then,
θ ≡ 1n
n∑i=1
g(Xi)→ θ ≡ E [g(X1)] as n increases to∞.
Consistency of the aforementioned estimators follows fromthe consistency of averages stated in the LLN.
For example, the consistency of S2 follows from the factthat σ2 = E(X 2)− µ2 and the consistency of X and1n
∑ni=1 X 2
i as estimators of µ and E(X 2), respectively.
Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
The Assumption of finite µ and σ2
The LLN requires that the population has finite mean.Later we will be using also the assumption of finitevariance.Is it possible for the mean of a r.v. to not exist or to beinfinite? Or is it possible a r.v. to have finite µ but infiniteσ2?
The most notorious (for its abnormality) r.v. is the Cauchywhose pdf is
f (x) =1π
11 + x2 , −∞ < x <∞.
The median of this r.v. is zero, but its mean does not exist.Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
It is not hard to construct other PDFs with∞ mean or variance:
ExampleLet X1 and X2 have PDFs
f1(x) = x−2, 1 ≤ x <∞, f2(x) = 2x−3, 1 ≤ x <∞,
respectively. Show that E(X1) =∞, E(X2) = 2, Var(X2) =∞.
Solution: By straightforward integration we get
E(X1) =
∫ ∞
1xf1(x) dx =∞, E(X 2
1 ) =
∫ ∞
1x2f1(x) dx =∞,
E(X2) =
∫ ∞
1xf2(x) dx = 2, E(X 2
2 ) =
∫ ∞
1x2f2(x) dx =∞.
Thus, Var(X2) = E(X 22 )− [E(X2)]2 =∞.
Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
Distributions with infinite mean, or infinite variance, aredescribed as heavy tailed.The LLN may not hold for heavy tailed distributions: X maynot converge (as in the case of Cauchy) or diverge. (Alsothe CLT, which we will see shortly, does not hold.)Samples coming from heavy tailed distributions are muchmore likely to contain outliers.If large outliers exist in a data set, it might be a good idea tofocus on estimating another quantity, such as the median,which is well defined also for heavy tailed distributions.
Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
ConvolutionsThe Main ResultThe DeMoivre-Laplace Theorem
1 Consistency and Law of Large Numbers
2 The Central Limit Theorem
Convolutions
The Main Result
The DeMoivre-Laplace Theorem
3 Lab 5: The distribution of averages
4 Some Estimation Concepts
Model-Free and Model-Based Estimators
Comparing Estimators
The Method of Moments
Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
ConvolutionsThe Main ResultThe DeMoivre-Laplace Theorem
• The LLN says nothing about the quality of the approximationfor a given sample size. For that we need the distribution of X .
The distribution of the sum of two independent randomvariables is called the convolution of the two distributions.In some cases it is easy to find the convolution of two RVs.
You and a friend each flip a fair coin a number of times. LetX be the number of heads you get in the n1 flips, and Y bethe number of heads your friend gets in n2 flips.(a) Are X and Y independent?(b) What is the distribution of the total number of headsX + Y ?
Solution: (a) Yes. (b) ∼ X + Y Bin(n1 + n2,0.5) (why?).
• In general, if X ∼ Bin(n1,p) and Y ∼ Bin(n2,p) areindependent, then X + Y ∼ Bin(n1 + n2,p).
Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
ConvolutionsThe Main ResultThe DeMoivre-Laplace Theorem
The convolution is also known in the Poisson and Normalcases:
If X ∼ Poisson(λ1) and Y ∼ Poisson(λ2) are independentthen X + Y ∼ Poisson(λ1 + λ2). (See Example 5.3-1, p.216 for the derivation.)If X1 ∼ N(µ1, σ
21) and X2 ∼ N(µ2, σ
22) are independent then
X1 + X2 ∼ N(µ1 + µ2, σ21 + σ2
2).
• Such nice results are the exception. In general, thedistribution of a sum of independent random variables is noteasy to determine.
If X ∼ Bin(n1,p1) and Y ∼ Bin(n2,p2), with p1 6= p2, thenthe distribution of X + Y is not binomial.
Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
ConvolutionsThe Main ResultThe DeMoivre-Laplace Theorem
The convolution of two independent U(0,1) randomvariables is shown inhttp://personal.psu.edu/acq/401/fig/convol_2Unif.pdfThis is derived in Example 5.3-3, p. 217. A numericalderivation will be done in the next lab.In general, the formula for the convolution of two randomvariables is given in relation (5.3.3), p. 219. It is not easy tocompute!The convolution of several random variables is even harderto compute.
Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
ConvolutionsThe Main ResultThe DeMoivre-Laplace Theorem
Distribution of Sums of Normal Random Variables
Proposition
Let X1,X2, . . . ,Xn be independent with Xi ∼ N(µi , σ2i ), and set
Y = a1X1 + · · ·+ anXn. Then
Y ∼ N(µY , σ2Y ), where
µY = a1µ1 + · · ·+ anµn, and σ2Y = a2
1σ21 + · · ·+ a2
nσ2n.
Corollary
Let X1,X2, . . . ,Xn be iid N(µ, σ2). Then
X ∼ N(µ,σ2
n).
Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
ConvolutionsThe Main ResultThe DeMoivre-Laplace Theorem
Why care about such results?
• This result is relevant for answering an important statisticalquestion, namely how to control the estimation error via thesample size:
We want to be 95% certain that X does not differ from thepopulation mean by more than 0.3 units. How large shouldthe sample size n be? It is known that the populationdistribution is normal with σ2 = 9.
Remark: This example assumes normality with knownvariance. In real life the variance is also unknown. More detailson this will be given in Chapter 8.
Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
ConvolutionsThe Main ResultThe DeMoivre-Laplace Theorem
Solution: Using the previous corollary we have
P(|X − µ| < 0.3) = P
(∣∣∣∣∣X − µσ/√
n
∣∣∣∣∣ < 0.3σ/√
n
)= P
(|Z | < 0.3
σ/√
n
).
We want to find n so that P(|X − µ| < 0.3) = 0.95. Thus,
0.3σ/√
n= z0.025, or n =
(1.960.3
σ
)2
= 384.16.
Thus, using n = 385 will satisfy the precision objective.
Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
ConvolutionsThe Main ResultThe DeMoivre-Laplace Theorem
1 Consistency and Law of Large Numbers
2 The Central Limit Theorem
Convolutions
The Main Result
The DeMoivre-Laplace Theorem
3 Lab 5: The distribution of averages
4 Some Estimation Concepts
Model-Free and Model-Based Estimators
Comparing Estimators
The Method of Moments
Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
ConvolutionsThe Main ResultThe DeMoivre-Laplace Theorem
The complexity of finding the exact distribution of a sum (oraverage) of k (> 2) RVs increases rapidly with k .This is why the following is the most important theorem inprobability and statistics.
Theorem (The Central Limit Theorem)
Let X1, . . . ,Xn be iid with mean µ and variance σ2 <∞. Then,for large enough n (n ≥ 30 for the purposes of this course), Xhas approximately a normal distribution with mean µ andvariance σ2/n, i.e.
X ·∼ N(µ,σ2
n
).
Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
ConvolutionsThe Main ResultThe DeMoivre-Laplace Theorem
The Central Limit Theorem: Comments
The CLT does not require the Xi to be continuous RVs.T = X1 + · · ·+ Xn has approximately a normal distributionwith mean nµ and variance nσ2, i.e.
T = X1 + · · ·+ Xn·∼ N
(nµ,nσ2
).
The CLT can be stated for more general linearcombinations of independent r.v.s, and also for certaindependent RVs. The present statement suffices for therange of applications of this book.The CLT explains the central role of the normal distributionin probability and statistics.
Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
ConvolutionsThe Main ResultThe DeMoivre-Laplace Theorem
ExampleTwo towers are constructed, each by stacking 30 segments ofconcrete. The height (in inches) of a randomly selectedsegment is uniformly distributed in (35.5,36.5). A roadway canbe laid across the two towers provided the heights of the twotowers are within 4 inches. Find the probability that the roadwaycan be laid.Solution: The heights of the two towers are
T1 = X1,1 + · · ·+ X1,30, and T2 = X2,1 + · · ·+ X2,30,
where Xi,j is the height of the j th segment used in the i th tower.By the CLT,
T1·∼ N(1080,2.5), and T2
·∼ N(1080,2.5).
Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
ConvolutionsThe Main ResultThe DeMoivre-Laplace Theorem
Example (Continued)Since the two heights T1, T2 are independent r.v.s, theirdifference, D, is distributed as
D = T1 − T2·∼ N(0,5).
Therefore, the probability that the roadway can be laid is
P(−4 < D < 4)·
= P(−4
2.236< Z <
42.236
)= Φ(1.79)− Φ(−1.79) = .9633− .0367
= .9266.
Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
ConvolutionsThe Main ResultThe DeMoivre-Laplace Theorem
ExampleThe waiting time for a bus, in minutes, is uniformly distributed in[0,10]. In five months a person catches the bus 120 times. Findthe 95th percentile, t0.05, of the person’s total waiting time T .Solution. Write T = X1 + . . .+ X120, where Xi=waiting time forcatching the bus the i th time. Since n = 120 ≥ 30, and
E(Xi) = 5, Var(Xi) = 102/12 = 100/12, we have
T ·∼ N(
120× 5, 12010012
)= N(600,1000).
Therefore,
t0.05 ' 600 + z.05√
1000 = 600 + 1.645√
1000 = 652.02.
Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
ConvolutionsThe Main ResultThe DeMoivre-Laplace Theorem
1 Consistency and Law of Large Numbers
2 The Central Limit Theorem
Convolutions
The Main Result
The DeMoivre-Laplace Theorem
3 Lab 5: The distribution of averages
4 Some Estimation Concepts
Model-Free and Model-Based Estimators
Comparing Estimators
The Method of Moments
Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
ConvolutionsThe Main ResultThe DeMoivre-Laplace Theorem
The DeMoivre-Laplace theorem was proved first byDeMoivre in 1733 for p = 0.5 and extended to general p byLaplace in 1812.Due to the prevalence of the Bernoulli distribution in theexperimental sciences, it continues to be stated separatelyeven though it is now recognized as a special case of theCLT.
Theorem (DeMoivre-Laplace)If T ∼ Bin(n,p) then, for n satisfying np ≥ 5 and n(1− p) ≥ 5,
T ·∼ N (np,np(1− p)) .
Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
ConvolutionsThe Main ResultThe DeMoivre-Laplace Theorem
The continuity correction
For an integer valued RV X , and Y the approximatingnormal RV, the principle of continuity correction specifies
P(X ≤ k) ' P(Y ≤ k + 0.5).
The DeMoivre-Laplace theorem and continuity correctionyield
P(T ≤ k) ' Φ
(k + 0.5− np√
np(1− p)
).
For individual probabilities use
P(X = k) ' P(k − 0.5 < Y < k + 0.5).
Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
ConvolutionsThe Main ResultThe DeMoivre-Laplace Theorem
0 2 4 6 8 10
0.00
0.05
0.10
0.15
0.20
0.25
Bin(10,0.5) PMF and N(5,2.5) PDF
Figure: Approximation of P(X ≤ 5) with and without continuitycorrection. Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
ConvolutionsThe Main ResultThe DeMoivre-Laplace Theorem
ExampleA college basketball teams plays 30 regular season games, 16which are against class A teams and 14 are against class Bteams. The probability that the team will win a game is 0.4 if theteam plays against a class A team and 0.6 if the team playsagainst a class B team. Assuming that the results of differentgames are independent, approximate the probability that
1 the team will win at leat 18 games;2 the number of wins against class B teams is smaller than
that against class A teams.
Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
ConvolutionsThe Main ResultThe DeMoivre-Laplace Theorem
Solution: If X1 and X2 denote the number of wins against classA teams and class B teams,
X1 ∼ Bin(16,0.4), and X2 ∼ Bin(14,0.6).
Since 16× 0.4 and 14× 0.4 are both ≥ 5,
X1·∼ N(16× 0.4,16× 0.4× 0.6) = N(6.4,3.84),
X2·∼ N(14× 0.6,14× 0.6× 0.4) = N(8.4,3.36).
Consequently, since X1 and X2 are independent,
X1 + X2·∼ N(6.4 + 8.4,3.84 + 3.36) = N(14.8,7.20),
X2 − X1·∼ N(8.4− 6.4,3.84 + 3.36) = N(2,7.20).
Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
ConvolutionsThe Main ResultThe DeMoivre-Laplace Theorem
Solution continued: Hence, using also continuity correction, theneeded approximation for part 1 is
P(X1 + X2 ≥ 18) = 1− P(X1 + X2 ≤ 17) ' 1− Φ
(17.5− 14.8√
7.2
)= 1− Φ(1.006) = 1− 0.843 = 0.157,
and the needed approximation for part 2 is
P(X2 − X1 < 0) ' Φ
(−0.5− 2√
7.2
)= Φ(−0.932) = 0.176.
Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
Consistency/inconsistency results
• Consistency of the sample proportion
n=100; x=rbinom(1,n,0.3); x/n # Repeat with n=1000 andn=10000
• Consistency of the sample mean for normal samples
n=100; mean(rnorm(n)) # Repeat with n=1000 and n=10000
• Inconsistency of the sample mean when E(X) does not exist
n=100; mean(rcauchy(n)) # Repeat with n=1000 and n=10000
Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
Plotting probabilities
• Probabilities can be plotted with plot and barplot commands.Empirical probabilities can be plotted with hist and barplot.
Plot the pmf of the Bin(10, 0.5):plot(0:10,dbinom(0:10, 10, 0.5), pch=4,col=2)lines(0:10,dbinom(0:10, 10, 0.5), col=2)Alternatively: p=dbinom(0:10, 10, 0.5); barplot(p)Try also: barplot(p,xlim=c(0,12),ylim=c(0,0.25), col=2)Plot empirical probabilities resulting fromx=rbinom(1000,10,0.5):hist(x,seq(-0.5,10.5,1), col=5)Alternatively: ph=table(x)/1000; barplot(ph)
Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
Averages of uniform random variables
The pdf of the sum of two independent uniform randomvariables can be obtained from the formula (5.3.3) in the book,i.e., fX1+X2(y) =
∫∞−∞ fX2(y − x)fX1(x)dx and plotted by
g=function(x){x*(x<1) + (2-x)*(1<= x)*(x<2)}
curve(g,0,2)
Formula (5.3.3) can be applied repeatedly to get the pdf of thesum of 3 or more uniform r.v.s, but it becomes very difficultquickly.
So, we employ simulations to get an idea what the pdf of thesum (or average) of several uniform r.v.s looks like:
Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
m=matrix(runif(5000000), nrow=80)
hist(m[1,], seq(0, 1, 0.1)) # should look like the pdf of theuniform
hist((m[1,]+m[2,])/2, seq(0, 1, 0.01)) # should look like theplotted pdf
hist(colMeans(m), seq(min(colMeans(m))-0.01,max(colMeans(m))+0.01, 0.01)) # shows you what the densityof X for n = 80 looks like.
mean(m[1,]); sd(m[1,]) # should be approximately 1.6487 and√6.389 = 2.5276
mean(colMeans(m)); sd(colMeans(m)) # should beapproximately 1.6487 and
√6.389/80 = 0.2826
Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
Averages of log-normal random variables
curve(dlnorm, 0,10) # This density has µ = exp(0.5) = 1.6487and σ2 = exp(2)− 1 = 6.389
m=matrix(rlnorm(5000000), nrow=80)
hist(m[1,], seq(0, max(m[1,])+0.3, 0.3)) # should look like theplotted pdf
hist(colMeans(m), seq(min(colMeans(m))-0.1,max(colMeans(m))+0.1, 0.1)) # shows you what the density ofX for n = 80 looks like.
mean(m[1,]); sd(m[1,]) # should be approximately 1.6487 and√6.389 = 2.5276
mean(colMeans(m)); sd(colMeans(m)) # should beapproximately 1.6487 and
√6.389/80 = 0.2826
Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
Averages of negative binomial random variables
p=dnbinom(0:15, 5, 0.6 ); barplot(p) # hasµ = 5(1/0.6)− 5 = 3.3333 and σ2 = 5(0.4/0.62) = 5.5556
m=matrix(rnbinom(4000000, 5, 0.6), nrow=80)
hist(m[1,], seq(-0.5, floor(max(m[1,]))+0.5,1)) # should look likethe plotted pmf.
hist(colMeans(m), seq(floor(min(colMeans(m)))-0.5,floor(max(colMeans(m)))+1.5, 0.1)) # shows you what the pmfof X for n = 80 looks like.
mean(m[1,]); sd(m[1,]) # should be approximately 3.3333 and√5.5556 = 2.357
mean(colMeans(m)); sd(colMeans(m)) # should beapproximately 3.3333 and
√5.5556/80 = 0.2635
Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
Model-Free and Model-Based EstimatorsComparing EstimatorsThe Method of Moments
1 Consistency and Law of Large Numbers
2 The Central Limit Theorem
Convolutions
The Main Result
The DeMoivre-Laplace Theorem
3 Lab 5: The distribution of averages
4 Some Estimation Concepts
Model-Free and Model-Based Estimators
Comparing Estimators
The Method of Moments
Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
Model-Free and Model-Based EstimatorsComparing EstimatorsThe Method of Moments
What is a model?
In probability and statistics the word “model” refers to anassumption we make about the data at hand.
For univariate discrete data we may assume the binomial,Poisson or another discrete probability model.For univariate continuous data we may assume the normal,exponential, or another continuous probability model.For bivariate data we may assume the bivariate normalmodel, the linear regression model, or another model.
Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
Model-Free and Model-Based EstimatorsComparing EstimatorsThe Method of Moments
When a model for the distribution of the data is assumed, itis of interest to estimate the model parameters.
In the normal model we want to estimate µ and σ2.In the uniform model, the two endpoints are of interest.In the SLR model we want β0, β1 and σ2
ε .
Estimating the model parameters is called fitting a modelto data.Estimating the model parameters is quite different fromestimating the population mean, variance, percentiles etcwe saw in Chapter 1.
For example, how does one estimate the endpoints of auniform distribution? (Chapter 6 presents two ways of doingthis, and one of them will be discussed.)
Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
Model-Free and Model-Based EstimatorsComparing EstimatorsThe Method of Moments
Estimators of the model parameters are meaningful only ifthe assumed model is correct.
Estimators of the endpoints of the uniform distribution wouldbe meaningless if the data are not uniformly distributed.Estimators the slope and intercept of the SLR model loosetheir interpretation if the true regression function is a higherorder polynomial.
The estimators of Chapter 1 do not have this problem.For example, assuming the population mean, µ, exists, Xestimates µ regardless of the population distribution.
• The estimators of Chapter 1 will be called nonparametric ormodel-free.
Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
Model-Free and Model-Based EstimatorsComparing EstimatorsThe Method of Moments
Fitting models to data leads to alternative estimators of thepopulation mean, variance, proportions and percentiles.
If the normal model is assumed:a) The density is estimated by the N(X ,S2) density.b) The percentiles are also estimated as X + Szα.c) In addition to sample proportions, P(X ≤ x) may also beestimated by Φ((x − X )/S).If the uniform model is assumed, and a, b are theestimators of the two endpoints:a) The mean and median are also estimated as (a + b)/2.b) The variance is also estimated as (b − a)2/12.If the Poisson model is assumed, X is an alternativeestimator of the variance.
• Such alternative estimators are called model-based.
Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
Model-Free and Model-Based EstimatorsComparing EstimatorsThe Method of Moments
1 Consistency and Law of Large Numbers
2 The Central Limit Theorem
Convolutions
The Main Result
The DeMoivre-Laplace Theorem
3 Lab 5: The distribution of averages
4 Some Estimation Concepts
Model-Free and Model-Based Estimators
Comparing Estimators
The Method of Moments
Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
Model-Free and Model-Based EstimatorsComparing EstimatorsThe Method of Moments
• The existence of alternative estimators for the samepopulation quantity raises the question of which is “better”.
Estimators are compared in terms of their mean andvariance. [Piece of jargon: The standard deviation of anestimator is called its standard error.]If the mean of an estimator θ equals its true value, i.e.,
Eθ(θ)
= θ,
it is called unbiased.The difference expected value − true value is the bias:
bias(θ) = Eθ(θ)− θ
Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
Model-Free and Model-Based EstimatorsComparing EstimatorsThe Method of Moments
X and p are unbiased estimators:
Eµ(X ) = µ, Ep(p) = p.
S2 is also unbiased: Eσ2
(S2) = σ2. See p. 232 for a proof.
Later we will also see that β0 and β1 are unbiased.
However,S is only consistent (not unbiased) estimator of σ.1n
n∑i=1
X 2i − X
2is only consistent (not unbiased) estimator of
σ2.
Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
Model-Free and Model-Based EstimatorsComparing EstimatorsThe Method of Moments
The Mean Square Error Selection Criterion
Variance and bias combine to define the Mean SquareError, MSEθ(θ) = Eθ(θ − θ)2:
MSEθ(θ) = Var(θ) + [bias(θ)]2
The MSE selection criterion says that estimator θ1 shouldbe preferred over θ2 if
MSEθ(θ1) ≤ MSEθ(θ2)
If θ1 and θ2 are unbiased, we have the variance selectioncriterion: θ1 should be preferred over θ2 if
Var(θ1) ≤ Var(θ2).
Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
Model-Free and Model-Based EstimatorsComparing EstimatorsThe Method of Moments
On the basis of the MSE (or variance) selection criterion, wecan make the following statement:
• Under correct model assumptions, model-based estimatorsare preferable to the model-free ones. For example:
if the normal assumption is correct, X + Szα is preferableto the (1− α)100% sample percentile.If the normal assumption is correct, Φ((x − X )/S) ispreferable to the sample proportion.If the normal assumption is correct, X is preferable to X .If the Poisson assumption is correct, X is preferable (asestimator of the variance!) to S2
Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
Model-Free and Model-Based EstimatorsComparing EstimatorsThe Method of Moments
1 Consistency and Law of Large Numbers
2 The Central Limit Theorem
Convolutions
The Main Result
The DeMoivre-Laplace Theorem
3 Lab 5: The distribution of averages
4 Some Estimation Concepts
Model-Free and Model-Based Estimators
Comparing Estimators
The Method of Moments
Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
Model-Free and Model-Based EstimatorsComparing EstimatorsThe Method of Moments
For r ≥ 1, the r -th moment of a r.v. X is defined as E(X r ).
Let X1, . . . ,Xn be iid from the same population as X . Then,
By the LLN,1n
n∑i=1
X ri is a consistent estimator of E(X r ).
It is also unbiased.The method of moments is based on expressing the modelparameters in terms of the moments.
If X ∼ Unif(0, θ) then θ = 2µ.If X ∼ Unif(α, β) then
α = µ−√
3σ2, β = µ+√
3σ2,
and, of course, σ2 = E(X 2)− µ2.
Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
Model-Free and Model-Based EstimatorsComparing EstimatorsThe Method of Moments
Additional examples include:If X ∼ Exp(λ), λ = 1/µ; if X ∼ Poisson(λ), λ = µ.
Ignoring the difference between 1n∑n
i=1 X 2i − X
2and S2,
the above lead to the following method of momentsestimators:
If X1, . . . ,Xn is a sample from N(µ, σ2), then
µ = X , σ2 = S2.
If X1, . . . ,Xn is a sample from Unif(0, θ), θ = 2X .If X1, . . . ,Xn is a sample from Unif(α, β),
α = X −√
3S2, β = X +√
3S2,
If X1, . . . ,Xn is a sample from Exp(λ), λ = 1/X .
Week 9 The Central Limit Theorem and Estimation Concepts
OutlineConsistency and Law of Large Numbers
The Central Limit TheoremLab 5: The distribution of averages
Some Estimation Concepts
Model-Free and Model-Based EstimatorsComparing EstimatorsThe Method of Moments
• In expressing the model parameters in terms of moments, themethod of moments uses the lowest possible moments.
If Xi , i = 1, . . . ,n, are iid Poisson(λ), use λ = X instead ofλ = S2.
• For bivariate data, (Xi ,Yi), i = 1, . . . ,n, the product momentE(XY ), is estimated by 1
n∑n
i=1 XiYi .
The method of moments estimator of Cov(X ,Y ) is
1n
n∑i=1
XiYi − XY ,
which is slightly different from the sample version ofCov(X ,Y ).
Week 9 The Central Limit Theorem and Estimation Concepts