copyright © 2006 pearson addison-wesley. all rights reserved. lecture 5: regression with one...

88
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1– 3.5, 3.7 Chapter 4.1– 4.4)

Post on 21-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

Lecture 5:Regression with One Explanator

(Chapter 3.1–3.5, 3.7Chapter 4.1–4.4)

Page 2: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-2

Agenda

• Finding a good estimator for a straight line through the origin: Chapter 3.1–3.5, 3.7

• Finding a good estimator for a straight line with an intercept: Chapter 4.1–4.4

Page 3: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-3

Where Are We?

• We wish to uncover quantitative features of an underlying process, such as the relationship between family income and financial aid. How much less aid will I receive on average for each dollar of additional family income?

• We have data, a sample of the process, for example observations on 10,000 students’ aid awards and family incomes.

Page 4: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-4

Where Are We? (cont.)

• Other factors (), such as number of siblings, influence any individual student’s aid, so we cannot directly observe the relationship between income and aid.

• We need a rule for making a good guess about the relationship between income and financial aid, based on the data.

Page 5: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-5

Where Are We? (cont.)

• A good guess is a guess which is right on average.

• We also desire a guess which will have a low variance around the true value.

Page 6: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-6

Where Are We? (cont.)

• Our rule is called an “estimator.”

• We started by brainstorming a number of estimators and then comparing their performances in a series of computer simulations.

• We found that the Ordinary Least Squares estimator dominated the other estimators.

• Why is Ordinary Least Squares so good?

Page 7: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-7

Where Are We? (cont.)

• To make more general statements, we need to move beyond the computer and into the world of mathematics.

• Last time, we reviewed a number of mathematical tools: summations, descriptive statistics, expectations, variances, and covariances.

Page 8: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-8

Where Are We? (cont.)

• As a starting place, we need to write down all our assumptions about the way the underlying process works, and about how that process led to our data.

• These assumptions are called the “Data Generating Process.”

• Then we can derive estimators that have good properties for the Data Generating Process we have assumed.

Page 9: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-9

Where Are We? (cont.)

• The DGP is a model to approximate reality. We trade off realism to gain parsimony and tractability.

• Models are to be used, not believed.

Page 10: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-10

Where Are We? (cont.)

• Much of this course focuses on different types of DGP assumptions that you can make, giving you many options as you trade realism for tractability.

Page 11: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-11

Where Are We? (cont.)

• Two Ways to Screw Up in Econometrics:

– Your Data Generating Process assumptions missed a fundamental aspect of reality (your DGP is not a useful approximation); or

– Your estimator did a bad job for your DGP.

• Today we focus on picking a good estimator for your DGP.

Page 12: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-12

Where Are We? (cont.)

• Today, we will focus on deriving the properties of an estimator for a simple DGP: the Gauss–Markov Assumptions.

• First we will find the expectations and variances of any linear estimator under the DGP.

• Then we will derive the Best Linear Unbiased Estimator (BLUE).

Page 13: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-13

Our Baseline DGP: Gauss–Markov(Chapter 3)

• Y = X +• E(i ) = 0

• Var(i ) = 2

• Cov(i ,j ) = 0, for i ≠ j

• X ’s fixed across samples (so we can treat them like constants).

• We want to estimate

Page 14: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-14

A Strategy for Inference

• The DGP tells us the assumed relationships between the data we observe and the underlying process of interest.

• Using the assumptions of the DGP and the algebra of expectations, variances, and covariances, we can derive key properties of our estimators, and search for estimators with desirable properties.

Page 15: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-15

An Example: g1

YiX

i

i

E(i) 0

Var(i) 2

Cov(i,

j) 0, for i j

X 's fixed across samples (so we can treat it as a constant).

g11

n

Yi

Xii1

n

In our simulations, g

1 appeared to give estimates close to .

Was this an accident, or does g1 on average give us ?

Page 16: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-16

An Example: g1 (cont.)

E(g1) E(

1

n

Yi

Xii1

n

) 1

nE(

Yi

Xi

) i1

n

1

nE(

Xi

i

Xi

)i1

n

1

nE() 1

n

1

Xi

E(i)

i1

n

i1

n

1

nn 0

On average, g1.

E(g1)

Using the DGP and the algebra of expectations,

we conclude that g1 is unbiased.

Page 17: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-17

Checking Understanding

E(g1) E(

1

n

Yi

Xii1

n

) 1

nE(

Yi

Xi

) i1

n

1

nE(

Xi

i

Xi

)i1

n

1

nE() 1

n

1

Xi

E(i)

i1

n

i1

n

1

nn 0

E(g1)

Question: which DGP assumptions did we need to use?

Page 18: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-18

Checking Understanding (cont.)

E(g1) E(

1

n

Yi

Xii1

n

) 1

nE(

Yi

Xi

) i1

n

1

nE(

Xi

i

Xi

)i1

n

Here we used Y

iX

i

i

1

nE() 1

n

1

Xi

E(i)

i1

n

i1

n

Here we used the assumption that X 's

are fixed across samples.

1

nn 0

Here we used E(i) 0

Page 19: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-19

Checking Understanding (cont.)

We did NOT use the assumptions about

the variance and covariances of i.

We will use these assumptions when we

calculate the variance of the estimator.

Page 20: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-20

Linear Estimators

• g1 is unbiased. Can we generalize?

• We will focus on linear estimators.

• Linear estimator: a weighted sum of the Y ’s.

ˆi iwY

Page 21: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

5-21

Linear Estimators (cont.)

ˆi iwY

1

1

1

1

i

i

ii

i i

Yg

n X

wnX

g wY

• Linear estimator:

• Example: g1 is a linear estimator.

Page 22: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-22

Linear Estimators (cont.)

1) Mean of Ratios: 3) Mean of Ratio of Changes:

g11

n

Yi

Xi

g3 1

n 1

Yi Y

i 1

Xi X

i 1

wi 1

nXi

wi 1

n 1

1

Xi X

i1

1

Xi 1

Xi

2) Ratio of Means: 4) Ordinary Least Squares:

g2

Yi

Xi g

4

YiX

iX

j2

wi 1

Xj

wi

Xi

Xj2

• All of our “best guesses” are linear estimators!

Page 23: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-23

2

1

1 1 1

1 1

( ) 0

( ) ( , ) 0,

ˆ

ˆ( ) ( ) ( ) ( )

[ ( ) ( )]

i i i i

i i j

n

i ii

n n n

i i i i i i ii i i

n n

i i i i ii i

Y X E

Var Cov i j

X

wY

E E wY w E Y w E X

w E X E w X

for

's fixed across samples (so we can treat it as a constant).

Expectation of Linear Estimators

Page 24: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-24

Expectation of Linear Estimator (cont.)

1

1

1

ˆ

ˆ( )

1.

n

i ii

n

i ii

n

i ii

wY

E w X

w X

A linear estimator is unbiased if

Page 25: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-25

Expectation of Linear Estimator (cont.)

• A linear estimator is unbiased if SwiXi = 1

• Are g2 and g4 unbiased?

2) Ratio of Means: 4) Ordinary Least Squares:

g2

Yi

Xi g

4

YiX

iX

j2

wi 1

Xj

wi

Xi

Xj2

wiX

i 1

Xj

Xi w

iX

i X

i

Xj2

Xi

1

Xj

Xi 1 1

Xj2

Xi2 1

Page 26: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-26

Expectation of Linear Estimator (cont.)

• Similar calculations hold for g3

• All 4 of our “best guesses” are unbiased.

• But g4 did much better than g3. Not all unbiased estimators are created equal.

• We want an unbiased estimator with a low mean squared error.

Page 27: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-27

First: A Puzzle…..

• Suppose n = 1

–Would you like a big X or a small X for that observation?

–Why?

Page 28: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-28

What Observations Receive More Weight?

1) Mean of Ratios: 3) Mean of Ratio of Changes:

g11

n

Yi

Xi

g3 1

n 1

Yi Y

i 1

Xi X

i 1

wi 1

nXi

wi 1

n 1

1

Xi X

i1

1

Xi 1

Xi

2) Ratio of Means: 4) Ordinary Least Squares:

g2

Yi

Xi g

4

YiX

iX

j2

wi 1

Xj

wi

Xi

Xj2

Page 29: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-29

What Observations Receive More Weight? (cont.)

g11

n

Yi

Xi

g3 1

n 1

Yi Y

i 1

Xi X

i 1

wi 1

nXi

wi 1

n 1

1

Xi X

i1

1

Xi 1

Xi

• g1 puts more weight on observations with low values of X.

• g3 puts more weight on observations with low values of X, relative to neighboring observations.

• These estimators did very poorly in the simulations.

Page 30: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-30

What Observations Receive More Weight? (cont.)

2 4 2

2

1

i i i

i j

ii i

j j

Y Y Xg g

X X

Xw w

X X

• g2 weights all observations equally.

• g4 puts more weight on observations with high values of X.

• These observations did very well in the simulations.

Page 31: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-31

Why Weight More Heavily Observations With High X ’s?

• Under our Gauss–Markov DGP the disturbances are drawn the same for all values of X….

• To compare a high X choice and a low X choice, ask what effect a given disturbance will have for each.

Page 32: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-32

Figure 3.1 Effects of a Disturbance for Small and Large X

Page 33: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-33

Linear Estimators and Efficiency

• For our DGP, good estimators will place more weight on observations with high values of X

• Inferences from these observations are less sensitive to the effects of the same

• Only one of our “best guesses” had this property.

• g4 (a.k.a OLS) dominated the other estimators.

• Can we do even better?

Page 34: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-34

Linear Estimators and Efficiency (cont.)

• Mean Squared Error = Variance + Bias2

• To have a low Mean Squared Error, we want two things: a low bias and a low variance.

Page 35: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-35

Linear Estimators and Efficiency (cont.)

• An unbiased estimator with a low variance will tend to give answers close to the true value of

• Using the algebra of variances and our DGP, we can calculate the variance of our estimators.

Page 36: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-36

Algebra of Variances

2

1 1 1 1

( ) 0

( ) · ( )

( ) ( )

( ) ( ) ( ) 2 ( , )

( ) ( ) ( , )n n n n

i i i ji i i j

j i

Var k

Var kY k Var Y

Var k Y Var Y

Var X Y Var X Var Y Cov X Y

Var Y Var Y Cov Y Y

(1)

(2)

(3)

(4)

(5)

• One virtue of independent observations is that Cov( Yi ,Yj ) = 0, killing all the cross-terms in the variance of the sum.

Page 37: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-37

Our Baseline DGP: Gauss–Markov

• Our benchmark DGP: Gauss–Markov

• Y = X +• E(i ) = 0

• Var(i ) = 2

• Cov(i ,j ) = 0, for i ≠ j

• X ’s fixed across samples

We will refer to this DGP (very) frequently.

Page 38: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-38

Variance of OLS

2

2 2 21 1,

2

2

2

2

ˆ( )

2 ( ,

0

i iOLS

i

n nj ji i i i

i ji k kj i

ii

k

ii i

k

X YVar Var

X

X YX Y X YVar Cov

X X X

XVar Y

X

XVar X

X

Page 39: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-39

Variance of OLS (cont.)

2

2

2

2

2 22

2 2 22 2 2

ˆ( )

(0 0)

1

iOLS i i

k

ii

k

ii

k k k

XVar Var X

X

XVar

X

XX

X X X

• Note: the higher the Xk2 , the lower

the variance.

Page 40: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-40

Variance of a Linear Estimator

• More generally:

2

2

2

2 2

( ) ( ) 2

( ) 0 ( )

( )

0 ( ) 0

i i i i

i i i i

i i i

i i

i

Var wY Var wY Covariance Terms

Var wY w Var Y

w Var X

w Var

w

Page 41: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-41

Variance of a Linear Estimator (cont.)

• The algebras of expectations and variances allow us to get exact results where the Monte Carlos gave only approximations.

• The exact results apply to ANY model meeting our Gauss–Markov assumptions.

Page 42: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-42

Variance of a Linear Estimator (cont.)

• We now know mathematically that g1–g4 are all unbiased estimators of under our Gauss–Markov assumptions.

• We also think from our Monte Carlo models that g4 is the best of these four estimators, in that it is more efficient than the others.

• They are all unbiased (we know from the algebra), but g4 appears to have a smaller variance than the other 3.

Page 43: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-43

Variance of a Linear Estimator (cont.)

• Is there an unbiased linear estimator better (i.e., more efficient) than g4?

–What is the Best, Linear, Unbiased Estimator?

– How do we find the BLUE estimator?How do we find the BLUE estimator?

Page 44: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-44

BLUE Estimators

• Mean Squared Error = Variance + Bias2

• An unbiased estimator is right “on average”

• In practice, we don’t get to average. We see only one draw from the DGP.

Page 45: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-45

BLUE Estimators (cont.)

• Some analysts would prefer an estimator with a small bias, if it gave them a large reduction in variance

• What good is being right on average if you’re likely to be very wrong in your one draw?

Page 46: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-46

BLUE Estimators (cont.)

• Mean Squared Error = Variance + Bias2

• In a particular application, there may be a favorable trade-off between accepting a little bias in return for a lot less variance.

• We will NOT look for these trade-offs.

• Only after we have made sure our estimator is unbiased will we try to make the variance small.

Page 47: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-47

BLUE Estimators (cont.)

A Strategy for Finding the Best Linear Unbiased Estimator:

1. Start with linear estimators: wiYi

2. Impose the unbiasedness condition wiXi=1

3. Calculate the variance of a linear estimator: Var(wiYi) =2wi

2

1. Use calculus to find the wi that give the smallest variance subject to the unbiasedness condition

Result: the BLUE Estimator for Our DGP

Page 48: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-48

BLUE Estimators (cont.)

2i

ij

Xw

X

Using calculus, we would find

This formula is OLS!

OLS is the Best Linear Unbiased Estimator for

the Gauss–Markov DGP.

This result is called the Gauss–Markov Theorem.

Page 49: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-49

BLUE Estimators (cont.)

• OLS is a very good strategy for the Gauss–Markov DGP.

• OLS is unbiased: our guesses are right on average.

• OLS is efficient: it has a small variance (or at least the smallest possible variance for unbiased linear estimators).

• Our guesses will tend to be close to right (or at least as close to right as we can get; the minimum variance could still be pretty large!)

Page 50: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-50

BLUE Estimator (cont.)

• According to the Gauss–Markov Theorem, OLS is the BLUE Estimator for the Gauss–Markov DGP.

• We will study other DGP’s. For any DGP, we can follow this same procedure:

– Look at Linear Estimators

– Impose the unbiasedness conditions

– Minimize the variance of the estimator

Page 51: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-51

Example: Cobb–Douglas Production Functions (Chapter 3.7)

• A classic production function in economics is the Cobb–Douglas function.

• Y = aLK1-

• If firms pay workers and capital their marginal product, then worker compensation equals a fraction of total output (or national income).

Page 52: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-52

Example: Cobb–Douglas

• To illustrate, we randomly pick 8 years between 1900 and 1995. For each year, we observe total worker compensation and national income.

• We use g1, g2, g3, and g4 to estimate Compensation = ·National Income +

Page 53: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-53

TABLE 3.6 Estimates of the Cobb–Douglas Parameter , with Standard Errors

Page 54: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-54

TABLE 3.7 Outputs from a Regression* of Compensation on National Income

Page 55: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-55

Example: Cobb–Douglas

• All 4 of our estimators give very similar estimates.

• However, g2 and g4 have much smaller standard errors. (We will see the value of small standard errors when we cover hypothesis tests.)

• Using our estimate from g4, 0.738, a 1 billion dollar increase in National Income is predicted to increase total worker compensation by 0.738 billion dollars.

Page 56: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-56

A New DGP

• Most lines do not go through the origin.

• Let’s add an intercept term and find the BLUE Estimator (from Chapter 4).

Page 57: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-57

Gauss–Markov with an Intercept

Yi

0

1X

i

i (i 1...n)

E(i) 0

Var(i) 2

Cov(i,

j) 0, i j

X 's fixed across samples.

All we have done is add a 0.

Page 58: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-58

Gauss–Markov with an Intercept (cont.)

• Example: let’s estimate the effect of income on college financial aid.

• Students whose families have 0 income do not receive 0 aid. They receive a lot of aid.

• E[financial aid | family income]

= 0 + 1(family income)

Page 59: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-59

Gauss–Markov with an Intercept (cont.)

Page 60: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-60

Gauss–Markov with an Intercept (cont.)

• How do we construct a BLUE Estimator?

• Step 1: focus on linear estimators.

• Step 2: calculate the expectation of a linear estimator for this DGP, and find the condition for the estimator to be unbiased.

• Step 3: calculate the variance of a linear estimator. Find the weights that minimize this variance subject to the unbiasedness constraint.

Page 61: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-61

Expectation of a Linear Estimator

0 1

0 1

0 1

0 1

ˆ( ) ( )

( ) ( )

( ) ( ) ( )

0

i i i i

i i i i i

i i i i i

i i i

i i i

E E wY E wY

w E Y w E X

w E w E X w E

w w X

w w X

Page 62: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-62

Checking Understanding

0 1ˆ( ) i i iE w w X

• Question: What are the conditions for an estimator of 1 to be unbiased? What are the conditions for an estimator of 0 to be unbiased?

Page 63: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-63

0 1ˆ( ) i i iE w w X

Checking Understanding (cont.)

• When is the expectation equal to 1?– When wi = 0 and wiXi = 1

• What if we were estimating 0? When is the expectation equal to 0?– When wi = 1 and wiXi = 0

• To estimate 1 parameter, we needed 1 unbiasedness condition. To estimate 2 parameters, we need 2 unbiasedness conditions.

Page 64: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-64

Variance of a Linear Estimator

20 1

2

2 2

ˆ( ) 0

0 0 ( ) 0

i i i i

i i i

i i

i

Var Var wY Var wY

w Var X

w Var

w

• Adding a constant to the DGP does NOT change the variance of the estimator.

Page 65: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-65

BLUE Estimator

1

2 2

12

1

ˆ

0

1

( )( )ˆ

( )

i

i

i i

i in

jj

w

w

w X

X X Y Y

X X

To compute the BLUE estimator for , we want to

minimize

subject to the constraints

Solution:

Page 66: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-66

BLUE Estimator of 1

12

1

( )( )ˆ ( )

i in

jj

X X Y Y

X X

• This estimator is OLS for the DGP with an intercept.

• It is the Best (minimum variance) Linear Unbiased Estimator for the Gauss–Markov DGP with an intercept.

Page 67: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-67

BLUE Estimator of 1 (cont.)

• This formula is very similar to the formula for OLS without an intercept.

• However, now we subtract the mean values from both X and Y.

12

1

( )( )ˆ ( )

i in

jj

X X Y Y

X X

Page 68: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-68

BLUE Estimator of 1 (cont.)

• OLS places more weight on high values of:

• Observations are more valuable if X is far away from its mean.

12

1

( )( )ˆ ( )

i in

jj

X X Y Y

X X

iX X

Page 69: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-69

BLUE Estimator of 1(cont.)

2

2

2 2 21 2

2 222

2

2

ˆ( )

1( )

ii

j

ii

j

i

j

j

X Xw

X X

X XVar w

X X

X XX X

X X

Page 70: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-70

0 1ˆ ˆY X

( , )X Y

BLUE Estimator of 0

• The easiest way to estimate the intercept:

• Notice that the fitted regression line always goes through the point

• Our fitted regression line passes through “the middle of the data.”

Page 71: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-71

Example: The Phillips Curve

• Phillips argued that nations face a trade-off between inflation and unemployment.

• He used annual British data on wage inflation and unemployment from 1861–1913 and 1914–1957 to regress inflation on unemployment.

Page 72: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-72

Example: The Phillips Curve (cont.)

• The fitted regression line for 1861–1913 did a good job predicting the data from 1914 to 1957.

• “Out of sample predictions” are a strong test of an econometric model.

Page 73: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-73

0

1

ˆ 0.06

ˆ 0.55

Example: The Phillips Curve (cont.)

• The US data from 1958–1969 also suggest a trade-off between inflation and unemployment.

Unemploymentt 0.06 - 0.55·Inflationt

Page 74: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-74

Example: The Phillips Curve (cont.)

• How do we interpret these numbers?

• If Inflation were 0, our best guess of Unemployment would be 0.06 percentage points.

• A one percentage point increase of Inflation decreases our predicted Unemployment level by 0.55 percentage points.

Unemploymentt 0.06 - 0.55·Inflationt

Page 75: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-75

Figure 4.2 U.S. Unemployment and Inflation, 1958–1969

Page 76: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-76

TABLE 4.1 The Phillips Curve

Page 77: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-77

Example: The Phillips Curve

• We no longer need to assume our regression line goes through the origin.

• We have learned how to estimate an intercept.

• A straight line doesn’t seem to do a great job here. Can we do better?

Page 78: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-78

Review

• As a starting place, we need to write down all our assumptions about the way the underlying process works, and about how that process led to our data.

• These assumptions are called the “Data Generating Process.”

• Then we can derive estimators that have good properties for the Data Generating Process we have assumed.

Page 79: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-79

Review: The Gauss–Markov DGP

• Y = X +• E(i ) = 0

• Var(i ) = 2

• Cov(i ,j ) = 0, for i ≠ j

• X ’s fixed across samples (so we can treat them like constants).

• We want to estimate

Page 80: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-80

Review

• We will focus on linear estimators.

• Linear estimator: a weighted sum of the Y ’s.

ˆi iwY

Page 81: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-81

Review (cont.)

2

1

1

1

( ) 0

( )

( , ) 0,

ˆ

ˆ( )

1.

i i i

i

i

i j

n

i ii

n

i ii

n

i ii

Y X

E

Var

Cov i j

X

wY

E w X

w X

for

's fixed across samples (so we can treat it as a constant).

A linear estimator is unbiased if

Page 82: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-82

Review (cont.)

YiX

i

i

E(i) 0

Var(i) 2

Cov(i,

j) 0, for i j

X 's fixed across samples (so we can treat it as a constant).

A linear estimator is unbiased if wiX

ii1

n

1.

Many linear estimators will be unbiased. How do I pick the "best"

linear unbiased estimator (BLUE)?

Page 83: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-83

Review: BLUE Estimators

A Strategy for Finding the Best Linear Unbiased Estimator:

1. Start with linear estimators: wiYi

2. Impose the unbiasedness condition wiXi = 1

3. Use calculus to find the wi that give the smallest variance subject to the unbiasedness condition.

Result: The BLUE Estimator for our DGP

Page 84: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-84

Review: BLUE Estimators (cont.)

• Ordinary Least Squares (OLS) is BLUE for our Gauss–Markov DGP.

• This result is called the “Gauss–Markov Theorem.”

Page 85: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-85

Review: BLUE Estimators (cont.)

• OLS is a very good strategy for the Gauss–Markov DGP.

• OLS is unbiased: our guesses are right on average.

• OLS is efficient: the smallest possible variance for unbiased linear estimators.

• Our guesses will tend to be close to right (or at least as close to right as we can get).

• Warning: the minimum variance could still be pretty large!

Page 86: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-86

Gauss–Markov with an Intercept

Yi

0

1X

i

i (i 1...n)

E(i) 0

Var(i) 2

Cov(i,

j) 0, i j

X 's fixed across samples.

All we have done is add a 0.

Page 87: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-87

Review: BLUE Estimator of 1

12

1

( )( )ˆ ( )

i in

jj

X X Y Y

X X

• This estimator is OLS for the DGP with an intercept.

• It is the Best (minimum variance) Linear Unbiased Estimator for the Gauss–Markov DGP with an intercept.

Page 88: Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 5: Regression with One Explanator (Chapter 3.1–3.5, 3.7 Chapter 4.1–4.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 5-88

0 1ˆ ˆY X

( , )X Y

BLUE Estimator of 0

• The easiest way to estimate the intercept:

• Notice that the fitted regression line always goes through the point

• Our fitted regression line passes through “the middle of the data.”