multiple regression with two explanatory variables: example 1 this sequence provides a geometrical...

24
MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE 1 This sequence provides a geometrical interpretation of a multiple regression model with two explanatory variables. EARNINGS = b 1 + b 2 S + b 3 EXP + u S b 1 EARNINGS EXP

Upload: sheryl-matthews

Post on 12-Jan-2016

234 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE 1 This sequence provides a geometrical interpretation of a multiple regression model with two

MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE

1

This sequence provides a geometrical interpretation of a multiple regression model with two explanatory variables.

EARNINGS = b1 + b2S + b3EXP + u

S

b1

EARNINGS

EXP

Page 2: MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE 1 This sequence provides a geometrical interpretation of a multiple regression model with two

2

Specifically, we will look at an earnings function model where hourly earnings, EARNINGS, depend on years of schooling (highest grade completed), S, and years of work experience, EXP.

MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE

EARNINGS = b1 + b2S + b3EXP + u

S

b1

EARNINGS

EXP

Page 3: MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE 1 This sequence provides a geometrical interpretation of a multiple regression model with two

3

The model has three dimensions, one each for EARNINGS, S, and EXP. The starting point for investigating the determination of EARNINGS is the intercept, b1.

MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE

EARNINGS = b1 + b2S + b3EXP + u

S

b1

EARNINGS

EXP

Page 4: MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE 1 This sequence provides a geometrical interpretation of a multiple regression model with two

4

Literally the intercept gives EARNINGS for those respondents who have no schooling and no work experience. However, there were no respondents with less than 6 years of schooling. Hence a literal interpretation of b1 would be unwise.

MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE

EARNINGS = b1 + b2S + b3EXP + u

S

b1

EARNINGS

EXP

Page 5: MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE 1 This sequence provides a geometrical interpretation of a multiple regression model with two

pure S effect

5

S

b1

EARNINGS

b1 + b2S

MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE

EARNINGS = b1 + b2S + b3EXP + u

The next term on the right side of the equation gives the effect of variations in S. A one year increase in S causes EARNINGS to increase by b2 dollars, holding EXP constant.

EXP

Page 6: MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE 1 This sequence provides a geometrical interpretation of a multiple regression model with two

pure EXP effect

6

S

b1

b1 + b3EXP

EARNINGS

EXP

MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE

EARNINGS = b1 + b2S + b3EXP + u

Similarly, the third term gives the effect of variations in EXP. A one year increase in EXP causes earnings to increase by b3 dollars, holding S constant.

Page 7: MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE 1 This sequence provides a geometrical interpretation of a multiple regression model with two

pure EXP effect

7

S

b1

b1 + b3EXP

b1 + b2S + b3EXP

EARNINGS

EXP

b1 + b2S

combined effect of S and EXP

MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE

EARNINGS = b1 + b2S + b3EXP + u

Different combinations of S and EXP give rise to values of EARNINGS which lie on the plane shown in the diagram, defined by the equation EARNINGS = b1 + b2S + b3EXP. This is the nonstochastic (nonrandom) component of the model.

pure S effect

Page 8: MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE 1 This sequence provides a geometrical interpretation of a multiple regression model with two

8

MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE

EARNINGS = b1 + b2S + b3EXP + u

The final element of the model is the disturbance term, u. This causes the actual values of EARNINGS to deviate from the plane. In this observation, u happens to have a positive value.

pure EXP effect

S

b1

b1 + b3EXP

EARNINGS

EXP

u

b1 + b2S

pure S effect

b1 + b2S + b3EXP

combined effect of S and EXP

b1 + b2S + b3EXP + u

Page 9: MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE 1 This sequence provides a geometrical interpretation of a multiple regression model with two

9

MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE

EARNINGS = b1 + b2S + b3EXP + u

A sample consists of a number of observations generated in this way. Note that the interpretation of the model does not depend on whether S and EXP are correlated or not.

pure EXP effect

S

b1

b1 + b3EXP

EARNINGS

EXP

u

b1 + b2S

pure S effect

b1 + b2S + b3EXP

combined effect of S and EXP

b1 + b2S + b3EXP + u

Page 10: MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE 1 This sequence provides a geometrical interpretation of a multiple regression model with two

pure EXP effect

10

S

b1

b1 + b3EXP

b1 + b2S + b3EXP + u

EARNINGS

EXP

u

However we do assume that the effects of S and EXP on EARNINGS are additive. The impact of a difference in S on EARNINGS is not affected by the value of EXP, or vice versa.

MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE

EARNINGS = b1 + b2S + b3EXP + u

b1 + b2S

pure S effect

b1 + b2S + b3EXP

combined effect of S and EXP

Page 11: MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE 1 This sequence provides a geometrical interpretation of a multiple regression model with two

The regression coefficients are derived using the same least squares principle used in simple regression analysis. The fitted value of Y in observation i depends on our choice of b1, b2, and b3.

11

MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE

uXXY 33221 33221ˆ XbXbbY

Fitted model

True model

Page 12: MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE 1 This sequence provides a geometrical interpretation of a multiple regression model with two

The residual ei in observation i is the difference between the actual and fitted values of Y.

12

MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE

iiiiii XbXbbYYYe 33221ˆ

uXXY 33221 33221ˆ XbXbbY

Fitted model

True model

Page 13: MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE 1 This sequence provides a geometrical interpretation of a multiple regression model with two

We define RSS, the sum of the squares of the residuals, and choose b1, b2, and b3 so as to minimize it.

13

MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE

iiiiii XbXbbYYYe 33221ˆ

uXXY 33221 33221ˆ XbXbbY

Fitted model

True model

233221

2 )( iiii XbXbbYeRSS

Page 14: MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE 1 This sequence provides a geometrical interpretation of a multiple regression model with two

First, we expand RSS as shown.

14

MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE

iiiiii XbXbbYYYe 33221ˆ

uXXY 33221 33221ˆ XbXbbY

Fitted model

True model

233221

2 )( iiii XbXbbYeRSS

)2222

22(

323233122133

22123

23

22

22

21

2

iiiiii

iiiiii

XXbbXbbXbbYXb

YXbYbXbXbbY

iii

iiiii

iiii

XXbbXbb

XbbYXbYXb

YbXbXbnbY

3232331

2213322

123

23

22

22

21

2

22

222

2

Page 15: MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE 1 This sequence provides a geometrical interpretation of a multiple regression model with two

15

MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE

iii

iiiii

iiii

XXbbXbb

XbbYXbYXb

YbXbXbnbYRSS

3232331

2213322

123

23

22

22

21

2

22

222

2

Then we use the first order conditions for minimizing it.

01

bRSS

0

2

bRSS

0

3

bRSS

iiiiii XbXbbYYYe 33221ˆ

uXXY 33221 33221ˆ XbXbbY

Fitted model

True model

Page 16: MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE 1 This sequence provides a geometrical interpretation of a multiple regression model with two

33221 XbXbYb

We thus obtain three equations in three unknowns. Solving for b1, b2, and b3, we obtain the expressions shown above. (The expression for b3 is the same as that for b2, with the subscripts 2 and 3 interchanged everywhere.)

16

23322

233

222

3322332

XXXXXXXX

XXXXYYXXb

iiii

iiii

MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE

23322 XXYYXX iii

uXXY 33221 33221ˆ XbXbbY

Fitted model

True model

Page 17: MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE 1 This sequence provides a geometrical interpretation of a multiple regression model with two

17

The expression for b1 is a straightforward extension of the expression for it in simple regression analysis.

MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE

uXXY 33221 33221ˆ XbXbbY

Fitted model

True model

33221 XbXbYb

23322

233

222

3322332

XXXXXXXX

XXXXYYXXb

iiii

iiii

23322 XXYYXX iii

Page 18: MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE 1 This sequence provides a geometrical interpretation of a multiple regression model with two

18

However, the expressions for the slope coefficients are considerably more complex than that for the slope coefficient in simple regression analysis.

MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE

uXXY 33221 33221ˆ XbXbbY

Fitted model

True model

33221 XbXbYb

23322

233

222

3322332

XXXXXXXX

XXXXYYXXb

iiii

iiii

23322 XXYYXX iii

Page 19: MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE 1 This sequence provides a geometrical interpretation of a multiple regression model with two

19

For the general case when there are many explanatory variables, ordinary algebra is inadequate. It is necessary to switch to matrix algebra.

MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE

uXXY 33221 33221ˆ XbXbbY

Fitted model

True model

33221 XbXbYb

23322

233

222

3322332

XXXXXXXX

XXXXYYXXb

iiii

iiii

23322 XXYYXX iii

Page 20: MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE 1 This sequence provides a geometrical interpretation of a multiple regression model with two

. reg EARNINGS S EXP

Source | SS df MS Number of obs = 540-------------+------------------------------ F( 2, 537) = 67.54 Model | 22513.6473 2 11256.8237 Prob > F = 0.0000 Residual | 89496.5838 537 166.660305 R-squared = 0.2010-------------+------------------------------ Adj R-squared = 0.1980 Total | 112010.231 539 207.811189 Root MSE = 12.91

------------------------------------------------------------------------------ EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+---------------------------------------------------------------- S | 2.678125 .2336497 11.46 0.000 2.219146 3.137105 EXP | .5624326 .1285136 4.38 0.000 .3099816 .8148837 _cons | -26.48501 4.27251 -6.20 0.000 -34.87789 -18.09213------------------------------------------------------------------------------

Here is the regression output for the earnings function using Data Set 21.

20

EXPSINGSNEAR 56.068.249.26ˆ

MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE

Page 21: MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE 1 This sequence provides a geometrical interpretation of a multiple regression model with two

21

It indicates that earnings increase by $2.68 for every extra year of schooling and by $0.56 for every extra year of work experience.

MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE

EXPSINGSNEAR 56.068.249.26ˆ

. reg EARNINGS S EXP

Source | SS df MS Number of obs = 540-------------+------------------------------ F( 2, 537) = 67.54 Model | 22513.6473 2 11256.8237 Prob > F = 0.0000 Residual | 89496.5838 537 166.660305 R-squared = 0.2010-------------+------------------------------ Adj R-squared = 0.1980 Total | 112010.231 539 207.811189 Root MSE = 12.91

------------------------------------------------------------------------------ EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+---------------------------------------------------------------- S | 2.678125 .2336497 11.46 0.000 2.219146 3.137105 EXP | .5624326 .1285136 4.38 0.000 .3099816 .8148837 _cons | -26.48501 4.27251 -6.20 0.000 -34.87789 -18.09213------------------------------------------------------------------------------

Page 22: MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE 1 This sequence provides a geometrical interpretation of a multiple regression model with two

22

Literally, the intercept indicates that an individual who had no schooling or work experience would have hourly earnings of –$26.49.

MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE

EXPSINGSNEAR 56.068.249.26ˆ

. reg EARNINGS S EXP

Source | SS df MS Number of obs = 540-------------+------------------------------ F( 2, 537) = 67.54 Model | 22513.6473 2 11256.8237 Prob > F = 0.0000 Residual | 89496.5838 537 166.660305 R-squared = 0.2010-------------+------------------------------ Adj R-squared = 0.1980 Total | 112010.231 539 207.811189 Root MSE = 12.91

------------------------------------------------------------------------------ EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+---------------------------------------------------------------- S | 2.678125 .2336497 11.46 0.000 2.219146 3.137105 EXP | .5624326 .1285136 4.38 0.000 .3099816 .8148837 _cons | -26.48501 4.27251 -6.20 0.000 -34.87789 -18.09213------------------------------------------------------------------------------

Page 23: MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE 1 This sequence provides a geometrical interpretation of a multiple regression model with two

23

Obviously, this is impossible. The lowest value of S in the sample was 6. We have obtained a nonsense estimate because we have extrapolated too far from the data range.

MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE

EXPSINGSNEAR 56.068.249.26ˆ

. reg EARNINGS S EXP

Source | SS df MS Number of obs = 540-------------+------------------------------ F( 2, 537) = 67.54 Model | 22513.6473 2 11256.8237 Prob > F = 0.0000 Residual | 89496.5838 537 166.660305 R-squared = 0.2010-------------+------------------------------ Adj R-squared = 0.1980 Total | 112010.231 539 207.811189 Root MSE = 12.91

------------------------------------------------------------------------------ EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+---------------------------------------------------------------- S | 2.678125 .2336497 11.46 0.000 2.219146 3.137105 EXP | .5624326 .1285136 4.38 0.000 .3099816 .8148837 _cons | -26.48501 4.27251 -6.20 0.000 -34.87789 -18.09213------------------------------------------------------------------------------

Page 24: MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE 1 This sequence provides a geometrical interpretation of a multiple regression model with two

Copyright Christopher Dougherty 2012.

These slideshows may be downloaded by anyone, anywhere for personal use.

Subject to respect for copyright and, where appropriate, attribution, they may be

used as a resource for teaching an econometrics course. There is no need to

refer to the author.

The content of this slideshow comes from Section 3.2 of C. Dougherty,

Introduction to Econometrics, fourth edition 2011, Oxford University Press.

Additional (free) resources for both students and instructors may be

downloaded from the OUP Online Resource Centre

http://www.oup.com/uk/orc/bin/9780199567089/.

Individuals studying econometrics on their own who feel that they might benefit

from participation in a formal course should consider the London School of

Economics summer school course

EC212 Introduction to Econometrics

http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx

or the University of London International Programmes distance learning course

EC2020 Elements of Econometrics

www.londoninternational.ac.uk/lse.

2012.10.28