introductory econometrics - session 5 - the linear model
TRANSCRIPT
![Page 1: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/1.jpg)
The simple regression model The multiple regression model Inference
Introductory EconometricsSession 5 - The linear model
Roland Rathelot
Sciences Po
July 2011
Rathelot
Introductory Econometrics
![Page 2: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/2.jpg)
The simple regression model The multiple regression model Inference
Multivariate econometrics
I Outcome
I Covariate(s)
I Model
In the simple regression: 1 outcome and 1 regressor
Rathelot
Introductory Econometrics
![Page 3: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/3.jpg)
The simple regression model The multiple regression model Inference
Assumption: random sample
I The population of interest should be defined
I (yi , xi ){i=1...n} are assumed to be iid
I Note that here yi and xi are rv
Rathelot
Introductory Econometrics
![Page 4: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/4.jpg)
The simple regression model The multiple regression model Inference
Assumption: A linear model
y = α + βx + u
I y is the outcome (explained variable)
I x is the explanatory variable (covariate)
I α is the constant (intercept)
I β is the coefficient on x (slope)
I u is the error
Rathelot
Introductory Econometrics
![Page 5: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/5.jpg)
The simple regression model The multiple regression model Inference
Correlation and causality
I Comovement of ∆x and ∆y
I Interpreting β in a causal sense: when is it possible?
I In a causal framework, u are the unobserved determinants of y
Rathelot
Introductory Econometrics
![Page 6: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/6.jpg)
The simple regression model The multiple regression model Inference
Correlation and causality
I Comovement of ∆x and ∆y
I Interpreting β in a causal sense: when is it possible?
I In a causal framework, u are the unobserved determinants of y
Rathelot
Introductory Econometrics
![Page 7: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/7.jpg)
The simple regression model The multiple regression model Inference
Assumption: Zero expectation of the error
E (u) = 0
When an intercept is introduced in the model, this assumptioncomes at no cost
Rathelot
Introductory Econometrics
![Page 8: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/8.jpg)
The simple regression model The multiple regression model Inference
Assumption: Zero conditional expectation
E (u|x) = 0
I This means that the error is not correlated with any functionof x
I Crucial assumption
I As a consequence:
E (y |x) = α + βx
Rathelot
Introductory Econometrics
![Page 9: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/9.jpg)
The simple regression model The multiple regression model Inference
What is the right estimator?
I Based on these assumptions, how can we estimate β (and α)?
I By the moments
I By the least squares
Rathelot
Introductory Econometrics
![Page 10: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/10.jpg)
The simple regression model The multiple regression model Inference
The OLS estimator
β =
∑ni (yi − y)(xi − y)∑n
i (xi − x)2
α = y − βx
Rathelot
Introductory Econometrics
![Page 11: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/11.jpg)
The simple regression model The multiple regression model Inference
Algebraic properties of the OLS estimator
Let’s define
I the residualui = yi − α− βxi
I the predicted valueyi = α + βxi
Then
1.∑
ui = 0
2.∑
xi ui = 0
3. (x , y) is on the regression line
Rathelot
Introductory Econometrics
![Page 12: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/12.jpg)
The simple regression model The multiple regression model Inference
Decomposition of the variance
We define:
SST =∑
(yi − y)2
SSE =∑
(yi − y)2
SSR =∑
(ui )2
Then,SST = SSE + SSR
Rathelot
Introductory Econometrics
![Page 13: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/13.jpg)
The simple regression model The multiple regression model Inference
Goodness of fit
The R-squared is usually used to appreciate the goodness of fit ofthe linear regression
R2 = SSE/SST = 1− SSR/SST
It is the share of the explained variance in the total variance
Rathelot
Introductory Econometrics
![Page 14: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/14.jpg)
The simple regression model The multiple regression model Inference
Assumptions: Summary
I (A1) Linear model
I (A2) Random sample in the population
I (A3) Variability of the covariate in the sample
I (A4) Zero conditional expectation of the error
Rathelot
Introductory Econometrics
![Page 15: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/15.jpg)
The simple regression model The multiple regression model Inference
Statistical properties
Under assumptions (A1) to (A4), two important properties for theOLS estimator (α, β)
I The OLS estimator is consistent
I The OLS estimator is unbiased
Rathelot
Introductory Econometrics
![Page 16: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/16.jpg)
The simple regression model The multiple regression model Inference
An additional assumption
I So far, nothing about precision
I (A5) Homoskedasticity: V (u|x) = σ2
I This assumption means that the variance of the error does notdepend on the value of the covariate
Assumption (A1)-(A5) are sometimes called the Gauss-Markovassumptions
Rathelot
Introductory Econometrics
![Page 17: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/17.jpg)
The simple regression model The multiple regression model Inference
The variance of the OLS estimator
Conditional of the sample {x1 . . . xn}, the variance of the OLSestimator is
V (β) =σ2∑
(xi − x)2
V (α) =
∑x2in
σ2∑(xi − x)2
Rathelot
Introductory Econometrics
![Page 18: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/18.jpg)
The simple regression model The multiple regression model Inference
Estimating the variance of β
I What is unknown and needed is σ2
I The usual estimator is σ2 =∑
u2in−2
I This estimator is unbiased
Rathelot
Introductory Econometrics
![Page 19: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/19.jpg)
The simple regression model The multiple regression model Inference
Estimating the variance of β
I What is unknown and needed is σ2
I The usual estimator is σ2 =∑
u2in−2
I This estimator is unbiased
Rathelot
Introductory Econometrics
![Page 20: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/20.jpg)
The simple regression model The multiple regression model Inference
Regression through the origin
It is possible to estimate the model with no intercept
y = βx + u
In this case, β =∑
xiyix2i
When using this estimator instead of the
one with intercept?
I When strong a priori or theory to believe that α = 0
I When variables have been centered before the regression:xi = xi − x
Rathelot
Introductory Econometrics
![Page 21: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/21.jpg)
The simple regression model The multiple regression model Inference
The multiple regression model
I Multiple: not just one but several covariates
I k covariates: 1, x1, x2,... xk−1I Caeteris paribus: principle and examples
Rathelot
Introductory Econometrics
![Page 22: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/22.jpg)
The simple regression model The multiple regression model Inference
Scalar notations
The linear model now writes:
yi = β0 + β1x1,i + . . . βk−1x(k−1,i) + ui
where, for the sake of clarity, the subscript i is usually omitted
Rathelot
Introductory Econometrics
![Page 23: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/23.jpg)
The simple regression model The multiple regression model Inference
Matrix notations
y = Xβ + u
I y is the vector (y1 . . . yn), of length n
I X is a matrix with k columns and n rows
I β is a vector of length k
I u is a vector of length n
Rathelot
Introductory Econometrics
![Page 24: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/24.jpg)
The simple regression model The multiple regression model Inference
The Gauss-Markov assumption in the multiple case
I (A1) Linear model
I (A2) Random sample in the population
I (A3) No collinearity between covariates
I (A4) Zero conditional expectation of the error
I (A5) Homoskedasticity : Var(ui |xi = x) = σ2
Rathelot
Introductory Econometrics
![Page 25: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/25.jpg)
The simple regression model The multiple regression model Inference
Obtaining the OLS
Just the same as before:
I Least squares
I Moments
provide the same expression for the OLS estimators
β = (β0, β1 . . . βk−1)
Rathelot
Introductory Econometrics
![Page 26: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/26.jpg)
The simple regression model The multiple regression model Inference
A compact expression
β = (X ′X )−1(X ′y)
I Under A1 to A4, the estimator is consistent and unbiased
I Under A1 to A5, its variance is equal to (X ′X )−1σ2
Rathelot
Introductory Econometrics
![Page 27: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/27.jpg)
The simple regression model The multiple regression model Inference
Residuals and predicted values
The residuals and the predicted values are defined as before
u = y − X β
y = X β
Rathelot
Introductory Econometrics
![Page 28: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/28.jpg)
The simple regression model The multiple regression model Inference
Goodness of fit
The residuals and the predicted values are defined as beforeThe R-squared is still used to assess the model’s goodness of fit
R2 = SSE/SST = 1− SSR/SST
Rathelot
Introductory Econometrics
![Page 29: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/29.jpg)
The simple regression model The multiple regression model Inference
Estimating the variance of the OLS estimator
I As in the simple case, σ2 has to be estimated
I An unbiased estimator for σ2 is:
σ2 =1
n − k
∑u2i
Rathelot
Introductory Econometrics
![Page 30: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/30.jpg)
The simple regression model The multiple regression model Inference
Projections
To interpret the meaning of OLS estimators, it is useful tointroduce:
PX = X (X ′X )−1X ′
MX = I − X (X ′X )−1X ′
I PX and MX are symmetric
I PX and MX are projectors
so that y = PX y and u = MX y
Rathelot
Introductory Econometrics
![Page 31: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/31.jpg)
The simple regression model The multiple regression model Inference
The Frisch Waugh theorem
Split the covariates in two groups: X = (X1,X2), β = (β1, β2)
I First regress y on X1 and X2 on X1 and keep the residualsMX1y and MX1X2
I Now regress MX1y on MX1X2: the obtained estimator is equalto the OLS estimator β2
Rathelot
Introductory Econometrics
![Page 32: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/32.jpg)
The simple regression model The multiple regression model Inference
Frisch Waugh and ceteris paribus
I Suppose we are especially interested by βj , the coefficient onxj
I Regress first xj on all the other covariates: keep the residualM−jxj
I βj may be obtained by the regression of y on M−jxj
Rathelot
Introductory Econometrics
![Page 33: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/33.jpg)
The simple regression model The multiple regression model Inference
Frisch Waugh and the variance of βj
Var(βj) =σ2
(1− R2j )SSTj
I SSTj =∑
(xij − xj)2
I R2j is the R-squared from regressing xj on all other
independent variables
Rathelot
Introductory Econometrics
![Page 34: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/34.jpg)
The simple regression model The multiple regression model Inference
Misspecification
Suppose the true model is y = β0 + β1x1 + β2x2 + u
I β1 is the OLS estimator relating to x1 in the regression of yon x1 and x2
I β1 is the OLS estimator of the regression of y on x1
Rathelot
Introductory Econometrics
![Page 35: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/35.jpg)
The simple regression model The multiple regression model Inference
Misspecification (2)
I β1 is biased iff:
1. β2 6= 02. Cov(x1, x2) 6= 0
I In terms of variance, β is always more precise than beta
Var(β) ≥ Var(β)
Rathelot
Introductory Econometrics
![Page 36: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/36.jpg)
The simple regression model The multiple regression model Inference
BLUE
I Among the linear estimators
βj =n∑i
wiyi
I that are unbiased
I the OLS estimator is the one with the smallest variance
It is said to be the Best Linear Unbiased Estimator
Rathelot
Introductory Econometrics
![Page 37: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/37.jpg)
The simple regression model The multiple regression model Inference
Normality
I Even under the Gauss Markov assumptions, the distribution ofβ may still have any form
I To be able to make inference, need to add a normalityassumption
(A6) u is independent from x1 . . . xk and is distributed as aN(0, σ2)
Rathelot
Introductory Econometrics
![Page 38: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/38.jpg)
The simple regression model The multiple regression model Inference
Distribution of the estimator
Under A1 to A6, the OLS estimator is distributed as:
βj ∼ N(βj ,Var(βj))
Andβj − βj√Var(βj)
∼ N(0, 1)
Rathelot
Introductory Econometrics
![Page 39: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/39.jpg)
The simple regression model The multiple regression model Inference
The t-stat
I The distribution of its empirical counterpart is a Student withn − k df
βj − βj√V (βj)
∼ tn−k
I When n is not too small, this distribution is really close to astandard normal
I To test the significance of the coefficient βj , the t-statistic isusually used
βj√V (βj)
Rathelot
Introductory Econometrics
![Page 40: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/40.jpg)
The simple regression model The multiple regression model Inference
To test any linear restriction
This test may be used to test any case where there is one linearrestriction
I the equality of a coefficient to 0
I the equality of a coefficient to any number
I the equality of two coefficients
I any linear relationship between two or more coefficients
Rathelot
Introductory Econometrics
![Page 41: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/41.jpg)
The simple regression model The multiple regression model Inference
Testing more restrictions
The Fisher test
Rathelot
Introductory Econometrics
![Page 42: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/42.jpg)
The simple regression model The multiple regression model Inference
What if normality is not likely?
I In many cases, the normality of the errors is a strongassumption
I How can we reach inference without this assumption?
I We replace A6 by another assumption
I A′6 n is sufficiently large so that we can use the asymptoticproperties of the OLS estimator
Rathelot
Introductory Econometrics
![Page 43: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/43.jpg)
The simple regression model The multiple regression model Inference
Consistency of the OLS estimator
I The OLS estimator is consistent
plimβ = β
Rathelot
Introductory Econometrics
![Page 44: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/44.jpg)
The simple regression model The multiple regression model Inference
Asymptotic normality
Using the Central Limit Theorem, under the Gauss Markovassumptions
I√n(β − β) N(0, σ2/A2)
I where A2 = plim(X ′X )/n
I σ2 is a consistent estimator of σ2
I Finallyβj − βj√V (βj)
N(0, 1)
Rathelot
Introductory Econometrics
![Page 45: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/45.jpg)
The simple regression model The multiple regression model Inference
Asymptotic inference
When n is large, one may, without any normality assumption A6
I use the t-test for one linear restriction
I use the Fisher test for several linear restrictions
Rathelot
Introductory Econometrics
![Page 46: Introductory Econometrics - Session 5 - The linear model](https://reader033.vdocuments.mx/reader033/viewer/2022061101/629b89b53a530c2353295472/html5/thumbnails/46.jpg)
The simple regression model The multiple regression model Inference
The asymptotic behavior of the variance
We already know that
V ar(βj) =σ2
(1− R2j )SSTj
I σ2 is consistent for σ2
I R2j converges to some value between 0 and 1
I SSTj/n converges to Var(xj)
So the variance is O(1/n) and the standard error O(1/√n)
Rathelot
Introductory Econometrics