course econometrics i - department home · course econometrics i 2. multiple regression analysis:...
TRANSCRIPT
Course Econometrics I
2. Multiple Regression Analysis: Further Issues
Martin Halla
Johannes Kepler University of LinzDepartment of Economics
Last update: April 1, 2014
Martin Halla CS Econometrics I – 2 1/41
Effects of data scaling on OLS Statistics
Data scaling is changing the units of measurement of thedependent and the independent variables.
I Due to scaling the estimated coefficients (∂y/∂x), standarderrors, test-statistics, etc. change in a way that all measuredeffects and testing outcomes are preserved.
I Linear transformation do not change the fit of the regression
I Data scaling is done for cosmetic purposes.
I To improve interpretability
Martin Halla CS Econometrics I – 2 2/41
Effects of data scaling on OLS Statistics – An example I
Let’s study the effects of data scaling based on an example:
ˆbwght = β0 + β1cigs+ β2faminc (1)
bwght . . . child birth weight in ouncescigs . . . no. of cigarettes smoked by the mother per dayfaminc . . . annual family income in USD 1.000
Results are displayed in column (1) on the next slide.
Martin Halla CS Econometrics I – 2 3/41
Effects of data scaling on OLS statistics – An example II
>> see do-file 2-1.do <<
Martin Halla CS Econometrics I – 2 4/41
Effects of data scaling on OLS Statistics – An example III
Now suppose that we decide to measure birth weight in pounds,rather than in ounces:
ˆbwght/16 = β0/16 +(β1/16
)cigs+
(β2/16
)faminc (2)
I New coefficients are the old coefficients divided by 16.
I New standard errors are the old ones divided by 16.
I t-statistics are unchanged.
I R-squared is unchanged.
I SSR has to be divided by 162; SER by 16.
Results are displayed in column (2) on the previous slide.
Martin Halla CS Econometrics I – 2 5/41
Effects of data scaling on OLS Statistics – An example IV
Now suppose we change compared to (1) the measurement ofcigarettes to packs = cigs/20
ˆbwght = β0 +(
20β1
)(cigs/20) + β2faminc
= β0 +(
20β1
)packs+ β2
(3)
I β0 and β2 are unchanged.
I Coefficient and standard errors on packs are 20 times higherthan those on cigs.
I t-statistics, R-squared, SSR and SER are unchanged.
Results are displayed in column (3) on the penultimate slide.
Martin Halla CS Econometrics I – 2 6/41
Effects of data scaling on OLS Statistics – Log. forms
I If the LHS var appears in log, changing unit has no impact onslope coeffs.
I This follows from the fact that log(c1 · yi) = log(c1) + log(yi)for any c1 > 0.
I The new intercept will be log(c1) + β0.
I The same holds for RHS vars.
Martin Halla CS Econometrics I – 2 7/41
Scaling in the birth weight model
--------------------------------------------------------------------------
Variable | bwght bwghtlbs bwght log(bwght) log(bwghtlbs)
-------------+------------------------------------------------------------
cigs | -0.4634 -0.0290 -0.0040 -0.0040
| 0.0916 0.0057 0.0009 0.0009
faminc | 0.0928 0.0058 0.0928 0.0008 0.0008
| 0.0292 0.0018 0.0292 0.0003 0.0003
packs | -9.2682
| 1.8315
_cons | 116.9741 7.3109 116.9741 4.7440 1.9714
| 1.0490 0.0656 1.0490 0.0098 0.0098
-------------+------------------------------------------------------------
r2 | 0.0298 0.0298 0.0298 0.0265 0.0265
rss | 5.57e+05 2177.6778 5.57e+05 49.0862 49.0862
rmse | 20.0628 1.2539 20.0628 0.1883 0.1883
--------------------------------------------------------------------------
legend: b/se
Martin Halla CS Econometrics I – 2 8/41
Beta coefficients – I
A key var may be measured on a scale that is hard to interpret.
I Often in other disciplines; e.g., test scores, indices or responsesto attitudinal questions
I PISA test, trust, or political freedom
I Instead of asking by how much the LHS changes if the (test)score would be 10 points higher, we can ask what happens ifthe score is one standard deviation (sd) higher?
I The sample sd of all variables can easily be obtained.
I Therefore, we have to standardize variables (see Appendix C)
I Subtracting off its sample mean and dividing by its sample sdI Generates vars with a mean of zero and a sd of one.I These are called z-scores.
Martin Halla CS Econometrics I – 2 9/41
Beta coefficients – II
We want to standardize all variables in the following original model:
yi = β0 + β1xi1 + β2xi2 + . . .+ βkxik + ui
(i) subtract means of all variables:
yi− y = β1(xi1− x1) + β2(xi2− x2) + . . .+ βk(xik − xk) + (ui− 0)
(ii) divide each variable by its sample sd (σy and σk)
(yi − y)
σy=
(σ1σy
)β1
[(xi1 − x1)
σ1
]+. . .
(σkσy
)βk
[(xik − xk)
σk
]+
(uiσy
)
- Since we divide the LHS by σy → divide the coeffs. by σy.
- Since we divide the RHS by σk → multiply the coeffs. with σk.
- Constant is zero by construction
Martin Halla CS Econometrics I – 2 10/41
Beta coefficients – III
The last equation (without i subscripts) can be re-written as follows:
zy = b1z1 + b2z2 + . . .+ bkzk + error (4)
where zy denotes the z-score of y, z1 the z-score of x1, and so on.The new coefficients are
bj =
(σjσy
)βj for j = 1, . . . , k. (5)
These bj are either called beta coefficients or standardized coeffs.
I Interpretat.: If x1 increases by one sd, then y changes by b1 sds.
I Since we measure all variables in sd, the scale is irrelevant.
I The most important RHS can easily be identified.
I See Example 6.1 and >> do-file 2-2.do <<.
Martin Halla CS Econometrics I – 2 11/41
Calculation of beta coefficients in Stata
. reg price nox crime rooms dist stratio, beta
Source | SS df MS Number of obs = 506
-------------+------------------------------ F( 5, 500) = 174.47
Model | 2.7223e+10 5 5.4445e+09 Prob > F = 0.0000
Residual | 1.5603e+10 500 31205611.6 R-squared = 0.6357
-------------+------------------------------ Adj R-squared = 0.6320
Total | 4.2826e+10 505 84803032 Root MSE = 5586.2
------------------------------------------------------------------------------
price | Coef. Std. Err. t P>|t| Beta
-------------+----------------------------------------------------------------
nox | -2706.433 354.0869 -7.64 0.000 -.340446
crime | -153.601 32.92883 -4.66 0.000 -.1432828
rooms | 6735.498 393.6037 17.11 0.000 .5138878
dist | -1026.806 188.1079 -5.46 0.000 -.2348385
stratio | -1149.204 127.4287 -9.02 0.000 -.2702799
_cons | 20871.13 5054.599 4.13 0.000 .
------------------------------------------------------------------------------
>> do-file 2-2.do <<
Martin Halla CS Econometrics I – 2 12/41
More on using logarithmic functional forms – I
Summary of functional forms involving logarithms
Model Dependent Independent InterpretationVariable Variable of β1
Level-level y x 4y = β14xLevel-log y log(x) 4y = (β1/100)%4xLog-level log(y) x %4y = (100β1)4xLog-log log(y) log(x) %4y = β1%4x
Martin Halla CS Econometrics I – 2 13/41
More on using logarithmic functional forms – II
Let’s consider the following equation
log(price) = β0 + β1 log(nox) + β2rooms+ u. (6)
I β1 is the elasticity of price with respect to nox (the NOx
pollution).
I β2 is the change in log(price) when ∆rooms = 1.
I β2 is also called semi-elasticity.
I (100β2)∆x2 is the approx. percentage change in price.
I 100[exp(β2∆x2)− 1] is the exact percentage change in price.
Martin Halla CS Econometrics I – 2 14/41
More on using logarithmic functional forms – III
When estimated using the data in HPRICE2, we obtain
ˆlog(price) = 9.23− 0.718 log(nox) + 0.306rooms
n = 506, R2 = 0.514
I When nox increases by 1%, price falls by 0.718%, holdingrooms fixed.
I When rooms increases by one, price increases by approximately100(0.306) = 30.6%.
I The exact percentage change in price is100[exp(0.306)− 1] = 35.8%.
Martin Halla CS Econometrics I – 2 15/41
More on using logarithmic functional forms – IV
Why do so many econometric models utilize logs?
I Log models often more closely satisfy our assumptions
I Many econ vars are constrained to be positive
I Taking logs reduces extrema and curtails the effects of outliers
I Ratios are often left in levels
I Could be expressed in logs
I Something like an unemployment rate already has a nicepercentage interpretation.
I Be careful to distinguish between an 0.01 change and a one unitchange
Martin Halla CS Econometrics I – 2 16/41
Models with quadratics – I
I The following model is linear in parameters
Y = β0 + β1x1 + β2x21 + β3x3 + . . .+ βkxk + ε. (7)
I However, the squared-term x21 allows more flexibility in themodeling of the relationship between x1 and y.
I β1 and β2 have to be interpreted jointly; if x1 is changing, x21changes too.
I The marginal effect of x1 on y is given by
∂y
∂x1= β1 + 2β2x1. (8)
I The effect of x1 on y depends, therefore, on the level of x1.
Martin Halla CS Econometrics I – 2 17/41
Models with quadratics – II
Different combinations of β1 and β2 result in different functionalforms:
I β1 < 0, β2 > 0: u-shaped relationship
I β1 > 0, β2 < 0: inverted u-shaped relationship
I β1 > 0, β2 > 0: y increases quadratically in x1I β1 < 0, β2 < 0: y decreases quadratically in x1
The extremum/turning point of x1 with respect to y is given by:
∂y
∂x1= β1 + 2β2x1 = 0 ⇒ x1 = −β1/2β2. (9)
Martin Halla CS Econometrics I – 2 18/41
Models with quadratics – An example
See equation (6.12).
Martin Halla CS Econometrics I – 2 19/41
Models with quadratics – An further example
See Example 6.2.
Martin Halla CS Econometrics I – 2 20/41
Models with higher order polynomials
Higher order polynomials:
y = β0 + β1x1 + β2x21 + β3x
31 + ε. (10)
The marginal effect of x1 on y is given by:
∂y
∂x1= β1 + 2β2x1 + 3β3x
21. (11)
Not as commonly found in empirical work
Martin Halla CS Econometrics I – 2 21/41
Models with interaction termsInteraction terms are the product of two (or more) RHS variables:
I This technique allows for nonlinearities
I For instance, consider the determinants of housing prices
price = β0+β1sqrft+β2bdrms+β3sqrft·bdrms+β4bthrms+u,
where the partial effect of bdrms on price is given by
∂price(·)∂bdrms
= β2 + β3 · sqrft.
I That means, if β3 > 0 that an additional bedroom yields ahigher increase in housing price for larger houses
I We must evaluate this effect at interesting values of sqrft
I Respective standard errors can easily be done byreparameterizing the model
I Same applies to partial effect of sqrft on price
Martin Halla CS Econometrics I – 2 22/41
Interaction terms – Reparameterization (general case) ILet’s consider the following model including an interaction term:
y = β0 + β1x1 + β2x2 + β3x1x2 + u.
Here β2 gives the partial effect of x2 on y when x1 = 0.
Often, this is not of interest. Therefore, we reparamterize the model:
y = α0 + α1x1 + δ2x2 + β3(x1 − µ1)x2 + u,
where µ1 is the population mean of x1. After rearranging we get:
y = α0 + α1x1 + δ2x2 + β3x1x2 − β3x2µ1 + u.
We can check (see next slide) that δ2 is the partial effect of x2 on yat the mean value of x1:
δ2 = β2 + β3µ1.
Martin Halla CS Econometrics I – 2 23/41
Interaction terms – Reparameterization (general case) IIOriginal model:
y = β0 + β1x1 + β2x2 + β3x1x2 + u
∂y
∂x2= β2 + β3x1
Reparameterized model:
y = α0 + α1x1 + δ2x2 + β3(x1 − µ1)x2 + u
y = α0 + α1x1 + δ2x2 + β3x1x2 − β3x2µ1 + u
∂y
∂x2= δ2 + β3x1 − β3µ1
β2 + β3x1 = δ2 + β3x1 − β3µ1δ2 = β2 + β3µ1
Martin Halla CS Econometrics I – 2 24/41
Interaction terms – Reparameterization (Example 6.3) Igen priGPA2 = priGPA^2
gen ACT2 = ACT^2
gen interaction = priGPA*atndrte
reg stndfnl atndrte priGPA ACT priGPA2 ACT2 interaction
Source | SS df MS Number of obs = 680
-------------+------------------------------ F( 6, 673) = 33.25
Model | 152.001001 6 25.3335002 Prob > F = 0.0000
Residual | 512.76244 673 .761905557 R-squared = 0.2287
-------------+------------------------------ Adj R-squared = 0.2218
Total | 664.763441 679 .97903305 Root MSE = .87287
------------------------------------------------------------------------------
stndfnl | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
atndrte | -.0067129 .0102321 -0.66 0.512 -.0268035 .0133777
priGPA | -1.62854 .4810025 -3.39 0.001 -2.572986 -.6840938
ACT | -.1280394 .098492 -1.30 0.194 -.3214279 .0653492
priGPA2 | .2959046 .1010495 2.93 0.004 .0974945 .4943147
ACT2 | .0045334 .0021764 2.08 0.038 .00026 .0088068
interaction | .0055859 .0043174 1.29 0.196 -.0028913 .0140631
_cons | 2.050293 1.360319 1.51 0.132 -.6206864 4.721272
------------------------------------------------------------------------------
test atndrte interaction
( 1) atndrte = 0
( 2) interaction = 0
F( 2, 673) = 4.32
Prob > F = 0.0137
Martin Halla CS Econometrics I – 2 25/41
Interaction terms – Reparameterization (Example 6.3) II
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
priGPA | 680 2.586775 .5447141 .857 3.93
display _b[atndrte] + _b[interaction]*2.59
0.00775457
* This is the estimated effect of atndrte on stndfnl at the mean of priGPA
* A 10 percentage point increase in atndrte increases stndfl by 0.078 s.d. from the mean final exam score.
* Is this effect statistically significant from zero?
gen priGPA_mean = priGPA-2.59
gen interaction2 = priGPA_mean* atndrte
reg stndfnl atndrte priGPA ACT priGPA2 ACT2 interaction2
------------------------------------------------------------------------------
stndfnl | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
atndrte | .0077546 .0026393 2.94 0.003 .0025723 .0129368
priGPA | -1.62854 .4810025 -3.39 0.001 -2.572986 -.6840938
ACT | -.1280394 .098492 -1.30 0.194 -.3214279 .0653492
priGPA2 | .2959046 .1010495 2.93 0.004 .0974945 .4943147
ACT2 | .0045334 .0021764 2.08 0.038 .00026 .0088068
interaction2 | .0055859 .0043174 1.29 0.196 -.0028913 .0140631
_cons | 2.050293 1.360319 1.51 0.132 -.6206863 4.721272
------------------------------------------------------------------------------
* Check the estimated coefficient (and standard errors) of atndrte.
Martin Halla CS Econometrics I – 2 26/41
Adjusted R-squared
I R-squared cannot decrease when additional RHS vars are addedto the model; even if these have no significant effect on theLHS var.
I A richer model will always be preferred over a moreparsimonious one.
I The adjusted R-squared imposes a penalty for an additionalRHS var.:
R2 = 1− [SSR/ (n− k − 1)]
[SST/ (n− 1)](12)
I R2 increases iff the t statistic on the new var. is greater than|1|.
I R2 can be negative.
I Different formula: R2 = 1−(1−R2
)(n− 1) / (n− k − 1)
Martin Halla CS Econometrics I – 2 27/41
Using R2 to choose between nonnested models
The R2 can help us to choose a model without redundant variables
I Models are nonnested if none is a special case of the other(s)
I For instance,
wage = β0 + β1educ+ β2female+ β3weight (13)
wage = β0 + β1educ+ β2female+ β3height (14)
I R2 may be used to make informal comparisons of non-nestedmodels, as long as they have the same dependent variable
I For the limitations of the R2, see Example 6.4
Martin Halla CS Econometrics I – 2 28/41
Controlling for too many factors in regression analysis
I MLR.3 shows that we should worry about omitted vars.
I It is also possible to control for too many vars.
I Do not overemphasize goodness-of-fit
I Let’s assume we want to study the effect of state beer taxes ontraffic fatalities:
fatalities = β0 + β1tax+ β2perc male+ . . . (15)
fatalities = β0 + β1tax+ β2perc male+ β3beer cons+ . . . (16)
I Shall we control for beer consumption (beer cons)?
I Remember the ceteris paribus nature of multiple regressionI Beer is a so-called bad control var
Martin Halla CS Econometrics I – 2 29/41
Adding regressors to reduce the error variance
Additional RHS vars:
1. exacerbate the multicolinearity problem.
2. reduce the error variance.
I Generally, we do not know which effect will dominate.
I However, always include RHS vars that affect the LHS var andare uncorrelated with all other RHS vars.
I Why? Because such RHS varsI will not cause multi-collinearity; andI reduce the error variance and the standard errors (s.e.).I The issue is not unbiasedness, but smaller sampling variance!I Think about a randomized controlled trial.
Martin Halla CS Econometrics I – 2 30/41
Confidence intervals (CI) for predictions
I Predictions for a specific subpopulation
1. Sampling error in y0, because the βj are estimated.
I Predictions for a particular unit
1. Sampling error in y0, because the βj are estimated.
2. Variance of the error in the population σ2.
Martin Halla CS Econometrics I – 2 31/41
CI for prediction for a specific subpopulation – I
Fitted values or predictions are subject to sampling variation:
I Suppose we have estimated y = β0 + β1x1 + β2x2 + . . .+ βkxk.
I To obtain a prediction we plug in particular values c1, c2, . . . , ckfor each k RHS var.
I The parameter we would like to estimate is:
θ0 = β0 + β1c1 + β2c2 + . . .+ βkck (17)
θ0 = E (y|x1 = c1, x2 = c2, . . . , xk = ck)
I The estimator of θ0 is
θ0 = β0 + β1c1 + β2c2 + . . .+ βkck.
I To obtain a confidence interval (CI) for θ0, we need s.e. for θ0.
Martin Halla CS Econometrics I – 2 32/41
CI for prediction for a specific subpopulation – II
I We re-write (17) as β0 = θ0 − β1c1 − . . .− βkck and plug thisinto
y = β0 + β1x1 + β2x2 + . . .+ βkxk + u
to obtain
y = θ0 + β1 (x1 − c1) + β2 (x2 − c2) + . . .+ βk (xk − ck) + u
I In other words, we subtract the value cj from each observationson xj , and then run the regression of
yi on (xi1 − c1) , . . . , (xik − ck) , i = 1, 2, . . . , n.
I Predicted value and its s.e. are obtained from the intercept.
I Variance of the prediction is smallest at the mean values of xj .
I See Example 6.5 and >> see do-file 2-4.do <<
Martin Halla CS Econometrics I – 2 33/41
CI for prediction for a specific subpopulation – III
gen sat0 = sat-1200
gen hsperc0 = hsperc-30
gen hsize0 = hsize-5
gen hsize20 = hsize2-25
reg colgpa sat0 hsperc0 hsize0 hsize20
Source | SS df MS Number of obs = 4137
-------------+------------------------------ F( 4, 4132) = 398.02
Model | 499.030503 4 124.757626 Prob > F = 0.0000
Residual | 1295.16517 4132 .313447524 R-squared = 0.2781
-------------+------------------------------ Adj R-squared = 0.2774
Total | 1794.19567 4136 .433799728 Root MSE = .55986
------------------------------------------------------------------------------
colgpa | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
sat0 | .0014925 .0000652 22.89 0.000 .0013646 .0016204
hsperc0 | -.0138558 .000561 -24.70 0.000 -.0149557 -.0127559
hsize0 | -.0608815 .0165012 -3.69 0.000 -.0932327 -.0285302
hsize20 | .0054603 .0022698 2.41 0.016 .0010102 .0099104
_cons | 2.700075 .0198778 135.83 0.000 2.661104 2.739047
------------------------------------------------------------------------------
I The 95% confidence interval for the expected college GPA is about2.66 to 2.74
Martin Halla CS Econometrics I – 2 34/41
CI for prediction for a particular unit – I
I We have just derived the CI of the prediction for thesubpopulation with a given set of RHS vars.
I This is different from the CI for a particular unit (e. g.individual, firm, country.).
I Here, we must also account for the variance in theunobserved error.
I On average, that error is assumed to be zero; that is, E(u) = 0.
I For a specific value of y, there will be an error ui; we do notknow its magnitude.
Martin Halla CS Econometrics I – 2 35/41
CI for prediction for a particular unit – V
I Let y0 denote the value for which we would like to construct anprediction interval.
I y0 could represent a unit not in our original sample.
I Let x01, . . . , x0k be the values of the RHS vars. we assume to
observe; and u0 be the unobserved error.
I Therefore, we have
y0 = β0 + β1x01 + β2x
02 + . . .+ βkx
0k + u0.
I As before our best prediction of y0 is
y0 = β0 + β1x01 + β2x
02 + . . .+ βkx
0k
I The prediction error (with E(e0) = 0) is given by
e0 = y0 − y0 =(β0 + β1x
01 + . . .+ βkx
0k
)+ u0︸ ︷︷ ︸
y0
−y0.
Martin Halla CS Econometrics I – 2 36/41
CI for prediction for a particular unit – VI
I To find the variance of e0, note that u0 is uncorrelated witheach βj .
I Therefore, the variance of the prediction error is the sum ofthe variances:
V ar(e0) = V ar(y0) + V ar(u0) = V ar(y0) + σ2
where V ar(u0) is the error variance σ2. [See (B.31)]I There are two sources of variation in e0:
1. Sampling error in y0, because the βj are estimated.2. The variance of the error in the population σ2.
I The s.e. of e0 are given by se(e0) ={[se(y0)
]2+ σ2
}1/2.
I Due to the second term, CIs formed for specific values of y willalways be wider than those for predictions of the mean y.
Martin Halla CS Econometrics I – 2 37/41
CI for prediction for a particular unit – VII
I Using the same reasoning for the t statistics of the βj , we findthat e0/se(e0) has a t distribution with n− (k + 1) degrees offreedom:
P[−t0.025 ≤ e0/se(e0) ≤ t0.025
]= 0.95,
where t0.025 is the 97.5th percentile in the tn−k−1 distribution.
I Plugging in e0 = y0 − y0 and rearranging gives a 95%prediction interval for y0
y0 : y0 ± t0.025 · se(e0).
Martin Halla CS Econometrics I – 2 38/41
CI for prediction for a particular unit – VIII
See Example 6.6:
I y0 ± 1.96 · se(e0) ={
0.022 + 0.562}1/2
.
Source | SS df MS Number of obs = 4137
-------------+------------------------------ F( 4, 4132) = 398.02
Model | 499.030503 4 124.757626 Prob > F = 0.0000
Residual | 1295.16517 4132 .313447524 R-squared = 0.2781
-------------+------------------------------ Adj R-squared = 0.2774
Total | 1794.19567 4136 .433799728 Root MSE = .55986
------------------------------------------------------------------------------
colgpa | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
sat0 | .0014925 .0000652 22.89 0.000 .0013646 .0016204
hsperc0 | -.0138558 .000561 -24.70 0.000 -.0149557 -.0127559
hsize0 | -.0608815 .0165012 -3.69 0.000 -.0932327 -.0285302
hsize20 | .0054603 .0022698 2.41 0.016 .0010102 .0099104
_cons | 2.700075 .0198778 135.83 0.000 2.661104 2.739047
------------------------------------------------------------------------------
Martin Halla CS Econometrics I – 2 39/41
Residual analysis
OLS residuals are often calculated and analyzed after an estimation:
I Purely technical: residual may be used to test the validity ofthe several assumptions.
I Systematic behavior in the magnitude, or in their dispersion,would cast doubt on the OLS results.
I When plotted, do they appear systematic?I Does their dispersion appear to be constant, or is it larger for
some RHS var values than others?
I It can show whether particular units have predicted values thatare well above or well below the actual outcome.
Martin Halla CS Econometrics I – 2 40/41
Predicting y when log(y) is the dependent variable
I y 6= exp[
ˆlog(yi)]
I y = α0 exp[
ˆlog(yi)]; where α0 = exp(u)
1. Obtain the fitted values, ˆlog(yi), and residuals, ui, from theregression log(y) on x1, . . . , xk.
2. For each obs. i, create mi = exp[
ˆlog(yi)]
3. Regress y on mi without a constant; the coefficient on mi isthe estimate of α0
4. For given values of x1, . . . , xk obtain ˆlog(yi)
5. Plug α0 and ˆlog(yi) into the eq. above
See Example 6.7
Martin Halla CS Econometrics I – 2 41/41