chapter 14 additional topics in regression analysis ©

Chapter 14Chapter 14

Additional Topics in Additional Topics in Regression Regression

AnalysisAnalysis

©

The Stages of Model The Stages of Model BuildingBuilding

Model Specification

Coefficient Estimation

Model Verification

Interpretation and Inference

Experimental DesignExperimental DesignDummy variable regression can be used as a tool in experimental design work. The experiments have a single outcome variable, which contains all of the random error. Each experimental outcome is measured at discrete combinations of experimental (independent) variables, Xj.

There is an important difference in philosophy for experimental designs in comparisons to most of the problems that have been considered. Experimental design attempts to identify causes for the changes in the dependent variable. This is done by pre-specifying combinations of discrete independent variables at which the dependent variable will be measured. An important objective is to choose experimental points, defined by independent variables that provide minimum variance estimators. The order in which the experiments are performed is chosen randomly to avoid biases from variables not included in the experiment.

Example Dummy Variable Example Dummy Variable Specification for Treatment and Specification for Treatment and

Blocking VariablesBlocking Variables(Table 12.1)(Table 12.1)

Z1 X1 X2 X3

1 0 0 0

2 1 0 0

3 0 1 0

4 0 0 1

Z2 X4 X5

1 0 0

2 1 0

3 0 1

Regressions Involving Lagged Regressions Involving Lagged Dependent VariablesDependent Variables

Consider the following regression model linking a dependent variable, Y, and K independent variables

Where 0, 1, . . . ,K, are fixed coefficients. If data are generated by this model:

1. An increase of 1 unit in the independent variable xj in the time period t, will with all other independent variables held fixed, lead to an expected increase in the dependent variable of j in period t, j in period (t+1), j2 in period (t+2), j3 in period (t+3), and so on. The total expected increase over all current and future time periods is j/(1-).

2. The coefficients 0, 1, . . . ,K, can be estimated by least squares in the usual manner.

That a subset of regression parameters are simultaneously equal to 0 against the alternative hypothesis

ttKtKttt YxxxY 122110

Regressions Involving Lagged Regressions Involving Lagged Dependent VariablesDependent Variables

(continued)(continued)

3. Confidence intervals and hypothesis tests for the regression coefficients can be computed precisely as for the ordinary multiple regression model. (Strictly speaking, when the regression equation contains lagged variables, these procedures are only approximately valid. The quality of the approximation improves, all other things being equal, as the number of sample observations increases.)

4. Caution should be used when using confidence intervals and hypothesis tests with time series data. There is a possibility that the equation errors i are no longer independent from one another. When errors are correlated the coefficient estimates are unbiased, but not efficient. Thus confidence intervals and hypothesis tests are no longer valid. Econometricians have developed procedures for obtaining estimates under these conditions.

Specification BiasSpecification Bias

When significant predictor variables are omitted from the model, the least squares estimates will usually be biased, and the usual inferential statements from hypothesis test or confidence intervals can be seriously misleading. In addition the estimated model error will include the effect of the missing variable(s) and thus will be larger. In the rare case where omitted variables are uncorrelated with the independent variables included in the regression model, this will not occur.

MulticollinearityMulticollinearity

MulticollinearityMulticollinearity refers to the situation when high correlation exists between two independent variables. This means the two variables contribute redundant information to the multiple regression model. When highly correlated independent variables are included in the regression model, they can adversely affect the regression results.

MulticollinearityMulticollinearityTwo Designs with Perfect MulticollinearityTwo Designs with Perfect Multicollinearity

(Figure 12.8)(Figure 12.8)

3.0 3.2 3.4

7,500

7,700

7,900

x2i

3.0 3.2 3.4

7,500

7,700

7,900

x2i.. . . . . .

. . . ..

(a) (b)

Tests for HeteroscedasticityTests for Heteroscedasticity

Consider a regression model

Linking a dependent variable to K independent variables and based on n sets of observations. Let b0, b1,. . . , bK be the least squares estimates of the model coefficients, with predicted values

And the residuals from the fitted model are

To test the null hypothesis that the error terms, I, all have the same variance against the alternative that their variances depend on the expected values

iKiKiii xxxY 22110

KiKiii xbxbxbby 22110ˆ

iii yye ˆ

KiKiii xbxbxbby 22110ˆ

Tests for HeteroscedasticityTests for Heteroscedasticity(continued)(continued)

We estimate a simple regression. In this regression, the dependent variable is the square of the residuals – that is ei

2 – and the independent variable is the predicted value, yi-hat

Let R2 be the coefficient of determination of this auxiliary regression. Then for a test of significance level , the null hypothesis is rejected if nRnR22 is bigger than 2

1, where 2

1, is the critical value of the chi-square random variable with 1 degree of freedom and probability of error .

ii yaae ˆ102

Autocorrelated ErrorsAutocorrelated ErrorsConsider the regression model

based on sets of n observations. We are interested in determining if the error terms are autocorrelatedautocorrelated and follow a first-order autoregressive model

where ut is not autocorrelated.

The test of the null hypothesis of no autocorrelation

is based on the Durbin-WatsonDurbin-Watson statistic

tKtKttt xxxY 22110

ttt u 1

0:0 H

n

tt

n

ttt

e

eed

1

2

2

21)(

Autocorrelated ErrorsAutocorrelated Errors(continued)(continued)

Where et are the residuals when the regression equation is estimated by least squares. When the alternative hypothesis is of positive autocorrelation in errors, that is

the decision rule is as follows:

Where dL and dU are tabulated for values of n and K and for significance levels of 1% and 5% in Table 10 of the Appendix.

Occasionally, one wants to test against the alternative of negative autocorrelation that is

0:1 H

UL

U

L

ddd

dd

dd

if veinconclusiTest

if HAccept

if HReject

0

0

0:1 H

Estimation of Regression Models Estimation of Regression Models with Autocorrelated Errorswith Autocorrelated Errors

Suppose that we want to estimate the coefficients of the regression model

where the error term t is autocorrelated.

This can be accomplished in two stages, as follows(i) Estimate the model by least squares, obtaining the Durbin-Watson statistic, d, and hence the estimate

of the autocorrelation parameter(ii) Estimate by least squares a second regression in which the dependent variable is (Yt – rYt-1) and the independent variables are (x1t – rx1,t-1) , (x2t – rx2,t-1) , . . ., (xk1t – rxk,t-1) . The parameters 1, 2, . . ., k are estimated regression coefficients from the second model. An estimate of 0 is obtained by dividing the estimated intercept for the second model by (1-r). Hypothesis tests and confidence intervals for the regression coefficients can be carried out using the output from the second model.

tKtKttt xxxY 22110

21d

r

Key WordsKey Words

Autocorrelated Errors Autocorrelated Errors

with Lagged Dependent Variables

Bias from Excluding Significant Predictor Variables

Coefficient Estimation Dummy Variables Durbin-Watson Test

Estimation of Regression Models with Autocorrelated Errors

Experimental Design Heteroscedasticity Model Interpretation

and Inference Model Specification Model Verification Multicollinearity

Key WordsKey Words(continued)(continued)

Regression Involving Lagged Dependent Variables

Test for Heteroscedasticity

chapter 14 additional topics in regression analysis ©

Documents