part 10: time series applications [ 1/64] econometric analysis of panel data william greene...

Part 10: Time Series Applications [ 1/64]

Econometric Analysis of Panel Data

William Greene

Department of Economics

Stern School of Business


Dear professor Greene, I have a plan to run a (endogenous or exogenous) switching regression model for a panel data set. To my knowledge, there is no routine for this in other software, and I am not so good at coding a program. Fortunately, I am advised that LIMDEP has a built in function (or routine) for the panel switch model.


Endogenous Switching (ca.1980)0 0 0

1 1 1

Regime 0: y

Regime 1: y

Regime Switch: d* = , d = 1[d* > 0]

Regime 0 governs if d = 0, Probability = 1- ( )

Regime 1 governs if d = 1, Probability = ( )

Endogenous

x

x

z

z

z

i i i

i i i

i

i

i

u

20 0

20 1

0 0 0 1 1

0

Switching: ~ 0 , ?

0 1

This is a latent class model with different processes in the two classes.

There is correlation between the unobservabl

i

i

i

N

es that govern the class

determination and the unobservables in the two regime equations.

Not identified. Regimes do not coexist.


Modeling an Economic Time Series

Observed y0, y1, …, yt,… What is the “sample” Random sampling? The “observation window”


Estimators

Functions of sums of observations Law of large numbers?

Nonindependent observations What does “increasing sample size” mean?

Asymptotic properties? (There are no finite sample properties.)


Interpreting a Time Series Time domain: A “process”

y(t) = ax(t) + by(t-1) + … Regression like approach/interpretation

Frequency domain: A sum of terms y(t) = Contribution of different frequencies to the

observed series. (“High frequency data and financial

econometrics – “frequency” is used slightly differently here.)

( ) ( )j jjCos t t


For example,…


In parts…


Studying the Frequency Domain Cannot identify the number of terms Cannot identify frequencies from the time

series Deconstructing the variance,

autocovariances and autocorrelations Contributions at different frequencies Apparent large weights at different frequencies Using Fourier transforms of the data Does this provide “new” information about the

series?


Autocorrelation in Regression Yt = b’xt + εt

Cov(εt, εt-1) ≠ 0

Ex. RealConst = a + bRealIncome + εt U.S. Data, quarterly, 1950-2000


Autocorrelation How does it arise? What does it mean? Modeling approaches

Classical – direct: corrective Estimation that accounts for autocorrelation Inference in the presence of autocorrelation

Contemporary – structural Model the source Incorporate the time series aspect in the model


Stationary Time Series zt = b1yt-1 + b2yt-2 + … + bPyt-P + et

Autocovariance: γk = Cov[yt,yt-k] Autocorrelation: k = γk / γ0 Stationary series: γk depends only on k, not on t

Weak stationarity: E[yt] is not a function of t, E[yt * yt-s] is not a function of t or s, only of |t-s|

Strong stationarity: The joint distribution of [yt,yt-1,…,yt-s] for any window of length s periods, is not a function of t or s.

A condition for weak stationarity: The smallest root of the characteristic polynomial: 1 - b1z1 - b2z2 - … - bPzP = 0, is greater than one. The unit circle Complex roots Example: yt = yt-1 + ee, 1 - z = 0 has root z = 1/ ,

| z | > 1 => | | < 1.


Stationary vs. Nonstationary Series

T

-5.73

-1.15

3.43

8.01

12.59

-10.3120 40 60 80 1000

YT ET

Varia

ble


The Lag Operator

Lc = c when c is a constant Lxt = xt-1

L2 xt = xt-2

LPxt + LQxt = xt-P + xt-Q

Polynomials in L: yt = B(L)yt + et

A(L) yt = et

Invertibility: yt = [A(L)]-1 et


Inverting a Stationary Series

yt= yt-1 + et (1- L)yt = et

yt = [1- L]-1 et = et + et-1 + 2et-2 + …

Stationary series can be inverted Autoregressive vs. moving average form of

series

2 311 ( ) ( ) ( ) ...

1L L L

L


Regression with Autocorrelation yt = xt’b + et, et = et-1 + ut

(1- L)et = ut et = (1- L)-1ut

E[et] = E[ (1- L)-1ut] = (1- L)-1E[ut] = 0

Var[et] = (1- L)-2Var[ut] = 1+ 2u2 + …

= u2/(1- 2)

Cov[et,et-1] = Cov[et-1 + ut, et-1] =

=Cov[et-1,et-1]+Cov[ut,et-1] = u

2/(1- 2)


OLS vs. GLS

OLS Unbiased? Consistent: (Except in the presence of a

lagged dependent variable) Inefficient

GLS Consistent and efficient


+----------------------------------------------------+| Ordinary least squares regression || LHS=REALCONS Mean = 2999.436 || Autocorrel Durbin-Watson Stat. = .0920480 || Rho = cor[e,e(-1)] = .9539760 |+----------------------------------------------------++---------+--------------+----------------+--------+---------+----------+|Variable | Coefficient | Standard Error |t-ratio |P[|T|>t] | Mean of X|+---------+--------------+----------------+--------+---------+----------+ Constant -80.3547488 14.3058515 -5.617 .0000 REALDPI .92168567 .00387175 238.054 .0000 3341.47598| Robust VC Newey-West, Periods = 10 | Constant -80.3547488 41.7239214 -1.926 .0555 REALDPI .92168567 .01503516 61.302 .0000 3341.47598+---------------------------------------------+| AR(1) Model: e(t) = rho * e(t-1) + u(t) || Final value of Rho = .998782 || Iter= 6, SS= 118367.007, Log-L=-941.371914 || Durbin-Watson: e(t) = .002436 || Std. Deviation: e(t) = 490.567910 || Std. Deviation: u(t) = 24.206926 || Durbin-Watson: u(t) = 1.994957 || Autocorrelation: u(t) = .002521 || N[0,1] used for significance levels |+---------------------------------------------++---------+--------------+----------------+--------+---------+----------+|Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | Mean of X|+---------+--------------+----------------+--------+---------+----------+ Constant 1019.32680 411.177156 2.479 .0132 REALDPI .67342731 .03972593 16.952 .0000 3341.47598 RHO .99878181 .00346332 288.389 .0000


Detecting Autocorrelation

Use residuals Durbin-Watson d= Assumes normally distributed disturbances

strictly exogenous regressors Variable addition (Godfrey)

yt = ’xt + εt-1 + ut

Use regression residuals et and test = 0 Assumes consistency of b.

22 1

21

( )2(1 )

Tt t t

Tt t

e er

e


A Unit Root?

How to test for = 1? By construction: εt – εt-1 = ( - 1)εt-1 + ut

Test for γ = ( - 1) = 0 using regression? Variance goes to 0 faster than 1/T. Need a

new table; can’t use standard t tables. Dickey – Fuller tests

Unit roots in economic data. (Are there?) Nonstationary series Implications for conventional analysis


Reinterpreting Autocorrelation

1

1 1 1 1

t

t

Regression form

' ,

Error Correction Form

'( ) ( ' ) , ( 1)

' the equilibrium

The model describes adjustment of y to equilibrium when

x changes.

t t t t t t

t t t t t t t

t

y x u

y y x x y x u

x


Integrated Processes

Integration of order (P) when the P’th differenced series is stationary

Stationary series are I(0) Trending series are often I(1). Then yt – yt-1 =

yt is I(0). [Most macroeconomic data series.]

Accelerating series might be I(2). Then (yt – yt-1)- (yt – yt-1) = 2yt is I(0) [Money stock in hyperinflationary economies. Difficult to find many applications in economics]


Cointegration: Real DPI and Real Consumption

Observ.#

2198

3390

4582

5775

6967

100641 82 123 164 2050

REALDPI REALCONS

Vari

ab

le


Cointegration – Divergent Series?

Observ.#

4.30

8.48

12.67

16.85

21.04

.1220 40 60 80 1000

YT XT

Vari

ab

le


Cointegration X(t) and y(t) are obviously I(1) Looks like any linear combination of x(t) and y(t)

will also be I(1) Does a model y(t) = bx(t) + u(u) where u(t) is

I(0) make any sense? How can u(t) be I(0)? In fact, there is a linear combination, [1,-] that

is I(0). y(t) = .1*t + noise, x(t) = .2*t + noise y(t) and x(t) have a common trend y(t) and x(t) are cointegrated.


Cointegration and I(0) Residuals

Observ.#

-1.00

-.50

.00

.50

1.00

1.50

2.00

-1.5017 34 51 68 85 1020

ET


Cross Country Growth Convergence

1i,t i,t i,t i,t

i,t

i,t i,t-1 i,t-1

i,t 1 i i,t 1

i,t i i,t 1

Y K (A L )

A index of technology

K capital stock = I + (1- )K

I sY Investment Savings

L labor (1 n)L

Solow - Swan Growth Model

Equilibrium Model for Steady

i,t i i i i,t 1 i,t

i i i i

i i

logY t logY

(1- )g, g=technological growth rate

"convergence" parameter; 1- = rate of convergence to steady state

Cross country comparisons of convergence rat

State Income

es are the focus of study.


Heterogeneous Dynamic Model

i,t i i i,t 1 i it i,t

ii

i

logY logY x

Long run effect of interest is 1

Average (over countries) effect: or 1

(1) "Fixed Effects:" Separate regressions, then average results.

(2) Country means (o

ver time) - can be manipulated to produce

consistent estimators of desired parameters

(3) Time series of means across countries (does not work at all)

(4) Pooled - no way to obtain consistent est

i

i i

imates.

(5) Mixed Fixed (Weinhold - Hsiao): Build separate into the

equation with "fixed effects," treat u as random.


“Fixed Effects” Approach


ii

i

i i i

N Nii 1 i 1 i

i

Ni 1 i

Ni 1

logY logY x

, = 1 1

ˆ ˆ(1) Separate regressions; , ,ˆ

ˆ1 1ˆ ˆ(2) Average estimates = orˆN N1

ˆ(1/ N)ˆ Function of averages: =1 (1/ N)

i

i

N

ii=1

ˆ

In each case, each term i has variance O(1/T)

Each average has variance O(1/N) (1/N)O(1/T)

Expect consistency of estimates of long run effects.


Country Means


ii

i

i i i i, 1 i i i

i i, 1

Tt 1 it

ii

Tt 1 i,t 1 i,T i,0

i, 1 ii i

logY logY x

, = 1 1

logY logY x

contains logY

Estimates are inconsistent. But,

logylogY ,

T

logy y ylogY logY log

T T

i T i

i i i i, 1 i i i i i i T i i i i

i i i iii T i

i i i i

Y (y) / T

logY logY x (logY (y) / T) x

x (y) / T1 1 1 1


Country Means (cont.)

i i i i, 1 i i i i i i T i i i i

i i i iii T i

i i i i

0 0 0i i i i

i i i i

logY logY x (logY (y) / T) x

x (y) / T1 1 1 1

(1) Let = /(1 ), expect to be random

= /(1 ) and to be random

(2) Lev

i i T i

0i i

el variable x should be uncorrelated with change (logY (y) / T).

Regression of logY on 1, x should give consistent estimates of and .


Time Series

Nt i 1 i,t

N N Ni 1 i i,t 1 i 1 i i,t i 1 i,t

Ni 1 i i i

N Ni 1 i i,t 1 1,t i 1 i i,t 1

logy (1/ N) logy

= + (1/N) y (1/N) x (1/N)

Use =(1/N) so ( ) then

(1/N) logy logy (1/N) ( ) logy

Likewise for (1/N)

Ni 1 i i,t

Nt 1,t t t i 1 i i,t 1

x

logy logy x ( (1/N) ( ) logy ...)

Disturbance is correlated with the regressor. There is

no way out, and by construction no instrumental variable

that could be correlated

with the regressor and not the

disturbance.


Pooling

Essentially the same as the time series case.

OLS or GLS are inconsistentThere could be no instrument that would

work (by construction)


A Mixed/Fixed Approach

Ni,t i i 1 i i,t i,t 1 i i,t i,t

i,t

i i i

logy d logy x

d = country specific dummy variable.

Treat and as random, is a 'fixed effect.'

This model can be fit consistently by OLS and

efficiently by GLS.


A Mixed Fixed Model Estimator

Ni,t i i 1 i i,t i,t 1 i,t i i,t i,t

i i

2 2 2i i,t i,t w i,t

i,t

logy d logy x (wx )

w

Heteroscedastic : Var[wx ]= x

Use two step least squares.

(1) Linear regression of logy on dummy variables, dummy

i,t-1 i,t

2i,t

2 2w

variables times logy and x .

(2) Regress squares of OLS residuals on x and 1 to

estimate and .

(3) Return to (1) but now use weighted least squares.


Nair-Reichert and Weinhold on Growth

Weinhold (1996) and Nair–Reichert and Weinhold (2001) analyzed growth and development in a panel of 24 developing countries observed for 25 years, 1971–1995. The model they employed was a variant of the mixed-fixed model proposed by Hsiao (1986, 2003). In their specification,

GGDPi,t = αi + γi dit GGDPi,t-1 + β1i GGDIi,t-1 + β2i GFDIi,t-1 + β3i GEXPi,t-1 + β4 INFLi,t-1 + εi,t

GGDP = Growth rate of gross domestic product,GGDI = Growth rate of gross domestic investment,GFDI = Growth rate of foreign direct investment (inflows),GEXP = Growth rate of exports of goods and services,INFL = Inflation rate.The constant terms and coefficients on the lagged dependent variable are country specific.The remaining coefficients are treated as random, normally distributed, with means βk andunrestricted variances. They are modeled as uncorrelated. The model was estimated using a modification of the Hildreth–Houck–Swamy method


Analysis of Macroeconomic Data Integrated series The problem with regressions involving

nonstationary series Spurious regressions Unit roots and misleading relationships

Solutions to the “problem” Random walks and first differencing Removing common trends

Cointegration: Formal solutions to regression models involving nonstationary data

Extending these results to panels Large T and small T cases. Parameter heterogeneity across countries


Nonstationary Data


Integrated Series


Stationary Data


Unit Root Tests

Lt 0 t 1 l 1 l t l t

0

0 1

y y t y

Different restrictions on parameters produce the model.

The parameter of interest is .

Augmented Dickey-Fuller Tests:

Unit root test H : < 1 vs. H : 1.

KPSS Tests:

Null hypothesis is stationarity. Alternative is broadly

defined as nonstationary.


KPSS Test-1


KPSS Test-2


Cointegrated Variables?


Cointegrating Relationships

Implications: Long run vs. short run relationships Problems of spurious regressions (as

usual) Problem for existing empirical

studies: Regressions involving variables of different integration. E.g., regressions of flows on stocks


Money demand example


Panel Unit Root Tests


Implications

Separate analyses by country How to combine data and test statistics Cointegrating relationships across

countries


Purchasing Power Parity

i,t i i i,t i,t

i,t

i,t

i

Cross country purchasing power parity hypothesis:

E P

E log of exchange rate country i with U.S.

P log of aggregate consumer expenditure price ratio

Hypothesis of PPP is 1.

Dat

a on numerous countries (large N) and many periods (large T)

Standard simple regressions based on SUR model?

(See Pedroni, P., "Purchasing Power Parity Tests in Cointegrated

Panels," ReStat, Nov, 2001, p. 727 and related cited papers.)


Application

“Some international evidence on price determination: a non-stationary panel Approach,” Paul Ashworth, Joseph P. Byrne, Economic Modelling, 20, 2003, p. 809-838.

80 quarters, 13 OECD countries

log pi,t = β0 + β1log(unit labor costi,t) + β2 log(world price,t) + β3 log(intermediate goods pricei,t) + β4 (log-output gapi,t) + εi,t

Various tests for unit roots and cointegration


Vector Autoregression The vector autoregression (VAR) model is one of the most successful, flexible,and easy to use models for the analysis of multivariate time series. It isa natural extension of the univariate autoregressive model to dynamic multivariatetime series. The VAR model has proven to be especially useful fordescribing the dynamic behavior of economic and financial time series andfor forecasting. It often provides superior forecasts to those from univariatetime series models and elaborate theory-based simultaneous equationsmodels. Forecasts from VAR models are quite flexible because they can bemade conditional on the potential future paths of specified variables in themodel. In addition to data description and forecasting, the VAR model is alsoused for structural inference and policy analysis. In structural analysis, certainassumptions about the causal structure of the data under investigationare imposed, and the resulting causal impacts of unexpected shocks orinnovations to specified variables on the variables in the model are summarized.These causal impacts are usually summarized with impulse responsefunctions and forecast error variance decompositions.Eric Zivot: http://faculty.washington.edu/ezivot/econ584/notes/varModels.pdf


VAR

1 11 1 12 2 13 3 1 1

2 21 1 22 2 23 3 2 2

3 31 1 32 2 33 3 3 3

( ) ( 1) ( 1) ( 1) ( ) ( )

( ) ( 1) ( 1) ( 1) ( ) ( )

( ) ( 1) ( 1) ( 1) ( ) ( )

(In Zivot's examples,

1. Exchange rates

2. y(t

y t y t y t y t x t t



)=stock returns, interest rates, indexes of industrial production,

rate of inflation


12 2 13 3

2

1 11 1 1

2 1

1

1

VAR Formulation

(t) = (t-1) + x(t) + (t)

SUR with identical regressors.

Granger Causality: Non

( 1) ( 1)

(

zero off diagonal elements in

( ) ( 1) ( ) ( )

( ) 1)

y t y t

y t

y t y t x t t

y t

y y

22 2 2 2

3 33 3

23 3

31 1 3 3 3

1

2 2

2 12

( 1) ( ) ( )

( ) ( 1) ( ) ( )

Hypothesis: does not Granger cause :

( 1)

( 1) ( 1

0

)

=

y t x t t

y t y t x t t

y t

y t

y

y t

y


2

2

Impulse Response

(t) = (t-1) + x(t) + (t)

By backward substitution or using the lag operator (text, 943)

(t) x(t) x(t-1) x(t-2) +... (ad infinitum)

+ (t) (t-1) (t-2)

y y

y

2

1

+ ...

[ must converge to as P increases. Roots inside unit circle.]

Consider a one time shock (impulse) in the system, = in period t

Consider the effect of the impulse on y ( ), s=t, t+1,...

Ef

P

s

0

2

2 12

212

fect in period t is 0. is not in the y1 equation.

affects y2 in period t, which affects y1 in period t+1. Effect is

In period t+2, the effect from 2 periods back is ( )

... and so on

.


Zivot’s Data


Impulse Responses


GARCH Models: A Model for Time Series with Latent Heteroscedasticity

Bollerslev/Ghysel, 1974


ARCH Model


GARCH Model


Estimated GARCH Model----------------------------------------------------------------------GARCH MODELDependent variable YLog likelihood function -1106.60788Restricted log likelihood -1311.09637Chi squared [ 2 d.f.] 408.97699Significance level .00000McFadden Pseudo R-squared .1559676Estimation based on N = 1974, K = 4GARCH Model, P = 1, Q = 1Wald statistic for GARCH = 3727.503--------+-------------------------------------------------------------Variable| Coefficient Standard Error b/St.Er. P[|Z|>z] Mean of X--------+------------------------------------------------------------- |Regression parametersConstant| -.00619 .00873 -.709 .4783 |Unconditional VarianceAlpha(0)| .01076*** .00312 3.445 .0006 |Lagged Variance TermsDelta(1)| .80597*** .03015 26.731 .0000 |Lagged Squared Disturbance TermsAlpha(1)| .15313*** .02732 5.605 .0000 |Equilibrium variance, a0/[1-D(1)-A(1)]EquilVar| .26316 .59402 .443 .6577--------+-------------------------------------------------------------

part 10: time series applications [ 1/64] econometric analysis of panel data william greene...

Documents

time series applications

time series time domain

bx t t cov t

economic time series

time series aspect

stationary time series

function of t

regression y t