semiparametric estimation of multivariate fractional cointegration

15
Semiparametric Estimation of Multivariate Fractional Cointegration Author(s): Willa W. Chen and Clifford M. Hurvich Source: Journal of the American Statistical Association, Vol. 98, No. 463 (Sep., 2003), pp. 629- 642 Published by: American Statistical Association Stable URL: http://www.jstor.org/stable/30045290 . Accessed: 15/06/2014 19:36 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp . JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. . American Statistical Association is collaborating with JSTOR to digitize, preserve and extend access to Journal of the American Statistical Association. http://www.jstor.org This content downloaded from 195.34.78.242 on Sun, 15 Jun 2014 19:36:08 PM All use subject to JSTOR Terms and Conditions

Upload: willa-w-chen-and-clifford-m-hurvich

Post on 21-Jan-2017

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Semiparametric Estimation of Multivariate Fractional Cointegration

Semiparametric Estimation of Multivariate Fractional CointegrationAuthor(s): Willa W. Chen and Clifford M. HurvichSource: Journal of the American Statistical Association, Vol. 98, No. 463 (Sep., 2003), pp. 629-642Published by: American Statistical AssociationStable URL: http://www.jstor.org/stable/30045290 .

Accessed: 15/06/2014 19:36

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

American Statistical Association is collaborating with JSTOR to digitize, preserve and extend access to Journalof the American Statistical Association.

http://www.jstor.org

This content downloaded from 195.34.78.242 on Sun, 15 Jun 2014 19:36:08 PMAll use subject to JSTOR Terms and Conditions

Page 2: Semiparametric Estimation of Multivariate Fractional Cointegration

Semiparametric Estimation of Multivariate

Fractional Cointegration Willa W. CHEN and Clifford M. HURVICH

We consider the semiparametric estimation of fractional cointegration in a multivariate process of cointegrating rank r > 0. We estimate the

cointegrating relationships by the eigenvectors corresponding to the r smallest eigenvalues of an averaged periodogram matrix of tapered, differenced observations. The number of frequencies m used in the periodogram average is held fixed as the sample size grows. We first show that the averaged periodogram matrix converges in distribution to a singular matrix whose null eigenvectors span the space of cointegrating vectors. We then show that the angle between the estimated cointegrating vectors and the space of true cointegrating vectors is Op (ndu-d), where d and du are the memory parameters of the observations and cointegrating errors. The proposed estimator is invariant to the labeling of the component series and thus does not require that one of the variables be specified as a dependent variable. We determine the rate of convergence of the r smallest eigenvalues of the periodogram matrix and present a criterion that allows for consistent estimation of r.

Finally, we apply our methodology to the analysis of fractional cointegration in interest rates.

KEY WORDS: Fractional cointegration; Long memory; Periodogram; Tapering.

1. INTRODUCTION

Fractional cointegration was originally defined by Engle and Granger (1987) and has been the subject of increasing recent at- tention in the econometric literature. Robinson (1994) proposed a narrow-band least squares (NBLS) estimator of the cointe- grating parameter, with bandwidth tending to oo. Further re- sults on this estimator in the nonstationary case were obtained by Robinson and Marinucci (2001). Chen and Hurvich (2003) considered a tapered NBLS estimator based on differenced data and showed that this estimator, which is invariant to additive polynomial trends of a certain order, can converge faster under some circumstances than the nontapered NBLS. Most of the ex- isting theory for NBLS has been derived in the bivariate case. For series of dimension three or higher, NBLS suffers from the drawback of requiring the specification of one of the component series as a dependent variable. The estimator is not invariant to this choice, and not all choices are even permissible, because the chosen series may not appear in a cointegrating relationship with any of the other series.

In this article we consider the properties of eigenvectors of the averaged periodogram matrix of tapered, differenced obser- vations using a fixed amount of averaging. Such eigenvectors are invariant to the labeling of the variables and are also invari- ant to additive polynomial trends. Because we hold the amount of averaging fixed, we do not need to normalize the averaged periodogram using an estimate of the (common) memory para- meter of the original series. Such a normalization, which may be difficult to carry out accurately in the presence of fractional cointegration, was needed by Robinson and Yajima (2002), who studied the problem of determining the cointegrating rank based on the eigenvalues of a normalized averaged periodogram ma- trix, where the amount of averaging tends to oo. We note further that the use of fixed averaging was crucial to Chen and Hurvich (2003) in obtaining (in some cases) a faster rate of convergence for NBLS than had been achieved previously by Robinson and Marinucci (2001).

We first derive the limit distribution of the averaged peri- odogram of the tapered, differenced data, generalizing results

Willa W. Chen is Assistant Professor of Statistics, Texas A&M Univer- sity, College Station, TX 77843 (E-mail: [email protected]). Clifford M. Hurvich is Professor of Statistics, New York University, New York, NY 10012 (E-mail: [email protected]). The authors thank Rohit Deo for helpful comments and two anonymous referees for suggestions that substan- tially improved the presentation.

obtained by Chen and Hurvich (2003) to the multivariate case. The averaged periodogram matrix converges in distribution to a singular matrix with null space equal to the space of cointegrat- ing vectors. We then use recent results by Barlow and Slapnicar (2000) on perturbed (nonstochastic) singular symmetric matri- ces to obtain bounds in probability on the angle between the data-based eigenvectors and the space of cointegrating vectors. We then derive the rate of convergence of the r smallest eigen- values of the averaged periodogram matrix and present a model selection criterion, similar to one presented by Robinson and Yajima (2002), which allows for the consistent estimation of the cointegrating rank, r. We then present an application of our methodology to the analysis of interest rates. Next we give a discussion, including possible generalizations of our methodol- ogy to a situation where there are varying degrees of cointegra- tion. We conclude with a mathematical appendix.

2. MODEL AND AVERAGED PERIODOGRAM MATRIX

Suppose that the original data are a q x 1 time series such that the p - Ith differences {Yt} are weakly stationary with a common memory parameter d E (-p + 1/2, 1/2), where p > 1 is a fixed integer. The use of p - 1 th differences converts any additive polynomial trend of order p - 1 in the original series into an additive constant. The value of this constant is irrelevant for our purposes, because the estimators considered here are functions of the discrete Fourier transform at nonzero Fourier frequencies. Thus we can take the mean of {yt} to be 0 without loss of generality.

To guarantee that the cointegrating relationships in the sto- chastic component of the levels are preserved in the differences, we apply a taper to the differences; that is, we multiply the differences by a sequence of constants before Fourier trans- formation. A convenient family of tapers for use on the differ- ences was given by Hurvich and Chen (2000), who used it for semiparametric estimation of the memory parameter. The same family of tapers was used on differences by Chen and Hurvich (2003) for the tapered NBLS of fractional cointegration in a bi- variate process.

@ 2003 American Statistical Association Journal of the American Statistical Association

September 2003, Vol. 98, No. 463, Theory and Methods DOI 10.1198/016214503000000530

629

This content downloaded from 195.34.78.242 on Sun, 15 Jun 2014 19:36:08 PMAll use subject to JSTOR Terms and Conditions

Page 3: Semiparametric Estimation of Multivariate Fractional Cointegration

630 Journal of the American Statistical Association, September 2003

We assume that the differenced series {yt) has the generalized common-components representation

Yt = Axt + But, (1)

where A is a q x (q - r) unknown nonstochastic matrix of rank q - r (1 < r < q), B is a q x r unknown nonstochastic matrix of rank r, {xt} is a (q - r) x 1 unobserved series (the entries of which are called common components), and {ut} is an r x 1 unobserved error series. We further assume that each entry of {xt} has memory parameter d, and each entry of {ut} has mem- ory parameter du, where d, d, E (-p + 1/2, 1/2) and du < d. If p = 2 (one difference), and the differences {Yt} in (1) have d = 0, du = -1, then we obtain the well-known special case of standard cointegration. Similarly, our results can cover all

special cases of integer-valued cointegration by taking p suffi- ciently large.

Here we specify a linear model for the vector series (x', u)'. Let *k be a sequence of q x q matrices such that

1 f ik X )d, 2x7 r ,

where for each E [-r7, r], 7 (X) is a complex-valued matrix, not necessarily Hermitian, such that *(-k) = '(X) and ro is an identity matrix. Define the q x 1 vector process (x,4 u')' as

Xt 1 o kk t-k, Ut k=-oo

where {et = (St,, ... Etq)}~ " iid(0, 27r ), I is a symmet- ric positive definite matrix with diagonal entries

,a2a and

off-diagonal entries rab = aba, a : b, a, be {1, ..., q}, and

Elleti14 < oo, where I1-11 denotes the Euclidean norm. The spec- tral density matrix of (x~, u')' is

f(k) = %P (k)1:l**(k), [-7r, 7r], where the superscript * denotes conjugate transposition. We assume that all entries of xt are I(d) processes (i.e., inte- grated of order d) and all entries of ut are I(du) processes with -p + 1/2 < du < d < 1/2. We further assume that for X e [-7r, 7r], the (a, b)th entry of *(X) is given by

'Ilab(X) = (1 - e-iX)-dab ab (X)eikab(x), (2)

where for a E { 1,...,q - r), daa = d, dab d if b : a and for a E {q - r + 1 ...,q), daa = du, dab < du if b 0 a, tab () are

positive even real-valued functions, 0ab(')

are odd real-valued functions, all continuously differentiable in an interval contain-

ing 0. It follows from (2) that the first derivatives of lIab(X) satisfy

ab() = O(aa(X)4Pbb(X)Il/21X 1) (3)

It also follows from (2) that we can write the spectral density matrix as

f(X) = T(X)ft(X)T*(X), (4)

where T(X) = diag{(l - e-iX)-d,..., (1 - e-iX)-d, (1 -

e-iX)-du,... (1 - e-ix)-du}; that is, the first q - r diagonal entries are (1 - e-iX)-d and the remaining diagonal entries are (1 - e-iX)-du, and ft(X) is nonnegative definite, Hermitian, continuous at zero frequency, and therefore real valued at zero

frequency. We further assume that fj (0), the (q - r) x (q - r) leading diagonal block matrix of ft (0), is positive definite. Thus {xt} is not fractionally cointegrated, but {ut} is allowed to be

fractionally cointegrated (see Robinson and Marinucci 1998). Our assumptions imply that {Yt} is fractionally cointegrated. In- deed, for any nonzero q x 1 vector a in the null space of A', denoted by Ker(A'), the linear combination {a'yt} has mem- ory parameter less than or equal to du, with equality holding if {ut} is not fractionally cointegrated. The space of cointegrating vectors, Ker(A'), has dimension r, where r is the cointegrating rank, assumed known.

For any vector sequence of observations {tn}tU', define the

tapered discrete Fourier transform by

1 nLP

wNj = |h2 t=tei

where ttj = 2rrj/n is the jth Fourier frequency, and {htl is the

complex-valued taper of Hurvich and Chen (2000) (omitting an

unnecessary phase-shift factor)

ht = .5(1 - ei2rt/ln), t = 1 n.

Note that p = 1 yields the no-tapering case. Next, define the

tapered cross-periodogram matrix of two vector sequences,

{tn=l1 and

{o}tl, by

Ilj =

w~, jWj.

We work with the (real part of the) averaged periodogram ma- trix of a sample of n observations

{jyt=l}n m

Im = E Re(lyy,j), j=1

where m is a fixed positive integer, m > q - r. It follows from (1) that

m m

Im = A Re(lI,j)A' + A Re(Ixu,j)B' j= 1 j=1

m m

+ B Re(Iux,j)A' + B Re(Iuu,j)B'. (5) j=l j=1

We show that the right side of (5) is dominated by the first term,

m

M = A Re(I,j)A'. (6) j=l

We also show that, with probability approaching 1,

EmTl Re(lx,j) is positive definite, so that M is a singular q x q matrix of rank q - r and Ker(M) = Ker(A'). Thus, with proba- bility approaching 1, the space of cointegrating vectors is pre- cisely the null space of M (except for the zero vector), that is, the space of eigenvectors of M with corresponding eigen- value 0. Note that both I, and M are symmetric nonnegative definite matrices, so all of the eigenvalues of both matrices are real and nonnegative. Furthermore, all eigenvectors of I, cor- responding to distinct eigenvalues are mutually orthogonal.

This content downloaded from 195.34.78.242 on Sun, 15 Jun 2014 19:36:08 PMAll use subject to JSTOR Terms and Conditions

Page 4: Semiparametric Estimation of Multivariate Fractional Cointegration

Chen and Hurvich: Fractional Cointegration 631

From the foregoing discussion, it seems plausible that the

eigenvectors of Im corresponding to its r smallest eigenval- ues should be close in some sense to the space of cointegrat- ing vectors. The primary goal of this article is to prove a pre- cise version of this claim. It must be stressed, however, that the eigenvectors of Im cannot be considered estimators of the cointegrating vectors. Even though it is useful to view Im as a perturbed version of M (with a "small" symmetric pertur- bation), it is not necessarily true that any of the eigenvectors of Im converge in a meaningful sense to any of the eigenvec- tors of M, even after appropriate labeling and standardization. Instead, we seek to show that any set of r orthonormal eigen- vectors {ail}r of Im corresponding to the r smallest eigenval- ues is, with high probability, close to the space of cointegrating vectors, in the sense that sin O = Op (ndu-d), where sin O is the

square root of the sum of the squared lengths of the residuals from the orthogonal projections of {ai}rI on the space of coin- tegrating vectors. It should be noted that the need to estimate the space of cointegrating vectors rather than individual cointe-

grating vectors also arises in standard cointegration (see, e.g., Bierens 1997).

3. LIMITING DISTRIBUTION OF THE AVERAGED PERIODOGRAM

The limiting distribution of the averaged periodogram fol- lows from that of the tapered discrete Fourier transform. Let G be the q x q matrix-valued spectral measure on [-7r, 7r] de- fined by G(dX) = f(X) dX. Let Gn be the q x q renormalized spectral measure on R, defined by

Gn (dx) = AnG (dx/n)An = An~*I(x/n)JTI*(x/n)An dx,

where An = diag(n-d, ..., n-d, n-du, ..., n-du); that is, the first q - r diagonal entries are n-d, and the remaining diago- nal entries are n-du. It follows from our assumptions that there exists a Hermitian nonnegative definite q x q matrix-valued measure Go on R such that Gn(S) -+ Go(S) as n -+ oo for all bounded Lebesgue-measurable sets S. For x > 0, we have

Go (dx) = II (x)ft (0) 1*(x) dx (7)

and Go(-dx) = Go(dx), where

n (x) = diag(e-i.ed/2l x-d ..., e-i7rd/21xl-d,

e-i.r.du/21xl-du, -i.....e-irdu/21xi-du).

We make use of the spectral representation for the vector process {Et),

8t f eiftdZ(.),

where dZe (X) is a (q x 1) complex-valued random vector with the following properties:

dZe(- ) = dZe(X), E[dZ8(X)] = 0, (8)

E[dZ6(X) dZ*(L)] =0 ( ) ~(), (9)

E[dZe(X) dZ:(x)] = E d,.

For any bounded set A in R, define

=2! f e-isx * (x) dx, (10)

00

Zn(A) = nl/2 An E (s),s S=00

=nl/2An IfAIn (x)dZ(x), (11) A/n

where the final equality in (11) follows from theorem 4.10.1 of Brockwell and Davis (1991, p. 154). Define ZGo as the (q x 1) multivariate complex Gaussian random measure satisfying

E[ZGo(S)] =0, E[ZGo(S)Z@o(S)]=

Go(S),

ZG, (-S) = ZGo (S), (12)

E[ZGo (SI)Z o(S2)] = 0, if S1 n S2

= 0,

where S is any Borel set in R.

Lemma 1. If A1 ... AM are intervals in R with nonzero endpoints and

t-A1 . AM are disjoint, then

(Zn(A1), ...,Zn(AM)) -

(ZGo(Al),...,ZGo(AM)), where the q x q matrix-valued measure Go(S) is defined earlier and ZGo is the multivariate complex Gaussian random measure satisfying the properties given in (12).

Given the process (x, u't)' = Ek 'ket-k described earlier, consider the m tapered DFT vectors w ,..., wm,

WT WTT.

S -1t [ exp(i.jt),

r =n 2 t=l

j= m.

It is useful to note that tnl 1h112 = ncp, where

Cp = 2-2(p-1) 2p

-=.p (p I)1

We define the function (for x e R)

Ap W= 2p-2 -1/2P-1

k (-1)kA(x + 27xk), k=O

where

1 eix- 1

42. ix Theorem 1. As n -* co, with m a fixed positive integer,

{A}7 - {(x21 T dm { Anw T>I--- Ap(x + 27rj) dZGo

(x) , R ~j=l1

where An = diag(n-d, ... n, -d-du, ... , n-du) and ZG0 is the multivariate complex Gaussian random measure satisfying the properties given in (12).

This content downloaded from 195.34.78.242 on Sun, 15 Jun 2014 19:36:08 PMAll use subject to JSTOR Terms and Conditions

Page 5: Semiparametric Estimation of Multivariate Fractional Cointegration

632 Journal of the American Statistical Association, September 2003

We partition Go into four submatrices,

Go Go,xx Go,xu

where Go,xx is a (q - r) x (q - r) matrix and Go,uu is an r x r matrix. Defining

j(x)= [Ap(-x + 2rj) + Ap(x + 2rj)]

and

vj(x)= -

[Ap(-x+27rj) - Ap(x+2r]j),

we have the following corollary.

Corollary 1. Under the conditions of Theorem 1, m

n-2dlmd A (UjUj + VVj)A', (13)

where vec{ Uj, Vk} 7k=l is a 2m(q - r)-variate normal random variable with mean 0 and covariance determined by

E(UjUk) = fvj(x)vk(x)Go,xx (dx),

E(Vj V) = f; vj(x)vk(x)Go,xx (dx),

and

E(UjVI) = vjV(x)vk(x)Go,xx (dx).

Lemma 2. Let E denote the covariance matrix of vec {Ui...., Urn, VI ...., Vm) in Corollary 1. Then E is posi- tive definite.

Remark. It follows from the proof of Lemma 2 that if {xt} were cointegrated, then E would not be positive definite. Thus

positive definiteness of 2 provides a characterization of the lack of cointegration in {xt}.

Following from Lemma 2 and the theorem of Okamoto (1973), we have the following corollary.

Corollary 2. The averaged periodogram of {xt), m

n-2d j Re(lxx,j) j=1

under its limiting distribution, is positive definite with probabil- ity 1.

Proof Let U = (U ...U,rn) and V = (V1,..., V,,,), then the limiting distribution of n-2d jm1 Re(Il,j) is

m

>(Ujuj + V Vj) = (Uu' + vv'). j=1

Because veclU1,..., Urn, VI,..., Vm} is a 2m(q - r)-variate normal random variable with positive definite covariance ma- trix, the joint distribution of vec{ U1 ..., Um, V1 ...., Vm } is ab-

solutely continuous. Because we are assuming that m > q - r, it follows from the theorem of Okamoto (1973) that both UU' and VV' are positive definite with probability 1.

It follows from (5), together with Corollary 1 and the proof of Corollary 2, that the limiting distribution of n-2dlr is the same as the limiting distribution of n-2dM, where M is given by (6).

4. ANGLE BETWEEN ESTIMATED AND TRUE SPACES OF COINTEGRATING VECTORS

If we define H = n-2dlm

and H = n-2dM, then from (5), we can write

H = H+ AH, (14)

where the entries of the q x q matrix AH are Op(ndu-d). The

spaces of eigenvectors of Im and H are identical, but it is con- venient here to work with H. We see from (14) that H is a per- turbed version of the singular random symmetric matrix H, and the perturbation is a random symmetric matrix with entries that are Op(1). This setup in the nonstochastic case was studied by Barlow and Slapni'ar (2000), and we generalize their results to the stochastic situation, (14). We use the notation of Barlow and Slapni'ar (2000), who considered Hermitian matrices with Hermitian perturbations. Because our matrices H and AH are real and symmetric, all eigenvectors are real.

Consider the family of perturbed matrices

H =H+ o o E [0, 1].

Note that H = H(0) and H = H(1). For each value of ', let 0 < ~i(W) < X2(X )

l W

l < kq(O) be the eigenvalues of H('),

and let

Xi('), i E {1,... q}, be a corresponding set of orthonormal

eigenvectors, so that

H)K( = Xi(C)xi(ST), i = 1,.. q.

We assume that Ker(H) has dimension r, so that 1(0) = ... = r (0) = 0. Define

z = {i: f 0 for all " E [0, 1]}. (15)

The set I and its complement Zc play an important role in the results of Barlow and Slapni'ar (2000) in the nonstochas- tic case. Note that I is the set of indices i such that the ith smallest eigenvalue of H(') remains positive for all perturba- tions

" E [0, 1]. Clearly, {I1,..., r} C Zc. Because the matrices

H(') are random, I and Ic are random subsets of { 1,..., q}.

It seems plausible that because dim(Ker(H)) = r, and because the difference between H and H is op(1), there is a very high probability that c = { 1, ...., r). The following lemma is

proved in the Appendix.

Lemma 3. Pr{c { 1,..., r}} - 0 as n -+ oo.

Define

Xl = [Xr+1(0) ...., Xq(0)],

a matrix of orthonormal eigenvectors of H associated with its

q - r largest eigenvalues, kr+1 (0), ..., .q(0). Now define

sin = IIXX2 IIF, (16)

where for any matrix E, IIEIIF = Ei2 is the Frobenius norm and

X2 = [X(1), ... , Xr()]

This content downloaded from 195.34.78.242 on Sun, 15 Jun 2014 19:36:08 PMAll use subject to JSTOR Terms and Conditions

Page 6: Semiparametric Estimation of Multivariate Fractional Cointegration

Chen and Hurvich: Fractional Cointegration 633

is a matrix of orthonormal estimated cointegrating vectors, that is, eigenvectors of H associated with its r smallest eigenval- ues, Xi

(1),...., IXr(1). We can think of sin O as the sine of the angle O between the space spanned by the estimated cointe-

grating vectors and the space spanned by the true cointegrating vectors. A related interpretation of sin O in terms of orthogo- nal projections was given in Section 2. Roughly speaking, the smaller sin O is, the closer the spaces of estimated and true

cointegrating vectors are to each other. Barlow and Slapnicar (2000) showed that if I = {r + 1,..., q}, then

sin e = IIXX2 IIF H(0)AHX2F

, (17) rel gapol(A 1,A2)

where the quantities on the right side of this inequality are defined later, without requiring the assumption that I = {r + 1, ..., q. We define

X2 = [X1(0),..., Xr(0)],

an orthogonal matrix of true cointegrating vectors. We can write H(0) = X(0)A(0)X*(0), where

X(o) = [xi(0), ..., xq(O)] and A(0) = diag(0,...,0, 0r+l (0),.... IXq(O)). The Moore- Penrose inverse of H(0) is given by

Ht (0) = X(0) A t (0)X* (0),

where At(0) = diag(0 ...0, 71(0), ..., 1(0)). The matri- ces A1 and A2 are defined by A1 = diag(Xr+l(0),..., Xq(0)), and A2 = diag(X (1),..., Xr(1)). The generalized relative gap in the eigenvalues of At and A2 is given by

hi(0) - Aj(1) rel gapo(A 1,A2)= min je{1 ,... riE{r+1...q} i (0)

Because the eigenvalues of a matrix are continuous functions of the entries of the matrix, and because AH = op(l), it fol- lows that X1(1),..., r(1) all converge in probability to 0, and the limiting distribution of X,+1 (1)..., 1q(1) is the same as the limiting distribution of X,r+1(0)..., kq(0). Because H, under its limiting distribution, has rank q - r with proba- bility 1, it follows that the limiting marginal distributions of Xr+1 (0), ..., Iq(O) have no mass at 0. Thus,

rel gapo(A 1, 2) 1. (18)

Because AH = Op(ndu-d), we have

IIHt (O)AHX2IF = Op (ndu-d). (19)

Note that both (18) and (19) hold without the need to assume that I= {r + 1,..., q}.

We now present our main theorem.

Theorem 2. Under the conditions of Theorem 1, sin 9 =

Op (ndu-d), where sin O is defined in (16).

Proof We need to show that nd-du sin O is bounded in prob- ability; that is,

V E > 0 3ME : Pr{nd-du sin O > M, } < E V sufficiently large n.

For any constant C,

Pr{nd-du sinO > C} = Pr{nd-du" sin O > CI1 = {r +

1....q}} x Pr{I = {r + 1,...,q}}

+ Pr{nd-d" sin O > CI # {r + 1,....

q}}

x Pr{I {r+1,... q}}. (20)

The final term on the right side of (20) tends to 0 as n -+ by Lemma 3. Define

R = nd-du HA (O)AHX2I F/rel gapo(A , 2).

By (17), we have

Pr{nd-du sin 6 > CII= {r + 1,..., q}}

Pr{(R > C) n ( = {r+1 ... q})} Pr{= {r + 1 ...,q})}

Pr{R > C} - Pr{I= {r + 1 ... q}}

From (18) and (19), we conclude that R = Op(1), so we can say

V E > 0 3 M* :Pr{R > M*} < E/2 V sufficiently large n.

Then for the same E and M* as before and all sufficiently large n, we have

Pr{nd-dusin > M*} </2 + o(1) < E. Pr{Pn= {rd+M1,...,.q}}

5. CONSISTENT ESTIMATION OF THE COINTEGRATING RANK

Robinson and Yajima (2002) proposed using a model se- lection criterion described by Fujikoshi (1985), Fujikoshi and Veitch (1979), and Gunderson and Muirhead (1997) to estimate the cointegrating rank, r > 0. Here we adapt this criterion to our setting, obtaining a consistent estimate of r. The method requires some information about d and du. This shortcoming applies also to the analogous method presented by Robinson and Yajima (2002), whose assumptions D, E, H, and J require an a priori lower bound for d - du. Whereas Robinson and Ya- jima (2002) considered the stationary case, we do not require that the original series be stationary, only that the series is sta-

tionary (though potentially noninvertible) after suitable integer differencing. We start by obtaining a rate of convergence for the r smallest eigenvalues, X1 (1), .. ., Xr(1), of H.

Lemma 4. Under the conditions of Theorem 1, I (X 1(1) ..

Xr(1))II = Op(ndu-d)

Proof Because Xr(l)Xr(1) = HXr(1), using 11-112 for the Euclidean norm, we have

r2(1) = IHXr(1) < 21 HXr(1)2 211 AHXr(1)

= 211HXr(1)11 + Op(n2(du-d)). (21)

This content downloaded from 195.34.78.242 on Sun, 15 Jun 2014 19:36:08 PMAll use subject to JSTOR Terms and Conditions

Page 7: Semiparametric Estimation of Multivariate Fractional Cointegration

634 Journal of the American Statistical Association, September 2003

Note that

HXr(1) = X(O)A(O)X*(O)Xr(1)

0

= X(0) ,r+l(O)X1(O)X+1( r()

,xq(O)Xq(O)Xr(1)

The entries of X(0) are 0(1). Because H converges in distribu- tion, we have (Xr+1 (0), ..., Xq(0)) = Op(1). By Theorem 2,

X (0)Xr(1)= Op(ndu-d, = r + 1...q.

It follows that HXr(1) = Op(ndu-d). The lemma now follows from (21).

Our estimator of r is based on eigenvalues of I,, which

may be obtained directly from the observed data. We denote these eigenvalues by 1 < ...- < 3q, so that ii = n2di(1) for i = 1,..., q. Define

k

Oj,k = i.

i=j

Let r be the minimizer of L(u) for 1 < u < q, where

L(u) = V(n)(q - u) - ^u+1,q (22)

and V(n) is a deterministic sequence. Under certain conditions on V(n), the estimator r is consistent for r.

Theorem 3. If V(n) is a deterministic sequence such that

ndu+d V(n)

V(n) n2d

then Pr{? = r} -+ 1.

Proof We follow the same lines as the proof of theorem 4 of Robinson and Yajima (2002). We have

q

Pr{ > r} < < Pr{L(u) < L(r)} qPr{r+l < V(n)} -+ 0,

u=r+l

because br+1 = Op(n2d) and V(n)/n2d -+ 0. Next,

r-1

Pr{? < r)} < Pr{L(u) < L(r)I} rPr{V(n) < r} -* 0, u=1

because Sr = Op(ndu+d) and V(n)/ndu+d -+00.

6. APPLICATION

We applied the methods of this article to a multivariate se- ries of interest rates on U.S. Treasury securities, with maturi- ties of 3 months, 6 months, 1 year, 3 years, 5 years, 7 years, 10 years, and 30 years. The observations were daily, spanning the period January 1, 1982-December 31, 2001. The sample size is n = 4,999. The data were obtained from the database of the Federal Reserve Board, at http://www.federalreserve.gov/ releases/. We took logarithms of the interest rates, differenced

1.0

0.5

0.0

-0.5

01/01/1982 01/08/1986 01/08/1990 01/06/1994 01/06/1998 12/31/2001

Figure 1. Log Interest Rates at Maturities of 3 Months (....), 6 Months (- - -), 7 Years (....), and 10 Years (---); Daily Data, 1/1/1982-12/31/2001.

This content downloaded from 195.34.78.242 on Sun, 15 Jun 2014 19:36:08 PMAll use subject to JSTOR Terms and Conditions

Page 8: Semiparametric Estimation of Multivariate Fractional Cointegration

Chen and Hurvich: Fractional Cointegration 635

Table 1. Eigenvectors, Xi and Corresponding Eigenvalues, Xj, of Averaged Periodogram for Tapered Differenced Log Interest Rate Data, m = 10

X1i X2 X3 X4 X5 X6 X7 X8

3 months -.0153 .0088 .3931 -.0192 .2733 .5759 -.4513 .4846 6 months -.0096 -.0956 -.7932 .0587 -.1807 -.0122 -.2971 .4869 1 year .0282 .1764 .4311 .1664 -.4245 -.5806 -.1202 .4720 3 years -.1525 -.4270 .0503 -.6407 .3301 -.2985 .2557 .3434 5 years .0863 .7858 -.1570 -.0636 .3735 -.0217 .3660 .2693 7 years .7024 -.3594 .0284 .3860 .1254 .0713 .3920 .2306

10 years -.6881 -.1618 .0206 .5143 .0447 .1251 .4205 .2026 30 years .0371 .0674 .0453 -.3747 -.6681 .4699 .4047 .1436

,j 1.80 x 10-8 3.50 x 10-8 5.11 x 10-8 1.25 x 10-7 7.14 x 10-7 3.94 x 10-6 1.01 x 10-4 8.47 x 10-4

the logarithms once, and applied a first-order taper. We then

computed eigenvalues and orthonormal eigenvectors of the re-

sulting averaged periodogram matrix, using a bandwidth of m = 10. Tapered Gaussian semiparametric estimates of the

memory parameter of the tapered differences were not sig- nificantly different from 0 for any of the series. This pro- vides some justification for our assumption that all of the se- ries share the same memory parameter. More formally, we used these estimates and their asymptotic standard erorrs in the Bonferroni-based test for a common memory parame- ter described by Robinson and Yajima (2002, p. 228). At the 5% level of significance, the test failed to reject the null hypothesis that all of the interest rate series share a com- mon memory parameter, regardless of which of the three different bandwidths for the estimators (n-4, n5, or n6) is used.

Figure 1 plots the log interest rates for two short-term ma- turities (3 months and 6 months) and two long-term maturities (7 years and 10 years). Two groups are apparent, one group for the short-term rates and the other for the long-term rates. Ta- ble 1 gives the eigenvalues, j,

of the averaged periodogram matrix, sorted in order from lowest to highest, as well as the corresponding eigenvectors, Xj. The ratio of the largest to the smallest eigenvalue, ,8/X1, is 4.7 x 104. The largeness of this ratio is an indication of near-singularity of the averaged peri- odogram matrix and thus of potential fractional cointegration of the time series. It is perhaps of interest to note that for X1, the only components with an absolute coefficient exceeding .1 are those corresponding to maturities of 3 years, 7 years, and 10 years. Furthermore, X3 has large coefficients only for matu- rities of 3 months, 6 months, 1 year, and 5 years. For j < 5, the coefficients of Xj for 3 months and 6 months are either both small or both large, as are the coefficients for 7 years and 10 years.

We multiplied the transpose of each eigenvector by the mul- tivariate series of logarithms of nontapered interest rates, yield- ing eight univariate residual series. For each residual series, we computed the Gaussian semiparametric estimator of the

memory parameter of the tapered differences, using a first- order taper and bandwidths of 30. For the bandwidth of 30, Figure 2 plots the log periodogram of the tapered differenced series versus log frequency for each of the residual series. It can be seen that the differenced residual series corresponding to the largest eigenvalues seem to have a memory parameter

of nearly 0, because the scatterplot shows a slope of approx- imately 0, whereas those corresponding to the smaller eigen- values seem to have increasingly negative memory parameters. This suggests the potential presence of fractional cointegra- tion.

The Gaussian semiparametric estimators of the memory pa- rameters du for each series are given in Figure 2. In each case, the approximate standard error for the estimator is .146. The

XI, d, = -.6131 X2, du = -.4522

-20

-25

-17

-21

-6.5 -5.0 -3.5 -6.5 -5.0 -3.5

X39, d = -.7785 X4, d = -.7072 -15

-20

-15

-18

-6.5 -5.0 -3.5 -6.5 -5.0 -3.5

Xs, d, = -.6033 X69 du = -.3212

-15

-18

-14

-17

-6.5 -5.0 -3.5 -6.5 -5.0 -3.5

X79, u = -.0007 X8, du = .0944 -10

-14

-8

-12

-6.5 -5.0 -3.5 -6.5 -5.0 -3.5

Figure 2. Log Periodograms of Tapered Differenced Cointegrating Residuals Versus Log Frequency. Each residual process is obtained by multiplying an eigenvector Xi by the log interest rate data. Tapered Gaussian semiparametric estimated values of du with bandwidth 30 are also provided.

This content downloaded from 195.34.78.242 on Sun, 15 Jun 2014 19:36:08 PMAll use subject to JSTOR Terms and Conditions

Page 9: Semiparametric Estimation of Multivariate Fractional Cointegration

636 Journal of the American Statistical Association, September 2003

standard error was computed using the finite-sample expres- sion given by Hurvich and Chen (2000). Only the series corre- sponding to the six smallest eigenvalues have estimated mem- ory parameters that are significantly less than 0. Furthermore, there is evidence that the underlying memory parameters cor- responding to these six residual series are not all the same. This suggests that some of the cointegrating relationships are stronger than others. We discuss this point further in the next section.

It should be stressed that as yet there is no full theoreti- cal basis to justify the foregoing Gaussian semiparametric es- timators. However, it seems clear that the presence of frac- tional cointegration would place rather strict upper bounds on the bandwidth that could be used. This is the reason why we have used the relatively small bandwidth of 30. We also tried a bandwidth of 70. This gave estimated memory parameters that are somewhat closer to 0, but that may be contaminated by bias.

We computed ? as the minimizer of L(u) given by (22). We tried several choices for the tuning parameter, V(n). Most of these choices yielded either r = 6 or ? = 7. Overall, this suggests a cointegrating rank of r = 6, because the estimated memory parameter is significantly negative for X6 but not for X7.

7. DISCUSSION

In the case of standard nonparametric cointegration, in which (d, d,) (on differences) is known to be (0, -1), Bierens (1997) obtained estimators of the cointegrating vectors based on so- lutions to a generalized eigenvalue problem and showed that for these estimators sin 8 = Op(n-1), where sin 0 is defined in the same way as in this article. It might be interesting to compare Bierens's (1997) estimators to ours in the case where (d, du) = (0, -1), especially because one of the matrices used in Bierens's generalized eigenvalue problem is closely related to the averaged periodogram matrix with a fixed degree, m, of smoothing.

Although we have considered the consistent estimation of the cointegrating rank r assuming that cointegration is known to exist (r > 0), we have not presented a test for the presence of fractional cointegration. A Hausman-type test for fractional cointegration was proposed by Marinucci and Robinson (2001), who showed that it works well in simulations. Presumably, the methods of this article could be safely applied once the null hy- pothesis of no cointegration (r = 0) is rejected by such a test.

We stress here that the results obtained in this article allow the cointegrating errors to have different memory parameters. Because we do not assume that

fuu(0) is positive definite, it

follows that {ut}, unlike {xt}, is allowed to be cointegrated. If

{ut} is cointegrated, then, analogously to (1), we can write

(1) ' (2) ut = Alut + A2ut

so that

Yt =Aoxt + Alu + A2u2), where A1 = BA1 and A2 = BA2. In general, we can elaborate the common trend representation in (1) as follows. Let ({uk)1, k = 1, ..., s, be rk x 1 series with memory parameter duk, with

kY=l rk = r and d > du, > du2 > ... > du,. Let Ak be q x rk

full-rank matrices for k = 1,.....s. We have the common trend representation

yt = Aoxt + Alu) + - - + Asu s), (23)

where A0 is a full-rank q x (q - r) matrix, as was the matrix A in (1). This representation implies that some of the cointegrat- ing relationships among yt are stronger than others, as we saw in our analysis of the interest rate data. All of the theoretical results in this article hold for this elaborated model, if we take

du = du,, throughout. Indeed, the elaborated model is simply a

special case of the model that we have assumed in Section 2. Under model (23), there will exist cointegrating vectors such

that the cointegrating residual has memory parameter duk for all k. We are currently exploring the ability of the estimators that we have studied in this article to approximately recover such cointegrating vectors.

APPENDIX: PROOFS

For convenience of notation, throughout we write vx = q - r, vu = r, and v = vx + vu = q.

Proof of Lemma 1

From (11) and the properties (8) and (9) of the spectral representa- tion, we conclude that Zn(Aj) = Zn(-Aj) forj = 1,..., M and

E[ReZn(Aj) ReZn(Ak)] = E[ReZn(Aj) ImZn(Ak)]

= E[ImZn(Aj) ReZn (Ak)]

= E[ImZn(Aj) ImZn(Ak)] =0,

(j $ k).

It therefore suffices to prove that Zn(A) - ZG0 (A) for any inter- val A with nonzero endpoints and A n -A = 0. By the Cramer- Wold device, this is equivalent to showing that for all oa, f E Rv, a' Re Zn(A) + P' Im

Zn (A) - a' Re ZGo (A) + ' Im ZG (A). Note

that E[Zn(A)Z*(A)] = Gn(A) o Go(A) as n -+ oo. Note also that from the properties of the spectral representation, E[Zn(A)x Z* (- A)] = 0. It follows that

var[a' Re Zn (A) + o Im Zn(A)]

- (a' - iO')Gn(A)(ao + i/) 4

+ -I(a' + if')Gn(A)(a - ) 4

-- -(at - iO')Go(A)(a + i/) 4

+ -f(a' + if')Go(A)(a - ip)

= var[a' Re ZGo (A) + P' Im ZGo (A)]

: 02 > 0.

Note that it is possible that oq = 0, because the limiting normal distri- bution as asserted in Lemma 1 may have a singular covariance matrix. We will require bounds on the entries of * A (s, n), where here we ex- plicitly denote the dependence on n as well as s. Without loss of gen- erality, we assume that A = (A, B], where 0 < A < B. We denote the

This content downloaded from 195.34.78.242 on Sun, 15 Jun 2014 19:36:08 PMAll use subject to JSTOR Terms and Conditions

Page 10: Semiparametric Estimation of Multivariate Fractional Cointegration

Chen and Hurvich: Fractional Cointegration 637

(a, b) entry by *A (s, n)ab. Using integration by parts, we have

1 fB/n.

- - Bl

A(s, n)ab -- e-isx ab(x)d~x

1 1 B/n - e-isx xab (x) 27r -is A/n

1 1 SB/n e -isxIPab (x) dx.

27r -is JA/n

In the sequel, we use C to denote a generic constant. From (2) and (3), we have, for all sufficiently small x > 0, IVab(X)I < Cx-dx and Ib(x) < Cx-d-1 for a E {1, ..., vx1, Iab(x)l < Cx-du and

ab (x)I < Cx-du- for a E {(vx + 1) . . We obtain, for all s 0 0,

C d d CB-A IA (S, n)abl < -[(A/n)-d + (B/n)-d] +-

Is Isl n

x [(A/n)-d-1 + (B/n)-d-l]

< Cndl (A.1)

for a E {1,... vx} and

A(s, n)ab I< Cndu 1 (A.2)

for a E {(Vx + 1),..., v}. If 0aa is the ath element of a and 8t,b is the bth element of et, then

a' ReZn(A) + P' ImZn(A) 00 00

= c,'nl/2An E Re* A(s)s + 'nl/2An E Im r A(S)Es

S= -00 S=-00

vx v 00

= nl/2-dE y y [a Re A (S)abcs,b + fa Iml*A (S)abEs,b] a=l b=l s=-cO

V V 00

+ n1/2-du [aaRe fa(s)abEs,b a=vx+1 b=1 s=-oo

+ fa Im *A (S)abEs,b]

= Wns, S=-00

where for each n, the fWns }s-=_ are independent random variables given by

Wns = n1/2 n-d [a. Re A (s)ab + fa Im rA (S)ab] - n-du b=l a=l

v

x [Ca Re A (S)ab +

fa Im

A (S)ab] Es,b- a=vx+ l

(A.3)

Because IRe A(s)abl - <IA(S)abl and IIm *A((s)abl L IlA(S)abl for a, b E {1,..., v}, we conclude that for s 0,

E[W2s] Cn n-2d E I(S)ab2 b=l a=l

+ n-2du E IlA(S)abl 2

a=vx+l

< Cn/s2, (A.4)

where the final inequality follows from the bounds (A. 1) and (A.2) for the entries of *A (s, n).

Let Vo(n) be a nondecreasing sequence, to be determined later. We have

Wns= Wns.+ Wns. s=-00 Isl<Vo(n) Isl> Vo(n)

Using (A.4), we have

21

E Wns ,, = 1 E[W2s] Isl>Vo(n) ' IslI> Vo(n)

1 n <Cn 2 - C <C 2- Vo (n)

Isl> Vo(n)

If we choose VO(n) so that n/VO(n) --

0 as n --+ c, the lemma will follow if we can show that

d 2 2 >0 L Wns N(O,')

ifag>0, Isl Vo(n)

and that

SWns , O if a2 = 0. Isl Vo(n)

In the case where a2 = 0, the foregoing equation follows because

EjlslVo(n) E[W2s] =a2 + o(l). Now suppose that 02 > 0. By the Lyapounov condition (see, e.g., Billingsley 1986, p. 371), it suffices to show that

slS<Vo(n) E[W4s]

(IIsi<Vo(n) E[Ws])2-0

Because jlsljVo(n) E[W2s] = ao2 + o(1), it suffices to show that

lsl<_Vo(n) E[Ws] -+ 0 for a suitably chosen nondecreasing sequence with n/V0o(n) -+ 0.

Because Ellet 14 < 0o, we have, from (A.3),

v vx v E[W4s] < Cn l2

L-'ln-dr s4(S)ab 4+

fIn-du*lA(S)abO4 b=1 a=1 a=vx+1l

For a E { 1, ..., vx }, we have, from the Cauchy-Schwarz inequality,

-d max InAd l* (s, n)ab I s

-d B/n (BIn

W12 1/2 /B/n )1/2

- 2r \JA/n 1JA/n

< Cn-d[(B/n) -2d+1 + (A/n)-2d+lI/2(BA)1/2

= Cn-1

Using similar arguments, we obtain overall that maxs E[Ws] = 0(1/n2), and therefore that

EsVo<CV(n) Isl<Vo(n)

The proof of the lemma is therefore completed by choosing V0(n) to be

any nondecreasing sequence such that n/Vo(n) -+ 0 and Vo(n)/n2 0, for example, VO(n) = [n1'5].

This content downloaded from 195.34.78.242 on Sun, 15 Jun 2014 19:36:08 PMAll use subject to JSTOR Terms and Conditions

Page 11: Semiparametric Estimation of Multivariate Fractional Cointegration

638 Journal of the American Statistical Association, September 2003

Proof of Theorem 1

By the Cramer-Wold device, applied to complex-valued random variables, it suffices to show that any linear combination of the vm

complex-valued random variables contained in { AnwT =1, with fixed

complex-valued coefficients, converges in distribution to the corre-

sponding linear combination of the limit distribution given in the state- ment of Theorem 1. The initial linear combination can be expressed as

m Yn = a AnwT,

j=1

where aj are v x 1 vectors of complex numbers. Using the definitions

given here and in the preceding section, together with the change of variable s = t - k, we can write

m

a; (Anwf ) j=1

nm n 00

1=

aAnt t-1k

kEt-k exp(iijt)

nm n

2 n a;

A n t-

V27rncp j=1 Jrtkl

k=-o k=-

Im n S 1

LajAn htY 2r np

j=1 t=l1

x 1 ei(t-s) ( Es exp(i=Ljt)

S=-oof

S m n V/27rn

I27 j=

s=- j=1 t=1

x exp(it(X +

jl))AnI(u.)

dAies.

2nJ-/

Using the same argument as before and defining

1 m n h*())

=1Jp a; exp(it(X ++(j)), j=1 t=1

we conclude that

Yn = I

e-iSh*(() An( ()ddes. (A.5) S S=-OO

Let A be a real number with 0 < A < nir. We write Yn = YA + Rn, where

1 A/n -ISh. * YA = - e-is

h(k)An()dss, (A.6) SS=--Oo27 -A/n

and

s=-o ft[-7r,r]\[-A/n,A/n]

Define

K* (x) = n-1/2h (x/n) - K ,(x)

and m

K* (x) = a; Ap (x +2rj). j=1

Lemma A.1. As n - oo, KL*(x) -> Kv(x), uniformly on [-A, A] for any fixed A > 0.

Lemma A.2. For -p + 1/2 < du < d < 1/2,

lim[ Ilhn(x)112 trace{An dG(x)Anl n-+oo 00 -7r,r] ]

= f lKo(x) l2trace{dGo(x)}< oo. (A.8)

Lemma A.3.

lim [ [f IKn(x)I112trace{dGn(x)}l =0, (A.9)

uniformly for n = 1,2 ....

It follows from the foregoing properties, we can approximate Kn on [-A, A] by step functions of the form

L

ga(x)= E gAjlA(x), e=-L

where A-L ... AL partitions [-A,A] into equal subintervals and

gA0 = 0. Specifically, we have the following.

Lemma A.4. There exist step functions {gA}A>O as before, such that for any E > 0,

IIKn - gA 2 trace{dGn(x)} < e, (A.10) -A

when A > A(c) and n > n(E), and such that

A

IIKO -gA112trace{ddGox(x)} <e, (A.11) -A

when A > A(e).

Define

A 1 A/n=l

= n1l/2 Eif n e-iSXAnbI (x) dxes

s e .

=

9*_ Zn(Ae ) (A.12)

f

It follows from Lemma 1 that as n -+ oo, for fixed A,

:= A

gZAeZGo(Ae). (A.13) r:=E*tZGOA)

We complete the proof of the theorem by showing that Yn - Y, where Y is a complex normal random variable given by

Y := K (x) dZGo (x). 10 0/CL~

Ad S --+ Y as A -* oc and that for all e > 0,

limlim sup P[IYn - IAI E] = 0 (A.14) A n

This content downloaded from 195.34.78.242 on Sun, 15 Jun 2014 19:36:08 PMAll use subject to JSTOR Terms and Conditions

Page 12: Semiparametric Estimation of Multivariate Fractional Cointegration

Chen and Hurvich: Fractional Cointegration 639

then it will follow that Y,n -* Y by theorem 25.5 of Billingsley (1986).

We prove (A.14) in Lemma A.5. It remains to show that IA d Y as A -* oo.

We have

JO _ y = 00( - * - (x)) dZGo (x). By (12) and Cauchy's inequality,

E|IA - YI2 < C IIgA(x) - Ko(x)1l2 trace{dGo(x)}.

It follows from (A.11) and (A.8) that ElIA - Y12 -* 0, and hence that

a Y as A

--+ o.

Lemma A.5. For every e > 0,

limlimsupP(IYn - II > E) = 0. A n

Proof of Lemma A.5

Because Yn - IA= yA _ IA + Rn, it suffices to show that

limlim sup El YA - I2 -- 0 (A.15) A n

and

limlimsup EIRnl2 -- 0. (A.16) A n

We start by proving (A. 15). We have

yA -_J n n

A/ne-is[h (x) - nl/2 A(nx)]Anq(x)dx es.

Write E = Z1/2E1/2, and define the (1 x v) vector 2 (x) by

(x) = [h*(x) - nl/2g, (nx)]An(x)E1/2 = (21,

.

. ,2v). Then

var[YA - IA 00

fA/n A/n1/ S A e-isx[h (x)-nl/2 A(nx)IAnnW(x)

s= f -A/n -A/n

x Ybr*(y)A'neisY[hn(y)

- nl/2gA(ny)]dxdy

00 A/n TA/n =

f el-isx2 (x) dx elSXl2*(y)

dy s=-oo -A/n J-A/n

V 00 A/n 2

=I e-lsx2jW(x)dx

j=l s=-oo-A/n

-Aln = II 2(x)II2 dx (A.17)

fa/n 1/2 g 21/22

yAln

= f lhn(x) - nl/2gA (nx) I2 S-A/n

x tracef{An (x)x *

(x)A'n} dx.

The equality in (A. 17) follows from Parseval's equality, s -o dn

e-isxnQj(x) dxI2 -

A/n IJ(x)2 dx. With a

change of variables, we obtain

var[Ya - 1A< IA lKn() - gA(x)12 trace{dGn(x)}. -A

Equation (A.15) now follows from Lemma A.4, (A.10). We next prove (A.16). Using an argument very similar to that given in prov- ing (A.15), we conclude from (A.7) that

var[Rn] < C ,n]\[ IIKn(x)112 trace{dGn(x)}. -nnr,nr ]\[-A,A]

Thus (A. 16) follows from (A.9).

Proof of Lemma A.1

We have

K,* (x)

1 m n

na/tl'-1

exp(it(x/n +

lj))

n2 j=1 t= k=O

x (-1)kei27rkt/neit(x/n+pj)

1 m np-1

n2P- 2 a- 1 ( )keit(x/n+Lj+iLk) pj=1 t=1 k=O

11 1 kDn/+(x/ln + j+k),

where Dn(x) is the Dirichlet kernel, Dn(x)= n=1 eixt. Terrin and Hurvich (1994, p. 303) showed that

1 1 -Dn (x/n + ILj+k) = -Dn([X + 27r(j + k)]/n) -> A [x + 27r(j + k)] n n

uniformly on [-A,A]. Because Kn*(x) is a linear combination of

1Dn (x/n + ILj+k), we have m

K (x)- aP 2 27rcp1 j=1

p-1 x

P

k (-1)kA(x+22n(j+k)) k=0

m

= Laj Ap (x + 27j) = Ko*(x), j=1

uniformly on [-A, A].

Proof of Lemma A.2

Applying a change of variable,

Ilhn(x) ll2 trace{An dG(x)An}

[

f ,nn1

lhn (x/n) l2 trace{AndG(x/n)An) [-nr,nr] nl

= j- ,r]

IIK*(x)II2trace{dGn(x/n)}.

From lemma 0 of Hurvich and Chen (2000),

Eheixt <C n

1l(1 + nlxl)P

hence

m Ila II IK (x)I < (1 + x + 2rjl)P j=1

This content downloaded from 195.34.78.242 on Sun, 15 Jun 2014 19:36:08 PMAll use subject to JSTOR Terms and Conditions

Page 13: Semiparametric Estimation of Multivariate Fractional Cointegration

640 Journal of the American Statistical Association, September 2003

It follows that

IIK* (x) 112 trace{An dG(x/n) An

< C (1 + Ix+ 2j)P (Ix

-2d + Ixl-2d"u)

because

trace{dGn(x/n)} < C(n-2d + -2du u)

= C(xlI-2d + )xl-2du") For d, du e (-p + 1/2, 1/2),

fala*11

2

S (1 + Ix + 2j)P (x2d + x2d cisij=) 111

Furthermore, 1Dn(x/n + j+k) -- A(x), and dGn(x/n) -+ dGo(x). By the dominated convergence theorem, we have

[ -nn lK (x)ll2 trace{dGn (x/n)}

- fIIKo(x)ll2

trace{dGo(x)} < oo.

Proof of Lemma A.3

The lemma will follow if

A IIKn (x) 112 trace{dGn (x)}I

[-nw,,ni ]\[-A,A]

converges to 0 uniformly for n = 1 ..., as A --+ oco. For every n, the

foregoing integral is bounded by

(1 + x + 2 2d dx

which goes to 0 as A -> X o.

Proof of Lemma 2

Denote the 2m x 1 vector qo(x)

= ({vj(x)}jm=l, {vk(x)}m=l)'. Then

= f p(x) p*(x)) 0 Go,xx

(dx).

We show that

a'a =

JRa'[(p(x) p*(x))

Go,xx(dx)]a > 0,

where a is a real-valued 2vxm x 1 constant vector. Because both

(p(x)qp*(x)) and Go0,(dx) are nonnegative definite in R, it suffices to show that

a'[(p(x)p*(x)) 0 G0,xx(dx)]a # 0 (A.18)

for x e S, where S is a Borel set with positive measure. Partitioning a =

(a... a2m), where aj are vx x 1 vectors, we write the left side of (A.18) as

/m

[ajvj(x) + am+j Vj(X)] Go,xx (dx)

x [akvk (X) +am+kvk(X)] . k=l

Note that Go,xx (dx) is positive definite everywhere in R\{0} by (7). We show (A.18) by showing that {vj(x)}j=l and

{vk(x)}m~=1 are lin-

early independent. First, we write

(eix- 1)(2p-?2-

1/2 vj(x) = -1

p-1 x j

k (- 1)k j+k(x)

+ 0-(j+k)(x)} k=O

and

(eix - 1) 2p - 2 -1/2

Sp-1

xE Pk (-1)k { --j+k(x)+ 0-(+k)(x)}, k=O

where

(x) = x + 2 = ,..., (m +p - ).

Let

p-1 (x) = (-i) k (- 1)k

j+k(x) + -(jo+k) (x)}

k=O

and

p-1 ) 1 { (x)= k (- )k{Cj+k(x ) -0--(j+k)(x)}. k=0

It suffices to show that { j(x) }= 1 and { k (x) }= 1 are linearly inde-

pendent. We show this only for the case where m > p; the case where m < p can be handled similarly. Let {cj}=1 and

{J}m be two se-

quences of complex constants where neither sequence is identically 0. Then

p-1 m = (-1)k (-io j) j+k(x)

k=O j= 1

- (iotj

+ j)0-(j+k) (x)}

m j+p-l1

E y

(-1) {(-i , + )(x) j=1 e=j

- (i + 4 Ij) q_(x)}

m+p-1 = m be e(x), (A.19)

lel=1

where

min(m,e)

be = j (-1)-(-iJ + f) j=max(1, -p+ 1)

1 <f < (m+p- 1)

min(m, fe)

:

(--1) -(--i

-

j), j=max ( 1, -p+ 1)

- (m+p - 1) <e <-1.

This content downloaded from 195.34.78.242 on Sun, 15 Jun 2014 19:36:08 PMAll use subject to JSTOR Terms and Conditions

Page 14: Semiparametric Estimation of Multivariate Fractional Cointegration

Chen and Hurvich: Fractional Cointegration 641

We need to show that the {~g (x)}'s are linearly independent and the {be }'s are not all 0, and hence that (A. 19) is not almost everywhere the zero function. We first show that { o (x)}g are linearly independent for x E S, where S is a nonzero measure Borel set. We prove it by contra- diction. If te (x) are linearly dependent, then there exists a sequence of constants ce, l = +1, 2, ..., +L, such that be are not all 0 and

L L C Z

eLI=1 e h#e,o(x + 2hnr)

c(x ) 0

for x E S. For x # 27r , 1, +2, ..., +L, we have

L

E c (x + 2hnr)=O0. (A.20)

l1l=1 he,0o Since ce are not all zero, the LHS of the above equation is a polynomial of order 2L - 1. Hence (A.20) can only have 2L - 1 solutions which are a subset of {27rf : 1, 12,..., L}. If we choose S, say, to be ((2f - 3/2)7r, (2f - 1)7r), then (A.20) is a contradiction to the linear

dependence assumption of Oe (x). We now show that the {be} are not all 0. We again prove this by

contradiction. Note that

bl = -ical + 11 and b_1 = -ia1 - /1.

If b1 = b_1 = 0, then al = P1 = 0. Together with the assumption that

b2 = b_2 = 0, we have a2 = P2 = 0. Continuing this induction, we have

a,j =Pj = 0 for all j. This contradicts our assumptions on {aj}'m and

{/(j}nm Proof of Lemma 3

It suffices to show for s= 1,...,q- r that Pr{r+s C} = O(ndu-d). Suppose that r + s E ZC. Then for some

" e [0, 1], we

have H(0) = 0. It follows that .1 (') = ... = r+s(W) = 0, so that dim(Ker(H('))) > r + s. Thus X 1(), - Xr+s are orthonor- mal null vectors of H('). We then have, for i E 1,..., r + s},

0 = H = [H + ~ AH]Xi() = HXi() + Op(ndu-d), so that

JHXi()l)112 - Op(n2(du-d)), i= 1 ... r + s, (A.21)

where I' -II denotes the Euclidean norm.

Because {Xj(0)}-- are an orthonormal basis for Cq, we can write

q

Xi() = a Xj(0), i= 1 ..., r + s, j=1

with

q

M = 1 L=M

LZaL j=0 L

-

M j=1

From (A.21), we have for i = 1 ..., r + 1,

q q q 2

r+ 1(0) L ot )(0) = tij.j(0)xj(0) j=r+1 j=r+1 j=r+1

= IIHxi = Op(n2(du-d)). (A.22)

Because {Xj( are orthonormal, we have o 1I

r q

XL ()XM(t) = 0 = LjMj + 3 cOqaotj, j=1 j=r+1

L, M = 1 ..., r+ 1, L # M. (A.23)

Define

i = , i= 1 . r+ 1,

alir

and define B = [&1,...,r]. We now show that B'B = Tr + op(1), where Tr is an r x r identity matrix. First, note from (A.22) and (A.23) that if L # M, then

&L&M =- > LjMj < 3a2 L aMij j=r+l j=r+l j=r+1

= Op( l2 n2(du-d)

Because 12 (0) converges in distribution to a random variable that

has no atom at 0, we have &-' - +o 0 for L # M and, similarly,

aLaL -+ 1. It follows that B'B = Tr + op(1). Because a I,...cr+1 are not linearly independent, there exist co-

efficients 1..., - - -r such that

Ur+1 = l& + + + Prtr. (A.24)

Because (p 1, ..., r)' = B- 1 r+1, we obtain

r

>M3 =r+1 (BB')1 r+1 = 1 + Op(1). M=

We now obtain an upper bound for r=1 . From (A.23), and the Cauchy-Schwarz inequality, we have

ar+lj PM >3

r+1,jatMj j=1 M=1 j=r+l

r q q2

>3MI >

ar+,j a

M=1 j=r+1 j=r+1

q r r q 2 M q2

L tr+l j [Li j=r+l M=1 L=lj=r+l

n2(du-d) r+1

0

From (A.22), we have r

r+X 1 (0) - +1 (0) r+2 1,j

= Op(n2(du-d).

j=1

Combining this with (A.25), we obtain

r+l1 (0) = Op(n2(du-d))

Thus Pr{IC {l...r}} Pr{.2+1(0) < R}, where R is a random variable that is Op (n2(du-d)). By Corollary 2, the limiting distribution of H = An-2d

o Re(Ix,j)A' has rank q - r with probability 1. We

conclude that in fact 2+1 (0) converges in distribution to a random variable that has no mass at 0. Thus,

Pr{IC {l ..., r}} < Pr{R/2+l(0) 1)} 0,

because R/A2+1 (0) is op(1). [Received June 2002. Revised January 2003.]

This content downloaded from 195.34.78.242 on Sun, 15 Jun 2014 19:36:08 PMAll use subject to JSTOR Terms and Conditions

Page 15: Semiparametric Estimation of Multivariate Fractional Cointegration

642 Journal of the American Statistical Association, September 2003

REFERENCES Barlow, J. L., and Slapni'ar, I. (2000), "Optimal Perturbation Bounds for the

Hermitian Eigenvalue Problem," Linear Algebra and Its Applications, 309, 19-43.

Bierens, H. J. (1997), "Nonparametric Cointegration Analysis," Journal of Econometrics, 77, 379-404.

Billingsley, P. (1986), Probability and Measure (2nd ed.), New York: Wiley. Brockwell, P. J., and Davis, R. A. (1991), Time Series: Theory and Methods

(2nd ed.), New York: Springer-Verlag. Chen, W. W., and Hurvich, C. M. (2003), "Estimating Fractional Cointegration

in the Presence of Polynomial Trends," Journal of Econometrics, forthcom- ing.

Engle, R. F., and Granger, C. W. J. (1987), "Co-Integration and Error Correc- tion: Representation, Estimation and Testing," Econometrica, 55, 251-276.

Fujikoshi, Y. (1985), "Two Methods for Estimation of Dimensionality in Canonical Correlation Analysis and the Multivariate Linear Model," in Sta- tistical Theory and Data Analysis, ed. K. Matusita, Amsterdam: Elsevier, pp. 233-240.

Fujikoshi, Y, and Veitch, L. G. (1979), "Estimation of Dimensionality in Canonical Correlation Analysis," Biometrika, 66, 345-351.

Gunderson, B. K., and Muirhead, R. J. (1997), "On Estimating the Dimension- ality in Canonical Correlation Analysis," Journal of Multivariate Analysis, 62, 121-136.

Hurvich, C. M., and Chen, W. W. (2000), "An Efficient Taper for Potentially Overdifferenced Long-Memory Time Series," Journal of Time Series Analy- sis, 21, 155-180.

Marinucci, D., and Robinson, P. M. (2001), "Semiparametric Fractional Coin- tegration Analysis," Journal of Econometrics, 105, 225-247.

Okamoto, M. (1973), "Distinctness of the Eigenvalues of a Quadratic Form in a Multivariate Sample," The Annals of Statistics, 1, 763-765.

Robinson, P. M. (1994), "Semiparametric Analysis of Long-Memory Time Se- ries," The Annals of Statistics, 22, 515-539.

Robinson, P. M., and Marinucci, D. (1998), "Semiparametric Frequency Do- main Analysis of Fractional Cointegration," STICERD Discussion Paper EM/98/348, London School of Economics.

S(2001), "Narrow-Band Analysis of Nonstationary Processes," The An- nals of Statistics, 29, 947-986.

Robinson, P. M., and Yajima, Y. (2002), "Determination of Cointegrating Rank in Fractional Systems," Journal of Econometrics, 106, 217-241.

Terrin, N., and Hurvich, C. M. (1994), "An Asymptotic Wiener-Ito Repre- sentation for the Low-Frequency Ordinates of the Periodogram of a Long Memory Time Series," Stochastic Processes and Their Applications, 54, 297- 307.

Terrin, N., and Taqqu, M. S. (1991), "Convergence in Distribution of Sums of Bivariate Appell Polynomials With Long-Range Dependence," Probability Theory and Related Fields, 90, 57-81.

This content downloaded from 195.34.78.242 on Sun, 15 Jun 2014 19:36:08 PMAll use subject to JSTOR Terms and Conditions