asymptotic normality of narrow-band least squares in the stationary fractional cointegration model...

$: Asymptotic normality of narrow-band least squares in the stationary fractional cointegration model and volatility forecasting$
ARTICLE IN PRESS

Journal of Econometrics 133 (2006) 343–371

0304-4076/$ -

doi:10.1016/j

�CorrespoE-mail ad

www.elsevier.com/locate/jeconom

Asymptotic normality of narrow-band leastsquares in the stationary fractional cointegration

model and volatility forecasting

Bent Jesper Christensena, Morten Ørregaard Nielsenb,�

aSchool of Economics and Management, University of Aarhus, 322 University Park,

DK-8000 Aarhus C, DenmarkbDepartment of Economics, Cornell University, 482 Uris Hall, Ithaca, NY 14853, USA

Accepted 2 August 2004

Available online 12 May 2005

Abstract

We consider semiparametric frequency domain analysis of cointegration between long

memory processes, i.e. fractional cointegration, allowing derivation of useful long-run

relations even among stationary processes. The approach is due to Robinson (1994b. Annals

of Statistics 22, 515–539) and uses a degenerating part of the periodogram near the origin to

form a narrow-band frequency domain least squares (FDLS) estimator of the cointegrating

relation, which is consistent for arbitrary short-run dynamics. We derive the asymptotic

distribution theory for the FDLS estimator of the cointegration vector in the stationary long

memory case, thus complementing Robinson’s consistency result. An application to the

relation between the volatility realized in the stock market and the associated implicit volatility

derived from option prices is offered.

r 2005 Elsevier B.V. All rights reserved.

JEL classification: C14; C22; G13

Keywords: Asymptotic distribution theory; High-frequency data; Long memory; Semiparametric methods;

Stationary fractional cointegration

see front matter r 2005 Elsevier B.V. All rights reserved.

.jeconom.2005.03.018

nding author. Tel.: +1607 255 6338; fax: +1 607 255 2818.

dresses: [email protected] (B.J. Christensen), [email protected] (M.Ø. Nielsen).

www.elsevier.com/locate/jeconom

ARTICLE IN PRESS

B.J. Christensen, M.Ø. Nielsen / Journal of Econometrics 133 (2006) 343–371344

1. Introduction

Cointegration analysis has been one of the most active areas in the econometricsand time series literatures in the last 15 years, starting with the seminal contributionsby Granger (1981) and Engle and Granger (1987). Much of this work has beenprompted by applications in macroeconomics and finance. Most of the analysis hasconsidered the Ið1Þ � Ið0Þ type of cointegration in which linear combinations of twoor more Ið1Þ variables are Ið0Þ. Here, a process is labelled Ið0Þ if it is covariancestationary and has positive finite spectral density at the origin, and Ið1Þ if the oncedifferenced series is Ið0Þ: In the bivariate case, if yt and xt are Ið1Þ and hence inparticular nonstationary (unit root) processes, but there exists a process et which isIð0Þ and a fixed b such that

yt ¼ b0xt þ et, (1)

then xt and yt are said to be cointegrated. Thus, the nonstationary series movetogether in the sense that a linear combination of them is stationary and hence acommon stochastic trend is shared.

The above notion of cointegration is based on the premise that the first differencesof the raw series are somehow special. In particular, cointegrating relations amongunit root processes (whose first differences are Ið0Þ) are defined and studied.However, many economic and financial time series exhibit strong persistence withoutexactly possessing unit roots. The basic theory of cointegration offers no guidance asto the analysis of relations among such series. What is needed is a class of processesthat is more general than Ið1Þ and still admits a criterion for linear co-movementof series. One such class is that of fractionally integrated processes. Thus, a processis IðdÞ (fractionally integrated of order d) if its d’th difference is Ið0Þ. Here, d maybe any real number, i.e. d ¼ 0 or 1 are special cases. For a precise statement,xt 2 IðdÞ if1

ð1� LÞdxt ¼ �t, (2)

where �t 2 Ið0Þ and ð1� LÞd is defined by its binomial expansion

ð1� LÞd ¼X1j¼0

Gðj � dÞ

Gð�dÞGðj þ 1ÞLj ; GðzÞ ¼

Z 10

tz�1e�t dt, (3)

in the lag operator L (Lxt ¼ xt�1). Following the original idea by Granger (1981), anatural generalization of the cointegration concept is to assume that the raw seriesare IðdÞ and that a certain linear combination is IðdeÞ; with d and de positive realnumbers and deod. Thus, the errors are of lower order of fractional integration thanthe levels. In this case, the series share fractionally integrated stochastic trends ofdifferent orders (IðdÞ and IðdeÞ), and a linear combination eliminates the largest.Clearly, this allows the study of co-movement among persistent series much more

1Note that (2) is valid only for do1=2. For nonstationary values of d, IðdÞ is defined either by

initialization or by considering partial sums of series of the type in (2), see e.g. Marinucci and Robinson

(1999).

ARTICLE IN PRESS

B.J. Christensen, M.Ø. Nielsen / Journal of Econometrics 133 (2006) 343–371 345

generally than in the standard (exact) unit root based Ið1Þ � Ið0Þ type cointegrationcase.

Fractional cointegration analysis has been carried out by Cheung and Lai (1993),Baillie and Bollerslev (1994), Dolado and Marmol (1996), Dueker and Startz (1998),and Robinson and Hualde (2003), among others, using parametric methods.However, Robinson (1994b) shows that conventional estimators, and in particularOLS, are inconsistent when the errors are fractionally integrated. He introducesnarrow-band least squares, a semiparametric method, and proves that it is consistenteven in situations where the error term is correlated with the regressors, e.g., as aresult of fractional cointegration. Robinson’s method has been extended to multipleregressors by Lobato (1997) and further developed by Robinson and Marinucci(2003) (henceforth RM) and applied by Marinucci and Robinson (2001). Theseestimators are consistent for general orders of fractional integration d for theindividual series and de for the errors in the cointegrating relation, and for arbitraryshort run dynamics (which would have had to be precisely and correctly specified inthe alternative parametric approaches).

The distribution theory for fractional cointegration is not complete for all relevantparameter regions. RM provide the limiting distribution of their estimators of thecointegrating relation for the case d41=2; deX0. In this paper, we provide thelimiting distribution for the case d40; deX0; d þ deo1=2, which has been missing inthe literature. This is the case of stationary fractional cointegration. Precisely in thiscase, Robinson’s result yields inconsistency of conventional estimators, includingOLS. Thus, in the stationary fractional cointegration model narrow-band leastsquares is a tool for consistent estimation of the regression parameters even insituations when the error term is correlated with the regressors.

The present paper complements the consistency result for narrow-band leastsquares of Robinson (1994b) and the distribution results of RM by providing therelevant distribution theory for the stationary fractional cointegration model. Weshow that the resulting distribution is normal, in contrast to RM’s more complicateddistributions for the different subcases in the region d þ de41=2. This differencecorresponds to what might have been conjectured from earlier results for longmemory processes, e.g. Fox and Taqqu (1986, 1987) and Beran (1994).

We illustrate our new distribution theory in an application to stock marketvolatility. Financial volatility series almost universally exhibit strong serialdependence, but not of the unit root kind, see e.g. Lobato and Velasco (2000),Andersen et al. (2001a), and Andersen et al. (2001b). For univariate stock marketvolatility, Andersen et al. (2001a) show that the parameter region 0odo1=2 isrelevant. We consider the multivariate case, using volatility series measured fromboth stock index returns (realized volatility) and options data (implied volatility).We examine the possibility that the two volatility series are fractionally cointegrated,consistent with the notion that implied volatility forecasts realized volatility. Ouranalysis shows that indeed the region d40; deX0; d þ deo1=2 is relevant for theorders of fractional integration and cointegration in the volatility application, thusextending the result of Andersen et al. (2001a) to the multivariate case. Thecondition d þ deo1=2 implies that the limiting distributions provided by RM do not

ARTICLE IN PRESS


apply. Using our new distribution theory, we find evidence in favor of long-rununbiasedness of the implied volatility forecast of subsequent realized volatility.

The rest of the paper is laid out as follows. In Section 2 we set up the fractionalcointegration model. We then show that in the stationary fractional cointegrationcase, i.e., the above case where the ‘‘total memory’’ of the model (the sum d þ de ofthe memory of the raw data and the cointegrating error) is less than 1=2, theestimator of the coefficients of the cointegrating relation is asymptotically normal.Section 3 briefly reviews the relation between the volatility implied in option pricesand the volatility realized in the stock market, as well as the cointegrationimplications. In Section 4 we present our empirical results using an ultra-highfrequency dataset. Section 5 concludes. The proof of the main theorem appears inthe appendix.

2. Stationary fractional cointegration

The stochastic process fxt; t ¼ 1; 2; . . .g generated by (2) has spectral density

f ðlÞ�gl�2d as l! 0þ, (4)

where g is a constant and the symbol ‘‘�’’ means that the ratio of the left- and right-hand sides tends to one in the limit. Such a process is said to possess long memorysince the autocorrelations die out at a hyperbolic rate, in contrast to the much fasterexponential rate in, e.g., the stationary and invertible ARMA(p; q) case. The mostwell-known model satisfying (4) is the fractional ARIMA model of Granger andJoyeux (1980) and Hosking (1981). For excellent surveys of long memory processesand fractional models see Robinson (1994c), Baillie (1996), and Robinson (2003),and for a textbook treatment see e.g. Beran (1994). The parameter d determines thememory of the process. If d4� 1=2 the process is invertible and possesses a linear(Wold) representation, and if do1=2 it is covariance stationary. If d ¼ 0 the spectraldensity is bounded at the origin and the process has only weak dependence.Furthermore, if do0 the process is said to be anti-persistent, and has mostlynegative autocorrelations, but if d40 the process has long memory. Throughout thispaper, we shall be concerned with the case 0pdo1=2, only. This interval is relevantfor many applications in finance, see e.g. Lobato and Velasco (2000), Andersen et al.(2001a), and Andersen et al. (2001b). In particular, it is the relevant region for thevolatility processes we study below in our empirical application.

Many estimators of the memory parameter d and the scale parameter g have beensuggested in the literature. Two important semiparametric approaches have beendeveloped, the log-periodogram estimator by Geweke and Porter-Hudak (1983) andRobinson (1995b) and the Gaussian semiparametric or local Whittle estimator byKunsch (1987) and Robinson (1995a), recently extended and refined by Lobato(1999) and Shimotsu and Phillips (2002) among others. Estimators based on fullyspecified parametric models are considered by Fox and Taqqu (1986, 1987) andDahlhaus (1989) among others. The semiparametric estimators of the memoryparameter assume only (4) for the spectral density and use a degenerating part of the

ARTICLE IN PRESS


periodogram around the origin to estimate the model. This approach has theadvantage of being invariant to any short- and medium-term dynamics. The fullyparametric approaches are asymptotically efficient, using the entire sample, but areinconsistent if the parametric model is specified incorrectly, e.g., if the lag-structureof the short-term dynamics is misspecified.

A natural generalization of the standard Ið1Þ � Ið0Þ cointegration concept is toassume that the raw data variables are IðdÞ and that a certain linear combination isonly Iðd � bÞ with dXb40. To be more specific, suppose the p raw data series aregathered in zt ¼ ðx

0t; ytÞ

02 Iðd1; . . . ; dpÞ and that a certain linear combination of zt is

IðdeÞ with deominðdiÞ. We shall then call zt fractionally cointegrated and writezt 2 FCIðd1; . . . ; dp; deÞ. This would occur, for example, if ðxt; ytÞ 2 IðdÞ and et 2

IðdeÞ with d4deX0 in the model

yt ¼ b0xt þ et. (5)

Conceptually, an additional condition on the integration orders d1; . . . ; dp is needed.In particular, it is necessary that the two largest orders of integration are equal. Toextend on the previous example, suppose ðx1t;x2t; ytÞ 2 Iðd1; d2; d3Þ and that thevariables are indexed in increasing order of magnitude of di. It would then clearly benecessary that d2 ¼ d3 for x2t and yt to cointegrate. However, no restriction isnecessarily imposed on d1. To see why, suppose yt � b2x2t 2 Iðd 0Þ, whered 0od2 ¼ d3. That is, yt and x2t cointegrate to an Iðd 0Þ process. Then it would onlybe possible for x1t to enter nontrivially in the cointegrating relation if d 0 ¼ d1.Otherwise, we may let b1 ¼ 0 and exclude x1t from the equation. Thus, what isneeded is a generalization of the multicointegration concept to allow the orders ofintegration to be different, but maintaining a structure that ensures that onlyvariables of the same integration order cointegrate with each other.

Some authors have estimated fractionally cointegrated models. For instance,Dolado and Marmol (1996) assume an arbitrary value of d for both the raw data andthe errors, and Dueker and Startz (1998) estimate a fully parametric model of theerror correction type. The main drawback of fully specified parametric models is thatthey provide inconsistent estimates of the long-run parameters if the model is notcorrectly specified, e.g., if the short-run lag structure in the error correction model isincorrect. Robinson (1994b) and subsequently Robinson and Marinucci (2003) andMarinucci and Robinson (2001) attempt to correct this by considering asemiparametric approach to the estimation of the cointegrating relation in (5).Other recent references for fractional cointegration analyses include inter alia Chenand Hurvich (2003a, b) and Velasco (2003), who also consider semiparametricapproaches. However, they all consider only the asymptotic distribution theory forthe case d41=2, where the raw series are nonstationary. This does not apply for thestationary long memory volatility series that we consider below.2 Hence, we developthe necessary theory for the case d 2 ð0; 1=2Þ.

2Cointegration in the stationary case is also considered by Marinucci (2000), where a continuously

averaged (integral) version of FDLS, see (7) below, is advocated. However, only consistency is shown in

ARTICLE IN PRESS


2.1. The model and notation

In this subsection, we follow Robinson and Marinucci (2003) in setting out themodel and the assumptions and present their consistency result. In the followingsubsection, we develop the new asymptotic distribution theory for the stationary cased 2 ð0; 1=2Þ. Suppose we observe the random p-vectors zt ¼ ðx

0t; ytÞ

0; t ¼ 1; . . . ; n,which satisfy the following assumptions.

Assumption A. The vector process fzt; t ¼ 0;�1; . . .g is covariance stationary withspectral density matrix satisfying

f zzðlÞ�LGL as l! 0þ,

where G is a p� p real symmetric matrix whose leading ðp� 1Þ � ðp� 1Þ submatrixhas full rank and

L ¼ diagfl�d1 ; . . . ; l�dpg

for di 2 ð0; 1=2Þ; i ¼ 1; . . . ; p. However, there exists a ðp� 1Þ-vector ba0 and aconstant c 2 ð0;1Þ such that

ð�b0; 1Þf zzðlÞ�b

1

� �¼ f eðlÞ�cl�2de as l! 0þ,

with 0pdeominðdiÞ.

Assumption B. zt admits the Wold representation

zt ¼ mz þX1j¼0

Aj�t�j ;X1j¼0

kAjk2o1,

where the innovations �t satisfy

Eð�tjFt�1Þ ¼ 0; Eð�t�0tjFt�1Þ ¼ S; a:s:;

and �t�0t are uniformly integrable. Here, mz ¼ Eðz0Þ, S is a constant matrix of fullrank, Ft ¼ sðf�s; sptgÞ is the s-field of events generated by f�s; sptg, and k � k is theEuclidean matrix norm.

Assumption A is a natural multivariate generalization of (4) including themultivariate fractionally integrated ARMA model as a special case, see alsoRobinson (1995b) and Lobato (1997). In particular, it implies that zit 2 IðdiÞ andet 2 IðdeÞ. Thus, Assumption A follows if zt 2 FCIðd1; . . . ; dp; deÞ. Notice that eventhough it is not stated as part of Assumption A, the two largest integration ordersmust be equal as mentioned earlier. The rank condition on G ensures nomulticollinearity among the xt variables at the origin, i.e. it restricts the xt from

(footnote continued)

the stationary case and no asymptotic distribution theory is given. Furthermore, the approach of the

present paper does not require mean correction and is therefore probably superior.

ARTICLE IN PRESS


cointegrating among themselves. Assumption B is a regularity condition whichrequires that the first and second moments of the innovations in the Woldrepresentation of zt are martingale difference series.

Define the discrete Fourier transform (DFT) of an observed vectorfat; t ¼ 1; . . . ; ng,

waðlÞ ¼1ffiffiffiffiffiffiffiffi2pnp

Xn

t¼1

ateitl.

If fbt; t ¼ 1; . . . ; ng is another observed vector, the cross periodogram matrixbetween at and bt is

IabðlÞ ¼ waðlÞwn

bðlÞ ¼ IcabðlÞ þ iI

qabðlÞ,

where the asterisk is transposed complex conjugation, and c; q indicate the co- andquadrature periodograms, respectively. We then form the discretely averaged co-periodogram

F abðk; lÞ ¼2pn

Xl

j¼k

IcabðljÞ; 1pkplpn� 1, (6)

for lj ¼ 2pj=n. If F ab is a vector we denote the ith entry FðiÞ

ab, and if it is a matrix wedenote the ði; jÞth entry F

ði;jÞ

ab . We could also have considered a continuously averagedversion of (6) as in Marinucci (2000), but it would not enjoy the property of beinginvariant to mean terms.

With F defined as in (6) we can consider the frequency domain least squares(FDLS) estimator

bm ¼ F�1

xx ð1;mÞF xyð1;mÞ (7)

of b in the regression (5). Notice that by this definition bn�1 is algebraically identicalto the usual OLS estimate of b with allowance for a nonzero mean in et. If

1

mþ

m

n! 0 as n!1, (8)

then bm is called a narrow-band FDLS estimator, since it uses only a degeneratingband of frequencies around the origin. We need m to tend to infinity to gatherinformation, but we also need to remain in a neighbourhood of zero, so m=n musttend to zero.

In this setup, RM show the following result.

Theorem 1 (Robinson and Marinucci, 2003). Under Assumptions A, B, (8), and with

bm defined in (7),

bim � bi ¼ Opn

m

� �de�di

� �; i ¼ 1; . . . ; p� 1 as n!1.

Notice that under fractional cointegration deominðdiÞ, so the estimator bm isconsistent for b. Furthermore, if the integration order of the raw data series iscommon, i.e. di ¼ d for all i ¼ 1; . . . ; p, the stochastic order of magnitude of the

ARTICLE IN PRESS


estimator varies inversely with the strength of the cointegrating relation b ¼ d � de.In the case of different orders di, the estimated coefficients on the variables of highestorder of integration converge fastest.

This completes the RM setup and their consistency result. As already noted, noasymptotic distribution theory has been available for the case di 2 ð0; 1=2Þ, which isrelevant for our financial volatility application below. Hence, we now turn to thedevelopment of such a theory.

2.2. Limiting distribution theory

To derive the limiting distribution of bm for di 2 ð0; 1=2Þ, we need to strengthenAssumptions A and B and introduce new assumptions similar to those in Robinson(1995b) and Lobato (1999). As RM, we find it convenient to use the notationwt ¼ ðx

0t; etÞ

0. Henceforth, f ðlÞ ¼ ff ijðlÞgi;j¼1;...;p denotes the spectral density matrixfor wt. Note that the cross spectral density between xit and et is f ipðlÞ.

Assumption A’. The spectral density matrix of wt satisfies

f ðlÞ�LGL as l! 0þ. (9)

In particular, there exists a 2 ð0; 2� such that

jf ijðlÞ � gijl�di�dj j ¼ Oðla�di�dj Þ as l! 0þ; i; j ¼ 1; . . . ; p, (10)

where gij is the ði; jÞth element of G, which has the same properties as in AssumptionA and gip ¼ gpi ¼ 0, i ¼ 1; . . . ; p� 1.

Assumption B’. wt is a linear process, wt ¼ mþP1

j¼0 Aj�t�j, where the coefficientmatrices are square summable

P1j¼0 kAjk

2o1 and the innovations satisfy, almostsurely, Eð�tjFt�1Þ ¼ 0; Eð�t�0tjFt�1Þ ¼ Ip; and the matrices Eð�t � �t�0tjFt�1Þ, Eð�t�0t ��t�0tjFt�1Þ are nonstochastic, finite, and do not depend on t, with Ft ¼ sðf�s; sptgÞ.Denote the periodogram of �t by JðlÞ.

Assumption C. As l! 0þ

dAiðlÞdl

�� ¼ Oðl�1kAiðlÞkÞ

for i ¼ 1; . . . ; p, where AiðlÞ is the ith row of AðlÞ ¼P1

j¼0 Ajeijl.

Assumption D. The bandwidth m satisfies

1

mþ

m1þ2a

n2a ! 0 as n!1,

with a from Assumption A’.

ARTICLE IN PRESS


Some comments on our assumptions are in order. Assumptions A’ and Cstrengthen Assumption A by imposing a rate of convergence on f ðlÞ and asmoothness condition similar to those used in parametric models, see e.g. Fox andTaqqu (1986, 1987) or Dahlhaus (1989). Assumption A’ is satisfied with a ¼ 2 if, forinstance, wt is a vector fractional ARIMA process. Here, (10) with gip ¼ gpi ¼ 0,i ¼ 1; . . . ; p� 1, is a central assumption, restricting the coherencies between thecointegration residuals and other right hand side variables at the origin. Therestriction is not as strong as it may seem at first, and in particular it does not requirethe regressors xt and the errors et to be uncorrelated at frequencies away from theorigin. Thus, our estimator allows the regressor and error terms to share the sameshort- and medium-term dynamics. This is in stark contrast to most familiarestimation methods, such as OLS and other cointegration methods. In particular,under our assumptions, the OLS estimator of b is generally inconsistent. Note thatunder Assumption A’ with the new notation, the condition that the two largestintegration orders in zt must be equal is automatically satisfied since dy ¼ maxðdiÞ byconstruction.

Assumption B’ follows Robinson (1995a) and Lobato (1999) in imposing a linearstructure on wt with square summable coefficients and martingale differenceinnovations with finite fourth moments. It is satisfied for instance if �t form an i.i.d.process with finite fourth moments. Under Assumption B’ we can write the spectraldensity matrix of wt as

f ðlÞ ¼1

2pAðlÞAnðlÞ, (11)

where the asterisk is complex conjugation combined with transposition. AssumptionD implies (8), so our estimator is in the narrow-band FDLS class, and if a is high, i.e.(9) is a close approximation to (11), the implied constraint on the bandwidth m isweak. If, e.g., wt is a vector fractional ARIMA process such that a ¼ 2 we can takem ¼ oðn4=5Þ.

We are now ready to derive the limiting distribution of the narrow-band FDLSestimator of the cointegrating relation in (5). The proof is in the appendix.

Theorem 2. Under Assumptions A’, B’, C, D, di þ deo1=2; 0pdeodio1=2 for

i ¼ 1; . . . ; p� 1, and with bm defined in (7),ffiffiffiffimp

lde

m~Lmðbm � bÞ!

dNð0; gppH�1JH�1Þ

as n!1, where ~Lm ¼ diagðl�d1

m ; . . . ; l�dp�1

m Þ and H; J are defined in Eqs. (21) and

(37) in the appendix.

Note that the limiting distribution in Theorem 2 is a centered normal, unlike thenon-normal distributions arising when diX1=2 (Marinucci and Robinson, 2001;Robinson and Marinucci, 2003). The particular result that the asymptotic bias of bm

disappears is due to the condition (10) and gip ¼ gpi ¼ 0, i ¼ 1; . . . ; p� 1, inAssumption A’. The fact that the limiting distribution is normal is to someextent expected, in view of the results from Fox and Taqqu (1986, 1987) andsubsequent authors, showing that quadratic forms of long memory processes with

ARTICLE IN PRESS


square-summable autocovariances (2do1=2) are asymptotically Gaussian.This is related to our result, where we work with a quadratic form withdi þ deo1=2. See also Lobato and Robinson (1996). Our asymptotic normalityresult allows for straightforward inference, e.g. based on standard t-ratios (nosimulation needed).

In the simple two-variable case with xt 2 IðdxÞ, et 2 IðdeÞ, and G ¼ diagðgx; geÞ,the asymptotic distribution of bm is given by Theorem 2 as

ffiffiffiffimp

lde�dx

m ðbm � bÞ!dN 0;

geð1� 2dxÞ2

2gxð1� 2dx � 2deÞ

� �. (12)

The asymptotic distribution (12) is easy to interpret and compare to well-knowncases. The parameters ge and gx correspond to (long-run versions of) varðDde etÞ andvarðDdx xtÞ. Ordinary full band spectral regression yields the asymptotic variancege=gx in the short memory case. In that case, Brillinger (1981, Chapters 7–8) andothers have also considered narrow-band analysis and obtained the division by twoin the asymptotic variance, as in (12). This arises since the cross-terms involving et

vanish asymptotically in varðF xeÞ (see the proof of Theorem 2, immediately before(36)). Our Theorem 2 generalizes the analysis to allow long memory at the origin inthe regressors as well as in the errors. Of course, the short memory results appear asthe special case dx ¼ de ¼ 0 of (12) and Theorem 2.

The parameters G; de; di; i ¼ 1; . . . ; p� 1, in the limiting distribution inTheorem 2 can be replaced by consistent estimates. Many such estimates areavailable, but the Gaussian semiparametric estimator of Robinson (1995a) andLobato (1999) has particularly simple asymptotic properties, and is also employedby Marinucci and Robinson (2001) to estimate the fractional integration orders intheir analysis.

To assess the strength of the cointegrating relation, one could calculate theobserved error process et ¼ yt � b

0

mxt; t ¼ 1; . . . ; n, and estimate b ¼ d � d e

(assuming a common integration order di ¼ d for the raw data series) using e.g.the methods mentioned above. Formally, Velasco (2003) shows that Robinson’s(1995a) Gaussian semiparametric estimator applies for residuals under additionalregularity and bandwidth conditions. In any case, this provides an informaldiagnostic.

The condition that di þ deo1=2 for i ¼ 1; . . . ; p� 1 in Theorem 2 includes manyrelevant cases, both empirically and from theoretical models, e.g. the case when theraw data series are stationary with long memory and the error process—thedeviations from equilibrium—has only short memory (or is actually white noise) dueto rational expectations. In the case where dio1=2pdi þ deo1 for some i, it isconjectured that bm converges to a function of the Rosenblatt distribution like thescalar averaged periodogram estimator in Lobato and Robinson (1996).

To sum up, Theorem 2 provides the necessary asymptotic distribution theory forthe stationary fractional cointegration model, thus complementing the consistencyresult of Robinson (1994b) and Robinson and Marinucci (2003). We use this in theempirical application below.

ARTICLE IN PRESS


3. The implied-realized volatility relation

Our application of stationary fractional cointegration analysis is to the relationbetween the volatility implied in option prices and the subsequently realized returnvolatility of the underlying asset. If option market participants are rational andmarkets are efficient, then the price of a financial option should reflect all publiclyavailable information about expected future return volatility of the underlying assetover the life of the option. Empirical analysis of this hypothesis has typicallyemployed the option pricing formula of Black and Scholes (1973) and Merton(1973)—henceforth the BSM formula. According to this, the fair (arbitrage free)price of a European call option with t periods to expiration and strike price k isgiven by

cðs; k; t; r;sÞ ¼ sFðdÞ � e�rtkFðd� sffiffiffitpÞ,

d ¼lnðs=kÞ þ ðrþ 1

2s2Þt

sffiffiffitp , ð13Þ

where s is the price of the underlying asset, r is the riskless interest rate, F is thestandard normal c.d.f., and s is the return volatility of the underlying asset throughexpiration of the option (t periods hence).

Given an observation c of the price of the option, the implied volatility sIV may bedetermined by inverting (13), i.e. solving the nonlinear equation

c ¼ cðs; k; t; r; sIVÞ (14)

numerically with respect to sIV, for given data on s; k; t, and r. If this is done everyperiod t, a time series sIV;t results. Each implied volatility sIV;t may now be comparedto the actually realized return volatility of the underlying asset, from the time t wheresIV;t is calculated and until expiration of the option at tþ t. Realized volatility issimply computed as the sample standard deviation sRV;t of the realized return from t

to tþ t.Christensen and Prabhala (1998) considered the regression specification

yt ¼ aþ bxt þ et, (15)

where yt ¼ ln sRV;t and xt ¼ ln sIV;t are the log-volatilities, and the unbiasednesshypothesis of interest is that b ¼ 1. A monthly sampling frequency was employed forxt and yt. The underlying asset was the S&P100 stock market index, and yt wascalculated from daily returns. The options were at-the-money (kt ¼ st) one-month(t ¼ 1=12) OEX contracts. Basic OLS regression in (15) produced a b-estimate lessthan unity. Christensen and Prabhala (1998) also presented results without the logtransform, and the difference was negligible.

If volatility is persistent and, indeed, fractionally integrated, as empirical literaturesuggests (Andersen et al. (2001a) find fractional integration with d at about0:35� 0:4), whereas the forecasting error et in (15) is serially uncorrelatedor possesses only short memory, then xt and yt are fractionally cointegrated.The condition b ¼ 1 is then a long-run unbiasedness hypothesis. However,from Robinson (1994b) and Robinson and Marinucci (2003), OLS is generally

ARTICLE IN PRESS


inconsistent for b in the case of stationary fractional cointegration. The narrow-bandanalysis is consistent and the results from the previous section provide the necessaryasymptotic distribution theory for inference on b.

Our empirical results below in fact support fractional cointegration of xt and yt.This is consistent with Bandi and Perron (2004), who apply narrow band analysis tothe implied-realized volatility relation in independent contemporaneous work. Dueto the lack of available asymptotic distribution theory, they rely on subsampling forinference, and find results consistent with b ¼ 1. Our inference below is based onasymptotic standard errors from the new theory (Theorem 2), and also agrees withthis long-run unbiasedness hypothesis.

4. Data and empirical results

4.1. Data

We sample options data from the Berkeley options data base (BODB) (seethe BODB User’s Guide or Rubinstein and Vijh (1987) for a description). Weuse all quotes from January 1, 1988, to December 31, 1995, for options on theS&P500 index (SPX options). The SPX options are traded frequently and heavily.When using the BSM formula (13), the SPX options have the advantage overthe OEX contracts considered by Christensen and Prabhala (1998) that they areEuropean style, as assumed in the formula, i.e. there is no early exercise. Quotesare revised frequently due to the heavy trading and are used here since quotesdata are expected to be more reliable than trading prices. Quotes data are recordedautomatically and instantaneously when quotes are revised, whereas tradingprices are recorded manually with a time lag. This could bias results for high-frequency data. Finally, our sampling period starts after the October 1987 stockmarket crash, since Christensen and Prabhala (1998) documented a regime shift inthe implied-realized volatility relation around the time of the crash. Volatilitywas elevated for a period following the crash, and hence we exclude the remainderof the year 1987.

From the high-frequency options data, a 5-min return series for the under-lying (the S&P500 index) is constructed. Thus, there is a total of 8,189,617 quotesbetween January 1, 1988 and December 31, 1995, and we select the quotes closest tothe 5-min points during each trading day. These quotes are associated with timestamps 9:00 AM, 9:05 AM, ..., 2:55 PM, 3:00 PM. This results in a series of 147,022observations. The average absolute error in matching the desired time stamps is25.18 s. The raw mean of the timing errors is 7.87 s, with a standard error of themean 0.6532 s. In the BODB, a simultaneous price of the underlying is recorded foreach quoted option price. We use these underlying index levels to calculate our 5-minreturn series.

The high-frequency (5-min) return series forms the basis for realizedvolatility calculations. In particular, we choose a one-week interval and calculate

ARTICLE IN PRESS


the realized variance

s2RV;t ¼1

K � 1

XK

k¼1

ðrt;k � rtÞ2, (16)

where rt;k are the 5-min annualized returns in week t, and rt is the weekly average.The realized volatility is the square-root of s2RV;t. In practice, we start Mondaymorning at 10:00 AM and end Friday at 3:00 PM to avoid effects of irregularbehavior from the Friday close to the Monday opening. Thus, each estimate is basedon up to K ¼ 348 high-frequency return observations. For comparison, Andersenet al. (2001a) use 5-min returns to form daily realized volatilities and have 79 returnsin each. For increasing sampling frequency, Ks2RV;t converges to the true integratedvariance, see Andersen et al. (2001a) and Barndorff-Nielsen and Shephard (2002).

In addition, we need implied volatilities. Here, we use the Monday 10:00 AMquote (according to the above definition) for the call of shortest maturity and closestto the money. Since options expire at 3:00 PM on the Friday immediately precedingthe third Saturday of each month, the sampled options will have 4 5

6, 9 5

6, 14 5

6, 19 5

6, or

24 56trading days to expiration, except that business holidays may reduce each of

these figures slightly. From each sampled quote, an implied volatility is backed outusing the BSM formula (13), as described in the previous section. We correct fordividends on the S&P500 index as described in Hull (1997, p. 263). Dividend yielddata are obtained from Datastream. We use two different measures of time toexpiration to reflect that calendar days are relevant for interest and dividends, andtrading days for volatilities, following Hull (1997, p. 248). This results in a weeklyimplied volatility series ~sIV;t. However, if the implied volatility ~sIV;t in week t is froman option with one week (k0 ¼ 4 5

6trading days) to expiration then ~sIV;t�i

corresponds to i þ 1 weeks (ki ¼ 5i þ 4 56days) to expiration, i ¼ 0; 1; 2; 3, and in

some cases also for i ¼ 4. In the other cases ~sIV;t�4 is again from a one week option,and which case applies depends on when the third Saturday in the relevant monthoccurs. We convert this heterogeneous series to another weekly series sIV;t that maybe associated with the series sRV;t of realized volatilities covering homogeneousnonoverlapping weekly (length k0) intervals by the formula

s2IV;t�i ¼1

ki � ki�1ðki � ~s2IV;t�i � ki�1 � ~s2IV;t�iþ1Þ, (17)

starting with sIV;t ¼ ~sIV;t for t corresponding to a one week option and applying (17)for i ¼ 1; 2; 3 and if applicable also i ¼ 4. Here, (17) is an identity for the associatedrealized volatilities covering interval lengths k0, ki, and ki�1. This identity becomesan approximation for implied (as opposed to realized) volatilities. In practice, if theRHS of (17) turns out negative, we let sIV;t�i ¼ sIV;t�i�1 for that week. This occurs in30 of the 417 weeks in our sample, or about once every 14 weeks. Thus, due to theapproximations involved, our results below on the forecasting performance ofweekly implied volatility are conservative estimates based on the sample at hand. Byfocusing on a diminishing band of low frequencies in the fractional cointegrationanalysis, the effect of such measurement errors vanishes asymptotically, provided the

ARTICLE IN PRESS


spectral density of these errors is dominated by that of the observed variables, i.e. isof lower order of fractional integration.

4.2. Empirical results

Summary statistics for the two (log) volatility series constructed in Section 4.1appear in Table 1. Each of the series is of length 417 weeks, covering the intervalJanuary 1988 through December 1995. From Table 1, implied volatility has a highermean than realized volatility. In the case of American style options, such a differencewould be at least partly attributed to the early exercise premium embedded inimplied volatility. In particular, the BSM formula (13) does not correct for thepossibility of early exercise. However, we have deliberately chosen the S&P500 (SPX)options because they are of European style, so there is no early exercise premiumissue. Hence, the difference in mean volatility in our case must be attributed to excesshedging costs associated with replicating the options from the underlying (assumedzero in (13)).

The next line in Table 1 shows that implied volatility also has higher variance thanrealized volatility. This is inconsistent with the notion that implied volatility is themarket’s rational forecast of subsequent index volatility, i.e., a conditionalexpectation (see Section 3). In effect, we have only a noisy estimate of true impliedvolatility, due to the approximation in (17), among others, and as already noted, theassessment of the implied-realized volatility relation is conservative in any givensample, even if it is precise in large samples.

The third line in Table 1 shows that there is very little skewness in realized andimplied volatility. Realized volatility is slightly positively skewed, and impliedvolatility is slightly negatively skewed, but the magnitudes are negligible. The nextline in Table 1 shows that realized volatility is not very leptokurtic, consistent withthe findings in Andersen et al. (2001a). However, the table also shows that impliedvolatility in fact is somewhat leptokurtic. This difference is an interesting addition tothe picture of the properties of volatility which has not been noted in the literature.Thus, our results so far complement those of Andersen et al. (2001a) who find thatlog volatilities are indeed Gaussian.

The sample autocorrelation functions for the realized and implied volatility seriesare exhibited in Figs. 1 and 2. For both series, the decay is very slow, with significant

Table 1

Summary statistics

Realized volatility Implied volatility

yt ¼ ln sRV;t xt ¼ lnsIV;t

Mean �2.642928 �2.035616

Variance 0.122568 0.200101

Skewness 0.585426 �0.916736

Excess Kurtosis 0.457148 2.807389

This table reports summary statistics for log realized and log implied volatility.

ARTICLE IN PRESS

0 10 20 30 40 50 60 70 80 90 100

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

Fig. 1. Autocorrelation function up to lag 100 for log realized volatility.

0 10 20 30 40 50 60 70 80 90 100

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

Fig. 2. Autocorrelation function up to lag 100 for log implied volatility.


autocorrelations even beyond the 50th lag. Even when autocorrelations drop belowthe upper 95% confidence limit shown in the figures, they all remain positive. This isconsistent with the hypothesis of long memory in volatility and with the results ofAndersen et al. (2001a).

Next, in Table 2 we turn to the analysis of the memory properties for eachindividual volatility series. First, the impression of highly significant autocorrelationfunctions for both realized and implied volatility is confirmed by the Box–Pierce

ARTICLE IN PRESS

Table 2

Fractional integration order

Realized volatility Implied volatility

yt ¼ ln sRV;t xt ¼ lnsIV;t

Box–Pierce (lags)a

Q ðL ¼ 4Þ 394.7nn 136.8nn

Q ðL ¼ 24Þ 1211nn 649.4nn

GSP (bandwidth)b

d ðm ¼ 20Þ 0.48468 0.46281

ð0:11180Þ ð0:11180Þ

d ðm ¼ 37Þ 0.44758 0.45273

ð0:08220Þ ð0:08220Þ

d ðm ¼ 68Þ 0.41620 0.35033

ð0:06063Þ ð0:06063Þ

ADF (lags)c

t ðL ¼ 0Þ �11.079nn �14.843nn

t ðL ¼ 4Þ �4.6995nn �5.3166nn

aBox–Pierce tests of the significance of the autocorrelation function. Two asterisks indicate significance

at the 1% level in the asymptotic w2ðLÞ distribution.bGSP are Gaussian semiparametric estimates of the fractional integration order as described in

Robinson (1995a). Numbers in parentheses are asymptotic standard errors usingffiffiffiffimpðd � dÞ!dNð0; 1=4Þ.

cADF are augmented Dickey–Fuller tests of the null of a unit root. Two asterisks indicate significance at

the 1% level where the critical value is �3:44 (constant term included).


statistics with 4 or 24 lags (roughly one and six months) in the first two lines of thetable. The second part of the table shows Gaussian semiparametric (henceforth GSP,see Robinson, 1995a) estimates of the fractional integration parameter d for eachseries, and for different choices of the bandwidth parameter m.

Figs. 3 and 4 show the log periodograms for both series plotted against the logFourier frequencies and both exhibit negative linear trends at the long frequencies(i.e. the periodograms have peaks at the long frequencies), thus reinforcing theimpression of long memory series. The chosen bandwidths are n0:5 ¼ 20, n0:6 ¼ 37,and n0:7 ¼ 68, corresponding roughly to the log frequencies �1 to 0, up to which thenegative linear trend in Figs. 3 and 4 is most pronounced. Returning to Table 2, allthe d estimates are below 1=2, but significantly greater than 0 (asymptotic standarderrors in parentheses). This suggests that both realized and implied volatility arecovariance-stationary long memory series.

To be sure that the long memory evident in the volatility series does not in factreflect a unit root, we also in the last part of Table 2 exhibit standard augmentedDickey–Fuller unit root tests with 0 and 4 lags. These tests complement the GSPestimates and show clearly that the volatility series are not unit root processes.

The results so far are consistent with the notion that realized and implied volatilityboth are driven by stationary but fractionally integrated series. The interestingquestion is how closely they move together.

ARTICLE IN PRESS

-4.0 -3.5 -3.0 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0

-8

-7

-6

-5

-4

-3

-2

-1

0

Fig. 4. Log periodogram vs. log Fourier frequencies for log implied volatility.

-4.0 -3.5 -3.0 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0

-9

-8

-7

-6

-5

-4

-3

-2

-1

0

Fig. 3. Log periodogram vs. log Fourier frequencies for log realized volatility.


Under the long-run unbiasedness hypothesis, we would expect the series to followeach other closely. In Table 3, we turn to the analysis of the potential stationaryfractional cointegration relation between the two series. The first line of the tableshows the results of the OLS regression

yt ¼ aþ bxt þ et (18)

of realized on implied volatility. The parameter of interest, b, is estimated to be 0:38,but viewing OLS as a special case of FDLS, we have the bandwidth parameterm ¼ n� 1 (n ¼ 417 is the number of time series observations), and this is too muchfor calculating standard errors under the maintained hypothesis (based on Table 2)of long memory in the individual volatility series. The point estimate, however, is

ARTICLE IN PRESS

Table 3

Fractional cointegration analysis

Bandwidth am bm s.e. ðbmÞ de

me ¼ 20 me ¼ 37 me ¼ 68 me ¼ 20 me ¼ 37 me ¼ 68

m ¼ n� 1 �1.8628 0.38325 — — — 0.38996 0.33443 0.31102

ð0:1180Þ ð0:08270Þ ð0:06063Þ

m ¼ 15 �0.91075 0.85094 0.18866 0.17005 0.15844 0.10991 0.10048 0.09479

ð0:11180Þ ð0:08270Þ ð0:06063Þ

m ¼ 9 �0.93154 0.84072 0.22678 0.19860 0.18376 0.11400 0.10330 0.09752

ð0:11180Þ ð0:08270Þ 0:06063Þ

m ¼ 6 �0.93049 0.84124 0.25163 0.21971 0.20286 0.11378 0.10315 0.09738

ð0:11180Þ ð0:08270Þ 0:06063Þ

m ¼ 3 �0.83362 0.88882 0.25279 0.23727 0.21978 0.09856 0.09284 0.08669

ð0:11180Þ ð0:08270Þ 0:06063Þ

This table reports semiparametric estimates of the stationary fractional cointegration relation (18) for

different bandwidths. Bandwidth m ¼ n� 1 corresponds to OLS. The standard error for bm is based on

the new asymptotic distribution result Theorem 2. The estimated fractional integration orders of the

residuals de are GSP estimates with bandwidth me ¼ 20; 37 and 68, respectively. For the integration order

dx of the raw regressor series, used in the calculation of standard errors of bm based on (12), the results for

bandwidth 68 from Table 2 were used.


similar to those in the literature, and suggests that implied volatility is a downwardbiased forecast of realized volatility. Of course, the properties of OLS are called intoquestion when the possibility of stationary fractional cointegration is recognized, butthe narrow-band estimator remains consistent even in this case (Robinson, 1994b).Note the high estimated order of integration of the OLS residuals (0.31–0.39,depending on bandwidth), i.e. the residuals possess long memory.

Turning to lower bandwidth parameters m for the estimation of the parameter ofinterest, b, we find much larger point estimates. Following Marinucci and Robinson(2001) and Robinson and Marinucci (2003), we actually place considerable weight onthe estimates resulting from low values of m, such as m ¼ 3 or 6, and only considerestimation with up to m ¼ 15 (RM consider m ¼ 3; 4; 6 for their sample sizes ofn ¼ 116 and 138)3. For example, for m ¼ 15 we find b in excess of 0:85, i.e. morethan twice the value from OLS, and the estimate is significantly greater than zero.Here, the asymptotic standard errors of bm are calculated using the new asymptoticdistribution result in Theorem 2, with the results from Table 2, bandwidth 68, for d.We have chosen bandwidth 68 since the results are quite close and this choice yieldsresults closest to those in the literature. For m ¼ 15 or less, bm is insignificantly less

3The rationale for this has been provided by Robinson and Marinucci (2001) and Chen Hurvich (2003a,

b), who showed that a small (finite) m is sufficient for consistency and produces much less bias. However,

those results refer to the nonstationary case, where much more caution is needed. Thus, we consider also

wider bandwidths up to m ¼ 15, see also Robinson (1994a).

ARTICLE IN PRESS


than unity, whether using me ¼ 20, 37, or 68 for the bandwidth parameter toestimate de (this enters the formula from Theorem 2 for the standard error of bm).

From the residual analysis, we cannot reject the null hypothesis that implied andrealized volatility indeed are stationary fractionally cointegrated, which is also inaccordance with Bandi and Perron (2004). The residuals are of lower order offractional integration than the volatility series themselves, deod and de þ do1=2. Infact, our results are consistent with the even stronger relation that de ¼ 0. Thus, de isinsignificant in Table 3 for m ¼ 15 and lower, whether using bandwidth me ¼ 20, 37or 68 for estimating de.

For the estimation of de, we have chosen the same bandwidths as in the estimationof the memory parameter of the raw data. The log periodogram vs. log Fourierfrequency plots do not give any clear guidance to the choice of bandwidth for theresiduals. For example, the plots using m ¼ 3 and 15 (to estimate b and generate theresiduals) are given in Figs. 5 and 6. The results indicate that the cointegration errorprocess exhibits only short memory, so that all long memory properties in volatilityare common features for implied and realized volatility.

Considering different bandwidths m for the estimation of b, we note amonotonicity, with a tendency towards higher bm for lower m. Thus, the more wefocus the analysis on the long frequencies, the less the data that we use to extractinformation about the long term relation between the series is contaminated by shortterm noise, which could include measurement errors. For m ¼ 3, we find b ¼ 0:89and with a rather narrow confidence band, based on the asymptotic theory fromTheorem 2, which contains unity. These results leave the possibility of b ¼ 1 verylikely, consistent with the notion that implied volatility from option prices isan unbiased forecast of the subsequently realized index volatility in the longrun. These findings based on the new asymptotic theory agree with those obtainedby Bandi and Perron (2004) using subsampling, and are obviously in stark contrast

-4.0 -3.5 -3.0 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0

-10

-9

-8

-7

-6

-5

-4

-3

-2

Fig. 5. Log periodogram vs. log Fourier frequencies for residuals with m ¼ 3.

ARTICLE IN PRESS

-4.0 -3.5 -3.0 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0

-11

-10

-9

-8

-7

-6

-5

-4

-3

-2

Fig. 6. Log periodogram vs. log Fourier frequencies for residuals with m ¼ 15.


to the raw OLS results in the first line of the table. Clearly, inferences on volatilityrelations may be heavily misleading if the possibility of stationary fractionalcointegration is ignored.

5. Conclusion

In this paper, we have derived the asymptotic distribution theory for the narrow-band least squares estimator of Robinson (1994b) in the stationary fractionalcointegration model. This new distribution theory complements the consistencyresult for the cointegration coefficients from Robinson (1994b). The method isparticularly useful because it provides consistent and asymptotically normal (CAN)estimators even in situations where OLS is inconsistent, which it is in the stationaryfractional cointegration model due to the correlation between regressors and errorterms. By using a degenerating part of the periodogram near the origin, the approachis invariant to short-run dynamics, which would have to be specified correctly in aparametric procedure.

In an empirical application, we show that inference based on the new asymp-totic theory supports long-run unbiasedness in the relation between impliedand realized volatility, consistent with Bandi and Perron’s (2004) findings basedon subsampling.

Acknowledgements

This paper is a revised version of our earlier paper ‘‘Semiparametric Analysisof Stationary Fractional Cointegration and the Implied-Realized Volatility Relation

ARTICLE IN PRESS


in High-Frequency Options Data’’, August 2001, Department of ManagementWorking Paper 2001-9, University of Aarhus, see http://www.econ.au.dk/afv/wp/wp-liste.htm. We thank participants at the Danish Econometric Society annualmeeting, Denmark, 26–28 May 2000, the ‘Market Microstructure and High-Frequency Data in Finance’ conference, Denmark, 10–12 August 2001, and theESRC Econometric Study Group Annual Conference (invited presentation), Bristol,10–12 July 2003, for useful comments. We also thank Tim Bollerslev, twoanonymous referees, and the associate editor for helpful suggestions. Finally, weare grateful to the Centre for Analytical Finance (CAF), the Centre for DynamicModelling in Economics (CDME), and the Danish Social Science Research Council(SSF) for research support.

Appendix. Proof of Theorem 2

We now turn to the proof of Theorem 2. Following Robinson (1995a) and Lobato(1999), the proof of asymptotic normality of F xe ¼ Fxeð1;mÞ first approximates thecross periodogram between xit and et by AiðlÞJðlÞAn

pðlÞ, where JðlÞ is theperiodogram of �t by Assumption B’, and then invokes a central limit theorem formartingale difference arrays.

Proof of Theorem. We want to examine each of the terms in

ffiffiffiffimp

lde

m~Lmðbm � bÞ ¼

ffiffiffiffimp

lde

m~LmF

�1

xx ð1;mÞF xeð1;mÞ

¼ lm~LmF

�1

xx ð1;mÞ~Lm

n o ffiffiffiffimp

lde�1m

~L�1

m Fxeð1;mÞn o

. ð19Þ

Using linearity of Reð�Þ and Theorem 1 of Lobato (1997), which is implied by ourassumptions, we conclude that

F ik � Fik ¼ opffiffiffiffiffiffiF ii

p ffiffiffiffiffiffiffiffiFkk

p� �; 1pi; kpp� 1.

Thus, the first term in (19) is

lm~LmF

�1

xx ð1;mÞ~Lm!

plm~Lm~L�1

m l�1m H�1 ~L�1

m~Lm ¼ H�1 (20)

by the continuous mapping theorem, where

Hik ¼gik

1� di � dk

; 1pi; kpp� 1. (21)

Note that the leading ðp� 1Þ � ðp� 1Þ submatrix of G (and thus H) is invertible byAssumption A’.

For the remaining part of the proof we look at the convergence in distribution offfiffiffiffimp

lde�1m

~L�1

m Fxeð1;mÞ.

http://www.econ.au.dk/afv/wp/wp-liste.htm

http://www.econ.au.dk/afv/wp/wp-liste.htm

ARTICLE IN PRESS


Applying the Cramer–Wold device, we need to examine (Z is an arbitrary ðp� 1Þ � 1vector)

Z0ffiffiffiffimp

lde�1m

~L�1

m Fxeð1;mÞ

¼Xp�1i¼1

Zi

ffiffiffiffimp

ldiþde�1m F ipðlmÞ

¼Xp�1i¼1

Zi

ffiffiffiffimp

ldiþde�1m

2pn

Xm

j¼1

ReðI ipðljÞÞ

¼Xp�1i¼1

Zi

ffiffiffiffimp

ldiþde�1m

2pn

Xm

j¼1

ReðI ipðljÞ � AiðljÞJðljÞAn

pðljÞÞ ð22Þ

þXp�1i¼1

Zi

ffiffiffiffimp

ldiþde�1m

2pn

Xm

j¼1

ReðAiðljÞJðljÞAn

pðljÞÞ. ð23Þ

By summation by parts and (C.2) of Lobato (1999), which is implied by ourassumptions, the first term, i.e. expression (22), is

ð22Þ ¼ Op

Xp�1i¼1

Zi

ffiffiffiffimp

l�1m

1

nm1=3ðlogmÞ2=3 þ logmþ

ffiffiffiffimp

n1=4

� � !

¼ Op

Xp�1i¼1

Zi

1ffiffiffiffimp m1=3ðlogmÞ2=3 þ logmþ

ffiffiffiffimp

n1=4

� � !

¼ OpðlogmÞ2=3

m1=6þ

logmffiffiffiffimp þ

1

n1=4

!!p0.

The second term above, expression (23), can be written as

ð23Þ ¼Xp�1i¼1

Zi

ffiffiffiffimp

ldiþde�1m

2pn

Xm

j¼1

Re AiðljÞ1

2pn

Xn

t¼1

�t eitlj

2

An

pðljÞ

0@

1A

¼Xp�1i¼1

Zi

ffiffiffiffimp

ldiþde�1m

2pn

Xm

j¼1

Re AiðljÞ1

2pn

Xn

t¼1

�t�0tA

n

pðljÞ

!ð24Þ

þXp�1i¼1

Zi

ffiffiffiffimp

ldiþde�1m

2pn

Xm

j¼1

Re AiðljÞ1

2pn

Xn

t¼1

Xsat

�t�0se

iðt�sÞlj An

pðljÞ

!.ð25Þ

Now write (24) as

ð24Þ ¼Xp�1i¼1

Zi

ffiffiffiffimp

ldiþde�1m

1

n

Xm

j¼1

Re AiðljÞAn

pðljÞ

� �ð26Þ

þXp�1i¼1

Zi

ffiffiffiffimp

ldiþde�1m

1

n

Xm

j¼1

Re AiðljÞ1

n

Xn

t¼1

�t�0t � Ip

!An

pðljÞ

!, ð27Þ

ARTICLE IN PRESS


where (26) is bounded by

supi

Offiffiffiffimp

ldiþde�1m

1

n

Xm

j¼1

jf ipðljÞj

!¼ sup

iOð

ffiffiffiffimp

ldiþde�1m

m

nla�di�de

m Þ

¼ Om1þ2a

n2a

� �! 0

by Assumptions A’ and D. For (27) we use the fact that �t�0t � Ip is a martingaledifference sequence with respect to the filtration ðFtÞt2Z;Ft ¼ sðf�s; sptgÞ, such thatin particular n�1

Pnt¼1 �t�0t � Ip ¼ opð1Þ. Then (27) is bounded by

supi

opffiffiffiffimp

ldiþde�1m

1

n

Xm

j¼1

jf ipðljÞj

!¼ op

m1þ2a

n2a

� �!p0

by Assumptions A’ and D.We return to (25),

Xp�1i¼1

Zi

ffiffiffiffimp

ldiþde�1m

1

n2

Xm

j¼1

Re AiðljÞXn

t¼1

Xsat

�t�0se

iðt�sÞlj An

pðljÞ

!

¼Xn

t¼1

�0tXsat

Xp�1i¼1

Zi

ffiffiffiffimp

n2ldiþde�1

m

Xm

j¼1

ReðA0iðljÞeiðt�sÞlj ApðljÞÞ�s

¼Xn

t¼2

�0tXt�1s¼1

ct�s;n�s,

where the bar denotes complex conjugation and we have defined

ctn ¼1

2pnffiffiffiffimp

Xm

j¼1

yj cosðtljÞ,

yj ¼Xp�1i¼1

Zildiþde

m ReðA0iðljÞApðljÞ þ A0pðljÞAiðljÞÞ.

So, ztn ¼ �0tPt�1

s¼1 ct�s;n�s is a martingale difference array with respect to ðFtÞt2Z;and we can apply the CLT if

Xn

t¼1

Eðz2tnjFt�1Þ �Xp�1i¼1

Xp�1k¼1

ZiZkgppJik!p0, (28)

Xn

t¼1

Eðz2tn1ðjztnj4dÞÞ ! 0; d40, (29)

ARTICLE IN PRESS


see Hall and Heyde (1980, Chapter 3.2). A sufficient condition for (29) is

Xn

t¼1

Eðz4tnÞ ! 0. (30)

First, we show (28),

Xn

t¼1

Eðz2tnjFt�1Þ ¼Xn

t¼2

EXt�1s¼1

Xt�1r¼1

�0sc0t�s;n�t�

0tct�r;n�r

Ft�1

!

¼Xn

t¼2

Xt�1s¼1

�0sc0t�s;nct�s;n�s ð31Þ

þXn

t¼2

Xt�1s¼1

Xras

�0sc0t�s;nct�r;n�r. ð32Þ

By (D.10) and (D.11) of Lobato (1999), (32) has mean zero and variance

O nXn

s¼1

kcsnk2

!2

þXn

t¼3

Xt�1u¼2

Xu�1s¼1

kcu�s;nk2Xu�1s¼1

kct�s;nk2

!0@

1A. (33)

Since kyjk ¼ Oððm=jÞdiþdp Þ by Theorem 2 of Robinson (1995b), kcsnk is boun-ded by

kcsnk ¼ O1

nffiffiffiffimp

Xm

j¼1

kyjk

!

¼ Omdiþdp�1=2

n

Xm

j¼1

j�di�dp

!

¼ O

ffiffiffiffimp

n

� �.

Define the functions KiðlÞ ¼ ReðA0iðlÞApðlÞ þ A0pðlÞAiðlÞÞ such that yj ¼Pp�1

i¼1

Zildiþde

m KiðljÞ, KiðlÞ ¼ Oðl�di�de Þ as l! 0þ, and KiðlÞ is differentiable withqKiðlÞ=ql ¼ Oðl�di�de�1Þ as l! 0þ by Assumption 3. Now we can derive analternative bound as

kcsnk ¼ O supi

ldiþde

m

nffiffiffiffimp

Xm

j¼1

KiðljÞ cosðsljÞ

!

¼ O supi

ldiþde

m

nffiffiffiffimp

Xm�1j¼1

ðKiðljÞ � Kiðljþ1ÞÞXj

k¼1

cosðslkÞ

!

þO supi

ldiþde

m

nffiffiffiffimp KiðlmÞ

Xm

j¼1

cosðsljÞ

!

ARTICLE IN PRESS


from summation by parts. Using the Mean Value Theorem it follows thatKiðljÞ � Kiðljþ1Þ ¼ ðljþ1 � ljÞqKiðljÞ=ql ¼ ð2p=nÞqKiðljÞ=ql, and the bound is

kcsnk ¼ O supi

ldiþde

m

nffiffiffiffimp

Xm�1j¼1

n�1l�di�de�1j

Xj

k¼1

cosðslkÞ

!

þO supi

ldiþde

m

nffiffiffiffimp l�di�de

m

Xm

j¼1

cosðsljÞ

!

¼ O1

sffiffiffiffimp

� �

using alsoPl

j¼1 cosðsljÞ

¼ Oðn=sÞ, see Zygmund (2002, p. 2). This bound is betterwhen s4n=m.

Thus, we find that

Xn

s¼1

kcsnk2 ¼ O

X½n=m�

s¼1

kcsnk2 þ

Xn

s¼½n=m�þ1

kcsnk2

0@

1A

¼ On

m

ffiffiffiffimp

n

� �2

þ1

m

Xn

s¼½n=m�þ1

s�2

0@

1A

¼ Oðn�1Þ,

implying that the first term of (33) is Oðn�1Þ. The second term of (33) is bounded by

O nXn

s¼1

kcsnk2

! X½n=2�s¼1

skcsnk2

! !,

see Robinson (1995a, pp. 1646–1647) or Lobato (1999, p. 151). The summand in thelast sum is Oðn�2smþ ðsmÞ�1Þ. Choosing the first bound when sp½n=m2=3�, the lastsum is

OX½n=m2=3�

s¼1

sm

n2þ

X½n=2�s¼½n=m2=3�þ1

1

sm

0@

1A ¼ O

1

m1=3

� �

and (33)¼ Oðn�1 þm�1=3Þ.We still need to show that the mean of (31) is asymptotically equivalent toPp�1i¼1

Pp�1k¼1 ZiZkgppJik. Thus,

Eð31Þ ¼Xn

t¼2

Xt�1s¼1

Eðtrðc0t�s;nct�s;n�s�0sÞÞ

¼Xn

t¼2

Xt�1s¼1

trðc0t�s;nct�s;nÞ

ARTICLE IN PRESS


¼Xn

t¼2

Xt�1s¼1

Xm

j¼1

Xm

k¼1

1

4p2n2mtrðy0jykÞ cosððt� sÞljÞ cosððt� sÞlkÞ

¼Xn

t¼2

Xt�1s¼1

Xm

j¼1

1

4p2n2mtrðy0jyjÞcos

2ððt� sÞljÞ ð34Þ

þXn

t¼2

Xt�1s¼1

Xm

j¼1

Xm

kaj

1

4p2n2mtrðy0jykÞ cosððt� sÞljÞ cosððt� sÞlkÞ. ð35Þ

Notice that, since kyjk ¼ Oððm=jÞdiþdpÞ,

ð35Þ ¼ OXn

t¼2

Xt�1s¼1

Xm

j¼1

Xkoj

1

n2m

m

j

� �2diþ2dp

cosððt� sÞljÞ cosððt� sÞlkÞ

!

¼ O n�1m2diþ2dp�1Xm

j¼1

Xkoj

j�2di�2dp

!

using thatPn

t¼2

Pt�1s¼1 cosððt� sÞljÞ cosððt� sÞlkÞ ¼ �n=2, and hence we can bound

(35) by Oðm=nÞ. Now, trðy0jyjÞ is equal to

Xp�1i;k¼1

trðZiZkldiþdkþ2de

m ReðAn

pðljÞAiðljÞ þ An

i ðljÞApðljÞÞ

�ReðA0kðljÞApðljÞ þ A0pðljÞAkðljÞÞÞ

¼Xp�1i;k¼1

ZiZkldiþdkþ2de

m 4p2ðf ppðljÞf ikðljÞ þ f ipðljÞf kpðljÞ

þ f pkðljÞf piðljÞ þ f ppðljÞf kiðljÞÞ

by definition of f ðlÞ in (11). The second and third terms are of smaller order than thefirst and fourth terms by Assumption A’, so (34) reduces to

Xn

t¼2

Xt�1s¼1

Xm

j¼1

1

n2m

Xp�1i;k¼1

ZiZkldiþdkþ2de

m ðf ppðljÞf ikðljÞ

þ f ppðljÞf kiðljÞÞcos2ððt� sÞljÞ. ð36Þ

Notice that

2pn

Xm

j¼1

ðf ppðljÞf ikðljÞ þ f ppðljÞf kiðljÞÞ

�

Zlm

0

ðf ppðlÞf ikðlÞ þ f ppðlÞf kiðlÞÞdl

ARTICLE IN PRESS


�

Z lm

0

ðgik þ gkiÞgppl�di�dk�2de dl

¼ 2gikgpp

l1�di�dk�2de

m

1� di � dk � 2de

,

so that we can rewrite (36) as

ð36Þ�Xp�1i¼1

Xp�1k¼1

ZiZk

Xn

t¼2

Xt�1s¼1

cos2ððt� sÞljÞ

!1

pnmgikgpp

lm

1� di � dk � 2de

¼Xp�1i¼1

Xp�1k¼1

ZiZk

n

4pmgikgpp

lm

1� di � dk � 2de

,

sincePn�1

t¼1

Pn�ts¼1 cos

2ðsljÞ ¼ ðn� 1Þ2=4. Thus, we finally arrive at

ð36Þ�Xp�1i¼1

Xp�1k¼1

ZiZk

1

2gikgpp

1

1� di � dk � 2de

¼Xp�1i¼1

Xp�1k¼1

ZiZkgppJik,

where

Jik ¼gik

2ð1� di � dk � 2deÞ; 1pi; kpp� 1, (37)

and we have shown (28).To show (30),

Xn

t¼1

Eðz4tnÞ ¼Xn

t¼2

EXt�1s¼1

�0sct�s;n�t�0t

Xt�1r¼1

ct�r;n�rXt�1p¼1

�0pct�p;n�t�0t

Xt�1q¼1

ct�q;n�q

!

pCXn

t¼2

trXt�1s¼1

c0t�s;nct�s;nc0t�s;nct�s;n

!

þXn

t¼2

trXt�1s¼1

c0t�s;n

Xt�1r¼1

ct�r;nc0t�r;nct�s;n

!!

for some constant C40, by Assumption B’. This expression can be bounded

by O nPn

t¼1 kc2tnk

�2� �¼ Oðn�1Þ, which completes the proof of asymptotic normality

of F xe.Hence, we have shown that

ffiffiffiffimp

lde�1m

~L�1

m Fxe!dNð0; gppJÞ (38)

and the theorem follows using (19), (20), and (38) and applying the SlutskyTheorem. &

ARTICLE IN PRESS


References

Andersen, T.G., Bollerslev, T., Diebold, F.X., Ebens, H., 2001a. The distribution of realized stock return

volatility. Journal of Financial Economics 61, 43–76.

Andersen, T.G., Bollerslev, T., Diebold, F.X., Labys, P., 2001b. The distribution of exchange rate

volatility. Journal of the American Statistical Association 96, 42–55.

Baillie, R.T., 1996. Long memory processes and fractional integration in econometrics. Journal of

Econometrics 73, 5–59.

Baillie, R.T., Bollerslev, T., 1994. Cointegration, fractional cointegration, and exchange rate dynamics.

Journal of Finance 49, 737–745.

Bandi, F.M., Perron, B., 2004. Long memory and the relation between implied and realized volatility,

Preprint, Universite de Montreal.

Barndorff-Nielsen, O.E., Shephard, N., 2002. Econometric analysis of realized volatility and its use in

estimating stochastic volatility models. Journal of the Royal Statistical Society Series B 64, 253–280.

Beran, J., 1994. Statistics for Long-Memory Processes. Chapman & Hall, New York.

Black, F., Scholes, M., 1973. The pricing of options and corporate liabilities. Journal of Political Economy

81, 637–654.

Brillinger, D.R., 1981. Time Series, Data Analysis and Theory. Holden Day, Inc., San Francisco.

Chen, W.W., Hurvich, C.M., 2003a. Estimating fractional cointegration in the presence of polynomial

trends. Journal of Econometrics 117, 95–121.

Chen, W.W., Hurvich, C.M., 2003b. Semiparametric estimation of multivariate fractional cointegration.

Journal of the American Statistical Association 98, 629–642.

Cheung, Y.E., Lai, K.S., 1993. A fractional cointegration analysis of purchasing power parity. Journal of

Business and Economic Statistics 11, 103–122.

Christensen, B.J., Prabhala, N.R., 1998. The relation between implied and realized volatility. Journal of

Financial Economics 50, 125–150.

Dahlhaus, R., 1989. Efficient parameter estimation for self-similar processes. Annals of Statistics 17,

1749–1766.

Dolado, J., Marmol, F., 1996. Efficient estimation of cointegrating relationships among higher order and

fractionally integrated processes. Working Paper no. 9617, Banco de Espana.

Dueker, M., Startz, R., 1998. Maximum-likelihood estimation of fractional cointegration with an

application to U.S. and Canadian bond rates. Review of Economics and Statistics 83, 420–426.

Engle, R., Granger, C.W.J., 1987. Cointegration and error correction: representation estimation, and

testing. Econometrica 55, 251–276.

Fox, R., Taqqu, M.S., 1986. Large-sample properties of parameter estimates for strongly dependent

stationary gaussian series. Annals of Statistics 14, 517–532.

Fox, R., Taqqu, M.S., 1987. Limit theorems for quadratic forms in random variables having long-range

dependence. Probability Theory and Related Fields 74, 213–240.

Geweke, J., Porter-Hudak, S., 1983. The estimation and application of long memory time series models.

Journal of Time Series Analysis 4, 221–238.

Granger, C.W.J., 1981. Some properties of time series data and their use in econometric model

specification. Journal of Econometrics 16, 121–130.

Granger, C.W.J., Joyeux, R., 1980. An introduction to long memory time series models and fractional

differencing. Journal of Time Series Analysis 1, 15–29.

Hall, P., Heyde, C.C., 1980. Martingale Limit Theory and its Application. Academic Press, New York.

Hosking, J.R.M., 1981. Fractional differencing. Biometrika 68, 165–176.

Hull, J.C., 1997. Options, Futures, and Other Derivatives, third ed. Prentice-Hall, Englewood Cliffs, NJ.

Kunsch, H.R., 1987. Statistical aspects of self-similar processes. In: Prokhorov, Y., Sazanov, V.V. (Eds.),

Proceedings of the First World Congress of the Bernoulli Society. VNU Science Press, Utrecht,

pp. 67–74.

Lobato, I.N., 1997. Consistency of the averaged cross-periodogram in long memory series. Journal of

Time Series Analysis 18, 137–155.

Lobato, I.N., 1999. A semiparametric two-step estimator in a multivariate long memory model. Journal of


ARTICLE IN PRESS


Lobato, I.N., Robinson, P.M., 1996. Averaged periodogram estimation of long memory. Journal of


Lobato, I.N., Velasco, C., 2000. Long memory in stock-market trading volume. Journal of Business and

Economic Statistics 18, 410–427.

Marinucci, D., 2000. Spectral regression for cointegrated time series with long-memory innovations.

Journal of Time Series Analysis 21, 685–705.

Marinucci, D., Robinson, P.M., 1999. Alternative forms of fractional Brownian motion. Journal of

Statistical Planning and Inference 80, 111–122.

Marinucci, D., Robinson, P.M., 2001. Semiparametric fractional cointegration analysis. Journal of


Merton, R.C., 1973. Theory of rational option pricing. Bell Journal of Economics and Management

Science 4, 141–183.

Robinson, P.M., 1994a. Rates of convergence and optimal spectral bandwidth for long range dependence.

Probability Theory and Related Fields 99, 443–473.

Robinson, P.M., 1994b. Semiparametric analysis of long-memory time series. Annals of Statistics 22,

515–539.

Robinson, P.M., 1994c. Time series with strong dependence. In: Sims, C.A. (Ed.), Advances in

Econometrics. Cambridge University Press, Cambridge, pp. 47–95.

Robinson, P.M., 1995a. Gaussian semiparametric estimation of long range dependence. Annals of

Statistics 23, 1630–1661.

Robinson, P.M., 1995b. Log-periodogram regression of time series with long range dependence. Annals of


Robinson, P.M., 2003. Long-memory time series. In: Robinson, P.M. (Ed.), Time series with long

memory. Oxford University Press, Oxford, pp. 4–32.

Robinson, P.M., Hualde, J., 2003. Cointegration in fractional systems with unknown integration orders.

Econometrica 71, 1727–1766.

Robinson, P.M., Marinucci, D., 2001. Narrow-band analysis of nonstationary processes. Annals of


Robinson, P.M., Marinucci, D., 2003. Semiparametric frequency domain analysis of fractional

cointegration. In: Robinson, P.M. (Ed.), Time Series with Long Memory. Oxford University Press,

Oxford, pp. 334–373.

Rubinstein, M., Vijh, A.M., 1987. The Berkeley options data base: a tool for empirical research. Advances

in Futures and Options Research 2, 209–221.

Shimotsu, K., Phillips, P.C.B., 2002. Exact local Whittle estimation of fractional integration. Cowles

Foundation Discussion Paper 1367. Annals of Statistics, forthcoming.

Velasco, C., 2003. Gaussian semi-parametric estimation of fractional cointegration. Journal of Time Series

Analysis 24, 345–378.

Zygmund, A., 2002. Trigonometric Series, third ed. Cambridge University Press, Cambridge.

asymptotic normality of narrow-band least squares in the stationary fractional cointegration model...

Documents