Download - Finance - Long Memory Models for Volatility and High Frequency Financial Data Econometrics (2004)_yes(1)참

Long memory models for volatility and high frequency financial

data econometrics

Dmitri Koulikov

Department of Economics

School of Economics and Management

University of Aarhus

Aarhus C, 8000, Denmark

phone: +45 89421577

e-mail: [email protected]

PhD dissertation submitted to

The Faculty of Social Sciences


Completed under supervision of

Professor Niels Haldrup, University of Aarhus

and

Professor Bent Jesper Christensen, University of Aarhus

June 11, 2004

i

Contents

Preface iii

Summary iii

Dask resume iv

Chapter 1 Modeling sequences of long memory non-negative covariance

stationary random variables 1

Chapter 2 Long memory ARCH(∞) models: specification and

quasi–maximum likelihood estimation 26

Chapter 3 Non–stationary models for volatility of speculative returns:

with application to foreign exchange data 53

Chapter 4 Conditional heteroscedasticity model for discrete high-frequency

price changes: with application to IBM trades data 73

ii

1 Preface

I wish to thank Niels Haldrup and Bent Jesper Christensen, my PhD thesis supervisors, for their

effort and advice during the period of my study at the Department of Economics, University

of Aarhus. I also wish to use this opportunity to express my gratitude to Niels Haldrup

for stimulating my interest in long memory time series analysis back in 1999, while he was

supervising my masters thesis. I am greatly indebted to Svend Hylleberg for supporting my

masters education at the Department of Economics in 1998–1999, and for his help during my

PhD studies in 2000–2003.

In addition, I would like to thank my teachers and colleagues from the University of Tartu

and Euro Faculty: Raul Eamets, Arne Gotfredsen, Jens A. Larsen, Helje Kaldaru, Tiiu Paas,

Alf Vanags, Morten Hansen and Kenneth Smith for inspiring and supporting my early inroads

into the field of econometric and economic research.

2 Summary

This thesis presents contribution to the branch of econometric research devoted to the long

memory models for sequences of non-negative stationary random variables (Xt, ψt) : t ∈ Z.Among the most common fields of applications of such models are the conditional heteroscedas-

ticity models and the econometric models for high frequency financial data. An enlightening

recent survey of new theoretical developments for this class of econometric models is Giraitis,

Leipus, Surgailis (2003), for an overview of empirical work in this area refer to Bollerslev, Engle

and Nelson (1994).

In Chapter 1 of this thesis I introduce a class of stationary MD-ARCH(∞) models for

(Xt, ψt) : t ∈ Z, defined as follows:

Xt = ψt εt ψt = a+∞∑

j=1

θj−1(Xt−j − ψt−j) , (1)

where εt : t ∈ Z is a sequence of i.i.d. non-negative random variables and all other parame-

ters are positive. I show that under a mild set of conditions (1) has a non-negative covariance

stationary solution with non-summable autocovariance function, commonly referred to as long

memory. Moreover, the class of MD-ARCH(∞) models includes all short memory covariance

stationary GARCH sequences, and therefore provides a natural extension of the GARCH mod-

els to the long memory case, just as ARFIMA is an extension of the classical ARMA models.

In Chapter 2 I study the class of long memory ARCH(∞) models, as well as the asymptotic

properties of a time-domain QML estimator of such models. In their influential study, Ding

and Granger (1996) define the following ARCH(∞) model:

Xt = ψt εt ψt =∞∑

j=1

πj−1Xt−j , (2)

but leave a number of important issues unresolved. In particular, existence of non-degenerate

long memory solution of this model have not been previously established in the literature.

iii

Under certain conditions, covariance stationary solutions of (1) and (2) are equivalent, allowing

me to show properties of the Ding and Granger (1996) model. The second part of the paper

examines QML estimator for this class of models. One of the most notable results is that the

asymptotic variance of the memory parameter estimator equals to 6π2 , the same as in the class

of linear ARFIMA models.

In Chapter 3 I propose a framework for non–stationary conditional heteroscedasticity mod-

els and examine some empirical evidence of non–stationary volatility. A family of models,

referred to as cMD–ARCH models, suitable for non–stationary conditional heteroscedasticity

time series is defined as:

Xt = ψt = a0 for t ≤ 0

Xt = ψt εt ψt = at +t∑

j=1

θj−1(Xt−j − ψt−j) for t > 0 ,

where at > 0 is a non-stochastic function of the index t and other parameters are similar to the

stationary MD-ARCH(∞) model in Chapter 1. The model allows for separation of deterministic

and stochastic effects in the conditional volatility, similarly to the linear time series case. A

statistical inference theory for the parameters of cMD–ARCH is developed. Consistency and

asymptotic normality of the QML estimator is shown under general assumptions on innovations

εt-s. An empirical application of the new model to thirteen major European and Asian foreign

exchange returns is included, illustrating empirical cases of non–stationary volatility.

In Chapter 4 of this thesis I introduce MD-ARCH(∞)-like models for time series of discrete

price changes in high frequency financial data, allowing for separate modeling of conditional

mean and conditional variance parameters. The paper borrows on ideas from the ordered probit

model, but includes an observation driven dynamic volatility part. Both short and long memory

volatility models are discussed, and an application to the IBM trades dataset is included.

3 Dansk resume

Denne afhandling beskæftiger sig med økonometri-omradet, som handler om modeller med lang

hukommelse for sekvenser af nonnegative stationære stokastiske variabler (Xt, ψt) : t ∈ Z.Iblandt de mest almindelige applikationer af sadanne modeller er de betingede volatilitets

modeller og økonometriske modeller for højfrekvente finansielle data. Giraitis, Leipus, Sur-

gailis (2003) giver en oplysende oversigt over nyere teoretiske bidrag til denne klasse af mod-

eller, og Bollerslev, Engle and Nelson (1994) opsummerer den empiriske forskning inden for

omradet.

I kapitel 1 introducerer jeg en klasse af stationære MD-ARCH(∞) modeller for (Xt, ψt) :

t ∈ Z, defineret som følgende:

Xt = ψt εt ψt = a+∞∑

j=1

θj−1(Xt−j − ψt−j) , (3)

iv

hvor εt : t ∈ Z er en sekvens af i.i.d. nonnegative stokastiske variabler og alle andre parametre

er positive. Under en række milde betingelser viser jeg at (3) har en nonnegativ kovarians

stationær løsning med en ikke sumerbar autokovarians funktion, kendt som lang hukommelse.

Desuden inkluderer MD-ARCH(∞) klassen alle kovarianse stationære GARCH sekvenser med

kort hukommelse og tilfører derfor en naturlig udvidelse af GARCH modellerne til tilfælde med

lang hukommelse, ligesom ARFIMA er en udvidelse af de klassiske ARMA modeller.

I kapitel 2 undersøger jeg klassen af ARCH(∞) modeller med lang hukommelse, samt asymp-

totiske egenskaber af QML estimatoren for disse modeller. I deres banebrydende artikel de-

finerer Ding and Granger (1996) følgende ARCH(∞) model:

Xt = ψt εt ψt =∞∑

j=1

πj−1Xt−j , (4)

men efterlader et antal emner uløste. Specielt er eksistensen af en ikke-degenereret lang hukom-

melses løsning af denne model ikke blevet pavist i litteraturen. Under særlige betingelser er

kovarianse stationære løsninger af (3) og (4) identiske, dette giver mig mulighed for at vise

Ding and Granger (1996) modellens egenskaber. Anden del af kapitlet undersøger QML es-

timatoren for denne klasse af modeller. En af de mest bemærkelsesværdige resultater er at

hukommelsesparameterets asymptotiske varians af er lig med 6π2 , det samme som for ARFIMA

klassen.

I kapitel 3 foreslar jeg en rammemodel for den ikke–stationære betingede heteroscedasticitet

og gennemgar en række empiriske eksempler af ikke-stationær volatilitet. En gruppe af modeller

som kaldes cMD–ARCH modeller, og som passer til ikke–stationær betinget heteroscedastcitet

er defineret som:

Xt = ψt = a0 for t ≤ 0

Xt = ψt εt ψt = at +t∑

j=1

θj−1(Xt−j − ψt−j) for t > 0 ,

hvor at > 0 er en ikke-stokastisk funktion af t–indekset og de andre parametre er identiske

med den stationære MD-ARCH(∞) model i kapitel 1. Denne model tillader en separation

af deterministiske og stokastiske effekter i den betingede volatilitet, som ligner de lineære

tidsrækker. Der udvikles en statistisk inferensteori for parametrene af cMD–ARCH. Under

generelle betingelser for εt residualer er QML estimatoren konsistent og normalfordelt. En

empirisk anvendelse af den nye model til afkast af tretten større Europæiske og Asiatiske valu-

takurser er ligeledes inkluderet, disse illustrerer empiriske tilfælde af ikke–stationær volatilitet.

I kapitel 4 af denne afhandling introducerer jeg MD-ARCH(∞) lignende modeller for tid-

srækker af diskrete prisændringer i høj–frekvent finansiel data, som giver mulighed for adskilte

modeller for betinget middelværdi og betinget volatilitet. Artiklen laner ideer fra den ordered

probit model, men indeholder en volatilitets parameter, styret af observationer. Kapitlet inde-

holder desuden en diskussion af modeller med bade kort og lang hukommelse og en empirisk

anvendelse af disse modeller for IBM data.

v

References

Bollerslev, Tim, Robert F. Engle and Daniel B. Nelson (1994) ARCH models. Handbook of

Econometrics, vol. IV, pp. 2961–3031, New-York: Elsevier Science.

Ding, Z. and Clive W.J. Granger (1996) Modeling volatility persistence of speculative returns:

a new approach. Journal of Econometrics, vol. 73, pp. 185-215.

Giraitis, Liudas, Remigijus Leipus and Donatas Surgailis (2003) Recent advances in ARCH

modelling. Preprint.

vi

Chapter 1: Modeling sequences of long memorynon-negative covariance stationary random variables

1

Modeling sequences of long memory non-negative covariance

stationary random variables

Dmitri Koulikov∗





phone: +45 89421577


This revision:

July 4, 2003

Abstract

This paper extends the class of covariance stationary GARCH processes of Engle (1982)

and Bollerslev (1986) to the case of non-summable autocovariances. We improve on the

results of two previous studies in this field: FIGARCH model of Baillie, Bollerslev and

Mikkelsen (1996), which generates sequences of non-negative random variables with infi-

nite first and higher-order moments, and hyperbolic decay rate of the impulse response

function, and the linear ARCH sequences of Giraitis, Robinson and Surgailis (2000), which

do not contain the class of short-memory covariance stationary GARCH processes. We

use an infinite series representation of GARCH models in terms of martingale differences

innovations referred to as MD-ARCH(∞) representation. This allows for the case of hy-

perbolically decaying square-summable weighting coefficients. Conditions for the existence,

non-negativity and covariance stationarity of the MD-ARCH(∞) sequences are derived,

and the functional limit of normalized partial sums of the process is studied. Applications

of long-memory MD-ARCH(∞) processes include volatility modeling and high-frequency

financial data econometrics.

JEL classification: C22, C51

Keywords: Conditional heteroscedasticity, Long-memory, Weak stationarity, ARCH(∞),

Econometrics of high-frequency financial data

∗The author wishes to thank the participants of the seminars at CORE, Universite Catholique de Louvain,

and at Nuffield College, University of Oxford, for their helpul feedback. Comments from Luc Bauwens, Neil

Shephard and Bent Nielsen are gratefully acknowledged. All remaining mistakes are my own.

2

1 Introduction

In this paper we show that the class of covariance stationary GARCH processes1 of Engle (1982)

and Bollerslev (1986) can be extended to the case of long memory. For the class of linear ARMA

models such extension has been proposed in early eighties by Granger and Joyeux (1980) and

Hosking (1981) and is nowadays widely used in various applications in economics. Evidence of

long-memory and persistent autocorrelations has been documented in many fields in economics,

including volatility of financial series and trading intensity in financial durations data. Short-

memory GARCH processes are widely used in these settings, but their extension to the long-

memory case proved to be non-trivial, largely because of their complicated non-linear structure.

Let εt : t ∈ Z be a sequence of non-negative i.i.d. innovations. Similarly to Engle (1982)

and Bollerslev (1986) we seek to model a sequence of non-negative random variables Xt : t ∈Z recursively defined in the following way:

Xt = ψt · εt ψt = ψ(εt−1, εt−2, . . .) , (1)

where ψ is a measurable function of possibly infinite history of innovations. Non-negativity of

Xt : t ∈ Z implies that we are interested in the class of non-negative functions ψ.

The general formulation given in (1) includes many models, but the primary interest in the

econometric literature has been concentrated on the class of GARCH models, where the function

ψ is linear in the previous history of (Xt, ψt) : t ∈ Z. GARCH processes are appealing for

modeling financial time-series, where ψt : t ∈ Z is interpreted as the time-varying conditional

second moment of returns, squares of which are represented by the sequence Xt : t ∈ Z.Recently, techniques and ideas employed in GARCH literature have been utilized for statistical

modeling of other non-negative processes, most notably high-frequency financial durations data

in Engle and Russell (1998) and Engle (2000), where ψt : t ∈ Z represents the time-varying

trading intensity of financial markets. Recent surveys of GARCH literature are Bollerslev,

Engle and Nelson (1994) and Berkes, Horvath and Kokoszka (2002).

During the last decade substantial empirical evidence of long-memory in volatility and

trading intensity of many financial time-series has been accumulated and documented; see

Andersen, Bollerslev, Diebold and Labys (2001) and Jasiak (1998) among many others. Some

empirical regularities of such data can be modeled in the framework of FIGARCH processes

introduced by Baillie, Bollerslev and Mikkelsen (1996) and Ding and Granger (1996). The

simplest FIGARCH process is given by:

Xt = ψt · εt ψt = a+ [1− (1− L)d]Xt , (2)1Throughout the paper we adopt notation and terminology of the ARCH(∞) literature, whereby we only

consider sequences of non-negative random variables, representing squared returns and volatilities in the main-

stream GARCH literature. Therefore all statistical concepts used in the paper will be based on this notation. For

example, “covariance stationary GARCH sequence” refers to a GARCH model with well-defined time-invariant

autocorrelation function of squared returns and volatility.

3

where Eε0 = 1, L denotes the lag operator, and a > 0 and 0 ≤ d < 1 are given parameters. In

contrast to the original GARCH formulation of Engle (1982) and Bollerslev (1986), FIGARCH

model assign hyperbolic weights to the previous history of the process. As shown in Baillie,

Bollerslev and Mikkelsen (1996), FIGARCH model implies infinite first and higher-order un-

conditional moments of (Xt, ψt) : t ∈ Z, but does feature hyperbolic decay rate of the impulse

response function of ψt.

Infinite unconditional moments of the sequence (Xt, ψt) : t ∈ Z in FIGARCH model

may not be attractive in many empirical settings, especially for modeling financial durations

data, where it implies infinite unconditional expected waiting time until the next high-frequency

event. Moreover, Giraitis, Kokoszka and Leipus (2000) show that the covariance stationary ver-

sion of Ding and Granger (1996) model, which is closely related to (2), has absolutely summable

autocovariance function, and hence short-memory as defined in McLeod and Hipel (1978).

The short-memory nature of the Ding and Granger (1996) model is, among other factors,

due to the summable coefficients of the polynomial 1 − (1 − z)d, which also appears in the

definition of ψt in the FIGARCH process (2). Giraitis, Robinson and Surgailis (2000) relax the

summability requirement by disturbing ψ∗t with zero-centered random variables Yt : t ∈ Z as

shown below:

Xt = Y 2t Yt = ψ∗t · zt ψ∗t = a+ (1− L)−dYt−1 , (3)

where zt : t ∈ Z is a sequence of i.i.d. random variables with mean zero and unit variance.

Giraitis, Robinson and Surgailis (2000) further show that for 0 < d < 12 Xt : t ∈ Z is

covariance stationary with non-summable autocovariance function.

Note that the process ψ∗t : t ∈ Z in the linear ARCH model (3) is defined on R, and hence

lacks the usual volatility interpretation of ψt : t ∈ Z in the class of GARCH processes. This

fact precludes applications of linear ARCH models in high-frequency financial econometrics,

where ψt : t ∈ Z is also required to be non-negative. Additionally, expressions for the

autocovariance function of Xt : t ∈ Z in model (3) are complicated due to the square

transformation of Yt : t ∈ Z.In this paper we introduce a new class of processes, referred to as the MD-ARCH(∞)

class, which extends the covariance-stationary GARCH sequences of Engle (1982) and Boller-

slev (1986) to the case of non-summable autocovariances. We stay in the general framework

of (1), similar to the short-memory GARCH processes, where ψt is a linear function of zero-

centered innovations Xt − ψt : t ∈ Z weighted by a sequence of coefficients θj : j ≥ 0.We show that conditions for the existence, non-negativity and covariance stationarity of MD-

ARCH(∞) sequences allow for the case of square summable θj : j ≥ 0 and long-memory in

(Xt, ψt) : t ∈ Z. We derive closed-form expressions for the moments of (Xt, ψt) : t ∈ Z in

terms of underlying parameters. For an important case of hyperbolically decaying θj : j ≥ 0we show the functional limit of normalized partial sums of Xt : t ∈ Z. Finally, an overview

of the statistical inference methods for the new class of models is also given.

The paper is organized as follows. Section 2 collects main results of the paper, introducing

4

the class of MD-ARCH(∞) sequences, conditions for the existence, non-negativity and station-

arity of the new model, and ending with a functional CLT for partial sums of Xt : t ∈ Z.Section 3 describes semiparametric and fully parametric approaches to the statistical inference

for the long-memory MD-ARCH(∞) sequences. Conclusion summarizes the findings. Full

proofs of the main results of the paper are presented in the Appendix.

2 Sequences of long-memory covariance stationary non-negative

random variables

In this section we introduce and study the class of MD-ARCH(∞) processes. The new class

contains short memory covariance stationary GARCH sequences of Engle (1982) and Boller-

slev (1986). We derive sufficient conditions for the covariance stationarity of the MD-ARCH(∞)

processes and show that these conditions allow for long memory.

2.1 Representations of GARCH sequences

In the series of recent articles Giraitis, Kokoszka and Leipus (2000) and Kazakevicius and

Leipus (2002) study statistical properties of a wide class of GARCH model by expressing them

in the framework of ARCH(∞) processes defined as follows:

Xt = ψt · εt ψt = a∗ +∞∑

j=1

θ∗j−1Xt−j , (4)

where a∗ > 0, θ∗j : j ≥ 0 ⊆ R0+ and εt : t ∈ Z is a sequence of i.i.d. non-negative random

variables. Sufficient conditions for stationarity of ARCH(∞) sequences derived by Giraitis,

Kokoszka and Leipus (2000) imply absolute summability of the coefficients θ∗j : j ≥ 0, and

ultimately the short-memory nature of the process.

Absolute summability of θ∗j : j ≥ 0 in the ARCH(∞) framework of Giraitis, Kokoszka

and Leipus (2000) and Kazakevicius and Leipus (2002) is necessary to ensure convergence of

the infinite series in the definition of ψt in (4). Consider the following alternative to (4):

Xt = ψt · εt ψt = a+∞∑

j=1

θj−1(Xt−j − ψt−j) , (5)

where the following assumptions hold:

A1. εt : t ∈ Z is defined on the common probability space (Ω,F ,P), and consists of i.i.d.

copies of a non-negative random variable ε0 with Eε0 = 1.

A2. a > 0 and θj : j ≥ 0 ⊆ R0+.

The ψt part of (5) is formulated in terms of the sequence of zero-centered innovations Xt−ψt :

t ∈ Z, where Xt − ψt = ψt(εt − 1) and ψt and εt are independent for each t ∈ Z. In this

paper we will only consider the case when the first two moments of ψt are finite for each t ∈ Z.

It follows that E[Xt − ψt] = 0 and E[Xt − ψt|Ft−1] = 0 for each t ∈ Z, Ft being the process

5

filtration, and hence Xt−ψt : t ∈ Z is a sequence of martingale differences innovations. This

structure of innovations, much like in the class of linear ARFIMA processes, can potentially

ensure convergence of the infinite series in (5) without assuming the absolute summability of

θj : j ≥ 0. Model (5) will be referred to as the MD-ARCH(∞) model.

Specification of GARCH models using the sequence of zero-centered innovations Xt −ψt :

t ∈ Z can be traced back to Robinson (1991) and Robinson and Henry (1999). In the latter

study the authors show that particular choice of weighting coefficients in such models can lead

to non-summable autocovariances and long memory. However, they leave the crucial questions

of covariance stationarity and non-negativity of such specifications of GARCH processes open,

noting that results of Giraitis, Kokoszka and Leipus (2000) may contradict their conjectures.

The issues of covariance stationarity and non-negativity of the proposed MD-ARCH(∞)

class of processes are central to this paper. We start with the following examples, demonstrating

the range of potential parametrizations of the MD-ARCH(∞) sequences:

EXAMPLE 1. Consider the covariance stationary GARCH(p,q) model of Engle (1982) and

Bollerslev (1986). It can be written in our notation as:

Xt = ψt · εt ψt = a[1−A(1)− B(1)] + [A(L) + B(L)]ψt +A(L)(Xt − ψt) , (6)

where A(z) :=∑q

j=1 αjzj , B(z) :=

∑pj=1 βjz

j , and A1 holds. Recall that the covariance

stationary assumption implies that all roots of 1−A(z)−B(z) = 0 are outside the unit circle,

see Bollerslev (1986), and hence we can rewrite the process as:

Xt = ψt · εt ψt = a+A(L)[1−A(L)− B(L)]−1(Xt − ψt) . (7)

This representation of the covariance stationary GARCH(p,q) model is closely related to (5). In

particular, covariance stationarity assumption implies that Xt−ψt : t ∈ Z in (7) is the square-

integrable martingale difference sequence. Power series expansion of A(z)[1 − A(z) − B(z)]−1

gives the sequence of absolutely summable coefficients with exponential rate of decay, which

can be found from the following recursion:

θ0 = α1 , θ1 = α2 + α1[β1 + α1]θ0 , θ2 = α3 + α1[β1 + α1]θ1 + α1[β2 + α2]θ0 , . . . (8)

Using these results, the MD-ARCH(∞) representation of GARCH(1,1) model can be written

as follows:

Xt = ψt · εt ψt = a+∞∑

j=1

α1[β1 + α1]j−1(Xt−j − ψt−j) . (9)

Covariance stationarity of the GARCH(p,q) sequence (7) ensures that Eψ20 < ∞ and EX2

0 <

∞.

EXAMPLE 2. A potentially important parametrization of θj : j ≥ 0 in model (5) is given

by the coefficients from the power series expansion of (1− z)−d, previously discussed in Robin-

son (1991) and Robinson and Henry (1999). This specification forms a building block of many

6

parametric long-memory time-series models, most popular being the class of linear ARFIMA

models of Granger and Joyeux (1980) and Hosking (1981). Coefficients of the expansion are

given by:

θj :=Γ(d+ j)

Γ(d)Γ(1 + j)∀j ≥ 0 , (10)

where 0 ≤ d < 12 and Γ is the gamma function. In the class of linear time-series models,

the sequence of square-summable hyperbolically decaying coefficients (10) is known to lead to

the non-summable autocovariance function, and hence long memory as defined in McLeod and

Hipel (1978); see also Hosking (1996).

Among the two examples above the first one is of particular importance, showing that

the covariance stationary GARCH(p,q) model is a member of the class of MD-ARCH(∞) se-

quences, just like it is nested within the original ARCH(∞) framework of Giraitis, Kokoszka

and Leipus (2000) and Kazakevicius and Leipus (2002). More importantly, Example 1 demon-

strates that even though the sequence of innovations Xt − ψt : t ∈ Z in (7) is supported on

R, the process (Xt, ψt) : t ∈ Z is non-negative with probability one.

In the remainder of this section we address the following three issues. First, we show that

the covariance stationary MD-ARCH(∞) sequences can be constructed assuming only square

summability of θj : j ≥ 0, and that non-summable autocovariance function of (Xt, ψt) :

t ∈ Z can be obtained. Second, we study conditions for non-negativity of the MD-ARCH(∞)

sequences. Finally, we derive the functional limit of the appropriately normalized partial sums

of Xt : t ∈ Z in the case of square-summable hyperbolically decaying coefficients θj : j ≥ 0from Example 2. This limit is potentially useful for the semiparametric inference in the class

of long-memory MD-ARCH(∞) sequences.

2.2 Covariance stationary MD-ARCH(∞) sequences

The issue of convergence of the infinite series in the definition of the MD-ARCH(∞) process (5)

is central to the existence and stationarity of the model. The sequence of innovations Xt −ψt : t ∈ Z is constructed from the past history of the process itself, and hence the linear

representation of ψt in (5) cannot be used to study properties of the MD-ARCH(∞) process.

Instead, we follow the approach of Giraitis, Kokoszka and Leipus (2000) and Kazakevicius and

Leipus (2002) and use a Volterra series representation of (5) given by:

Xt = ψt · εt ψt = a

∞∑k=0

M(k, t) , (11)

where for each t ∈ Z sequence M(k, t) : k ≥ 0 is defined as:

M(0, t) := 1 ,

M(k, t) :=∞∑

j1...jk=1

θj1−1 · · · θjk−1(εt−j1 − 1) · · · (εt−j1−...−jk− 1) ∀k ≥ 1 .

(12)

7

Expressed in the Volterra series form, ψt part of the MD-ARCH(∞) process (11) is given by the

infinite sum of random variables M(k, t) : k ≥ 0, which themselves are non-linear functions of

the underlying sequence of innovation εt : t ∈ Z. Under appropriate assumptions on θj : j ≥0 and the moments of εt : t ∈ Z, we are able to show that for each t ∈ Z M(k, t) : k ≥ 0is a L2 sequence of mutually orthogonal elements with exponentially decreasing variance; see

Lemma 1 in the Appendix. This finding provides considerable simplifications in the derivation

of the following two theorems.

THEOREM 1. Under A1–A2 and conditions∞∑

j=1

[log j]2θ2j <∞ , (13)

E(ε0 − 1)2∞∑

j=0

θ2j < 1 , (14)

the process (Xt, ψt) : t ∈ Z defined in (5), equivalently (11)–(12), is finite a.e. on (Ω,F ,P),

and is stationary and ergodic.

Note that conditions (13) and (14) involve only squares of θj : j ≥ 0, and therefore

allow for parametrizations involving hyperbolically decaying coefficients in Example 2. Under

stronger assumption of absolute summability of θj : j ≥ 0, such as in the case of stationary

GARCH sequences (7), existence of the second moment of innovations εt : t ∈ Z implied

by (14) can be relaxed; refer to Kazakevicius and Leipus (2002) for the thorough discussion of

the latter case.

Sufficient conditions for the covariance stationarity of the MD-ARCH(∞) process are given

in the following theorem:

THEOREM 2. Assume A1–A2 and (14). Then the sequence (Xt, ψt) : t ∈ Z defined in (5)

is covariance stationary, where for each t ∈ Z and k ≥ 0:

EXt = Eψt = a , (15)

E [(ψt+k − a)(ψt − a)] =a2E(ε0 − 1)2

1− E(ε0 − 1)2∑∞

j=0 θ2j

∞∑j=0

θjθj+k , (16)

E [(Xt+k − a)(Xt − a)] = E [(ψt+k − a)(ψt − a)] +a2E(ε0 − 1)2

1− E(ε0 − 1)2∑∞

j=0 θ2j

θ∗k , (17)

and θ∗k : k ≥ 0 is defined as θ∗0 := 1, θ∗k := θk−1 for k ≥ 1.

Covariance stationarity condition (14) in Theorem 2 involves only sums of squared weighting

coefficients θj : j ≥ 0, again allowing for parametrizations suggested in Example 2. In

addition, Theorem 2 demonstrates that behavior of the autocovariance function of (Xt, ψt) :

t ∈ Z is determined by the rate of decay of the weighting coefficients, similarly to the class

of linear ARFIMA processes. Hence, the properties of the autocovariance function of MD-

ARCH(∞) sequences are closely related to the properties of θj : j ≥ 0. Implications of

Theorem 2 are elaborated upon in the following examples:

8

EXAMPLE 3. Consider the covariance stationary GARCH(p,q) model in Example 1. Boller-

slev (1986) and Ding and Granger (1996) derive the autocovariance function of the GARCH(1,1)

model (9), where the sequence of innovations εt : t ∈ Z is given by i.i.d. copies of χ21 ran-

dom variable. Simple calculations show that the covariance stationarity condition for the

MD-ARCH(∞) sequences in Theorem 2 reduces in this case to the well known restriction

β21 + 2β1α1 + 3α1 < 1. Similarly, the autocovariance functions (16) and (17) reduce to the

corresponding expressions in Ding and Granger (1996).

The moment structure of general GARCH(p,q) sequences is studied in He and Terasvir-

ta (1999). Due to the excessive amount of technicalities necessary for the direct comparison of

Theorem 2 with their results, we abstain from doing it in this example.

EXAMPLE 4. The rate of decay of the autocovariance function of GARCH(p,q) sequences in

the previous example is known to be exponential, see He and Terasvirta (1999) and Giraitis,

Kokoszka and Leipus (2000), and the autocovariance function can be shown to be summable.

Example 2 introduces three possible parametrizations of the MD-ARCH(∞) model with

hyperbolically decaying square-summable coefficients θj : j ≥ 0. Consider the sequence of

coefficients from the power series expansion of (1− z)−d given in (10). Using notation in terms

of the lag operator L, the MD-ARCH(∞) model (5) can be written in this case as follows:

Xt = ψt · εt ψt = a+ γ(1− L)−d(Xt−1 − ψt−1) , (18)

where an additional parameter γ > 0 is needed to guarantee a.s. non-negativity of the process

as shown in Example 6 of subsection 2.3. The autocovariance function for this model can be

derived using the summation formula∑∞

j=0Γ(d+j)Γ(d+j+k)

Γ2(d)Γ(1+j)Γ(1+j+k)= Γ(1−2d)Γ(d+k)

Γ(d)Γ(1−d)Γ(1−d+k) found in

Hosking (1996), which together with the Stirling’s formula implies that E[(Xt+k−a)(Xt−a)] =

O(k2d−1) , when 0 < d < 12 and the covariance stationarity condition in Theorem 2 is satisfied.

Thus, the autocovariance function of model (18) is non-summable. We refer to model (18) as

the long-memory MD-ARCH(∞) model.

A potentially interesting variation of model (18), allowing for both summable and non-

summable autocovariance function, can be defined as:

Xt = ψt · εt ψt = a+ γ (1− φL)−1(1− L)−d(Xt−1 − ψt−1) , (19)

where 0 ≤ φ < 1 and γ > 0. The sequence of coefficients θj : j ≥ 0 in this specification

is given by θj := γ∑j

k=0 φkθ∗j−k ∀j ≥ 0 , with θ∗j : j ≥ 0 as defined in (10). For d = 0

and 0 < φ < 1 the process (19) reduces to the covariance stationary GARCH(1,1) model,

see (9). The process has non-summable autocovariance function when d > 0 and the covariance

stationarity condition (14) is satisfied.

EXAMPLE 5. One of the models introduced by Ding and Granger (1996) for picking up the

long memory dynamics in the volatility of financial returns can be written in our notation as

9

follows:

Xt = ψt · εt ψt = [1− (1− L)d]Xt . (20)

In the terminology of Ding and Granger (1996) this model corresponds to the case “µ =

1”. While Giraitis, Kokoszka and Leipus (2000) find that the case “µ < 1” has summable

autocovariance function and hence short memory, they could not make definitive conclusion

about the model (20) because it violates their sufficient stationarity condition.

Consider the following parametrization of the MD-ARCH(∞) process (5):

Xt = ψt · εt ψt = a+ [(1− L)−d − 1](Xt − ψt) . (21)

Using similarity to the long-memory MD-ARCH(∞) model from the previous example, one can

show that under conditions of Theorem 2 the model (21) is covariance stationary with non-

summable autocovariance function. This permits inversion of (1−L)−d and allows to establish

equivalence of (21) and the model (20) of Ding and Granger (1996).

2.3 Non-negativity of MD-ARCH(∞) sequences

Definition of the MD-ARCH(∞) process involves the infinite series of weighted zero-centered

innovations Xt − ψt : t ∈ Z. Consequently, non-negativity of the process is not immediate

from the definition and is likely to hold only under suitable parameter restrictions. Recall

that non-negativity of the GARCH(p,q) sequence (7) is a simple consequence of its finite-

dimensional representation (6). Unlike GARCH(p,q), general MD-ARCH(∞) sequences do not

possess such a finite-dimensional form. Instead, we work with a sequence of finite-dimensional

approximations to MD-ARCH(∞) as detailed below.

Define a conditional process (Xt,n, ψt,n) : t ∈ Z, n ≥ 0 as follows:

Xt,n = ψt,n · εt ψt,n = a+ θn(ψ − a) +n∑

j=1

θj−1(Xt−j,n−j − ψt−j,n−j) , (22)

where ψ ∈ R0+ is a known starting value and A1–A2 hold. The conditional process can be

regarded as “started” MD-ARCH(∞) sequence, where the infinite series in (5) is replaced by

the finite stretch of previous innovations of length n. As stated below in the first part of

Theorem 3, the following additional assumption on the sequence of coefficients θj : j ≥ 0 is

sufficient to ensure a.s. non-negativity of the sequence of conditional models (22):

A3. For the sequence ηj : j ≥ 0 defined as:

η0 := θ0 , ηj := θj −j−1∑k=0

θj−1−kηk ∀j ≥ 1 (23)

let ηj : j ≥ 0 ⊆ R0+ and∑∞

j=0 ηj ≤ 1.

10

Assumption A3 strengthens A2 by requiring θj : j ≥ 0 to be strictly positive and decline to

zero sufficiently fast. The sequence of a.s. non-negative conditional models (Xt,n, ψt,n) : n ≥ 0is shown to converge a.e. to the MD-ARCH(∞) process (Xt, ψt) : t ∈ Z for each t ∈ Z in the

second part of Theorem 3, establishing non-negativity of (5):

THEOREM 3. Under A1–A3:

1. For each t ∈ Z, the sequence (Xt,n, ψt,n) : n ≥ 0 defined in (22) is non-negative a.e.

2. Under condition (14), there exists a subsequence nj : j ≥ 0 such that for each t ∈ Z

(Xt,nj , ψt,nj ) : nj ≥ 0 a.s.−→ (Xt, ψt), as j →∞, where (Xt, ψt) is defined in (5).

EXAMPLE 6. Consider the sequences of square-summable hyperbolically decaying coefficients (10)

in the long memory MD-ARCH(∞) process (18). It can be easily shown that the following

inequality holds for the sequence ηj : j ≥ 0 implied by the model:

ηj ≥ γ (d− γ)j ∀j ≥ 0 .

Hence condition d ≥ γ is sufficient for non-negativity of ηj : j ≥ 0. No closed-form solution

is available for the sum of ηj-s in this model, but numerical results show that the second part

of A3 is also satisfied.

In Ding and Granger (1996) model (21) the sequence ηj : j ≥ 0 is given by the coefficients

of the power series expansion of 1− (1− z)d and therefore satisfies A3.

Consider the combined short- and long-memory model (19). Simple calculations show that

the sequence ηj : j ≥ 0 for this process satisfies:

ηj ≥ γ(d− γ + φ)j ∀j ≥ 0

when d(1−d−2φ) ≥ 0. The latter condition together with d−γ+φ ≥ 0 are therefore sufficient

for the first part of A3 to hold. Numerical calculations can be used to show that the second

part of A3 is satisfied as well.

EXAMPLE 7. Representation (6) of the covariance stationary GARCH(p,q) sequences ensures

their a.s. non-negativity. Using (8), one can easily check that A3 is also satisfied, where

ηj : j ≥ 0 is recursively given by:

η0 = α1 , η1 = α2 + β1η0 , η2 = α3 + β1η1 + β2η0 , . . .

This sequence is equivalent to the coefficients θ∗j : j ≥ 0 in the ARCH(∞) representation (4)

of the covariance stationary GARCH(p,q) models. Sufficient covariance stationarity condition

of Giraitis, Kokoszka and Leipus (2000), together with A1, imply that∑∞

j=0 ηj < 1.

2.4 Convergence to fractional Brownian motion

Theorem 2 shows that summability of the autocovariance function for MD-ARCH(∞) model

depends on the properties of θj : j ≥ 0. In particular, GARCH(p,q) model (7) has summable

11

autocovariances, while the sequence of hyperbolically decaying square-summable coefficients in

long-memory MD-ARCH(∞) models (18) and (21) leads to the non-summable autocovariance

function. In the case of linear long-memory ARFIMA models of Granger and Joyeux (1980) and

Hosking (1981), the limit of appropriately normalized partial sums is given by the fractional

Brownian motion; see Marinucci and Robinson (1999) among others. In this subsection we

establish similar result for partial sums of long-memory MD-ARCH(∞) sequences.

Let us introduce the following notation. For 0 < H < 1 and r ∈ R, let BH(r) denote a

zero-mean Gaussian process with the covariance function given by:

E[BH(r1)BH(r2)] =12

(|r1|2H + |r2|2H − |r1 − r2|2H

).

This process was introduced by Mandelbrot and Van Ness (1968) and is commonly referred

to as fractional Brownian motion. Note that the case H = 12 corresponds to the standard

Brownian motion. Let ⇒ stand for convergence of finite-dimensional distributions, and b·cdenote the floor function.

It is well known that the functional limit of the normalized partial sums of random variables

depends on the summability of their autocovariances. Under certain additional assumptions,

such as linear structure of the underlying time series process, summable autocovariance function

implies limit given by the standard Brownian motion. On the other hand, non-summable

autocovariances lead to convergence to the fractional Brownian motion.

Although GARCH(p,q) sequences (7) do not belong to the class of linear time series mod-

els, Giraitis, Kokoszka and Leipus (2000) show that the functional limit of such sequences is

given by the standard Brownian motion. Using similar techniques, Giraitis, Robinson and

Surgailis (2000) demonstrate that the normalized partial sums of their linear ARCH model (3)

converge in finite-dimensional distributions to the fractional Brownian motion. In line with

these results we show that the following limit holds for the long memory MD-ARCH(∞) se-

quences:

THEOREM 4. Under A1–A2, let Xt : t ∈ Z be defined by (5) such that for some d > 0 the

sequence of coefficients θj : j ≥ 0 satisfies θj = O(jd−1) together with the condition (14) of

Theorem 1. Then the following distributional limit holds for each 0 < r ≤ 1:

1

cdTd+ 1

2

bTrc∑t=1

(Xt − a) ⇒ Bd+ 12(r) as T →∞ ,

where for 0 < Kd <∞ coefficient cd is defined as c2d := E(ε0 − 1)2 Eψ20

Kdd(1+2d) .

3 Statistical inference for MD-ARCH(∞) model

This section presents an account of the methods for statistical inference for the MD-ARCH(∞)

sequences introduced in section 2. Considering the novel structure of the model, our discussion

in this section will necessarily have a preliminary character. Using the functional limit result

of subsection 2.4, we first show how a well known semiparametric estimator can be used to

12

obtain inference on the parameter d in the long-memory MD-ARCH(∞) sequences. In the sec-

ond subsection we propose time-domain quasi-maximum likelihood estimator for simultaneous

estimation of all parameters of the model.

3.1 Semiparametric inference for long-memory MD-ARCH(∞) sequences

The class of MD-ARCH(∞) models, introduced in section 2, allows for parsimonious statistical

modeling of long memory covariance stationary non-negative sequences Xt : t ∈ Z. In

particular, we showed that the limiting behavior of the autocovariance function of the long

memory MD-ARCH(∞) model (18) and Ding and Granger (1996) model (21) depends on the

parameter d. Using methods developed for general long-memory sequences, the inference on d

can be obtained separately from the other parameters of the model, which in some applications

are of secondary importance for the researches. In this subsection we discuss estimation of d

using R/S statistic of Hurst (1951).

Let xt : 1 ≤ t ≤ T be a sample of the long-memory MD-ARCH(∞) process with param-

eter d satisfying assumptions of Theorem 4. Giraitis, Kokoszka, Leipus and Teyssiere (2000)

consider application of R/S analysis to linear ARCH sequences (3) of Giraitis, Robinson and

Surgailis (2000) for which the functional limit analogous to that in Theorem 4 is available.

Define the empirical range estimator as follows:

RT = max1≤k≤T

k∑t=1

(xt − xT )− min1≤k≤T

k∑t=1

(xt − xT ) ,

where xT is the usual sample mean. Let s2T be the sample variance estimator given by:

s2T =1T

T∑t=1

(xt − xT )2 .

Under assumptions analogous to that of Theorem 4 in subsection 2.4, Giraitis, Kokoszka, Leipus

and Teyssiere (2000) show the following limit of the R/S statistic as T →∞:

RT

T d+ 12 sT

d−→cd

[max0≤r≤1B

0d+ 1

2

(r)−min0≤r≤1B0d+ 1

2

(r)]

E [(X0 − a)2]12

, (24)

where d−→ denotes convergence in distribution, cd given in Theorem 4 and B0d+ 1

2

(r) = Bd+ 12(r)−

rBd+ 12(r), for 0 ≤ r ≤ 1, is fractional Brownian bridge. Finally, estimator of the long memory

parameter d can be defined as follows:

dT :=log RT

sT

log T− 1

2,

for which the rate of convergence is given by dT − d = Op

([log T ]−1

); refer to (24).

Giraitis, Kokoszka, Leipus and Teyssiere (2000) report results of Monte Carlo study of

the empirical bias and MSE of dT . They conclude that its performance in the setting of the

long-memory linear ARCH model (3) is similar to that found for linear time series models.

13

R/S statistic is also found to have somewhat better MSE compared to a number of other

semiparametric estimators of d.

We note that parameter a in covariance stationary MD-ARCH(∞) sequences can simply be

estimated by xT using (15). Hosking (1996) derives rate of convergence of xT under assump-

tions on the limiting behavior of the autocovariance function similar to that of long-memory

MD-ARCH(∞) models 18 and 21. He shows that xT − a = Op

(T 2d−1

). However, limiting

distribution of xT shown in Hosking (1996) requires linear structure of Xt : t ∈ Z.A number of other semiparametric estimators of the long memory parameter d is available

in the literature. A large class of estimators is based on the log periodogram regressions similar

to the one pioneered by Geweke and Porter-Hudak (1983).

3.2 The quasi-maximum likelihood estimator

Semiparametric methods presented above allow for estimation of only a subset of parameters of

the MD-ARCH(∞) model (5). Apart form efficiency considerations, testing of some interesting

hypothesis is ruled out, in particular, those concerning summability of the autocovariance

function in the framework of the combined short- and long-memory model (19). For joint

estimation of all parameters together with their covariance matrix we propose time domain

quasi-maximum likelihood estimator.

Properties of the QML estimator of short-memory GARCH(p,q) models were previously

addressed by several authors, most notably Lee and Hansen (1994), Lumsdaine (1996) and

Berkes, Horvath and Kokoszka (2003). These studies impose only a handful of moment con-

ditions on the sequence of shocks εt : t ∈ Z, largely compatible with A1. However, all three

rely on the ARCH(∞) representation of GARCH(p,q) sequences with non-zero a∗ and abso-

lutely summable θ∗j : j ≥ 0 in (4). Hence, properties of the QML estimator of the class of

long-memory MD-ARCH(∞) models introduced in section 2 remain unknown. General ideas

associated with such estimator are introduced below.

Recall that the set of parameters of MD-ARCH(∞) processes consists of a and and θj :

j ≥ 0. Let the sequence of coefficients θj : j ≥ 0 be further parametrized by a finite-

dimensional vector κ, and let λ0 = (a,κ)′denote the vector of true parameters. Under A1–A3,

let xt : 1 ≤ t ≤ T be a sample of the Xt part of the MD-ARCH(∞) process, with parameters

evaluated at λ0. The following sequence of functions approximates the unobserved sequence

pt : 1 ≤ t ≤ T, corresponding to the ψt part of the MD-ARCH(∞) process:

w1(u) = α , wt(u) = α+t−1∑j=1

θj−1(u)[xt−j − wt−j(u)] for 1 < t ≤ T , (25)

where u = (α,k)′has the same dimension as λ0. Define the quasi-maximum likelihood function

as follows:

LT (u) = −T∑

t=1

[log wt(u) +

xt

wt(u)

]. (26)

14

The quasi-maximum likelihood function (26) would be the proper likelihood function for the

model if, in addition to A1, we were to assume exponential distribution of ε0. For many

other specific distributional assumptions on ε0, notably those involving exponential family

of distributions, the difference between (26) and respective proper likelihoods will consist of

inessential constants. Finally, the QML estimator of λ0 based on a sample of T observations

of the MD-ARCH(∞) process is defined as:

λT = arg maxu∈U

LT (u) , (27)

where λ0 ∈ U , and U is a compact subspace of Rk, where k is the dimension of λ0, such that

A2–A3 are satisfied. Non-negativity of wt(u) : 1 ≤ t ≤ T for all u ∈ U can then be deduced

from the following representation of (25):

w1(u) = α , wt(u) = α+t−1∑j=1

ηj−1(u)(xt−j − α) for 1 < t ≤ T ,

where ηj(u) : j ≥ 0 is defined in terms of θj(u) : j ≥ 0 as in (23).

However, it is clear that even when evaluated at true parameters, the sequence wt(λ0) : 1 ≤t ≤ T is only a rough approximation to the unobserved sequence pt : 1 ≤ t ≤ T. By analogy

with Lee and Hansen (1994), Lumsdaine (1996) and Berkes, Horvath and Kokoszka (2003), it

appears to be convenient to base the study of asymptotic properties of the QML estimator on

the following unobserved quasi-maximum likelihood function:

LT (u) = −T∑

t=1

[logwt(u) +

xt

wt(u)

], (28)

where the sequence of functions wt(u) : 1 ≤ t ≤ T depends on the unobserved part of the

sample xt : t ≤ 0 and is defined as:

wt(u) = α+∞∑

j=1

θj−1(u)[xt−j − wt−j(u)] for 1 ≤ t ≤ T . (29)

Upon establishing the properties of the QML estimator based on (28) and (29), it needs to be

shown that supu∈U

∣∣∣ 1T LT (u)− 1

T LT (u)∣∣∣ → 0. Much of the previously cited literature on infer-

ence and estimation of the GARCH processes refers to the QML estimator based the likelihood

function (26), respectively (28), as the feasible, respectively infeasible, QML estimator.

Recall that Berkes, Horvath and Kokoszka (2003) establish consistency and asymptotic

normality of the QML estimator for GARCH(p,q) processes using their ARCH(∞) representa-

tion. As shown in Section 2, covariance stationary GARCH(p,q) processes have MD-ARCH(∞)

representation with exponentially decaying sequence of weighting coefficients θj : j ≥ 0. We

conjecture that the same conclusions hold with respect to the asymptotic properties of the

QML estimator (27) when applied to the covariance stationary MD-ARCH(∞) sequences. It

is also likely that in the case of long-memory MD-ARCH(∞) sequences the rate of convergence

of the first element of u, which corresponds to the parameter a, will be different from Op(T−12 )

and will depend on the degree of long memory.

15

4 Conclusion

This paper introduces a class of models for sequences of long memory covariance stationary

non-negative random variables. The new class, referred to as MD-ARCH(∞) class, is closely

related to the GARCH sequences of Engle (1982) and Bollerslev (1986), having multiplica-

tive structure of innovations and linear dependence on its own history. Using a representation

in terms of a sequence of martingale differences, similar to the one in Robinson (1991) and

Robinson and Henry (1999), the MD-ARCH(∞) model with the sequence of hyperbolically de-

caying square-summable weighting coefficients is shown to have non-summable autocovariance

function. In addition, the MD-ARCH(∞) class includes the usual covariance stationary short

memory GARCH sequences, providing the natural extension of GARCH models to the long

memory case.

Models introduced in this paper have a range of potential empirical applications in a variety

of fields. One of the most interesting questions which can be addressed using the class of models

introduced in this paper is testing the hypothesis of long memory versus the alternative of short

memory in samples of non-negative random variables, e. g. squared returns on financial and real

assets. However more work is needed before the properties of parametric estimators, such as the

QML estimator described in Section 3, become known for the class of MD-ARCH(∞) processes.

Another area of application of the MD-ARCH(∞) sequences is econometrics of high-frequency

financial data. Time-series dynamics of durations between events in transactions data is compli-

cated and often exhibits long-range dependence. However, direct application of the FIGARCH

models to this data implies infinite first unconditional moment of financial durations, feature

that is hardly attractive for the empirical models in this field. We believe that researches

working with high-frequency financial durations data will find results and ideas of this paper

useful.

5 Appendix

This technical appendix collects proofs of the main theorems in Section 2. In the appendix we

use notation ut := εt − 1, where by assumption A1 ut : t ∈ Z is zero mean i.i.d. sequence. In

addition, we use convention∑n

m · = 0 whenever m > n for m,n ∈ Z.

LEMMA 1. Under conditions of Theorem 2, for each t ∈ Z the sequence M(k, t) : k ≥ 0defined in (12) is orthogonal in L2 with moments given by:

EM(k, t) = 0 , EM(k, t)2 =

Eu20

∞∑j=0

θ2j

k

for each k ≥ 1. Sequence M(k, t) : t ∈ Z is stationary and ergodic for each k ≥ 0.

PROOF: As shown in Kokoszka and Leipus (2000) the following recursive equality holds for

16

the sequence M(k, t) : k ≥ 0:

M(0, t) = 1 , M(k, t) =∞∑

j=1

θj−1ut−jM(k − 1, t− j) for k ≥ 1 , t ∈ Z . (A.1)

By the conditions (14) Eu20 < ∞ and θj : j ≥ 0 is square-summable. Then by Lemma 2.2.2

of Stout (1974) M(1, t) : t ∈ Z ⊆ L2. Stationarity and ergodicity of M(1, t) : t ∈ Zfollows from Lemma 3.5.8 and Theorem 3.5.8 of Stout (1974) by assumptions on ut : t ∈ Zand (A.1).

Let M(k − 1, t) : t ∈ Z ⊆ L2 be stationarity and ergodic for given k ≥ 1. Then

the sequence utM(k − 1, t) : t ∈ Z is orthogonal in L2, where we use independence of ut

and M(k − 1, t) for each t ∈ Z. From (A.1) and Lemma 2.2.2 of Stout (1974) follows that

M(k, t) : t ∈ Z ⊆ L2. Stationarity and ergodicity of M(k, t) : t ∈ Z is the consequence

of Theorem 3.5.8 of Stout (1974) and assumptions on ut : t ∈ Z and M(k − 1, t) : t ∈ Ztogether with representation (A.1). Induction shows that M(k, t) : k ≥ 0 ⊆ L2 for each t ∈ Z

and that M(k, t) : t ∈ Z is stationary and ergodic for each k ≥ 0.

Previous results together with Theorem 2.3.1 of Lukacs (1975) allow us to obtain for

each t ∈ Z, k ≥ 1: EM(k, t) =∑∞

j=1 θj−1 E[ut−jM(k − 1, t − j)] = 0 and EM(k, t)2 =∑∞j=1 θ

2j−1 E[u2

t−jM(k − 1, t− j)2], from where EM(k, t)2 =[Eu2

0

∑∞j=0 θ

2j

]k.

Similarly, expression for covariance of M(1, t) and elements of M(k, t) : k > 1 is given by:

E[M(1, t)M(k, t)] =∑∞

j1=1

∑∞j2=1 θj1−1θj2−1E[ut−j1ut−j2M(k − 1, t − j2)] = 0, where we use

the independence of ut−m from the other terms under the expectation operator for m = j1∧ j2,j1, j2 ≥ 1, j1 6= j2, and EM(k − 1, t) = 0 for each t ∈ Z, k > 1. It follows trivially that

E[M(0, t)M(k, t)] = 0 for each k ≥ 1.

Let M(p − 1, t) be orthogonal w.r.t. elements of M(k, t) : k ≥ 1, k 6= p − 1. Then

similar arguments show: E[M(p, t)M(k, t)] =∑∞

j1=1

∑∞j2=1 θj1−1θj2−1E[ut−j1ut−j2M(p− 1, t−

j1)M(k − 1, t− j2)] = 0. By induction M(k, t) : k ≥ 0 is orthogonal in L2 for each t ∈ Z.

LEMMA 2. Under conditions of Theorem 2, for each t ∈ Z and 0 ≤ n < ∞ define sequences

of random variables Mn(k, t) : k ≥ 0 and Mn(k, t) : k ≥ 0 as follows:

Mn(0, t) := 1 ,

Mn(k, t) :=j1+...+jk≤n∑

j1...jk=1

θj1−1 · · · θjk−1ut−j1 · · ·ut−j1−...−jk∀1 ≤ k ≤ n ,

(A.2)

Mn(0, t) := θn ,

Mn(k, t) :=j1+...+jk≤n∑

j1...jk=1

θj1−1 · · · θjk−1θn−j1−...−jkut−j1 · · ·ut−j1−...−jk

∀1 ≤ k ≤ n ,(A.3)

and Mn(k, t) = Mn(k, t) := 0 for k > n. Then Mn(k, t) : k ≥ 0 and Mn(k, t) : k ≥ 0 are

orthogonal sequences in L2 with the following properties:

E[Mn(p, t)M(k, t)] = 0 , E[Mn(p, t)M(k, t)] = 0 , E[Mn(p, t)Mn(k, t)] = 0

17

for p, k ≥ 0, p 6= k.

PROOF: Definitions of Mn(k, t) and Mn(k, t) involve only finite sums, and it follows imme-

diately that Mn(k, t) : k ≥ 0 ⊆ L2 and Mn(k, t) : k ≥ 0 ⊆ L2 for each t ∈ Z and

0 ≤ n < ∞ under assumptions of Theorem 2. Results EMn(k, t) = E Mn(k, t) = 0 for k ≥ 1

and E[Mn(p, t)Mn(k, t)] = E[Mn(p, t)Mn(k, t)] = 0 for p, k ≥ 0, p 6= k follow easily by i.i.d.

property of ut : t ∈ Z.Analogously to (A.1), Mn(k, t) and Mn(k, t) admit the following recursive representations:

Mn(0, t) = 1 , Mn(k, t) =n−k+1∑

j=1

θj−1ut−jMn−j(k − 1, t− j) ∀1 ≤ k ≤ n , (A.4)

Mn(0, t) = 1 , Mn(k, t) =n−k+1∑

j=1

θj−1ut−jMn−j(k − 1, t− j) ∀1 ≤ k ≤ n (A.5)

for each t ∈ Z and 0 ≤ n <∞. Using Theorem 2.3.1 of Lukacs (1975) together with arguments

presented in Lemma 1 we get:

E[Mn(1, t)M(k, t)] =n−k+1∑j1=1

∞∑j2=1

θj1−1θj2−1E[ut−j1ut−j2M(k − 1, t− j2)] = 0

for each 0 ≤ n < ∞ and k > 1. Trivially, E[Mn(0, t)M(k, t)] = E[M(0, t)Mn(k, t)] = 0 for

each k ≥ 1. Assume that for each 0 ≤ n < ∞ Mn(p − 1, t) is orthogonal w.r.t. elements of

M(k, t) : k ≥ 1, k 6= p− 1. Then similar considerations lead to:

E[Mn(p, t)M(k, t)] =n−k+1∑j1=1

∞∑j2=1

θj1−1θj2−1E[ut−j1ut−j2Mn−j1(p− 1, t− j1)M(k − 1, t− j2)] = 0 .

By induction E[Mn(p, t)M(k, t)] = 0 for p, k ≥ 0, p 6= k. Similar line of reasoning can be used

to establish the remaining results.

PROOF OF THEOREM 1: As shown in Lemma 1, the sequence M(k, t) : k ≥ 0, for

given t ∈ Z, can be written in the linear form (A.1). The sequence of innovations in (A.1) is

orthogonal in L2, weighted by the coefficients θj : j ≥ 0. Theorem 2.3.2 of Stout (1974) and

condition (13) imply that M(k, t) is finite a.e. on (Ω,F ,P) for every k ≥ 0.

Consider representation (11) of the MD-ARCH(∞) process. Using expression for EM(k, t)2

derived in Lemma 1 and condition (14), it follows by the standard inifinite series convergence

tests that∑∞

k=1[log k]2EM(k, t)2 <∞ . Orthogonality of the sequence M(k, t) : k ≥ 0 shown

in Lemma 1 together with Theorem 2.3.2 of Stout (1974) imply that the infinite series in (11)

is finite a.e. on (Ω,F ,P).

From (11) and (12) Xt = X(εt, εt−1, . . .) and ψt = ψ(εt−1, εt−2, . . .), where X and ψ are

measurable functions of εt : t ∈ Z. Hence, stationarity and ergodicity of (Xt, ψt) : t ∈ Zfollows from Lemma 3.5.8 and Theorem 3.5.8 of Stout (1974).

18

PROOF OF THEOREM 2: Using orthogonality of M(k, t) : k ≥ 0, for given t ∈ Z, shown

in Lemma 1, and Lemma 2.2.2 of Stout (1974) infinite series in (11) will converge in L2 if∑∞k=0 EM(k, t)2 <∞. This holds by the condition (14). The first unconditional moment of ψt

is given by: Eψt = a∑∞

k=0 EM(k, t) = a, which coincides with the first unconditional moment

of Xt.

The following auxiliary result is used in the derivation of the autocovariance function of

(Xt, ψt) : t ∈ Z, where n ≥ 0:

E [M(p, t+ n)M(k, t)]

=∞∑

j1,j2=1

θj1−1θj2−1E[ut+n−j1ut−j2M(p− 1, t+ n− j1)M(k − 1, t− j2)]

= Eu20

∞∑j=1

θj−1θj+n−1E[M(p− 1, t− j)M(k − 1, t− j)] ,

where last equality is justified by the fact that for (j1−n) 6= j2 andm = (j1−n)∧j2 for j1, j2 ≥ 1

(εt−m−1) is independent of the rest of the terms under the expectation operator, producing zero

terms. By orthogonality of M(k, t) : k ≥ 0 for given t ∈ Z, E[M(p− 1, t− j)M(k − 1, t− j)]

are different from zero only for p = k. Hence:

E[M(p, t+ n)M(k, t)] =

Eu2

0

∞∑j=1

θj−1θj+n−1EM(p− 1, t− j)2 for p = k

0 for p 6= k.

This result together with stationarity of M(k, t) : t ∈ Z for given k ≥ 0 is used to derive

autocovariance function of ψt : t ∈ Z as follows:

E[(ψt+n − a)(ψt − a)] = a2∞∑

p=1

∞∑k=1

E[M(p, t+ n)M(k, t)]

= a2 Eu20

( ∞∑k=0

EM(k, t)2)( ∞∑

j=0

θjθj+n

)=

a2 Eu20

1− Eu20

∑∞j=0 θ

2j

∞∑j=0

θjθj+n .

Finally, the autocorrelation function of Xt : t ∈ Z is derived using its representation given

below:

Xt − a =∞∑

j=1

θj−1(Xt−j − ψt−j) + (Xt − ψt) =∞∑

j=0

θ∗j (Xt−j − ψt−j) =∞∑

j=0

θ∗jut−jψt−j .

Note that the sequence utψt : t ∈ Z is orthogonal in L2 by the previous results and assump-

tions on ut : t ∈ Z, and θ∗j : j ≥ 0 is square-summable. Hence, standard results of Brockwell

and Davis (1991) can be used to show: E[(Xt+n−a)(Xt−a)] = Eu20

∑∞j=0 θ

∗j θ

∗j+nEψ2

t−j . Using

stationarity of ψt : t ∈ Z together with the expression for its variance derived above we arrive

at the desired result.

19

PROOF OF THEOREM 3: By simple recursive substitution, for each n ≥ 0 and t ∈ Z, ψt,n

part of the conditional process (22) can be written as follows:

ψt,0 = ψ , ψt,n = a+n∑

j=1

ηj−1(Xt−j,n−j − a) + ηn(ψ − a) ∀n ≥ 1 .

By A3 the sequence ηj : j ≥ 0 is non-negative with∑n

j=0 ηj ≤ 1 for all n ≥ 0, showing a.s.

non-negativity of (Xt,n, ψt,n) : n ≥ 0.To prove the second part of the theorem we use another representation of ψt,n, which again

is derived by simple recursive substitution:

ψt,n =n∑

k=0

[aMn(k, t) + (ψ − a)Mn(k, t)] ∀n ≥ 0 and ∀ t ∈ Z .

It follows from Lemma 2 that ψt,n : n ≥ 0 ⊆ L2 for every t ∈ Z. In the following we study L2

convergence of random variables (ψt − ψt,n) as n→∞. Using the previous expression for ψt,n

together with equation (11) and Lemma 1 and 2 we can write for each each t ∈ Z and n ≥ 0:

E(ψt − ψt,n)2 =n∑

k=0

E[a[M(k, t)−Mn(k, t)]− (ψ − a)Mn(k, t)

]2+ a2

∞∑k=n+1

EM(k, t)2 .

Since∑∞

k=0 EM(k, t)2 <∞ by condition (14), the second part of this expression converges to

zero as n → ∞. Consider now the first part. Using inequality (A.8) we can write for each

n ≥ 0:

0 ≤n∑

k=0


]2≤

∞∑k=0


]2≤

(a2 + (ψ − a)2

∞∑j=0

θ2j

) ∞∑k=0

EM(k, t)2 <∞ .

Given ε > 0, we can then choose K ≥ 0 s.t. the following inequality will hold for each n ≥ 0

and t ∈ Z:

0 ≤∞∑

k=0


]2≤

K∑k=0


]2+

(a2 + (ψ − a)2

∞∑j=0

θ2j

) ∞∑k=K

EM(k, t)2 ≤

K∑k=0


]2+ε

2.

From (A.7) follows that the last sum in the expression above converges to zero as n → ∞,

establishing that E(ψt−ψt,n)2 → 0. The result of the theorem then follows from Corollary 2.1.1

20

of Stout (1974). Using a.s. convergence of ψt,n : n ≥ 0 to ψt and definitions of Xt,n and Xt

in respectively (22) and (5), the second part of the theorem is established.

Auxiliary results (A.7) and (A.8), which we use above, are derived as follows. Using re-

cursive representations (A.1), (A.4) and (A.5) together with orthogonality of terms under the

summation sign we can write for each n ≥ 0, k ≥ 1 and t ∈ Z:


]2=

Eu20

[n−k+1∑

j=1

θ2j−1E

[a[M(k − 1, t− j)−Mn−j(k − 1, t− j)]−

(ψ − a)Mn−j(k − 1, t− j)]2

+ a2∞∑

j=n−k+2

θ2j−1EM(k − 1, t− j)2

].

(A.6)

Consider the case k = 1. The right-hand side of (A.6) reduces to: Eu20

[(ψ−a)2·

∑n−1j=0 θ

2j θ

2n−j−1+

a2∑∞

j=n θ2j

]. Using property of absolutely convergent series

(∑∞j=0 θ

2j

)2=

∑∞i=0

∑ij=0 θ

2j θ

2i−j ,

we conclude that for each t ∈ Z this expression converges to zero as n → ∞. In addition, the

following inequality holds: E[a[M(1, t)−Mn(1, t)]− (ψ − a)Mn(1, t)

]2≤

((ψ − a)2

∑∞j=0 θ

2j +

a2)EM(1, t)2 . Assume now that for each t ∈ Z E

[a[M(k−1, t)−Mn(k−1, t)]−(ψ−a)Mn(k−

1, t)]2→ 0 as n→∞ . We can rewrite (A.6) in the following way:


]2=

Eu20

[ ∞∑i=0

θ2n−k−i1i≤n−kE

[a[M(k − 1, t− 1− i)−Mk−1+i(k − 1, t− 1− i)]−

(ψ − a)Mk−1+i(k − 1, t− 1− i)]2

+ a2(Eu2

0

∞∑j=0

θ2j

)k−1∞∑

j=n−k+1

θ2j

].

The second term on the right-hand side of this expression converges to zero as n → ∞ by

the square-summability of θj : j ≥ 0 implied by the condition in Theorem 2. The square-

summability also ensures that:∑∞

i=0 θ2n−k−i1i≤n−k ≤

∑∞j=0 θ

2j < ∞ for all n ≥ 0, and

θ2n−k−i1i≤n−k → 0 as n→∞ for each i ≥ 0. Using Lemma 3.2.3 of Stout (1974) we conclude

that the first right-hand side term of the expression above also converges to zero as n → ∞.

By induction we conclude that for each k ≥ 0 and t ∈ Z:


]2→ 0 as n→∞ , (A.7)

where the case k = 0 follows trivially. Assume now that for each t ∈ Z E[a[M(k − 1, t) −

Mn(k − 1, t)]− (ψ − a)Mn(k − 1, t)]2≤

(a2 + (ψ − a)2

∑∞j=0 θ

2j

)EM(k − 1, t)2 . Substituting

this inequality into (A.6) and collecting terms we conclude that for each n, k ≥ 0 and t ∈ Z:


]2≤

(a2 + (ψ − a)2

∞∑j=0

θ2j

)EM(k, t)2 , (A.8)

21

where the case k = 0 is trivial.

PROOF OF THEOREM 4: Define νt := ψtut and write Xt = ψt + νt. Then:

1

T d+ 12

bTrc∑t=1

(Xt − EXt) =1

T d+ 12

bTrc∑t=1

(ψt − a) +1

T d+ 12

bTrc∑t=1

νt ,

where EXt = a is from (15) by assumed stationarity of (Xt, ψt) : t ∈ Z. The second term in

this expression is asymptotically negligible:

1T 2d+1

E( T∑

t=1

νt

)2=

1T 2d

Eu20 Eψ2

0 → 0 as T →∞ ,

where we again use stationarity assumption together with orthogonality of νt : t ∈ Z. Hence,

it is sufficient to show that:

1

T d+ 12

bTrc∑t=1

(ψt − a) ⇒ cdBd+ 12(r) . (A.9)

As in Giraitis, Robinson and Surgailis (2000), we use representation (5) to write:

ψt − a =∞∑

j=1

θj−1E[ψt−j |F+t−j−M ]ut−j +

∞∑j=1

θj−1

(ψt−j − E[ψt−j |F+

t−j−M ])ut−j

:=z−t + z+t ,

where M ≥ 0 is a given integer, F+t denotes information generated by the process (Xt, ψt) :

t ∈ Z from t onwards and θj = O(jd−1). It follows that z+t : t ∈ Z and z−t : t ∈ Z are

stationary L2 sequences. We can write:

1

T d+ 12

bTrc∑t=1

(ψt − a) =1

T d+ 12

bTrc∑t=1

z−t +1

T d+ 12

bTrc∑t=1

z+t . (A.10)

We show that, by choosing sufficiently large M , variance of the second term in (A.10) can

be made arbitrary close to zero, and hence it can be ignored in subsequent derivations of the

limiting process:

1T 2d+1

E( T∑

t=1

z+t

)2=

1T 2d+1

T∑t=1

T∑s=1

E[z+t z

+s ]

= Eu20 E

(ψ0 − E[ψ0|F+

−M ])2 1T 2d+1

T∑t=1

T∑s=1

∞∑j=0

θjθj+|t−s| .

The following limit relies on the assumed structure of θj : j ≥ 0:

1T 2d+1

T∑t=1

T∑s=1

∞∑j=0

θjθj+|t−s| →Kd

d(1 + 2d)as T →∞ ,

where constant 0 < Kd <∞ depends on d and is possibly different for various parametrizations

of θj : j ≥ 0; refer to Giraitis, Robinson and Surgailis (2000) and Example 4. It follows that

22

1T 2d+1 E

(∑Tt=1 z

+t

)2→ Eu2

0 E(ψ0 − E[ψ0|F+

−M ])2 Kd

d(1+2d) as T → ∞ , which in turn goes to

zero as M increases since E(ψ0 − E[ψ0|F+

−M ])2 → 0 as M → ∞ . Consider now first part

of (A.10). Using previous results we can write:

1T 2d+1

E( T∑

t=1

z−t

)2= Eu2

0 E(E[ψ0|F+

−M ])2 1T 2d+1

T∑t=1

T∑s=1

∞∑j=0

θjθj+|t−s|

→Eu20 E

(E[ψ0|F+

−M ])2 Kd

d(1 + 2d)as T →∞ .

By choosing sufficiently large value of M , the last expression can be made arbitrary close

to c2d := Eu20 Eψ2

0Kd

d(1+2d) . Result (A.9) then follows by arguments in Giraitis, Robinson and

Surgailis (2000) based on linear structure of z−t , where E[ψt|F+t−M ]ut : t ∈ Z is stationary

M -dependent orthogonal sequence in L2.

References

Andersen, Torben G., Tim Bollerslev, Francis Diebold, and Paul Labys (2001) The distribution

of exchange rate volatility. Journal of the American Statistical Association, vol. 96, pp. 42-

55.

Baillie, Richard T., Tim Bollerslev and Hans O. Mikkelsen (1996) Fractionally integrated gen-

eralized autoregressive conditional heteroskedasticity. Journal of Econometrics, vol. 74,

pp. 3-30.

Berkes, Istvan, Lajos Horvath and Piotr Kokoszka (2003) GARCH processes: Structure and

estimation. Bernoulli, vol. 9.

Berkes, Istvan, Lajos Horvath and Piotr Kokoszka (2002) Probabilistic and statistical proper-

ties of GARCH processes. Preprint.

Bollerslev, Tim (1986) Generalized autoregressive conditional heteroskedasticity. Journal of

Econometrics, vol. 31, pp. 307-327.

Bollerslev, Tim, Robert F. Engle and D. B. Nelson (1994) ARCH models. Handbook of Econo-

metrics, vol. IV, pp. 2961-3031, New-York: Elsevier Science.

Bollerslev, Tim and Hans O. Mikkelsen (1996) Modeling and pricing long memory in stock

market volatility. Journal of Econometrics, vol. 73, pp. 151-184.

Brockwell, Peter J. and Richard A. Davis (1991) Time series: theory and methods. Second

Edition, New-York: Springer-Verlag.

Chung, Ching-Fan and Richard T. Baillie (1993) Small sample bias in conditional sum-of-

squares estimators of fractionally integrated ARMA models. Empirical Economics, vol. 18,

pp. 791-806.



Engle, Robert F. (2000) The econometrics of ultra-high-frequency data. Econometrica, vol. 68,

no. 1, pp. 1-22.

23

Engle, Robert F. (1982) Autoregressive conditional heteroskedasticity with estimates of the

variance of U.K. inflation. Econometrica, vol. 50, pp. 987-1008.

Engle, Robert F. and Bollerslev, Tim (1986) Modeling persistence of conditional variance.

Econometric Reviews, vol. 5, pp. 1-50.

Engle, Robert F. and Jeffrey R. Russell (1998) Autoregressive conditional duration: a new

model for irregularly spaced transaction data. Econometrica, vol. 66, pp. 1127-1162.

Geweke, John and Susan Porter-Hudak (1983) The estimation and application of long memory

time series models. Journal of Time Series Analysis, vol. 4, pp. 221-238.

Giraitis, Liudas, Piotr Kokoszka and Remigijus Leipus (2000) Stationary ARCH models: de-

pendence structure and central limit theorem. Econometric Theory, vol. 16, pp. 3-22.

Giraitis, Liudas, Piotr Kokoszka, Remigijus Leipus and Gilles Teyssiere (2000) Semiparametric

estimation of the intensity of long memory in conditional heteroskedasticity. Statistical

Inference for Stochastic Processes, vol. 3, pp. 113-128.

Giraitis, Liudas, Peter M. Robinson and Donatas Surgailis (2000) A model for long memory

conditional heteroscedasticity. The Annals of Applied Probability, vol. 10, pp. 1002-1024.

Granger, Clive W.J. and Joyeux, R. (1980) An introduction to long memory time series models

and fractional differencing. Journal of Time Series Analysis, vol. 1, pp. 15-39.

Hosking, Jonathan R.M. (1981) Fractional differencing. Biometrika, vol. 68, pp. 165-76.

Hosking, Jonathan R.M. (1996) Asymptotic distributions of the sample mean, autocovariances,

and autocorrelations of long-memory time series. Journal of Econometrics, vol. 73, pp. 261-

284.

Jasiak, Joanna (1998) Persistence in intertrade durations. Finance, vol. 19, pp. 166-195.

Kazakevicius, Vytautas and Remigijus Leipus (2002) On stationarity in the ARCH(∞) model.

Econometric Theory, vol. 18, pp. 1-16.

Kokoszka, Piotr and Remigijus Leipus (2000) Chainge-point estimation in ARCH models.

Bernoulli, vol. 6, pp. 1-28.

Lee, Sang-Won and Bruce E. Hansen (1994) Asymptotic theory for the GARCH(1,1) quasi-

maximum likelihood estimator. Econometric Theory, vol. 10, pp. 29-52.

Lukacs, Eugene (1975) Stochastic convergence. Second Edition, Academic Press, Inc.

Lumsdaine, Robin L. (1996) Consistency and asymptotic normality of the quasi-maximum like-

lihood estimator in IGARCH(1,1) and covariance stationary GARCH(1,1) models. Econo-

metrica, vol. 64, pp. 575-596.

Mandelbrot, B. and J. W. Van Ness (1968) Fractional Brownian motions, fractional noises and

applications. SIAM Review, vol. 10, pp. 422-437.

Marinucci, D. and Peter M. Robinson (1999) Alternative forms of fractional Brownian motion.

Journal of Statistical Planning and Inference, vol. 80, pp. 111-122.

McLeod, A. I. and K. W. Hipel (1978) Preservation of the rescaled adjusted range, 1: a

reassessment of the Hurst phenomenon, Water Resources Research, vol. 14, pp. 491-508.

Nelson, Daniel B. (1990) Stationarity and persistence in the GARCH(1,1) model. Econometric

24

Theory, vol. 6, pp. 318-334.

Robinson, Peter M. (1991) Testing for strong serial correlation and dynamic conditional het-

eroskedasticity in multiple regression. Journal of Econometrics, vol. 47, pp. 67-84.

Robinson, Peter M. and M. Henry (1999) Long and short memory conditional heteroscedasticity

in estimating the memory parameter of levels. Econometric Theory, vol. 15, pp. 299-336.

Stout, William F. (1974) Almost sure convergence. Academic Press, Inc.

25

Chapter 2: Long memory ARCH(∞) models:specification and quasi–maximum likelihood

estimation

26

Long memory ARCH(∞) models: specification and

quasi-maximum likelihood estimation

Dmitri Koulikov∗





phone: +45 89421577


This revision:

December 8, 2003

Abstract

The paper introduces the long memory ARCH(∞) model and studies the asymptotic

properties of its QML estimator. The class of ARCH(∞) sequences of Robinson (1991)

includes many popular models for the dynamic conditional volatility and high frequency

financial data econometrics. Giraitis, Kokoszka and Leipus (2000) show that the covari-

ance stationary solution of the ARCH(∞) model with non-zero intercept has a summable

autocovariance function, and hence short memory as defined in McLeod and Hipel (1978).

First part of the paper shows that a long memory non-negative solution of the ARCH(∞)

model exists as well, but requires the intercept to be zero. The result is established us-

ing equivalence between covariance stationary ARCH(∞) and MD-ARCH(∞) models, the

latter studied in Koulikov (2003). The second part of the paper examines the properties

of a time-domain QML estimator of the memory parameter of the new model. Strong

consistency and asymptotic normality of the infeasible estimator is established, while only

consistency result holds for the feasible case. A Monte Carlo experiment is conducted to

assess properties of the QML estimator in finite samples.

JEL classification: C13, C15, C22

Keywords: Conditional heteroscedasticity, Long-memory, ARCH(∞), Weak stationarity,

Quasi-maximum likelihood estimation

∗Most important parts of the paper were completed during my visit to Nuffield College, University of Oxford,

in spring 2003. Their hospitality is gratefully acknowledged. I would also like to thank Bent Jesper Christensen,

Neil Shephard, Bent Nielsen and Matthias Winkel for their helpful suggestions.

27

1 Introduction

The class of ARCH(∞) processes, defined in Robinson (1991) and studied in details in recent

papers by Giraitis, Kokoszka and Leipus (2000) and Kazakevicius and Leipus (2002), serves as

an important class of models for dynamic sequences of non-negative random variables, hence-

forth denoted as (Xt, ψt) : t ∈ Z. The stationary sequence (Xt, ψt) : t ∈ Z is said to be

ARCH(∞) if it satisfies the following set of stochastic equations:

Xt = ψt · εt ψt = a∗ +∞∑j=1

πj−1Xt−j , (1)

where εt : t ∈ Z is the sequence of non-negative shocks, and all parameters are assumed to be

non-negative. The popular GARCH(p,q) model of Bollerslev (1986) and many of its subsequent

variations represent the most widely used parametrizations of the ARCH(∞) process, refer

to Giraitis, Kokoszka and Leipus (2000) for a thorough description and further examples.

This class of models is particularly useful in various applications in finance, where the non-

negative time series are often encountered. Examples of such data include various volatility

measures in finance, and more recently durations between market events in the high-frequency

financial datasets. The complicated nature of dynamic dependencies in this kind of series is a

well established empirical fact, as highlighted inter alia in Andersen, Bollerslev, Diebold and

Labys (2001) and Jasiak (1998).

Yet, several outstanding issues related to the ARCH(∞) sequences have hitherto received

limited attention in the literature. One such issue is the existence of the long memory solution

of the ARCH(∞) model. This is especially relevant as many financial series exhibit strong

dependence, often characterized by high persistence of the estimated autocovariance function.

One of the early attempts to define a long memory ARCH(∞) model is the study by Ding

and Granger (1996), who also present a wealth of empirical evidence of the persistent auto-

correlations in squared and absolute returns of several real-world financial data series. One of

the conclusions of Ding and Granger (1996) is that existent GARCH and IGARCH volatility

models do not provide adequate fit to the observed autocorrelation structure of the empirical

volatility estimates of their series.

In order to account for persistent volatility, Ding and Granger (1996) suggest the following

parametrization of the ARCH(∞) process:

Xt = ψt · εt ψt = a

∫ 1

0

1− α− β

1− βdF (α, β) +

∞∑j=1

[∫ 1

0αβj−1dF (α, β)

]Xt−j ,

where a > 0, α, β ≥ 0 s.t. 0 ≤ α + β ≤ 1, and F (α, β) is a joint distribution function of the

parameters α and β. This model is motivated by the influential study of Granger (1980), where

he shows that a linear fractionally integrated process can arise as the result of aggregation of

an infinite number of component short-memory AR(1) processes, each having the autoregres-

sive coefficient drawn randomly from a specific distribution on the unit interval. Ding and

Granger (1996) use a similar idea and show that aggregation of short-memory GARCH(1,1)

28

processes leads to the model above, where a, α, β correspond to the respective parameters in

the component GARCH(1,1) processes.

Additional parametric assumptions on the distribution function F (α, β) allow Ding and

Granger (1996) to identify two important cases of their model. The first case corresponds to

the restriction α+ β < 1, resulting in the following ARCH(∞) model:

Xt = ψt · εt ψt = a(1− µ) + µ∞∑j=1

πj−1Xt−j , (2)

where 0 < µ < 1 and the sequence of coefficients πj : j ≥ 0 is defined for p, d > 0 as

πj := dΓ(p+d)Γ(p+j)Γ(p)Γ(p+d+j+1) , where Γ is the gamma function. Using properties of the hypergeometric

function one can show that∑∞

j=0 πj = 1 for all values of p and d. The second case arises when

F (α, β) is such that α+ β = 1. Ding and Granger (1996) show that the resulting process can

be written as:

Xt = ψt · εt ψt =∞∑j=1

πj−1Xt−j . (3)

Having defined the processes (2) and (3), Ding and Granger (1996) leave a number of issues

open for further research. One of the most important issues concerns existence of the covariance

stationarity solution of their models, together with corresponding properties of the autocovari-

ance function. In particular, Ding and Granger (1996) conjecture that the process (2) has a

non-summable autocovariance function, equivalently long memory by the widely used defini-

tion of McLeod and Hipel (1978). However, Giraitis, Kokoszka and Leipus (2000) show that a

sufficient covariance stationarity condition for this process implies summable autocovariances,

and therefore the short memory nature of the model.

Yet more questions arise in connection with the model defined in (3), which in the re-

mainder of the paper is referred to as the ARCH(∞) model with zero intercept. Under i.i.d.

assumptions on the sequence of shocks εt : t ∈ Z and E εt∑∞

j=0 πj < 1, Giraitis, Kokoszka

and Leipus (2000) show that the unique solution of (3) is given by Xt = ψt = 0 for all t ∈ Z.

The same trivial solution for the class of short memory GARCH processes with zero intercept

appears in Bougerol and Picard (1992). As well as this, given that a non-degenerate solution

of (3) exists, the issues of covariance stationarity and long memory of such solution remain

open. Note that each of the component GARCH processes in this model is not covariance

stationary, indeed not even first order stationary, owing to the restriction α + β = 1 imposed

in the derivation of (3).

In this paper we focus on the class of ARCH(∞) processes with zero intercept, analogous to

the one defined in (3). We show that there exists a non-degenerate and non-negative covariance

stationary solution of such processes, which coincides with the solution of the recently studied

class of MD-ARCH(∞) processes of Koulikov (2003). In particular, such a solution is shown to

have non-summable autocovariance function. In section 3 we show consistency and asymptotic

normality of the QML estimator for the mean and memory parameter of a simple long memory

29

ARCH(∞) model. Finite sample properties of the estimator are examined in a small Monte

Carlo study in section 4. Proofs of the main results of the paper are collected in the appendix.

2 Long memory solution of the ARCH(∞) equations

The class of ARCH(∞) models with zero intercept, such as the model (3) defined by Ding and

Granger (1996), does not have a Volterra series expansion similar to the one used in Giraitis,

Kokoszka and Leipus (2000) and Kazakevicius and Leipus (2002). Hence, sufficient stationarity

and covariance stationarity conditions derived in these papers are not immediately applicable

to this class of models. In order to find the solution of (3) and to study its properties, a

new representation in terms of the sequence of i.i.d. shocks εt : t ∈ Z is needed. As we

show below, under a set of mild conditions on the sequence of coefficients πj : j ≥ 0, the

covariance stationary solution of the ARCH(∞) models with zero intercept coincides with the

long memory solution of the class of MD-ARCH(∞) sequences studied in Koulikov (2003).

We begin by examining a set of necessary conditions for the existence of a covariance

stationary solution of the general ARCH(∞) sequences (1). Kazakevicius and Leipus (2002)

study conditions for existence of a strictly stationary solution of the ARCH(∞) model, but their

approach requires a∗ > 0. As demonstrated in Theorem 1, non-negative covariance stationary

solutions of the ARCH(∞) equations do not require this restriction.

THEOREM 1. Let (Xt, ψt) : t ∈ Z be a non-negative covariance stationary solution of (1)

satisfying EXt = Eψt = a for a ≥ 0. Then the following conditions hold when a > 0:

a∗ ≥ 0 ,

∞∑j=0

πj ≤ 1 , where a∗ > 0 iff∞∑j=0

πj < 1 , (4)

∞∑j=1

πj−1E(Xt − a)(Xt−j − a) <∞ . (5)

In addition, the only non-negative stationary solution of (1) when a = 0 is given by Xt = ψt = 0

for all t ∈ Z.

Condition (5) related to the autocovariance function of any covariance stationary solution

of the ARCH(∞) model is an important result of the theorem. In particular, it does not rule

out long memory solutions of the ARCH(∞) equations, for which the autocovariance function

satisfies E(Xt − a)(Xt−j − a) = O(j2d−1), as the summability of πj : j ≥ 0 guarantees that

the condition is fulfilled.

An equally notable result of Theorem 1 is the link between a∗ and the sequence of coefficients

πj : j ≥ 0 shown in (4). Giraitis, Kokoszka and Leipus (2000) demonstrate that for an

ARCH(∞) process with a∗ > 0, the condition∑∞

j=0 πj < 1, together with some additional

assumptions on the sequence of shocks εt : t ∈ Z, implies summability of the autocovariance

function of the weakly stationary solution of the process. Hence, any long memory solutions

30

of the model (1) has to belong to the class of ARCH(∞) models with zero intercept, such as

model (3) of Ding and Granger (1996).

For convenience of subsequent exposition, assumptions related to the ARCH(∞) model (1)

and repeatedly referred to in the remainder of the paper are collected below:

A1. εt : t ∈ Z is defined on the common probability space (Ω,F ,P), and consists of i.i.d.

copies of a non-negative random variable ε0 with Eε0 = 1 and E(ε0 − 1)2 <∞.

A2. a∗ ≥ 0 and πj : j ≥ 0 ⊆ R0+ s.t.∑∞

j=0 πj ≤ 1.

We note that A1 is comparable to the assumptions of Giraitis, Kokoszka and Leipus (2000),

and is necessary for the existence of covariance stationarity solution of the model with a∗ > 0.

While Theorem 1 shows that the ARCH(∞) process with zero intercept can have covariance

stationary solutions with non-summable autocovariances, the existence of such solutions has yet

to be established. Consider the MD-ARCH(∞) process, studied in details in Koulikov (2003):

X∗t = ψ∗t · εt ψ∗t = a+∞∑j=1

θj−1(X∗t−j − ψ∗t−j) . (6)

Under A1 and E(ε0−1)2∑∞

j=0 θ2j < 1, a covariance stationary solution of this process is derived

in Koulikov (2003), and is given by the following Volterra series expansion of (6):

X∗t = ψ∗t · εt ψ∗t = a∞∑k=0

M(k, t) , (7)

where each element of the sequence M(k, t) : k ≥ 0, t ∈ Z is a non-linear square-integrable

function of the underlying sequence of i.i.d. innovations εt : t ∈ Z. Let the sequence of

coefficients θj : j ≥ 0 in (6) be related to πj : j ≥ 0 in the ARCH(∞) model as follows.

Define P(z) := 1−∑∞

j=1 πj−1 zj on the open complex unit disc. When P(z) 6= 0 for all |z| < 1,

let P−1(z) = 1 +∑∞

j=1 θj−1 zj be given by the power series expansion of 1

P(z) around z = 0.

Then P(z)P−1(z) = 1 for all |z| < 1 and the sequence of coefficients θj : j ≥ 0 is related to

πj : j ≥ 0 as:

θ0 = π0 , θj = πj +j−1∑i=0

θj−1−iπi ∀j ≥ 1 . (8)

Theorem 2 establishes the link between covariance stationary solutions of the MD-ARCH(∞)

and ARCH(∞) models.

THEOREM 2. Under A1–A2, consider covariance stationary solutions of the ARCH(∞) pro-

cess (1) and of the MD-ARCH(∞) process (6), where EXt = EX∗t = a > 0 and the sequences

θj : j ≥ 0 and πj : j ≥ 0 are related as in (8). Assume that the following condition is

satisfied:

∞∑j=N

(πj +

N−1∑i=0

θj−1−iπi

)→ 0 as N →∞ . (9)

31

0

1

2

3

4

5

6

7

8

9

10

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5

E(e-

1)^2

d

Non-stationary region

Covariance stationary region

Figure 1: Graph of E(ε0 − 1)2[

Γ(1−2d)Γ(1−d)2 − 1

]= 1 with corresponding stationary and non-

stationary regions of model (12).

Then the solution (Xt, ψt) : t ∈ Z of the model (1) satisfies the following set of stochastic

equations:

Xt = ψt · εt ψt = a+∞∑j=1

θj−1(Xt−j − ψt−j) . (10)

Similarly, the solution (X∗t , ψ∗t ) : t ∈ Z of the model (6) satisfies the following set of stochastic

equations:

X∗t = ψ∗t · εt ψ∗t = a+∞∑j=1

πj−1(X∗t−j − a) . (11)

It follows from Theorem 2 that the class of covariance stationary MD-ARCH(∞) sequences

studied in Koulikov (2003) has the ARCH(∞) representation (11). In particular, the Volterra

series form (7) gives the solution of any covariance stationary ARCH(∞) sequence (Xt, ψt) :

t ∈ Z, including the long memory case studied in Koulikov (2003). By Theorem 1, the long

memory solution of (1) has a∗ = 0 and∑∞

j=0 πj = 1. Model (3) of Ding and Granger (1996)

gives one possible parametrization of the long memory ARCH(∞) processes.

Giraitis, Kokoszka and Leipus (2000) consider the following ARCH(∞) process, correspond-

ing to the restricted version of Ding and Granger (1996) model (3), for which p = 1− d:

Xt = ψt · εt ψt = [1− (1− L)d]Xt , (12)

where L denotes the lag operator. The techniques used by Giraitis, Kokoszka and Leipus (2000)

do not allow them to establish conditions for the covariance stationarity of this process. The-

orem 2 shows that the covariance stationary solution of (12) is the same as the covariance

stationary solution of the following long memory MD-ARCH(∞) model:

Xt = ψt · εt ψt = a+ [(1− L)−d − 1](Xt − ψt) , (13)

32

provided that condition (9) is fulfilled. Indeed, the recursive structure of coefficients from the

power series expansion of polynomials 1− (1− z)d and (1− z)−d − 1 allows us to write:

∞∑j=N

(πj +

N−1∑i=0

θj−1−iπi

)=

θN (N + 1)π0

∞∑j=1

j

N + jπj−1 ≤ K1N

d∞∑j=1

j−2−d

j +N

≤ K2Nd

∫ ∞

0

s−32d

N + sds = O(N−d) → 0 as N →∞ ,

where K1,K2 > 0 are constants, and we use (18.25) in Spiegel and Liu (1999) to evaluate the

integral.

Under A1, the sufficient covariance stationarity condition in Theorem 2 of Koulikov (2003)

implies that the parameter d in model (12) has to satisfy the inequality E(ε0−1)2[

Γ(1−2d)Γ(1−d)2 − 1

]<

1 . This inequality, linking the variance of shocks and the range of parameter d, is akin to similar

conditions for the class of short memory GARCH models; refer to Giraitis, Kokoszka and Lei-

pus (2000). Figure 1 depicts covariance stationary region of the model in the two-dimensional

plane d × E(ε0 − 1)2. It follows that for the case of unit variance of shocks, parameter d is

restricted to an approximate interval [0, 0.395]. When εt : t ∈ Z is given by i.i.d. copies of χ21

random variable, a choice popular in the empirical volatility modeling literature, the approx-

imate interval of d corresponding to the covariance stationary region of the model is reduced

to [0, 0.340].

3 Quasi-maximum likelihood estimator for the long memory

ARCH(∞) model

Since the introduction of the GARCH(p,q) model by Bollerslev (1986), there has been a con-

tinuous interest in the statistical properties of various estimators of the parameters for this

class of models. Recent studies by Giraitis, Kokoszka, Leipus and Teyssiere (2000) and Giraitis

and Robinson (2001) consider certain semiparametric estimators of the ARCH(∞) models with

non-zero intercept, both relying on minimal additional assumptions about the process, such as

the weak limit of normalized partial sums in the former and a specific local behavior of the spec-

tral density around the origin in the latter. The quasi-maximum likelihood (QML) estimator

for the class of GARCH(p,q) processes was previously examined in a number of studies, most

notably Weiss (1986), Lee and Hansen (1994), Lumsdaine (1996) and Berkes, Horvath and

Kokoszka (2003). The latter covers the entire class of stationary GARCH(p,q) sequences by

making use of their ARCH(∞) representation. Robinson and Zaffaroni (2003) attempt to pro-

vide asymptotic theory of the QML estimator for yet wider class of ARCH(∞) sequences with

non-zero intercept, including the FIGARCH model of Baillie, Bollerslev and Mikkelsen (1996).

However, existence and properties of the stationary solutions of the FIGARCH sequences have

not been studied in the literature, making it difficult to develop a rigorous estimation theory

for this class of models.

33

However, none of the previously cited studies consider the QML estimator in the context

of the long memory ARCH(∞) processes examined in section 2. Recall that by Theorems 1

and 2 the long memory ARCH(∞) model is required to have a∗ = 0, such as in the models (3)

and (12) of Ding and Granger (1996). This poses a number of challenges not previously

addressed in the estimation literature. In particular, existence of Eψ−νt for some ν > 0 is

not an immediate consequence of the process structure and has to be established using a

new set of techniques. Negative moments of ψt are required for existence of the limiting log-

likelihood function and various other ratios of random variates in a number of auxiliary results.

Another issue, commonly present in the time-domain QML estimation of the linear long memory

processes, is a relatively slow convergence rate of the log-likelihood function and its derivatives

to their respective limits. This has important implications on the limiting distribution of the

estimator, see Yajima (1985) for the case of Gaussian fractionally integrated processes and

Robinson and Zaffaroni (2003) for the non-stationary FIGARCH model.

In this section we examine properties of the time-domain QML estimator of the long memory

ARCH(∞) model (12). For a time series of non-negative random variables (Xt, ψt) : t ∈ Zwith non-summable autocovariances, this model offers a particularly simple parametrization in

terms of only two parameters a and d. In this respect (12) is similar to a well studied fractional

white noise model of Granger (1980) and Hosking (1981). Moreover, as will be shown in

Theorem 3, the asymptotic properties of the two models are remarkably similar as well.

In addition to A1–A2, the QML estimator studied in this section is based on the following

additional assumptions:

A3. The random variable ε0 has a distribution function satisfying lims→0

s−µPε0 ≤ s = 0 for

some µ > 0.

A4. Let D := [d, d] for some 0 < d < d < 1. Let d0 ∈ D0 and a0 ∈ R+, where D0 ⊆ D is the

covariance stationary region of the model.

Note that A3 is equivalent to the corresponding assumption in Berkes, Horvath and Kokosz-

ka (2003), and ensures existence of N ≥ 1 such that E(

N∑n=1

εn

)−1

< ∞ without imposing

specific distributional assumptions on the sequence of i.i.d. shocks εt : t ∈ Z.Suppose that a finite interval Xt : 1 ≤ t ≤ T of a covariance stationary solution Xt : t ∈

Z of the long memory ARCH(∞) model (12) is observed by an econometrician. Under A4,

let a0 and d0 be the unknown parameters of (12), and suppose that the statistical inference for

these parameters is required. Define a sequence of positive functions πj(d) : j ≥ 0 as follows:

πj(d) :=dΓ(1− d+ j)

Γ(1− d)Γ(2 + j), (14)

for d ∈ D. Let πj(d0) be denoted simply as πj for all j ≥ 0. In particular, definition (14)

implies that πj(d) : j ≥ 0 ⊆ R+ for all d ∈ D. Denoting the n-th derivative of (14) as

π(n)j (d), and observing that π(0)

j (d) = πj(d), the following inequality holds for all j ≥ 1 and

34

n ≥ 0:∣∣∣π(n)j−1(d)

∣∣∣ ≤ Kn [log j]n j−1−d , (15)

where Kn : n ≥ 0 is a sequence of non-negative constants. Furthermore, define a sequence

of non-negative functions wt(d) : 1 ≤ t ≤ T as follows:

wt(d) =∞∑j=1

πj−1(d)Xt−j , (16)

where by convention wt(d0) = ψt. Let w(n)t (d) denote the n-th derivatives of (16) with respect

to d. Finally, let the log-likelihood function based on (16) be defined as:

LT (d) = −T∑t=1

[logwt(d) +

Xt

wt(d)

], (17)

and the QML estimator of the parameter d0 be given by:

dT = arg maxd∈D

1TLT (d) . (18)

For the case of exponential shocks εt : t ∈ Z, the function (17) will be the proper log-likelihood

function for the model. However, apart from A1 and A3, the asymptotic properties of the

estimator (18) derived in this section do not depend on particular distributional assumptions

about ε0, and therefore we refer to (18) as the QML estimator for the model (12). Theorem 3

shows consistency and asymptotic normality of the QML estimator defined above.

THEOREM 3. Under A1–A4, let the sequence of estimators dT : T ≥ 1 of the long memory

ARCH(∞) model (12) be defined as in (18). Then:

dTa.s.−→ d0 as T →∞ . (19)

In addition:

T12 (dT − d0)

d−→ N

(0,

6π2

)as T →∞ , (20)

where N(0, S) denotes a univariate normal distribution with mean 0 and variance S.

We wish to remark on the similarity of the asymptotic limiting distribution of dT in (20) to

that of the fractional white noise model established in Yajima (1985). Notably, the asymptotic

variance of the limiting distribution of T12 (dT − d0) is the same in both models in spite of

their markedly different structure. In addition, the estimator of the memory parameter is

independent of that for the mean parameter a0, as the log-likelihood function (17) does not

depend on the latter.

As in Yajima (1985), we propose the following simple estimator of a0:

aT =1T

T∑t=1

Xt . (21)

35

Asymptotic properties of the sequence aT : T ≥ 1 follow from Theorem 4 of Koulikov (2003)

using the equivalence of representations (12) and (13). In particular, aT is weakly consistent

for a0 with the limiting distribution given by:

T12−d0(aT − a0)

d−→ N(0, c2a0,d0

)as T →∞ ,

where the closed form expression for c2a0,d0is not available, but is known to depend on both

a0 and d0. The rate T12−d0 of the estimator aT is in line with the corresponding results of

Brockwell and Davis (1991) for the class of linear long memory models.

In practice, the estimator (18) is not feasible, as the functions wt(d) : 1 ≤ t ≤ T re-

quire availability of an infinite subset of the process Xt : t ≤ T. However, the asymptotic

properties of such estimator are easier to establish. Therefore, the literature on the maximum

likelihood estimation of GARCH models commonly defines both feasible and infeasible estima-

tors, showing their convergence to the same limit. We proceed in the analogous way. Consider

the following sequence of functions, related to (16):

w1(d) = a0 , wt(d) = a0

∞∑j=t

πj−1(d) +t−1∑j=1

πj−1(d)Xt−j for 1 < t ≤ T , (22)

where d ∈ D. The feasible QML estimator is defined as:

dT = arg maxd∈D

1TLT (d) , (23)

where the function LT (d) is given by:

LT (d) = −T∑t=1

[log wt(d) +

Xt

wt(d)

]. (24)

Theorem 4 shows consistency of the feasible QML estimator dT for d0.

THEOREM 4. Under A1–A4, let the sequence of feasible estimators dT : T ≥ 1 of the long

memory ARCH(∞) model (12) be defined as in (23). Then:

dTa.s.−→ d0 as T →∞ . (25)

The limiting distribution of the sequence T12 (dT − d0) is not available, as |dT − dT | : T ≥

1 has a slower rate than the required op(T−12 ). This follows from the slow convergence of

supd∈D1T

∣∣∣L(1)T (d)− L

(1)T (d)

∣∣∣ as T → ∞, where L(1)T (d) and L

(1)T (d) denote first derivatives of

respectively (17) and (24). We note that Yajima (1985) relies on Gaussianity of his fractional

white noise model in deriving the limiting distribution of the QML estimator of the memory

parameter. In the context of the FIGARCH model, Robinson and Zaffaroni (2003) assume

d0 >12 in order to show the equivalent limiting distribution of their feasible and infeasible QML

estimators. An analogous restriction is not suitable for the long memory ARCH(∞) case, as the

stationarity properties of such models are not known. The Monte Carlo experiment in the next

section is designed to examine how well the limiting distribution in Theorem 3 approximates

that of the feasible estimator dT .

36

4 Monte Carlo experiment

In order to assess the finite sample properties of the QML estimator of model (12) studied in the

previous section, we conduct a small scale Monte Carlo experiment, results of which are reported

below. Recall that although consistent, the feasible QML estimator defined in (23) has unknown

limiting distribution, owing to a relatively slow convergence of supd∈D1T

∣∣∣L(1)T (d)− L

(1)T (d)

∣∣∣ as

T → ∞. One of the goals of this section is to study in the experimental way an approximate

distribution of T12 (dT − d0) for large T , for a variety of sample sizes and assumptions on the

distribution of shocks εt : t ∈ Z. In particular, it is of interest to know how well this

distribution is approximated by that of the infeasible estimator (18) shown in Theorem 3.

Another potentially interesting issue is to assess effects of the estimation uncertainty of the

location parameter a0 on the distribution of (23). Recall that Theorem 4 is proved under

assumption that the true value a0 is available in (22). Finally, the approximate distribution

T12 (dT − d0) is examined for mildly non-stationary values of d0 in model (12).

We use the following experimental setup. The data generating process in all experiments

is based on the set of stochastic equations (12), with d0 = 0.35 for the covariance stationary

version of the model and d0 = 0.45 for the non-stationary case. In both versions of the data

generating process a0 = 1. All experiments are conducted with two choices of the sample size,

given by T = 500 and T = 2000. An often encountered problem with generating long memory

time series on the computer is the slowly dissipating effect of the starting value of the generated

process, owing to the hyperbolic rate of coefficients in the power series expansion of (1− z)−d

around z = 0, see Beran (1994) and references therein. In order to mitigate this effect, the data

generating algorithm used in the experiments starts the process at its unconditional expected

value a0 and discards the first 3000 realizations of the process. The remainder of the generated

data is used in the estimation phase of the experiment.

The data is simulated using three alternative distributions of the random shocks εt : t ∈ Z,such that assumptions A1 and A3 are satisfied. The first choice are the exponential random

numbers, for which (24) is the proper log-likelihood function. Finite sample properties of the

feasible QML estimator (23) in this case are expected to be the best. The second case is given

by χ21 distributed shocks ε∗t , normalized as follows:

εt =ε∗t + c

b, where b =

√2 , c = b− 1 .

The normalization scales the variance of ε∗t such that E(ε0 − 1)2 = 1. The third choice of

random numbers are Student’s t variates zt with parameter ν, transformed to εt in order to

ensure A1 and to scale the variance to unity as follows:

εt =z2t + c

b, where b = ν

√3

(ν − 2)(ν − 4), c = b− ν

ν − 2.

The parameter ν in this setup is related to the distribution function of εt by the moment

condition E εν∗t <∞ for ν∗ < ν2 . In all experiments in this section we use ν = 7.

37

Table 1: Results of the Monte Carlo experiment for the data generating process based on (12)

with parameters a0 = 1 and d0 = 0.35.

Experiments Mean Median Bias RMSE Variance 95% Coverage

a0 Exp. T = 2000 0.35168 0.35163 0.00168 0.00042 0.00042 0.90600

a0 χ21 T = 2000 0.35303 0.35293 0.00303 0.00050 0.00049 0.86800

a0 t T = 2000 0.35263 0.35484 0.00263 0.00075 0.00074 0.79000

aT Exp. T = 2000 0.34316 0.34313 -0.00683 0.00093 0.00088 0.74600

aT χ21 T = 2000 0.34267 0.34198 -0.00733 0.00094 0.00088 0.73600

aT t T = 2000 0.33988 0.33845 -0.01012 0.00109 0.00099 0.67400

a0 Exp. T = 500 0.35972 0.36028 0.00972 0.00198 0.00189 0.87800

a0 χ21 T = 500 0.35604 0.35877 0.00604 0.00231 0.00228 0.85200

a0 t T = 500 0.36286 0.36768 0.01286 0.00348 0.00331 0.75400

aT Exp. T = 500 0.32642 0.32698 -0.02358 0.00337 0.00282 0.76000

aT χ21 T = 500 0.32994 0.33332 -0.02006 0.00363 0.00322 0.74800

aT t T = 500 0.32826 0.32414 -0.02174 0.00409 0.00361 0.69000

Notes: The experiments are described as follows: a0, respectively aT , indicates that the true parameter,

respectively the estimator (21), has been used in (22) during the estimation, Exp., χ21 and t denote the

normalized distribution of shocks, and T = 500 and T = 2000 give the sample sizes. Numbers reported in

the table give corresponding statistics of the estimator dT across 500 replications. Variance column shows the

sampling variance of the estimator. The 95% coverage column reports empirical frequency of the estimator

within the 95% confidence interval implied by the limiting distribution of the infeasible estimator in Theorem 3.

38

Table 2: Results of the Monte Carlo experiment for the data generating process based on (12)

with parameters a0 = 1 and d0 = 0.45.

Experiments Mean Median Bias RMSE Variance 95% Coverage

a0 Exp. T = 2000 0.46113 0.46274 0.01113 0.00066 0.00053 0.81800

a0 χ21 T = 2000 0.46010 0.46077 0.01010 0.00076 0.00066 0.77400

a0 t T = 2000 0.45923 0.46168 0.00923 0.00104 0.00096 0.70800

aT Exp. T = 2000 0.43395 0.43147 -0.01605 0.00136 0.00111 0.63400

aT χ21 T = 2000 0.43469 0.43199 -0.01531 0.00146 0.00123 0.62200

aT t T = 2000 0.43672 0.43422 -0.01328 0.00169 0.00151 0.62400

a0 Exp. T = 500 0.47574 0.48219 0.02574 0.00294 0.00227 0.80000

a0 χ21 T = 500 0.47433 0.48048 0.02433 0.00339 0.00279 0.73200

a0 t T = 500 0.48043 0.47936 0.03043 0.00463 0.00371 0.66200

aT Exp. T = 500 0.42494 0.42282 -0.02506 0.00434 0.00371 0.70800

aT χ21 T = 500 0.42021 0.42340 -0.02979 0.00461 0.00372 0.69600

aT t T = 500 0.41299 0.41004 -0.03701 0.00581 0.00444 0.62000

Notes: The experiments are described as follows: a0, respectively aT , indicates that the true parameter,

respectively the estimator (21), has been used in (22) during the estimation, Exp., χ21 and t denote the

normalized distribution of shocks, and T = 500 and T = 2000 give the sample sizes. Numbers reported in

the table give corresponding statistics of the estimator dT across 500 replications. Variance column shows the

sampling variance of the estimator. The 95% coverage column reports empirical frequency of the estimator

within the 95% confidence interval implied by the limiting distribution of the infeasible estimator in Theorem 3.

39

The feasible QML estimator studied in this section is defined by equations (22), (23)

and (24). In one set of experiments, the parameter a0 is assumed to be known, as required by

Theorem 4. In another set of experiments a0 is estimated in the first stage by (21), replacing

a0 in (22) during the estimation of the memory parameter. The sampling variance of aT adds

to the estimation uncertainty of parameter d0 in the feasible QML case, but the amount of the

added variation is not quantified by Theorem 4. Design of the Monte Carlo experiment allows

us to assess this effect. Results of all experiments reported in Tables 1 and 2 are based on 500

replications.

As expected, results of the Monte Carlo experiment point out to higher sampling variance

of the feasible estimator d in comparison to the asymptotic variance of the infeasible one shown

in Theorem 3. This is also reflected by lower than nominal 95% coverage frequences of the

estimator. In all experiments the fat–tail t distributed shocks lead to increased variance and

worse coverage rate of the estimator, in contrast to both the exponential and χ21 cases, both

of which have finite moments of all orders. In addition, the sampling variance and bias of d

in the non-stationary region of the model shown in Table 2 are uniformly higher than those

in Table 1. Recall that the limiting results derived in section 3 are applicable only in the

covariance stationary case.

In line with corresponding results for the covariance stationary linear ARFIMA models

reported in Cheung and Diebold (1994), the absolute bias of d increases sharply when (21) is

used as the estimator of a0 in place of its true value in Table 1. However, the increase in bias

is less noticeable for the non-stationary model. Unlike in the linear case, the bias changes from

positive to negative when changing the estimator of the location parameter in all experiments

reported in Tables 1 and 2. Predictably, the sampling variance of d increases when (21) is used

in the log-likelihood function, more pronounced so for larger sample sizes.

5 Conclusion

This paper studies specification and estimation of a class of long memory ARCH(∞) models.

We show that the ARCH(∞) model of Robinson (1991) can have a covariance stationary

solution with non-summable autocovariance function, equivalently long memory as defined in

McLeod and Hipel (1978). A notable feature of the long memory ARCH(∞) model is absence

of the intercept and sum of the weighting coefficients equal to unity. This makes the Volterra

series expansion of the classical ARCH(∞) model used in Giraitis, Kokoszka and Leipus (2000)

and Kazakevicius and Leipus (2002) inapplicable in the present context. In order to establish

existence of a non-negative long memory solution of (1) and examine its properties, we show

equivalence between the covariance stationary solutions of the ARCH(∞) model and the class

of MD-ARCH(∞) sequences of Koulikov (2003). Two important parametrizations of the long

memory ARCH(∞) models have been introduced by Ding and Granger (1996), including a

particularly simple two parameter model (12).

In the second part of the paper we examine asymptotic properties of the QML estimator

40

for the mean and memory parameters of the model (12). It is shown that the estimator of the

memory parameter is strongly consistent, but the asymptotic normality is not available for the

feasible case due to a slow convergence rate between the sequences of feasible and infeasible

estimators. The asymptotic variance of the infeasible estimator is shown to be 6π2 , the same as

in the fractional white noise model of Granger (1980) and Hosking (1981).

The class of long memory ARCH(∞) models will find potential applications in several areas

of financial econometrics. Apart from the volatility modeling, where the manifestations of long

range dependence have been well documented and extensively studied, it offers an attractive

and parsimonious way of modeling time series of non-negative data in the newly emerged field

of econometric models for high frequency financial data.

6 Appendix

This technical appendix collects proofs of the main results in Section 2 and Section 3. To

simplify notation, we use convention∑n

m · = 0 whenever m > n for m,n ∈ Z. Kn : n ≥ 0 is

a sequence of non-negative constants.

PROOF OF THEOREM 1: Using stationarity and non-negativity of the solution (Xt, ψt) :

t ∈ Z, and the monotone convergence theorem we write:

Eψt = E

[a∗ +

∞∑j=1

πj−1Xt−j

]= a∗ +

∞∑j=1

πj−1EXt−j = a∗ + a

∞∑j=0

πj ,

from where∑∞

j=0 πj ≤ 1 for a > 0 since Eψt = a. Since all parameters are non-negative, a∗ > 0

iff∑∞

j=0 πj < 1. Hence, the process can be written as:

Xt − a = (Xt − ψt) +∞∑j=1

πj−1(Xt−j − a) .

Then, by E|Xt−a||Xt−j−a| ≤ E(Xt−a)2 <∞, summability of πj : j ≥ 0 and the monotone

convergence theorem:

E(Xt − a)2 = E(Xt − ψt)(Xt − a) +∞∑j=1

πj−1E(Xt − a)(Xt−j − a) .

By Cauchy-Schwarz inequality and covariance stationarity of (Xt, ψt) : t ∈ Z:

E(Xt − ψt)(Xt − a) <∞ ,

implying that∑∞

j=1 πj−1E(Xt−a)(Xt−j−a) <∞. Finally, the last part of the theorem follows

by non-negativity of the solution (Xt, ψt) : t ∈ Z and EXt = Eψt = 0 when a = 0.

PROOF OF THEOREM 2: Suppose a covariance stationary solution (Xt, ψt) : t ∈ Z of the

ARCH(∞) process is given. Then, using (8), for each N ≥ 0 we can rewrite (1) as:

ψt = a+N∑j=1

θj−1(Xt−j − ψt−j) +∞∑j=N

(πj +

N−1∑i=0

θj−1−iπi

)(Xt−1−j − a) .

41

By covariance stationarity of Xt : t ∈ Z follows:

E

[ψt − a−

N∑j=1

θj−1(Xt−j − ψt−j)

]2

≤ E(Xt − a)2[ ∞∑j=N

(πj +

N−1∑i=0

θj−1−iπi

)]2

,

and the last expression converges to zero as N → ∞ by (9). Since∑N

j=1 θj−1(Xt−j − ψt−j)

converges in L2 as N →∞, we establish representation (10).

Next, consider a covariance stationary solution (X∗t , ψ∗t ) : t ∈ Z of the MD-ARCH(∞)

process (6). By (8), for each N ≥ 0:

ψ∗t = a+N∑j=1

πj−1(X∗t−j − a)− πN−1(ψ∗t−N − a) +∞∑

j=N+1

bj−1,N−1(X∗t−j − ψ∗t−j) ,

where bj,N : j,N ≥ 0 is defined as:

bj,N :=

θj −

N−1∑i=0

θj−1−iπi for j ≥ N ≥ 0

θj otherwise.(A.1)

It is easy to see that πj ≤ bj,N ≤ θj for all j,N ≥ 0, hence the sequence bj,N : j ≥ 0 is

square-summable for each N ≥ 0. By Cauchy-Schwarz inequality we can write:

E

[ψ∗t − a−

N∑j=1

πj−1(X∗t−j − a)

]2

≤ 2 E

[ ∞∑j=N+1

bj−1,N−1(X∗t−j − ψ∗t−j)

]2

+

2π2N−1E(ψ∗t−N − a)2 .

It remains to show that the right-hand side of this expression converges to zero as N → ∞.

The limit of the last expression above is clearly zero, as E(ψt−N − a)2 < ∞ for any N ≥ 0,

and the summability of πj : j ≥ 0 implies that π2N → 0 as N → ∞. Similarly, covariance

stationarity of (X∗t , ψ∗t ) : t ∈ Z together with (6) implies that (X∗t − ψ∗t ) : t ∈ Z is the

sequence of uncorrelated random variables, with E(X∗t −ψ∗t ) = 0 and E(X∗t −ψ∗t )2 <∞ for all

t ∈ Z. It follows that:

E

[ ∞∑j=N+1

bj−1,N−1(X∗t−j − ψ∗t−j)

]2

≤∞∑

j=N+1

θ2j−1 E(X∗t−j − ψ∗t−j)

2 → 0 as N →∞ ,

where we use (A.1). Standard results on stationary time series, such as Brockwell and Da-

vis (1991), imply that a +∑N

j=1 πj−1(X∗t−j − a) converges in L2 as N → ∞, and hence (11)

follows.

LEMMA 1. Let (Xt, ψt) : t ∈ Z be a sequence of non-negative random variables satisfying

ARCH(∞) equations (12) and Pψt = 0 = 0 for all t ∈ Z. Assume that the sequence

πj : j ≥ 0 is given by (14). Then the following inequalities hold a.s.:

K1 ≤ψtψt−1

≤ 1 + εt−1 ,

for some 0 < K1 < 1.

42

PROOF: Using the structure of model (12), we can write:

ψtψt−1

= π0Xt−1

ψt−1+

∞∑j=2

πj−1Xt−j

ψt−1≤ 1 + εt−1 , (A.2)

where the recursive structure of (14):

πj = πj−1j − d0

j + 1for all j ≥ 1 , (A.3)

is utilized in the following way:

∞∑j=2

πj−1Xt−j

ψt−1=

∞∑j=1

πj−1j−d0j+1 Xt−1−j

ψt−1≤

∞∑j=1

πj−1Xt−1−j

ψt−1= 1 a.s.

noting non-negativity of the summands, and 0 < j−d0j+1 < 1 for all j ≥ 1 and 0 ≤ d0 < 1.

From (A.2) and (A.3) we write:

ψtψt−1

≥

∞∑j=2

πj−1Xt−j

ψt−1=

∞∑j=1

πj−1j−d0j+1 Xt−1−j

ψt−1≥ 1− d0

2> 0 ,

where we use non-negativity of the summands, and 1−d02 ≤ j−d0

j+1 for all j ≥ 1.

LEMMA 2. Let (Xt, ψt) : t ∈ Z be a covariance stationary non-negative solution of the

ARCH(∞) model (12), where EXt = Eψt = a > 0 and assumptions A1–A3 hold. Then:

Pψt = 0 = 0 , (A.4)

Eψ−νt <∞ , (A.5)

for any ν > 0.

PROOF: We first show (A.4). By A3 the distribution of the shocks εt : t ∈ Z does not

contain atom at zero. Using (12) and the structure of (14), whereby πj : j ≥ 0 ⊆ R+, the

following probabilities are equal:

Pψt = 0 = P

∞⋂j=1

ψt−j = 0

.

By Theorem 2, the covariance stationary solution (Xt, ψt) : t ∈ Z also satisfies (6). Thus,

probability of the event on the right hand side of this expression is zero, since ψt−1−j = 0 for

all j ≥ 1 implies that ψt−1 = a.

To show (A.5), it is sufficient to establish that:

Pψ−1t ≥ s = Pψt ≤ s−1 ≤ O(s−ν

∗) , (A.6)

43

for some ν∗ > ν and s ∈ R+. Choose a sequence of positive numbers ci : i ≥ 0, where

c0 := 1, ci → ∞ as i → ∞, and cici+1

≥ K1 > 0 for all i ≥ 0. Using non-negativity of the

solution (Xt, ψt) : t ∈ Z and Lemma 1 we can write for an arbitrary 1 ≤M <∞ and i ≥ 0:

PKi2 ci ψt ≤ s−

1

2i ≤ P

Ki2 ci ψt−M

M∑j=1

ψt−jψt−M

εt−j ≤ s−1

2i

≤ P

Ki+12 ci ψt−M

M∑j=1

εt−j ≤ s−1

2i

,

where by Lemma 1 there exists a constant 0 < K2 < 1 s.t. K2 ≤ min ψt−1

ψt−M, . . . ,

ψt−M

ψt−M .

Observe that the following inequality holds for a pair of non-negative random variables A and

B and any s ∈ R+:

PA ·B ≤ s ≤ PA ≤ s12 + PB ≤ s

12 .

We continue writing:

P

Ki+12 ci

ci+1

ci+1ψt−M

M∑j=1

εt−j ≤ s−1

2i

≤ P

cici+1

M∑j=1

εt−j ≤ s−1

2i+1

+ P

Ki+1

2 ci+1 ψt−M ≤ s−1

2i+1

.

Using stationarity of the solution (Xt, ψt) : t ∈ Z we can write for an arbitrary 1 ≤ N <∞:

Pψt ≤ s−1 ≤N∑i=1

P

ci−1

ci

M∑j=1

εj ≤ s−1

2i

+ PKN

2 cN ψt ≤ s− 1

2N

. (A.7)

Choose the sequence ci : i ≥ 0 s.t. KN2 cN →∞ as N →∞. By (A.4) the last probability on

the right hand side of this expression converges to zero. Next, consider probabilities under the

summation sign in equation (A.7). Using Markov’s inequality:

P

ci−1

ci

M∑j=1

εj ≤ s−1

2i

≤ P

K1

M∑j=1

εj

−2i

≥ s

≤ s−ν∗E

K1

M∑j=1

εj

−ν∗ 2i

.

Let M in the last expression be given by M = K3 · M∗. By A3 there exists K3 ≥ 1 s.t.

E(K3∑n=1

εn

)−1

< ∞ . The harmonic–arithmetic mean inequality, see Spiegel and Liu (1999),

helps to establish the following:

E

M∑j=1

εj

−1

≤M−2M∗∑j=1

E

(K3∑n=1

εn+K3(j−1)

)−1

= O(M−1) = K4 .

Finally, choose sufficiently large M to ensure K−11 K4 < 1 and write (A.7) for any given s ∈ R+

as follows:

Pψt ≤ s−1 ≤ s−ν∗

( ∞∑i=1

[K−1

1 K4

]ν∗ 2i

+ 1

).

This finishes the proof of (A.5).

44


ARCH(∞) model (12), and wt(d) : 1 ≤ t ≤ T be defined as in (16). Then, under A3–A4,

for any ν > 0:

E(

supd∈D

ψtwt(d)

)ν<∞ (A.8)

PROOF: For any 1 ≤M <∞ we can write:ψt

wt(d)≤ ψt

M∑j=1

πj−1(d)Xt−j

≤ ψt

ψ π(d)M∑j=1

εt−j

,

where ψ := minψt−1, . . . , ψt−M and

π(d) := minπ0(d), . . . , πM−1(d) . (A.9)

Note that ψ > 0 by Lemma 2. Using Lemma 1, ψt

ψ ≤ max(1 + εt−1), . . . ,M∏j=1

(1 + εt−j) ≤

M∏j=1

(1 + εt−j) , and hence we write:

supd∈D

ψtwt(d)

≤ K1

M∏j=1

(1 + εt−j)

M∑j=1

εt−j

.

The remainder of the proof is analogous to the proof of Lemma 5.1 in Berkes, Horvath and

Kokoszka (2003) using A3.


ARCH(∞) model (12), and wt(d) : 1 ≤ t ≤ T be defined as in (16). Then, under A3–A4,

for any ν > 0:

E

(supd∈D

∣∣∣∣∣w(1)t (d)wt(d)

∣∣∣∣∣)ν

<∞ , (A.10)

E

(supd∈D

∣∣∣∣∣w(2)t (d)wt(d)

∣∣∣∣∣)ν

<∞ . (A.11)

PROOF: First, consider (A.10). Using (14), we can write for any 1 < N < M <∞:

∣∣∣∣∣w(1)t (d)wt(d)

∣∣∣∣∣ ≤K9

1d

∞∑j=1

πj−1(d)Xt−j +∞∑j=1

πj−1(d) log j Xt−j

∞∑j=1

πj−1(d)Xt−j

≤K9

d+K9

M−1∑j=1

πj−1(d) log j Xt−j +∞∑j=M

πj−1(d) log j Xt−j

M−1∑j=1

πj−1(d)Xt−j

≤K9

[1 +

1d

]logM +

K9

π(d)

∞∑j=M

πj−1(d) log jXt−jN∑n=1

Xt−n

.

45

where π(d) is defined in (A.9). Using properties of the sequence πj(d) : j ≥ 0, it follows that:

supd∈D

∣∣∣∣∣w(1)t (d)wt(d)

∣∣∣∣∣ ≤ K1 logM +K2

∞∑j=M

j−γ log jXt−jN∑n=1

Xt−n

,

for some γ > 1. Next, consider (A.11). Using similar arguments, it follows for any 1 < N <

M <∞:

∣∣∣∣∣w(2)t (d)wt(d)

∣∣∣∣∣ ≤K10

2d

∞∑j=1

πj−1(d) log j Xt−j +∞∑j=1

πj−1(d) (log j)2Xt−j

∞∑j=1

πj−1(d)Xt−j

≤K10

[1 +

2d

](logM)2 +

K10

π(d)

[1 +

2d

] ∞∑j=M

πj−1(d) (log j)2Xt−jN∑n=1

Xt−n

,

from where we can write:

supd∈D

∣∣∣∣∣w(2)t (d)wt(d)

∣∣∣∣∣ ≤ K11(logM)2 +K12

∞∑j=M

j−γ (log j)2Xt−jN∑n=1

Xt−n

.

Hence, in order to establish (A.10) and (A.11) it is sufficient to show that:

P

K11(logM)2 +K12

∞∑j=M


Xt−n

> s

≤ O(s−ν∗) , (A.12)

for some ν∗ > ν and s ∈ R+. The following auxiliary result is used in subsequent derivations:

E Xt−j

N∑n=1

Xt−n

<∞ for all 1 ≤ N < j <∞. Observe that:

Xt−jN∑n=1

Xt−n

=Xt−j(

ψt−1

ψt−Nεt−1 + . . .+ εt−N

)ψt−N

≤ K4

(N∑n=1

εt−n

)−1Xt−jψt−N

,

using inequality ψt

ψt−1≥ K3 > 0 for all t ∈ Z shown in Lemma 1. By Holder’s inequality,

independence of shocks εt : t ∈ Z, covariance stationarity of the solution (Xt, ψt) : t ∈ Zand Lemma 2, for all 1 ≤ N < j <∞:

EXt−jN∑n=1

Xt−n

≤ K4 E

(N∑n=1

εt−n

)−1

EXt−jψt−N

≤ K5

(E

1ψ2

0

EX20

) 12

<∞ ,

where N is sufficiently large for E(

N∑n=1

εt−n

)−1

<∞ in view of A3.

46

Returning to (A.12), we write using Markov’s inequality and equation above:

P

K12

∞∑j=M


Xt−n

> s−K11(logM)2

≤ K12

s−K11(logM)2E

∞∑j=M


Xt−n

=

K13

s−K11(logM)2

∞∑j=M

j−γ (log j)2 ,

where the interchange of limits is justified by the monotone convergence theorem and non-

negativity of the summands. Finally, it is known that∞∑j=M

j−γ (log j)2 = O(M−γ∗) , for some

γ > γ∗ > 1. Choosing M = sν∗γ∗ , we establish (A.12) since:

K13

s−K11(logM)2

∞∑j=M

j−γ (log j)2 =K13

s−K14(log s)2O(s−ν

∗) ,

for large enough s, where s−K14(log s)2 is positive and increasing.

PROOF OF THEOREM 3: Using Lemma 3 and 4, the proof of (19) follows from the same

arguments as the proof of Theorem 4.1 in Berkes, Horvath and Kokoszka (2003). We have the

following additional remarks.

First, consider the E | logw0(d)| < ∞. By | logw0(d)| ≤ w−10 (d) + w0(d) it is sufficient to

show that Ew−10 (d) <∞ and Ew0(d) <∞ for all d ∈ D. We write for any 1 ≤ N <∞:

1w0(d)

≤ 1N∑j=1

πj−1(d)X−j

≤ 1

π(d)(ψ−1

ψ−Nε−1 + . . .+ ε−N

)ψ−N

≤

N∑j=1

ε−j

−1

K1

ψ−N, (A.13)

where π(d) is defined in (A.9), and we use ψt

ψt−1≥ K2 > 0 for all t ∈ Z shown in Lemma 1.

Hence by Lemma 2 and A3, for sufficiently large N :

E1

w0(d)≤ K1E

N∑j=1

ε−j

−1

E1ψ0

<∞ .

In view of (15), we have w0(d) ≤ K3

∞∑j=1

j−γX−j for some γ > 1 and all d ∈ D. Hence,

by the monotone convergence theorem and stationarity of the solution (Xt, ψt) : t ∈ Z,Ew0(d) ≤ K3

∞∑j=1

j−γ EX0 <∞.

Second, the uniqueness of the maximum of the limiting log-likelihood function L(d) :=

−E(logw0(d) + X0

w0(d)

)follows from the arguments similar to those in Theorem 2.3 in Berkes,

47

Horvath and Kokoszka (2003), where their equation (2.2) can be written as:

εt−m =

∞∑j=m+1

(π∗j − πj)Xt−j

(π∗m − πm)ψt−m,

where the sequence π∗j : j ≥ 0 satisfies A2. The right-hand side of this expression is well-

defined, as Pψt = 0 = 0 for all t ∈ Z by Lemma 2.

The asymptotic normality of the estimator (18) follows by the arguments used by Berkes,

Horvath and Kokoszka (2003) to prove their Theorem 4.2. Additionally, we would like to

comment on the following.

Define A0 := E[−w

(1)0 (d0)ψ0

(1− ε0)]2

and B0 := −E[w

(1)0 (d0)ψ0

]2

, such that they correspond to

respective definitions of Berkes, Horvath and Kokoszka (2003). Existence of A0 and B0 follows

from Lemma 3 and A1. In order to establish that they are different from zero, it is sufficient

to show |w(1)0 (d0)|2 6= 0 a.s. Indeed, using (12) and (14), |w(1)

0 (d0)|2 ≥ X2−1, from where by

Lemma 2 and A3 we establish the result.

The limiting variance of the QML estimator in Theorem 3 is derive as follows. Using (12)

we can write w0(d) = [1− (1− L)d]X0 , from where:

w(1)0 (d0) = − log(1− L) (1− L)d0X0 = log(1− L)(X0 − ψ0) ,

from where the following expectation holds by A1:

E

[w

(1)0 (d0)ψ0

]2

= E [log(1− L)(ε0 − 1)]2 =π2

6E(ε0 − 1)2 .

The limiting variance of the QML estimator is then:

A0

B20

= E(1− ε0)2 E

[w

(1)0 (d0)ψ0

]−2

=6π2

.

This concludes the proof of Theorem 3.


ARCH(∞) model (12), and functions LT (d) and LT (d) be defined respectively in (17) and (24).

Then, under A1, A3–A4, as T →∞:

supd∈D

1T

∣∣∣LT (d)− LT (d)∣∣∣ a.s.−→ 0 . (A.14)

PROOF: Using definitions of functions LT (d) and LT (d), by triangular inequality we are able

to write:

1T

∣∣∣LT (d)− LT (d)∣∣∣ ≤ 1

T

T∑t=1

|logwt(d)− log wt(d)|+1T

T∑t=1

Xt

wt(d)

∣∣∣∣wt(d)− wt(d)wt(d)

∣∣∣∣ . (A.15)

We re-write the two expressions on the right hand side of this equations as follows. First, by

non-negativity of wt(d) and wt(d), | log x− log y| ≤ (x−1 + y−1)|x− y| for x, y ∈ R+, and (16)

48

and (22) follows:

1T

T∑t=1

|logwt(d)− log wt(d)| ≤1T

T∑t=1

[w−1t (d) + w−1

t (d)] ∞∑j=t

πj−1(d)|Xt−j − a0| .

Similarly to (A.13), function w−1t (d) is bounded uniformly in d ∈ D by the following expression,

for t ≥ N ≥ 1:

w−1t (d) ≤ 1

N∑i=1

πi−1(d)Xt−i

≤

(N∑i=1

εt−i

)−1K1

ψt−N.

The same bound holds for w−1t (d) as well. Hence, using (15) we obtain:

supd∈D

1T

T∑t=N+1

|logwt(d)− log wt(d)| ≤ K21T

T∑t=N+1

∞∑j=t

j−γ

(N∑i=1

εt−i

)−1|Xt−j − a0|ψt−N

,

for some γ > 1. Next, consider the second term on the right hand side of (A.15). Using bounds

of functions w−1t (d) and w−1

t (d) shown above together with (A.2), (16) and (22), we obtain the

following inequality:

supd∈D

1T

T∑t=N+1

Xt

wt(d)


∣∣∣∣ ≤ K31T

T∑t=N+1

∞∑j=t

j−γ

N∏i=0

(1 + εt−i)

N∑i=1

εt−i

2

|Xt−j − a0|ψt−N

.

Comparison of the last two equations shows that the former dominates, given an appropriate

choice of the generic constants, by non-negativity of the shocks εt : t ∈ Z. By Kronecker’s

Lemma, the following limit is therefore sufficient for the uniform convergence of 1T LT (d) and

1T LT (d), apart from the first N terms:

∞∑t=N+1

1t

∞∑j=t

j−γ

N∏i=0

(1 + εt−i)

N∑i=1

εt−i

2


<∞ a.s.

By Beppo Levi’s theorem the desired result is established upon showing convergence of the

following infinite series:

∞∑t=N+1

E

1t

∞∑j=t

j−γ

N∏i=0

(1 + εt−i)

N∑i=1

εt−i

2


=

∞∑t=N+1

1t

∞∑j=t

j−γE

N∏i=0

(1 + εt−i)

N∑i=1

εt−i

2 [

E1

ψ2t−N

E(Xt−j − a0)2] 1

2

=

K4

∞∑t=N+1

1t

∞∑j=t

j−γ <∞ ,

49

where we used A1, covariance stationarity of the solution (Xt, ψt) : t ∈ Z, Holder’s inequality,

Lemma 2, and last part of the proof of Lemma 3.

To finally establish (A.14) it remains to show that the difference between firstN terms of the

functions 1T LT (d) and 1

T LT (d) converges to zero a.s. as T →∞. We use decomposition (A.15)

and write:

supd∈D

1T

N∑t=1

|logwt(d)− log wt(d)| ≤

K51T

N∑t=1

∞∑j=t

j−γ supd∈D

|Xt−j − a0|wt(d)

+K61T

N∑t=1

∞∑j=t

j−γ |Xt−j − a0| ,

upon noting w−1t (d) ≤

(a0

∞∑j=N

πj−1(d)

)−1

≤ K6 uniformly in d ∈ D for 1 ≤ t ≤ N . Since N

is fixed, it is sufficient to show that:

E

N∑t=1

∞∑j=t

j−γ supd∈D

|Xt−j − a0|wt(d)

=N∑t=1

∞∑j=t

j−γ

[E(

supd∈D

1wt(d)

)2

E(Xt−j − a0)2] 1

2

<∞ ,

where we used Holder’s inequality, covariance stationarity of (Xt, ψt) : t ∈ Z and argument

similar to (A.13). Similarly:

E

N∑t=1

∞∑j=t

j−γ |Xt−j − a0|

=N∑t=1

∞∑j=t

j−γ E|Xt−j − a0| <∞ ,

by the covariance stationarity of (Xt, ψt) : t ∈ Z. Finally, the last part of (A.15) can be

written as follows:

supd∈D

1T

N∑t=1

Xt

wt(d)


∣∣∣∣ ≤ K71T

N∑t=1

εt

∞∑j=t

j−γ supd∈D

ψtwt(d)

|Xt−j − a0| ,

where we use uniform boundary of w−1t (d) shown above. Taking expectation of the right hand

side of this expression and using independence of εt : t ∈ Z, Holder’s inequality, and Lemma 3

we conclude that:

E

N∑t=1

εt

∞∑j=t

j−γ supd∈D

ψtwt(d)

|Xt−j − a0|

=

N∑t=1

∞∑j=t

j−γ

[E(

supd∈D

ψtwt(d)

)2

E(Xt−j − a0)2] 1

2

<∞ .

This finishes the proof of (A.14).

PROOF OF THEOREM 4: Using Lemma 5, (25) follows directly from the proof of Theorem 4.3

in Berkes, Horvath and Kokoszka (2003).

50

References

Andersen, Torben G., Tim Bollerslev, Francis Diebold, and Paul Labys (2001) The distribution

of exchange rate volatility. Journal of the American Statistical Association, vol. 96, pp. 42-

55.



pp. 3-30.

Beran, J. (1994) Statistics for long-memory processes. Chapman and Hall, New York.

Berkes, Istvan, Lajos Horvath and Piotr Kokoszka (2003) GARCH processes: Structure and

estimation. Bernoulli, vol. 9, pp. 201-227.



Bougerol, Philippe and Nico Picard (1992) Stationarity of GARCH processes and of some

non-negative time series. Journal of Econometrics, vol. 52, pp. 115-127.

Brockwell, Peter J. and Richard A. Davis (1991) Time series: theory and methods. Second

Edition, New-York: Springer-Verlag.

Cheung, Yin-Wong and Francis X. Diebold (1994) On maximum likelihood estimation of the

differencing parameter of fractionally-integrated noise with unknown mean. Journal of





pendence structure and central limit theorem. Econometric Theory, vol. 16, pp. 3-22.



Inference for Stochastic Processes, vol. 3, pp. 113-128.

Giraitis, Liudas and Peter M. Robinson (2001) Whittle estimation of ARCH models. Econo-

metric Theory, vol. 17, pp. 608-631.

Granger, Clive W.J. (1980) Long memory relationships and aggregation of dynamic models.

Journal of Econometrics, vol. 14, pp. 227-238.




Econometric Theory, vol. 18, pp. 1-16.

Koulikov, Dmitri (2003) Modeling sequences of long memory non-negative covariance stationary

random variables. CAF Working Paper Series, no. 156

Lee, Sang-Won and Bruce E. Hansen (1994) Asymptotic theory for the GARCH(1,1) quasi-

maximum likelihood estimator. Econometric Theory, vol. 10, pp. 29-52.

51

Lumsdaine, Robin L. (1996) Consistency and asymptotic normality of the quasi-maximum like-

lihood estimator in IGARCH(1,1) and covariance stationary GARCH(1,1) models. Econo-

metrica, vol. 64, pp. 575-596.

McLeod, A. I. and K. W. Hipel (1978) Preservation of the rescaled adjusted range, 1: a

reassessment of the Hurst phenomenon, Water Resources Research, vol. 14, pp. 491-508.


eroskedasticity in multiple regression. Journal of Econometrics, vol. 47, pp. 67-84.

Robinson, Peter M. and Paolo Zaffaroni (2003) Pseudo–maximum likelihood estimation of

ARCH(∞) models. Preprint.

Spiegel, Murray R. and John Liu (1999) Mathematical handbook of formulas and tables. Second

Edition, McGraw-Hill.

Weiss, Andrew A. (1986) Asymptotic theory for ARCH models: estimation and testing. Econo-

metric Theory, vol. 2, pp. 107-131.

Yajima, Yoshihiro (1985) On estimation of long–memory time series models. Australian Journal

of Statistics, vol. 27, pp. 303-320.

52

Chapter 3: Non–stationary models for volatility ofspeculative returns: with application to foreign

exchange data

53

Non–stationary models for volatility of speculative returns: with

application to foreign exchange data

Dmitri Koulikov





phone: +45 89421577


This revision:

December 29, 2003

Abstract

A family of models for non–stationary conditional heteroscedasticity is introduced in

the paper, allowing for both a flexible deterministic time dependence and various forms of

stochastic trends in the volatility process. The stochastic part of the new model, referred

to as cMD–ARCH, is based on a weighted sequence of martingale difference innovations,

constructed from the process history. Sufficient conditions for a.s. non–negativity are pro-

vided, along with a number of possible parametrizations based on a number of popular

stationary volatility models. Consistency and asymptotic normality of the QML estimator

are established, drawing on the contributions of Jensen and Rahbek (2002, 2003). Finally,

an empirical application to thirteen series of daily exchange rate returns is included to

illustrate a practical potential of the cMD–ARCH process.

JEL classification: C13, C22

Keywords: Conditional heteroscedasticity, Non–stationarity, Quasi-maximum likelihood

estimation

54

1 Introduction

The literature on modeling conditionally heteroscedastic time series, such as returns on specula-

tive assets, considers infinite sequences of strictly stationary random variables (rt, σ2t ) : t ∈ Z,

where rt usually represents a compound difference between prices of a financial asset at con-

secutive time periods, and σ2t can be regarded as a scaling parameter of rt. The most popular

class of models for heteroscedastic data is given by the following general family:

rt = σt z∗t σ2t = a∗ +

∞∑j=1

πj−1r2t−j , (1)

where z∗t : t ∈ Z is an infinite sequence of i.i.d. disturbances, and a∗ ≥ 0 and πj : j ≥0 ⊆ R0+ are the model parameters. Members of (1) include ARCH model of Engle (1982),

GARCH model of Bollerslev (1986), IGARCH sequences of Engle and Bollerslev (1986) and

a number of other specifications. Robinson (1991) introduced ARCH(∞) model (1) in the

heteroscedasticity testing context. An extensive survey of recent theoretical advances in ARCH

modeling is Giraitis, Leipus and Surgailis (2003), while an overview of the applied GARCH

literature is found in Bollerslev, Engle and Nelson (1994).

Another class of stationary models for conditionally heteroscedastic data, referred to as

MD–ARCH(∞), was recently studied in Koulikov (2003a) and is given as follows:

rt = σt z∗t σ2t = a +

∞∑j=1

θj−1(r2t−j − σ2

t−j) , (2)

where a > 0 and θj : j ≥ 0 ⊆ R+ is a sequence of square–summable coefficients. This model

is particularly suited for covariance stationary sequences (r2t , σ

2t ) : t ∈ Z, allowing for either

short or long memory in the conditional volatility process. Koulikov (2003b) demonstrates that

covariance stationary non-negative solutions of (2) also belong to the ARCH(∞) family.

ARCH(∞) and MD–ARCH(∞) models define strictly stationary infinite sequences of re-

turns and conditional volatility (rt, σ2t ) : t ∈ Z, where existence of the moments E σν

t , for

ν > 0, depends on the parameters and moment assumptions on the sequence of disturbances

z∗t : t ∈ Z. Nelson (1990) derives general moment condition for the GARCH(1,1) case, includ-

ing IGARCH(1,1). Ling and McAleer (2002) show necessary and sufficient conditions for the

existence of higher order moments E σ2it , for integer i ≥ 1, in the class of GARCH(p,q) models.

The moment conditions for the general ARCH(∞) family of models are derived in Giraitis,

Kokoszka and Leipus (2000) and Giraitis, Leipus and Surgailis (2003), while Kazakevicius and

Leipus (2002) study existence of strictly stationary ARCH(∞) sequences. Koulikov (2003a)

shows second order stationarity conditions for the class of MD–ARCH(∞) models.

In this paper we introduce a family of models for heteroscedastic time series, which allow for

non–stationary behavior of the conditional volatility parameter. It is defined by the following

55

set of stochastic equations:

r2t = σ2

t = a0 for t ≤ 0

rt = σt zt σ2t = at +

t∑j=1


t−j) for t > 0 ,(3)

where at : t ≥ 0 ⊆ R+ is a non-stochastic function on the set of integers, and zt : t ≥ 0 are

innovations with E z2t = 1. The model implies a set of conditional distributions of the vectors

(rt, σ2t ) given the history of the process and its parameters. Other than this, the definition (3)

places few restrictions on the time dynamics of (rt, σ2t ) : t ≥ 0, allowing for explosive behavior,

various forms of deterministic trends in at : t ≥ 0, and diverging stochastic trends in the

conditional volatility process. Naturally, only solutions where σ2t : t ≥ 0 ⊆ R0+ will be of

interest for volatility modeling. In the remainder of the paper such solutions will be referred

to as cMD–ARCH process.

As shown in the paper, particular parametrizations of at : t ≥ 0 and θj : j ≥ 0 in

the cMD–ARCH model (3) allows the process to converge to stationary limits defined by the

ARCH(∞) and MD–ARCH(∞) equations. In particular, stationary GARCH and IGARCH

limits are possible. Moreover, the cMD–ARCH model permits a study of volatility processes

with unknown or possibly non-stationary solutions, such as the FIGARCH process of Bail-

lie, Bollerslev and Mikkelsen (1996) or the model (2) with non–square–summable coefficients.

An attractive property of the cMD–ARCH processes is the ease of statistical inference about

the set of parameters in (3). Consistency and asymptotic normality of the QML estimator

of the model parameters follows from a set of sufficient conditions in Basawa, Feigin and

Heyde (1976), regardless of the limiting stability of the cMD–ARCH process. This result

generalizes recent findings of Jensen and Rahbek (2002, 2003), who consider non-stationary

ARCH(1) and GARCH(1,1) models, to a wide class of models given by (3).

The simplicity of statistical inference coupled with a wide range of permitted parametriza-

tions of the sequences θj : j ≥ 0 and at : t ≥ 0, including those outside the stationary

regions of known ARCH(∞) and MD–ARCH(∞) processes, allows for estimation and testing

of flexible conditional volatility models for a variety of real–world financial time series. In this

paper we provide application of the cMD–ARCH model to thirteen series of daily exchange rate

returns on major world currencies. The hypotheses of interest in this application will include

presence of a deterministic trend in at : t ≥ 0 and square–summability of θj : j ≥ 0. We

show that deterministic trends in (3), implied by the IGARCH and FIGARCH limits of the

cMD–ARCH process, are rejected in most foreign exchange rate series in our sample. The

hypothesis of square–summability of the sequence θj : j ≥ 0 is often rejected as well, in

favor of the non–square–summable coefficients. Therefore, the empirically preferred model for

our sample of foreign exchange rate returns does not belong any known class of stationary

ARCH(∞) or MD–ARCH(∞) processes.

The paper is comprised of the following sections. Statistical properties of the cMD–ARCH

model, including the QML inference, are discussed in details in Section 2. Section 3 presents

56

an empirical application of the cMD–ARCH model to a sample of thirteen foreign exchange

returns. Conclusion summarizes the findings and outlines plans for further research. Proofs of

main theorems are collected in the Appendix.

2 Specification and estimation of cMD–ARCH model

In this section we consider a class of cMD–ARCH models which allow for a wide range of

non–stationary behavior in the conditional second moment of heteroscedastic time series. Sub-

section 2.1 introduces the new class of volatility models and provides a number of possible

parametrizations of cMD–ARCH processes. Subsection 2.2 discusses issues related to the sta-

tistical inferences in the model parameters.

2.1 Framework for non–stationary volatility modeling

Most of the current theoretical and empirical research in volatility modeling is centered around

the class of stationary ARCH(∞) and MD–ARCH(∞) models, which include popular GARCH,

IGARCH and covariance stationary long memory MD–ARCH(∞) processes for the conditional

volatility. All these models imply a set of restrictions on the parameters in stochastic equa-

tions (1) and (2) in order to ensure existence of stationary solutions (rt, σ2t ) : t ∈ Z. A review

of theoretical properties of the stationary ARCH(∞) models and implied parameter restrictions

is Giraitis, Leipus, Surgailis (2003), for the corresponding results pertaining to the covariance

stationary MD–ARCH(∞) sequences refer to Koulikov (2003a).

Yet, empirical and theoretical relevance of these restrictions received very limited attention

in the econometric literature. One of the reasons behind the lack of research in this direction is

underdeveloped statistical inference theory for non–stationary volatility models. An important

progress in developing such a theory has recently been made by Jensen and Rahbek (2002,

2003) for non–stationary ARCH(1) and GARCH(1,1) models. They establish consistency and

asymptotic normality of the QML estimator for the parameters of the two processes outside the

stationary region shown in Nelson (1990). On the other hand, most earlier contributions dealing

with the statistical inference for ARCH(∞) and MD–ARCH(∞) processes assume stationary

of (rt, σ2t ) : t ∈ Z, refer to Lee and Hansen (1994), Berkes, Horvath and Kokoszka (2003a)

and Koulikov (2003b).

Empirically, there is a substantial interest in modeling behavior of various financial assets

in a potentially non–stationary environments, such as foreign exchange rates during the periods

of market instability. Recent contribution along these lines is Davidson (2003), where he uses a

volatility model with time–invariant parameters to fit both the pre– and post–crisis periods in

a sample of major Asian currencies. Davidson (2003) reports remarkable stability of parameter

estimates across the sub-samples and concludes that some existing volatility models, such as the

FIGARCH process of Baillie, Bollerslev and Mikkelsen (1996), provide an adequate statistical

tool for potentially non–stationary volatility modeling.

57

The goal of this paper is to introduce a formal framework for non–stationary volatility

modeling and develop a statistical inference theory in this context. To this end we define the

cMD–ARCH model in (3), where the following assumptions hold:

A1. z2t−1 : t ≥ 0 is a stochastic non–degenerate sequence of martingale difference innovations

satisfying E [z2t − 1]2 < ∞ and E

[z2t − 1 | Ft−1

]= 0, where Ft is the natural filtration.

A2. at : t ≥ 0 ⊆ R+ and θj : j ≥ 0 ⊆ R+ are non–stochastic sequences of parameters such

that πj : j ≥ 0 ⊆ R0+ and at −∑t−1

j=0 πj at−1−j ≥ 0 for all t > 0, where:

π0 := θ0 , πj := θj −j−1∑k=0

θj−1−k πk for j > 0 . (4)

We wish to remark on the following points. A1 places few restrictions of the dependence

structure and distribution of shocks zt : t ≥ 0. This compares favorably with much

stronger assumptions on the sequence of shocks z∗t : t ∈ Z in the stationary ARCH(∞)

and MD–ARCH(∞) models, where the i.i.d. property is usually required. In addition, under

A1 the innovations r2t − σ2

t : t ≥ 0 in the conditional volatility part of (3) are martingale dif-

ferences. This follows from σ2t < ∞ for all t ≥ 0, whereby E[σ2

t (1− z2t ) | Ft−1] = 0 is immediate

from the definition of the process. Assumption A2 is needed to ensure a.s. non–negativity of

the cMD–ARCH process, and is more general than the one in Koulikov (2003a).

For modeling the conditional volatility of financial time series, the process σ2t : t ≥ 0

defined by the cMD–ARCH model has to remain non–negative. The following theorem formally

establishes the required result:

THEOREM 1. Under assumptions A1–A2 the sequence (r2t , σ

2t ) : t ≥ 0 defined by the stochas-

tic equations (3) is a.s. non–negative.

Examples 1 to 4 provide a number of possible parametrizations of the cMD–ARCH model and

demonstrate applicability of the non–negativity condition in A2:

EXAMPLE 1. One of the implication of the IGARCH process of Engle and Bollerslev (1986)

is that the volatility forecast conditional on the current information set increases linearly with

the forecast horizon. This is a manifestation of the fact that the stationary (rt, σ2t ) : t ∈ Z

process defined by the IGARCH equations has no finite integer moments, including the first

moment. Let zt : t ≥ 0 be a sequence of shocks satisfying A1, and let r20 = σ2

0 = a0 denote

an arbitrary non–negative initial value, from which we wish to recursively construct a sequence

(rt, σ2t ) : t ≥ 0 according to the usual IGARCH equations:

rt = σt zt σ2t = c + γ σ2

t−1 + (1− γ)r2t−1 ,

where the parameters satisfy 0 < γ < 1 and c ∈ R+. Then the corresponding cMD–ARCH

model is given by:

r2t = σ2

t = a0 for t ≤ 0

rt = σt zt σ2t = a0 + c t +

t∑j=1

γ (r2t−j − σ2

t−j) for t > 0 .(5)

58

In this representation at = a0 + c t contains a linear trend, corresponding to the linear com-

ponent in the IGARCH conditional volatility forecast. The sequence of martingale difference

innovations in (5) forms a stochastic trend in σ2t : t ≥ 0.

In this model, the sequence of πj-s defined in (4) is given by πj = γ (1 − γ)j for all j ≥ 0.

The following condition has to be checked in order to establish A2:

at −t−1∑j=0

πj at−1−j =a0

1− γt−1∑j=0

(1− γ)j

+

c

t− γt−1∑j=0

(1− γ)j(t− 1− j)

≥ c > 0 for all t > 0 ,

by the inequality 0 < γ∑t

j=0(1− γ)j ≤ 1 for all t ≥ 0. Hence, by Theorem 1, the conditional

volatility part of (5) is a.s. non–negative.

Finally, we note that unlike in the stationary IGARCH model, the sequence of shocks

in the cMD–ARCH sequence (5) needs not to be i.i.d. In this case, however, convergence of

(rt, σ2t ) : t ≥ 0 to the limiting IGARCH process cannot be shown. In the i.i.d. case the result

follows from Theorem 2 of Nelson (1990).

EXAMPLE 2. The FIGARCH process of Baillie, Bollerslev and Mikkelsen (1996) belongs to

the family of ARCH(∞) sequences (1), with the coefficients πj = O(j−1−d) converging to zero

at a relatively slow hyperbolic rate controlled by a parameter 0 < d < 1. Similarly to the

IGARCH case, the sequence πj : j ≥ 0 in the FIGARCH model sums up to unity. However,

existence and properties of the stationary solution (rt, σ2t ) : t ∈ Z satisfying the FIGARCH

equations have not been established in the literature, see Giraitis, Leipus, Surgailis (2003).

While the limiting properties of the FIGARCH model remain unknown, its empirical appli-

cations on samples of financial data are feasible, where the infinite series in (1) is truncated at

the sample size value. This effectively overcomes the problem of potential non–stationarity of

the model. In the cMD–ARCH framework the truncated FIGARCH process has the following

representation:

r2t = σ2

t = a0 for t ≤ 0

rt = σt zt σ2t = a0 + a∗

t∑j=1

θj−1 +t∑

j=1


t−j) for t > 0 ,(6)

where a0 ∈ R+ is a starting value of the process, and the sequence of coefficients θj : j ≥ 0 is

defined from πj : j ≥ 0 by inverting the formula (4). It can be shown that for the hyperbolic

πj = O(j−1−d) the corresponding θj = O(j−1+d), therefore the sequence θj : j ≥ 0 is not

absolutely summable.

The trend component in (6) is given by at = a0 + a∗∑t

j=1 θj−1, where its asymptotic rate

O(td) is slower than the corresponding linear trend in model (5). The stochastic part of the

model consists of a weighted sequence of martingale difference innovations r2t − σ2

t : t ≥ 0,

59

where the weighting coefficients θj : j ≥ 0 are square–summable for 0 < d < 12 , a half of the

permitted parameter range.

Non–negativity of the conditional volatility part in (6) holds by Theorem 1 and the following

argument:

at −t−1∑j=0

πj at−1−j = a0

1−t−1∑j=0

πj

+ a∗t−1∑j=0

[θj − πj

t−2−j∑i=0

θi

]

= a0

1−t−1∑j=0

πj

+ a∗t−1∑j=0

πj ≥ a∗t−1∑j=0

πj > 0 for all t > 0 ,

(7)

where we use (4) and the inequality 0 <∑t

j=0 πj ≤ 1 for all t ≥ 0.

EXAMPLE 3. The class of stationary GARCH(p,q) sequences of Bollerslev (1986) and the

long memory MD–ARCH(∞) model of Koulikov (2003a) have representation (2), for a detailed

discussion refer to Koulikov (2003a, 2003b). By starting the process at a non–negative value

r20 = σ2

0 = a0 and truncating the infinite series in equation (2) at t, the cMD–ARCH model

can be defined as:

r2t = σ2

t = a0 for t ≤ 0

rt = σt zt σ2t = a0 +

t∑j=1


t−j) for t > 0 ,(8)

where the sequence of coefficients θj : j ≥ 0 is at least square–summable. From Theo-

rem 3 of Koulikov (2003a) follows that the cMD–ARCH model (8) converges to a stationary

MD–ARCH(∞) limit when zt : t ≥ 0 are i.i.d. shocks.

In contrast to models (5) and (6) in the previous examples, the sequence at : t ≥ 0in (8) is constant for all t ≥ 0. The stochastic part of the model is similar to that in (6), with

an added square–summability requirement on θj-s. The latter is needed for the three series

theorem to hold, ensuring a.s. convergence of the Volterra series expansion of (8), for details

refer to Koulikov (2003a).

Non–negativity of the σ2t : t ≥ 0 process in (8) follows by the same arguments as in

Example 3, on replacing a∗ in (7) by 0 and observing that inequality∑t

j=0 πj ≤ 1 holds for

the class of GARCH and MD–ARCH(∞) models by Theorem 2.1 in Giraitis, Kokoszka and

Leipus (2000) and Theorem 1 in Koulikov (2003b).

EXAMPLE 4. cMD–ARCH models (5), (6) and (8) in the previous examples are based on

specifications of known members of the ARCH(∞) or MD–ARCH(∞) family. They imply

different form of the deterministic and stochastic parts of the conditional volatility process

σ2t : t ≥ 0. In this example we combine ideas of the three models into one general cMD–ARCH

specification as follows:

r2t = σ2

t = a0 for t ≤ 0

rt = σt zt σ2t = a0 + c tα + γ (1− φ L)−1(1− L)−d(r2

t−1 − σ2t−1) for t > 0 ,

(9)

60

where L denotes the lag operator, and the parameters satisfy a0, c ∈ R+, 0 ≤ d, α, γ ≤ 1 and

|φ| < 1. A number of cross restrictions on these parameters have to be imposed to ensure a.s.

non–negativity of the conditional variance.

This process nests all previous specifications in Examples 1 to 3 as special cases. In particu-

lar, the sequence of coefficients θj : j ≥ 0 from the generating function γ (1−φ z)−1(1− z)−d

satisfies∑∞

j=0 θνj < ∞ for ν > 1

1−d and 0 < d < 1, and is absolutely summable when d = 0

and 0 < φ < 1. On the other extreme, the case d = 1, φ = 0 and 0 < γ < 1 corresponds to

model (5) in Example 1.

The deterministic part of the conditional volatility process in (9) allows for a substantial

flexibility in at : t ≥ 0, ranging from a linear trend resembling the one in model (5) to the

absence of any deterministic time dependence in at : t ≥ 0 similar to (8). Parameter c

permits the importance of deterministic time behavior to vary.

To establish non–negativity of model (9) one needs to restrict parameters of the model such

that A2 is satisfied. The closed–form expressions for πj : j ≥ 0 are somewhat complicated,

but sufficient conditions for non–negativity of πj-s are given by d(1−d−2φ) ≥ 0 and d−γ+φ ≥0, refer to Koulikov (2003a). For sufficiently large d these conditions permit a negative φ

parameter. The second condition in A2 follows by:

at −t−1∑j=0

πj at−1−j = a0

1−t−1∑j=0

πj

+ c

t−t−1∑j=0

πj(t− 1− j)α

≥ c > 0 for all t > 0 ,

by the inequality 0 <∑t

j=0 πj ≤ 1 according to Koulikov (2003a).

2.2 The quasi–maximum likelihood estimation

A substantial amount of research in the theoretical GARCH literature has been devoted to

the issue of statistical inference for this family of time series models. However, most of this

work assumes a stationary ARCH(∞) or MD–ARCH(∞) process (rt, σ2t ) : t ∈ Z, from which

a finite realization of returns rt : 1 ≤ t ≤ T is observed. Recent contributions within this

framework include Berkes, Horvath and Kokoszka (2003a), Robinson and Zaffaroni (2003),

Hall and Yao (2003) and Koulikov (2003b), for a broader review of the literature refer to

Giraitis, Leipus, Surgailis (2003). When stationarity of the data generating process of returns

and conditional volatility is not assumed, the asymptotic properties of the QML estimator are

established only for some special cases. Jensen and Rahbek (2002, 2003) show consistency and

asymptotic normality of the QML estimator of the non–intercept parameters of non–stationary

explosive ARCH(1) and GARCH(1,1) processes. They note that in contrast to the unit root

and explosive non–stationarities in the linear time series models, the properties of the QML

estimator in non–stationary volatility processes remain similar to the stationary case. In this

subsection we extend their results to the class of cMD–ARCH sequences (3), including the two

cases considered in Jensen and Rahbek (2002, 2003).

61

Let an observed sequence of returns of length T ≥ 1 satisfy the set of equations (3). Let

at(u) : t ≥ 0 and θj(u) : j ≥ 0 be functions of a vector u, such that for every u ∈ U A2

is satisfied, where U is a subset of the finite–dimensional Eucledian space. Let the unknown

parameter vector be given by u0 ∈ U , and let the sequence of shocks zt : t ≥ 0 in the

data generating process satisfy A1. We wish to obtain statistical inference on u0 based on the

sample rt : 1 ≤ t ≤ T by maximizing the following stochastic function:

LT (u) = −T∑

t=1

[log σ2

t (u) +r2t

σ2t (u)

], (10)

where the sequence of non–negative functions σ2t (u) : 1 ≤ t ≤ T on U is defined as:

r2t = σ2

t = a0(u) for t ≤ 0

σ2t (u) = at(u) +

t∑j=1

θj−1(u)[r2t−j − σ2

t−j(u)]

for 1 ≤ t ≤ T .

It is important to note the following: in contrast to most previously cited studies, where

properties of the QML estimator are derived under assumption of stationarity, the sequence

σ2t (u0) : 1 ≤ t ≤ T gives true values of the conditional volatility parameter in the sample,

not just a consistent approximation. In view of this fact and in order to save on notation we

write σ2t = σ2

t (u0) for every 1 ≤ t ≤ T . The superscript (n) next to a function denotes its n-th

derivative with respect to u.

The following additional regularity conditions on the parameters at(u) : t ≥ 0 and θj(u) :

j ≥ 0 is needed for Theorem 2:

A3. For a non–zero vector v with Eucledian norm ||v|| = 1 and a non–negative finite constant

C let:

v′

∣∣∣∣∣∣log[at(u0)−

t−1∑j=0

πj(u0) at−1−j(u0)](1)

+t∑

j=1

[log πj−1(u0)

](1)∣∣∣∣∣∣ ≤ C for all t ≥ 0 .

A4. The following function is continuous element–wise in u in a neighborhood of u0:[at(u0)−

t−1∑j=0

πj(u0) at−1−j(u0)](2)

+t∑

j=1

[πj−1(u0)

](2)for all t ≥ 0 .

Proof of the following result is based on Basawa, Feigin and Heyde (1976), and Jensen and

Rahbek (2002, 2003):

THEOREM 2. Under A1–A4, let the sequence of estimators uT : T ≥ 1 be defined by:

uT = arg maxu∈U

1T

LT (u) ,

where the log–likelihood function LT (u) is given in (10). Then:

T12 (uT − u0)

d−→ N(0,A−10 B0A

−10 ) as T →∞ ,

where A0 := limT→∞ E[

1T L

(2)T (u0)

], B0 := limT→∞ E

[1T L

(1)T (u0) L

(1)T (u′0)

], and N(0,S) de-

notes a multivariate normal distribution with mean 0 and variance S.

62

In the empirical part of the paper we estimate the matrices A0 and B0 by the numerical

Hessian and outer product of gradients using the sample log–likelihood function evaluated at

the consistent estimates uT .

The vector of QML estimates uT allows to calculate estimates of the unobserved sequence

of true shocks zt : 1 ≤ t ≤ T as follows:

zt = sign rt ·

√r2t

σ2t (uT )

, (11)

where sign rt returns −1 for rt < 0 and 1 for rt ≥ 0. From Theorem 2 follows that zt : 1 ≤t ≤ T are consistent for the sequence of true shocks. In section 3 we assess the fit of empirical

foreign exchange volatility models by checking adequacy of A1 in the sequence of estimated

residuals zt : 1 ≤ t ≤ T. However, we note that statistical properties of the diagnostic tests

based on (11) remain unknown, and are likely to depend on the corresponding properties of

uT : T ≥ 1. Asymptotic properties of the autocorrelation tests based on z2t : 1 ≤ t ≤ T

have recently been shown in Berkes, Horvath and Kokoszka (2003b) for stationary GARCH(p,q)

model.

3 Application to foreign exchange returns

This section provides an empirical illustration of the new class of conditional volatility models

introduced in section 2. A general overview of the dataset is given in subsection 3.1, followed

by the empirical results in subsection 3.2.

3.1 Data and descriptive statistics

An empirical application of the cMD–ARCH model in section 3 is based on a sample of thirteen

major European and Asian foreign exchange rate returns series. The dataset contains ten years

of daily foreign exchange rates against the US dollar, from which compound daily returns are

calculated by the usual methodology. All series are obtained from Datastream and cover 2576

returns from 1st January 1994 to 18th November 2003.

The exchange rate series for eight European countries are included in the dataset: Denmark,

Finland, Germany, Ireland, Italy, Portugal, Spain, Switzerland, and United Kingdom. Since

January 1999 six of these countries became members of the common European currency area,

and adopted Euro as their national currencies. This is likely to affect a number of results

reported in the next subsection due to the common dynamics of the six currencies against the

US dollar.

In addition, the dataset contains foreign exchange returns series for four major Asian

economies: Indonesia, Japan, South Korea, and Taiwan. The period of Asian economic crisis

during 1997–1998 is covered by the sample. Because of this, and due to a relatively sluggish

economic development in Asia during the last decade in comparison to Europe and America, we

expect to find substantial differences in volatility modeling results for this part of the sample.

63

Table 1: Descriptive statistics of foreign exchange returns series.

Series Mean Variance Skewness Kurtosis Q(100) Q2(100)

DKK -2.870·10−5 3.483·10−5 -0.28542∗ 4.49634∗ 99.6128 260.666∗

FIM -5.362·10−5 3.528·10−5 -0.25159∗ 4.21366∗ 95.3173 289.506∗

DEM -1.767·10−5 3.677·10−5 -0.27849∗ 4.57641∗ 102.359 372.140∗

IEP -2.339·10−5 3.219·10−5 -0.18452∗ 4.73030∗ 98.6883 389.494∗

ITL -1.606·10−5 3.297·10−5 -0.19310∗ 4.59435∗ 98.9594 527.978∗

PTE -1.483·10−5 3.473·10−5 -0.27748∗ 4.54844∗ 108.759 338.646∗

ESP -4.853·10−6 3.373·10−5 -0.24890∗ 4.46264∗ 104.279 307.245∗

CHF -4.554·10−5 4.405·10−5 -0.32785∗ 5.06566∗ 89.1619 373.699∗

GBP -5.164·10−5 2.102·10−5 -0.09735∗ 4.73871∗ 99.0589 188.816∗

IDR 5.399·10−4 3.846·10−4 1.49429∗ 44.4082∗ 664.571∗ 5083.84∗

JPY -9.539·10−6 5.152·10−5 -0.65022∗ 8.54790∗ 139.409 941.560∗

KRW 1.482·10−4 1.001·10−4 -0.75246∗ 114.191∗ 1194.46∗ 3299.99∗

TWD 9.443·10−5 7.410·10−6 2.56988∗ 54.5937∗ 320.633∗ 737.627∗

Notes: The table reports respective statistics for sample of returns rt : 1 ≤ t ≤ T, where

T = 2576. The series are denoted using the international currency codes. Except for mean and

variance, the star near a reported test statistic indicates 5% level significance for the appropriate

distribution. The column Q(100), respectively Q2(100), shows Box–Pierce statistics for returns,

respectively squared returns, see Li and Mak (1994). Skewness and kurtosis statistics are

according to Jarque and Bera (1987).

64

Summary statistics of squared returns in the dataset are given in Table 1. The reported

results point out to significant deviations from Gaussianity in the distribution of foreign ex-

change rate returns, a well known result in the empirical volatility modeling literature. For the

sample of four Asian currencies the deviation appears to be especially pronounced, and can be

attributed to the effects of 1997–1998 crisis.

Strong dynamic dependence in the squared returns are present in all foreign exchange series.

In addition, IDR, KRW and TWD series have significant autocorrelation in the returns series. This

is likely to be an effect of rapid depreciation of these currencies, where during the crisis period

a long sequence of negative returns against the US dollar is introduced into the sample. We use

a simple AR(1) filter to eliminate these dependencies before estimating conditional volatility

models in the next section.

3.2 Empirical results

In this subsection we discuss an empirical application of the cMD–ARCH model to foreign

exchange data. As shown in section 2, the class of cMD–ARCH models imposes very few

restrictions on the parameters of the conditional volatility process, allowing for non–stationary

and explosive types of behavior. At the same time, stationary limiting process, such as IGARCH

and long memory covariance stationary MD–ARCH(∞) sequences, are also included. The QML

estimator of the cMD–ARCH parameters is shown to be consistent and asymptotically normal.

The goal of the modeling exercise in this section is to use previously developed ideas on the

real world data, testing for possible presence of non–stationary behavior in foreign exchange

returns.

The empirical heteroscedasticity model of foreign exchange returns in this subsection is

based on the cMD–ARCH specification (9). Recall from Example 4 that this specification

includes both the stationary and non–stationary limits, as well as a flexible deterministic part

in the conditional volatility process. In particular, an empirically important IGARCH model

of Engle and Bollerslev (1986) is nested within the specification (9). This allows for testing

the hypothesis of integrated volatility in the sense of Engle and Bollerslev (1986) against a

number of other nested stationary and non–stationary limiting processes. Among these, the

processes with non–zero fractional integration parameter d are of special interest. As shown

in Example 4, 0 < d < 1 implies non–summable coefficients θj : j ≥ 0 in the general

cMD–ARCH model (3), where the square–summability of θj : j ≥ 0 and thus the covariance

stationary MD–ARCH(∞) limit of (9) holds for 0 < d < 12 .

During the estimation, the parameter a0 in the cMD–ARCH model (9) has to be fixed in

order to avoid identification issues between a0 and c when α = 0. In all empirical volatility

models estimated in this subsection we fix a0 on the sample average of r2t : 1 ≤ t ≤ T.

For the second–order stationary returns processes this choice of a0 naturally corresponds to

the unconditional expected value of the volatility, refer to Example 3. For other cases, a0

represents an unobserved pre–sample value of the squared returns process, permitting a range

65

Table 2: cMD–ARCH modeling results for thirteen series of foreign exchange returns.

Series a0 c α γ d φ

DKK 4.2487·10−09 2.4097·10−12 1.8823 0.023134 0.09493 0.98374

(—) (2.9590·10−11) (1.5899) (0.006463) (0.11292) (0.00795)

FIM 4.0123·10−09 1.7955·10−19 2.1433 0.021467 0.29847 0.94637

(—) (1.1868·10−17) (2.1034) (0.005488) (0.17020) (0.03482)

DEM 4.8389·10−09 4.2332·10−13 2.0502 0.029754 0.01376 0.98639

(—) (7.0662·10−12) (2.1604) (0.007198) (0.09756) (0.00602)

IEP 3.8691·10−09 5.0261·10−10 1.4605 0.008056 0.73252 0.89955

(—) (2.1124·10−09) (0.5204) (0.001789) (0.12003) (0.04581)

ITL 3.9091·10−09 2.1709·10−10 1.6522 0.027258 0.78169 0.69117

(—) (9.6382·10−10) (0.5444) (0.005660) (0.08346) (0.10746)

PTE 4.2839·10−09 4.4976·10−11 1.6040 0.021412 0.44304 0.90978

(—) (3.8021·10−10) (1.0683) (0.004465) (0.18011) (0.06193)

ESP 3.9411·10−09 8.7586·10−11 1.5713 0.017922 0.60947 0.84159

(—) (6.8220·10−10) (0.9834) (0.003846) (0.10145) (0.06143)

CHF 7.9079·10−09 1.2588·10−14 2.5932 0.097256 0.54513 -0.34532

(—) (2.0515·10−13) (2.0742) (0.014078) (0.06818) (0.12570)

GBP 1.6547·10−09 2.9505·10−14 2.3072 0.044645 -0.15794 0.98908

(—) (6.8264·10−13) (3.0115) (0.011270) (0.08410) (0.00408)

IDR 6.4466·10−06 1.1387·10−13 3.0680 0.266080 0.99317 -0.34217

(—) (7.4904·10−14) (0.0883) (0.014152) (0.00286) (0.04295)

JPY 2.0041·10−08 2.5396·10−10 1.5101 0.089290 0.66882 0.20970

(—) (2.9743·10−09) (1.4339) (0.016493) (0.06534) (0.22682)

KRW 1.1327·10−06 1.5517·10−10 2.0602 0.389410 0.96060 -0.42893

(—) (8.6000·10−11) (0.0715) (0.021410) (0.01093) (0.03143)

TWD 2.9625·10−09 5.7081·10−08 1.1764 0.457370 0.91817 -0.35245

(—) (2.3774·10−08) (0.0590) (0.019184) (0.02405) (0.03477)

Notes: The table reports estimation results of the cMD–ARCH model (9) on thirteen samples of foreign

exchange returns rt : 1 ≤ t ≤ T, where T = 2576. The series are denoted using the international

currency codes. Notation of the model parameters corresponds to that in (9). Asymptotic maximum–

likelihood standard errors are given in parenthesis below the coefficient estimates, except for the fixed

a0 parameter. Diagnostic tests are summarized in Table 3.

66

Table 3: Residuals diagnostics for models in Table 2.

Series Mean Variance Skewness Kurtosis Q(100) Q2(100)

DKK -0.00653864 0.987771 -0.298274∗ 4.46094∗ 100.614 99.5539

FIM -0.00914468 1.005780 -0.238782∗ 4.06924∗ 98.9181 112.562

DEM -0.00412344 0.989840 -0.297643∗ 4.37323∗ 103.247 110.441

IEP -0.00825703 0.982476 -0.059215∗ 5.32161∗ 99.1079 93.2667

ITL -0.00458738 0.975643 -0.184337∗ 4.34671∗ 97.7134 104.048

PTE -0.00315817 0.985543 -0.252066∗ 4.50373∗ 111.260 112.995

ESP -0.00008092 0.987327 -0.219347∗ 4.39398∗ 107.536 112.684

CHF -0.00830310 0.980774 -0.336811∗ 4.61003∗ 82.2046 111.603

GBP -0.01527700 0.987953 -0.154136∗ 4.76172∗ 96.3491 82.6668

IDR 0.08611490 1.150080 0.586289∗ 20.2904∗ 109.062 60.5305

JPY 0.00323757 0.984919 -0.475329∗ 5.33905∗ 111.057 94.8046

KRW 0.01435000 0.977692 0.616896∗ 8.17019∗ 113.408 95.4020

TWD 0.02019300 1.076450 1.898260∗ 40.5118∗ 105.181 43.8986

Notes: The table reports respective statistics of estimated residuals zt : 1 ≤ t ≤ T, shown

in (11), for models in Table 2. The series are denoted using the international currency

codes. Except for mean and variance, the star near a reported test statistic indicates 5% level

significance for the appropriate distribution. The column Q(100), respectively Q2(100), shows

Box–Pierce statistics for estimated residuals, respectively squared estimated residuals, see Li

and Mak (1994). Skewness and kurtosis statistics are according to Jarque and Bera (1987).

67

of empirically reasonable choices of a0.

The estimation results using thirteen series of foreign exchange data described in the previ-

ous subsection are presented in Table 2. A battery of diagnostic tests based on the estimated

residuals series zt : 1 ≤ t ≤ T for each model are reported in Table 3. All estimation and

diagnostic routines used in this subsection are written in Ox version 3.30, see Doornik (2002).

Empirical volatility models reported in Table 2 can be grouped into two main categories.

The models for DKK, FIM, DEM, and GBP series have statistically insignificant fractional inte-

gration parameter d, and a point estimate of φ close to unity. For a larger group of models,

based on IEP, ITL, PTE, ESP, CHF, IDR, JPY, KRW, and TWD series, both d and φ are statistically

significant, where d is close to unity for IDR and KRW models. Importantly, the estimates of d

in all but PTE model in the second group are above 12 , even though the asymptotic standard

errors point to statistical significance of this hypothesis only in ITL, IDR, JPY, KRW, and TWD

models.

The deterministic component of the conditional volatility process in the estimated models

shows almost uniform absence of time dependence. Recall that the parameter c determines

importance of time trend in the deterministic part of the cMD–ARCH process (9). The point

estimates of c are statistically insignificant for all but TWD model. In the latter case α is close

to unity, pointing to IGARCH-like deterministic dynamics in σ2t : t ≥ 0 for this series.

A set of diagnostic tests reported in Table 3 uses series of estimated residuals (11) corre-

sponding to the empirical volatility models in Table 2. Reported tests indicate overall success

of the estimated cMD–ARCH models in picking up essential volatility dependences in the

data. Remaining skewness and excess kurtosis in zt : 1 ≤ t ≤ T do not contradict A1 and

Theorem 2.

The results of empirical volatility modeling presented in this subsection can be summarized

as follows. First, relaxing strict time dynamics in at : t ≥ 0 imposed in many familiar

conditional heteroscedasticity processes, such as IGARCH and FIGARCH, leads to rejection

of deterministic trend component in empirical volatility models. We find that in all but one

model for our sample of foreign exchange rate returns the sequence at : t ≥ 0 is O(1). Second,

only four out of thirteen empirical models imply summable sequence of coefficients θj : j ≥ 0when expressed in the general cMD–ARCH form (3). Among the remaining models, all but

one, have fractional integration parameter d significantly below unity. This finding suggests

empirical importance of conditional heteroscedasticity models with hyperbolically decaying

non–summable weighting coefficients. Third, even though the sequence θj : j ≥ 0 is close to

O(1) for six models in Table 2, the IGARCH hypothesis is not supported by the data due to lack

of statistically significant linear trend component in at : t ≥ 0. Fourth, in six out of thirteen

cases the empirical results point out to non–square–summable θj : j ≥ 0 and thus non–

stationary limiting conditional volatility process. This result indicates potential significance of

non–stationary fractionally integrated cMD–ARCH processes (9) with 12 ≤ d < 1 and c = 0,

opening potentially interesting direction of further research.

68

4 Concluding remarks and further research

Non–stationary models for linear time series, such as I(d) models for d ≥ 1 and trend–stationary

processes, play a substantial role in theoretical and applied econometric research. The main

goal of this paper is to introduce a framework for non–stationary conditional heteroscedasticity

models and to examine some empirical evidence of non–stationary volatility.

The proposed family of models, referred to in the paper as cMD–ARCH processes, allows

for separation of deterministic and stochastic effects in the conditional volatility, similarly to

the linear time series models. The stochastic part consists of a weighted sequence of martingale

difference innovations, permitting a range of different weighting structures. The deterministic

part may include a time trend component. A set of cross–restrictions between the parameters

of the cMD–ARCH process is imposed to ensure non–negativity of the conditional variance.

The paper examines a number of parametrizations of the new process.

A statistical inference theory for the parameters of cMD–ARCH is developed in the paper,

drawing on previous contributions of Jensen and Rahbek (2002, 2003). Consistency and asymp-

totic normality of the QML estimator is shown under general assumptions on the sequence of

innovations.

Finally, an empirical application of the new model to thirteen major European and Asian

foreign exchange returns is included. The results indicate non–stationary volatility for six

cases in the form of non–square–summable weighting coefficients in the stochastic part of the

estimated volatility process.

The findings reported in the paper suggest a potential empirical importance of the following

non–stationary cMD–ARCH model:

r2t = σ2

t = a0 for t ≤ 0

rt = σt zt σ2t = a0 + γ (1− L)−d(r2

t−1 − σ2t−1) for t > 0 ,

where 12 ≤ d < 1 and 0 < γ ≤ d < 1. More work needs to be done to establish statistical

properties and practical implications of this process. In particular, a detailed comparison of

this model with IGARCH sequences of Engle and Bollerslev (1986) may provide fruitful insights

into the area of non–stationary volatility modeling.

5 Appendix

This technical appendix collects proofs of the main results in Section 2. The proof of Theorem 2

is preliminary and is likely to be revised further. Notation is simplified by using convention∑nm · = 0 whenever m > n for m,n ∈ Z.

PROOF OF THEOREM 1: Using definition of πj : j ≥ 0 in (4) and substituting recursively

one can re–write the conditional variance part of (3) as follows:

σ2t = at −

t−1∑j=0

πj at−1−j +t∑

j=1

πj−1r2t−j , (A.1)

69

from where result follows by induction on t ≥ 0 by A1–A2.

PROOF OF THEOREM 2: We proceed to establish sufficient conditions of Basawa, Feigin

and Heyde (1976). First, using (A.1) and denoting Πt(u) := at(u)−∑t−1

j=0 πj(u) at−1−j(u), we

observe that:

σ2(1)t (u0)σ2

t

=Π(1)

t (u0)σ2

t

+t∑

j=1

π(1)j−1(u0)

r2t−j

σ2t

≤ [log Π(u0)](1) +

t∑j=0

[log πj(u0)](1) , (A.2)

where the following inequalities are used:

1σ2

t

≤ 1Πt(u0)

andr2t−j

σ2t

≤ 1πj−1(u0)

for all t ≥ 0 .

By A3 the right–hand side of (A.2) is bounded in absolute value for all t ≥ 0.

Second, the score vector of (10) is shown to be asymptotically normal as follows. Write:

T−12 L

(1)T (u0) = −T−

12

T∑t=1

[1− z2t ]

σ2(1)t (u0)σ2

t

, (A.3)

where by A1 the vectors under the summation are martingale differences. Then:

−T−1T∑

t=1

E

([1− z2

t ]2σ2(1)

t (u0)σ2

t

σ2(1)t (u′0)σ2

t

∣∣∣∣∣Ft−1

)P−→ B0 ,

by A1 and A3, where B0 is positive definite. Since σ2(1)t (u0)

σ2t

: t ≥ 0 is bounded according

to A3, the asymptotic normality of (A.3) follows by the martingale central limit theorem for

random vectors.

Third, the Hessian of the log–likelihood function evaluated at u0 is given by:

−T−1L(2)T (u0) = −T−1

T∑t=1

[2z2t − 1]

σ2(1)t (u0)σ2

t

σ2(1)t (u′0)σ2

t

− T−1T∑

t=1

[1− z2t ]

σ2(2)t (u0)σ2

t

.

The second part on the right–hand side of this expression converges to zero by A1. The limit

of the first part is following:

−T−1T∑

t=1

[2z2t − 1]

σ2(1)t (u0)σ2

t

σ2(1)t (u′0)σ2

t

P−→ −A0 ,

where A0 is a positive definite matrix.

Finally, condition (B7) in Basawa, Feigin and Heyde (1976) is needed to ensure weak conver-

gence of the Hessian evaluated at the neighborhood of u0 to −A0. It is non–trivial to check in

the general setting of the cMD–ARCH model without considering specific parametrizations of

the parameters. In Berkes, Horvath and Kokoszka (2003a) the required convergence is claimed

by sufficient continuity of T−1L(2)T (u) with respect to u in the neighborhood of the true vector

u0. The analogous continuity requirements are imposed in A4.

70

References



pp. 3–30.

Berkes, Istvan, Lajos Horvath and Piotr Kokoszka (2003a) GARCH processes: Structure and

estimation. Bernoulli, vol. 9, pp. 201–227.

Berkes, Istvan, Lajos Horvath and Piotr Kokoszka (2003b) Asymptotics for GARCH squared

residual correlations. Econometric Theory, vol. 19, pp. 515–540.


Econometrics, vol. 31, pp. 307–327.

Bollerslev, Tim, Robert F. Engle and D. B. Nelson (1994) ARCH models. Handbook of Econo-

metrics, vol. IV, pp. 2961–3031, New-York: Elsevier Science.

Bougerol, Philippe and Nico Picard (1992) Stationarity of GARCH processes and of some

non–negative time series. Journal of Econometrics, vol. 52, pp. 115–127.

Davidson, James (2003) Moment and memory properties of linear conditional heteroscedasticity

models, and a new model. Preprint.

Doornik, Jurgen A. (1998) Object–Oriented Matrix Programming Using Ox, 3rd ed. London:

Timberlake Consultants Press and Oxford: www.nuff.ox.ac.uk/Users/Doornik.


variance of U.K. inflation. Econometrica, vol. 50, pp. 987–1008.

Engle, Robert F. and Bollerslev, Tim (1986) Modeling persistence of conditional variance.

Econometric Reviews, vol. 5, pp. 1–50.


pendence structure and central limit theorem. Econometric Theory, vol. 16, pp. 3–22.



Inference for Stochastic Processes, vol. 3, pp. 113–128.

Giraitis, Liudas, Remigijus Leipus and Donatas Surgailis (2003) Recent advances in ARCH

modelling. Preprint.

Hall, Peter and Qiwei Yao (2003) Inference in ARCH and GARCH models with heavy–tailed

errors. Econometrica, vol. 71, pp. 285–317.

Jarque, C. M. and A. K. Bera (1987) A test for normality of observations and regression

residuals. International Statistical Review, vol. 55, pp. 163-172.

Jensen, Søren Tolver and Anders Rahbek (2003) Asymptotic normality for Non–stationary,

explosive GARCH. Preprint.

Jensen, Søren Tolver and Anders Rahbek (2002) Non–stationary and no moments asymptotics

for the ARCH model. CAF Working Paper Series, no. 124


Econometric Theory, vol. 18, pp. 1–16.

71

Koulikov, Dmitri (2003a) Modeling sequences of long memory non–negative covariance station-

ary random variables. CAF Working Paper Series, no. 156

Koulikov, Dmitri (2003b) Long memory ARCH(∞) models: specification and quasi–maximum

likelihood estimation. CAF Working Paper Series, no. 165

Lee, Sang-Won and Bruce E. Hansen (1994) Asymptotic theory for the GARCH(1,1) quasi–

maximum likelihood estimator. Econometric Theory, vol. 10, pp. 29–52.

Li, W. K. and T. K. Mak (1994) On the squared residual autocorrelations in non–linear time se-

ries with conditional heteroscedasticity. Journal of Time Series Analysis, vol. 15, pp. 627–

636.

Ling, Shiqing and Michael McAleer (2002) Necessary and sufficient moment conditions for the

GARCH(r,s) and assymetric power GARCH(r,s) models. Econometric Theory, vol. 18,

pp. 722–729.

Nelson, Daniel B. (1990) Stationarity and persistence in the GARCH(1,1) model. Econometric

Theory, vol. 6, pp. 318–334.


eroskedasticity in multiple regression. Journal of Econometrics, vol. 47, pp. 67–84.

Robinson, Peter M. and Paolo Zaffaroni (2003) Pseudo–maximum likelihood estimation of

ARCH(∞) models. Preprint.

72

Chapter 4: Conditional heteroscedasticity model fordiscrete high-frequency price changes: with

application to IBM trades data

73

Conditional heteroscedasticity model for discrete high-frequency

price changes: with application to IBM trades data

Dmitri Koulikov∗





phone: +45 89421577


This revision:

February 7, 2002

Abstract

In this paper we present conditional heteroscedasticity models for time-series of discrete

price changes in high-frequency financial data. They combine tractability of observation-

driven GARCH models of Bollerslev (1986) with the simplicity of the ordered probit/logit

structure of Hausman, Lo and MacKinlay (1992). In contrast to the ACM model of Russell

and Engle (1998) and the ADS decomposition model of Rydberg and Shephard (2003), we

separate groups of parameters driving conditional mean and conditional variance of the

data, allowing us to test the effects of explanatory variables separately on the two moments

of high-frequency price changes. We introduce two models belonging to the class outlined

above: IV-GARCH model with short-memory volatility dynamics and IV-FIARCH model

with long-range dependence in the conditional volatility. Application of the models to IBM

trades dataset is provided.

JEL classification: C22, C25, C51, G10

Keywords: High-frequency financial data, Time-series of discrete random variables, Con-

ditional heteroscedasticity, Markov chains, Non-linear econometric models.

∗I wish to thank the participants of the following conferences for their helpful suggestions: “Market Mi-

crostructure and High-Frequency Data in Finance”, Sandbjerg, “57th European Meeting of the Econometric

Society”, Venice, and “VIIth Spring Meeting of Young Economists”, Paris. The usual disclaimer applies.

74

1 Introduction

This paper presents a contribution to the literature on econometric modeling of high-frequency

financial data. We introduce a class of observation-driven models for conditionally heteroscedas-

tic discrete price changes in the spirit of GARCH models of Engle (1982) and Bollerslev (1986)1.

This class includes both short- and long-memory models, where the latter is able to accommo-

date substantial persistence found in the volatility of discrete price changes. In addition, our

models admit a relatively straightforward integration with financial duration models, such as

the ACD model of Engle and Russell (1998) and Engle (2000), giving the framework for joint

modeling of stochastically dependent inter-trade durations and price changes. Thus, this paper

follows a research agenda put forward in Rydberg and Shephard (2000), where the authors

propose compound Poisson process as the basic statistical model for high-frequency financial

data.

Recent availability of high-frequency financial data, coupled with increased computing ca-

pacity, has spurred the literature seeking to develop a range of econometric techniques suitable

for its statistical modeling. Among important recent contributions in the area are Davis, Ry-

dberg, Shephard and Streett (2001), Engle (2000), Engle and Russell (1998), Gerhard and

Pohlmeier (2000), Hausman, Lo and MacKinlay (1992), Rydberg and Shephard (1999, 2000),

Russell and Engle (1998) and Engle (2000). A good survey of this relatively new econometric

field is Hautsch and Pohlmeier (2002).

From a viewpoint of the established econometric methodology, high-frequency financial data

presents several new challenges. First and foremost, high-frequency datasets contain collections

of variables, such as bid and ask quotes, trade prices and trade volumes, were observations are

separated by stochastic time intervals. As suggested by the market microstructure literature,

these intervals themselves carry an important informational content and therefore should be

modeled simultaneously with other variables. This point has been recently stressed in papers

by Dufour and Engle (2000), Ghysels (2000) and Gerhard and Pohlmeier (2000).

Secondly, trade prices and quotes of many financial assets are discrete, reflecting the insti-

tutional structure of the markets. As documented in Rydberg and Shephard (2000), Campbell,

Lo and MacKinlay (1997) and in section 4 of this paper, discrete price changes in high-frequency

financial data exhibit statistical features similar to those normally observed in continuous low-

frequency returns.

Finally, the real-time character of high-frequency data further contributes to its statistical

complexity. Intra-day volatility patterns, news announcement effects and a range of mar-

ket microstructure-specific idiosyncrasies make econometric modeling of this data particularly

challenging.

In this paper we focus on discreteness of prices and quotes in high-frequency financial data.

Our models are suitable both for price changes series “binned” into regular time intervals in the

spirit of Davis, Rydberg, Shephard and Streett (2001), and for the original irregularly-spaced1A comprehensive survey of the GARCH literature can be found in Bollerslev, Chou and Kroner (1992).

75

data. In the latter case, our models integrate with the class of stochastic duration models

for financial data, providing a foundation for joint modeling of durations and price changes in

high-frequency data.

Our work is motivated by the need for a tractable model that picks up essential dynamics of

discrete price changes, in particular, of their second moment. The success of GARCH models of

Engle (1982) and Bollerslev (1986) for low-frequency financial returns calls for the development

of a similar observation-driven conditional heteroscedasticity model for discrete data. While

both the Autoregressive Conditional Multinomial (ACM) model of Russell and Engle (1998)

and the ADS decomposition model of Rydberg and Shephard (1999a) are able to describe

the observed volatility clustering phenomenon in high-frequency price changes, neither of the

two has a single underlying parameter driving the volatility. Among other things, this fact

complicates testing of economically relevant hypothesis linked to the second moment of price

changes in high-frequency data.

We reuse the idea of Hausman, Lo and MacKinlay (1992), whereby the discrete distribu-

tion of price changes is approximated using the logistic distribution function in a framework

resembling the ordered logit model. However, unlike Hausman, Lo and MacKinlay (1992), our

models are cast entirely in terms of discrete random variables and therefore allow for simple

estimation and diagnostic procedures. Moreover, the probabilistic structure of the models lends

itself to the study of stationarity and moments of the discrete price change series.

Our models have two separate parameters: one driving conditional first moment of the

discrete distribution of price changes, and the other driving its conditional variance. The latter

is specified similarly to the GARCH model of Bollerslev (1986), allowing for a straightforward

extension for the long-memory case and augmentation with a set of exogenous explanatory

variables, such as announcement dummies and deterministic intra-day seasonality. In the paper

we look in detail at both short- and long-memory cases, referred to as IV-GARCH and IV-

FIARCH respectively2.

The paper is organized as follows. Section 2 gives a short overview of the existing models

for high-frequency financial data and discuss advantages and drawbacks of existing approaches.

Section 3 introduces IV-GARCH and IV-FIARCH models, together with conditions for their

existence and stationarity and an overview of the estimation and diagnostic methods. Sec-

tion 4 gives the description of high-frequency IBM trades dataset used in the empirical part of

the paper. Section 5 presents estimation results of the IV-GARCH and IV-FIARCH models.

Conclusion summarizes the findings and discusses directions for the future research.2These abbreviations stand for Integer-Valued Generalized Autoregressive Conditional Heteroscedasticity and

Integer-Valued Fractionally Integrated Generalized Autoregressive Conditional Heteroscedasticity respectively.

76

2 Short overview of existing models for high-frequency finan-

cial data

In this section we give an overview of three existing models for discrete price changes in high-

frequency financial data: the ordered probit model of Hausman, Lo and MacKinlay (1992), the

ACM model of Russell and Engle (1998), and the ADS decomposition model of Rydberg and

Shephard (2003). We also provide a brief discussion of a number of other contributions in the

field. Some short comparison of the three models of price changes with our model is also given.

2.1 Ordered probit model of Hausman, Lo and MacKinlay (1992)

One of the first studies in the literature on modeling discrete price changes in high-frequency

financial data, Hausman, Lo and MacKinlay (1992) propose an ordered probit model to capture

essential features of such data. The ordered probit and logit models are well known from cross-

sectional econometrics, with main application area in the modeling of individual choices with

a natural ordering structure.

Hausman, Lo and MacKinlay (1992) motivate their model using the traditional framework

for ordered probit and logit:

∆p∗i = x′i β + εi ,

where xi is a vector of exogenous explanatory variables and εi is assumed to be an independent

normal random variable with variance σ2i . In this setting ∆p∗i is an unobserved continuous

state variable3. The observation rule is then given by:

∆pi =

−K if ∆p∗t ∈ A−K

−K + 1 if ∆p∗t ∈ A−K+1

......

...

K if ∆p∗t ∈ AK

,

where K denotes maximum allowable absolute price change, and Ak : −K ≤ k ≤ K defines

a finite partition of R.

Hausman, Lo and MacKinlay (1992) make the variance of εi term dependent on a set of

exogenous variables wi:

σ2i = 1 + w′

i γ .

The need for heteroscedastic εi in their model is motivated by appealing to the diffusion models

for prices of financial assets, where the variance of increments depends on the sampling interval.

Since price changes in high-frequency financial data are spaced irregularly, they include inter-

trade duration variable in wi to control for possible heteroscedasticity.

The models for ∆pi developed in section 3 have the structure superficially similar to that of

ordered probit model of Hausman, Lo and MacKinlay (1992). In particular, our models inherit3In the remainder of this paper we index observations in the high-frequency financial data by the subscript

i. This convention underlines the fact that such data may not be regularly spaced in time.

77

the ordered logit mechanism of modeling probabilities of price changes. However, the models

developed in this papers do not have underlying continuous state variable ∆p∗i of Hausman,

Lo and MacKinlay (1992)4. This difference is not purely motivational: the variable ∆p∗i in the

ordered probit model of Hausman, Lo and MacKinlay (1992) figures prominently throughout

their paper, and in particular the diagnostic procedures developed by the authors rely on

conditional independence and normality of ∆p∗i . This makes the specifications tests unnecessary

complicated due to the latent character of the state variable.

Moreover, interpretation of ∆p∗i in the setting of financial markets is at best vague. Haus-

man, Lo and MacKinlay (1992) carefully avoid linking the state variable ∆p∗i to the hypothetical

underlying continuous price, which is then rounded to the nearest tick by the ordering structure

of their model. Their empirical findings indicate that the boundaries of Ak : −K ≤ k ≤ Kare misaligned with respect to the $1

8 grid and vary substantially from stock to stock. How-

ever, when ∆p∗i does not have a price interpretation, the relevance of time varying σ2i becomes

unclear.

In this paper we show that the empirical success of the model in Hausman, Lo and MacKin-

lay (1992) lies in the flexibility of the ordered probit/logit structure in fitting the shape of the

discrete distribution of high-frequency price changes. By looking at the ordered logit model

merely as a convenient and parsimonious way to specify a probability distribution function of

discrete ∆pi, we are able to cast an entire class of models in section 3 in terms of discrete

random variables.

2.2 ACM model of Russell and Engle (1998)

Autoregressive conditional multinomial (ACM) model of Russell and Engle (1998) provides a

general framework for modeling the dynamics of a series of discrete random variables. Russell

and Engle (1998) specify a dynamic model for the discrete probability distribution of high-

frequency price changes ∆pi. For price changes in transaction data the state space of ∆pi is

assumed to be a bounded interval of Z symmetric around zero5. Let probabilities of individual

price changes be denoted by π = (π−K , . . . , πK)′, where the support of ∆pi has the length

2K + 1. Russell and Engle (1998) suggest a dynamic model for πi of the following form:

h(πi) =p∑

j=1

Aj(1∆pi−j − πi−j) +q∑

j=1

Bj1∆pi−j +r∑

j=1

Cjh(πi−j) +GZi , (1)

4Historically, ordered probit and logit models were always build upon linear regressions with continuous

disturbances and appropriate classifying observation rules. In medical and biological applications the underlying

state variable is often interpreted as time or drug dosage, whereas in economics it most often represents level

of utility. For an overview of the models refer to McFadden (1984). In this paper we work only with discrete

random variables whose probability mass function is parametrized similarly to the structure of the ordered logit

model; see section 3.5We use this assumption throughout the paper. However, both the ACM model of Russell and Engle (1998)

and the models developed in section 3 are not limited to this case.

78

where 1∆pi denotes a (2K + 1)× 1 vector of the form (1∆pi=−K , . . . ,1∆pi=K)′, Zi stands for a

set of weakly exogenous explanatory variables, and the link function h is chosen such that the

probabilities πk : −K ≤ k ≤ K sum up to unity. In their subsequent discussion of the model

and its empirical applications Russell and Engle (1998) utilize multinomial logit specification

for h.

With this level of generality almost every other model for the time series of discrete random

variables will be a special case of the ACM model. In its practical application to the IBM

transaction data Russell and Engle (1998) impose certain restrictions on the parameters of

the ACM model in equation (1) justified by the considerations of response symmetry. These

restrictions allow them to substantially reduce the number of parameters that have to be

estimated.

While capable of producing very good fits to the high-frequency price change data, the

ACM model has certain drawbacks from the empirical researcher perspective. Multinomial

logit specification of the link function h leads to difficulties in interpreting parameters of the

model. In particular, direction in which included variables affect the probabilities of the states

of ∆pi will not in general coincide with the signs of respective coefficients, and the total effect

of an explanatory variable will depend on a subset of parameters. In the case of complicated

dynamic specification, such as given in equation (1), numerical simulations from the model

give a feasible solution to this interpretation problem. More generally, parameters of the ACM

model do not naturally fall into groups driving conditional moments of the discrete distribution

of ∆pi. While it is possible to identify a subset of parameters in the model that will enter the

expressions for the conditional moments of ∆pi, it is likely to be a complicated expression that

is difficult to keep track of in practice. Therefore, testing of hypotheses that explicitly involve

restrictions on the conditional moments is difficult.

The models for conditionally heteroscedastic discrete price changes presented in section 3

share many similarities with the ACM model of Russell and Engle (1998). However, several

important differences are apparent. Firstly, our models are designed to have identifiable groups

of parameters driving conditional first and conditional second moments of ∆pi. Hence, a range

of economically interesting hypotheses related to the moments of discrete price changes can be

tested directly, without the need to resort to post-estimation simulations. Secondly, by switch-

ing from the multinomial logit link function of the ACM model to the ordered logit or ordered

probit link function, such as in the model of Hausman, Lo and MacKinlay (1992), substantial

gains in terms of model parsimony are realized. This allows reducion of the computation time

needed to maximize the likelihood function — an important consideration in increasingly large

high-frequency datasets. Finally, having a simpler specification for discrete price changes allows

us to derive results pertaining to the stationary distribution of ∆pi.

79

2.3 The ADS model of Rydberg and Shephard (2003)

Rydberg and Shephard (2003) propose a decomposition model for the discrete price changes in

financial data, whereby ∆pi is assumed to be the product of three random processes as follows:

∆pi = AiDi Si ,

where Ai : i ∈ Z is the binary process on 0, 1 describing trading activity, Di : i ∈ Zis the binary process on −1, 1 modeling direction of the price movement, and Si : i ∈ Zis the random process defined on the set of positive integers giving the magnitude of price

change. All three processes are allowed to be interdependent with possible inclusion of other

exogenous explanatory variables and intra-day seasonality. In their empirical analysis Rydberg

and Shephard (2003) use the autologistic model for Ai : i ∈ Z and Di : i ∈ Z, and specify

Si : i ∈ Z by the negative binomial GLARMA process.

The ADS decomposition model has certain advantages from the point of view of market

microstructure research and hypothesis testing. In many cases the market activity process

Ai : i ∈ Z can be the sole focus of research. The product of Ai and Di represents the censored

model of price movement, where ∆pi is at most allowed to change by one tick. Rydberg and

Shephard (2003) report a number of interesting results concerning the dynamics and effect of

exogenous variables on Ai : i ∈ Z, Di : i ∈ Z and Si : i ∈ Z.However, as in the ACM model of Russell and Engle (1998), the effects of explanatory

variables in the ADS decomposition model are not directly tied to the moments of the discrete

price changes. Although Si is tightly related to the volatility of ∆pi, the activity indicator Ai

must also be accounted for in the implied second moment of the price changes distribution.

The model for high-frequency price changes presented in this paper makes the link between the

parameters and the moments of ∆pi even more explicit.

2.4 Other contributions

Among other important contributions to the econometric modeling of high-frequency data we

mention ACD-GARCH models for irregularly spaced financial data by Ghysels and Jasiak (1997),

UHF-GARCH model by Engle (2000) and dynamic model for discrete bid-ask quotes with

ARCH volatility by Hasbrouck (1999). The first two models are not designed to account for

data discreteness, whereas Hasbrouck (1999) models it in the framework of the so-called round-

ing models of discreteness surveyed in Campbell, Lo and MacKinlay (1997) pp. 114-122.

ACD-GARCH model of Ghysels and Jasiak (1997) is based on the GARCH aggregation

results of Drost and Nijman (1993), where aggregation intervals are stochastic and driven by

the autoregressive conditional duration (ACD) model of Engle and Russell (1998). Ghysels

and Jasiak (1997) introduce a latent GARCH model that generates unobserved conditionally

heteroscedastic returns at the highest observed frequency (normally 1 second). Parameters of

this latent GARCH model are of the primary interest for the researcher. Observed irregularly

spaced returns come from the aggregation of the latent data, where the aggregation intervals

80

are stochastic and driven by the ACD model. This leads to the GARCH model with random

coefficients that depend on the expected duration parameter from the ACD part. The basic

model can be appended to include deterministic intra-day seasonality and effects of the exoge-

nous explanatory variables. Ghysels and Jasiak (1997) present application of the ACD-GARCH

to the IBM transaction dataset. They find that latent GARCH model features remarkably low

volatility persistence — something that contrasts many other results, including those reported

in Engle (2000). Ghysels and Jasiak (1997) interpret this results as the demonstration of the

important role of the persistence of inter-trade durations, that together with the high-frequency

returns create the evidence of substantial volatility persistence in high-frequency data.

The ACD-GARCH model of Ghysels and Jasiak (1997) is one of the few contributions

in the current literature attempting to link the volatility process of high-frequency returns

with the process driving the inter-trade durations through their joint modeling. Another such

attempt is made in Russell and Engle (1998), who also use the ACD model to fit the durations

data. However, as also pointed out in Ghysels (2000), both models are reduced to the two-step

framework, where the ACD model for durations data is estimated first under the assumption of

exogeneity from the process driving the high-frequency returns or price changes data. Ghysels

and Jasiak (1997) also attempt to test causality from the volatility of high-frequency returns

to the inter-trade durations and find some supporting evidence for it.

The UHF-GARCH model of Engle (2000) uses high-frequency returns scaled by the actual

inter-trade durations for modeling in the usual GARCH framework. Scaling of the returns by

the square root of durations is intuitively justified as the natural measure of the volatility per

unit of time. Like Russell and Engle (1998), Engle (2000) also studies the effect of actual and

predicted durations on the conditional second moment of scaled returns. However, in contrast to

Russell and Engle (1998), Engle (2000) finds that actual durations have a statistically significant

effect on the variance of scaled high-frequency returns.

Both the ACD-GARCH model of Ghysels and Jasiak (1997) and UHF-GARCH model of

Engle (2000) ignore the inherent discreteness of the high-frequency financial data and fit tradi-

tional GARCH models into it. There have been no studies up to date discussing outcomes of

this modeling decision on the performance of GARCH models. As documented in Hausman,

Lo and MacKinlay (1992), Campbell, Lo and MacKinlay (1997) pp. 107-114, Russell and En-

gle (1998), Rydberg and Shephard (2000) and many other studies, high-frequency transaction

data normally contains a large proportion of zero price changes and, therefore, zero returns.

In addition to that, minimum price change of one tick is usually sufficiently coarse compared

to the price level of the asset, leading to the bunching of high-frequency returns around the

points of support of the discrete price change distribution; see Szpiro (1998) and Crack and

Ledoit (1996) for more on this effect. For some distributional assumptions, such as GED in the

EGARCH model of Nelson (1991), the concentration of probability mass on the zero returns

may lead to numerical instabilities and failures to estimate the model; see Hasbrouck (1999).

Moreover, predictions from such models are likely to fail to generate a sufficiently large amount

81

of zero returns and to pick up the bunching.

Hasbrouck (1999) proposes a dynamic model for the discrete bid and ask quotes that is

largely motivated by the insights from the market microstructure theory. Apart from the

discreteness, his model features ARCH effects and incorporates costs of market making. Has-

brouck (1999) approach discreteness by the rounding of continuous data generated from the

latent time-series process with conditionally heteroscedastic innovations. The rounding is asym-

metric and is related to the cost of market making. This way of introducing discreteness into

the model goes back to the contributions of Gottlieb and Kalay (1985) and Ball (1988), who

use this setup to consider effects of discreteness on the estimator of variance of the continuous

underlying process. Inference in the model is complicated by the presence of several latent

components and is based on the recursive likelihood calculations, where the state variables

are integrated out using numerical methods. Empirical findings of Hasbrouck (1999) imply

highly peaked distribution of the innovations to the unobserved price process together with the

relatively low degree of persistence of their variance.

3 IV-GARCH and IV-FIARCH models for high-frequency fi-

nancial data

In this section we present a class of models for time-series of discrete conditionally heteroscedas-

tic price changes in high-frequency financial datasets. The models are motivated by Hausman,

Lo and MacKinlay (1992) ordered probit model for transactions data with the variance process

similar to the GARCH model of Bollerslev (1986). The models belong to the class of obser-

vation driven models in the sense of Cox (1981) and lead to the straightforward maximum

likelihood based inferential procedures. The basic specification can be also used to parametrize

conditional variance of discrete price changes in terms of a background driving unobserved

process.

3.1 Empirical regularities in the distribution of high-frequency price changes

in financial data

Before we proceed with discussion of IV-GARCH and IV-FIARCH models in the following

subsections, we present several stylized facts pertaining to the statistical properties of high-

frequency price changes. For a much broader survey of general empirical regularities of high-

frequency financial data refer to Campbell, Lo and MacKinlay (1997) pp. 107-114, and Hautsch

and Pohlmeier (2002). Here we emphasize the following three most notable features of ∆pi :

i ∈ Z:

1. Figure 1 depicts unconditional probability density function of high-frequency price changes

in IBM trades dataset. Description of the dataset will be given in section 4. It is seen

that the function is nearly symmetric around ∆pi = 0, with high concentration of the

probability mass on the middle state and almost no mass for |∆pi| ≥ $12 ; see Campbell,

82

-3.5 -3 -2.5 -2 -1.5 -1 -.5 0 .5 1 1.5 2 2.5 3 3.5

2.5

5

7.5

10

12.5

15

17.5

20

22.5

Date: 25-07-2001 17:50:14 Filename: trade_durations.mat

Figure 1: Unconditional distribution of ∆pi in IBM transaction data. Tick size equals to $18 .

0 50 100 150 200 250 300 350 400 450 500

-.4

-.2

0

Correlogram

Date: 26-07-2001 11:57:46 Filename: trade_durations.mat0 50 100 150 200 250 300 350 400 450 500

.1

.2

.3

.4

Correlogram

Figure 2: Correlograms of ∆pi (upper panel) and |∆pi| (lower panel) for IBM transaction data.

83

-3.5 -3 -2.5 -2 -1.5 -1 -.5 0 .5 1 1.5 2 2.5 3 3.5

5

10

15

Conditional price change

Date: 24-10-2001 14:19:01 Filename: trade_durations.mat-3.5 -3 -2.5 -2 -1.5 -1 -.5 0 .5 1 1.5 2 2.5 3 3.5

5

10

15

20

Conditional price change

Figure 3: Distribution of ∆pi conditional on ∆pi−1 ≥ 0 (upper panel) and on ∆pi−1 ≤ 0 (lower

panel) in IBM transaction data. Tick size equals to $18 .

-1.25 -1 -.75 -.5 -.25 0 .25 .5 .75 1 1.25

10

20

Price changes on 23/11/90

Date: 30-10-2001 17:52:11 Filename: trade_durations.mat-1.25 -1 -.75 -.5 -.25 0 .25 .5 .75 1 1.25

10

20

Price changes on 17/01/91

Figure 4: Marginal distributions of ∆pi on 23rd of November 1990 (upper panel) and 17th of

January 1991 (lower panel) in IBM transaction data. Tick size equals to $18 .

84

Lo and MacKinlay (1997) pp. 107-114 and Hautsh and Pohlmeier (2001) for the similar

evidence in other high-frequency datasets;

2. Another notable regularity of the data is a significant negative first-order autocorrelation

of ∆pi : i ∈ Z series; see Figure 2. It follows that conditional on the sign of previous

observation, the distribution of ∆pi will be asymmetric; see Figure 3 for the illustration.

The negative autocorrelation is consistent with the bid-ask bounce model of Roll (1984).

3. Finally, as it is the case of many financial series with continuous support, ∆pi : i ∈ Zappears to exhibit dynamic heteroscedasticity. In the case of IBM data it is illustrated

by the correlogram of |∆pi| on Figure 2. We also demonstrate probability mass function

of ∆pi on Figure 4 at two different trading dates within our sample, where the difference

in the tail mass distribution is apparent.

In the next subsections we present a framework for econometric modeling of discrete price

changes, where the three properties outlined above are accounted for in a relatively parsimo-

nious and mathematically tractable way. The models can be easily extended in many other

directions, making them suitable for testing a range of market microstructure related theories.

In sections 4 and 5 we evaluate the fit of the models on the real-world IBM trades dataset.

3.2 Discrete distribution for high-frequency price changes

The discrete distribution for ∆pi in high-frequency financial series forms the basic building

block of the models proposed further in this section. While there is a large class of statistical

distributions with discrete support — a good overview of these can be found in Feller (1968)

and Johnston and Kotz (1969) — most are restricted to the non-negative counts and therefore

are not suitable for modeling ∆pi. The few parametric discrete distributions that are defined

for the intervals of Z do not seem to have enough flexibility to accommodate a range of patterns

of ∆pi that was documented in the previous subsection. Below we introduce a parametrization

for the discrete distribution of high-frequency price changes that allows us to pick up changes

in the first and second moments of the discrete data using as few parameters as possible.

We model the set of probability atoms π = (π−K , . . . , πK) on a bounded interval of Z

symmetric around zero as a function of two parameters linked to the first two moments of

∆pi6. In the reminder of the paper the two parameters are denoted µ and σ2, where we allow

for a non-linear relationship between µ and E(∆pi|µ, σ2) and σ2 and V(∆pi|µ, σ2). In addition,

the following assumptions are used throughout the paper:

A1. 0 < πk(µ, σ2) < 1 and∑K

k=−K πk(µ, σ2) = 1 for all k = −K . . .K, µ ∈ R and σ2 ∈ R+.

6Here and henceforth we use the same notation as in subsection 2.

85

A2. Functions πk(µ, σ2) : −K ≤ k ≤ K have the following limits, for some 0 ≤ δ < 1:

limσ2→0

πk(µ, σ2) = δ for k = −K . . . ,−1, 1, . . . ,K

limσ2→0

π0(µ, σ2) = 1− δ

limσ2→∞

πk(µ, σ2) < 1 .

A 3. Functions σ2 7→ πk(µ, σ2) : −K ≤ k ≤ K are Lipschitz, and statisfy |πk(µ, x) −πk(µ, y)| ≤ Ck|x− y| for all k = −K . . .K and some Ck : −K ≤ k ≤ K ⊆ R+.

A1 imposes restriction on the vector π, ensuring that the probability mass remains at all

points in the support of ∆pi, regardless of the values of µ and σ2. This restriction is common in

econometric models with absolutely continuous distributions and time-varying volatility, such

as GARCH and EGARCH models. It also plays an important role in ensuring dynamic stability

of the time-series models based on this discrete distribution.

A2 provides a number of restrictions on the behavior of π as the function of σ2. First two

limits guarantee that the probability mass concentrates at the middle point of the support,

possibly leaving only some limited probability mass in the tails, as σ2 approaches zero. The

third limit ensures stability of the probability distribution as σ2 →∞ by requiring the functions

πk(µ, σ2) : −K ≤ k ≤ K to have a well-defined limit.

Finally, A3 imposes extra regularity requirements on the functions πk(µ, σ2) : −K ≤ k ≤K. In particular, it guarantees their smoothness with respect to σ2 — a desired property in

an econometric model.

General framework presented above resembles the one outlined by Russell and Engle (1998),

but some important differences are present. Notably, as was mentioned in subsection 2.2, the

model for discrete distribution of ∆pi in this paper is designed to have identifiable groups of

parameters associated with the moments of high-frequency price changes. Because volatility

in finance plays a prominent role and many hypotheses are specifically linked to the second

moment of financial data, the structure of our model should provide a powerful and convenient

tool for the empirical research. Another important distinction of our approach is to make

πk : −K ≤ k ≤ K the functions of parameters of interest, rather than to model the evolu-

tion of π in the generalized VAR framework. As will become more clear below, we sacrifice

some flexibility of the generalized VAR framework of Russell and Engle (1998) in order to

obtain more parsimonious, easier to interpret structure of the discrete distribution of ∆pi. In

fact, for modeling price changes in high-frequency data, one hardly needs completely flexible

parametrization of π. As we saw in the previous subsection, the spectrum of distributions of

∆pi have clear common features, such as pronounced concentration of probability mass on zero

price change and thin tails. We make use of these facts to simplify our model and to gain in

its interpretability.

Once the suitable mapping (µ, σ2) 7→ π is established, a parametrization of µ and σ2 can

be selected. For example, in the spirit of observation driven models of Cox (1981), µ and

σ2 can be made dependent on the history of the process and a set of exogenous variables.

86

Another suggestion is to parametrize them in terms of latent state variables. Moreover, it

becomes possible to model µ and σ2 together with other endogenous variables of interest in the

high-frequency dataset. We will explore some of these possibilities later in the paper.

In the reminder of this subsection we describe a particular parametrization of functions

πk(µ, σ2) : −K ≤ k ≤ K such that A1–A3 are satisfied and the parameters µ and σ2 are

linked to the first two moments of ∆pi. Our choice is similar to the popular ordered logit

model, where logistic distribution function is used as a link function between µ and σ2 and the

probabilities π. This choice was originally motivated by the ordered probit model of Hausman,

Lo and MacKinlay (1992), but as we mentioned previously, our model is not in the class of

ordered logit models. We use logistic link function for the parametrization of π because of

its convenience in describing the variety of forms of the probability mass distributions of ∆pi

observed in the data; refer to subsection 2.1 for the discussion7.

The logistic distribution function is a continuous bounded function with two parameters:

location parameter µ and scale parameter σ2. In addition to these, probabilities πk : −K ≤k ≤ K are defined using an extra set of parameters α = (−αK−1, . . .−α1,−1, 1, α1, . . . αK−1)

in the following way:

π−K(µ, σ2) :=1

1 + e−−αK−1−µ

σ

...

π0(µ, σ2) :=1

1 + e−−1−µ

σ

− 1

1 + e−1−µ

σ

(2)

...

πK(µ, σ2) := 1− 1

1 + e−αK−1−µ

σ

.

Together with the assumed structure of α, the distribution of ∆pi is seen to be an interval of Z

symmetric around 0. This parametrization ensures that the probabilities πk : −K ≤ k ≤ Ksum up to unity. Note that in the rest of this paper it will be assumed that the parameters

(α1, . . . αK−1) are time-invariant, and do depending on any set of exogenous or predetermined

variables. The vector of parameters α can be thought of as defining the unconditional distri-

bution of ∆pi : i ∈ Z, whilst the conditional distributions of price changes is modeled using

µ and σ2.

A symmetric discrete distribution of ∆pi around its middle state, normally ∆pi = 0, obtains

whenever the location parameter µ is zero. When µ 6= 0, probability mass swings to the left

or to the right tail of the distribution, leading to non-zero expected price change. This gives

convenient way of picking up variations in the conditional first moment of ∆pi in the real-world7Other models for heteroscedastic sequences of discrete random variables can be constructed. For example,

consider partition of the unit interval [0, 1) into subintervals A1 = (0, α1], A2 = (α1, α2] and A3 = (α2, 1] s.t.

0 < α1 < α2 < 1. Assign πi = λ1(Ai) for i = 1, 2, 3 where λ1 is the Lebesgue measure on [0, 1). By parametrizing

α1 and α2, a variety of shapes of the trivariate distribution of ∆pi can be achieved. In particular, E(∆pi) changes

when A2 is moved around the unit interval and λ1(A2) stays constant, while V(∆pi) varies depending on λ1(A2).

87

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

-4 -3 -2 -1 0 1 2 3 4

mu

sigma=1sigma=2sigma=3

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0.5 1 1.5 2 2.5 3 3.5 4

sigma

mu=0mu=1mu=2

Figure 5: Dependence of E(∆pi) on µ and σ2 (left panel) and V(∆pi) on µ and σ2 (right panel)

in the static model with trivariate distribution of ∆pi.

data. This mechanism is seen from the expression for the first moment of ∆pi given by:

E(∆pi|µ, σ2) =K∑

k=−K

k · πk(µ, σ2) ,

where πk : −K ≤ k ≤ K are defined in equation (2) and the unit of measurement of E(∆pi)

is the number of ticks by which the price is expected to change. As is apparent from this

equation, whenever α have the structure given above, probability atoms in the left and in the

right tail of the distribution will be equal to each other when µ = 0 and different otherwise.

The second moment of ∆pi is given by the following expression:

V(∆pi|µ, σ2) =K∑

k=−K

k2 · πk(µ, σ2)−(E(∆pi|µ, σ2)

)2.

The measurement unit of the variance is given by squared ticks. When µ = 0, the second part

of the expression drops out, and as follows from equation (2), the remaining tail probabilities

in the expression are scaled proportionally to the parameter σ2, linking it to V(∆pi).

However, just as in many parametric discrete distributions, there will be cross effect of µ,

respectively σ2, on V(∆pi), respectively E(∆pi), since πk(µ, σ2) : −K ≤ k ≤ K are functions

of both parameters. Figure 5 plots the first two moments of a simple trivariate model of ∆pi as

the functions of µ and σ2. For E(∆pi), the increase in σ2 leads to less pronounced dependence

of the first moment on µ, and similar effect is observed for the dependence of V(∆pi) on σ2

when µ increases. The same result holds for models with larger support of ∆pi.

Given some constant µ, σ2 and α, the static model presented in this subsection produces a

sequence of i.i.d. homoscedastic discrete random variables ∆pi : i ∈ Z. By specifying µ and

σ2 in terms of the set of exogenous parameters, the set of probability densities πi : i ∈ Z is

allowed to vary, permitting the discrete process ∆pi : i ∈ Z to exhibit a variety of dynamic

88

features. We wish to note that the sequence ∆pi : i ∈ Z is bounded by construction, hence

all moments of ∆pi will exist. In the following subsections we will show how to introduce

dependence into the time series of high-frequency price changes such that the stylized facts

presented in 3.1 can be modeled in an adequate way.

3.3 IV-GARCH model for heteroscedastic discrete ∆pi

Building upon the model for homoscedastic price changes introduced in the previous subsection,

we now show how to model sequences of heteroscedastic ∆pi in high-frequency data. As

documented in subsection 3.1, and as will be seen in section 4, there is a dynamic dependence

in the real world ∆pi series in both the first and the second moments. We pick this up by

borrowing ideas from GARCH models of Bollerslev (1986). In particular, we parametrize σ2 in

terms of its own lags and a set of exogenous variables, with disturbances given by a sequence of

martingale difference (henceforth MD) innovations. µ is assumed to be non-dynamic, possibly

depending on a set of exogenous variables.

The first model for dependent heteroscedastic sequence of price changes features short-

memory dynamic structure in the second moment and is defined as follows:

∆pi ∼ π−K(µi, σ2i ;α), . . . πK(µi, σ

2i ;α)

µi = x′i β

σ2i = γ0i + γ1σ

2i−1 + γ2

[∆p2

i−1 − E(∆p2i−1|σ2

i−1)]

,

(3)

where π−K(µ, σ2;α), . . . πK(µi, σ2i ;α) denotes discrete distribution introduced in the previous

subsection with parameters µi, σi and α, and for simplicity we supress dependnece of the condi-

tional expectation E(∆p2i |σ2

i ) on µi. Vector xi collects a set of exogenous variables normalized

to have zero mean. Parameter γ0i is assumed to be non-negative for all i ∈ Z, and can option-

ally be parametrized as z′i δ to include effects of intra-day seasonality, news announcements

and additional exogenous variables. As follows from (3), the conditional volatility parameter

σ2i has one autoregressive component and one driving martingale difference innovation. In the

reminder of this paper this model is referred to as IV-GARCH(1,1).

σ2i : i ∈ Z process in the IV-GARCH(1,1) model above is driven by the innovation terms

∆p2i −E(∆p2

i |σ2i )

8. They have a natural interpretation of volatility “surprises”, i.e. unexpected

increases or decreases in the volatility of the discrete random variable ∆pi. The innovations

∆p2i −E(∆p2

i |σ2i ) are by construction martingale differences, since E

[∆p2

i − E(∆p2i |σ2

i )|Fi−1

]=

0 for all i ∈ Z, where Fi stands for the information generated by the process up to time i.8At this point the difference between our view of the model in equation (3) and the traditional ordered

probit/logit literature is most visible. Since we do not have any underlying latent variable with absolutely

continuous distribution in (3), the only random innovation in our model is the discrete ∆pi. Traditional approach

may call for the inclusion of the squared unobserved continuous ∆p∗i , which leads to complications due to its

latent character. Also, diagnostic procedures in traditional ordered probit/logit models require calculation of

the expected underlying innovation; see Hausman, Lo and MacKinlay (1992). We base our specification tests

directly on ∆pi.

89

The sequence of squared price innovation ∆p2i : i ∈ Z in the volatility process is not i.i.d.

and therefore requires careful treatment in proofs of the distributional results and statistical

properties of σ2i : i ∈ Z.

Under assumptions Eσ2i <∞ and 0 ≤ γ1 < 1, the IV-GARCH(1,1) model has the following

representation, referred to as IV-ARCH(∞):

σ2i =

∞∑j=0

γj1γ0i−j + γ2

∞∑j=0

γj1

[∆p2

i−j−1 − E(∆p2i−j−1|σ2

i−j−1)]

.

This representation highlights an important feature of model (3): the sequence of martingale

difference innovations is weighted by an exponentially decaying sequence of coeffiects, leading

to fast dissipation of the effect of past shocks on the current conditional volatility parameter

σ2i . In partuclar, when the conditional volatility process has a stationary distribution and γ0i

is constant, the autocovariance function of σ2i : i ∈ Z will be absolutely summable, similarly

to the class of short memory linear ARMA models.

Below we give a sufficient condition for non-negativity of the conditional volatility process

σ2i : i ∈ Z in the IV-GARCH(1,1) model:

THEOREM 1. The conditional volatility process in IV-GARCH(1,1) model is non-negative if

σ2 7→ γ0 + γ1σ2 − γ2E(∆p2|σ2) > 0 for all σ2 > 0.

The choice of γ0, γ1 and γ2 that satisfy the condition in Theorem 1 is always possible because

by A1 and A3 function σ2 7→ E(∆p2|σ2) is continuous and bounded between 0 and K2. In

addition, a simpler sufficient condition for positivity of the volatility process in IV-GARCH(1,1)

model is given by σ2 7→ γ0 − γ2E(∆p2|σ2) > 0, which is seen from (A.1).

Before showing conditions for stochastic stability of the IV-GARCH(1,1) model, we show

deterministic stability of the noise-free skeleton of the model as given in equation (A.1). We

also obtain the lower bound for the volatility process in the IV-GARCH(1,1) model. We need

the following extra regularity assumption:

A4. The function σ2 7→ γ0 + γ1σ2 − γ2E(∆p2|σ2) defines a contraction, that is∣∣γ1[x− y]− γ2[E(∆p2|x)− E(∆p2|y)]

∣∣ ≤ δ|x− y|

holds for any x, y ∈ R+ and 0 < δ < 1.

THEOREM 2. Under A4, the deterministic part of the volatility process in IV-GARCH(1,1)

model defined by the recursion σ2i = γ0 + γ1σ

2i−1 − γ2E(∆p2|σ2

i−1) is globally stable and has a

unique limit point given by the solution of σ2 = γ0 + γ1σ2 − γ2E(∆p2|σ2).

It is also seen that the expected value of the volatility process in equation (3) is given by:

Eσ2i =

E γ0i

1− γ1.

Finally, existence and uniqueness of the stationary distribution of the IV-GARCH(1,1)

model draws on the results from Markov chain theory in general state spaces. In particular, we

90

will follow the line of proofs given in Davis, Rydberg, Shephard and Streett (2001). The IV-

GARCH(1,1) model and the CBIN model introduced by these authors have similar probabilistic

structures, and the cited work have been very helpful in establishing stationarity results for our

model. Note that the conditional volatility part the IV-GARCH(1,1) model defines a Markov

chain on the positive half-line:

σ2i = γ0 + γ1σ

2i−1 − γ2E(∆p2

i−1|σ2i−1) + γ2∆p2

i−1 , (4)

where ∆p2i : i ∈ Z is the sequence of non-i.i.d. innovations. The following result is central in

this subsection:

THEOREM 3. Markov chain σ2i : i ∈ Z with transition function defined by (4) possesses a

unique stationary distribution under A1–A4.

3.4 IV-FIARCH model for heteroscedastic discrete ∆pi

The long-memory version of the IV-ARCH(∞) model, referred to as IV-FIARCH(∞) model,

is defined by:

∆pi ∼ π−K(µi, σ2i ;α), . . . πK(µi, σ

2i ;α)

µi = x′i β

σ2i = γ0i + γ2(1− L)−d

[∆p2

i−1 − E(∆p2i−1|σ2

i−1)]

,

(5)

where d is the coefficient of fractional integration, and L denotes the lag operator9. The power

series expansion of (1−z)−d around z = 0 gives a sequence of coefficients θj : j ≥ 0, with the

property∑∞

j=0 θδj < ∞ for all δ > (1 − d)−1 and 0 ≤ d < 1. In particular, θj : j ≥ 0 is not

absolutely summable, but is square–summable for 0 ≤ d < 12 . This will imply non–summable

autocovariance function under assumption of stationary of the conditional volatility process

and constant γ0i.

Similarly to the IV-GARCH(1,1) model (3), the sequence ∆p2i − E(∆p2

i |σ2i ) : i ∈ Z is

by construction a martingale difference sequence, where both E|∆p2i − E(∆p2

i |σ2i )| < ∞ and

E[∆p2

i − E(∆p2i |σ2

i )|Fi−1

]= 0 hold because the random variable ∆p2

i is bounded between 0

and K2. It follows that Eσ2i = E γ0i.

Sufficient non-negativity condition for the σ2i : i ∈ Z process in the IV-FIARCH(∞)

model is given in Theorem 4:

THEOREM 4. The conditional volatility process in the IV-FIARCH(∞) model (5) is non-

negative if σ2 7→ σ2 − γ2 E(∆p2|σ2) ≥ 0.

An immediate consequence of Theorem 4 is that limσ2→0 E(∆p2|σ2) = 0. This implies δ = 0 in

A2 for the IV-FIARCH(∞) model.

Unlike in the IV-GARCH(1,1) case, only a deterministic stability result for the condi-

tional volatility process in model (5) are available, due to non-Markovian structure of the

IV-FIARCH(∞) model:9Baillie (1996) gives an overview of recent results on fractional integration in econometrics.

91

THEOREM 5. Under conditions of Theorem 4, the deterministic part of the volatility process

in IV-FIARCH(∞) model defined by σ2i = γ0 − γ2

∑ij=1 ψj−1E(∆p2

i−j |σ2i−j) is globally stable

and has a unique limit point σ2 = 0.

So far we introduced two simple dynamic heteroscedasticity models for high-frequency price

changes. As seen from equations (3) and (5), IV-GARCH(1,1) and IV-FIARCH(∞) models

have relatively restricted short- and long-memory dynamics. The work is being done to extend

the models to have richer dynamics, including short-memory part in the IV-FIARCH model.

3.5 Estimation and diagnostics for IV-GARCH and IV-FIARCH models

Statistical inference in IV-GARCH and IV-FIARCH models is based on the straightforward

maximum-likelihood procedure. Given the set of observed data ∆pi,xi,zi : 1 ≤ i ≤ N the

log-likelihood function is given by:

LN

(α,β,γ, δ

∣∣∆pi,xi,zi : 1 ≤ i ≤ N)

=N∑

i=1

π′i 1∆pi .

In IV-FIARCH model the parameter d replaces γ1 in the log-likelihood function. In the em-

pirical part of the paper in section 5 we employ numerical procedures to calculate the gradient

and the Hessian matrix of the log-likelihood function. We use BHHH method of Berndt, Hall,

Hall and Hausman (1974) for numerical maximization of the log-likelihood functions. Stan-

dard errors of the parameter estimates are computed from the diagonal elements of the inverted

negative Hessian matrix at the point of maximum of the log-likelihood function.

Diagnostics in the IV-GARCH and IV-FIARCH models can be based on generalized resid-

uals defined as follows:

ui =∆pi − E(∆pi|σ2

i )√V(∆pi

∣∣σ2i

) . (6)

By construction, under the maintained hypothesis of ∆pi : i ∈ Z generated from IV-GARCH

or IV-FIARCH model, the sequence ui : 1 ≤ i ≤ N will be homoscedastic and uncorrelated.

A battery of standard test procedures can be applied to ui : 1 ≤ i ≤ N to test for this

assumptions in empirical applications of the models.

4 Data and descriptive statistics

In this section we give an overview of the high-frequency transaction data used in the empirical

part of the paper in section 5. We present simple descriptive statistics of the dataset, discuss

evidence of long-range dependence in the volatility of high-frequency price changes and its

dependence on the tails of price changes distribution.

In section 5 we apply IV-GARCH and IV-FIARCH models to the high-frequency IBM trades

data. This dataset has been used in the series of recent papers by Engle and Russell (1998),

Russell and Engle (1998) and Engle (2000). The dataset originates from the TORQ database

and covers trades of the IBM stock on the weekdays during the three-month period from

92

November, 1990 through January, 1991. There are 60328 observations in the sample, and

besides the date and the timestamp the dataset also includes information on the traded volume,

transaction price and the bid-ask price of the stock at the time of the trade.

From this dataset we extract transaction prices and create price changes data ∆pi by

taking their first difference. The prices and price changes are discrete due to the institutional

structure of the NYSE where the stock is traded. ∆pi is the multiple of $18 , and we delete

a few “keypunch” errors where this has not been the case. In addition to the high-frequency

price changes, we calculate inter-trade durations and construct trading hour indicator dummies

for picking up intraday seasonal effects. We also create buyer-seller indicators by comparing

the current transaction price with the mid-quote price coming at least 5 seconds before the

transaction; see Campbell, Lo and MacKinlay (1997) pp. 136-137 for the discussion of this

methodology.

It has become a common practice in the literature to filter out observations that have zero

inter-trade durations and zero price changes; see Russell and Engle (1998) and Jasiak (1998)

among others. This adjustment is believed to help reducing the influence of so-called splitted

trades, whereby larger orders are divided into a number of smaller ones traded at the same price.

Therefore, the filtered price changes can be attributed to the unique transactions leading to

the price movements due to the new information arrivals and/or position adjustments/liquidity

considerations. In the empirical application of IV-GARCH and IV-FIARCH models in section 5

we also follow this procedure for the IBM data.

Table 1: Probabilities of ∆pi.

sk Pr(∆pi = sk)

≤ -0.50000 0.0042211

-0.37500 0.0027775

-0.25000 0.014217

-0.12500 0.16002

0.0000 0.63633

0.12500 0.16016

0.25000 0.014947

0.37500 0.0031064

≥ 0.50000 0.0042211

Distribution of the resulting ∆pi series is shown on figure 1. Table 1 shows the state

probabilities of high-frequency ∆pi, where we censor price changes higher than $12 in absolute

value. Although the distribution of ∆pi series on figure 1 is notably symmetric around its

middle state, table 1 show that the right half of the distribution has slightly higher probability

mass reflecting the upward drift of the IBM stock price during the sample period. Nevertheless,

assumption of the symmetric unconditional distribution of price changes in the IV-GARCH and

IV-FIARCH models made in section 3 seems warranted in view of the evidence in table 1 and

93

figure 1. Note that the parameter µi in IV-GARCH and IV-FIARCH models is designed to pick

up short-term fluctuations in the first moment of high-frequency price changes, and is likely to

have only limited success in describing its long term trend.

Descriptive statistics of the ∆pi series together with its transformations is given in table 2.

After all data adjustments there are 54725 observations in the sample. It is seen that the sam-

ple average of the series is statistically indistinguishable from zero, and that small variance of

∆pi reflects the dominating probability of zero price movement in the data. Symmetry of the

distribution of ∆pi is confirmed by the statistically insignificant skewness statistic. Portman-

teau statistics presented in the table clearly indicate the presence of the dynamic structure in

both the first and second moments of high-frequency price changes. This is further confirmed

by the estimated autocorrelation functions on figure 210. It is seen that, while there is a sig-

nificant negative first-order autocorrelation in the ∆pi series, all higher-order autocorrelations

are statistically insignificant. This pattern corresponds very well to the bid-ask bounce model

of Roll (1984). In the empirical part of the paper in section 5 we include lags of buyer-seller

indicator in our specification of µi to capture this effect.

Table 2: Sample statistics of high-frequency price changes.

∆pi |∆pi| ∆p2i

Mean 0.000342622 0.0558748∗ 0.0140304∗

Variance 0.0140303 0.0109084 0.0244471

Skewness -0.02079 8.3517∗ 61.4053∗

Kurtosis 125.193∗ 187.434∗ 4555.33∗

Q(500) 12773.6∗ 42829∗ 32837.3∗

Maximum 3.625 3.625 13.1406

Minimum -3.625 0 0

No. of obs.: 54725 54725 54725

Notes: ∆pi denotes high-frequency price change for IBM transac-

tion data. Tick size equals to $ 18. Skewness and kurtosis statistics

and their standard errors are according to Jarque and Bera (1987).

Q(500) denotes Ljung-Box statistic with autocorrelations up to 500

lags; see Ljung and Box (1978). Star near a test statistic indicates

significance on 5% level for the appropriate distribution.

Lower panel of figure 2 reveals substantial degree of persistence in the second moment of

high-frequency price changes in IBM trades data as given by 500 lags of significant autocorre-

lations in the |∆pi| series. Similar pattern in the irregularly spaced stock market data has been

documented earlier in Rydberg and Shephard (2000), while Andersen and Bollerslev (1997a,

1997b, 1998) document long-range dependence in the volatility of five-minute foreign exchange

returns.

The number of parameters in IV-GARCH and IV-FIARCH models in section 3 depends on10Under i.i.d. normality assumption 95% confidence band for estimated autocorrelations is given by ± 1.96√

T.

Confidence intervals for correlograms shown on figure 2 are given by ±0.0084 for both ∆pi and |∆pi| series.

94

0 50 100 150 200 250 300 350 400 450 500

.2

.4

Correlogram

Date: 27-07-2001 17:01:43 Filename: trade_durations.mat

0 50 100 150 200 250 300 350 400 450 500

.2

.4

Correlogram

0 50 100 150 200 250 300 350 400 450 500

.2

.4

Correlogram

Figure 6: Correlograms of censored |∆pi| series: 7 states (upper panel), 5 states (middle panel)

and 3 states (lower panel).

the number of support points in the distribution of high-frequency price changes. As shown

in table 1, the states of ∆pi far in the tails of the distributions have quite small probability.

Therefore, in section 5 of the paper we estimate IV-GARCH and IV-FIARCH models with

reduced number of states of ∆pi, saving on the estimation time by cutting the parameters that

are likely to be estimated inefficiently. However, there has been little evidence in the literature

on the effects of the censoring of large price changes in high-frequency data on the dynamics of

the volatility of ∆pi series. Figure 6 depicts correlograms of |∆pi| series censored to 7, 5 and

3 states11. It is seen that the very extreme states of the price changes distribution do not play

a major role in the volatility dynamics of the series. When ∆pi is censored down to 3 states

from the initial number of 37 states — equivalent of loosing less than 5% of the information

compared to the initial distribution — the autocorrelations of absolute price changes become

remarkably low, becoming statistically insignificant already after 50 lags. With 7 states in price

changes distribution — still much lower than the number of states in the uncensored data —

the dynamics of |∆pi| closely resembles that of the original series on figure 2.

5 Application to IBM transaction data

In this section we report estimation results for IV-GARCH and IV-FIARCH models using IBM

transaction dataset introduced in section 4. We study the influence of several explanatory vari-11The confidence band for autocorrelations on figure 6 is given by ±0.0084 for all three graphs.

95

ables on the conditional volatility part of IV-GARCH and IV-FIARCH models, but our main

goal is to gauge the overall success of these two models in explaining volatility dynamics of dis-

crete high-frequency data. We present model diagnostics based on the generalized residuals (6)

to assess the quality of the fit12.

In the application of IV-GARCH and IV-FIARCH models to IBM data we censor high-

frequency price changes to 7 states. This allows us save on the number of estimated parameters,

but at the same time preserves essential dynamic structure of the volatility in the data; see

section 4. With 7 support points in the estimated distribution of ∆pi the dimensionality of α

is 6, out of which 2 parameters are free.

The conditional mean parameter µi and the parameter γ0i in IV-GARCH and IV-FIARCH

models are parametrized using the following sets of predetermined and exogenous variables.

For the µi parameter we use:

- Ibsi−1 is lagged buyer-seller indicator constructed as outlined in section 4. We include

one lag of this variable to pick up the bid-ask bounce in the first moment of ∆pi series;

- ∆pi−1 is included to account for possible autoregressive dynamics of the first moment of

∆pi series. Hausman, Lo and MacKinlay (1992) found the lags of endogenous variable to

be significant in the conditional mean of their ordered probit model;

- Durati is the inter-trade duration variable for the current observation. We treat this

variable as exogenous with respect to the price changes in all models in this section,

although this assumption may not be entirely realistic; see Ghysels and Jasiak (1997) for

the evidence of why this may not be so.

The parameter γ0i in the conditional volatility part of the models includes the following vari-

ables:

- Negdpi−1 is the indicator variable showing occurrence of the previous negative price

change in the series. This variable is included to study possible leverage effect in the

high-frequency data, refer to Nelson (1991). Rydberg and Shephard (2003) found this

variable to be significant in their ADS decomposition model;

- Durati is the same variable that appears in the conditional mean part of the model. By

including inter-trade duration variable two times we study the effect of trading intensity

separately on the first and second moments of ∆pi. Hausman, Lo and MacKinlay (1992)

found this variable marginally significant in static conditional volatility part of their

model;

- Trade hour dummies pick up possible intraday seasonality in the second moment of high-

frequency price changes. Intraday seasonal patterns in the volatility of high-frequency

data are widely reported for the regularly spaced data; see Andersen, Bollerslev and

Cai (2000) for recent evidence.12All models in this paper are estimated using Ox version 2.20 for Linux 2.2.17, refer to Doornik (1998).

96

As was mentioned in section 3, all variables in the conditional mean part of IV-GARCH and

IV-FIARCH models, with an exception of the constant term, should be normalized to have

zero means. This is done for all exogenous variables entering µi in both models. Signs of the

coefficients for the variables listed above were not restricted during the estimation procedure.

Therefore, the direction of the influence of included explanatory variables on the second moment

of high-frequency price changes coincides with the signs of the estimated parameters.

First, we discuss estimation results for the short memory IV-GARCH model. As seen

from table 3, the γ1 parameter of the conditional volatility process is highly significant, but

is firmly below unity. With the value of this parameter estimated at 0.93478, the half-live of

a unit shock to the conditional variance process is given by only 10 periods. This seems to

contradict the evidence of the long-range dependence in the volatility of the ∆pi series presented

in section 4. Generalized residuals diagnostics presented in the bottom half of table 3 also hints

to the unexplained dynamics left in both the first and second moments of ∆pi series. However,

graphical examination of the correlograms of generalized residuals and absolute generalized

residuals on figure 713 reveals dramatic reduction of the volatility dynamics of the residuals

compared to the original series. In fact, as suggested by the lower right panel of figure 7,

the significance of the Ljung-Box statistic of squared residuals stems from the unexplained

fist-order autocorrelation in the volatility of the ∆pi series, indicating possible omission of

explanatory variables or the need for extra MA dynamics in σ2i . Most importantly, however,

the IV-GARCH model seems to do a good job in picking up the conditional heteroscedasticity

of high-frequency price changes.

Most coefficients of other explanatory variables in the IV-GARCH model in table 3 have

expected signs. Buyer-seller indicator Ibsi−1 have expected negative influence on the condi-

tional mean part of the model, indicating increased probability of negative price change when

the transaction is seller-initiated. However, as seen from the lower left panel of figure 7, gener-

alized residuals still retain significant first order negative autocorrelation. This may signal the

failure of Ibsi−1 variable to correctly classify all transactions in the dataset into either initiated

by the buyer or seller14. The Durati variables is only significant in the conditional mean part,

and is also indicating right skew of the distribution of ∆pi for longer inter-trade durations. As

in the static model of Hausman, Lo and MacKinlay (1992), inter-trade durations have no sig-

nificant influence in the conditional volatility part of the model. A somewhat surprising result

shown in table 3 is the effect of the previous negative price change on the conditional volatility

of ∆pi. In contrast to the leverage hypothesis of Nelson (1991), conditional volatility of high-

frequency ∆pi in IBM data becomes lower after the preceding stock price drop, although the

statistical significance of the coefficient estimate of Negdpi−1 is high for the given sample size.

Effect of intraday seasonality dummies on σ2i is also insignificant.

13The confidence band for autocorrelations on figure 7 is given by ±0.0084 for all three graphs.14In is also necessary to note here that the methodology of creating the Ibsi−1 variable does not allow for the

full data classification. In our dataset around 21% of observations are left unclassified, which may also contribute

to the remaining first-order autocorrelation in the generalized residuals.

97

Table 3: Results of modeling high-frequency IBM trades series.

IV-GARCH IV-FIARCH

α parameters

α1 2.5553 (0.022118) 2.5555 (0.039110)

α2 3.4401 (0.041139) 3.3469 (0.064460)

Conditional mean parameters

Const 0.0043945 (0.0068764 0.0086375 (0.013443)

Ibsi−1 -0.30770 (0.0084919 -0.32839 (0.016626)

∆pi−1 -5.3752 (0.091196) -5.4225 (0.16522)

Durati -0.055171 (0.0095503) -0.10364 (0.030933)

Conditional variance parameters

Const 0.030987 (0.0037965) 1.5471 (0.44586)

Negdpi−1 -0.024142 (0.0071386) 0.083911 (0.021269)

Durati -0.0017309 (0.0015081) -0.032359 (0.021301)

9-10h -0.0011467 (0.0016714) -0.21259 (0.047269)

11-12h -0.0016986 (0.0016620) -0.16512 (0.063266)

13-14h -0.0020903 (0.0017012) -0.17244 (0.064233)

γ1 0.93478 (0.0086118) — —

γ2 0.069761 (0.0060288) 0.10259 (0.011412)

d — — 0.60528 (0.033268)

Skewness -0.153967∗ -0.132489∗

Kurtosis 5.55161∗ 5.35271∗

Q(500) 1777.92∗ 1081.83∗

Q2(500) 729.472∗ 594.943∗

Log lik. -27929.287 -8619.459

No of obs. 54725 22000

Notes: Parameters of the models are denoted as detailed in the text. Asymp-

totic maximum-likelihood standard errors are given in the parenthesis. Gen-

eralized residuals calculated according to equation 6. Skewness and kurtosis

statistics and their standard errors are according to Jarque and Bera (1987).

Q(500) denotes Ljung-Box statistic of the generalized residuals and Q2(500) of

squared generalized residuals with autocorrelations up to 500 lags; see Ljung

and Box (1978). Star near a test statistic indicates significance on 5% level

for the appropriate distribution.

98

0 100 200 300 400 500

−0.2

0.0

0.2

Correlogram

Date: 29−07−2001 15:20:08 Filename: trade_durations.mat

−5.0 −2.5 0.0 2.5 5.0 7.5

0.2

0.4

0.6

Density

0 100 200 300 400 500

−0.1

0.0

0.1

0.2 Correlogram

0 100 200 300 400 500

−0.2

0.0

0.2

Correlogram

Figure 7: Residuals diagnostics for IV-GARCH model. Correlograms of ui (lower left panel)

and |ui| (lower right panel). Estimated density of ui (upper right panel) and correlogram of

|∆pi| (upper left panel).

0 100 200 300 400 500

−0.2

0.0

0.2

Correlogram

Date: 31−07−2001 18:53:01 Filename: trade_durations.mat

−5.0 −2.5 0.0 2.5 5.0 7.5

0.2

0.4

Density

0 100 200 300 400 500

−0.1

0.0

0.1

0.2 Correlogram

0 100 200 300 400 500

−0.2

0.0

0.2

Correlogram

Figure 8: Residuals diagnostics for IV-FIARCH model. Correlograms of ui (lower left panel)

and |ui| (lower right panel). Estimated density of ui (upper right panel) and correlogram of

|∆pi| (upper left panel).

99

Estimation results for the long memory IV-FIARCH model are given in the right half of

table 315. We use the same set of exogenous explanatory variables as in the IV-GARCH model.

The estimate of the coefficient of fractional integration is above 0.5 and highly statistically

significant. Figure 816 shows high persistence of the volatility of ∆pi in the estimation sub-

sample as well as dramatic reduction of this persistence in the estimated generalized residuals,

although similarly to the IV-GARCH model some significant low-order autocorrelation is still

present. Effects of the included explanatory variables remain the same, except for the Negdpi−1

in the conditional variance part of the model. Even though it now supports the leverage effect

of Nelson (1991), the estimated standard error of the coefficient is relatively large for the given

sample size.

Surprisingly, both the IV-GARCH and IV-FIARCH models are doing very much alike in

terms of the residuals diagnostics shown on figures 7 and 8, even though the AR(1) structure of

the volatility process of the IV-GARCH model is not able to pick up high persistence exhibited

by the data. Relatively low point estimate of γ1 in the model reported in table 3 seems to

reaffirm the analogous results documented in Ghysels and Jasiak (1997) and Hasbrouck (1999);

see subsection 2.4. In summary, the findings call for further investigation into potential infre-

quent changes in the volatility regimes on the market that can create statistical illusion of the

long-range dependence in the data.

6 Conclusions

In this paper we present a class of models for time-series of discrete high-frequency price

changes. In contrast to the ACM model of Russell and Engle (1998) and the ADS decomposition

model of Rydberg and Shephard (2003), our models have separate set of parameters linked to

the first two moments of the conditional distribution of discrete price changes. We borrow the

idea of Hausman, Lo and MacKinlay (1992) and specify discrete distribution of ∆pi similarly to

the well-known ordered logit model. But unlike the latter study, our models are cast entirely in

terms of discrete random variables, including procedures for model diagnostics. We introduce

IV-GARCH and IV-FIARCH models for heteroscedastic sequences of discrete price changes,

where volatility parameter has the dynamic structure resembling the one in GARCH models

of Bollerslev (1986).

Separate sets of parameters for the moments of discrete price changes allow us to isolate

effects of exogenous variables on conditional mean and conditional variance of ∆pi. We present

application of IV-GARCH(1,1) and IV-FIARCH(∞) models to high-frequency IBM trades

data, where we study the effects of inter-trade durations, buyer-seller indicator and previous

negative price changes on the moments of high-frequency price changes.

We find that both IV-GARCH and IV-FIARCH models explain dynamic heteroscedasticity

of ∆pi series quite well. In particular, both models succeed in explaining most of the long-15Please note that the model is estimated on the reduced sample due to computational time considerations.16The confidence band for autocorrelations on figure 8 is given by ±0.0132 for all three graphs.

100

range dependence observed in absolute price changes series in the data, although some low order

dependence remains. Unexpectedly, the short-memory IV-GARCH(1,1) model fits the dataset

at least as good as the IV-FIARCH(∞) model in terms of residuals diagnostics. This may

indicate that observed long-range dependence in |∆pi| comes from the infrequent changes in

the volatility regimes of the stock market, rather than from the generic long-memory structure

in the second moment.

Current research efforts in the literature are directed to joint modeling of price changes and

durations in high-frequency financial data; see Gerhard and Pohlmeier (2000). In this paper

we estimate a conditional model of price changes volatility and find an insignificant effect of

the immediately preceding inter-trade duration on σ2i parameter both in IV-GARCH and IV-

FIARCH models. This finding is surprising and calls for further investigation of the interaction

between the two variables in the joint model. Combination of the ACD model of Engle and

Russell (1998) and IV-GARCH model introduced in this paper may provide a useful tool for

such analysis.

The interrelations between possible volatility regimes and long-range dependence in the

second moment of high-frequency price changes is another research issue raised by the findings

in this paper. The literature on potential links between structural breaks and long memory

in the volatility of financial data is numerous (see Hamilton and Susmel (1994), Lamoureux

and Lastrapes (1990) and Liu (2000) among others), but is mostly limited to lower frequency

financial data. The IV-FIARCH model offers an opportunity to study this issue in high-

frequency datasets.

7 Appendix

PROOF OF THEOREM 1: The noise-free skeleton of the volatility process in (3) is given by

the following exression, for any arbitrary large 0 < N <∞:

σ2i =

N∑j=1

γj−11

[γ0 − γ2E(∆p2

i−j |σ2i−j)

]+ γN

1 σ2i−N . (A.1)

Assume that the process is started at σ2−N . Recursive substitution into the equation above and

positivity of σ2 7→ γ0 + γ1σ2 − γ2E(∆p2|σ2) shows that:

σ2−N = γ0

σ2−N+1 = γ0 − γ2E(∆p2

−N |σ2−N ) + γ1σ

2−N > 0

σ2−N+2 =

[γ0 − γ2E(∆p2

−N+1|σ2−N+1)

]+ γ1

[γ0 − γ2E(∆p2

−N |σ2−N )

]+ γ2

1σ2−N

=[γ0 − γ2E(∆p2

−N+1|σ2−N+1)

]+ γ1σ

21 > 0

...

By letting N to infinity we establish the required result.

101

PROOF OF THEOREM 2: We utilize the invariance principle given in Theorem 2.9 of

Tong (1990). We have to show that the mapping σ2 7→ γ0 + γ1σ2 − γ2E(∆p2|σ2) is con-

tinuous and bounded, and that a Lyapunov function exists for the recursion σ2i = γ0 +γ1σ

2i−1−

γ2E(∆p2|σ2i−1).

1. Continuity of the mapping σ2 7→ γ0 + γ1σ2 − γ2E(∆p2|σ2) follows from A3, whereby the

function σ2 7→ E(∆p2|σ2) is continuous. Boundedness is an immediate consequence of

the parameter restrictions γ0, γ2 > 0, 0 < γ1 < 1 and non-negativity of the function

σ2 7→ E(∆p2|σ2).

2. Define an identity map V (σ2) ≡ σ2. In order for V to be a Lyapunov function for the

recursion σ2i = γ0 + γ1σ

2i−1 − γ2E(∆p2|σ2

i−1), we have to check that:

V(γ0 + γ1σ

2 − γ2E(∆p2|σ2))− V

(σ2)

= γ0 + γ1σ2 − γ2E(∆p2|σ2)− σ2 ≤ 0

for σ2 ∈ G ⊆ R+. Under A4 there exists is a unique solution of the equation γ0 − (1 −γ1)σ2 − γ2E(∆p2|σ2) = 0 provided that limσ2↓0 E(∆p2|σ2) ≤ γ0

γ2. We denote this solution

by σ2. Hence, V is Lyapunov function for the recursion σ2i = γ0 +γ1σ

2i−1−γ2E(∆p2|σ2

i−1)

on G = [σ2,∞).

Global stability of the recursion σ2i = γ0 + γ1σ

2i−1 − γ2E(∆p2|σ2

i−1) follows from the fact that

it converges to σ2 from any σ20 ∈ G.

LEMMA 1. Suppose that there exists a measurable function V : X 7→ [0,∞) and a set A ∈ B(X)

satisfying:

1. For some b <∞,

PV ≤ V − 1 + b1A .

2.

limi→∞

supa∈A

E(V (σ2

i )1(τA > i)∣∣∣σ2

0 = a)

= 0 .

3. For each m ≥ 1, the family of probability measures 1m

∑mk=1 P

k(a, ·) : a ∈ A is tight.

Then the chain is bounded in probability on average.

PROOF: The proof follows directly from Glynn and Meyn (1997).

LEMMA 2. If σ2i : i ∈ Z is a weak Feller, then for each m ≥ 1 and compact A, the family of

probability measures 1m

∑mk=1 P

k(a, ·) : a ∈ A is tight.

PROOF: The proof is given in Davis, Rydberg, Shephard and Streett (2001).

PROOF OF THEOREM 3: To show existence and uniqueness of the stationary distribution

of IV-GARCH(1,1) model we show that Markov chain σ2i : i ∈ Z defined by equation (4) is

bounded in probability on average, is an e-chain and possesses a reachable state.

102

1. To show that the chain σ2i : i ∈ Z is bounded in probability on average we verify three

conditions given in Lemma 1 of Glynn and Meyn (1997).

(a) Let function V be given by the identity map V (x) = x, let set A be an in-

terval [σ2, 1+γ0

1−γ1], where σ2 = γ0−γ2E(∆p2|σ2)

1−γ1is given in Theorem 2, and let b =

1 + γ2E(∆p2|σ2). Recall that (4) is defined on R+. We are required to show that

PV − V ≤ −1 + b1A. This follows from:

PV − V ≡ E(σ2i |σ2

i−1 = x)− x

= γ0 + γ1x+ γ2E(∆p2i−1|σ2

i−1)− γ2E(∆p2i−1|σ2

i−1)− x

= γ0 + x(γ1 − 1)

≤ −1 +(1 + γ2E(∆p2|σ2)

)1A .

(b) Let set A be as before and note that be Cauchy-Schwartz inequality we have that:

limi→∞

supa∈A

E(σ2

i 1(τA > i)∣∣∣σ2

0 = a)≤ lim

i→∞supa∈A

E12

((σ2

i )2∣∣∣σ2

0 = a)P

12 (τA > i|σ2

0 = a) .

(A.2)

The first term in the inequality above can be written as:

E((σ2

i )2∣∣∣σ2

0 = a)

= E((γ0 + γ1σ

2i−1 + γ2[∆p2

i−1 − E(∆p2i−1|σ2

i−1)])2∣∣∣σ2

0 = a)

= E(

E(γ2

0 + γ21(σ2

i−1)2 + γ2

2 [∆p2i−1 − E(∆p2

i−1|σ2i−1)]

2

+2γ0γ1σ2i−1 + 2γ0γ2[∆p2

i−1 − E(∆p2i−1|σ2

i−1)]

+2γ1σ2i−1γ2[∆p2

i−1 − E(∆p2i−1|σ2

i−1)]∣∣∣σ2

i−1

)∣∣∣∣σ20 = a

)= γ2

0 + γ21E((σ2

i−1)2∣∣∣σ2

0 = a)

+ γ22V(∆p2

i−1|σ20 = a)

+2γ0γ1E(σ2i−1|σ2

0 = a) .

By recursively substituting E((σ2

i )2|σ2

0 = a)

into the expression above the following

equation obtains:

E((σ2

i )2∣∣∣σ2

0 = a)

= γ20

t−1∑j=0

γ2j1 + γ2t

1 a2 + 2γ0γ1

i−1∑j=0

γ2j1 E(σ2

i−j−1|σ20 = a)

+γ22

i−1∑j=0

γ2j1 V(∆p2

i−j−1|σ20 = a) . (A.3)

In this expression, E(σ2i |σ2

0 = a) can be written as follows:

E(σ2i |σ2

0 = a) = E(γ0 + γ1σ

2i−1 + γ2[∆p2

i−1 − E(∆p2i−1|σ2

i−1)]∣∣∣σ2

0 = a)

= E(

E(γ0 + γ1σ

2i−1 + γ2[∆p2

i−1 − E(∆p2i−1|σ2

i−1)]∣∣∣σ2

i−1

)∣∣∣∣σ20 = a

)= γ0 + γ1E(σ2

i−1|σ20 = a)

= . . . = γ0

i−1∑j=0

γj1 + γi

1a .

103

From this we have that:

2γ0γ1

i−1∑j=0

γ2j1 E(σ2

i−1|σ20 = a) = 2γ0γ1

i−1∑j=0

γ2j1

(γ0

i−j−2∑l=0

γl1 + γi−j−1

1 a

)

= 2γ0γi1a

i−1∑j=0

γj1 + 2γ2

0γ1

i−1∑j=0

γ2j1

i−j−2∑l=0

γl1 ,

from where using the fact that∑i−j−2

l=0 γl1 ≤ 1

1−γ1we arrive at the limit:

limi→∞

2γ0γ1

i−1∑j=0

γ2j1 E(σ2

i−j−1|σ20 = a) ≤ 2γ2

0γ1

(1− γ1)(1− γ21)

.

Next, consider V(∆p2i |σ2

0 = a) in equation (A.3). Because the support of the random

variable ∆p2i is a finite subset of N0, its variance will always be bounded. Let the

upper bound of V(∆p2i |σ2

0 = a) be given by V . Then we have the following limit:

limi→∞

γ22

i−1∑j=0

γ2j1 V(∆p2

i−j−1|σ20 = a) ≤ lim

i→∞γ2

2

i−1∑j=0

γ2j1 V =

γ22 V

1− γ21

.

Combining these results for the first part of equation (A.2) we get the following

inequality:

E((σ2

i )2∣∣∣σ2

0 = a)≤ γ2

0

1− γ21

+ a2 +2γ2

0γ1

(1− γ1)(1− γ21)

+γ2

2 V

1− γ21

= c1 <∞ .

The second part of equation (A.2) follows from the Theorem 11.3.4 of Meyn and

Tweedie (1993), whereby:

P(τA > i|σ20 = a) ≤ E(τA|σ2

0 = a)i+ 1

≤ V (a) + b1A(a)i+ 1

≤ a+ 1 + γ2E(∆p2|σ2)i+ 1

.

The second condition of Lemma 1 then follows from:

limi→∞

supa∈A

E(σ2

i 1(τA > i)∣∣∣σ2

0 = a)

≤ c121 lim

i→∞supa∈A

(a+ 1 + γ2E(∆p2|σ2)

i+ 1

) 12

≤ c121 lim

i→∞

(1+γ0

1−γ1+ 1 + γ2E(∆p2|σ2)

i+ 1

) 12

= 0 .

(c) The third condition of Lemma 1 is the consequence of Lemma 2 if we show that

chain σ2i : i ∈ Z is a weak Feller chain. Recall that a Markov chain is said to be

weak Feller if its transition function P (·, O) is a lower semicontinuous function for

any open set O ∈ B(X); refer to Tweedie (1998). Rewrite (4) as:

σ2i = γ0 + γ1σ

2i−1 − γ2E(∆p2

i−1|σ2i−1) + γ2∆p2

i−1 .

It follows that the Markov transition kernel P (·, O) for the chain defined by the

equation above is given by:

P (x,O) =K∑

k=−K

1O(γ0 + γ1x− γ2E(∆p2|x) + γ2k2)πk(x) .

104

Recall that the function 1O is a lower semicontinuous function for an open set O,

and that according to A3 functions x 7→ E(∆p2|x) and x 7→ πk(x), k = −K . . .K

are both continuous in x. Hence, x 7→ P (x,O) is a lower semicontinuous for the

IV-GARCH(1,1) model.

This finishes the proof that the chain σ2i : i ∈ Z is bounded in probability on average.

2. Recall that a Markov chain is said to be an e-chain if the collection of Markov transition

kernels Pnf : n ≥ 1 is equicontinuous for each continuous function f with compact

support; refer to Meyn and Tweedie (1993). Therefore, it is necessary to show that for

any x, y in the state-space of the chain σ2i : i ∈ Z and for a given ε1 > 0 there is ε2 > 0

s.t. |Pnx f − Pn

y f | < ε1 whenever |x− y| < ε2 for all n ≥ 1.

We start with one-step transition probabilities. Since by assumption f will be uniformly

continuous and bounded, assume without loss of generality that |f | ≤ 1. Observe that:

|Pxf − Pyf | =∣∣∣∣ K∑

k=−K

f(γ0 + γ1x− γ2E(∆p2|x) + γ2k2)πk(x)

−K∑

k=−K

f(γ0 + γ1y − γ2E(∆p2|y) + γ2k2)πk(y)

∣∣∣∣=

∣∣∣∣ K∑k=−K

(f(γ0 + γ1x− γ2E(∆p2|x) + γ2k

2)

−f(γ0 + γ1y − γ2E(∆p2|y) + γ2k

2))πk(x)

+K∑

k=−K

f(γ0 + γ1y − γ2E(∆p2|y) + γ2k2)(πk(x)− πk(y)

)∣∣∣∣≤

K∑k=−K

∣∣∣f(γ0 + γ1x− γ2E(∆p2|x) + γ2k2)

−f(γ0 + γ1y − γ2E(∆p2|y) + γ2k

2)∣∣∣πk(x)

+K∑

k=−K

∣∣∣πk(x)− πk(y)∣∣∣ .

Now, using uniform continuity of f , the differences∣∣f(γ0 + γ1x− γ2E(∆p2|x) + γ2k

2)−f(γ0 + γ1y − γ2E(∆p2|y) + γ2k

2)∣∣ can be made less that ε′ > 0 whenever |x − y| < ε2

for any x, y in the state-space of the chain. Also recall, that by A3 functions x 7→ πk(x)

are Lipschitz for all k = −K . . .K. Therefore we can select C = maxC−K . . . CK s.t.∣∣πk(x)− πk(y)∣∣ ≤ C|x− y| for all x, y in the state-space of the chain. Hence, we arrive at

the following inequality:

|Pxf − Pyf | ≤ ε′ + (2K + 1)C|x− y| .

105

Next, consider the two-step transition probabilities. Using similar arguments we have:

|P 2xf − P 2

y f | =∣∣Px(Px′f)− Py(Py′f)

∣∣≤

K∑k=−K

|Px′f − Py′f |πk(x) +K∑

k=−K

∣∣πk(x)− πk(y)∣∣ ,

where x′ = γ0 + γ1x − γ2E(∆p2|x) + γ2k2 and analogously for y′. Then |x′ − y′| =∣∣∣γ1(x− y)− γ2

(E(∆p2|x)− E(∆p2|y)

)∣∣∣ ≤ φ|x− y| by A4, where φ < 1. Hence we have:

|P 2xf − P 2

y f | ≤ ε′ + (2K + 1)Cφ|x− y|+ (2K + 1)C|x− y| .

By induction,

|Pnx f − Pn

y f | ≤ ε′ + (2K + 1)C|x− y|n−1∑j=0

φj

≤ ε′ +(2K + 1)C

1− φ|x− y|

≤ ε′ +(2K + 1)C

1− φε2 ≤ ε1 .

Hence, collection of Markov transition kernels Pnf : n ≥ 1 is equicontinuous for IV-

GARCH(1,1) model.

3. Lastly, we show that the pointσ2

is a reachable state of the chain σ2i : i ∈ Z. It is

enough to show that for any open O ∈ B(X) containingσ2

there exists 1 ≤ n <∞ s.t.

Pn(x,O) > 0 for any starting value x in the state-space of the chain.

From equation (4) we see that σ2i can be written as:

σ2i = γ0

i−1∑j=0

γi1 + γi

1σ20 + γ2

i−1∑j=0

γi−1−j1 ∆p2

j − γ2

i−1∑j=0

γi−1−j1 E(∆p2

j |σ2j ) .

Consider the case when ∆p2i : i ∈ Z is a sequence of zero price innovations, where each

zero price innovation has probability π0(σ2i ), which by A2 is strictly greater than zero for

all σ2i in the state-space of the chain. By Proposition 2 the limit of σ2

i is given by:

limi→∞

σ2i = lim

i→∞γ0

i−1∑j=0

γi1 + lim

i→∞γi

1σ20 − lim

i→∞γ2

i−1∑j=0

γi−1−j1 E(∆p2

j |σ2j )

= σ2 ,

and by definition of the limit there exist 1 ≤ n < ∞ s.t. σ2i is arbitrary close to

σ2

with probability∏n

i=0 π0(σ2i ) > 0.

This finishes the proof of Theorem (3).

PROOF OF THEOREM 4: Similarly to the IV-GARCH model, we write the noise-free skeleton

of the conditional volatility process σ2i : i ∈ Z in (5) as follows:

σ2i = γ0 − γ2

N∑j=1

θj−1E(∆p2i−j |σ2

i−j) , (A.4)

106

where the sequence θj : j ≥ 0 is from the power series expansion of (1 − z)−d, refer to

Hosking (1981), and an arbitrary large 0 < N < ∞. Assume that the process is started at

σ2−N . The following recursion holds:

σ2−N = γ0

σ2−N+1 = σ2

−N − γ2 θ0 E(∆p2−N |σ2

−N )

σ2−N+2 = γ0 − γ2 θ0 E(∆p2

−N |σ2−N ) + γ2 θ0 E(∆p2

−N |σ2−N )

− γ2 θ1 E(∆p2−N |σ2

−N )− γ2 θ0 E(∆p2−N+1|σ2

−N+1)

= σ2−N+1 − γ2 θ0 E(∆p2

−N+1|σ2−N+1) + γ2(θ0 − θ1)E(∆p2

−N |σ2−N )

σ2−N+3 = σ2

−N+2 − γ2 θ0 E(∆p2−N+2|σ2

−N+2) + γ2(θ0 − θ1)E(∆p2−N+1|σ2

−N+1)

+ γ2(θ1 − θ2)E(∆p2−N |σ2

−N )...

By induction we can write:

σ2i = σ2

i−1 − γ2 θ0 E(∆p2i−1|σ2

i−1) + γ2

N∑j=2

(θj−2 − θj−1)E(∆p2i−j |σ2

i−j) ,

from where sufficiency of σ2 7→ σ2 − γ2E(∆p2|σ2) ≥ 0 follows by letting N to infinity and

non-negativity of θj−1 − θj : j ≥ 0.

PROOF OF THEOREM 5: From Theorem 4 follows that, starting from any σ2−N ∈ R+,

the sequence σ2i : i ∈ Z from the noise-free skeleton (A.4) of the volatility process in IV-

FIARCH(∞) model is bounded below by zero. At the same time, equation (A.4) implies that

the sequence σ2i : i ∈ Z is monotonically decreasing. Hence, the result follows from the

convergence theorem for monotonic bounded sequences; see Theorem 3.14 in Rudin (1976).

References

Andersen, Torben G. and Tim Bollerslev (1998a) Deutsche Mark-Dollar volatility: intraday

activity patterns, macroeconomic announcements and longer run dependencies. Journal

of Finance, vol. 53, pp. 219-265.

Andersen, Torben G. and Tim Bollerslev (1997a) Intraday periodicity and volatility persistence

in financial markets. Journal of Empirical Finance, vol. 4, pp. 115-158.

Andersen, Torben G. and Tim Bollerslev (1997b) Heterogeneous information arrivals and re-

turns volatility dynamics: uncovering the long-run in high frequency returns. Journal of

Finance, vol. 52, pp. 975-1005.

Andersen, Torben G., Tim Bollerslev and Jun Cai (2000) Intraday and interday volatility in

the Japanese stock market. Journal of International Financial Markets, Institutions and

Money, vol. 10, pp. 107-130.

107

Baillie, Richard T. (1996) Long memory processes and fractional integration in econometrics.


Ball, C. (1988) Estimation bias induced by discrete security prices. Journal of Finance, vol. 43,

pp. 841-865.

Berndt, E., B. Hall, R. Hall and J. Hausman (1974) Estimation and inference in non-linear

structural models. Annals of Economic and Social Measurement, vol. 3, pp. 653-665.



Bollerslev, Tim, Ray F. Chou and Kenneth F. Kroner (1992) ARCH modeling in finance.


Campbell, John Y., Andrew W. Lo, and A. Craig MacKinlay (1997) The econometrics of

financial markets. Princeton, New Jersey: Princeton University Press.

Cox, D. R. (1981) Statistical analysis of time series: some recent developments. Scandinavian

Journal of Statistics, vol. 8, pp. 93-115.

Crack, Timothy Falcon and Olivier Ledoit (1996) Robust structure without predictability: the

“compass rose” pattern of the stock market. Journal of Finance, vol. 51, no. 2, pp. 751-762.

Davis, A. Richard, Tina Hviid Rydberg, Neil Shephard and Sarah B. Streett (2001) The CBIN

model for counts: testing for common features in the speed of trading, quote changes,

limit and market order arrivals. Preprint.

Doornik, Jurgen A. (1998) Object-Oriented Matrix Programming using Ox 2.0. London: Tim-

berlake Consultants Ltd and Oxford: www.nuff.ox.ac.uk/Users/~Doornik.

Drost, Feike C. and Theo E. Nijman (1993) Temporal aggregation of GARCH processes. Econo-

metrica, vol. 61, no. 4, pp. 909-927.

Dufour, Alfonso and Robert F. Engle (2000) Time and the price impact of a trade. Journal of

Finance, vol. 55, no. 6, pp. 2467-2498.

Engle, Robert F. (2000) The econometrics of ultra-high-frequency data. Econometrica, vol. 68,

no. 1, pp. 1-22.


variance of U.K. inflation. Econometrica, vol. 50, pp. 987-1008.

Engle, Robert F. and Jeffrey R. Russell (1998) Autoregressive conditional duration: a new

model for irregularly spaced transaction data. Econometrica, vol. 66, pp. 1127-1162.

Feller, William (1968) An Introduction to Probability Theory and its Applications, 3rd edition,

Wiley.

Gerhard, Frank and Winfried Pohlmeier (2000) On the Simultaneity of Components of the

Transaction Process. University of Konstanz. Preprint.

Ghysels, Eric (2000) Some econometric recipes for high-frequency data cooking. Journal of

Business and Economic Statistics, vol. 18, no. 2, pp. 154-163.

Ghysels, Eric and Joanna Jasiak (1997) GARCH for irregularly spaced financial data: the

ACD-GARCH model. Preprint.

108

Gottlieb, Gary and Avner Kalay (1985) Implications of the discreteness of observed stock prices.

Journal of Finance, vol. 40, pp. 135-153.

Glynn, Peter and Sean Meyn (1997) Tightness for non-irreducible Markov chains. Preprint.

Hamilton, James D. and Raul Susmel (1994) Autoregressive conditional heteroskedasticity and

changes in regime. Journal of Econometrics, vol. 64, pp. 307-333.

Hasbrouck, Joel (1999) The dynamics of discrete bid and ask quotes. Journal of Finance,

vol. 54, no. 6,pp. 2109-2142.

Hausman, Jerry A., Andrew W. Lo and A. Craig MacKinlay (1992) An ordered probit analysis

of transaction stock prices”, Journal of Financial Economics, vol. 31, pp. 319-379.

Hautsch, Nikolas and Winfried Pohlmeier (2002) Econometric analysis of financial transaction

data: pitfalls and opportunities. Allgemeines Statistisches Archiv, vol. 86, pp. 5-30.


Jarque, C. M. and A. K. Bera (1987) A test for normality of observations and regression

residuals. International Statistical Review, vol. 55, pp. 163-172.


Johnston, Norman L. and Samuel Kotz (1969) Discrete distributions, Houghton Mifflin Com-

pany, Boston.

Lamoureux, Christopher G. and William D. Lastrapes (1990) Persistence in variance, structural

change and the GARCH model. Journal of Business and Economic Statistics, vol. 8,

pp. 225-234.

Liu, Ming (2000) Modeling long memory in stock market volatility. Journal of Econometrics,

vol. 99, pp. 139-171.

Ljung, G. M. and G. P. E. Box (1978) On a measure of lack of fit in time series models.

Biometrika, vol. 66, pp. 66-72

Meyn, Sean and Richard L. Tweedie (1993) Markov chains and stochastic stability, Springer-

Verlag, London.

McFadden, Daniel L. (1984) Econometric analysis of qualitative response models. In Handbook

of Econometrics, vol. II, Elsevier Science, North-Holland.

Nelson, Daniel B. (1991) Conditional heteroskedasticity in asset returns: a new approach.

Econometrica, vol. 59, pp. 347-370.

Roll, R. (1984) A simple implicit measure of the effective bid-ask spread in an efficient market.

Journal of Finance, vol. 39, pp. 1127-1140.

Rudin, Walter (1976) Principles of mathematical analysis. Third Edition, McGraw-Hill Inter-

national.

Russell, Jeffrey R. and Robert F. Engle (1998) Econometric analysis of discrete-valued irregularly-

spaced financial transactions data using a new autoregressive conditional multinomial

model. Preprint.

Rydberg, Tina Hviid and Neil Shephard (2000) A modelling framework for the prices and times

of trades made on the New York stock exchange. Preprint.

109

Rydberg, Tina Hviid and Neil Shephard (2003) Dynamics of trade-by-trade price movements:

decomposition and models. Journal of Financial Econometrics, vol. 1, pp. 2-25.

Szpiro, George G. (1998) Tick size, the compass rose and market nanostructure. Journal of

Banking and Finance, vol. 22, pp. 1559-1569.

Tong, Howell (1990) Non-linear time series. A dynamic system approach. New York, Oxford

University Press.

Tweedie, Richard L. (1998) Markov chains: structure and applications. Preprint.

110

Download - Finance - Long Memory Models for Volatility and High Frequency Financial Data Econometrics (2004)_yes(1)참

Top Related