structural changes in autoregressive models for binary time series

9
Structural changes in autoregressive models for binary time series Šárka Hudecová n Charles University in Prague, Faculty of Mathematics and Physics, Department of Probability and Mathematical Statistics, Sokolovská 83, CZ-186 75 Prague 8, Czech Republic article info Article history: Received 21 September 2012 Received in revised form 9 May 2013 Accepted 17 May 2013 Available online 25 May 2013 Keywords: Binary time series Change point analysis Extreme value asymptotics Autoregressive models for binary time series abstract We study autoregressive models for binary time series with possible changes in their parameters. A procedure for detection and testing of a single change is suggested. The limiting behavior of the test statistic is derived. The performance of the test is analyzed under the null hypothesis as well as under different alternatives via a simulation study. Application of the method to a real data set on US recession is provided as an illustration. & 2013 Elsevier B.V. All rights reserved. 1. Introduction Binary time series play an important role in many fields of application. They are typically observed when one is concerned with an occurrence of an event in a time period. For example, daily occurrences of precipitation can be modeled as binary time series (see Wilks and Wilby, 1999). In a financial area, one might be interested in recession indicators (see Kauppi and Saikkonen, 2008; Startz, 2008), or in a series of direction-of-change of stock returns. In some situations, the system under consideration may change at some unknown times moments (the so-called change points) while it remains stable between these points. The main objective is then to decide whether a change has occurred, and if this is the case, to estimate the change points. There is a broad statistical literature on the change point topic, see Csörgö and Horváth (1997) for an overview. In the time series setting, various results have been derived for ARMA models (e. g. Horváth, 1993; Davis et al., 1995; Hušková et al., 2007) and GARCH models (e.g. Kokoszka and Teyssière, 2002; Berkes et al., 2004). Ling (2007) deals with detection of changes in general time series models (which include ARMA, GARCH, and other as special cases) under the near epoch dependence assumption. Aue and Horváth (2013) provide an extensive overview on the recent work in this field. Structural changes in time series of counts have been considered as well. Weiss and Testik (2009) investigate detection of changes in INAR processes. Franke et al. (2012) derive cumulative sums (CUSUM) type test statistic for Poisson autoregression of order one. Change point problem for more general INGARCH processes is considered by Fokianos and Fried (2010). Darling-Erdős type limit theorems for general strong mixing sequences are provided in Schmitz (2011). Contents lists available at SciVerse ScienceDirect journal homepage: www.elsevier.com/locate/jspi Journal of Statistical Planning and Inference 0378-3758/$ - see front matter & 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.jspi.2013.05.009 n Tel.: +420221913342; fax: +420222323316. E-mail address: [email protected] Journal of Statistical Planning and Inference 143 (2013) 17441752

Upload: sarka

Post on 21-Dec-2016

212 views

Category:

Documents


1 download

TRANSCRIPT

Contents lists available at SciVerse ScienceDirect

Journal of Statistical Planning and Inference

Journal of Statistical Planning and Inference 143 (2013) 1744–1752

0378-37http://d

n Tel.:E-m

journal homepage: www.elsevier.com/locate/jspi

Structural changes in autoregressive models for binarytime series

Šárka Hudecová n

Charles University in Prague, Faculty of Mathematics and Physics, Department of Probability and Mathematical Statistics, Sokolovská 83,CZ-186 75 Prague 8, Czech Republic

a r t i c l e i n f o

Article history:Received 21 September 2012Received in revised form9 May 2013Accepted 17 May 2013Available online 25 May 2013

Keywords:Binary time seriesChange point analysisExtreme value asymptoticsAutoregressive models for binary timeseries

58/$ - see front matter & 2013 Elsevier B.V.x.doi.org/10.1016/j.jspi.2013.05.009

+420221913342; fax: +420222323316.ail address: [email protected]

a b s t r a c t

We study autoregressive models for binary time series with possible changes in theirparameters. A procedure for detection and testing of a single change is suggested. Thelimiting behavior of the test statistic is derived. The performance of the test is analyzedunder the null hypothesis as well as under different alternatives via a simulation study.Application of the method to a real data set on US recession is provided as an illustration.

& 2013 Elsevier B.V. All rights reserved.

1. Introduction

Binary time series play an important role in many fields of application. They are typically observed when one isconcerned with an occurrence of an event in a time period. For example, daily occurrences of precipitation can be modeledas binary time series (see Wilks and Wilby, 1999). In a financial area, one might be interested in recession indicators (seeKauppi and Saikkonen, 2008; Startz, 2008), or in a series of direction-of-change of stock returns.

In some situations, the system under consideration may change at some unknown times moments (the so-called changepoints) while it remains stable between these points. The main objective is then to decide whether a change has occurred,and if this is the case, to estimate the change points. There is a broad statistical literature on the change point topic, seeCsörgö and Horváth (1997) for an overview. In the time series setting, various results have been derived for ARMAmodels (e.g. Horváth, 1993; Davis et al., 1995; Hušková et al., 2007) and GARCH models (e.g. Kokoszka and Teyssière, 2002; Berkeset al., 2004). Ling (2007) deals with detection of changes in general time series models (which include ARMA, GARCH, andother as special cases) under the near epoch dependence assumption. Aue and Horváth (2013) provide an extensiveoverview on the recent work in this field.

Structural changes in time series of counts have been considered as well. Weiss and Testik (2009) investigate detection ofchanges in INAR processes. Franke et al. (2012) derive cumulative sums (CUSUM) type test statistic for Poissonautoregression of order one. Change point problem for more general INGARCH processes is considered by Fokianos andFried (2010). Darling-Erdős type limit theorems for general strong mixing sequences are provided in Schmitz (2011).

All rights reserved.

Š. Hudecová / Journal of Statistical Planning and Inference 143 (2013) 1744–1752 1745

Detection of changes in the success probability of independent binary variables was studied by Pettitt (1980) using aCUSUM type test statistic. Changes in the success probability of independent binomial variables were further considered, e.g.by Worsley (1983), Horváth (1989), Ma (1997), and Serbinowska (1996).

In this paper we study the change point problem within the framework of autoregressive models for binary time series.The proposed test statistic is derived as a maximum type score test statistic for testing a change in the intercept. Inapplications, any change in the model is usually accompanied by a change in the intercept. The derived test statistic is amaximum of normalized sums of the estimated residuals, but the normalization is slightly more complicated compared tothe standard CUSUM type statistic. Simulations indicate that the power of the proposed test is larger compared to theCUSUM, see Remark 2. Due to the form of the test statistic, the test is sensitive to any change in the model which leads to achange in the unconditional success probability, as illustrated by the simulations. Our procedure is closely related to theproblem of detection of changes in generalized linear models studied by Antoch et al. (2004).

The paper is organized as follows. The considered model for binary time series is introduced in Section 2. The teststatistic for testing a change is suggested in Section 3, and its limiting distribution is given therein. The proofs andpreliminary results are provided in Section 4. Results of the simulation study are presented in Section 5, and the real dataanalysis is given in Section 6. Some concluding remarks are summarized in Section 7.

2. Autoregressive models for binary time series

Let fYtg be a binary (0–1 valued) time series of interest, and let F t−1 ¼ sfYs; s≤t−1g be the s�field generated by the pastfYs; s≤t−1g. Assume that the conditional distribution of Yt given F t−1 is binary BðπtÞ such that the success probability πtdepends on p previous values of the series via the model

gðπtÞ ¼ β0 þ β1Yt−1 þ⋯þ βpYt−p; ð1Þ

where g is a suitable link function (logit, probit) and β¼ ðβ0; β1;…; βpÞ′ is a vector of unknown parameters. This model isreferred to as a binary autoregressive model (BAR), (e.g. Wang and Li, 2011), or a binary dynamic response model, e.g. Kauppiand Saikkonen (2008) and de Jong and Woutersen (2011). It is shown by Wang and Li (2011) that there always exists astationary solution of (1).

The model (1) can be further extended by including explanatory variables or lagged values of πt in (1) (e.g. Kauppi andSaikkonen, 2008; Startz, 2008). All these models then belong to a wide class of models, the so-called time series followinggeneralized linear models, see Kedem and Fokianos (2002). The vector of coefficients β in (1) can be estimated by the partialmaximum likelihood method. The estimator bβ is a maximizer of the partial likelihood

PLðβÞ ¼ ∏n

t ¼ 1½πtðβÞ�Yt ½1−πtðβÞ�1−Yt ;

where πtðβÞ ¼ πt given by (1). The estimator bβ is almost surely unique for all sufficiently large n, consistent, andasymptotically normal under some regularity conditions, see Kedem and Fokianos (2002, Chapter 1). This estimationapproach is convenient in applications, because software tools available for generalized linear models can be directly used.Testing hypotheses about β is based on the partial likelihood as well. The common tests are based on log-partial likelihoodratio statistic, the Wald statistic, and the score statistic. The limiting distribution of all these test statistics under the nullhypothesis is a χ2 distribution, similarly as in the classical maximum likelihood inference.

In the following we consider the logit link function gðxÞ ¼ logitðxÞ ¼ log½x=ð1−xÞ� in model (1), because this is thecanonical link for binary regression, and this fact simplifies some of the formulas. Results for a different link functiong : ð0;1Þ-R could be derived in the same way, provided that g satisfies some standard regularity conditions, see Kedem andFokianos (2002, Chapter 3).

Let Y−p;…;Y1;Y2;…;Yn be realizations of the model (1). This means that we have n realizations of ðYt ;Yt−1;…;Yt−pÞ′. Wewould like to decide whether a change has appeared in the data generating process. Hence, we introduce the model

logitðπtÞ ¼β0 þ ∑

p

j ¼ 1βjYt−j; t ¼ 1;…;m

βn

0 þ ∑p

j ¼ 1βn

j Yt−j; t ¼mþ 1;…;n;

8>>>><>>>>: ð2Þ

where β≠βn. Model (2) describes the situation where the first m observations follow the model (1) with the parameters β,and the remaining n−m observations follow the model (1) with the parameters βn. The main objective is to test whether achange has occurred or not, i.e. to test

H0 : m¼ n against H1 : mon: ð3Þ

In the next section, we derive the test statistic for H0 against a simplified alternative of a change in the intercept only, that isfor the case βj ¼ βn

j for j¼ 1;…; p in (2). The resulting test statistic is then a maximum of normalized sums of the estimatedresiduals Yt−bπ t . Clearly, the power of the test is largest against the alternative of a change in β0. However, the test is sensitive

Š. Hudecová / Journal of Statistical Planning and Inference 143 (2013) 1744–17521746

to any change in β which leads to a change in the unconditional success probability Eπt , and thus, it can be applied even inthe general case (2). Moreover, in many applications, it is usual that any change in β is accompanied by a change in β0.

3. Testing for a change

Assume first that the change point is known, let say m¼k for some kon. Consider the model

logitðπtÞ ¼β0 þ δ0 þ ∑

p

j ¼ 1βjYt−j t ¼ 1;…; k

β0 þ ∑p

j ¼ 1βjYt−j t ¼ kþ 1;…;n;

8>>>><>>>>:and derive the score test statistic, denoted as cðkÞn , for a test of Hn

0 : δ0 ¼ 0 against Hn

1 : δ0≠0 in this model. Let bπ t andbs2t ¼ bπ tð1−bπ tÞ be the estimated conditional mean and variance of Yt given F t−1 under the null hypothesis. After some

computation, it follows that:

cðkÞn ¼ ∑kt ¼ 1ðYt−π̂ tÞ

� �2Vk

; ð4Þ

where

Vk ¼ ∑k

t ¼ 1bs2t − ∑

k

t ¼ 1bs2t Zt−1

" #′∑n

t ¼ 1bs2t Zt−1Z′t−1

� �−1

∑k

t ¼ 1bs2t Zt−1

" #; ð5Þ

and Zt−1 ¼ ð1;Yt−1;…;Yt−pÞ′.If the (possible) change point m is known then the test of H0 of no change in (2) can be performed easily using the score

statistic cðkÞn and its asymptotic χ21 distribution. In practice the time point of change is usually unknown and, thus, a naturalidea is to base the test of H0 on the maximum of cðkÞn over all possible k. Set bSk ¼∑k

t ¼ 1ðYt−bπ tÞ, and define

Tn ¼ maxk0≤k≤n−k0

ffiffiffiffiffiffifficðkÞn

q¼ max

k0≤k≤n−k0

jbSkjffiffiffiffiffiffiVk

p ;

where k0 is such that Vk is well defined and positive for all k0okon−k0. Hence, Tn is a maximum of normalized cumulativesums of residuals Yt−bπ t . Statistics with slightly different normalizations are considered in Remark 2.

In order to derive the asymptotic distribution of Tn under the null hypothesis we need to state the followingassumptions:

(A1)

The true parameter β lies in an open subset of Rpþ1. (A2) The series fYtg is strictly stationary.

Theorem 1. Let assumptions (A1) and (A2) hold. Then under H0 : m¼ n we have

P Tnoffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi2 log log n

pþ log log log n

2ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi2 log log n

p þt−

12log πffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

2 log log np

0B@1CA-expf−2e−tg ð6Þ

as n-∞, t∈R.

The test of the null hypothesis of no change H0 : m¼ n can be based on a comparison of Tn with the asymptotic critical

value, which can be easily computed from the limiting distribution (6). If H0 is rejected then argmaxk0≤k≤n−k0 jbSkj½ ffiffiffiffiffiffiVk

p�−1 can

be taken as an estimate of the unknown change point m.

Remark 2. Remark that it is possible to use simplified (and numerically easier) versions of the test statistic Tn as well,namely

Un ¼ maxk0 okon−k0

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi∑n

t ¼ 1bs2t

∑kt ¼ 1bs2

t ∑nt ¼ kþ1bs2

t

vuut jbSkj8<:

9=; ð7Þ

or

Wn ¼ maxk0 okon−k0

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffin

kðn−kÞ

r jbSkjffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1n∑n

t ¼ 1bs2t

r8>><>>:

9>>=>>;: ð8Þ

Š. Hudecová / Journal of Statistical Planning and Inference 143 (2013) 1744–1752 1747

It follows from Lemma 3 together with the approximations given in the proof of Theorem 1 that the asymptotic distributionof Un and Wn is under H0 the same as the limiting distribution of Tn. According to our simulations, the test statistics Un andWn lead to slightly less powerful tests compared to the test based on Tn. Hence, we recommend using the test statistic Tn inapplications.

The test statistic Tn is defined as a normalized sum of estimated residuals Yt− bπt . It is, therefore, sensitive to a change inthe unconditional success probability EYt ¼Eπt . If β≠βn, but the corresponding success probabilities satisfy Eπt ¼Eπn

t thenthe test is not able to detect the change. This problem of test statistics based on estimated residuals is well known in a linearregression, see Hušková and Koubková (2005) and Horváth et al. (2004). The same situation occurs in the Poissonautoregression, see Franke et al. (2012).

4. Proofs and preliminary results

The following statements are derived under the assumptions of Theorem 1. Set s2t ¼ πtð1−πtÞ and recall the notationZt−1 ¼ ð1;Yt−1;…;Yt−pÞ.

Lemma 3. There exists a positive definite matrix G such that with probability one

‖1n

∑n

t ¼ 1s2t Zt−1Z′t−1−G‖¼ oðanÞ as n-∞; ð9Þ

where an ¼ n−ð1=2−δÞ for some 0oδo1=2. Furthermore,ffiffiffin

p ðbβ−βÞ is asymptotically normal with zero mean and covariance matrixG−1.

Proof. Let fUtg be defined as Ut ¼ ðYt ;…;Yt−pþ1Þ′. Then fUtg is a finite-state positive recurrent Markov chain, (see Wang andLi, 2011). Each element of the matrix s2t Zt−1Z′t−1 is a bounded function of Ut−1. Hence, the existence of the limiting matrix Gfollows from the law of large numbers, Meyn and Tweedie (2009, Chapter 17), and G¼Es21Z0Z′0. The proof that G is positivedefinite is straightforward. Furthermore, it can be shown that fUtg is V-uniformly ergodic for a suitable function V, see Wangand Li (2011), and thus, the law of iterated logarithm applies to each element of ∑n

t ¼ 1ðs2t Zt−1Z′t−1−GÞ, see Meyn andTweedie (2009, Chapter 17). This implies the rate of the convergence in (9). Finally, the assumptions of Theorem 1 togetherwith (9) ensure that bβ exists and is asymptotically normal, see Kedem and Fokianos (2002, Chapter 1). □

Define Xt ¼ Yt−πt and set Sk ¼∑kt ¼ 1Xt . Let A denote the ð1;1Þ element of the limiting matrix G from (9), that is

A¼ lim n−1∑nt ¼ 1s

2t ¼Es20. The next lemma shows that the normalized tied-down sum of the martingale differences fXtg has

a Darling–Erdős type limit distribution.

Lemma 4. Define aðnÞ ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi2 log log n

pand bðnÞ ¼ 2 log log nþ 1

2 log log log n−12 log π. Then under H0

max1≤k≤ log n

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffin

kðn−kÞ

r Sk−knSn

���� ����ffiffiffiA

p ¼OPðffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffilog log log n

pÞ; ð10Þ

maxn−log n≤k≤n

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffin

kðn−kÞ

r Sk−knSn

���� ����ffiffiffiA

p ¼OPðffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffilog log log n

pÞ; ð11Þ

and for t∈R

P aðnÞmax1≤k≤n

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffin

kðn−kÞ

r Sk−knSn

���� ����ffiffiffiA

p −bðnÞot

0BB@1CCA-exp f−2e−tg ð12Þ

as n-∞.

Proof. The process fUtg, where Ut ¼ ðYt ;…;Yt−pþ1Þ′, is a finite-state geometrically ergodic Markov chain with a uniquestationary distribution, see Wang and Li (2011). It follows from the stationarity of fUtg that fUtg is strong mixing with mixingrates αðnÞ decaying exponentially fast (see Doukhan, 1994, Chapter 4.2). Consequently, the same holds for fXtg, because Xt isa measurable function of Ut and Ut−1. Hence, we can define fXkg on another probability space together with a standardWiener process fWðtÞ; t40g such that

1ffiffiffin

p ∑n

k ¼ 1Xk−A �WðnÞ

!¼ oðn−ρÞ a:s:

for some ρ40, see Kuelbs and Philipp (1980, Theorem 4). This strong invariance principle implies the law of iteratedlogarithm for Sn, see e.g. Kuelbs and Philipp (1980), from which we get (10). The equality (11) is obtained in a similar wayfrom the fact that maxn−log n≤k≤nðjSn−Skj=

ffiffiffiffiffiffiffiffiffin−k

pÞ¼OPð

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffilog log log n

pÞ. The latter follows from an approximation of ∑n

t ¼ kþ1Xt

by a standard Wiener process, see Schmitz (2011, Lemma 2.1.4).

Š. Hudecová / Journal of Statistical Planning and Inference 143 (2013) 1744–17521748

Furthermore, the strong invariance principle allows us to replace Xk by iid normal variables without changing theasymptotic distribution of max1≤k≤n

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiðn=ðkðn−kÞÞÞ

pjSk−ðk=nÞSnj, which leads to (12). For a detailed proof of the approximation

see Schmitz (2011, Theorem 2.1.5). □

Finally, we show that we can replace bSk by Sk−ðk=nÞSn and Vk by Akðn−kÞn−1 in the test statistic Tn without changing itsasymptotic distribution.

Lemma 5. Under the assumptions of Theorem 1 we have

maxlog n≤k≤n−log n

�����ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffikðn−kÞ

n1Vk

s−

1ffiffiffiA

p�����¼ oPðalog nÞ þ OPð1=

ffiffiffin

p Þ: ð13Þ

Proof. The proof follows from (9) and goes along the same lines as the proof of Lemma 3.2. in Antoch et al. (2004). □

Proof of Theorem 1. It follows from the asymptotic normality of bβ and the Taylor expansion that

bπ t−πt ¼ s2t ðbβ−βÞZt−1 þ OP1n

� �; bs2

t −s2t ¼OP

1ffiffiffin

p� �

ð14Þ

uniformly in t, because Zt−1 is uniformly bounded. Since bβ is the maximizer of the partial likelihood, we have bSn ¼ 0. Fork≤n=2 thenffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

nkðn−kÞ

r �����bSk− Sk−knSn

� ������¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

nkðn−kÞ

r �����bSk−Sk− knbSn−Sn� �����

¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

nkðn−kÞ

r ����� ∑kt ¼ 1ðbπ−πÞ− k

n∑n

t ¼ 1ðbπ−πÞ�����

¼ffiffiffiffiffiffiffiffiffinkn−k

r �����ðbβ−βÞ′ 1k

∑k

t ¼ 1s2t Zt−1−

1n

∑n

t ¼ 1s2t Zt−1

!þ OP

1n

� ������:Similarly, for k≥n=2 one getsffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

nkðn−kÞ

r �����bSk− Sk−knSn

� ������¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffinðn−kÞ

k

r �����ðbβ−βÞ′ 1n−k

∑n

t ¼ kþ1s2t Zt−1−

1n

∑n

t ¼ 1s2t Zt−1

!þ OP

1n

� ������:It then follows from (9) that

max1≤k≤log n

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffin

kðn−kÞ

r �����bSk− Sk−knSn

� ������¼OPðffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffilog n=n

qÞ; ð15Þ

maxn−log n≤kon

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffin

kðn−kÞ

r �����bSk− Sk−knSn

� ������¼OPðffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffilog n=n

qÞ; ð16Þ

maxlog nokon−log n

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffin

kðn−kÞ

r �����bSk− Sk−knSn

� ������¼ oPðalog nÞ þ OPð1=ffiffiffin

p Þ: ð17Þ

The latter approximations together with Lemma 4 imply that

max1≤k≤ log n

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffin

kðn−kÞ

r bSk��� ���¼OPðffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffilog log log n

pÞ; ð18Þ

maxn−log n≤k≤n

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffin

kðn−kÞ

r bSk��� ���¼OPðffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffilog log log n

pÞ; ð19Þ

max1≤k≤n

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffin

kðn−kÞ

r bSk��� ���¼OPðffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffilog log n

pÞ: ð20Þ

Since max1≤k≤nð½kðn−kÞ�=½n � Vk�Þ1=2 is bounded in probability, due to (5), (14), and (9), it follows from (18) and (19) that

max1≤k≤log n

jbSkjffiffiffiffiffiffiVk

p ¼OPðffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffilog log log n

pÞ; max

n−log n≤k≤n

jbSkjffiffiffiffiffiffiVk

p ¼OPðffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffilog log log n

pÞ:

Š. Hudecová / Journal of Statistical Planning and Inference 143 (2013) 1744–1752 1749

Finally, combining (17) with Lemma 5 gives

maxlog n≤k≤n−log n

����� bSkffiffiffiffiffiffiVk

p −Sk−

knSnffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

kðn−kÞn

A

r �����¼ oP1ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

log log np !

:

Theorem 1 then follows directly from Lemma 4. □

5. Simulations

Various simulations were conducted in the R Development Core Team (2011) in order to investigate properties of thederived testing procedure. Its behavior was studied under the null hypothesis as well as under several different alternatives.In each simulation, the (pseudo) random number generator was set to an initial value set.seed(1234).

5.1. Behavior under H0

In order to investigate the behavior under the null hypothesis, N¼1000 replicas of a binary time series fytg following theparticular model (1) (without a change) were simulated. Namely, realizations of yt for t ¼ 1;…;nþ p were generated, and nrealizations of ðyt ; yt−1;…; yt−pÞ, t ¼ pþ 1;…;nþ p were used in the analysis. For each replica, the empirical distribution ofthe normalized version of the test statistic

tn ¼ Tn−ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi2 log logðnÞ

p−

log log log n

2ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi2 log logðnÞ

p ! ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi2 log log n

pþ 1

2logðπÞ

was computed. Characteristics as distribution function and quantiles were compared to their theoretical counterparts. Thesimulations were conducted for various choices of n and model parameters β.

First, series fytgnþ1t ¼ 1 were generated from the model (1) with p¼1. The empirical distribution functions of tn are compared

to the limiting distribution function GðtÞ ¼ expð−2e−tÞ in Fig. 1 (left panel) for a particular choice of β0 and β1. Apparently, thedistribution of tn is quite well roughly approximated by the limiting one even for small sample sizes (n≐ 100). On the otherhand, the approximation is not perfect even for n¼1000. This corresponds to the well-known fact that the rate ofconvergence to the extreme value distribution is low. This is also evident from the comparison of the sample quantiles of tnand the asymptotic ones, listed in Table 1. In particular, the 95% theoretical quantile is always larger compared to the sampleones, which implies that the type I error of the test is typically lower compared to the asymptotic value α. Hence, it neverexceeds the prescribed value α, but the test is quite conservative. The results are comparable for other choices of β0 and β1.

Simulations were conducted for models (1) with p41 as well. The obtained results are analogous to those discussedabove, and thus, they are not presented.

5.2. Behavior under H1

In order to study the behavior of the test under the alternative, N¼500 replicas of series fytgnþ1t ¼ 1 from the model (2) were

generated in several different settings. The (estimated) power of the test of no change was computed for various choices ofn, m, β, and βn.

−2 0 2 4 6

0.0

0.2

0.4

0.6

0.8

1.0

0.2

0.4

0.6

0.8

1.0

x

n=100n=200n=500n=1000

200 400 600 800 1000n

power

λ = 0.5λ = 0.2λ = 0.7

Fig. 1. Left panel: comparison of the empirical distribution functions of tn for n¼ 100;200;500;1000 with the limiting distribution GðtÞ ¼ expð−2e−t Þ in themodel (1) with p¼1 and β0 ¼ 2 and β1 ¼−2. Right panel: estimated power of the test in the model (2) with p¼ 1; β0 ¼ 2, β1 ¼−2, δ0 ¼−1, and δ1 ¼ 0.

Table 1Comparison of theoretical and sample quantiles for tn in the model (1) with p¼1, β0 ¼ 2, and β1 ¼−2.

Level Asymp. n¼100 n¼200 n¼500 n¼1000

1% −0.834 −0.926 −1.011 −1.052 −0.9605% −0.404 −0.519 −0.566 −0.668 −0.636

10% −0.141 −0.279 −0.335 −0.449 −0.40350% 1.060 0.775 0.721 0.751 0.77190% 2.944 2.058 2.096 2.265 2.30395% 3.663 2.514 2.528 2.793 2.85799% 5.293 3.516 3.592 3.807 3.986

−1.5 −1.0 −0.5 0.0 0.5 1.0 1.5 −1.5 −1.0 −0.5 0.0 0.5 1.0 1.5

0.0

0.2

0.4

0.6

0.8

1.0

0.0

0.2

0.4

0.6

0.8

1.0

δ0

power

δ1

power

Fig. 2. Estimated power of the test of no change in the model (2) with p¼1, n¼500, m¼250, and β0 ¼ 2, β1 ¼−2 (estimation based on N¼200 replicas).

Š. Hudecová / Journal of Statistical Planning and Inference 143 (2013) 1744–17521750

A model with p¼1 was considered for different choices of β0; β1 and δ0 ¼ βn

0−β0, δ1 ¼ βn

1−β1. Remark that if fYtg follows themodel (1) with p¼1 then it is a first order Markov chain with the transition probabilities pij ¼PðYt ¼ jjYt−1 ¼ iÞ such thatpi1 ¼ ½1þ eð−β0−β1 iÞ�−1 and pi0 ¼ 1−pi1, i¼0,1. Hence, if a change occurs at timem, then the transition probabilities change from pijto pn

ij such that pn

i1 ¼ ½1þ eð−β0−δ0−ðβ1þδ1ÞiÞ�−1. This might help to judge the effect of the change in the examples given below.As expected, the power of the test increases with an increasing sample size n, and it is largest if λ¼m=n¼ 1=2. The

dependence of the (estimated) power on n is shown for several choices of λ¼m=n in Fig. 1. The results are provided for themodel with β0 ¼ 2, β1 ¼−2, δ0 ¼−1, and δ1 ¼ 0. (The corresponding Markov transition probabilities are p01 ¼ 0:88, p11 ¼ 0:50,pn

01 ¼ 0:73, pn

11 ¼ 0:27.). If λ¼ 1=2 then the power is reasonable (greater than 0.8) for n≥400. If the size of the change is smallerthen a larger n is required for such a power. Similarly, if λ is not close to 0.5, the sample size has to be larger.

The dependence of the power on the size of the change, i.e. on δ0 and δ1, is graphically shown in Fig. 2 for β0 ¼ 2, β1 ¼ −2.Here, an effect of a change in one parameter is shown while the second parameter is kept fixed. If δ1 is kept fixed then thepower first decreases as δ0 increases and then it starts to increase again. The same holds vice versa. The minimum powerseems to be close to δ0 ¼ −δ1. This is caused by the fact that the transition probability p11 depends only on the sum β0 þ β1.Hence, if the change in β0 and β1 is the same in size but in the opposite direction then p11 remains unchanged. However, ifthe change in p01 (that is in β0) is large enough then the power of the test can be reasonable. From Fig. 2 it might seem thatthe test does not reveal a change if δ0 ¼−δ1, but this is only due to the fact that for the case β0 ¼ 2 and β1 ¼−2 discussedhere, an increment in δ041 leads to only a very small change in p01, because p01 is already close to 1 (the asymptotic valuefor δ0-∞). Some selected numerical results are listed in Table 2. The transition probabilities pn

01 and pn

11 after the change areinvolved in the table in order to provide better understanding the size of the change.

Even though the test was derived against the alternative of change in the intercept, it has a reasonable power evenagainst a more general alternative. Furthermore, it is important to notice that the test can reveal a change in β1 even forδ0 ¼ 0 provided that the size of the change is large enough, see the right panel of Fig. 2.

Number of simulations was run also for the model (2) with p41. The dependence of the power on the sample size n, andon the ratio λ¼m=n is analogous to that discussed above. Similar conclusions regarding the dependence of the power on thesize of the change can be made as well.

6. Analysis of US recession data

US recession data contain the quarterly recession data from USA in period 1855–2011 obtained from The National Bureauof Economic Research (n¼628 records). Here, Yt is coded as 1 if any month in the quarter is being in a recession. The datahave been previously analyzed, e.g. by Startz (2008) or Kauppi and Saikkonen (2008). A change in the nature of the series

Table 2Estimated power in the model (2) with p¼2, β0 ¼ 2, β1 ¼−2, n¼500, m¼250 based on N¼500 replicas.

βn

0 βn

1 p01 p11 Est. power

1.00 −2.00 0.731 0.269 0.9401.00 −1.50 0.731 0.378 0.5201.00 −1.00 0.731 0.500 0.0301.50 −2.50 0.818 0.269 0.7921.50 −2.00 0.818 0.378 0.2201.50 −1.50 0.818 0.500 0.0142.00 −2.50 0.881 0.378 0.120

2.00 −2.00 0.881 0.500 0.014

2.00 −1.50 0.881 0.622 0.1542.50 −2.00 0.924 0.622 0.2462.50 −1.50 0.924 0.731 0.8822.50 −2.50 0.924 0.500 0.0083.00 −2.50 0.953 0.622 0.3563.00 −2.00 0.953 0.731 0.918

1850 1900 1950 2000

0

1

2

3

4

k

c n(k)

c

Tn

Fig. 3. The square root of the score test statistic,ffiffiffiffiffiffifficðkÞn

q, against the possible change point m¼k in the model for US recession data.

Š. Hudecová / Journal of Statistical Planning and Inference 143 (2013) 1744–1752 1751

might be expected somehow after the World War II. The analyzed binary sequence is plotted in Fig. 4. The change in theoverall probability of recession, which happened around WWII, is clearly visible (the variables are “more likely” to be zeroafter this period).

For an illustration we consider the model (1) with p¼3, already considered by Startz, 2008, and we test whether achange in the model parameters has occurred. The sequence of the score test statistics fcðkÞn g was computed and its squareroot is plotted in Fig. 3. The maximal value Tn equals 4.7196 and the corresponding critical value on level α¼ 0:05 isc¼3.6926 (computed for n¼625, the number of available data ðyt ;…; yt−3Þ′). The null hypothesis of no change is rejectedwith the asymptotic p-value 0.0070. The maximal value is attained for the first quarter of 1933. However, one can see fromFig. 3 that the critical value c is crossed in the whole period 1927–1946, with some other minor crossings in 1915, 1921, 1924,1949, and 1950. Hence, even though the change point would be estimated as 1933, the results are in accordance with theexpectation that the change occurred around the time of World War II and the preceding events. The estimated model witha change is

logitc½πt � ¼ −1:52þ 21:16Yt−1−2:96� 10−9Yt−2−17:99Yt−3 for t≤313;−2:80þ 21:54Yt−1 þ 1:32� 10−8Yt−2−1:834Yt−3 for t≥314:

(Here, the change point m¼313 corresponds to the first quarter of 1933. Remark that a change in the structure of the modelleads to a change in the autocorrelation structure of fYtg (for instance weaker dependence of Yt on Yt−3 after the change), butalso to a lower overall probability of a recession, see Fig. 4.

7. Concluding remarks

We have suggested a procedure for testing a change in the autoregressive models for binary time series. The test statisticis a maximum of normalized sums of estimated residuals from the model, and thus, it is sensitive to any change which leadsto a change in the unconditional success probability. The performance of the proposed procedure was studied via a

1850 1900 1950 2000

0.0

0.2

0.4

0.6

0.8

1.0

1933

Fig. 4. The US recession data. The change point corresponding to the first quarter of 1933 is marked by a bold vertical line. The horizontal gray segmentscorrespond to the overall probability of a recession before and after the change point.

Š. Hudecová / Journal of Statistical Planning and Inference 143 (2013) 1744–17521752

simulation study. We have shown that the test is rather conservative due to the low rate of convergence to the extremevalue distribution.

It would be possible to construct a maximum type score test statistic against the alternative of a change in any of theregression parameters, similarly as in the GLM setting in Antoch et al. (2004). However, we did not consider this type ofstatistic due to our experience from GLM. According to our simulations, the type I error of such test might exceedconsiderably the prescribed level α¼ 0:05 in finite samples, which is rather inappropriate in practical applications.

Acknowledgments

The author thanks the anonymous reviewers for a careful inspection of the paper and for their useful comments. Thispaper was written with the support of the Czech Science Foundation project DYME Dynamic Models in Economics no. P402/12/G097.

References

Antoch, J., Gregoire, G., Jarušková, D., 2004. Detection of structural changes in generalized linear models. Statistics & Probability Letters 69, 315–332.Aue, A., Horváth, L., 2013. Structural breaks in time series. Journal of Time Series Analysis 34, 1–16.Berkes, I., Gombay, E., Horváth, L., Kokoszka, P., 2004. Sequential changepoint detection in GARCH(p,q) models. Econometric Theory 20, 1140–1167.Csörgö, M., Horváth, L., 1997. Limit Theorems in Change-Point Analysis. Wiley, New York.Davis, R., Huang, D., Yao, Y., 1995. Testing for a change in the parameter values and order of an autoregressive model. Annals of Statistics 23, 282–304.de Jong, R.M., Woutersen, T., 2011. Dynamic time series binary choice. Econometric Theory 27, 673–702.Doukhan, P., 1994. Mixing. Properties and examples. Lecture Notes in Statistics, vol. 85, Springer-Verlag, New York.Fokianos, K., Fried, R., 2010. Interventions in INGARCH processes. Journal of Time Series Analysis 31, 210–225.Franke, J., Kirch, C., Kamgaing, J.T., 2012. Changepoints in times series of counts. Journal of Time Series Analysis 33, 757–770.Horváth, L., 1989. The limit distributions of likelihood ratio and cumulative sum tests for a change in a binomial probability. Journal of Multivariate Analysis

31, 148–159.Horváth, L., 1993. Change in autoregressive processes. Stochastic Processes and their Applications 44, 221–242.Horváth, L., Hušková, M., Kokoszka, P., Steinebach, J., 2004. Monitoring changes in linear models. Journal of Statistical Planning and Inference 126, 225–251.Hušková, M., Koubková, A., 2005. Monitoring jump changes in linear models. Journal of Statistics and Operation Research 39, 51–70.Hušková, M., Prášková, Z., Steinebach, J., 2007. On the detection of changes in autogeressive time series I. Asymptotics. Journal of Statistical Planning and

Inference 137, 1243–1259.Kauppi, H., Saikkonen, P., 2008. Predicting US recessions with dynamic binary response models. Review of Economics and Statistics 90, 777–791.Kedem, B., Fokianos, K., 2002. Regression Models for Time Series Analysis. Wiley, New York.Kokoszka, P., Teyssière, G., 2002. Change-point detection in GARCH models: asymptotic and bootstrap tests. Available online at ⟨http://ideas.repec.org/p/

cor/louvco/2002065.html⟩.Kuelbs, J., Philipp, W., 1980. Almost sure invariance principles for partial sums of mixing B-valued random variables. Annals of Probability 8, 1003–1036.Ling, S., 2007. Testing for change points in time series models and limiting theorems for NED sequences. Annals of Statistics 35, 1213–1237.Ma, L., 1997. The asymptotic distributions of maximum likelihood ratio test and maximally selected χ2�test in binomial observations. Journal of Statistical

Planning and Inference 67, 17–43.Meyn, S., Tweedie, R.L., 2009. Markov Chains and Stochastic Stability, 2nd ed. Cambridge University Press.Pettitt, A.N., 1980. A simple cumulative sum type statistic for the changepoint problem with zero-one observations. Biometrika 67, 79–84.R Development Core Team, 2011. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria. ISBN 3-

900051-07-0.Schmitz, A., 2011. Limit Theorems in Change-Point Analysis for Dependent Data. Ph.D. Thesis. University of Cologne, Germany. Available online at ⟨http://

kups.ub.uni-koeln.de/4224/⟩.Serbinowska, M., 1996. Consistency of an estimator of the number of changes in binomial observations. Statistics & Probability Letters 29, 337–344.Startz, R., 2008. Binomial autoregressive moving average models with an application to US recession. Journal of Business & Economic Statistics 26, 1–8.Wang, C., Li, W.K., 2011. On the autopersistence functions and the autopersistence graphs of binary autoregressive time series. Journal of Time Series

Analysis 32, 639–646.Weiss, C., Testik, M., 2009. CUSUM monitoring of first-order integer-valued autoregressive processes of Poisson counts. Journal of Quality Technology 41,

389–400.Wilks, D., Wilby, R., 1999. The weather generation game: a review of stochastic weather models. Progress in Physical Geography 23, 329–357.Worsley, K.J., 1983. The power of likelihood ratio and cumulative sum tests for a change in a binomial probability. Biometrika 70, 455–464.