hölder norm test statistics for epidemic change

26
Journal of Statistical Planning and Inference 126 (2004) 495 – 520 www.elsevier.com/locate/jspi H older norm test statistics for epidemic change Alfredas Ra ckauskas a , Charles Suquet b; a Department of Mathematics, Institute of Mathematics and Informatics, Vilnius University, Naugarduko 24, Lt-2006 Vilnius, Lithuania b Universit e des Sciences et Technologies de Lille, Math ematiques Appliqu ees, F.R.E. CNRS 2222, at. M2, U.F.R. de Math ematiques, F-59655 Villeneuve d’Ascq Cedex, France Received 5 December 2002; accepted 4 September 2003 Abstract To detect epidemic change in the mean of a sample of size n, we introduce new test statistics UI and DI based on weighted increments of partial sums. We obtain their limit distributions under the null hypothesis of no change in the mean. Under alternative hypothesis our statistics can detect very short epidemics of length log n; ¿ 1. Using self-normalization and adaptiveness to modify UI and DI, allows us to prove the same results under very relaxed moment assumptions. Trimmed versions of UI and DI are also studied. c 2003 Elsevier B.V. All rights reserved. MSC: 62E20; 62G10; 60F17 Keywords: Change point; Epidemic alternative; Functional central limit theorem; H older norm; Partial sums processes; Selfnormalization 1. Introduction An important question in the large area of change point problems involves testing the null hypothesis of no parameter change in a sample versus the alternative that parameter changes do take place at an unknown time. For a survey we refer to the books by Brodsky and Darkhovsky (1993) or Cs org˝ o and Horv ath (1997). In this paper we suggest a class of new statistics for testing change in the mean under so called epidemic alternative. More precisely, given a sample X 1 ;X 2 ;:::;X n , we want to test the Research supported by a cooperation agreement CNRS/LITHUANIA (4714). Corresponding author. Fax: +33-320-436774. E-mail address: [email protected] (C. Suquet). 0378-3758/$ - see front matter c 2003 Elsevier B.V. All rights reserved. doi:10.1016/j.jspi.2003.09.004

Upload: univ-lille1

Post on 15-Nov-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

Journal of Statistical Planning andInference 126 (2004) 495–520

www.elsevier.com/locate/jspi

H%older norm test statistics for epidemic change�

Alfredas Ra)ckauskasa, Charles Suquetb;∗aDepartment of Mathematics, Institute of Mathematics and Informatics, Vilnius University,

Naugarduko 24, Lt-2006 Vilnius, LithuaniabUniversit"e des Sciences et Technologies de Lille, Math"ematiques Appliqu"ees, F.R.E. CNRS 2222,

Bat. M2, U.F.R. de Math"ematiques, F-59655 Villeneuve d’Ascq Cedex, France

Received 5 December 2002; accepted 4 September 2003

Abstract

To detect epidemic change in the mean of a sample of size n, we introduce new test statisticsUI and DI based on weighted increments of partial sums. We obtain their limit distributions underthe null hypothesis of no change in the mean. Under alternative hypothesis our statistics candetect very short epidemics of length log� n; �¿ 1. Using self-normalization and adaptiveness tomodify UI and DI, allows us to prove the same results under very relaxed moment assumptions.Trimmed versions of UI and DI are also studied.c© 2003 Elsevier B.V. All rights reserved.

MSC: 62E20; 62G10; 60F17

Keywords: Change point; Epidemic alternative; Functional central limit theorem; H%older norm; Partial sumsprocesses; Selfnormalization

1. Introduction

An important question in the large area of change point problems involves testingthe null hypothesis of no parameter change in a sample versus the alternative thatparameter changes do take place at an unknown time. For a survey we refer to thebooks by Brodsky and Darkhovsky (1993) or Cs%orgo and Horv@ath (1997). In this paperwe suggest a class of new statistics for testing change in the mean under so calledepidemic alternative. More precisely, given a sample X1; X2; : : : ; Xn, we want to test the

� Research supported by a cooperation agreement CNRS/LITHUANIA (4714).∗ Corresponding author. Fax: +33-320-436774.E-mail address: [email protected] (C. Suquet).

0378-3758/$ - see front matter c© 2003 Elsevier B.V. All rights reserved.doi:10.1016/j.jspi.2003.09.004

496 A. Ra5ckauskas, C. Suquet / Journal of Statistical Planning and Inference 126 (2004) 495–520

standard null hypothesis of constant mean

(H0): X1; : : : ; Xn all have the same mean denoted by �0;

against the epidemic alternative

(HA): there are integers 1¡k∗ ¡m∗ ¡n and a constant �1 �= �0 such

that EXi = �0 + (�1 − �0)1{k∗¡i6m∗}; i = 1; 2; : : : ; n:

Writing l∗ := m∗ − k∗ for the length of the epidemic, we assume throughout the paperthat both l∗ and n − l∗ go to inJnity with n.In this paper we follow the classical methodology to build test statistics by using

continuous functionals of a partial sums process. Set

S(0) = 0; S(t) =∑k6t

Xk ; 0¡t6 n:

When A is a set of integers, S(A) will denote∑

i∈A Xi. For instance we suggest to usewith 0¡�¡ 1=2

UI(n; �) := max16i¡j6n

|S(j)− S(i)− S(n)(j=n − i=n)|[(j=n − i=n)(1− (j − i)=n)]�

:

This is a functional, continuous in some H%older topology, of the classical Donsker–Prokhorov polygonal line process. The statistics UI(n; 0) was suggested by Levin andKline (1985), see Cs%orgo and Horv@ath (1997).To motivate the deJnition of such test statistics, let us assume just for a moment

that the changes times k∗ and m∗ are known. Suppose moreover under (H0) that the(Xi; 16 i6 n) are independent identically distributed with Jnite variance !2. Under(HA), suppose that the (Xi; i∈ In) and the (Xi; i∈ Icn) are separately independent iden-tically distributed with the same Jnite variance !2, where

In := {i∈ {1; : : : ; n}; k∗ ¡i6m∗}; Icn := {1; : : : ; n} \ In:

Then we simply have a two sample problem with known variances. It is then naturalto accept (H0) for small values of the statistics |Q| and to reject it for large ones,where

Q :=S(In)− l∗S(n)=n

(l∗)1=2− S(Icn)− (n − l∗)S(n)=n

(n − l∗)1=2:

After some algebra, Q may be recast as

Q =(n)1=2

(l∗(n − l∗))1=2

(S(In)− l∗

nS(n)

)[(1− l∗=n)1=2 + (l∗=n)1=2]:

A. Ra5ckauskas, C. Suquet / Journal of Statistical Planning and Inference 126 (2004) 495–520 497

As the last factor into square brackets ranges between 1 and 21=2, we may drop it andso replace |Q| by the statistics

R :=(n)1=2

(l∗(n − l∗))1=2

∣∣∣∣S(In)− l∗

nS(n)

∣∣∣∣= n−1=2 |S(In)− l∗n S(n)|[

l∗n

(1− l∗

n

)]1=2 :Introducing now the notation

tk = tn;k :=kn; 06 k6 n;

enables us to rewrite R as

R= n−1=2 |S(m∗)− S(k∗)− S(n)(tm∗ − tk∗)|[(tm∗ − tk∗)(1− (tm∗ − tk∗))]1=2

:

Now in the more realistic situation where k∗ and m∗ are unknown it is reasonableto replace R by taking the maximum over all possible indexes for k∗ and m∗. Thisleads to consider

UI(n; 1=2) := max16i¡j6n

|S(j)− S(i)− S(n)(tj − ti)|[(tj − ti)(1− (tj − ti))]1=2

:

It is worth to note that the same statistics arises from likelihood arguments in thespecial case where the observations Xi are Gaussian, see Yao (1993). The asymptoticdistribution of UI(n; 1=2) is unknown, due to diOculties caused by the denominator(for historical remarks see Cs%orgo and Horv@ath, 1997, p. 183).In our setting, the Xi’s are not supposed to be Gaussian. Moreover it seems fair

to pay something in terms of normalization when passing from R to n−1=2UI(n; 1=2).Intuitively the cost should depend on the moment assumptions made about the Xi’s. Todiscuss this, let us introduce the polygonal partial sums process $n deJned by linearinterpolation between the points (tk ; S(k)). Then UI(n; 1=2) appears as the discretizationthrough the grid (tk ; 06 k6 n) of the functional T1=2($n) where

T1=2(x) := sup0¡s¡t¡1

|x(t)− x(s)− (x(1)− x(0))(t − s)|[(t − s)(1− (t − s))]1=2

: (1)

This functional is continuous in the H%older space Ho1=2 of functions x : [0; 1] → R such

that |x(t+h)−x(t)|=o(h1=2), uniformly in t. Obviously Jnite dimensional distributionsof n−1=2!−1$n converge to those of a standard Brownian motion W . However L@evy’stheorem on the modulus of uniform continuity of W implies that Ho

1=2 has too stronga topology to support a version of W . So n−1=2$n cannot converge in distribution to!W in the space Ho

1=2. This forbid us to obtain limiting distribution for T1=2(n−1=2$n)

by invariance principle in Ho1=2 via continuous mapping. Fortunately, H%olderian in-

variances principles do exist for, roughly speaking, all the scale of H%older spaces Ho(

of functions x such that |x(t + h) − x(t)| = o(((h)), uniformly in t, provided thatthe weight function ( satisJes limh↓0 ((h)(h log |h|)−1=2 =∞. This type of invariance

498 A. Ra5ckauskas, C. Suquet / Journal of Statistical Planning and Inference 126 (2004) 495–520

principles goes back to Lamperti (1962) who studied the case ((h)=h� (0¡�¡ 1=2).A complete characterization in terms of moments assumptions on X1 was obtainedrecently by the authors in the general case (see Theorem 11 below). This leads us toreplace T1=2(n−1=2$n) by T�(n−1=2$n) obtained substituting the denominator in (1) by(t − s)�(1− (t − s))�. Going back to the discretization we Jnally suggest the class UI(uniform increments) of statistics which includes particularly UI(n; �) and similar onesUI(n; () built with a general weight ((h) instead of h�. Together with UI we considerthe class of DI (dyadic increments) statistics, which includes particularly

DI(n; �) = max16j6log2 n

2j�maxr∈Dj

∣∣∣∣S(nr)− 12S(nr + n2−j)− 1

2S(nr − n2−j)

∣∣∣∣ ;where Dj is the set of dyadic numbers of the level j and log2 denotes the loga-rithm with basis 2. DI(n; �) and UI(n; �) have similar asymptotic behaviors. Moreover,dyadic increments statistics are of particular interest since their limiting distributionsare completely speciJed (see Theorem 10 below).Due to the independence of the Xi’s, it is easy to see that even stochastic bound-

edness of either of n−1=2UI(n; �) or n−1=2DI(n; �) yields that of n−1=2+�max16i6n |Xi|.Hence, necessarily P(|X1|¿t) = O(t−p(�)), where p(�) = (1=2 − �)−1. So, heavier isthe weight ((h), stronger are the required moment assumptions to obtain the conver-gence of n−1=2UI(n; (). On the other hand, the interest of a heavy weight is in thedetection of short epidemics. Indeed (see Theorem 4 below), UI(n; 0) can detect onlyepidemics whose the length l∗ is such that n1=2 = o(l∗). For 0¡�¡ 1=2; UI(n; �)detects epidemics with n* = o(l∗) where * = (1 − 2�)=(2 − 2�). With the weight((h) = h1=2 log+(c=h); +¿ 1=2, UI(n; () detects epidemics such that log2+ n= o(l∗).We consider two ways to preserve the sensitiveness to short epidemics while relaxing

moments assumptions. One is trimming and the other one is self-normalization andadaptive selection of partial sums increments. Trimming leads to a class of statistics,which includes

DI(n; �; �) = max16j6� log2 n

2j�maxr∈Dj

∣∣∣∣S(nr)− 12S(nr + n2−j)− 1

2S(nr − n2−j)

∣∣∣∣ ;where 0¡�¡ 1. Adaptive construction of partial sums process and the correspondingfunctional central limit theorem proved in Ra)ckauskas and Suquet (2001a) allows todeal with a class of statistics which includes e.g. SUI(n; �) obtained by replacing inUI(n; �) the deterministic points tk by the random vk := V 2k =V

2n , where V 2k =X 21 + · · ·+

X 2k ; k = 1; : : : ; n,

SUI(n; �) = max16i¡j6n

|S(j)− S(i)− S(n)(vj − vi)|[(vj − vi)(1− (vj − vi))]�

:

When X1 is symmetric, the convergence in distribution of V−1n SUI(n; �) requires only

the membership of X1 in the domain of attraction of normal law, otherwise the existenceof E|X1|2+- for some -¿ 0 is suOcient.The paper is organized as follows. In Section 2, deJnitions of classes of statistics

are presented and limiting distributions are given. All proofs are deferred to Section 3.

A. Ra5ckauskas, C. Suquet / Journal of Statistical Planning and Inference 126 (2004) 495–520 499

2. UI and DI statistics and their asymptotics

All the test statistics studied in this paper may be viewed as discretizations of someH%older norms or semi-norms. The following subsection contains the relevant back-ground.

2.1. The H>olderian framework

Let ( : [0; 1] → R+ be a weight function. Membership of a continuous functionx : [0; 1] → R in the H%older space Ho

( means roughly that |x(t + h)− x(t)|= o(((|h|))uniformly in t. The classical scale of H%older spaces uses ((h)= h�, 0¡�¡ 1. L@evy’stheorem on the modulus of continuity of the Brownian motion restricts the investigationof invariance principles in this scale to the range 0¡�¡ 1=2. The same result leadsnaturally to consider the H%older spaces built with the functions ((h) = h1=2 log+(c=h)for +¿ 1=2. We include these two cases of practical interest in the following rathergeneral class R.

De�nition 1. We denote by R the class of non-decreasing functions ( : [0; 1] → R+satisfying

(i) for some 0¡�6 1=2, and some function L, positive on [1;∞) and normalizedslowly varying at inJnity,

((h) = h�L(1=h); 0¡h6 1;

(ii) /(t) = t1=2((1=t) is C1 on [1;∞);(iii) there is a +¿ 1=2 and some a¿ 0, such that /(t) log−+(t) is non-decreasing on

[a;∞).

Let us recall that L is normalized slowly varying at inJnity if and only if for every*¿ 0, t*L(t) is ultimately increasing and t−*L(t) is ultimately decreasing (Binghamet al., 1987, Theorem 1.5.5). The main practical examples we have in mind may beparametrized by

((h) = ((h; �; +) := h� log+(c=h):

We write C[0; 1] for the Banach space of continuous functions x : [0; 1] → R endowedwith the supremum norm ‖x‖∞ := sup{|x(t)|; t ∈ [0; 1]}.

De�nition 2. Let ( be a real valued non-decreasing function on [0; 1], null and rightcontinuous at 0. We denote by Ho

( the H%older space

Ho( :=

{x∈C[0; 1]; lim

*→0!((x; *) = 0

};

500 A. Ra5ckauskas, C. Suquet / Journal of Statistical Planning and Inference 126 (2004) 495–520

where

!((x; *) := sups; t∈[0;1];0¡t−s¡*

|x(t)− x(s)|((t − s)

:

Ho( is a separable Banach space for its native norm

‖x‖( := |x(0)|+ !((x; 1):

Under technical assumptions, satisJed by any ( in R, the spaceHo( may be endowed

with an equivalent norm ‖x‖seq( built on weighted dyadic increments of x. Let us denoteby Dj the set of dyadic numbers in [0; 1] of level j, i.e.

D0 = {0; 1}; Dj = {(2l − 1)2−j; 16 l6 2j−1}; j¿ 1:

We write D (resp. D∗) for the sets of dyadic numbers in [0; 1] (resp. (0; 1])

D :=∞⋃j=0

Dj; D∗ := D \ {0}:

Put for r ∈Dj; j¿ 0,

r− := r − 2−j; r+ := r + 2−j:

For any function x : [0; 1] → R, deJne its Schauder coeOcients 2r(x) by

2r(x) := x(r)− x(r+) + x(r−)2

; r ∈Dj; j¿ 1 (2)

and in the special case j = 0 by 20(x) := x(0), 21(x) := x(1). When ( belongs toR, we have on the space Ho

( the equivalence of norms (see Ra)ckauskas and Suquet,2001b; Semadeni, 1982)

‖x‖( ∼ ‖x‖seq( := supj¿0

1((2−j)

maxr∈Dj

|2r(x)|:

Let W = (W (t); t ∈ [0; 1]) be a standard Wiener process and B= (B(t); t ∈ [0; 1]) thecorresponding Brownian bridge B(t) =W (t)− tW (1); t ∈ [0; 1]. Consider for ( in R,the following random variables:

UI(() := sup0¡t−s¡1

|B(t)− B(s)|(((t − s)(1− (t − s)))

(3)

and

DI(() = supj¿1

1((2−j)

maxr∈Dj

∣∣∣∣W (r)− 12W (r+)− 1

2W (r−)

∣∣∣∣= ‖B‖seq( : (4)

These variables serve as limiting for uniform increment (UI) and dyadic increment(DI) statistics respectively. No analytical form seems to be known for the distributionfunction of UI((), whereas the distribution of DI(() is completely speciJed by Theorem10 below.

Remark 1. We do not treat in this paper the problem of low regularity H%older normsassociated to the weights ((h) = L(1=h) with L slowly varying. In this case nothing

A. Ra5ckauskas, C. Suquet / Journal of Statistical Planning and Inference 126 (2004) 495–520 501

seems known about equivalence of the norms ‖ ‖( and ‖ ‖seq( , which plays a key rolein our proofs of H%olderian invariance principles. The extension of our results to thisboundary case would require other technics and seems to be of limited practical scope.

2.2. Statistics UI(n; () and DI(n; ()

To simplify notation put

%(h) := ((h(1− h)); 06 h6 1:

For (∈R, deJne (recalling that tk = k=n, 06 k6 n),

UI(n; () = max16i¡j6n

|S(j)− S(i)− S(n)(tj − ti)|%(tj − ti)

and

DI(n; () = max1¡2j6n

1((2−j)

maxr∈Dj

∣∣∣∣S(nr)− 12S(nr+)− 1

2S(nr−)

∣∣∣∣ :To obtain limiting distribution for these statistics we shall work with a stronger null

hypothesis, namely

(H ′0): X1; : : : ; Xn are independent identically distributed random

variables with mean denoted �0:

Theorem 3. Under (H ′0), assume that (∈R and for every A¿ 0,

limt→∞ tP(|X1|¿A/(t)) = 0: (5)

Then

!−1n−1=2UI(n; () D−→n→∞UI(() (6)

and

!−1n−1=2DI(n; () D−→n→∞DI((); (7)

where !2 = var(X1) and UI((); DI(() are de?ned by (3) and (4) respectively.

When �¡ 1=2, it is enough to take A=1 in (5). In the special case ((h)=((h; �; 0)=h�, (5) may be recast as

P(|X1|¿t) = o(t−p) with p := 1=(1=2− �):

In the other special case ((h)=((h; 1=2; +)= h1=2 log+(c=h), where +¿ 1=2, it is notpossible to drop the constant A in Condition (5) which is easily seen to be equivalentto

E exp{2|X1|1=+}¡∞ for each 2¿ 0: (8)

Let us note that either of (6) or (7) yields stochastic boundedness of the randomvariable max16k6n |Xk |=/(n) which on its turn gives (8).

502 A. Ra5ckauskas, C. Suquet / Journal of Statistical Planning and Inference 126 (2004) 495–520

Remark 2. In the case where the variance !2 is unknown the results remain valid if!2 is substituted by its standard estimator !2.

Remark 3. Under (H0), the statistics n−1=2UI(n; () keeps the same value when eachXi is substituted by X ′

i := Xi − EXi. This property is only asymptotically true forn−1=2DI(n; (), see the proof of Theorem 3. For practical use of DI(n; (), it is preferableto replace Xi by X ′

i if �0 is known or else by Xi − SX where SX = n−1(X1 + · · ·+ Xn).This will avoid a bias term which may be of the order of |EX1| ln−+n in the worstcases.

To see a consistency of tests to reject null hypothesis versus epidemic alternative(HA) for large values of n−1=2UI(n; (), we naturally assume that the numbers of obser-vations k∗; m∗ − k∗; n − m∗ before, during and after the epidemic go to inJnity withn. Write l∗ := m∗ − k∗ for the length of the epidemic.

Theorem 4. Let (∈R. Assume under (HA) that the Xi’s are independent and !20 :=supk¿1 var(Xk) is ?nite. If

limn→∞ n1=2

hn

((hn)|�1 − �0|=∞; where hn :=

l∗

n

(1− l∗

n

); (9)

then

n−1=2UI(n; () P−→n→∞∞;

n−1=2DI(n; () P−→n→∞∞: (10)

In our setting �0 and �1 are constants, but the proof of Theorem 9 remains validwhen �1 and �0 are allowed to depend on n. This explains the presence of the factor|�1 − �0| in (9). This dependence on n of �0 and �1 is discarded in the followingexamples.When ((h) = h�, (9) is equivalent to

limn→∞

hn

n1=(2−2�)=∞: (11)

In this case one can detect short epidemics such that n(1−2�)=(2−2�) = o(l∗) as well aslong epidemics such that n(1−2�)=(2−2�) = o(n − l∗).When ((h)=h1=2 log+(c=h) with +¿ 1=2, (9) is satisJed provided that hn=n−1 log� n,

with �¿ 2+. This leads to detection of short epidemics such that log� n=o(l∗) as wellas of long ones verifying log� n= o(n − l∗).

2.3. Statistics SUI and SDI

To introduce the adaptive selfnormalized statistics, we shall restrict the null hypoth-esis assuming that �0 = 0. Practically the reduction to this case by centering requiresthe knowledge of �0. Although this seems a quite reasonable assumption, it is fair topoint out that this knowledge was not required for the UI and DI statistics.

A. Ra5ckauskas, C. Suquet / Journal of Statistical Planning and Inference 126 (2004) 495–520 503

For any index set A ⊂ {1; : : : ; n}, deJneV 2(A) :=

∑i∈A

X 2i :

Then V 2k = V 2({1; : : : ; k}), V 20 = 0. To simplify notation we write vk for the randompoints of [0; 1]

vk :=V 2kV 2n

; k = 0; 1; : : : ; n:

For (∈R deJne

SUI(n; () = max16i¡j6n

|S(j)− S(i)− S(n)(vj − vi)|%(vj − vi)

:

Introduce for any t ∈ [0; 1],6n(t) := max{i6 n; V 2i 6 tV 2n }

and denoting by log2 the logarithm with basis 2 (log2(2t) = t = log(et)),

Jn := log2 (V2n =X

2n:n) where Xn:n := max

16k6n|Xk |:

Now deJne

SDI(n; () := max16j6Jn

1((2−j)

maxr∈Dj

∣∣∣∣S(6n(r))− 12S(6n(r+))− 1

2S(6n(r−))

∣∣∣∣ :

Theorem 5. Assume that under (H ′0); �0 = 0 and the random variables X1; : : : ; Xn

are either symmetric and belong to the domain of attraction of normal law orE|X1|2+- ¡∞ for some -¿ 0. Then for every (∈R,

V−1n SUI(n; () D−→

n→∞UI(() (12)

and

V−1n SDI(n; () D−→

n→∞DI((): (13)

Theorem 6. Let (∈R. Under (HA) with �0 = 0, assume that the Xi’s satisfy

S(In) = l∗�1 + OP(l∗1=2); S(Icn) = OP((n − l∗)1=2) (14)

and

V 2(In)l∗

P−→n→∞ b1;

V 2(Icn)n − l∗

P−→n→∞ b0; (15)

for some ?nite constants b0 and b1. Assume that l∗=n converges to a limit c∈ [0; 1]and when c = 0 or 1, assume moreover that

limn→∞ n1=2

hn

((hn)=∞; where hn :=

l∗

n

(1− l∗

n

): (16)

504 A. Ra5ckauskas, C. Suquet / Journal of Statistical Planning and Inference 126 (2004) 495–520

Then

V−1n SUI(n; () P−→

n→∞∞: (17)

Note that in Theorem 6, no assumption is made about the dependence structure of theXi’s. It is easy to verify the general hypotheses (14) and (15) in various situations ofweak dependence like mixing or association. Under independence of the Xi’s (withoutassuming identical distributions inside each block In and Icn), it is enough to havesupk¿1 E|Xk |2+- ¡∞ and EV 2(In)=l∗ → b1, EV 2(Icn)=(n − l∗) → b0. This followseasily from Lindebergh’s condition for the central limit theorem (giving (14)) and ofTheorem 5, p. 261 in Petrov (1975) giving the weak law of large numbers requiredfor the Xi’s.

Theorem 7. Let (∈R and set X ′i := Xi −E(Xi). Under (HA) with �0=0, assume that

the random variables X ′i are independent identically distributed and that E|X ′

1 |2+- is?nite for some -¿ 0. Then under (16),

V−1n SDI(n; () P−→

n→∞∞: (18)

Remark 4. Theorems 5 and 7 are proved using the H>olderian FCLT for self-normalizedpartial sums processes (see Ra)ckauskas and Suquet, 2001a). As far as we know, theremoving of the assumption of Jnite 2 + - moment in the non-symmetric case forthe H%olderian FCLT remains an open problem. Of course when there is no weight(((h) = 1), we can apply the weak invariance principle in C[0; 1] under self-normalization. In this special case, Theorems 5 and 7 are valid with X1 (resp. X ′

1)in the domain of attraction of the normal distribution.

2.4. Trimmed statistics

Let (∈R and let the sequence mn → ∞ be such that mn6 log2 n. DeJne

DI(n; (; mn) := max16j6mn

1((2−j)

maxr∈Dj

∣∣∣∣S(nr)− 12(S(nr+) + S(nr−))

∣∣∣∣ :

Theorem 8. If under (H ′0); E|X1|2+6 ¡∞ for some 0¡66 1 and

limn→∞m1+6

n n−6=2(−2−6(2−mn) = 0 (19)

then

!−1n−1=2DI(n; (; mn)D−→

n→∞DI(():

Let (-n) and (-′n) be two sequences of positive real numbers, both converging to 0.DeJne with tk = k=n,

UI(n; (; -n; -′n) := maxn-n6i¡j6n(1−-′n)

|S(j)− S(i)− S(n)(tj − ti)|%(tj − ti)

:

A. Ra5ckauskas, C. Suquet / Journal of Statistical Planning and Inference 126 (2004) 495–520 505

Theorem 9. If under (H ′0); E|X1|2+6 ¡∞ for some 0¡66 1 and

limn→∞ n−6=2(−2−6(-n-′n) log

1+6 n= 0;

then

!−1n−1=2UI(n; (; -n; -′n)D−→

n→∞UI(():

2.5. The distribution of DI(()

The distribution function of DI(() may be conveniently expressed in terms of theerror function:

erf x =2

81=2

∫ x

0exp(−s2) ds:

The following asymptotic expansion will be useful

erf x = 1− 1x81=2

exp(−x2)(1 + O(x−2)); x → ∞:

Theorem 10. Let c = lim supj→∞ j1=2=/(2j).

(i) If c =∞ then DI(() =∞ almost surely.(ii) If 06 c¡∞, then DI(() is almost surely ?nite and its distribution functions is

given by

P(DI(()6 x) =∞∏j=1

{erf (/(2j)x)}2j−1; x¿ 0:

The distribution function of DI(() is continuous with support [c(log 2)1=2;∞).

Remark 5. The exact distribution of UI(() = ‖B‖( is not known. To obtain criticalvalues for UI(n; (), one can use a general result on large deviations for Gaussiannorms (see e.g. Ledoux and Talagrand, 1991) which reads

P(UI(()¿ x)6 4 exp( −x2

8E‖B‖2(

); x¿ 0: (20)

Another approach is to use the equivalence of the norms ‖B‖( and ‖B‖seq( whichprovides constants c( and C( such that

c(DI(()6UI(()6C(DI((): (21)

Both methods give only rough estimates of the critical value for UI(n; ().

506 A. Ra5ckauskas, C. Suquet / Journal of Statistical Planning and Inference 126 (2004) 495–520

3. Proofs

3.1. Tools

The proofs of Theorems 3 and 5 are based on invariance principles in H%older spaces,for the partial sums processes $n = ($n(t); t ∈ [0; 1]) and 9n = (9n(t); t ∈ [0; 1]) deJnedby

$n is the polygonal line with vertices(kn; S(k)

); 06 k6 n; (22)

9n is the polygonal line with vertices(V 2kV 2n

; S(k))

; 06 k6 n: (23)

The relevant invariance principles are proved in Ra)ckauskas and Suquet (2001b) andRa)ckauskas and Suquet (2001a) respectively and may be stated as follows.

Theorem 11. Let ( be in R. Then under (H ′0) with �0 = 0, the convergence

!−1n−1=2$nD→W

takes place in Ho( if and only if condition (5) holds.

Theorem 12. Under the conditions of Theorem 5

V−1n 9n

D→W

in Ho( for any (∈R.

The two next lemmas are useful to simplify and unify the proofs of Theorems 3and 5.

Lemma 13. Let (:n)n¿1 be a tight sequence of random elements in the separableBanach space B and gn; g be continuous functionals B → R. Assume that gn convergespointwise to g on B and that (gn)n¿1 is equicontinuous. Then

gn(:n) = g(:n) + oP(1):

Proof. By the tightness assumption, there is for every -¿ 0, a compact subset K inB such that for every n¿ 1, P(:n �∈ K)¡-. Now by a classical corollary of Ascoli’stheorem, the equicontinuity and pointwise convergence of (gn) give its uniform con-vergence to g on the compact K . Then for every *¿ 0, there is some n0 = n0(*; K),such that

supx∈K

|gn(x)− g(x)|¡*; n¿ n0:

Therefore we have for n¿ n0,

P(|gn(:n)− g(:n)|¿ *)6P(:n �∈ K)¡-;

which is the expected conclusion.

A. Ra5ckauskas, C. Suquet / Journal of Statistical Planning and Inference 126 (2004) 495–520 507

Lemma 14. Let (B; ‖ ‖) be a vector normed space and q :B → R such that

(a) q is subadditive: q(x + y)6 q(x) + q(y); x; y∈B;(b) q is symmetric: q(−x) = q(x); x∈B;(c) For some constant C; q(x)6C‖x‖, x∈B.

Then q satis?es the Lipschitz condition

|q(x + y)− q(x)|6C‖y‖; x; y∈B: (24)

If F is any set of functionals q ful?lling (a), (b), (c) with the same constant C, then(a), (b), (c) are inherited by g(x) := sup{q(x); q∈F} which therefore satis?es (24).

The proof is elementary and will be omitted.

3.2. Proof of Theorem 3

Convergence of UI(n; (). Consider the functionals gn; g, deJned on Ho( by

gn(x) := max16i¡j6n

I(x; i=n; j=n); g(x) := sup0¡s¡t¡1

I(x; s; t); (25)

where

I(x; s; t) :=|x(t)− x(s)− (t − s)x(1)|

%(t − s); 0¡t − s¡ 1 (26)

and %(h) = ((h(1− h)); h∈ [0; 1]. Observe thatUI(n; () = gn($n); UI(() = g(W ): (27)

Clearly the functional q = I(:; s; t) satisJes Conditions (a) and (b) of Lemma 14. Letus check Condition (c). From the DeJnition 1, it is clear that for (∈R, the ratio((h)=((h=2) is continuous on (0; 1] and has the Jnite limit 2� at zero. Hence it isbounded on [0; 1] by a constant a = a(()¡∞. Similarly the ratio (?(h) := h=((h)vanishes at 0 and is bounded on [0; 1] by some b= b(()¡∞.If 0¡t − s6 1=2, then %(t − s)¿ (((t − s)=2)¿ a−1((t − s) and

I(x; s; t)6 a|x(t)− x(s)|((t − s)

+ at − s

((t − s)|x(1)| (28)

6 a(1 + b)‖x‖(: (29)

If 1=2¡t− s¡ 1, then %(t− s)¿ (((1− t+ s)=2)¿ a−1((1− t+ s) and ((1− t+ s)is bigger than both ((s) and ((1− t). Assuming that x(0) = 0, we write x(t)− x(s)−(t − s)x(1) = x(t)− x(1) + x(0)− x(s) + (1− t + s)x(1) and obtain

I(x; s; t)6 a|x(1)− x(t)|((1− t)

+ a|x(s)− x(0)|

((s)+ a

1− t + s((1− t + s)

|x(1)| (30)

6 a(2 + b)‖x‖(: (31)

508 A. Ra5ckauskas, C. Suquet / Journal of Statistical Planning and Inference 126 (2004) 495–520

Introduce the closed subspace B := {x∈Ho(; x(0) = 0}. From (29) and (31) we

see that the functionals q= I(:; s; t) satisfy on B the Condition (c) of Lemma 14 withthe same constant C = C(() = a(2 + b). It follows by Lemma 14 that gn as well asg are Lipschitz on B with the same constant C. In particular the sequence (gn)n¿2 isequicontinuous on B.Now we shall apply Lemma 13 with :n := !−1n−1=2$n with $n deJned by (22).

By Theorem 11, (:n)n¿2 is tight on B. To check the pointwise convergence on Bof gn to g, it is enough to show that for each x∈B, the function (s; t) �→ I(x; s; t)can be extended by continuity to the compact set T = {(s; t)∈ [0; 1]2; 06 s6 t6 1}.From (28) we get 06 I(x; s; t)6 a!((x; t − s) + a|x(1)|(?(t − s), which allows thecontinuous extension along the diagonal putting I(x; s; s) := 0. From (30) we get06 I(x; s; t)6 2a!((x; 1 + t − s) + a|x(1)|(?(1 + t − s) which allows the continuousextension at the point (0; 1) putting I(x; 0; 1) := 0.The pointwise convergence of (gn) being now established, Lemma 13 gives

gn(!−1n−1=2$n) = g(!−1n−1=2$n) + oP(1) (32)

and the convergence of !−1n−1=2UI(n; () to UI(() follows from Theorem 11, (27) and(32).

Convergence of DI(n; (). Let DI′(n; () be deJned like DI(n; (), replacing Xi byX ′i = Xi − EXi. As∑

nr−¡i6nr

Xi −∑

nr¡i6nr+Xi =

∑nr−¡i6nr

X ′i −

∑nr¡i6nr+

X ′i + c(r)EX1;

with |c(r)|6 1, we clearly have n−1=2(DI(n; () − DI′(n; ()) = oP(1), so we may anddo assume without loss of generality that �0 = 0 in the proof. First we observe that

DI(n; () = max1¡2j6n

1((2−j)

maxr∈Dj

|2r(S(n :))|; (33)

where the 2r are deJned by (2) and S(n :) is the discontinuous process (S(nt); 06 t6 1). This process may also be written as

S(nt) = $n(t)− (nt − [nt])X[nt]+1; t ∈ [0; 1];from which we see that

|2r(S(n :))− 2r($n)|6 2 max16i6n

|Xi|:

Plugging this estimate into (33) leads to the representation

!−1n−1=2DI(n; () = gn(!−1n−1=2$n) + Zn; (34)

where

gn(x) := max1¡2j6n

maxr∈Dj

|2r(x)|((2−j)

; x∈Ho(

A. Ra5ckauskas, C. Suquet / Journal of Statistical Planning and Inference 126 (2004) 495–520 509

and the random variable Zn satisJes

|Zn|6 2!n1=2((1=n)

max16i6n

|Xi|= 2!/(n)

max16i6n

|Xi|: (35)

Hypothesis (5) in Theorem 3 gives immediately the convergence in probability to zeroof max16i6n |Xi|=/(n). The functionals qr(x) := |2r(x)|=((r − r−) satisfy obviouslythe hypotheses of Lemma 14 with the same constant C = 1. The same holds forgn =max1¡2j6nmaxr∈Dj qr and for

g(x) := supj¿1maxr∈Dj

qr(x):

Accounting (34) and (35), Lemma 13 leads to the representation

!−1n−1=2DI(n; () = g(!−1n−1=2$n) + oP(1);

from which the conclusion follows by Theorem 11 and continuous mapping.

3.3. Proof of Theorem 4

Divergence of UI(n; (). DeJne the random variables

X ′i :=

{Xi − �0 if i∈ Icn;Xi − �1 if i∈ In:

In order to Jnd a lower bound for UI(n; (), let us consider the contribution to itsnumerator of the increment corresponding to the full length of epidemics. It may bewritten as

S(m∗)− S(k∗)− S(n)(tm∗ − tk∗) =(1− l∗

n

)S(In)− l∗

nS(Icn)

= l∗(1− l∗

n

)(�1 − �0) + Rn; (36)

where

Rn := − l∗

n

∑i∈Icn

X ′i +

(1− l∗

n

)∑i∈In

X ′i :

It is easy to see that n−1=2Rn =OP(h1=2n ). Indeed

var(n−1=2Rn)61n

(l∗

n

)2(n − l∗)!20 +

1n

(1− l∗

n

)2l∗!20 = !20hn:

This estimate together with (36) leads to the lower bound

n−1=2UI(n; ()¿ n1=2hn

((hn)|�1 − �0| − OP

(h1=2n

((hn)

): (37)

From DeJnition 1(iii), it is easily seen that limhn→0 h1=2n =((hn) = 0, so (10) follows

from Condition (9) via (37).

510 A. Ra5ckauskas, C. Suquet / Journal of Statistical Planning and Inference 126 (2004) 495–520

Divergence of DI(n; (). Let us estimate Jrst 2r(S(n :)) under the special conJgurationr−6 tk∗ ¡tm∗ 6 r (then necessarily, l∗=n6 1=2). For notational simpliJcation, deJnefor 06 s¡ t6 1 and �∈R,

Sn(s; t) :=∑

ns¡k6nt

Xk and S ′n(s; t; �) :=

∑ns¡k6nt

(X ′k + �):

Then we have

22r(S(n :)) = Sn(r−; r)− Sn(r; r+)

= S ′n(r

−; tk∗ ; �0) + S ′n(tk∗ ; tm∗ ; �1) + S ′

n(tm∗ ; r; �0)− S ′n(r; r

+; �0)

= (�0 − �1)l∗ +O(1) +∑

nr−¡k6nr+

-kX ′k ;

where O(1) and the -k ’s are deterministic and -k = ±1. If the level j of r (r ∈Dj)is such that 2−j−1¡l∗=n6 2−j, this gives for DI(n; () the same lower bound as theright-hand side of (37). Clearly the same result holds true when [tk∗ ; tm∗ ] is includedin [r; r+]. Denoting by / the middle of [tk∗ ; tm∗ ], we obtain the same lower bound(up to multiplicative constants) under the conJgurations where [/; tm∗ ] ⊂ [r−; r] or[tk∗ ; /] ⊂ [r; r+].Assume still that l∗=n6 1=2 and Jx the level j by 2−j−1¡l∗=n6 2−j. There is a

unique r0 ∈Dj such that r−0 6 /¡r+0 . Then the divergence of n−1=2DI(n; () follows

from Hypothesis (9) by applying the above remarks in each of the following cases.

(a) r−0 6 /¡r−

0 + 2−j−1. Then /+ l∗=(2n)6 r−

0 + 2−j−1 + 2−j−1 = r0, so [/; tm∗ ] ⊂

[r−0 ; r0].

(b) r0 + 2−j−16 /¡r+0 . Then / − l∗=(2n)¿ r0, so [tk∗ ; /] ⊂ [r0; r+0 ].(c) r0−2−j−16 /¡r0+2−j−1. Then r−

0 6 tk∗ ¡tm∗ 6 r+0 . Only one of both dyadicsr−0 and r+0 has the level j−1. If r−

0 ∈Dj−1, writing r1 := r−0 we have r+1 = r+0 and

[tk∗ ; tm∗ ] ⊂ [r1; r+1 ]. Else r+0 ∈Dj−1 and with r1 := r+0 , r−1 = r−

0 so that [tk∗ ; tm∗ ] ⊂[r−1 ; r1].

It remains to consider the case where l∗=n¿ 1=2. Assume Jrst that tk∗ ¿ 1− tm∗ , sothat tk∗ ¿ (1−l∗=n)=2. Then there is a unique j such that 0¡ 2−j−1¡tk∗ 6 2−j6 1=2¡tm∗ . Choosing r0 := 2−j ∈Dj, we easily obtain

22r0 (S(n :)) = (�0 − �1)(ntk∗) + O(1) +∑

nr−¡k6nr+

-kX ′k :

Now the divergence of n−1=2DI(n; () follows from (9) in view of the lower bound

|2r0 (S(n :))|¿ |�0 − �1|n − l∗

4− OP((n − l∗)1=2):

When tk∗ ¡ 1 − tm∗ , Jxing j by 1 − 2−j6 tm∗ ¡ 1 − 2−j−1 and choosing r0 :=1− 2−j ∈Dj, leads clearly to the same lower bound. The proof is complete.

A. Ra5ckauskas, C. Suquet / Journal of Statistical Planning and Inference 126 (2004) 495–520 511

3.4. Proof of Theorem 5

Convergence of SUI(n; (). We use the straightforward representation

SUI(n; () = max16i¡j6n

I

(9n;

V 2iV 2k

;V 2jV 2k

); (38)

where 9n and I are deJned by (23) and (26) respectively. By Lemma 9 in Ra)ckauskasand Suquet (2001a), if X1 is in the domain of attraction of the normal distributionthen

max06k6n

∣∣∣∣V 2kV 2n − kn

∣∣∣∣ P−→n→∞ 0: (39)

Theorem 12 and the same arguments as in the proof of Theorem 3 give

gn(V−1n 9n) = max

16i¡j6nI(9n;

in;jn

)= g(V−1

n 9n) + oP(1); (40)

with g deJned by (25). By continuous mapping, the right-hand side of (40) convergesin distribution to UI((). Hence the convergence of SUI(n; () to UI(() will follow fromthe estimate

SUI(n; ()− gn(V−1n 9n) = oP(1):

Due to the tightness in Ho( of (V

−1n 9n)n¿1, this follows in turn from (38), (39) and

the purely analytical next lemma.

Lemma 15. For any x∈Ho( such that x(0)= 0, denote by Ix the continuous function

Ix :T → R; (s; t) �→ I(x; s; t);

where T := {(s; t)∈ [0; 1]2; 06 s6 t6 1} and I is de?ned by (26) and its continuousextension to the boundary of T putting I(x; s; s) := 0 and I(x; 0; 1) := 0. Let K be acompact subset in Ho

( of functions x such that x(0) = 0. Then the set of functionsEK := {Ix; x∈K} is equicontinuous.

Proof. Put for notational simpliJcation y(t) := x(t)− tx(1) and deJne

y %(s; t) :=y(t)− y(s)%(|t − s|) ; 0¡ |t − s|¡ 1;

with y %(s; t) := 0 when |t − s|= 0 or 1. This reduces the problem to the proof of theequicontinuity of EK′ := {y % :T → R; y∈K ′} where K ′ is a compact subset in Ho

(of functions vanishing at 0 and 1.It is convenient to estimate the increments of y in terms of the function %(h) =

((h(1− h)). First we always have for 06 s6 s+ h6 1,

|y(s+ h)− y(s)|6 ((h)!((y; h): (41)

512 A. Ra5ckauskas, C. Suquet / Journal of Statistical Planning and Inference 126 (2004) 495–520

Because y(0) = y(1) = 0 and ( and !((y; :) are non-decreasing, we also have for06 s6 s+ h6 1,

|y(s+ h)− y(s)|6 |y(s+ h)− y(1)|+ |y(s)− y(0)|6 ((1− s − h)!((y; 1− s − h) + ((s)!((y; s)

6 2((1− h)!((y; 1− h): (42)

Recalling the constant a= a(() = sup{((h)=((h=2); 0¡h6 1}¡∞, we see that((h) ∧ ((1− h)6 a((h(1− h)) = a%(h): (43)

By (41)–(43) there is a constant c = c(() such that

|y(s+ h)− y(s)|6 c%(h)!((y; h ∧ (1− h)); 06 s6 s+ h6 1:

Writing for simplicity

!((y; h) := c!((y; h ∧ (1− h));

the above estimate leads to

y(t) = y(s) + y %(s; t)%(t − s); (s; t)∈ [0; 1]2; (44)

where

|y %(s; t)|6 !((y; |t − s|): (45)

Using expansion (44), we get for any (s; t) and (s′; t′) in the interior of T (writingh′ := t′ − s′ and h= t − s)

y %(s′; t′)− y %(s; t) =(y(t)− y(s))(%(h)− %(h′))

%(h)%(h′)

+y %(t; t′)%(|t′ − t|)− y %(s; s′)%(|s′ − s|)

%(h′)

= y %(s; t)%(h)− %(h′)

%(h′)

+y %(t; t′)%(|t′ − t|)− y %(s; s′)%(|s′ − s|)

%(h′):

Due to (45), this gives

|y %(s′; t′)− y %(s; t)|6 c‖y‖(|%(h)− %(h′)|+ %(|t′ − t|) + %(|s′ − s|)

((h′): (46)

To deal with the points close to the border of T , we complete this estimate with

|y %(s′; t′)− y %(s; t)|6 c(!((y; h) + !((y; h′)): (47)

Now the equicontinuity of EK′ follows easily from (46), (47), the continuity of (, theboundedness in Ho

( of the compact K′ and the uniform convergence to zero over K ′ of

!((y; *) (when * goes to zero). This last property is again an application of Ascoli’stheorem or of Dini’s theorem.

A. Ra5ckauskas, C. Suquet / Journal of Statistical Planning and Inference 126 (2004) 495–520 513

Convergence of SDI(n; (). Recall that 6n(t) = max{i6 n; V 2i 6 tV 2n }, t ∈ [0; 1]and Jn = log2 (V

2n =X

2n:n) where Xn:n := max16k6n |Xk |. Note also that SDI(n; () may be

recast as

SDI(n; () = max16j6Jn

1((2−j)

maxr∈Dj

|2r(S ◦ 6n)|:

By O’Brien (1980), X1 is in the domain of attraction of the normal distribution if andonly if

V−1n max

16k6n|Xk |= oP(1): (48)

The polygonal random line 9n deJned by (23) may be written as

9n(t) = S(6n(t)) +t − V 26n(t)V

−2n

X 26n(t)+1V−2n

X6n(t)+1 = S(6n(t)) + n(t):

Noting that | n(t)|6Xn:n, we have |2r( n)|6 2Xn:n for every dyadic r. This providesus the representation

V−1n SDI(n; () = max

16j6Jn

1((2−j)

maxr∈Dj

∣∣∣∣2r

(9nVn

)∣∣∣∣+ Z ′n;

where

|Z ′n|6

2

Vn((X 2n:nV−2n )

=2

/(V 2n X−2n:n )

;

with /(t) = t1=2((1=t). Recalling that for (∈R, limt→∞ /(t) =∞, we see from (48)that Z ′

n = oP(1). Introducing again the functionals qr(x) := |2r(x)|=((r − r−), gn =max16j6log2 nmaxr∈Dj qr and g(x) := supj¿1 maxr∈Dj qr(x) which satisfy the hypothesesof Lemma 14 with the same constant C = 1, we have

V−1n SDI(n; () = gJn(V

−1n 9n) + Z ′

n:

By (48), Jn goes to inJnity in probability, so an obvious extension of Lemma 13 leadsto the representation

V−1n SDI(n; () = g(V−1

n 9n) + oP(1)

and the convergence of V−1n SDI(n; () to DI(() follows from Theorem 12 by continuous

mapping.

3.5. Proof of Theorem 6

Divergence of SUI(n; (). We shall exploit the lower bound

SUI(n; ()¿|S(In)− S(n)(vm∗ − vk∗)|

%(vm∗ − vk∗): (49)

514 A. Ra5ckauskas, C. Suquet / Journal of Statistical Planning and Inference 126 (2004) 495–520

Writing again X ′i := Xi − E(Xi), we get

S(In)− S(n)(vm∗ − vk∗) =V 2(Icn)V 2n

l∗�1 + Rn; (50)

where

Rn :=V 2(Icn)V 2n

∑i∈In

X ′i − V 2(In)

V 2n

∑i∈Icn

X ′i :

From (15) it follows that

V 2nn= cb1 + (1− c)b0 + oP(1); (51)

so by (14) and (15) we easily obtain

n−1=2Rn =OP(h1=2n ): (52)

Rewriting the principal term in (50) as

V 2(Icn)V 2n

l∗�1 =(n − l∗)−1V 2(Icn)

n−1V 2n

l∗

n(n − l∗)�1

leads by (15) to the equivalence in probability

n−1=2V2(Icn)V 2n

l∗�1P∼ b1�1cb1 + (1− c)b0

n1=2hn: (53)

Finally we also have from (14)

V 2(In)V 2n

V 2(Icn)V 2n

P∼ b0b1(cb1 + (1− c)b0)2

hn;

from which (recalling that ((h) = h�L(1=h) with L slowly varying) we deduce that forsome constant d¿ 0,

%(vm∗ − vk∗)6d((hn)(1 + oP(1)): (54)

Going back to (49) with the estimates (52), (53) and (54) provides the lower bound

n−1=2SUI(n; ()¿Cn1=2hn

((hn)− oP(1)− OP

(h1=2n

((hn)

); (55)

with a positive constant C depending on �1, b0, b1, c and (. As (∈R, h1=2n =o(((hn))when hn goes to zero, so (55) gives the divergence of n−1=2SUI(n; (), accountingHypothesis (16). Due to (51), the divergence of V−1

n SUI(n; () follows.

A. Ra5ckauskas, C. Suquet / Journal of Statistical Planning and Inference 126 (2004) 495–520 515

3.6. Proof of Theorem 7

Divergence of SDI(n; (). For A ⊂ {1; : : : ; n} and 16 k6 n, deJne

S ′(A) :=∑i∈A

X ′i ; S ′(0) := 0; S ′(k) = S ′({1; : : : ; k})

and

V ′2(A) :=∑i∈A

X 2i′ ; V 2k′ := V ′2({1; : : : ; k}); v′0 := 0; v′

k :=V 2k′

V 2n′:

Denote by 9′n the polygonal line with vertices (v

′k ; S

′(k)); 06 k6 n.As a preliminary step, we look for some control on the ratios Vn=V ′

n and(vm∗ − vk∗)=(v′

m∗ − v′k∗). Clearly

V 2nV 2n′

= 1 +l∗�21V 2n′

+2�1S ′(In)

V 2n′:

Under the assumptions made on the sequence (X ′i ), V−1

n′ S ′(In) = OP(1) and n−1V 2n′

converges in probability to !2 := E(X 21′), so

16 lim infn→+∞

V 2nV 2n′

6 lim supn→+∞

V 2nV 2n′

6 1 +�21!2

in probability: (56)

Similarly we have

vm∗ − vk∗v′m∗ − v′

k∗=

V 2n′(V ′2(In) + l∗�21 + 2�1S′(In))

V 2n V ′2(In)

=V 2n′

V 2n

{1 +

l∗�21V ′2(In)

(1 +

2S ′(In)�1l∗

)}:

Applying the weak law of large numbers to V ′2(In)=l∗ and S ′(In)=l∗, we see that inprobability,(

1 +�21!2

)−16 lim inf

n→+∞vm∗ − vk∗v′m∗ − v′

k∗6 lim sup

n→+∞vm∗ − vk∗v′m∗ − v′

k∗6 1 +

�21!2

: (57)

The statistics SDI(n; () may be expressed as

SDI(n; () = max16j6Jn

12((2−j)

maxr∈Dj

∣∣∣∣∣∣∑

r−¡vk6r

Xk −∑

r¡vk6r+Xk

∣∣∣∣∣∣ :DeJne the random index J by 2−J ¿ vm∗ − vk∗ ¿ 2−J−1 and consider Jrst the conJg-uration where r−6 vk∗ ¡vm∗ 6 r where r ∈DJ . Then

V−1n SDI(n; ()¿

l∗|�1|2Vn((2−J )

− 12Rn;

516 A. Ra5ckauskas, C. Suquet / Journal of Statistical Planning and Inference 126 (2004) 495–520

where

Rn :=1

Vn((2−J )

∣∣∣∣∣∣∑

r−¡vk6r

X ′k −

∑r¡vk6r+

X ′k

∣∣∣∣∣∣ :Due to (56), (57), the deJnition of J and the form of (, there are some positiverandom variables Ki, not depending on n such that

l∗|�1|Vn((2−J )

¿K1l∗

Vn((vm∗ − vk∗)¿

K2(l∗=V 2n′)V ′n

((V ′2(In)=V 2n′)¿K3

n1=2hn

((hn)

and this lower bound goes to inJnity in probability by Condition (16). So it remainsto check that Rn =OP(1).To this end, we split Rn in four blocks Rn;1; : : : ; Rn;4 indexed respectively by

r− ¡vk with k6 k∗, k ∈ In, vk 6 r with k ¿m∗; r ¡ vk 6 r+. Recall that 6n(t) :=max{i6 n; vi6 t}. To bound Rn;1, note that if 6n(r−)¿k∗, then Rn;1 = 0. Assumingnow that 6n(r−)6 k∗, we get

Rn;1 =1

((2−J )V ′n

Vn

∣∣∣∣S ′((6n(r−); k∗])V ′n

∣∣∣∣6

V ′n

Vn

∣∣∣∣∣∣∣∣ 9′

n

V ′n

∣∣∣∣∣∣∣∣(

((v′k∗ − v′

6n(r−))

((2−J )

6V ′n

Vn

∣∣∣∣∣∣∣∣ 9′

n

V ′n

∣∣∣∣∣∣∣∣(

((v′k∗ − v′

6n(r−))

((vk∗ − v6n(r−)): (58)

In the bound (58), V ′n=Vn =OP(1) by (56) and ‖V−1

n′ 9′n‖( =OP(1) by Theorem 12. To

bound the last fraction, we observe that

v′k∗ − v′

6n(r−)

vk∗ − v6n(r−)=

V−2n′ V ′2((6n(r−); k∗])

V−2n V 2((6n(r−); k∗])

=V 2nV 2n′

;

since X ′i = Xi for i∈ (6n(r−); k∗]. As ( is of the form ((h) = h�L(1=h) with L slowly

varying, it easily follows from (56) that

((v′k∗ − v′

6n(r−))

((vk∗ − v6n(r−))= OP(1):

Similarly the control of Rn;2 reduces to establishing the stochastic boundedness of theratio ((v′

m∗ − v′k∗)=((vm∗ − vk∗), which in turn follows from (57).

The control of Rn;3 and Rn;4 is similar to the one of Rn;1. Finally the proof iscompleted by a discussion of the conJgurations like in the proof of Theorem 4.

A. Ra5ckauskas, C. Suquet / Journal of Statistical Planning and Inference 126 (2004) 495–520 517

3.7. Proof of Theorem 8

Like in the proof of the convergence of DI(n; () (Theorem 3), it is enough toconsider the case �0 = 0. We shall assume !2 = 1. Denoting

*ni(j; t) =1

((2−j)n−1=2[1{nt− ¡i6 nt} − 1{nt ¡ i6 nt+}];

where j=1; : : : ; mn; t ∈Dj and i=1; : : : ; n, we easily Jnd the following representation

n−1=2DI(n; (; mn) =12max

16j6mn

maxt∈Dj

∣∣∣∣∣n∑

i=1

Xi*ni(j; t)

∣∣∣∣∣ :Let (�i; i=1; : : : ; n) be a collection of independent standard normal random variables.Consider

Vn;((mn) =12max

16j6mn

maxt∈Dj

∣∣∣∣∣n∑

i=1

�i*ni(j; t)

∣∣∣∣∣ :Set kn = 1 + 2 + · · ·+ 2mn . For x = (x1; : : : ; xkn)∈Rkn , write

‖x‖∞ := max16l6kn

|xl|:

Consider for each -¿ 0 and r¿ - the function fr;- :Rkn → R which possesses thefollowing properties

(i) for each x∈Rkn

1{‖x‖∞6 r − -}6fr;-(x)6 1{‖x‖∞6 r + -}; (59)

(ii) the function f- is inJnitely many times diVerentiable and for each ‘¿ 1 thereexists an absolute constant C‘ ¿ 0 such that

‖f(‘)r; - ‖ := sup{|f(‘)r; - (x)(h)‘| : x; h∈Rkn ; ‖h‖∞6 1}

6 C‘-−‘ log‘−1 kn: (60)

Here f(‘)r; - denotes the ‘th derivative of the function fr;- and f(‘)r; - (x)(h)‘ the cor-responding diVerential. Such functions are constructed in Bentkus (1984). Denotingfor i = 1; : : : ; n, Xni = (Xi*ni(j; t), j = 1; : : : ; mn; t ∈Dj) and Yni = (�i*ni(j; t), t ∈Dj,j = 1; : : : ; mn) we have

DI(n; (; mn) =

∣∣∣∣∣∣∣∣∣∣

n∑i=1

Xni

∣∣∣∣∣∣∣∣∣∣∞

; Vn;((mn) =

∣∣∣∣∣∣∣∣∣∣

n∑i=1

Yni

∣∣∣∣∣∣∣∣∣∣∞

:

Applying property (59) of the functions f-=fr+-; - and f-=fr−-; - we derive for r¿ -

Dn(r) := |P(DI(n; (; mn)6 r)− P(Vn;((mn)6 r)|

6max

∣∣∣∣∣Ef-

(n∑

i=1

Xni

)− Ef-

(n∑

i=1

Yni

)∣∣∣∣∣+ P(r − -6Vn;((mn)6 r + -);

where maximum extends over two functions f- = fr−-; - and f- = fr+-; -.

518 A. Ra5ckauskas, C. Suquet / Journal of Statistical Planning and Inference 126 (2004) 495–520

Writing Znk =∑k−1

i=1 Xni +∑n

i=k+1 Yni and noting that Znk +Xnk =Zn;k+1 +Ynk wehave

I := Ef-

(n∑

i=1

Xni

)− Ef-

(n∑

i=1

Yni

)

=n∑

k=1

Ef-(Znk + Xnk)− Ef-(Znk + Ynk):

Next we shall use Taylor’s expansion f-(x + h) = f-(x) + f′-(x)h + 2

−1f′′- (x)h

2 + Rwith interpolated remainder |R|6 21−6‖f′′

- ‖1−6‖f′′′- ‖6‖h‖2+6

∞ where 0¡66 1. Notingthat for each k = 1; : : : ; n

Ef′-(Znk)(Xnk) = Ef′

-(Znk)(Ynk) = 0

and

Ef′′- (Znk)(Xnk)2 = Ef′′

- (Znk)(Ynk)2

we obtain I6 21−�-−3∑n

i=1 [E‖Xni‖2+6∞ + E‖Yni‖2+6

∞ ] ‖f′′- ‖1−6‖f′′′

- ‖6. Clearly

E‖Xni‖2+6∞ 6E|X1|2+6n−(2+6)=2(−2−6(2−mn)

and

E‖Yni‖2+6∞ 6 c6n−(2+6)=2(−2−6(2−mn);

where c6 depends on 6 only. Finally, accounting (60) we obtain

|I |6 c6-−3n−6=2(−2−6(2−mn)m1+6n :

Therefore, under condition (19) we have for each r¿ -

lim supn→∞

Dn(r)6 limn→∞P(r − -6Vn;((mn)6 r + -):

If r ¡ - we have evidently

Dn(r)6Dn(-) + 2P(Vn;((mn)6 -);

hence

lim supn→∞

Dn(r)6 3 limn→∞P(r − -6Vn;((mn)6 r + -) (61)

for all r ¿ 0. Since for each a¿ 0

limn→∞P(Vn;((mn)6 a) = F (1)( (a) = P(DI(()6 a)

and the distribution function of DI(() is continuous we complete the proof by letting- → 0 in (61).

The proof of Theorem 9 is similar to the proof of Theorem 8. The only diVerenceis in the deJnition of random vectors Xni (the modiJcation being evident) and thedimension kn which now is O(n2).

A. Ra5ckauskas, C. Suquet / Journal of Statistical Planning and Inference 126 (2004) 495–520 519

3.8. Proof of Theorem 10

The result relies essentially on the representation of a standard Brownian motion asa series of triangular functions. For r ∈Dj; j¿ 1, let Hr be the L2[0; 1] normalizedHaar function, deJned on [0; 1] by

Hr(t) =

+(r+ − r−)−1=2 = +2(j−1)=2 if t ∈ (r−; r];

−(r+ − r−)−1=2 =−2(j−1)=2 if t ∈ (r; r+];0 else:

DeJne moreover H1(t) := 1[0;1](t). For r ∈Dj, j¿ 1, the triangular Faber-Schauderfunctions Fr are deJned as

Fr(t) :=

(t − r−)=(r − r−) = 2j(t − r−) if t ∈ (r−; r];

(r+ − t)=(r+ − r) = 2j(r+ − t) if t ∈ (r; r+];0 else:

In the special case j = 0, we set F1(t) := t. The Fr’s are linked to the Hr’s in thegeneral case r ∈Dj, j¿ 1 by

Fr(t) = 2(r+ − r−)−1=2∫ t

0Hr(s) ds= 2(j+1)=2

∫ t

0Hr(s) ds (62)

and in the special case r = 1 by

F1(t) =∫ t

0H1(s) ds:

It is well known that any function x∈C[0; 1] such that x(0) = 0 may be expanded inthe uniformly convergent series

x = 21(x)F1 +∞∑j=1

∑r∈Dj

2r(x)Fr;

where the 2r(x)’s are given by (2).Let us recall the L@evy–Kamp@e de Feriet representation of the Brownian motion,

which was generalized by Shepp (1966) with any orthonormal basis. Let {Y1; Yr; r ∈Dj;j¿ 1} be a collection of independent standard normal random variables. Then the se-ries

W (t) := Y1t +∞∑j=1

∑r∈Dj

Yr

∫ t

0Hr(s) ds; t ∈ [0; 1]; (63)

converges almost surely uniformly on [0; 1] and W is a Brownian motion started at 0.Removing the Jrst term in (63) gives a Brownian bridge

B(t) := W (t)− tW (1) =∞∑j=1

∑r∈Dj

Yr

∫ t

0Hr(s) ds; t ∈ [0; 1]: (64)

520 A. Ra5ckauskas, C. Suquet / Journal of Statistical Planning and Inference 126 (2004) 495–520

Now from (62) to (64), it is clear that the sequential H%older norms of B may bewritten as

‖B‖seq( = supj¿1

1(2)1=2/(2j)

maxr∈Dj

|Yr|: (65)

Recalling that DI(() has the same distribution as ‖B‖seq( , the explicit formula for itsdistribution function is easily derived from (65) using standard techniques for themaxima of independent identically distributed Gaussian variables.

References

Bentkus, V.Yu., 1984. DiVerentiable functions deJned in the spaces c0 and Rk . Lithuan. Math. J. 23 (2),146–154.

Bingham, N.H., Goldie, C.M., Teugels, J.L., 1987. Regular variation. In: Encyclopaedia of Mathematics andits Applications. Cambridge University Press, Cambridge.

Brodsky, B.E., Darkhovsky, B.S., 1993. Non Parametric Methods in Change Point Problems. KluwerAcademic Publishers, Dordrecht.

Cs%orgo, M., Horv@ath, L., 1997. Limit Theorems in Change-Point Analysis. Wiley, New York.Lamperti, J., 1962. On convergence of stochastic processes. Trans. Amer. Math. Soc. 104, 430–435.Ledoux, M., Talagrand, M., 1991. Probability in Banach Spaces. Springer, Berlin, Heidelberg.O’Brien, G.L., 1980. A limit theorem for sample maxima and heavy branches in Galton–Watson trees. J.Appl. Probab. 17, 539–545.

Petrov, V.V., 1975. Sums of Independent Random Variables. Springer, Berlin.Ra)ckauskas, A., Suquet, Ch., 2001a. Invariance principles for adaptive self-normalized partial sums processes.Stochastic Process. Appl. 95, 63–81.

Ra)ckauskas, A., Suquet, Ch., 2001b. Necessary and suOcient condition for the H%olderian functional centrallimit theorem. Pub. IRMA Lille 57-I, preprint.

Semadeni, Z., 1982. Schauder bases in Banach spaces of continuous functions. In: Lecture Notes inMathematics, Vol. 918. Springer, Berlin, Heidelberg, New York.

Shepp, L.A., 1966. Radon–Nikodym derivatives of Gaussian measures. Ann. Math. Statist. 37, 321–354.Yao, Q., 1993. Tests for change-points with epidemic alternatives. Biometrika 80, 179–191.