a maximal inequality and a functional central limit theorem for set-indexed empirical processes

6
Result. Math. 31 (1997) 189-194 0378-6218/97/020189-6 $ 1.50+0.20/0 © Birkhiiuser Verlag, Basel, 1997 I Results in Mathematics A Maximal Inequality and a Functional Central Limit Theorem for set-indexed empirical processes Klaus Ziegler Abstraet For the tail probabilities of a general set-indexed empirical process in an arbitrary sample space a maximal inequality is derived. In the case that the class of sets by which the process is indexed possesses a total ordering, the application of our inequality yields an elementary proof for a functional central limit theorem without involving such advanced techniques as symmetrization, stratification, chaining or Gaussian domination. Analogously, the inequality leads to a weak uniform law of large numbers (including convergence rate). Key words: Maximal inequality, set-indexed empircal process, functional central limit theorem, uniform law of large numbers AMS 1991 subject classification: 6OE15,60F17 1. Introduction and Results Let be a measurable space and ('li)iaN be a sequence of i.i.d. random elements in (X, with common law £{ 'l1} = v. Let further be a class of measurable subsets of X. Then the empirical process with index set 'C: and sample size n is defined by Note that = n1l2 (vn(C) - v(C», where vn(C) := n- 1 lC('li) denotes the emprical i=l measure based on the observations'll , ... ,'In' Vn is a nonparametrc estimator for the unknown underlying propability measure v. Since v(C) is the expectation of lC<'li)' v n(C) is an unbiased, strongly consistent and asymptotically normal estimator for v(C) for every fixed For a general theory of empirical processes (which are of great importance in statistics, econometrics and ecology) see e.g. Gaensser (1983), Dudley (1984), Gine and Zinn (1986) or the excellent recent monograph by van der Vaart and Wellner (1996). In the present paper we establish a maximal inequality for the tails of where the maximum runs only over a finite number of sets. Nevertheless, the inequality will enable us to verify asymptotic equicontinuity in certain situations. In the asymptotic equicontinuity condition (AEC) to be dealt with below a supremum running over "small sets" (w.r.t. the measure v) will emerge. Hence, we consider the following assumption:

Upload: klaus-ziegler

Post on 13-Dec-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Result. Math. 31 (1997) 189-194 0378-6218/97/020189-6 $ 1.50+0.20/0 © Birkhiiuser Verlag, Basel, 1997 I Results in Mathematics

A Maximal Inequality and a Functional Central Limit Theorem for set-indexed empirical processes

Klaus Ziegler

Abstraet

For the tail probabilities of a general set-indexed empirical process in an arbitrary sample space a maximal inequality is derived. In the case that the class of sets by which the process is indexed possesses a total ordering, the application of our inequality yields an elementary proof for a functional central limit theorem without involving such advanced techniques as symmetrization, stratification, chaining or Gaussian domination. Analogously, the inequality leads to a weak uniform law of large numbers (including convergence rate).

Key words: Maximal inequality, set-indexed empircal process, functional central limit theorem, uniform law of large numbers

AMS 1991 subject classification: 6OE15,60F17

1. Introduction and Results

Let (X,~) be a measurable space and ('li)iaN be a sequence of i.i.d. random elements in

(X, ~) with common law £{ 'l1} = v. Let further 'C:C~ be a class of measurable subsets of X.

Then the empirical process with index set 'C: and sample size n is defined by

Note that ~n(C) = n1l2 (vn(C) - v(C», where vn(C) := n-1 ~ lC('li) denotes the emprical i=l

measure based on the observations'll , ... ,'In' Vn is a nonparametrc estimator for the unknown

underlying propability measure v. Since v(C) is the expectation of lC<'li)' v n(C) is an

unbiased, strongly consistent and asymptotically normal estimator for v(C) for every fixed Ce~. For a general theory of empirical processes (which are of great importance in statistics, econometrics and ecology) see e.g. Gaensser (1983), Dudley (1984), Gine and Zinn (1986) or the excellent recent monograph by van der Vaart and Wellner (1996). In the present paper we establish a maximal inequality for the tails of ~n where the maximum runs only over a finite number of sets. Nevertheless, the inequality will enable us to verify asymptotic equicontinuity in certain situations. In the asymptotic equicontinuity condition (AEC) to be dealt with below a supremum running over "small sets" (w.r.t. the measure v) will emerge. Hence, we consider the following assumption:

190 Ziegler

Let Bke~, k=I, ... ,m (meN being a fixed integer) be measurable subsets of X

(1) with BIC. .. CBm and v(Bk):>; b < 114 for every k=I, ... ,m.

Let further be Be~ with v(8) = O+a (O<a:>;O) and B::J Bm.

1.1 Inequality. Let (1) be fulfilled. Then for n ~ 9216 a-3 and for every E>O it holds that

D

1P( max l~n(Bk)1 > E) :>;1P(~ Is(l1j) > nJ2) + 4P(I~n(B)1 > E/2). k=I, ... ,m 1=1

Since the first term on the right hand side tends to zero as n-OO by the weak law of large

numbers (note that E(1s(l1j» = O+a < 1/2), inequality 1.1 tells us that the asymptotic tail

behaviour of the maximum is the same as that of the single r.v. ~n(B).

Note that neither m nor b nor a appear explicitly on the right hand side. Note further that inequality 1.1 does not change if the maximum runs over the sets BI, ... ,Bm and, additionally, B. A proof of inequality 1.1. will be presented in section 2.

Next we consider convergence in law (in the sense of Hoffmann-J!1Irgensen) of the process (~n(C)C .. C . For a defintion of this extension of weak convergence avoiding measurability problems see e.g. Hofmann-J!1Irgensen (1984) or Gaenssler (1992). For a deeper understanding of this subject the reader may consult chapter 1 in van der Vaart and Wellner (1996). From Theorem 3.10 in Gaensser (1992) (see also Gine and Zinn (1986) or Pollard (1990), among others) we learn that the crucial point in proving a functional central limit theorem

(FCLT) for the process ~n is the verification of the asymptotic equicontinuity condition

(AEC) lim limsup 1P*( sup l~n(C) - ~n(D)1 > E) = ° for each E>O, b!O n-oo d(C,O)sb

where d is a pseudometric on t; (as it is usual in this context, we take d = <Iv with dV<C,D) :=

v(CAD» and P* denotes outer probability.

Now we assume that t; is totally ordered w.r.t. the inclusion. The simplest example for this is

t; = ([O,t), te[O,I]} which yields, together with v='\.l[0, 1) (the uniform distribution on [0,1]), the well-known uniform empirical process on the unit interval, viewed as a set-indexed process; see also Remark 1.3 (i) below. A natural two-dimensional example comes in if we observe forest disease spreading over a continuously increasing area. For technical convenience and lucidity of the proof of Theorem 1.2 below, we also assume:

For every fixed 114> 0 > ° there are

(2) 0 =: DoCDIC. .. CD[1I6]CD[1I6]+I:= Xwith Dtet;

and v(Dk) = kb for every k=O, ... ,[lIb)

Note that total boundedness of (t;,dv) also trivially follows from (2).

1.2 Theorem. Assume that t; is countable (but see Remark 1.3 (ii) below) and possesses a

total ordering. Assume further that (2) holds. Then ~n converges weakly in loo(t;) := {all

Ziegler 191

bounded functions ~ -1R} in the sense of Hoffmann-lj11rgensen to a mean zero Gaussian

process G with covariance function cov(G(C),G(D» = v (CnD) -v(C}v(D), C,De~.

Furthermore, the process n-1/213n = vn(C) - v(C) fulfills a weak uniform law of large numbers

with convergence rate n- I ; more precisely, it holds that JP(sup In-1I213n(C)1 >E) :5 AE-2n-1 for c.c

some constant A and n large enough.

1.3 Remarks. (i) Theorem 1.2 can also be deduced from general theorems based on bracketing methods (see e.g. Ossiander (1987), Gaenssler and Ziegler (1994» or random entropy conditions (see e.g. Gine and Zinn (1986), Alexander (1987), Pollard (1990) and Ziegler (1995». Our method, however, is far more direct in the situation considered here, since it does not rely on covering or bracketing numbers, symmetrization and chaining techniques, subgaussian inequalties or other involved tools. In particular, it yields a new, elementary and lucid proof for the classical Donsker theorem for the emprical process on the real line (see e.g. Gaenssler and Stute (1977), Theorem 10.2.8.). For the uniform empirical process (Cln(t»Ui[O,lj

(i.e. un(t) := n-1I2 ~ (l[O,t](!;j) - t) with (!;j)j-tl a sequence of i.i.d. variables being uniformly

distributed on the unit interval) our method of proof was carried out by Strobl (1990).

Apart from these considerations, the inequality l.l is of independent interest and might be useful for various purposes.

(ii) Countability of ~ is not necessary and only assumed for technical convenience. It suffices to assume that suprema of the type emerging in the AEC can be replaced by suprema running over a countable subset (see e.g. Gine and Zinn (1986), see further Ossiander (1987». This is in particular given in the above example of the uniform empirical process on the unit interval.

2. Proof of inequality 1.1.

Step 1. We look for the first index k such that the process exceeds E absolutely on Bk. Additionally, we must ensure that at most nl2 of the observations fall into Bk. Hence we split:

(3) 1P( max l13n(Bk)1 > E) k=l, .. . ,m

n

:51P(~ IB(llj) ~ nl2) i=l

+ l:l lP(13n(Bj ):5 E Vj=I, ... k-l, 13n(Bk) > E, ~l 1~(llj):5 nl2)

+ l:l lP(!3n(Bj ) ~ -E Vj=I, ... k-l, !3n(Bk) < -E, ~l 1~(llj):5 nl2)

=: lP(A) + f lP(A~) + f lP(Ak")· k=l k=l

Step 2. We show that for each k=I, ... ,m and for n ~ 9216 a-3

(4) JP(Ak) :5 41P(I\) with Ek = Ak n {!3n(B) ~ !3n(Bk) L~}.

192 Ziegler

where R := {r = {r1, .•. ,rk): n-1I2(r 1+ ... +rj - nV(Bj» S; E Vj=I, ... ,k-I, n-1/2(r1+ ... +rk­

nv(Bk» > E, r1+ ... +rk s; nl2} and Bo:= 0.

An elementary multinomial (and binomial) calculation shows that for i.i.d. r.e.'s ;jk' ieN, in

v(\I\) (X, cr.) with law £{;jk} = the above sum equals

I-v(Bk) n . n /)+u-v(BIt)

}: lPQ: IB .\B. (l1j) = rj ,J=I, ... ,k) lP( I IB(;jk) ~ (n - r) 1-v(B) ). r .=1 J J-1 .=1 It

it follows by the Berry-Esseen theorem (and noticing that k!IlI2) that

~ 114 for n as indicated.

Hence lP(~) s; 1I4}: lPd IB .\8. (l1j) = rj ,j=l, ... ,k) and (4) is established. r .=1 J J-1

In a completely analogous manner it is shown that

Step 3. By (3), (4) and (5) together with the observation that L~) ~ 112 we obtain

1P( max If}n(Bk)1 > E) k=1, ... ,m

S; lP(A) + 4 f lP(~) + 4 f lP(~) k=1 k=1

S; !P(A) + 41P(If}n(B)1 > E/2),

since the sets~, Ek, k=I,oo.,m are disjoint. o

Ziegler 193

3. Proof of Theorem 1.2.

Since total boundedness of (C ,dv) is trivial and convergence of the finite-dimensional marginals is an easy consequence of the multivariate CLT, it suffices to show (see e.g. Gaenssler (1992), Theorem 3.10) that

lim limsup IP(WIt (b) > E) = 0 for each E>O, II~O n-+oo I'n

with wII (b) := sup If\n(C) - f\n(D)1. n d(C,O)St'I

Step 1. We replace the supremum by a finite maximum to permit the application of inequality 1.1.

First we observe that, by monotonicity of wlln' it suffices to consider the values b=l/l, leN.

Let 1>4 be fixed and denote K:= {O, ... ,l-l}. By total ordering of C one has

{Wit (b) > E} C U {3CeC with DkCCCI\+1 and If\n(C) - f\n(I\)1 > E/3} I'n kliK

and hence

Choose now for every keK sets C mk = {Cmlk"",Cmmk} such that CmlkC."CCmmk and

Cmk t {Cee: I\CCCDk+l} (this is possible because C is countable and totally ordered), then

(6)

Step 2. We show that IPC_ max If\n(Cmjk\Dk)l> E/3) can be bounded uniformly in m and k as J-I, ... ,m

For every meN and keK\{I-I} inequality 1.1 yields for large n:

For k=I-1 we take X\Dk-1 instead of Dk+2 \Dk' By the weak law of large numbers and the CLT, respectively, we observe that the right hand

side tends to 8( 1 - fl>( ~). Hence by (6) 6 26<1-2b)

194 Ziegler

(7) limsup lP(wJl (b) > E) $ lib· 8(1 - cI>( ~). n--+OO n 6 26<I-2b)

A well~known bound for the normal distribution function cit (see e.g. Gaenssler and Stute

(1977), 1.19.2) together with the elementary fact that eX~x yields that the right hand side in (7) converges to zero as b--+O. This concludes the proof of the functional central limit theorem. The convergence rate in the weak uniform law of large numbers is established in an analogous manner using Chebyshev's inequality instead of the CLT. 0

REFERENCES

[I] Alexander, K.S. (1987): Central limit theorems for stochastic processes under random entropy conditions. Probab. Th. ReI. Fields 75,351-378.

[2] Dudley, R.M. (1984): A course on empirical processes. Lecture Notes in Mathematics 1097, 1 - 142. Springer, Berlin/Heidelberg/New York.

[3] Gaenssler, P. (1983): Empirical Processes. Hayward, California, Institute of Mathematical Statistics.

[4] Gaenssler, P. (1992): On weak convergence of certain processes indexed by pseudometric parameter spaces with Applications to empirical processes. Transactions of the lIth Prague Conference, 49 - 78.

[5] Gaenssler, P. and Stute, W. (1977): Wahrscheinlichkeitstheorie. Springer, Berlin/Heidelberg/New York.

[6] Gaenssler, P. and Ziegler, K. (1994): On function-indexed partial-sum processes. In: Prob. Theory and Math. Statist. (B. Grigelonis et aI . (Eds.)), pp. 285 - 311. VSprrEV 1994.

[7] Gine. E. and Zinn, J. (1986): Lectures on the central limit theorem for empirical processes. Lecture Notes in Mathematics 1221, 50 - 113. Springer, Berlin/Heidelberg/New Yark.

[8] Pollard, D. (1990): Empirical Processes: Theory and Applications. Institute of Mathematical Statistics, Hayward, California

[9] Strobl, F. (1990): Zur Verteilungskonvergenz von ZufallsgroBen. Schriftliche Hausarbeit, University of Munich.

[10] van der Vaart, A. and Wellner, J.A. (1996): Weak convergence and empirical processes (with Applications to Statistics). Springer, Berlin/Heidelberg/New York.

[11] Ziegler, K. (1995): Functional Central Limit Theorems for Triangular Arrays of Function-indexed Processes under uniformly integrable entropy conditions. Submitted to Journal of Mult. Anal.

Klaus Ziegler Mathematisches Institut Universitaet Muenchen

Theresienstrasse 39 D-80333 Muenchen

email: [email protected]

Eingegangen am 24. Oktober 1996