asymptotic normality of statistics based on the … · asymptotic normality of statistics based on...

37
ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE CONVEX MINORANTS OF EMPIRICAL DISTRIBUTION FUNCTIONS BY PIET GROENEBOOM AND RONAL.DPYKE TECHNICAL REPORT NO. 5 JULY 1981 THIS RESEARCH IS BASED UPON WORK PARTIALLY SUPPORTED BY THE NATIONAL SCIENCE FOUNDATION UNDER GRANT MCS-78-09858 DEPARTMENT OF STATISTICS UNIVERSITY OF WASHINGTON SEATTLE) WASHINGTON 98195 DEPARTMENT STAT

Upload: others

Post on 18-Mar-2020

25 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE … · ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE CONVEX MINORANTS ... The method used in Section 3 utilizes a conditional

ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE CONVEX MINORANTSOF EMPIRICAL DISTRIBUTION FUNCTIONS

BY

PIET GROENEBOOM

AND

RONAL.DPYKE

TECHNICAL REPORT NO. 5JULY 1981

THIS RESEARCH IS BASED UPONWORK PARTIALLY SUPPORTED BY THE

NATIONAL SCIENCE FOUNDATIONUNDER GRANT MCS-78-09858

DEPARTMENT OF STATISTICSUNIVERSITY OF WASHINGTON

SEATTLE) WASHINGTON 98195

DEPARTMENT STAT

Page 2: ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE … · ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE CONVEX MINORANTS ... The method used in Section 3 utilizes a conditional

ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE CONVEX MINORANTS

OF EMPIRICAL DISTRIBUTION FUNCTIONS

by

Piet Groeneboom1 and Ronald Pyke2

University of Washington

Abstract

A

Let Fn be the Uniform empirical distribution function. Write FnA

for the (least) concave majorant of Fn, and let fn denote the corres-

ponding density. It is shown that n!~(fn(t)-1)2dt is asymptotically

standard normal when centered at log n and normalized by (3 log n)~. AA

similar result is obtained in the 2-samp1e case in which fn is replaced byF -1the slope of the convex minorant of r m= FmoHN .

Page 3: ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE … · ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE CONVEX MINORANTS ... The method used in Section 3 utilizes a conditional

FOOTNOTES

1. This work was done while this author was on leave from the Mathematical

Centre, Amsterdam, as a Visiting Professor in the Departments of

Statistics and Mathematics at the University of Washington.

2. The research of this author was supported in part by the National

Science Foundation, Grant MCS~78-09858.

AMS 1970 Subject Classification. Primary: 62E20

Secondary: 62G99, 60J65

Key words and phrases: empirical distribution function, concave majorant,

convex minorant, limit theorems, spacings, Brownian bridge, two-sample

rank statistics.

Page 4: ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE … · ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE CONVEX MINORANTS ... The method used in Section 3 utilizes a conditional

1-

1. Introduction

In 1956, Grenander introduced ideas of 'non-parametric' maximum like­

lihood estimation. In one of the examples, he found the maximum likelihood

estimate (M.L.E.) within the class of all distribution functions that are

concave over [0,=), or equivalently, the class of all monotone decreasing

densities supported on [0,=). The M.L.E. in this example is the concave

majorant of the ordinary empirical distribution function. For a formal

definition of maximum likelihood which covers these 'non-parametric' cases,

see Sch91z (1980).

Statistics based on either concave majorants or convex minorants of

empirical distribution functions have arisen independently in at least two

other very different contexts. In 1975, Behnen proposed a 2-sample rank

statistic defined as lithe supremum of all standardized and centered simple

linear rank statistics having non-decreasing scores". This statistic was

shown to have comparable performance to an adaptive statistic proposed by

Randles and Hogg (1973) when used against shift alternatives. and to have

a much superior performance against stochastically ordered alternatives.

Although not mentioned in Behnen (1974, 1975), it was known independently

to both Behnen and Scholz (personal correspondence, June and July, 1975)

that this statistic is expressible in terms of the L2-norm of the density

function of the concave majorant of the usual 2-sample empirical distribution

function. The asymptotic distribution of the statistic was left as an open

question, although Behnen (1974) provided extensive simulations for selected

sample sizes up to m=n=100 which suggested to us the asymptotic normality

of the statistic.

Page 5: ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE … · ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE CONVEX MINORANTS ... The method used in Section 3 utilizes a conditional

2-

In a completely different context, Scholz (1981) proposed a procedure

for the combination of p-values from independent tests of significance.

The procedure, utilizing Roy's union-intersection principle, results in a

statistic that is expressible similarly in terms of the L2-norm of the

density function of the concave majorant of the l-sample Uniform empirical

distribution function. Exact distributions are obtained by Scholz for the

very small sample sizes which are important for this context. Simulations

were also carried out by Scholz (personal communication) for moderate sample

sizes to evaluate the feasibility of approximations using asymptotic theory.

Scholz is derived in Section 3 and that of the 2-sample statistic of Behnen

is obtained in Section 5. The method used in Section 3 utilizes a conditional

representation of the concave majorant of the Uniform empirical distribution

function in terms of a sequence of Poisson and gamma random variables. This

representation is detailed in Section 2. This method is an extension of that

used in Pyke (1965) and due originally to Le Cam (1958). The 2-sample case

is proved in Section 5 using a strong invariance principle together with

the asymptotic normality of the L2-norms of the slope processes of the

convex minorants of a sequence of truncated Brownian Bridges. The latter is

derived in Section 4.

As is pointed out in the remarks in Section 6, it is possible to prove

the 2-sample result by an analogue of the method presented in Sections 2 and

3. On the other hand, it is possible to prove the l-sample result by a

method analogous to that of Sections 4 and 5. This is done in Groeneboom

(1981), Theorem 3.2) where a detailed study of the concave majorant of

Brownian motion is. presented.

Page 6: ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE … · ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE CONVEX MINORANTS ... The method used in Section 3 utilizes a conditional

3-

2. The Representation Theorem

In this section, we describe the specific construction that is used

to provide a tractable representation for the concave majorant of the

uniform empirical process. To do this, the following notation is needed.

Let Fn denote the Uniform empirical distribution function, and writeA

Fn for its concave majorant, the function on [0, 1] formed as it were

by stretching a rubber band over the top of Fn. Let nn be the numberA

of vertices of Fn, including the end-points, (0,0) and (1,1). Let

, 'nO =.0 < ~n,1 < ••• < F;n,nn = 1

For 1 s i So. nn and 1 s j s n,

°ni = F;n,i - F;n,i-1'

be the x-coordinates of these vertices.

define

Qn,j =#{i: In,i = j}

to be respectively the horizontal "widthll and vertical "number of steps"A

associated with each of the segments of Fn, and the frequency of segments

of a given number of steps. Notice that Q 0 = 1 in view of the flatn,A

section of Fn that always occurs to the right of the largest order

sta tistic. Set R.(n) = (0 l' ... , 0 ) , ~(n) = (J l' ... , J ) andn, n,nn n, n,n(n) A n A

~ = (Qn,l'···' Qn,n)· If fn denotes the density (slope) of Fn,then the statistic that motivated this paper is

which can be written in terms of the above notation as

Page 7: ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE … · ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE CONVEX MINORANTS ... The method used in Section 3 utilizes a conditional

4-

(2.1a)

in which Jni/nDni is the slope of the i-th segment of the concave

majorant. The concave majorant of a sequence of partial sums of inter­

changeable r.v.'s was studied by Sparre Andersen (1954) who derived in

particular the distribution of the number of vertices. Implicit in that

paper is the following result for our problem where the interchangeable

r.v's are the n+l spacings formed from the n independent Uniform (0,1)

observations. (For any .r.•. ve-. '.sX, '1, .w.e .write fxandfxty for

marginal and conditional density functions when well defined.)

, n =Lemma 2.1. For non-negative integers ql , ... , qn wlth Ej=l jqj n,

(2.2)

and

n -q,= II j J /qJ'!

j=l

Proof. Conditionally given the ordered set of Uniform spacings, all

permutations thereof are equally likely. With probability 1, all spacings

and partial sums thereof are distinct. Partition the spacings into ql

subsets of size 1, q2 subsets of size 2, and so on, with Ej=l jqj = n.

The remaining spacing forms a subset of its own, Within each subset of

size j, the probability is l/j of choosing a permutation whose partial

sums lie below the line segment joining the end points (Spitzer1s Lemma;

Spitzer (1956), cf. Feller (1968), p. 423.) The qj subsets of size j

Page 8: ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE … · ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE CONVEX MINORANTS ... The method used in Section 3 utilizes a conditional

5-

can be permuted qj! times. Finally, the slopes of the line segments

determined by each of the subsets can be ordered in exactly one way by

decreasing slopes to form a concave majorant with the required ~(n) = ~(n).

(2.3) is immediate. (Cf. the last paragraph of Hobby and Pyke (1964).) 0If we let {Nj: j~l} be independent Poisson r.v.'s with E(Nj) = l/j,

it is clear from the form of (2.2) that if Tn = Ej=l jNj and No = 1 a.s.~

(2.4)

Note that

(2.5)

Also, it is well known that if {V.: j>l} are independent Exp(l) ~.v.'sJ -

then the conditional distribution of n-l(vl,···, Vn+l) given

Vl t ...+ Vn+l = n is the same as that of the n+l Uniform spacings. We

now build upon this as follows to provide a suitable conditional representa­

tion for the concave majorant. Let {S.,.: i>l, j>O} be independent r.v.'sJ - -

with Sji being a r(j,l) r.v. for j~l and Soi being a r(l,l) or

Exp(l) r.v. That is, each Sji is equal in law to a sum of j independent

Exp(l) r.v.'s. Assume {Nj} and {Sji} are independent.

Rewrite the spacings {Oni} of the concave majorant in a specific

order as follows. First, write 0 0 for ° , the only zero-step spacing;n nnn

then write all of the one-step spacings in the order of increasing magnitude

(which is the same as the order of appearance in the concave majorant): then

write all of the two-step spacings, and so on. In this order, denote them

b I'\(n) = (0 . 0 D· "', - .y I(, no' nl1"'" nl Qnl'

unZP ' '' ' °n2Qnz' ... , 0nnQnn)'

Page 9: ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE … · ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE CONVEX MINORANTS ... The method used in Section 3 utilizes a conditional

Analogously, write 5j 1 ~ Sj2 ~ ... ~ SjN. for the ordered values ofJ

Sj1"",SjN. and setJ

~(n) = (501; 511" " 51N1; 521" " , S2N2;"';

5n1, .. ·, 5nNn).

Then the following representation holds, where ~(n) = (N1,···, Nn).Note: We delete the parameter n from the notation whenever it is unlikely

to cause confusion.

Theorem 2.1. For n>l

where

(2.7)

exponentials. 0

Proof. Let {Yet): t~O} denote a Poisson process with EY(t) = t. It is

well known that conditional on the (n+l)-th jump occurring at t=n,

n-1Y(n.) on [0,1) is equal in law to F. The relationship given in, n

(2.6) is'closely related. First of all, by (2.4), the marginal distribution

of Q is the same as the conditional distribution of N(n) given T = n.'Y tV n

Now f~,~ is obtainable by standard techniques starting from the Uniform

distribution over [O,lJ n. The main thing to observe is that as is implied

in the proof of (2.4), the distribution of ~ is the same for almost all

values of the original ordered Uniform spacings. The latter when multiplied

by n, can be represented as the ordered values of n+1 independent

Exp(l) r.v. IS Yl, ... , Yn+1 given Yl +...+ Yn+1 = n. Thus the same

techniques that will yield f~,n~(~'~) from the joint density of ordered

Uniform spacings will yield f (n) -tn) (~,~,n,n) from orderedN ,S ITn,Sn

Page 10: ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE … · ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE CONVEX MINORANTS ... The method used in Section 3 utilizes a conditional

7-

3. The One-Sample Limit Theorem

The statistic Ln is shown to be asymptotically normal by studying a

related statistic suggested by Theorem 2.1. First of all, notice that

(3.1) _ -1 n .2 Qj ( II' )L - n I. 1 J I. 1 11nu .. ,n J= ,= nJ'

which suggests that one might study the conditional limiting distribution of

* -1 n .2 NjLn = n I j =l J I i=l (l /Sj i )

ut1d~r th~ <:QnditiOt1s that Tn = n.. and. $n=.n. To. this end, we. introduce

three suitably normalized r.v.'s, namely, .

;(3.2) U =n

(3.3)

and

(3.4) W = n-l E~ J·Nn J=l j'

Observe first of all that the conditions Tn = nand Sn = n are equivalent

to Wn = 1 and Vn = o. Secondly, observe that under these conditions,

Un reduces to

(3.5)

~ n 2 N.Un = (3 log n)- (Ej=3 j Ei~l(l/Sji) - n - log n),

= (3 log n)-~(nL~ - n - log n),

which is conditionally equal in law to the same expression with Ln replaced

*by Ln. The particular choice of Un in (3.2) was not easy to obtain. Its

Page 11: ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE … · ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE CONVEX MINORANTS ... The method used in Section 3 utilizes a conditional

8-

One may also writeNo

U = (3 log n)-~(.~ oEJ(SJo1o-j)2/j - log n)n J=3 1=1

- (3 log nf~2 ·~j(Soo-j)3/jSooj=3 i=l J1 J1

specific combination of terms is essential in order to provide the desired

asymptotic normality. The form given by (3.2) makes it easy to see the

effect of the conditions Wn = 1 and Vn = O. It may, however, be

simplified as

(3.Sb)

in which the randomness has been removed from the denominator in the first

term while the second term is of smaller order. This form is more tractable

for computation of higher order expansion terms.

LLemma 3.1. (Un' Vn, Wn) --> (U,V,W) where U and (V,W) are independent,

U is N(O,l) and (V,W) has the infinitely divisible characteristic

function

~(V,W)(s,t) = eXP{J6(eitu-s2u2/2 - 1)u-1du.

Equiva1ent1y, V~ .zv/i where Z is a N(0,1) r. v. independent of W.

Proof. Clearly E(Vn) = 0 and E(Wn) = 1. Also

nvar(Vn) =n-1Ej=lE(Nj) var(Sji) = n-1E~=1(1/j)j = 1.

( - 1) (0 )-1Moreover, for j>2, E Sji = J-1 ,

var(Sj~) = (j_1)-2(j_2)-l o Hence, as

E(Sj~) = 1j(j-1)(j-2)

n -+ "",

and

Page 12: ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE … · ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE CONVEX MINORANTS ... The method used in Section 3 utilizes a conditional

9-

and

(3.6) =-. (3-log tl,"'lEj=3{j3{j_1)""(j_2)o.l ... 1-25(j-1)"'1"'(j-1 )o.l}+ 0(1)

= (3 log n}-l E~_ {L!J-1}2+ 1~ - 1 - l/(j-l}} + o(l}J-3 (j-1) (j-2)

= (3 log n}-l E~:i{3/k + 4/k(k+l) + 1/k(k+l}2} + o(l} + 1.

'The orders of the asymptotic variances, are thereby established.

To determine the limiting distribution of (Un' Vn, Wn), it suffices

to show that all linear combinations, aUn + bVn + cWn converge in law and

to specify the limiting distribution. Since the variances converge, it

suffices to proceed as follows. In view of (3.5a), write Un = rj=3(Xnj+Ynj}+En,where

(3. 7) y . = (3 log nf~ (j N.-1)/ (j -1)nJ J

E =n-~ n )-1 )-~)(3 log n) (Ej =3(j -1 - log n) = O«log n .

Page 13: ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE … · ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE CONVEX MINORANTS ... The method used in Section 3 utilizes a conditional

10-

We note that EXnj ~ 0 = EYnj.To establish the asymptotic normality of Un' it suffices, since

en ... 0 and all variances converge, to show that LjP[IXnj + Ynjl > e:] ... O.

(Cf. Loeve (1963), p. 316). For this it suffices to compute fourth moments

and use Markov's inequality. To this end, we compute

where Aj and Bj are the fourth and second central moments of (Sj1-j)2/Sj 1,

re~pectively. It is easy to show that both Aj and Bj are uniformly

bounded in j>4, since

and by the cr-inequa1ity (cf'. Loeve (1963), p., 155)

E(Sj_4,1-j)8 ~ 27{E[L~:i(Yi-1)]8+ 48}

where Y1, Y2, ... are independent Exponential r.v.'s with mean 1. Straight­

forward computations show that E(X1 +.•.+ Xm)8 ~ Cm4 for any independent

r.v.'s with means zero. Therefore,

[ I I ] -4 1 14 ( )-2 .-1LjP Xnj > e: ~ e: Lj E Xnj < Clog n Lj J + o.

Similarly,

Page 14: ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE … · ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE CONVEX MINORANTS ... The method used in Section 3 utilizes a conditional

11-

EIYnjl4 = (3 log n)-2 j4(j_l)-4 E(Nj-1/j)4

= (3 log n)-2 j4(j_1)4(j-1 + 6j-2) ~ C(log n)-2 j-1.

Hence

We have thus established that Un ~> U, a N(O,l) r.v.

To determine the limiting joint distribution of (Vn, Wn), we compute

the characteristic function

isV+itw .{. rr··N. it(j/ri)N.}E(e n n) = E n ([~ .(sn-~)] J e J)

.j=l Sj1-J

(3.9)

. Now, as j ~ ~ while j/n ~ u€(O,l), we have by the Central limit for

{j-~(Sj1-j)} that

2 2~ .(sn-~) ~ e-s u /2.Sj1-J

Recognizing a Riemann sum in the exponent of (3.9), the limit becomes

as desired. It is straightforwardly checked by direct integration that the

exponent in (3.10) may be written as

(3.11)

where ~ is the standard Normal density, so that the Levy measure of the

Page 15: ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE … · ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE CONVEX MINORANTS ... The method used in Section 3 utilizes a conditional

12-

2-dimensional infinitely divisible r.v. (Y,W) is absolutely continuous with

respect to Lebesgue measure on ((-~,O)u(O,~»x(O,l) with 'density' ~(v/w)w-2.

There is therefore no Normal part to the distribution of (Y,W). This fact

completes the proof since it is easily checked that if {Yni} and {Zni}

are two triangular arrays which are jointly in the domain of attraction of

an infinitely divisible distribution, and if the marginal limiting law of one

is Normal and the other has no Normal component, then they are asymptotically

independent. 0

,

,whereas the result being sought is the limiting conditional distribution of

Un given Yn = 0 and Wn = 1. To obtain the conditional result from the

joint, we follow an idea used originally by the LeCam (1958) to obtain limit

l~ws for sums of a function of Uniform spacings. The method was used by~. .

pyke (1965) to obtain limit laws of more general functions as well as the

weak cQnvergence of related processes. It can be shown that

itU itU'nm(t): = E[e nlIYn=O, Wn=l] = E{E[e mIYm,WmJIYn=O, Wn=l}

(3.12) itU=E[e m Pnm(Wm)rnm(Ym' Wm)]

where

Pnm(k/n) = P[Wm= k/mlWn = l]/P[Wm= kIm]

and

rnm(V,k/m) =fy IW Y (v,k/m,O)/fy IW (v,k/m)m m' n m m

Page 16: ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE … · ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE CONVEX MINORANTS ... The method used in Section 3 utilizes a conditional

13-

for k = 0,1, .•• and real v. To verify (3.12), write

n . . 00 itU~nm(t) = tk=oP[Wm= k/mlWn = 1] [00 E[e ml Wm= kIm, Vm= v]

fV \W V {v, kIm, Oldym m n

rnm{v, k/n)fV IW(v, k/m)dvm m

which i$ then the unconditional expectation given in (3.12).

Since the Njls are independent

(3.13) p (kIn) = P[T -T = n-k]/P[T = n]nm n m n

w~ere Tn = E~=l j Nj = nWn· Moreover, under the condition that Wm= kIm

(equivalently, Tm= k) and Vn = 0 (equivalently, Sn: = E~=l E~~l Sji = n),

it is well known that n-15m is a Beta (k, n-k) r.v., while

under the single condition Vn = 0, then Sm is a Gamma (k,l) r.v.

Therefore, for k = 1,2, ... , nand 0 < vn-~ + kIn < 1,

f { kl O} - m~r(n) (m~v+k)}k-1(1 _ m~v+k)n-k-1VmlWm,Vn v, m, - nr(k)r(n-k) n n

and

~ m~(+~)k-1 ~ kfV IW (v,k/m) = m ( r k e-m v- .m m

Hence

Page 17: ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE … · ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE CONVEX MINORANTS ... The method used in Section 3 utilizes a conditional

14-

Upon substituting (3.13) and (3.14) into (3.12), one obtains a tractable

unconditional form for the conditional characteristic function of Urn

whiCh can be used to prove the desired result. A significant step in the

proof of this result will be the determination of the limiting behavior of

the functions Pnm and rnm in order to permit the use of the dominated

convergence theorem in (3.12). To this end, we prove the following lemmas .

(3.15) .-1)J , for 0 s k s n,

'(3.16)

< p for k > n- n

and np + e-Y = .561 ••• , where Y = .5772 ... is Euler's constant.n

Proof. The generating function of Tn is computed directly to be

( n .-1 j)Pn exp Lj=l J s .

For k ~ n, the coefficient of sk in (3.16) will be the same as the

coefficient of sk in Pn(l-S)-l since the two functions differ in the

exponent by powers of s higher than n. For k > n, the latter would be

greater. This proves (3.15). The rest of the Lemma is clear. 0

Lemma 3.3. For m=o(n), Pnm(k/n) + 1, as n + 00, uniformly for k < cn

for some c < 1.

Proof. By (3.13) and Lemma 3.2, it only remains to show that the numerator

of Pnm satisfies

Page 18: ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE … · ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE CONVEX MINORANTS ... The method used in Section 3 utilizes a conditional

15-

(3.17)

By the Fourier inversion formula for the characteristic function ofn

Tn-Tm= Lj=m+1 jNj ,

(3.18)

where

PET -T = n-k] = {2~)-1 !2~ f (t) e-(n-k)it dtn m 0 n

(3.19)

is the characteristic function of Tn-Tm• Integration by parts shows that

the right hand side of (3.18) is equal to

( 2 ( n - k )~ ) - 1 !2~ {L~ ei t j _ L~ eitj}f (t) e-{n-k)itdt .o J=l J=l nm, ,

Since Pn = p[Tn=n] = P[Tn = n-k] for all k ~ n, then similarly one obtains

P = {2n )-1 !2~ En ei t j 9 {t}e-{n-k)it dtn ~ 0 j=l n

where

{ (n .-1 itj}gn t) = expt-Ej=l J (l-e )

is the characteristic function of Tn' Hence

_ (2~}-1 !2~ E~ ei t j f (t) -(n-k)it dto J=l nm e

= (1-k/nr1{p(T -T <n-k-l]-P(T <n-k-l]-P(n-m-k<T ..T <n..k-1 J}.n m- n- - n m-

= {l-k/n)-l{p(Tn-Tm~n-m-k-l]-P[Tn~n-k-l]} .

Page 19: ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE … · ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE CONVEX MINORANTS ... The method used in Section 3 utilizes a conditional

16-

Thus if m=o(n) and k =o(n), it follows from lemma 3.1 that

W =n-1T ..!:..> W, a continuous r.v., and so the right hand side of (3.19),n n

which is bounded for k bounded away from n, converges to 0 as n + 00.

Thus

11m nP[T -T =n-k] = lim np =e-Yn~ nm n~ n

by lemma 3.2. The convergence is uniform for k < cn when 0 < c < 1.

This completes the proof. 0

lell1na3.4. For m/n .... b ~O and k/rn + x > 0 with bx < 1,

uniformly in v over bounded intervals and k < cn for c < 1.

Proof. By (3.14) and Stirling's formula,

Now, when 0 < m%v+k < n,

{.% k ~} ~log (1 - vm }n- em v = m%v + (n-k) log(l _ vm

k)n-k n-

=m~v + (n-k) {- vm: - v2m

2 - ••• }n- 2(n-k}

= -v2m/2(n-k} _ 0(m3/2(n_k}-2},

and the result follows.

Page 20: ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE … · ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE CONVEX MINORANTS ... The method used in Section 3 utilizes a conditional

This result, when k =m. w~s used in Le C~m (1958) and Pyke (1965,

p. 410). In the present ~pplication, we h~ve b =O. In all cases, the

result simply represents the know asymptotic behaviors of Beta and Gamma

densities.

In both of the above results, Lemmas 3.3 and 3.4, the uniformity

requires that k < cn for some c < 1. Since k is a sample val ue for

Tm~ the application of these results will require that for m=o(n),

P[Tm> cnlTn =n] + 0

as n + 00. Now

P[Tm> cnlTn = n] $ P[Tm > cn]/P[Tn = n].

17-

Since nP[Tn = n] =nPn + e-Y, it suffices to show that nP[Tm> cn] + O.

But by Markov's inequality,

tTnP[T > cn] < nE(e m)e-cnt

m -

-cnt {m J.-l(et j _ l)}= ne exp Lj=l

if one chooses t = 11m. Thus we have proved

Lemma 3.5. If m=o(n), then

(3.20)

We can now state and prove the main result.

Page 21: ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE … · ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE CONVEX MINORANTS ... The method used in Section 3 utilizes a conditional

18-

Theorem 3.1. As n + ~t

(3.21) ~: =(3 log n)~(nLn ~ n - log n) ~> U.

a N(O.l) random variable.

Proof. In view of (3.5) and the discussion leading up to it, ~ is equal

in law to the conditional law of Un given Vn = 0 and Wn = 1. But for

o<m< n

where

(3.22)itU it(U -U )

Rnm(t) = E[e m(e . n m - 1) IVn = 0, Wn = 1].

The object of the proof will be to show that ~nm(t) + exp(-t2/2) and '

Rnm(t) + 0 as n, m+~. We first study ~nm(t).

By the form for ~nm in (3.12), by the convergence in law of (Um' Vm, Wm)

given in Lemma 3.1, and by th~ convergence of the ratios Pnm and

rnm given in Lemmas 3.3 and 3.4, it follows that if m=o(n), then

itUlimn-+OO $nm(t) = limn-+OO E[e mPnm(Wm)rnm(Vm' Wm)]

2 ;(3.23) = E[eitU·l.l] = e-t /2.

itUmIn this, the result of Lemma 3.5 is used to show that E[e I[Tm>cn]ITn=n,

Vn=O] + 0, while restricting the other computations to the event

[Tm< cn] on which the convergences in Lemmas 3.3 and 3.4 are uniform.

Consider now the second term Rnm(t). In analogy to (3.12), the

conditional defining relation in (3.16) is equivalent to

Page 22: ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE … · ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE CONVEX MINORANTS ... The method used in Section 3 utilizes a conditional

19-

In view of the above derivation of (3.23), the convergence of Rnm{t) to

zero will be complete if we can show that Un - Um~> O. For this, it

will be necessary to be more specific about the choice of m. The only

condition so far has been that m= o(n). In what follows we will need

further to assume that {log m)llog n ... 1. (To see that this is possible,

consider 10.g m/10g n = 1 - 1/10g log n, for which m= 0en) and

log m/log n ... 1). To see that this suffices, set

_ Nj . 2 . ~Xj - Ei=l (Sji - j) ISji' bn ~ (3 log n)-

so that by (3.5a)

nUn-Um = (bn/bm- l)Um+ bn Ej=m+1 Xj - bn 10g{n/m).

By o.», EXj = l/(j-l). Therefore, write

Un-Um = (bn/bm - l)Um+ bn E~=m+1 (Xj - (j_l)-l)

- bn(log(n/m) - E~:~ j-1).

Clearly the last term converges to o. Since Um~>, the first term

converges to 0 because (bn/bm)2 = (log n)/log m... O. For the middle term,

we use (3.6) to compute its variance to be

b2 En-2 3/k + 0(1) = (1og n)-l 10g(n/m) + 0(1)n k=m-1

,: 1 - (log m)11 og n = 0(1) .

This shows that U -U ~> 0 as desired. The proof is complete. 0n m

Page 23: ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE … · ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE CONVEX MINORANTS ... The method used in Section 3 utilizes a conditional

20..

4. ~-norm of slopes of convex minorants of truncated Brownian bridges.We shall prove the following result.

Theorem 4.1. Let B= {B(t):tE [O,l]} be (standard) Brownian bridge on [0,1],let Bt,u =B.l [t,u]' where 1[t,u] is the indicator of the interval [t,u]and let gt,u be a version of the slope of the convex minorant of Bt,u on(0,1). Then

(4.1) {f~ g~/n,1_1/n(U)dU-109 n}/1310g n h. Z,

where Z is a standard normal random variable.

Theorem 4.1 will be used in section 5 to derive the asymptotic

from Theorem 3.1 by using strong approximation of the empirical process byversions of Brownian bridges in Komlifs et al(1975).

The following class of functions will playa fundamental role inthe sequel.

Definition 4.1. U is the set of right-continuous and nondecreasingstep-functions J:[O,lJ +m, which have only finitely many jumps and satisfyf~ J(u)du =0 and f~ J2(u)du =1.

Notice that all functions J€ Msatisfy the inequalities

() -k (-k4.2 -u 2~J(U)~ l-u) 2, UE(O,l).

The class Mis also considered in Behnen(1975) and Scho1z(1981)(with slight modifications). It can be used to give a convenientrepresentation of the L2-norm of the slope of the convex minorant ofbounded real-valued functions on [O,lJ, which satisfy certain regularityconditions near the boundary of the interval [O,lJ. The representationof the L2-norm of the slope of the convex minorant by means of functionsin 1.4 has been studie.d by F. Scholz, and the following lemma is a generalizationof results in Scholz(1981).

Page 24: ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE … · ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE CONVEX MINORANTS ... The method used in Section 3 utilizes a conditional

21-

Leoma 4.1. Let G:[O,lJ +R be a bounded function such that G(O) =G(1) =0

and

(4.3) 1imt.J-

ot-~-oG(t) = limt.J- ot-~-oG(1-t) = 0,

for some 0 > O. Then 119'112 < 00 and

(4.4) IIgl~= -infJ, M 1(0,1) G(u)dJ(u),

where 9 is a version of the slope of the convex minorant Gof G.

Proof. First suppose that G is a step-function which only has jumps at the

points t 1< ••• < tn' where t 1 > 0 and t n < 1. It follows from (4.3) that in this

case G(t) = 0, if t < t 1 or t> Since G~G, we have

(4.5)

Integration by parts and the Cauchy-Schwarz inequality give

(4.6) -/(0,1) GdJ=/6 g(u)J(u)du~ "9'1I2 " JII2 =1l9'1I2·

Since G(O) = G(1) = 0, we also have 6(0) =6(1) = 0 and hence 16 g(u)du = o.

Suppose 119'112 > O. Without loss of generality we may take a ri ght-conti nuous- -vers i on 9 of the slope of G and in thi s case the functi on J =g/ 119'112 belongs

to M. Hence the upper bound in (4.6) is attained for J=g/ 119'112,

Combining (4.5) and (4.6) we get

(4.7) -infJt: M 1(0,1) G dJ~ -/(0,1) GdJ=IIgI12·

--Let 0 be the set of discontinuity points of J, then 0 is not

empty , since otherwi seG :: 0 and hence Ilgl~ = O. The set 0 is a subset of

the set {t1, ... ,tn} of discontinuity points of G. Let H:[O,l] +R be the

function defined by_ {G(t-)I\ G(t+), if t e 0 and G(t) > G(t-)AG(t+)

H(t) - G(t), otherwise.

Then H(t) ='G(t), if t s 0 and hence, since j is a step-function which

only has jumps in 0,

(4.8) 1(0,1) H dr= I (0, 1) 'G dJ.

Page 25: ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE … · ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE CONVEX MINORANTS ... The method used in Section 3 utilizes a conditional

(4.9)

22..

It is clear that the integral 1(0,1) H dJ can be approximated arbitrarily

close by integrals 1(0,1) 6 dd, with JE M (move the points t, D, where

6(t) > 6(t-)" 6(t+) a bit to the right or left and consider functions J~ At

which have jumps of approximately the same height as J at the shifted

points instead of the original points). Relation (4.4) now follows from

(4.7) and (4.8).- - (If IIgI12=0, then 6=0, and hence 6~0. In this case 4.4) also holds,

since 1(0,1) 6 dJ = 0 for any function Jf'M such that J is constant on the

intervals [O,t) and [t,1), with tE(0,t1) .

Now consider an arbitrary bounded function 6:[0,1] +:R such that

G(t) = 0, if t~a or t~ l-a, where a E (O,~). Define for each n the intervals

[ - n ( ) - n) n [ - n ]. z"Ik,n by Ik,n= k2 , k+l 2 , k=0,1, .... ,2 -2, Ik,n= k2 ,1 ~ tf k= -1,

Gn(t) = inf E I G(u}, iftE Ik n ' k = 0,1, ... ,2n_l.

u k,n '

Fix e> O. Let P be the set of finitely discrete probabil ity

measures on [0,1]. Then, if 'iT is the convex minorant of a function H:[O,l] +:R,

we have for each t e [0,1],,...H(t) = i nf{f [O,l]H (u) dP:1[0,1]udP = t , PeP}

(see e.g. Rockafellar(1970), p. 36). Thus there exist positive constants

c1,n' .... 'cm(n),n and points tl,n' .... 'tm(n),n belonging to m(n) disjointintervals Ik such that, for each n and fixed t E [0,1],,n

m( n) _ m( n) _ - () m( n) ( ) _l:i=l ci,n -1, l:i=l ci ,nti,n - t and 6n t > l:i=l ci ,n6n ti,n e ,

This implies that there are points t ' , with It~ n-t1' n l ~2-n, such that1,n 1"

1m( n) I 1 -n "'" () m( n) (I) 2t-z'; 1 c. t ; < 2 and G t > r: 1 c. 6 t : - e1= 1,n 1,n - n 1= 1,n 1,n

(let t. and t~ belong to the same interval Ik ,and use the definition of G ).1,n 1,n ,n nThe sequence lGn} is increasing and hence limn +

ClO"trn(t) exists (and is~O).

The convex minorant Gof G is continuous on [0,1], since 6 is bounded on

[at l-aJ and zero outside this interval. Hence by (4.9), G(t) ~ lim n +ClO

Gn(t) + 2e.

We also have G(t)2.Gn(t), for all n, .and thus limn +ClO

Gn(t) ='G(t).

Since the sequence {G }converges pointwise to G, the right~cont~nuous_ n _

slopes 9n of Gn converge to the right-continuous slope 9 of G, except

possibly at countably many points of [O,lJ (see e.g. Roberts &Varberg(1973),

Page 26: ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE … · ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE CONVEX MINORANTS ... The method used in Section 3 utilizes a conditional

23-

Problem C(9), p. 20). The functions Gn and G are uniformly boundedbelow on (a,l-a) and zero outside this interval. This implies that theslopes gn and g are uniformly bounded on (0,1). Hence, by dominatedconvergence, 1imn+ co lI'9n-g1l2 =O.

Choose nO such that 119nll2 > 11'9112- e , for n~nO' By the firstpart of the proof there exists for each n a step-function JnE M, such that

-/(O,l)GndJ> lI'9nlk- e:, where the points of jump of In, say ul,n, ... ,up(n),n'belong to disjoint intervals Ik,n and are contained in [a,l-a]. By thedefinition of Gn there exist points u',n' •.•. 'up(n),n such thatG(ui ,n) < Gn(u i ,n) + e and ui,n and ui,n belong to the same interval Ik,n'Furthermore, let J~ be the right-continuous step-function which has thesame J'umos as In, but at the ooints u: instead of u, (note that in. . 1,n 1,ngenera1J~ .M).· Then,by (4.21,wehave for n~nO'

1 1

-/(O,1)GdJ~> -1(O,1)GndJn.,. 2e:a-'2> 11'9112- 2e:- 2e:a-'2.

It is also clear from (4.2) and the definition of the points ui,n thatI/~ J~(u)dul = I/~ J~(u)du- I~ In(u)dul ~2-n~la-~

and1/~(J~(U))2dU -1/ = 1/~(J~(u))2dU - I~J~(u)dul ~2-n-la-l.

Thus, for n sufficiently large we can find a J~( M, obtained from J~ by,making slight adjustments of mass, which satisfies

1

-/(O,1)GdJ~ >1/9112- 3e:- 2e:a-'2.

Therefore -infJ EM1(0,1) GdJ ~ 1I'9lk· Since -infJ E M1(0,1) GdJ2,

2, infJE M1(0,1) GndJ = 119n1l2' for each n, relation (4.4) now follows.

Finally, let G be an arbitrary bounded function, such that G(O)=G(l)=Oand (4.3) is satisfied. By (4.3) and the boundedness of G there exists aconstant c >0, such that

(4.10) IG(t) 1 ~ c.min{t¥O ,(l_t)~+o}, tE [0,1].

--Thus, if 9 is the right-continuous slope of the convex minorant G of G,we have

Page 27: ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE … · ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE CONVEX MINORANTS ... The method used in Section 3 utilizes a conditional

24-

(4.11)

This implies

(4.12)

Define for each t e (O,~) the function Gt by

G (u)= {G(uL if u([t,l-t],to, otherwise.

By (4.10), (4".2) and integration by parts, we have for all JE M

(4.13) Ifro ,OG dJ- !(O,1)GtdJI.sc!(O,t) U~odJ+Cffl_t,l)(l"U)~odJ. 0< c.t /0.

Let Hn =G1/n and let 'Hn be the convex minorant of Hn. The sequence {H n}converges uniformly to G, as n -+00, and hence, by the same argument as

used above, {H } converges pointwise to Gr. By (4.11) and (4.12) then ,..., N

right-continuous slopes hn of Hn are uniformly bounded (in absolute valu~)

by an L2-function f of the form f'(u) = k.min{u-l2+o,(l-ur~+o}, u~ (0,1),

where k is some positive constant. Since a similar bound holds for 91, we

have by dominated convergence

(4.14) lim 1111 -g 112 =o.n-+oo n

Thus limn-+ oo IIhnl12 =11'9112 and by (4.13),

(4.15) 1imn -+ 00 Ilhnl~= 1imn -+ 00 {-infJ £ H 1(0,1) G1/ndJ} =-infJ 6 MI(O,1)G dJ.

The result now follows from (4.14) and (4.15). 0

Page 28: ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE … · ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE CONVEX MINORANTS ... The method used in Section 3 utilizes a conditional

25-

Remark 4.1. It is clear that condition (4.3) can be somewhat weakenedand we mainly chose (4.3) for convenience.

Proof of Theorem 4.1. Let Un be the empirical process defined byUn(t)=Irl(Fn(t)-tL tE'[O,l], where Fn is the empirical df of a uniformdistribution on [0,1]. With probability one all observations are containedin the open interval (0,1) and hence Un satisfies almost surely theconditions of Lemma 4.1. Let un be a version of the slope of the convex

".,

minorant Un of Un' Then, by Lemma 4.1,

IlunU2=...infj, JJ f(O,l)UndJ.

Fix £ >0 and let an = (log n)4/n, bn=l-an. There exists 0> 0 suchthat P[suPt E (0,1) IUn(t) I/It(l-t) 2. Mil 092n] < e for all n2.3, where1092n= loglog n (this follows from the law of the iterated logarithmfor the empirical process). If IUn(t)I.=:.M/t 1092n and JE M, we have

f[o/n,anJ IUnldJ .=:. MlJog2n f[o/n,any''f dJ(t).=:. M{(10g2n)(lOg(nan/o))}~.=:.c.10g2n,

for some constant c independent of n. A similar upper bound holds for

f[bn,l-o/nJ IUnldJ.

Since sUPJE:M f(O,o/n] t dJ(t)+O~ and similarlysUPJ~ Mf[1-o/n,1) (l-t)dJ(t) +0, as n+ oo , there exists a constant k suchthat, for all large n, P[sUPJc:M f(o,an]v[bn,l) IUnldJ2.k 1092n] <2£.

By Theorem 3.1 and Lemma 4.1,

({infJ EMf(O,l) UndJ}2-1 0gn)/1.3109 n kZ,

where Z is a standard normal random variable. Furthermore, since

infJc:Mf(O,l) UndJ-infJ EH f(an,l-an)

UndJ= 0(1092n) on a set

of probability> 1-2£, and since £>0 was arbitrarily chosen, we have

(4.16) ({infJ(; Mf(an,bn)

UndJ}2_10g n)/1310g n k Z.

Page 29: ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE … · ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE CONVEX MINORANTS ... The method used in Section 3 utilizes a conditional

26..

By Komlos et al(1975), there are versions of Brownian bridges Bnsuch that SUPtE (o,l)IUn(t) -Bn(t)1 =O«1og n)/In) with probability one.Hence, by (4.2),

linfJEMf(a b) UndJ-infJEMf(a b) B dJI=O(suPJEMf(a b )n-~109 n dJ)n' n n' n n n' n

~ 0(1/10g n},

almost surely. This implies

(4.17) ({infJ E}.{ f(a b ) BndJ}2- log n)/1310g n ~ t:n' n

By the law of the iterated logarithm for Brownian bridge thereexists a constant k >0 such that

sUPJEMf[l/n,anlIBnl dJ <k(10g2n)-~f[1/n,anJt-3/2IBn( t) Idt =0(1092n)

and similarly sUPJE MfEb ,1_1/nJIBnldJ=0(1092 n) on a set of probability> l-e:.n

Thus we can replace an by l/n and bn by 1-1/n in (4.17). By Lemma 4.1 we

have -infJ~M f[1/n,1-1/nJ BndJ=lIg1/n,1-1/n ll2' where gl/n,l-l/n is a versionof the slope of the convex minorant of Bn.1[1/n,1-1/nJ. Since the distributionof Ilgl/n,l-l/nl~ will be the same for any version of the Brownian bridge Bn,the result now follows. 0

5.' Asymptotic normality of a statistic proposed by Behnen.Let X1, ... ,Xm and Y1' ... 'Yn be two independent samples from a

uniform distribution on [O,lJ, let Fm (Gn) be the empirical df of the first(second) sample and let HN be the empirical df of the combined sample. Withprobability one, all observations in the combined sample are different andcontained in the open interval (0,1). Thus, on a set of probability one,we can define the inverse HN

1 of HN as the right-continuous df such thatHN(H N

1(k/N)) = kiN and HN1(u) =HN

1(kiN) ,k/(N+l) 2 u < (k+l)/(N+l), k=0, ... ,N.

In the sequel we will restrict our attention to the set where HN1 is

well-defined and we shall omit the expressdon "wt th probabtlity one": vie 'definethe (random) dfs Fmand Gn by

- -1 - -1Fm= Fmo HN and Gn =Gno HNNote that by our definition of HN

1 these dfs are right-continuous.

Page 30: ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE … · ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE CONVEX MINORANTS ... The method used in Section 3 utilizes a conditional

27...

Behnen(1975) considered the statistic

(5.1) TN=suPJEU f(O~l) J(u)dFm(U)

(actually he considered slightly different versions, but this will

make no difference for the limiting behavior). By integration by parts

and Lemma 4.1 it is seen that

(5.2) TN=-infJ~M f(O,l)(tm(u)-u)dJ(u) =lIrm,N-ll~,

-where fm,N is a version of the slope of the convex minorant of tm' Let

(5.3) LN(t) = (l-AN){AN~!IUm(HN1(t» - (l-AN)-~ Vn(HN1(t»},

for all t, [0,1], where Um(u) = Iiii(Fm(u) - u), Vn(u) = Iri(Gn(u) - u) and AN = m/N.

Then

fnfJ E M f(O,l)IN(Fm(tFt)dJ(t)=infJ E M f(O,1) LNdJ+O(1),

since IIN(Fm(t)-t) - LN(t) 1= INIHN(HN1(t»-tl ~ N-~, t e [0,1],

(cf. pyke &Shorack(1968), Lemma 3.1, p. 762, but note that our definition

of HNl is different; in particularIHN(I::!Nl(t»-tl = tA(l-t), if tA(l-t) < (N+1)-l).

To obtain the limiting behavior of 111 N1I2, with f N as in (5.2),m, m,we compare LN with the corresponding functional for Brownian bridges Bmand B1

:n

(5.4) [N(t) = (l-AN){AN~ Bm(HN1(t» - (l-ANf~ B~(HN1(t»}.

Lemma 5.1. Let aN = (log N)4/N, bN= l-aN and let Urn' Vn, Bm and B~ be

independent versions of empirical processes and Brownian bridges respectively,

such that almost surely,

SUPtl: (0,1) IUm(t)-Bm(t) 1= 0(109 m/vin)and

SUPt E (0,1) IVn(t)-B~(t)1= O(log n/ln) ,

as m,n-+ oo• Then, if AN is bounded away from o and 1, as N-+co, we have

(5.5) infJ,M f[aN'bN]

LNdJ - infJ~ M f[aN,bN]

LNdJ = O(l/log N),

with probability one, as N-+oo.

Proof. Note that sUPt E(O,1) IUm(HN1{t)-Bm{HN1(t»I~sUPt({0,1)IUm{t)-Bm(t)l,

with a similar relation for VnoHN' - B~oHN1. The rest of the proof follows

Page 31: ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE … · ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE CONVEX MINORANTS ... The method used in Section 3 utilizes a conditional

28..

exactly the same pattern as the argument in the proof of Theorem 4.1. [J

lemma 5.2. let aN=(109N)4/N and bN=l-aN• Then, if B is a Brownian

bridge on [0,1], we have

sUPJ~ M l[aN,bN]IB(HN

1(t»-B(t)ldJ(t)-* 0,

in probability, as N-*co.

Proof. Fix e > O. There exist IS >0 and 1'11 :> J, .s.uchthat

P[suPO < s < t < 1 IB(t)-B(s) 1/I2(t-s)log(l/(t-s» .::.M1] < e .

(This follows from ItS and McKean(1974), p.36, formula 1).) By the law

of the iterated logarithm for the sample quantile process there exist

M2>0 and NO = NO( £) such that

P[SUPt E (0,1) IHN1(t)-tl/lt(l-t) .::.M2( (lOg2N)/ft}~J < £ •.

Thus there exists M3> 0, such that

P[sUPtE [aN,bN]

IB(HN1(t»-B(t) 1/(t(1-t»~'::'M3(lOgN)~(N-ll092N)~J< E.

By the Euler equation, applied on smooth J such that 16 J2(u)dU = 1,

it is seen that b

sUPJ,M l[aN,bNJ (t(l-t»~dJ(t) ~klr(~109 N la~ (t(1-t»-3/2dt .

~ k2N-~( log'N)N~/lolN = k2N~/109 N,

for some positive constants kl and k2• Thus there exists an M4 >0, such that

P[suPJf M l[aN,bNJIB(HN1(t»-B(t) IdJ(t) .::.M4(10g Nf~(1092N)~] < 2£,

and the result follows. CJ

Now, using the same notation as in Lemma 5.1, we define

(5.6) [NO( t) = (l-AN){)'N~Bm( t)- (l-An)-~~( t)}.

Page 32: ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE … · ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE CONVEX MINORANTS ... The method used in Section 3 utilizes a conditional

29..

By Theorem 4.1 we have

~ - 2 L(S.7) {{infJ C M{AN/{l-AN» l[aN,bN]

LNOdJ) -log m}/J310g m~ Z,

with Z standard normal, _i f AN stays bounded away from 0 and 1, as N~ co.To see this, note that LNO again represents a Brownian bridge (as a sum

of two independent Brownian bridges), but that the variance is {l-AN)/ANtimes the variance of the standard Brownian bridge on [0,1]. Furthermore,

it was shown in the proof of Theorem 4.1 that replacing [aN,bN] by [l/N,1-1/N]le~ds to the same limiting (normal) distribution.

The asymptotic (standard) normality of the statistic

{{NAN/(l-AN»T~- log m}/.t310g m,

withTNdefined by (S.2) (or,:quivalent1y, (S.1», will now follow if we

can show thafsuPJ E Ml(o,aN]

FmdJll10gmandsuPJ E Ml[bN,1)(T-~m)dJII1()9 rn

tend to zero in probability (with a similar statement for the functional

with Fm replaced by Gn) . First, by our definition of HN1, we have tm(t) =0,

if t < (N+l )-1. Second, for fixed e > 0, there exists b = b(e:) such that

P[Fm(t) ~Fm(bt), all t€ [0,1] ]~ l-€.

(see Lemma 2.5, p.761, Pyke &Shorack(1968); our interval for t is [0,1]

rather than [1/N, 1], because of our defi ni ti on of HN1) . There exi sts M> 0

such that P[supt £ {0,1)IUm(bt)INt~MI10g2m]<€, for all large m. Thus,

P[sUPJ€M 1[l/(N+1),aN]

1m FmdJ~k 1092m]<€, if m is large,

for some constant k > 0 (see the proof of Theorem 4. 1). Similar arguments

ho1d for I [bN,

1) IID'( 1-t m) dJ. We have proved

Theorem 5.1. Let TN=suPJ E M1(0,1) J(u)dtm(u). Then TN=lIfm,N-1112' where.- -1fm,N is a version of the slope of the convex minorant of FmoHN ' andthe statistic {(NAN/{l-AN»T~- log m}/1.3 log m tends in law to a standardnormal distribution, if AN stays bounded away fromO and 1, as N~co.

Page 33: ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE … · ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE CONVEX MINORANTS ... The method used in Section 3 utilizes a conditional

30-

6. Concluding Remarks

Both limit theorems involve non-negative random variables, namely,

square L2-norms. As such, one possible guide to the rate of convergence

is the sample size required before zero is 3 standard deviations from the

mean under the approximating Normal distribution. In the one-sample case,

this requires log n =3(3 log n)~ or n > 5 x 101~ For 2 standard

deviations, one requires n ~ 162,755. The results are similar for the

2-samp1e statistic. By this, one sees the extreme slowness of the conver-

c::nIIIA..,~rl norms

find functions of the statistics for which the convergence is much improved.

Behnen (1974) used the L2-norm itself, that is, the square-root transformation,

for his Monte Carlo simulations. Here, the asymptotic variance is constant

and the corresponding sample sizes are 854 and 20, respectively.

Monte Carlo simulations of sample sizes n = 4(1)10 (20,000 replications)

and 50 (5,000 replications) for the log transformation have been carried out

by Scholz (personal communication). They indicate tails that are still too

heavy for n =50. Behnen (1974) had earlier provided simulations for the

two-sample statistic for selected sample sizes up to m=n =100. Although

the convergence is slow, the fit was sufficiently close to suggest the

asymptotic normality of the statistic.

It is possible to generalize the representation approach used for

Theorem 3.1 to obtain an alternate proof of the two-sample result, Theorem

5.1. The only difficulty is in defining a suitable I randomization' of the

coincidences that can now occur in order that the resultant distribution of

Page 34: ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE … · ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE CONVEX MINORANTS ... The method used in Section 3 utilizes a conditional

31..

heights remain the same as in (2,2), The coincidences enter because fm•

unlike Fmt has its jumps occurring at the equi-distant points' {i/N}.

One approach is to affix small (continuous) random perturbations to these

points to prevent ties among the slopes of the segments of the concave

majorant without changing ~ignificantly the value of the statistic. Once

this is done t one uses Negative Binomial rather than Gamma random variables

for the {Sj'i}'

Page 35: ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE … · ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE CONVEX MINORANTS ... The method used in Section 3 utilizes a conditional

Acknowledgement

The authors are grateful to Dr. F.W. Scholz of Boeing Computer

Services for introducing us to the problem and some of the relevant

literature. We also appreciate the extensive computations which he

provided during our research,

Page 36: ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE … · ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE CONVEX MINORANTS ... The method used in Section 3 utilizes a conditional

33-

REFERENCES

1\[1] Behnen. K. (1974). Gutee1genschaften von R~n9tests unter Btnd~ngen\

Habil1tat1onschr1ft, University of Fre1burg.

[2] Behnen. K. (1975). The Randles-Hogg test and an alternative proposal.

Comm. Statist. ~. 203-238.

[3J Feller. W. (1968). An Introduction to Probability Theory and Its

Applications. Third Edition. John Wiley and Sons. New York.

[4] Grenande.r. U. (1956). On the theory of mortal ity measurement. Part

n .. Skand. Akt.~.125-153.

[5] Groeneboom. P. (1981). The concave majorant of Brownian Motion.

Tech. Report No.6. Dept. of Statistics. University of Washington.

Seattle.

[6] Hobby. C. and Pyke. R. (1963). Combinational results in fluctuation

theory. Ann. Math. Statist. ~. 1233-1242.A

[7] Ito. K. and McKean. H.P •• Jr. (1974). Diffusion processes and their

sample paths. 2nd Ed. Springer Ber1ag. Berlin.

[8] Komlos, J .• Major, P. and Tusnady, G. (1975). An approximation of

partial sums of independent r.v.ls and the sample d.f. Z. Wahr. v.

Geb. ~. 111-13l.

[9] LeCam. L. (1958). Une theoreme sur la division dlune interval1e par

des points pres au hasard. Pub. lnst. Statist. Univ. Paris Z, 7-16.

[10] Loeve. M. (1963). Advanced Probability. Third Edition, Van Nostrand,

New York.

[11] Pyke, R. (1965), Spacings. ~.~.~.~. (B), ~, 395-449.

Page 37: ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE … · ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE CONVEX MINORANTS ... The method used in Section 3 utilizes a conditional

34-

[12] pyke, R. and Shorack, G.R. (1968). Weak convergence of a two-sample

empirical process and a new approach to Chernoff-Savage theorems.

Ann. Math. Statist. ~, 755-771.

[13] Randles, R.H. and Hogg, R.V. (1973). Adaptive distribution-free tests.

Comm.Stati$t. ~, 337-56.

[14] Roberts, A.W. andVarberg, D.E., (1973). Convex Functions. Academic

Press. New York.

[15] Rockafellar, R.T. (1970). Convex Analysis. Princeton University Press,

Princeton.

[16] Scholz, F.W. (1980). Towards a unified definition of ~aximum likelihood.

Can. ~ Statist. ~, No.2.

(17] Scholz, F.W. (1981). Combining independent p-values. In preparation.

[18J Sparre Andersen, E. (1954). On the fluctuation of sums of random

variable~ II. Math. Scand. ~, 195-223.

[19J Spitzer, F. (1956). A combinatorial lemma and its application to

probability theory. Trans. Amer. Math. Soc. ~, 323-339.

l)EPARTMENT OF STAT