winter term 2016/2017 norbert henze, institute of...

362
Asymptotic stochastics Winter term 2016/2017 Norbert Henze, Institute of Stochastics Norbert Henze, KIT 0.1

Upload: others

Post on 19-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic stochastics

Winter term 2016/2017

Norbert Henze, Institute of Stochastics

Norbert Henze, KIT 0.1

Page 2: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Contents

1. Basic facts from probability theory

2. A Poisson limit theorem for triangular arrays

3. The method of moments

4. A CLT for stationary m-dependent sequences

5. The multivariate normal distribution

6. Convergence in distribution and CLT in Rd

7. Empirical distribution functions

8. Limit theorems for U-statistics

9. Basic concepts of asymptotic estimation theory

10. Asymptotic properties of maximum likelihood estimators

11. Asymptotic (relative) efficiency of estimators

12. Asymptotic tests in parametric models

13. Probability Measures on Metric Spaces

Norbert Henze, KIT 0.2

Page 3: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Contents

14. Weak convergence in metric spaces

15. Convergence in Distribution

16. Relative compactness and tightness

17. Weak convergence and tightness in C

18. Wiener Measure, Donsker’s Theorem

19. Brownian Bridge, Wiener process on [0,∞)

20. The Space D[0, 1]

21. Empirical Processes: Applications to Statistics

22. Gaussian distributions in separable Hilbert spaces

23. The Central Limit Theorem in separable Hilbert spaces

24. Statistical applications: Weighted L2-statistics

Norbert Henze, KIT 0.3

Page 4: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Basic facts from probability theory

1 Basic facts from probability theory

Let X,X1, X2, . . . be real-valued r.v.’s on some probability space (Ω,A,P).

1.1 Definition (Almost sure convergence)

Xna.s.−→ X :⇐⇒ P

(ω ∈ Ω : lim

n→∞Xn(ω) = X(ω)

)= 1.

1.2 Theorem (Characterization of almost sure convergence)

Xna.s.−→ X ⇐⇒ lim

n→∞P

(supk≥n|Xk −X| > ε

)= 0 ∀ ε > 0.

1.3 Definition (Convergence in probability)

XnP−→ X :⇐⇒ lim

n→∞P (|Xn −X| > ε) = 0 ∀ ε > 0.

1.4 Theorem (Characterization of convergence in probability)

XnP−→ X ⇐⇒ each subsequence (Xnk

) contains a further subsequence

(Xn′

k) such that Xn′

k

a.s.−→ X as n′k →∞.

Norbert Henze, KIT 1.1

Page 5: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Basic facts from probability theory

1.5 Generalization to random vectors

Let X =: (X(1), . . . , X(d)), Xn = (X(1)n , . . . , X

(d)n ), n ≥ 1, be d-dimensional

random vectors. Then:

Xna.s.−→ X without change,

XnP−→ X :⇐⇒ lim

n→∞P (‖Xn −X‖ > ε) = 0 ∀ ε > 0,

were ‖ · ‖ is any norm on Rd, (why?)

Xna.s.−→ X ⇐⇒ X

(j)n

a.s.−→ X(j) for each j ∈ 1, . . . , d, (why?)

XnP−→ X ⇐⇒ X

(j)n

P−→ X(j) for each j ∈ 1, . . . , d. (why?)

1.6 Definition (Convergence in p-th mean)

Let Lp := Lp(Ω,A,P) := X : Ω→ R : E|X|p <∞, 0 < p <∞.If X,X1, X2, . . . ∈ Lp, one defines

XnLp

−→ X :⇐⇒ E|Xn −X|p → 0.

p = 1: Convergence in the mean

p = 2: Convergence in quadratic mean

Norbert Henze, KIT 1.2

Page 6: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Basic facts from probability theory

1.7 Remark

If Xna.s.−→ X then Xn

P−→ X. (why?)

If XnLp

−→ X then XnP−→ X. (why?)

In general, there are no further implications. (why?)

1.8 Theorem (Strong law of large numbers, SLLN)

Let X1, X2, . . . be independent, identically distributed (i.i.d.) random variables.Then the following are equivalent:

a) There is a random variable X such that1

n

n∑

j=1

Xja.s.−→ X.

b) E|X1| <∞.

If a) or b) holds, then1

n

n∑

j=1

Xja.s.−→ EX1.

Does this result generalize to random vectors?

Norbert Henze, KIT 1.3

Page 7: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Basic facts from probability theory

1.9 Definition (Convergence in distribution)

Let X,X1, X2, . . . be random variables with distribution functions F, F1, F2, . . ..Write C(F ) for the set of continuity points of F .

XnD−→ X :⇐⇒ lim

n→∞Fn(x) = F (x) ∀x ∈ C(F ).

Equivalent notations: FnD−→ F , PXn

D−→ PX , XnD−→ PX .

1.10 Remarks

a) PX is called limit (asymptotic) distribution of Xn (of PXn).

b) XnP−→ X =⇒ Xn

D−→ X.”⇐=“ holds if PX is degenerate.

c) Suppose F is continuous. We then have Polya’s Theorem:

XnD−→ X ⇐⇒ lim

n→∞supx∈R

|Fn(x)− F (x)| = 0.

Norbert Henze, KIT 1.4

Page 8: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Basic facts from probability theory

Let

Cb := h : R→ R : h bounded and continuous,C(0) := h ∈ Cb : h uniformly continuous,C(r) := h ∈ C(0) : h r times differentiable, h(j) ∈ C(0) for j ∈ 1, . . . , r,r ∈ N.

1.11 Theorem (Characterization of convergence in distribution)

Let r ∈ N0 be fixed. Then the following assertions are equivalent:

a) XnD−→ X,

b) limn→∞ Eh(Xn) = Eh(X) for each h ∈ Cb,c) limn→∞ Eh(Xn) = Eh(X) for each h ∈ C(r).

Notice that

XnD−→ X ⇐⇒ lim

n→∞E1(−∞,x](Xn) = E1(−∞,x](X) ∀x ∈ C(F ).

Norbert Henze, KIT 1.5

Page 9: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Basic facts from probability theory

1.12 Definition (Tightness and relative compactness)

Let Q 6= ∅ be a set of probability measures on B1. Q is said to be

a) tight :⇐⇒ for each ε > 0 there is a compact set K ⊂ R such thatQ(K) ≥ 1− ε for each Q ∈ Q.

b) relatively compact :⇐⇒ for each sequence (Qn) in Q there are asubsequence (Qnk

) and a probability measure Q on B1 such that

Qnk

D−→ Q as k →∞.

1.13 Theorem We have: Q tight ⇐⇒ Q relatively compact.

1.14 Corollary

a) XnD−→ X =⇒ PXn : n ∈ N is tight.

b) Let PXn : n ≥ 1 be tight. Suppose there is a probability measure Q

such that Xnk

D−→ Q as k →∞ for each subsequence (Xnk) that

converges in distribution at all. Then XnD−→ Q.

Norbert Henze, KIT 1.6

Page 10: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Basic facts from probability theory

1.15 Definition (Characteristic function)

Let X be a random variable.The function

ϕ := ϕX :

R→ C,

t 7→ ϕ(t) := E[eitX

]=∫Reitx PX(dx)

is called the characteristic function of X.

1.16 Theorem (Some properties of characteristic functions)

a) ϕaX+b(t) = eitb ϕX(at), t ∈ R,

b) If E|X|k <∞, then ϕ is k times continuously differentiable, and

dk

dtkϕ(t) = E

[(iX)keitX

], t ∈ R.

c) If X,Y are independent then ϕX+Y = ϕX · ϕY .

d) If a, b ∈ C(F ) and a < b, then

F (b)− F (a) = limT→∞

1

∫ T

−T

e−ita − e−itb

itϕ(t) dt.

e) We have PX = PY (⇐⇒: XD= Y ) ⇐⇒ ϕX = ϕY .

Norbert Henze, KIT 1.7

Page 11: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Basic facts from probability theory

1.17 Theorem (Continuity theorem of Levy-Cramer)

Let X,X1, X2, . . . be random variables with characteristic functionsϕ,ϕ1, ϕ2, . . .. We then have

XnD−→ X ⇐⇒ lim

n→∞ϕn(t) = ϕ(t) ∀ t ∈ R.

Norbert Henze, KIT 1.8

Page 12: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Basic facts from probability theory

1.18 Definition (Characteristic function of a random vector)Let X = (X1, . . . , Xd)

⊤ be a d-dimensional random (column) vector.

ϕ := ϕX :

Rd → C,

t 7→ ϕ(t) := E

[eit

⊤X]

is called the characteristic function of X.

1.19 Proposition Let −∞ < aj < bj <∞, j = 1, . . . , d, and put

B := [a1, b1]× . . .× [ad, bd].

If P(Xj ∈ aj , bj) = 0 for each j = 1, . . . , d, then

PX(B) = lim

T→∞

1

(2π)d

∫ T

−T

· · ·∫ T

−T

d∏

j=1

e−itjaj − e−itjbj

itjϕX(t) dt.

Proof. Mimic the proof given for the case d = 1 (Exercise!)

1.20 Corollary We have XD= Y ⇐⇒ ϕX = ϕY . (why?)

Norbert Henze, KIT 1.9

Page 13: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Basic facts from probability theory

1.21 Theorem (Herglotz-Radon-Cramer-Wold)

We haveX

D= Y ⇐⇒ t⊤X

D= t⊤Y ∀ t ∈ R

d.

Proof. Only”⇐=“ is non-trivial. Fix t ∈ Rd. Then

ϕX(t) = E

[eit

⊤X]

= E

[ei·1·t

⊤X]

= ϕt⊤X(1)

= ϕt⊤Y (1) = E[ei·1·t

⊤Y]

= E[eit

⊤Y]

= ϕY (t).

Corollary 1.20 =⇒ assertion.

Norbert Henze, KIT 1.10

Page 14: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Basic facts from probability theory

1.22 Theorem (Central Limit Theorem, Lindeberg-Levy)

Let X1, X2, . . . be i.i.d. random variables such that EX21 <∞. Put a := EX1,

σ2 := V(X1), Sn := X1 + . . .+Xn, n ≥ 1. If σ2 > 0, then

Sn − E(Sn)√V(Sn)

=Sn − naσ√n

D−→ N(0, 1) as n→∞.

1.23 Theorem (Central Limit Theorem, Lindeberg-Feller)

For each n ≥ 2, let Xn,1, Xn,2, . . . , Xn,rn be independent random variables.Let 0 < σ2

n,j := V(Xn,j) < ∞, an,j := EXn,j , σ2n := σ2

n,1 + . . . + σ2n,rn ,

Sn := Xn,1 + . . .+Xn,rn . For ε > 0, let

Ln(ε) :=1

σ2n

rn∑

k=1

E[(Xn,k − an,k)

2 1∣∣Xn,k − an,k

∣∣ > εσn

].

If limn→∞ Ln(ε) = 0 for each ε > 0 (so-called Lindeberg condition), then

Sn − ESn√V(Sn)

D−→ N(0, 1) as n→∞.

Norbert Henze, KIT 1.11

Page 15: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Basic facts from probability theory

1.24 Theorem (Central Limit Theorem, Ljapunov)

Suppose that in 1.23 there is some δ > 0 such that

limn→∞

1

σ2+δn

rn∑

k=1

E∣∣Xn,k − an,k

∣∣2+δ= 0.

ThenSn − ESn√

V(Sn)

D−→ N(0, 1) as n→∞.

1.25 Theorem (Continuous mapping theorem, CMT)

If XnD−→ X and h : R→ R is continuous, then h(Xn)

D−→ h(X).

1.26 Theorem (Slutsky’s Lemma)

Suppose that XnD−→ X and Yn

P−→ a, a ∈ R. We then have:

a) Xn + YnD−→ X + a,

b) Xn · YnD−→ a ·X.

Norbert Henze, KIT 1.12

Page 16: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Basic facts from probability theory

1.27 Theorem (Skorokhod)

LetX,X1, X2, . . . be random variables on a probability space (Ω,A, P) such that

XnD−→ X. Then there are a probability space (Ω, A, P) and random variables

Y, Y1, Y2, . . . on Ω with the property

PY = P

X , PYn = P

Xn for each n (=⇒ YnD−→ Y )

andlim

n→∞Yn = Y P-almost surely.

Proof. Take

Ω := (0, 1),

A := B1 ∩ (0, 1),

P := λ1|(0,1),

Yn := F−1n ,

Y := F−1.

Norbert Henze, KIT 1.13

Page 17: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Basic facts from probability theory

1.28 Definition (Conditional expectation)

Let (Ω,A, P) be a probability space, X ∈ L1(Ω,A, P) and G be a sub-σ-fieldof A. A random variable Y is called conditional expectation of X given G, forshort: E[X|G] := Y , if:

a) E|Y | <∞,

b) Y is G-measurable,

c) E(Y 1A) = E(X1A) for each A ∈ G.

1.29 Theorem (Existence and Uniqueness of conditional expectations)

E[X|G] exists, and it is uniquely determined P-almost surely.

Norbert Henze, KIT 1.14

Page 18: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Basic facts from probability theory

1.30 Theorem (Properties of conditional expectations)

Let (Ω,A,P) a p.s., G a sub-σ-field of A and X,Y ∈ L1(Ω,A, P). Then:a) E (E[X|G]) = EX,

b) If X is G-measurable, then E[X|G] = X,

c) E[aX + bY |G] = aE[X|G] + bE[Y |G], a, b ∈ R, (linearity)

d) If X ≤ Y P-a.s., then E[X|G] ≤ E[Y |G]. (monotonicity)

e) If E|XY | <∞, and if Y is G-measurable, then

E[XY |G] = Y E[X|G]. (treat G-measurable functions like constants)

f) If F ⊂ G is a sub-σ-field of G, thenE[X|F ] = E

[E[X|G]|F

]. (tower property)

g) Suppose X and G are independent. Then E[X|G] = E(X).

h) Let H ⊂ A be a sub-σ-field of A. If H and σ(σ(X) ∪ G) are independent,then

E[X|G] = E[X|σ(G ∪H)].

Notice that g) follows from h) (why?)

Norbert Henze, KIT 1.15

Page 19: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Basic facts from probability theory

1.31 Theorem (Factorization Theorem)

Let (Ω′,A′) be a measurable space. If G = σ(Z) = Z−1(A′) for a (A,A′)-measurable mapping Z : Ω→ Ω′, then

E[X|G] = E[X|σ(Z)] =: E[X|Z] = h(Z)

for some measurable function h : Ω′ → R.

Norbert Henze, KIT 1.16

Page 20: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

A Poisson limit theorem for triangular arrays

2 A Poisson limit theorem for triangular arrays

For n ≥ 2, let Xn,1, . . . , Xn,n be independent N0-valued random variables.

Aim: Give a necessary and sufficient condition (”nasc“) for

Xn,1 + . . .+Xn,nD−→ Po(λ) as n→∞ for some λ > 0.

2.1 Definition (Null array)

Let ∆ := (Xn,j : 1 ≤ j ≤ n)n≥1 be a triangular array of R-valued random

variables. ∆ is said to be a null array if

limn→∞

max1≤j≤n

P (|Xn,j | > ε) = 0 for each ε > 0. (2.1)

Equivalent wording: ∆ null array ⇐⇒ ∆ is uniformly asymptotically negligible.

In what follows, we use the notation x ∧ y := min(x, y) (∧ =”wedge“)

Norbert Henze, KIT 2.1

Page 21: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

A Poisson limit theorem for triangular arrays

2.2 Proposition We have

limn→∞

max1≤j≤n

P (|Xn,j | > ε) = 0 ∀ε > 0⇐⇒ limn→∞

max1≤j≤n

E [|Xn,j | ∧ 1] = 0.

Proof:”⇐=“: It suffices to consider 0 < ε ≤ 1 (why?). Then

1|Xn,j | > ε ≤ 1

ε(|Xn,j | ∧ 1) .

Take E[ · ] and the maximum over j =⇒ assertion.

”=⇒“: Fix ε > 0. We have

E [|Xn,j | ∧ 1] = E [(|Xn,j | ∧ 1)1|Xn,j | > ε] + E [(|Xn,j | ∧ 1)1|Xn,j | ≤ ε]︸ ︷︷ ︸ ︸ ︷︷ ︸≤ 1 ≤ ε ∧ 1

≤ P (|Xn,j | > ε) + (ε ∧ 1),

q.e.d.

Norbert Henze, KIT 2.2

Page 22: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

A Poisson limit theorem for triangular arrays

2.3 Definition (Generating function)

Let X be a N0-valued random variable.

gX :

[−1, 1]→ R,

s 7→ gX(s) :=∑∞

k=0 P(X = k)sk

is called the (probability) generating function of X (of PX).

2.4 Example (Poisson distribution)

If X ∼ Po(λ) then

gX(s) =

∞∑

k=0

e−λλk

k!sk = e−λ

∞∑

k=0

(λs)k

k!= e−λeλs = eλ(s−1).

2.5 Remark

a) gX determines PX , since ddsr

gX(s)∣∣∣s=0

= r! · P(X = r),

b) gX(s) = E[sX],

c) X,Y independent =⇒

gX+Y (s) = E

[sX+Y

]= E

[sXsY

]= E

[sX]E

[sY]= gX(s)gY (s).

Norbert Henze, KIT 2.3

Page 23: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

A Poisson limit theorem for triangular arrays

2.6 Theorem (Continuity theorem for generating functions)

Let X0, X1, X2, . . . be N0-valued random variables with generating functionsg0, g1, g2, . . .. Then the following are equivalent:

a) XnD−→ X0,

b) limn→∞

P(Xn = k) = P(X0 = k) ∀ k ∈ N0,

c) limn→∞

gn(s) = g0(s) ∀ s ∈ [0, 1].

Proof:”a) =⇒ b)“: Let Fn be the distribution function of Xn. Fix k ∈ N0.

Since k + 1/2 is a point of continuity of F0, we have

limn→∞

P(Xn ≤ k) = limn→∞

Fn

(k +

1

2

)= F0

(k +

1

2

)= P(X0 ≤ k).

”b) =⇒ a):“

√(why?)

”b) =⇒ c):“ W.l.o.g. let s < 1. Put ∆n,k := |P(Xn = k)− P(X0 = k)|.Let m ∈ N. We have

|gn(s)− g0(s)| ≤∞∑

k=0

∆n,ksk ≤ max

0≤k≤m∆n,k ·

m∑

k=0

sk +∞∑

k=m+1

sk

= max0≤k≤m

∆n,k · 1− sm+1

1− s +sm+1

1− s .

Norbert Henze, KIT 2.4

Page 24: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

A Poisson limit theorem for triangular arrays

Memo: ∆n,k := |P(Xn = k)− P(X0 = k)|

Memo: |gn(s)− g0(s)| ≤ max0≤k≤m

∆n,k · 1− sm+1

1− s +sm+1

1− s , m ∈ N fixed.

Fix ε > 0. Choose m so large that sm+1/(1− s) ≤ ε. From b), we have

limn→∞

max0≤k≤m

∆n,k = 0

=⇒ lim supn→∞ |gn(s)− g0(s)| ≤ ε, q.e.d.

”c) =⇒ b):“ For each k ≥ 0, (P(Xn = k))n≥1 is a bounded sequence.Bolzano–Weierstraß and Cantor’s diagonal argument =⇒ ∃ subsequence (Xn′ )such that

uk := limn′→∞

P(Xn′ = k) exists for each k ∈ N0.

Part”b) =⇒ c)“ =⇒ lim

n′→∞gn′(s) =

∞∑

k=0

uksk, 0 ≤ s < 1.

Assumption c) =⇒ limn′→∞

gn′(s) =∞∑

k=0

P(X0 = k)sk, 0 ≤ s < 1,

=⇒ uk = P(X0 = k), k ∈ N0 (why?) =⇒ b).√

Norbert Henze, KIT 2.5

Page 25: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

A Poisson limit theorem for triangular arrays

2.7 Lemma For n ≥ 1, let Xn,1, . . . , Xn,n be N0-valued random variableswith generating functions gn,1, . . . , gn,n. We then have:

a) Xn,j : n ≥ 1, 1 ≤ j ≤ n is a null array

⇐⇒ b) limn→∞

max1≤j≤n

(1− gn,j(s)) = 0 ∀s ∈ [0, 1].

Proof: We have

a) ⇐⇒ limn→∞

max1≤j≤n

P (|Xn,j | > ε) = 0 ∀ ε > 0

⇐⇒ Xn,kn

P−→ 0 for each subsequence (kn) such that 1 ≤ kn ≤ n⇐⇒ Xn,kn

D−→ δ0 ”

⇐⇒ gn,kn(s)→ 1, 0 ≤ s ≤ 1, ”

⇐⇒ 1− gn,kn(s)→ 0, 0 ≤ s ≤ 1, ”

⇐⇒ b).

2.8 Remark If Xn,1, . . . , Xn,n are R-valued, then b) in 2.7 takes the form

limn→∞

max1≤j≤n

∣∣1− ϕn,j(t)∣∣ = 0, t ∈ R.

Here, ϕn,j is the characteristic function of Xn,j .

Norbert Henze, KIT 2.6

Page 26: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

A Poisson limit theorem for triangular arrays

2.9 Theorem (Poisson Limit Theorem)

Let Xn,j : n ≥ 2, 1 ≤ j ≤ n be a null array of rowwise independent N0-valuedrandom variables and X ∼ Po(λ). We then have:

Xn,1 + . . .+Xn,nD−→ X ⇐⇒ (i)

n∑

j=1

P(Xn,j > 1)→ 0,

(ii)

n∑

j=1

P(Xn,j = 1)→ λ.

Proof:”⇐=“: In view of 2.6, 2.5 c) and 2.4, we have to show

limn→∞

n∏

j=1

gn,j(s) = eλ(s−1), 0 ≤ s ≤ 1, (2.2)

⇐⇒n∑

j=1

log (1− (1− gn,j(s))) → λ(s− 1), 0 ≤ s ≤ 1,

⇐⇒n∑

j=1

(1− gn,j(s)) → λ(1− s), 0 ≤ s ≤ 1.↑ (2.3)

use 2.7 b) and 1− 1/t ≤ log t ≤ t− 1, t > 0

Norbert Henze, KIT 2.7

Page 27: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

A Poisson limit theorem for triangular arraysn∑

j=1

(1− gn,j(s))

=n∑

j=1

[1−

1∑

k=0

skP(Xn,j = k)

]−

n∑

j=1

∞∑

k=2

skP(Xn,j = k)

=n∑

j=1

[1− P(Xn,j = 0)− sP(Xn,j = 1)]

+n∑

j=1

∞∑

k=2

(s− sk)P(Xn,j = k) −n∑

j=1

sP(Xn,j ≥ 2)

= (1− s)n∑

j=1

P(Xn,j > 0) +∞∑

k=2

(s− sk)n∑

j=1

P(Xn,j = k)

︸ ︷︷ ︸ ︸ ︷︷ ︸=: Tn,1(s) =: Tn,2(s)

For k ≥ 2 we have s(1− s) ≤ s(1− sk−1) = s− sk ≤ s =⇒

s(1− s)n∑

j=1

P(Xn,j > 1) ≤ Tn,2(s) ≤ s

n∑

j=1

P(Xn,j > 1)

︸ ︷︷ ︸ ︸ ︷︷ ︸→ 0, cf. (i) → 0, cf. (i)

Norbert Henze, KIT 2.8

Page 28: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

A Poisson limit theorem for triangular arrays

Memo:∑n

j=1(1− gn,j(s)) = Tn,1(s) + Tn,2(s), limn→∞ Tn,2(s) = 0.

Memo: Tn,1(s) = (1− s)∑nj=1 P(Xn,j > 0)

Notice that

n∑

j=1

P(Xn,j > 0) =n∑

j=1

P(Xn,j = 1) +n∑

j=1

P(Xn,j > 1)

︸ ︷︷ ︸ ︸ ︷︷ ︸→ λ, cf. (ii) → 0, cf. (i).

It follows thatlim

n→∞Tn,1(s) = (1− s)λ, q.e.d.

Norbert Henze, KIT 2.9

Page 29: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

A Poisson limit theorem for triangular arrays

”=⇒“: Suppose that Xn,1 + . . .+Xn,n

D−→ Po(λ)

=⇒n∏

j=1

gn,j(s) −→ eλ(s−1), 0 ≤ s ≤ 1,

=⇒n∑

j=1

(1− gn,j(s)) −→ λ(1− s), 0 ≤ s ≤ 1.

Put s = 0. Then n∑

j=1

P(Xn,j > 0) −→ λ.

Memo:n∑

j=1

(1− gn,j(s)) = (1− s)n∑

j=1

P(Xn,j > 0) + Tn,2(s)

Memo: s(1− s)∑nj=1 P(Xn,j > 1) ≤ Tn,2(s) ≤ s

∑nj=1 P(Xn,j > 1)

It follows that∑n

j=1 P(Xn,j > 1) −→ 0 (which is (i)), andn∑

j=1

P(Xn,j = 1) =n∑

j=1

P(Xn,j > 0) −n∑

j=1

P(Xn,j > 1) −→ λ,

which is (ii), q.e.d.

Norbert Henze, KIT 2.10

Page 30: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

The method of moments

3 The method of moments

A warning! In general, XnD−→ X does not imply EXn → EX.

Notice that X + YnD−→ X, if Yn

P−→ 0 (Slutsky).

If P(Yn = n2) = 1/n, P(Yn = 0) = 1− 1/n, we have YnP−→ 0 and EYn →∞.

3.1 Theorem Suppose XnD−→ X. Then E|X| ≤ lim infn→∞ E|Xn|

Proof: Take (Ω, A, P) and Y, Y1, Y2, . . . as in Skorokhod’s Theorem. Notice

that XD= Y and 0 ≤ |Yn| → |Y | P-a.s.. Fatou’s Lemma =⇒

E|X| = E|Y | =

∫|Y |d P

≤ lim infn→∞

∫|Yn|d P = lim inf

n→∞E|Yn| = lim inf

n→∞E|Xn|, q.e.d.

Norbert Henze, KIT 3.1

Page 31: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

The method of moments

3.2 Definition (Uniform integrability)

Let X1, X2, . . . be random variables on (Ω,A,P). The sequence (Xn) is said tobe uniformly integrable (UI), if

lima→∞

supn≥1

E [|Xn|1|Xn| ≥ a] = 0.

3.3 Corollary

a) If (Xn)n≥1 is UI then supn≥1 E|Xn| <∞.

b) If supn≥1 E|Xn|1+δ <∞ for some δ > 0, then (Xn) is UI.

c) If supn≥1 |Xn| ≤ C <∞ for some C then (Xn) is UI.

Proof: a) For fixed a > 0, we have

E|Xn| = E [|Xn|1|Xn | < a] + E [|Xn|1|Xn| ≥ a]≤ a + E [|Xn|1|Xn| ≥ a]≤ a + sup

k≥1E [|Xk|1|Xk | ≥ a]

√.

b) Fix a > 0. We have

E [|Xn|1|Xn| ≥ a] ≤ E

[|Xn| ·

( |Xn|a

)δ]

=1

aδ· E|Xn|1+δ √

c) follows from b).

Norbert Henze, KIT 3.2

Page 32: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

The method of moments

3.4 Theorem If XnD−→ X and (Xn) is UI, then:

a) E|X| <∞,

b) limn→∞ EXn = EX.

Proof: a) follows from Thm. 3.1 and Cor. 3.3 a).

b) Skorokhod’s Theorem =⇒ w.l.o.g. Xna.s.−→ X. Fix a > 0.

|EXn − EX| ≤ E|Xn −X|= E [|Xn−X|1|Xn−X| < a] + E [|Xn−X|1|Xn−X| ≥ a]︸ ︷︷ ︸ ︸ ︷︷ ︸

=: ∆n,1 =: ∆n,2

Notice that limn→∞ ∆n,1 = 0 by dominated convergence. Furthermore,

∆n,2 ≤ 2 · E[max (|Xn|, |X|)1

max (|Xn|, |X|) ≥ a

2

]

≤ 2 · E[|Xn|1

|Xn| ≥ a

2

]+ 2 · E

[|X|1

|X| ≥ a

2

]

≤ 2 · supk≥1

E

[|Xk|1

|Xk| ≥ a

2

]+ 2 · E

[|X|1

|X| ≥ a

2

]

︸ ︷︷ ︸ ︸ ︷︷ ︸→ 0 as a→∞, since (Xn) UI → 0 as a→∞, since E|X| <∞ √

Norbert Henze, KIT 3.3

Page 33: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

The method of moments

3.5 Corollary Let r∈N, ε>0. If XnD−→ X and supn≥1 E|Xn|r+ε<∞, then:

a) E|X|r <∞,

b) limn→∞ E(Xrn) = E(Xr).

Proof: We have |Xn|r+ε = |Xrn|1+ε/r. Put δ := ε/r.

Memo: If supn≥1 E|Yn|1+δ <∞ for some δ > 0, then (Yn) is UI.

It follows that (Xrn)n≥1 is UI.

The Continuous Mapping Theorem gives Xrn

D−→ Xr.

The assertion now follows from Theorem 3.4, q.e.d.

Norbert Henze, KIT 3.4

Page 34: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

The method of moments

3.6 Theorem (Method of moments)

Suppose PX is uniquely determined by the sequence (EXk)k≥1 of moments.

If limn→∞

EXkn = EXk for each k ≥ 1, then Xn

D−→ X.

Proof: For a > 0, Markov’s inequality yields

P(|Xn| > a) ≤ EX2n

a2→ EX2

a2as n→∞.

It follows that PXn : n ≥ 1 is tight. (why?)

Thm. 1.13 =⇒ ∃ subsequence (Xnk) ∃ r.v. Y such that Xnk

D−→ Y as k →∞.

Cor. 3.5 =⇒ limk→∞ EXrnk

= EY r, r ∈ N. (why?)

Assumption =⇒ XD= Y , i.e., Xnk

D−→ X. Cor. 1.14 =⇒ assertion.

Problem: Prove the CLT of Lindeb.-Levy for bounded r.v.’s via Thm. 3.6.

Norbert Henze, KIT 3.5

Page 35: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

The method of moments

3.7 Theorem (Sufficient condition for”(EXk)k≥1 determines PX“)

Let X be a random variable such that E|X|k <∞, k ≥ 1. Suppose that

∞∑

k=1

EXk

k!tk

has a non-vanishing radius of convergence. Then PX is uniquely determined bythe sequence (EXk)k≥1.

Proof: Let ϕ(t) := EeitX , bk := E|X|k, k ≥ 1. Induction over n =⇒∣∣∣∣∣e

itX

(eihX −

n∑

k=0

(ihX)k

k!

)∣∣∣∣∣ ≤|h|n+1|X|n+1

(n+ 1)!, t, h ∈ R, n ≥ 0.

Since E|X|k <∞ implies ϕ(k)(t) = E[eitX(iX)k

], it follows that

∣∣∣ϕ(t+ h)−n∑

k=0

hk

k!ϕ(k)(t)

∣∣∣ ≤ |h|n+1

(n+ 1)!bn+1.

Put mk := EXk. Assumption =⇒ ∃t0 > 0 with∑∞

k=0 |mk|tk0/k! <∞.

Norbert Henze, KIT 3.6

Page 36: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

The method of moments

Memo:∣∣∣ϕ(t+ h) −

n∑

k=0

hk

k!ϕ(k)(t)

∣∣∣ ≤ |h|n+1

(n+ 1)!bn+1.

Memo:∞∑

k=0

|mk|tk0k!

<∞, 0 < t0 <∞, mk = EXk.

Since |X|2k−1 ≤ 1 + |X|2k , we have b2k−1 ≤ 1 +m2k. Since m2k = b2k,

b2k−1h2k−1

(2k − 1)!≤ h2k−1

(2k − 1)!+m2kt

2k0

(2k)!· 2kh

2k−1

t2k0

shows that the left-hand side tends to 0 as k →∞ if h ∈ (0, t0). It follows that

ϕ(t+ h) =∞∑

k=0

ϕ(k)(t)

k!hk, t ∈ R, |h| < t0. (3.1)

Let Y be a r.v. with EY k = mk, k ≥ 1, and CF ψ(t) = EeitY . Proceeding asabove, we obtain

ψ(t+ h) =

∞∑

k=0

ψ(k)(t)

k!hk, t ∈ R, |h| < t0. (3.2)

Put t = 0 in (3.1), (3.2). Since ψ(k)(0) = ϕ(k)(0) = ikmk, k ≥ 1, we haveψ(t) = ϕ(t), |t| < t0. Putting t = ±t0/2, t = ±t0, . . . in (3.1), (3.2) gives

ψ = ϕ =⇒ XD= Y , q.e.d.

Norbert Henze, KIT 3.7

Page 37: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

The method of moments

3.8 Examples

a) If P(|X| ≤M) = 1 for some M <∞, then PX is determined by thesequence (EXk).

b) If X ∼ N(0, 1), then PX is determined by the sequence (EXk).

c) If X has a lognormal distribution, then PX is not determined by thesequence (EXk).

Norbert Henze, KIT 3.8

Page 38: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

A CLT for stationary m-dependent sequences

4 A CLT for stationary m-dependent sequences

Let (Yj)j≥1 be a sequence of random variables on some probability space(Ω,A,P). Recall: For T ⊂ N, T 6= ∅,

σ (Yt : t ∈ T ) := σ

(⋃

t∈T

Y −1t

(B1)

).

4.1 Definition (m-dependence and stationarity)

a) (Yj)j≥1 is called m-dependent :⇐⇒

for each s ≥ 1 : σ(Y1, . . . , Ys) and σ(Ys+m+j : j ≥ 1) are independent

b) (Yj)j≥1 is said to be stationary :⇐⇒∀j ∈ N ∀k ∈ N0: the distribution of (Yj , . . . , Yj+k) does not depend on j.(”shift invariance of finite-dimensional distributions“)

Notice that

Y1, Y2, . . . are independent ⇐⇒ (Yj)j≥1 is 0-dependent. (!)

Y1, Y2, . . . i.i.d. =⇒ (Yj)j≥1 stationary and m-dependent ∀m ≥ 0

Norbert Henze, KIT 4.1

Page 39: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

A CLT for stationary m-dependent sequences

4.2 Examples (Functions of blocks of i.i.d. sequences)

a) Let X1, X2, . . . be i.i.d. random variables. Let ℓ ∈ N and f : Rℓ → R be ameasurable function. Put Yj := f(Xj , Xj+1, . . . , Xj+ℓ−1), j ≥ 1. Then(Yj)j≥1 is stationary. Since, for s ≥ 1, we have

σ(Y1, . . . , Ys) ⊂ σ(X1, . . . , Xs+ℓ−1),

σ(Ys+ℓ−1+j : j ≥ 1) ⊂ σ(Xs+ℓ, Xs+ℓ+1, . . .),

Y1, Y2, . . . are (ℓ− 1)-dependent.

b) (Special case of a)): Let X0, X1, X2, . . . be i.i.d. PutYj := 1Xj−1 > Xj < Xj+1, j ≥ 1 (local minimum at time j).Then (Yj)j≥1 is stationary and 2-dependent.

c) (Special case of a)): Let X0, X1, . . . be i.i.d. ∼ Bin(1, p), 0 < p < 1. Let

Yj := (1−Xj−1)Xj Xj−1 · . . . ·Xj+r−1 (1−Xj+r)

(a lucky streak of exact length r starts at the jth trial)

(Yj)j≥1 is (r + 1)-dependent and stationary.

Norbert Henze, KIT 4.2

Page 40: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

A CLT for stationary m-dependent sequences

In what follows, we assume EY 21 <∞. Put

µ := E(Y1) = E(Yj) ∀j ≥ 1,

σ00 := V(Y1) = V(Yj) ∀j ≥ 1,

σ0j := Cov(Y1, Y1+j) = Cov(Yi, Yi+j) ∀i ≥ 1.↑

stationarity!

Notice that σ0j = 0 if j > m (because of m-dependence).

Let Sn := Y1 + . . .+ Yn, n ≥ 1. Then ESn = nµ , and, for n ≥ m,

V(Sn) =

n∑

i=1

n∑

j=1

Cov(Yi, Yj)

= nσ00 + 2(n− 1)σ01 + . . .+ 2(n−m)σ0m.

Notice that

limn→∞

1

nV(Sn) = σ2 := σ00 + 2

m∑

j=1

σ0j .

σ2 = σ00 + 2m∑

j=1

σ0j is called the long-run variance of (Yj)j≥1.

Norbert Henze, KIT 4.3

Page 41: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

A CLT for stationary m-dependent sequences

4.3 Lemma Let Zn,k, Xn,k (n, k ∈ N) be random variables and

Tn := Zn,k + Xn,k, n, k ≥ 1 (Tn does not depend on k!)

Suppose that

a) limk→∞

supn∈N

P (|Xn,k | ≥ δ) = 0 ∀ δ > 0,

b) for each k ≥ 1: Zn,kD−→ Zk as n→∞ for some random variable Zk,

c) ZkD−→ Z as k →∞ for some random variable Z.

Then TnD−→ Z as n→∞.

Proof: Let F, F1, F2, . . . be the distribution functions of Z,Z1, Z2, . . ..

Fix ε > 0 and z ∈ C(F ). Since R \ C(F ) is countable, there is some δ > 0 with

P(|Z − z| ≤ δ) < ε and z + δ, z − δ ∈ C(F ) ∩∞⋂

k=1

C(Fk). (4.1)

From a), there is some k0 such that

P(|Xn,k| ≥ δ) < ε ∀n ∀k ≥ k0. (4.2)

From c), there is some k1 ≥ k0 with

|P(Zk ≤ z ± δ)− P(Z ≤ z ± δ)| < ε ∀k ≥ k1. (4.3)

Norbert Henze, KIT 4.4

Page 42: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

A CLT for stationary m-dependent sequences

Memo: b) ∀ k ≥ 1 : Zn,kD−→ Zk as n→∞ for some random variable Zk

Memo: P(|Z − z| ≤ δ) < ε and z + δ, z − δ ∈ C(F ) ∩⋂∞k=1 C(Fk) (4.1)

Memo: P(|Xn,k| ≥ δ) < ε ∀n, ∀k ≥ k0 (4.2)

Memo: |P(Zk ≤ z ± δ)− P(Z ≤ z ± δ)| < ε ∀k ≥ k1 (4.3)

For k ≥ k1, we have

P(Tn ≤ z) = P(Zn,k+Xn,k ≤ z)= P(Zn,k+Xn,k ≤ z, |Xn,k | < δ) + P(Zn,k+Xn,k ≤ z, |Xn,k| ≥ δ)≤ P(Zn,k ≤ z + δ) + P(|Xn,k| ≥ δ)≤ P(Zn,k ≤ z + δ) + ε. (by (4.2))

From b), it follows that

lim supn→∞

P(Tn ≤ z) ≤ P(Zk ≤ z+ δ) + ε ≤ P(Z ≤ z+ δ)+ 2ε ≤ P(Z ≤ z)+ 3ε.

b) (4.3) (4.1)

In the same way, using (4.3) with z − δ, we obtain

lim infn→∞

P(Tn ≤ z) ≥ P(Z ≤ z)− 3ε. ε ↓ 0 =⇒ assertion.

Norbert Henze, KIT 4.5

Page 43: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

A CLT for stationary m-dependent sequences

4.4 Theorem (CLT for stationary m-dependent sequences)

Let (Yj)j≥1 be a stationary m-dependent sequence satisfying EY 21 < ∞ and

0 < σ2, where σ2 is the long-run variance. For the sequence of partial sumsSn = Y1 + . . .+ Yn, we then have

Sn − ESn√V(Sn)

D−→ N(0, 1) as n→∞.

Proof: W.l.o.g. let µ = EY1 = 0. Idea: Split Sn into suitable blocks and usethe CLT of Lindeberg/Levy and Slutsky’s lemma.

Fix k > m. Then n =: s(k +m) + r, where 0 ≤ r < k +m. Put

Sn =: Sn,1 + Sn,2 +Rn, where

Sn,1 :=

s−1∑

j=0

Vk,j , Sn,2 :=

s−1∑

j=0

Wk,j ,

Vk,j :=k∑

i=1

Yj(k+m)+i, Wk,j :=k+m∑

i=k+1

Yj(k+m)+i,

Rn :=r∑

i=1

Ys(m+k)+i. Notice that Vk,0, . . . , Vk,s−1 are i.i.d.

Norbert Henze, KIT 4.6

Page 44: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

A CLT for stationary m-dependent sequences

• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •Vk,0 Wk,0 Vk,1 Wk,1 Vk,2 Wk,2 Vk,s−1 Wk,s−1 Rn

Put

Tn :=Sn√n, Zn,k :=

Sn,1 +Rn√n

, Xn,k :=Sn,2√n

=⇒ Tn =Sn√n

= Zn,k +Xn,k.

We claim that a), b), c) of Lemma 4.3 hold.

a) To show: limk→∞ supn∈N P (|Xn,k | ≥ δ) = 0 ∀ δ > 0.

We have

EXn,k = 0, V(Xn,k) =1

nV(Sn,2) =

s

nV(Sm).

Since n = s(k +m) + rn, 0 ≤ rn < k +m, we have sn≤ 1

k+m.

Tschebyshev =⇒ supn∈N

P(|Xn,k| ≥ δ) ≤ V(Sm)

(k +m)δ2→ 0 as k →∞. √

Norbert Henze, KIT 4.7

Page 45: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

A CLT for stationary m-dependent sequences

Memo: b) ∀k ≥ 1 : Zn,kD−→ Zk as n→∞ for some random variable Zk

Memo: Zn,k =Sn,1√n

+Rn√n, n = sn(k +m) + rn, 0 ≤ rn < k +m.

Memo: Sn,1 =∑s−1

j=0 Vk,j , Vk,j =∑k

i=1 Yj(k+m)+i

b) We have: Vk,0, . . . , Vk,s−1 i.i.d., E(Vk,j) = 0, V(Vk,j) = V(Sk).

Sn,1√n

=

√snn· 1√

sn

sn−1∑

j=0

Vk,j

︸ ︷︷ ︸ ︸ ︷︷ ︸↓ ↓ D1√

k +mN(0,V(Sk)) (Lindeberg-Levy)

D−→ Zk ∼ N

(0,

V(Sk)

k +m

)

E

[Rn√n

]= 0, V

(Rn√n

)=

V(Sr)

n≤ (k +m)2σ00

n(Cauchy-Schwarz)

=⇒ Rn√n

P−→ 0. Slutsky =⇒ Zn,kD−→ Zk.

Norbert Henze, KIT 4.8

Page 46: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

A CLT for stationary m-dependent sequences

Memo: c): ZkD−→ Z as k →∞ for some random variable Z.

Memo: Zk ∼ N

(0,

V(Sk)

k +m

)

Memo: limn→∞

1

nV(Sn) = σ2 := σ00 + 2

m∑

j=1

σ0j .

Notice that

V(Sk)

k +m=

k

k +m· V(Sk)

k→ σ2 as k →∞.

︸ ︷︷ ︸ ︸ ︷︷ ︸↓ ↓1 σ2

It follows that

ZkD−→ Z ∼ N(0, σ2) as k →∞, q.e.d.

Norbert Henze, KIT 4.9

Page 47: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

A CLT for stationary m-dependent sequences

4.5 Examples

a) In Example 4.2 b), i.e. Yj = 1Xj−1 > Xj < Xj+1, we have, providedthat the distribution function F of X1 is continuous:

√n

(Sn

n− 1

3

)D−→ N

(0,

2

45

)(!)

Notice that P(X0 > X1 < X2) = P(X1 = min(X0, X1, X2)) = 1/3.

b) In Example 4.2 c), i.e., Yj = (1−Xj−1)Xj . . . Xj+r−1(1−Xj+r), wehave, writing q = 1− p:

EY1 = q2pr = EY 21 =⇒ V(Y1) = q2pr − q4p2−r = σ00

E(Y1Y1+r+1) = q3p2r,

E(YjYj+k) = 0, if k ∈ 1, . . . , r,σ0k = Cov(Yj, Yj+k)− q4p2r, if k ∈ 1, . . . , r,

σ0,r+1 = Cov(Y1, Y1+r+1) = q3p2r − q4p2r,Cov(Yj , Yℓ) = 0, otherwise.

Thm. 4.4 =⇒ √n

(Sn

n− q2pr

)D−→ N(0, σ2),

where σ2 = . . . = q2pr + 2q3p2r − (2r + 3)q4p2r.

Norbert Henze, KIT 4.10

Page 48: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

The multivariate normal distribution

5 The multivariate normal distribution

Let X = (X1, . . . , Xd)⊤ be a d-dimensional random (column) vector.

5.1 Definition (Expectation vector, covariance matrix)

a) If E|Xj | <∞, j = 1, . . . , d, then

E(X) := (EX1, . . . ,EXd)⊤

is called the expectation (expectation vector) of X.

b) If EX2j <∞, j = 1, . . . , d, the (d× d)-matrix

Σ(X) :=(Cov(Xj , Xk)

)1≤j,k≤d

is called the covariance matrix of X.

More generally, if Y := (Yjℓ)1≤j≤k,1≤ℓ≤m is Rkm-valued (a random(k ×m)-matrix) and if E|Yjℓ| <∞ ∀j, ℓ then

E(Y ) := (E(Yjℓ))k×m .

Norbert Henze, KIT 5.1

Page 49: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

The multivariate normal distribution

With this new notation, we have

Σ(X) = E

[(X − EX) · (X − EX)⊤

]

= E

[XX⊤

]− EX · (EX)⊤.

5.2 Remark (Affine transformations)

If A ∈ Rs×d and b ∈ Rs, then

a) E (AX + b) = AE(X) + b,

b) Σ(AX + b) = AΣ(X)A⊤.

Norbert Henze, KIT 5.2

Page 50: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

The multivariate normal distribution

5.3 Theorem (Properties of covariance matrices)

a) Σ(X) is symmetric and positive-semidefinite (Σ ≥ 0),

b) Σ(X) ist singular (non-invertible) :⇐⇒

∃ c ∈ Rd, c 6= 0, ∃γ ∈ R such that P(c⊤X = γ) = 1

(⇐⇒ there is a hyperplane H ⊂ Rd such that P(X ∈ H) = 1.)

Proof: a) Since Cov(U,V ) = Cov(V,U), Σ(X) is symmetric.

Let c := (c1, . . . , cd)⊤ ∈ Rd. We have

0 ≤ V(c⊤X) = Cov

(d∑

j=1

cjXj ,

d∑

k=1

ckXk

)

=

d∑

j=1

d∑

k=1

cjck Cov(Xj , Xk)

= c⊤Σ(X)c.

b) Σ(X) singular ⇐⇒ ∃c 6= 0 : V(c⊤X) = 0

⇐⇒ ∃c 6= 0 ∃γ ∈ R : P(c⊤X) = γ) = 1.

Norbert Henze, KIT 5.3

Page 51: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

The multivariate normal distribution

5.4 Definition (d-variate normal distribution)

X = (X1, . . . , Xd)⊤ has a d-variate normal distribution :⇐⇒

∀ c = (c1, . . . , cd)⊤ ∈ R

d : c⊤X =d∑

j=1

cjXj has a normal distribution.

Here, N(a, 0) := δa.

5.5 Corollary Suppose X = (X1, . . . , Xd)⊤ has a d-variate normal distribution.

We then have:

a) Let s ∈ 1, . . . , d and 1 ≤ i1 < . . . < is ≤ d. Then (Xi1 , . . . , Xis )⊤ has

a s-dimensional normal distribution.

b) E(X) and Σ(X) exist (i.e., EX2j <∞, j = 1, . . . , d).

5.6 Remark and definition In the setting of 5.4, we have

E(c⊤X) = c⊤E(X), V(c⊤X) = c⊤Σ(X)c.

Thm. 1.21 =⇒ PX in 5.4 is uniquely determined by a := E(X) and Σ := Σ(X).Manner of speaking: X has a d-variate normal distribution with expectation aand covariance matrix Σ, for short: X ∼ Nd(a,Σ) or P

X = Nd(a,Σ).

Norbert Henze, KIT 5.4

Page 52: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

The multivariate normal distribution

5.7 Corollary (Reproduction Theorem for Nd)

If X ∼ Nd(a,Σ), A ∈ Rs×d, b ∈ Rs, then

AX + b ∼ Ns

(Aa+ b, AΣA⊤

).

Proof: h ∈ Rs =⇒ h⊤(AX + b) =(A⊤h

)⊤X + h⊤b has a univariate normal

distribution, q.e.d.

5.8 Lemma If Σ ≥ 0 there is a matrix A with Σ = AA⊤.

Proof: Σ has a complete system of orthonormal eigenvectors withnonnegative eigenvalues, i.e., we have

Σvj = λjvj , v⊤j vk = δj,k (j, k = 1, . . . , d).

Put V := (v1 · · · vd) =⇒ V ⊤ = V −1.

Let Λ := diag(λ1, . . . , λd). Then ΣV = V Λ and Σ = V ΛV ⊤.

Put A := V Λ1/2, where Λ1/2 := diag(√λ1, . . . ,

√λd). Then

Σ = V Λ1/2Λ1/2V ⊤ = AA⊤.

Norbert Henze, KIT 5.5

Page 53: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

The multivariate normal distribution

5.9 Theorem (Existence of Nd(a,Σ))

For each a ∈ Rd and each symmetric positive-semidefinite (d × d)-matrix Σthere is a random vector X such that X ∼ Nd(a,Σ).

Proof: Let Y1, . . . , Yd be i.i.d. ∼ N(0, 1). From the addition theorem for thenormal distribution, we have Y := (Y1, . . . , Yd)

⊤ ∼ Nd(0, Id), where Id is theunit matrix of order d. Let A be a (d× d)-matrix such that Σ = AA⊤. By thereproduction Theorem 5.7, we have

X := AY + a ∼ Nd(a,Σ), q.e.d.

5.10 Theorem (Nd(a,Σ) and independence)

Let X := (X1, . . . , Xk)⊤, Y := (Y1, . . . , Yℓ)

⊤. Suppose that

(XY

)has a

(k + ℓ)-dimensional normal distribution. We then have:

X,Y independent ⇐⇒ Cov(Xi, Yj) = 0 ∀i ∈ 1, . . . , k, ∀j ∈ 1, . . . , ℓ.

Proof:”=⇒“ is obvious.

Norbert Henze, KIT 5.6

Page 54: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

The multivariate normal distribution

”⇐=“ : Writing 0r×s for the zero matrix of order r× s, we have by assumption

Σ =

(Σ(X) 0k×ℓ

0ℓ×k Σ(Y )

).

From 5.8, there are matrices A,B with Σ(X) = AA⊤, Σ(Y ) = BB⊤. LetZ1, . . . , Zk+ℓ be i.i.d. ∼ N(0, 1). Then

U1

...Uk

V1

...Vℓ

:=

(A 0k×ℓ

0ℓ×k B

)

Z1

...Zk

Zk+1

...Zk+ℓ

+

(EXEY

)∼ Nk+ℓ

((EXEY

),

(AA⊤ 0k×ℓ

0ℓ×k BB⊤

))

Put U := (U1, . . . , Uk)⊤, V := (V1, . . . , Vℓ)

⊤. Then

U = A(Z1 · · ·Zk)⊤ + EX, V = B(Zk+1 · · ·Zk+ℓ)

⊤ + EY.

Notice that U and V are independent, (why?) and that UD= X, V

D= Y . Since(

XY

)D=

(UV

), the assertion follows. (why?)

Norbert Henze, KIT 5.7

Page 55: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

The multivariate normal distribution

5.11 Corollary Let X = (X1, . . . , Xd)⊤ ∼ Nd(a,Σ). We then have:

X1, . . . , Xd independent :⇐⇒ Σ is a diagonal matrix.

5.12 Theorem (Addition Theorem)

If X ∼ Nd(a,Σ), Y ∼ Nd(b, T ) and X and Y are independent, then

X + Y ∼ Nd(a+ b,Σ + T ).

Proof: Exercise!

5.13 Theorem (Density of a non-degenerate normal distribution)

The distribution Nd(a,Σ) is called non-degenerate, if det(Σ) > 0, otherwisedegenerate. If det(Σ) > 0 and X ∼ Nd(a,Σ), then X has the Lebesgue density

f(x) =1

(2π)d/2√

det(Σ)exp

(− 1

2(x− a)⊤Σ−1(x− a)

), x ∈ R

d.

Proof: Let Σ = AA⊤, Z := (Z1, . . . , Zd)⊤ ∼ Nd(0, Id). We have

fZ(z) = (2π)−d/2 exp(−z⊤z/2

). Since X

D= AZ + a and

fAZ+a(x) = fZ(A−1(x− a))/|det(A)|, |det(A)| =

√det(Σ), we are done.

Norbert Henze, KIT 5.8

Page 56: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

The multivariate normal distribution

5.14 Principal component decomposition

As in 5.8, let Σ = V ΛV ⊤, Λ = diag(λ1, . . . , λd), X ∼ Nd(a,Σ). Assume thatΣ is invertible. We then have

X =

d∑

j=1

(v⊤j X

)vj =

d∑

j=1

(v⊤j (X − a)

)vj +

d∑

j=1

(v⊤j a

)vj

︸ ︷︷ ︸= a

=d∑

j=1

√λj Zj vj + a,

where Zj := λ−1/2j v⊤j (X − a), j = 1, . . . , d.

Check that Z1, . . . , Zd are i.i.d. N(0, 1) (!).

If λ1 ≥ . . . ≥ λd, then√λjZjvj is called the jth principal component of X.

a1

a2 •v1

•v2

•√

λ1Z1

√λ2Z

2

Norbert Henze, KIT 5.9

Page 57: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

The multivariate normal distribution

5.15 Theorem If X ∼ Nd(a,Σ), det(Σ) > 0, then

(X − a)⊤Σ−1(X − a) ∼ χ2d.

proof: Let Q := (X − a)⊤Σ−1(X − a). We have

XD= AZ + a, where Z ∼ Nd(0, Id), Σ = AA⊤.

Put Z =: (Z1, . . . , Zd)⊤. We then have

QD= (AZ)⊤ Σ−1 (AZ)

= Z⊤A⊤(AA⊤

)−1

AZ

= Z⊤Z

=d∑

j=1

Z2j ∼ χ2

d.

Norbert Henze, KIT 5.10

Page 58: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Convergence in distribution and CLT in Rd

6 Convergence in distribution and CLT in Rd

Let X = (X1, . . . , Xd) be a d-dimensional random vector on (Ω,A,P).

6.1 Definition (distribution function of a random vector)

The function F : Rd → [0, 1], defined by

F (x) := P (X1 ≤ x1, . . . , Xd ≤ xd) , x = (x1, . . . , xd) ∈ Rd,

is called the distribution function of X.

In what follows, the notationx(n) ↓ x

for a sequence (x(n)) in Rd and x ∈ Rd means

x(n)j ↓ xj for each j = 1, . . . , d.

Here, x = (x1, . . . , xd) and x(n) =

(x(n)1 , . . . , x

(n)d

).

Norbert Henze, KIT 6.1

Page 59: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Convergence in distribution and CLT in Rd

6.2 Theorem (Properties of F ) We have:

a) If x, y ∈ Rd and x ≤ y, then

0 ≤ ∆yxF :=

(ε1,...,εd)∈0,1d(−1)d−ε1−...−εdF

(yε11 x

1−ε11 , . . . , yεdd x1−εd

d

)

(generalized monotonicity)

b) F is continuous from above, i.e., if x(n) ↓ x then

limn→∞

F (x(n)) = F (x).

c) If x(n)j → −∞ for some j ∈ 1, . . . , d, then F (x(n))→ 0,

If x(n)j →∞ for each j ∈ 1, . . . , d, then F (x(n))→ 1.

Proof: Exercise! (Use that PX is continuous from above and from below, andthat ∆y

x = P(X ∈ (x, y]))

Norbert Henze, KIT 6.2

Page 60: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Convergence in distribution and CLT in Rd

6.3 Remarks

a) PX is uniquely determined by the values

PX(A), A ∈ Hd := (x, y] : x, y ∈ R

d, x ≤ y.

(uniqueness theorem for measures)

b) If a function F : Rd → [0, 1] satisfies 6.2 a)- c), there is a uniqueprobability measure Q on Bd such that

Q((x, y]) = ∆yxF ∀(x, y] ∈ Hd.

(follows from Caratheodory’s extension theorem, see e.g. Billingsley, P.:Probability and Measure, p. 177)

Notation: Let Od, Ad denote the class of open and closed sets in Rd,respectively. For a set B ⊂ Rd, let

B :=⋃O ∈ Od : O ⊂ B, (interior of B)

B :=⋂A ∈ Ad : A ⊃ B, (closure of B)

∂B := B \B. (boundary of B)

Norbert Henze, KIT 6.3

Page 61: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Convergence in distribution and CLT in Rd

Let Cb := f : Rd → R : f bounded and continuous.Let X,X1, X2, . . . be d-dimensional random vectors on (Ω,A, P).Put Q := PX , Qn := PXn , F (x) := P(X ≤ x), Fn(x) := P(Xn ≤ x).

6.4 Theorem (Portmanteau theorem)

The following assertions are equivalent:

a) limn→∞

∫hdQn =

∫hdQ ∀h ∈ Cb,

b) lim supn→∞

Qn(A) ≤ Q(A) ∀A ∈ Ad,

c) lim infn→∞

Qn(O) ≥ Q(O) ∀O ∈ Od,

d) limn→∞

Qn(B) = Q(B) ∀B ∈ Bd such that Q(∂B) = 0,

e) limn→∞

Fn(x) = F (x) ∀x ∈ C(F ) (the set of continuity points of F )

Notice that∫h dQn = Eh(Xn),

∫hdQ = Eh(X), Qn(A) = P(Xn ∈ A) etc.

Statements can be rephrased in terms of Xn and X.

Norbert Henze, KIT 6.4

Page 62: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Convergence in distribution and CLT in Rd

Proof:”a) =⇒ b)“:

Memo: a)∫hdQn →

∫hdQ ∀h∈Cb, b) lim supn→∞Qn(A) ≤ Q(A)∀A∈Ad

Let ‖ · ‖ be the Euclidean norm on Rd. Fix A ∈ Ad. Put

hj(x) := max(0, 1− j ‖x− A‖), j ≥ 1,

where ‖x− A‖ := inf‖x − y‖ : y ∈ A. Then hj ∈ Cb and hj ≥ 1A, j ≥ 1.Moreover, 1 ≥ hj ↓ 1A as j →∞. (why?) From a), we have

limn→∞

∫hj dQn =

∫hj dQ, j ≥ 1.

Furthermore,

Qn(A) =

∫1A dQn ≤

∫hj dQn, n ≥ 1, j ≥ 1

and thus

lim supn→∞

Qn(A) ≤∫hj dQ, j ≥ 1.

Since∫hj dQ ↓

∫1A dQ = Q(A) as j →∞, (why?) b) follows.

Norbert Henze, KIT 6.5

Page 63: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Convergence in distribution and CLT in Rd

Memo: b) lim supn→∞

Qn(A) ≤ Q(A)∀A∈Ad, c) lim infn→∞

Qn(O) ≥ Q(O)∀O∈Od

Memo: d) limn→∞Qn(B) = Q(B)∀B∈Bd such that Q(∂B) = 0

”b) ⇐⇒ c)“: Take complements!

”b) + c) =⇒ d)“:

Let B ∈ Bd. We have

Q(B) ≤ lim infn→∞

Qn(B) (by c))

≤ lim infn→∞

Qn(B)

≤ lim supn→∞

Qn(B)

≤ lim supn→∞

Qn(B)

≤ Q(B) (by d))

= Q(B) +Q(∂B).

If Q(∂B) = 0, then limn→∞Qn(B) = Q(B), q.e.d.

Norbert Henze, KIT 6.6

Page 64: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Convergence in distribution and CLT in Rd

Memo: d)Qn(B)→ Q(B)∀B∈Bd s.th. Q(∂B) = 0

Memo: a)∫hdQn →

∫hdQ ∀h ∈ Cb

”d) =⇒ a)“: Approximate h ∈ Cb by

hm :=m∑

j=1

αj1Bj such that Q(∂Bj) = 0 ∀j.

To this end, let K := supx∈Rd |h(x)| <∞. Fix ε > 0.

Choose α0 < α1 < . . . < αm such that α0 < −K, αm > K and αj −αj−1 ≤ εfor each j = 1, . . . ,m. If

Bj := x ∈ Rd : αj−1 < h(x) ≤ αj = αj−1 < h ≤ αj,

then ‖h− hm‖∞ ≤ ε.Notice that ∂Bj ⊂ h = αj−1 ∪ h = αj. Hence if, in addition,

P(h(X) ∈ α0, . . . , αm) = 0,

then Q(∂Bj) = 0 for each j = 1, . . . ,m.

Norbert Henze, KIT 6.7

Page 65: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Convergence in distribution and CLT in Rd

Memo: d)Qn(B)→ Q(B)∀B∈Bd s.th. Q(∂B) = 0

Memo: a)∫hdQn →

∫hdQ ∀h ∈ Cb

Memo: hm =∑m

j=1 αj1Bj, Q(∂Bj) = 0∀j, ‖h− hm‖∞ ≤ ε

Now,

∣∣∣∫hdQn −

∫hdQ

∣∣∣ ≤∣∣∣∫hdQn −

∫hmdQn

∣∣∣+∣∣∣∫hmdQn −

∫hmdQ

∣∣∣

+∣∣∣∫hmdQ−

∫hdQ

∣∣∣

≤∫|h− hm|dQn +

∣∣∣m∑

j=1

αj (Qn(Bj)−Q(Bj))∣∣∣

+

∫|hm − h|dQ

≤ 2ε +∣∣∣

m∑

j=1

αj (Qn(Bj)−Q(Bj))∣∣∣

︸ ︷︷ ︸→ 0 by d)

Hence, lim supn→∞ |∫hdQn −

∫hdQ| ≤ 2ε, q.e.d. (since ε > 0 was arbitrary)

Norbert Henze, KIT 6.8

Page 66: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Convergence in distribution and CLT in Rd

Memo: d)Qn(B)→ Q(B)∀B∈Bd s.th. Q(∂B) = 0

Memo: e)Fn(x)→ F (x) ∀x ∈ C(F )

Memo: c) lim infn→∞Qn(O) ≥ Q(O) ∀O ∈ Od

”d) =⇒ e)“: Let Bx := (−∞, x], x ∈ Rd.

Check that x ∈ C(F )⇐⇒ Q(∂Bx) = 0 (!), q.e.d.

”e) =⇒ c)“: Let D be a countable subset of R such that D = R and

Q((x1, . . . , xd) ∈ R

d : xj = a)= 0 ∀a ∈ D ∀j = 1, . . . , d.

Then Dd ⊂ C(F ). (!) Let

M :=×d

j=1 (aj , bj ] : aj , bj ∈ D, aj < bj for j ∈ 1, . . . , d

From e), we have

Qn

(×d

j=1(aj , bj ])

= ∆baFn → ∆b

aF = Q(×d

j=1(aj , bj ]),

i.e. Qn(B)→ Q(B) for each B ∈M.

Norbert Henze, KIT 6.9

Page 67: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Convergence in distribution and CLT in Rd

Memo: Dd ⊂ C(F )

Memo: M :=×d

j=1 (aj , bj ] : aj , bj ∈ D, aj < bj for j ∈ 1, . . . , d

Memo: Qn(B)→ Q(B) for each B ∈M.

The systemM∪ ∅ is closed with respect to finite intersections.

From the inclusion-exclusion formula, we thus obtain Qn(B) → Q(B), if B isa finite union of sets inM.

Fix O ∈ Od, O 6= ∅. Since the systemM is”sufficiently rich“, there are

B1, B2, . . . ∈M such that

O =∞⋃

j=1

Bj .

For fixed k ∈ N, we have

Q

(k⋃

j=1

Bj

)= lim

n→∞Qn

(k⋃

j=1

Bj

)≤ lim inf

n→∞Qn(O).

Since Q is continuous from below, it follows that

Q(O) ≤ lim infn→∞

Qn(O), q.e.d.

Norbert Henze, KIT 6.10

Page 68: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Convergence in distribution and CLT in Rd

6.5 Definition (Convergence in distribution of random vectors)

Let X,X1, X2, . . . be d-dimensional random vectors on some probability space(Ω,A,P).

XnD−→ X :⇐⇒ lim

n→∞Eh(Xn) = Eh(X) ∀h ∈ Cb

By the Portmanteau theorem, there are the following equivalent statements:

lim supn→∞

P(Xn ∈ A) ≤ P(X ∈ A) ∀A ∈ Ad,

lim infn→∞

P(Xn ∈ O) ≥ P(X ∈ O) ∀O ∈ Od,

limn→∞

P(Xn ∈ B) = P(X ∈ B) ∀B ∈ Bd such that P(X ∈ ∂B) = 0,

limn→∞

Fn(x) = F (x) ∀x ∈ C(F ).

Equivalent notations: XnD−→ X, Xn

D−→ Q := PX , FnD−→ F .

Norbert Henze, KIT 6.11

Page 69: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Convergence in distribution and CLT in Rd

6.6 Theorem (Continuous Mapping Theorem, CMT)

Suppose XnD−→ X. If h : Rd → Rs is measurable and P(X ∈ C(h)) = 1, then

h(Xn)D−→ h(X).

In terms of Qn := PXn , Q := PX , an equivalent statement is:

If QnD−→ Q and Q(C(h)) = 1 then Qh

nD−→ Qh.

Notice that the continuity of h is a sufficient condition.

Proof: Fix A ∈ As. To show: lim supn→∞Qhn(A) ≤ Qh(A).

Notice that

h−1(A) ⊂ Rd \ C(h) ∪ h−1(A) (why?) =⇒

lim supn→∞

Qn

(h−1(A)

)≤ lim sup

n→∞Qn

(h−1(A)

)

≤ Q(h−1(A)

)

≤ Q(R

d \ C(h))

+Q(h−1(A)

)

= 0 +Qh(A), q.e.d.

Norbert Henze, KIT 6.12

Page 70: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Convergence in distribution and CLT in Rd

6.7 Theorem (Slutsky’s Lemma)

Let X,X1, X2, . . . ;Y1, Y2, . . . be d-dimensional random vectors. We then have:

XnD−→ X and Yn

P−→ 0 =⇒ Xn + YnD−→ X.

Proof: Fix A ∈ Ad and ε > 0. The set

Aε := x ∈ Rd : ∃y ∈ A with ‖x− y‖ ≤ ε

is closed, and the triangle inequality yields

Xn + Yn ∈ A ⊂ Xn ∈ Aε ∪ ‖Yn‖ > ε. (!)

It follows that

lim supn→∞

P(Xn + Yn ∈ A) ≤ lim supn→∞

P(Xn ∈ Aε) + 0

≤ P(X ∈ Aε) (by Portmanteau, since XnD−→ X)

Norbert Henze, KIT 6.13

Page 71: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Convergence in distribution and CLT in Rd

6.8 Definition (Tightness and relative compactness)

Let Q 6= ∅ be a set of probability measures on Bd.

a) Q tight :⇐⇒ ∀ ε > 0 ∃K ⊂ Rd, K compact: Q(K) ≥ 1− ε ∀Q ∈ Q.b) Q relatively compact :⇐⇒ ∀(Pn)∈QN ∃ subsequence (Pnk

)

∃probab. measure Q: Pnk

D−→ Q as k →∞.

6.9 Theorem (Prokhorov) Q tight ⇐⇒ Q relatively compact .

Proof:”⇐=“: Suppose that Q is not tight, i.e., ∃ ε > 0 ∃ sequence (Qn) in

Q such that Qn(An) < 1− ε for each n ≥ 1, where An := [−n, n]d.

Assumption =⇒ ∃ subsequence (Qnk) ∃Q with Qnk

D−→ Q.

Put A := [−M,M ]d, where M > 0 is chosen to have Q(A) ≥ 1− ε/2 andQ(∂A) = 0. Then, as k →∞, Qnk

(A)→ Q(A) ≥ 1− ε/2. Since Ank⊃ A for

sufficiently large k, we have

Qnk(A) ≤ Qnk

(Ank) < 1− ε

for those k, a contradiction!

Norbert Henze, KIT 6.14

Page 72: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Convergence in distribution and CLT in Rd

”=⇒“: (for d = 1, for d > 1 see Billingsley, P.: Probability and measure, p.392).

Let (Qn) be an arbitrary sequence in Q. Put Fn(x) := Qn((−∞, x]), x ∈ R.Bolzano-Weierstraß and Cantor’s diagonal procedure =⇒ ∃ subsequence (Fnk

)∃G : Q→ [0, 1] such that

G(q) := limk→∞

Fnk(q)

exists for each q ∈ Q. Put F (x) := infG(q) : q > x, x ∈ R. Then F isnondecreasing and, by definition, for each x ∈ R and ε > 0 there is a q ∈ Q

such that x < q and G(q) < F (x) + ε. If x ≤ y < q, thenF (y) ≤ G(q) < F (x) + ε. Hence F is continuous from the right.

If x ∈ C(F ) choose y < x so that F (x)− ε < F (y). Now choose r, s ∈ Q sothat y < r < x < s and G(s) < F (x) + ε. Since

F (x)− ε < G(r) ≤ G(s) < F (x) + ε

and Fn(r) ≤ Fn(x) ≤ Fn(s), n ≥ 1, it follows that

F (x)− ε < G(r) ≤ lim infk→∞

Fnk(x) ≤ lim sup

k→∞Fnk

(x) ≤ G(s) < F (x) + ε

and thus limk→∞ Fnk(x) = F (x).

Norbert Henze, KIT 6.15

Page 73: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Convergence in distribution and CLT in Rd

Memo: (Qn) sequence in Q, Fn(x) := Qn((−∞, x]) =⇒ ∃F : R→ [0, 1]

Memo: F ր, contin. from the right ,∃(nk) : Fnk(x)→ F (x) ∀x ∈ C(F )

In general, F need not be a distribution function, i.e. 0 < F (−∞) and /orF (∞) < 1 possible. But: Since Q is tight, we have:

∀ε > 0 ∃a, b with a < b and Qn((a, b]) = Fn(b)− Fn(a) ≥ 1− ε, n ≥ 1.

Let a′, b′ ∈ C(F ) with a′ < a, b′ > b. Then

1− ε ≤ Qnk((a, b]) ≤ Qnk

((a′, b′])

= Fnk(b′)− Fnk

(a′) → F (b′)− F (a′).

Hence, F is a distribution function.

Let Q be the distribution associated with F , q.e.d.

6.10 Remarks

a) (Xn)n≥1 tight :⇐⇒ PXn : n ≥ 1 tight.b) Xn

D−→ X =⇒ (Xn)n≥1 tight.c) If (Xn) is tight and there is a probability distribution Q such that

Xnk

D−→ Q for each subsequence (Xnk) that converges in distribution,

then XnD−→ X, where PX = Q.

Norbert Henze, KIT 6.16

Page 74: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Convergence in distribution and CLT in Rd

6.11 Theorem (Continuity Theorem of Levy-Cramer)

LetX,X1, X2, . . . be d-dimensional random vectors with characteristic functionsϕ,ϕ1, ϕ2, . . .. We then have:

XnD−→ X ⇐⇒ lim

n→∞ϕn(t) = ϕ(t) ∀ t ∈ R

d.

Proof:”=⇒“: For fixed t ∈ Rd, put h1(x) := cos(t⊤x), h2(x) := sin(t⊤x),

and use the definition of XnD−→ X.

”⇐=“: Let Xn =: (X

(1)n , . . . , X

(d)n )⊤, X =: (X(1), . . . , X(d))⊤. Write

ej := (0, . . . , 0, 1, 0, . . . , 0)⊤ for the jth unit vector in Rd. Put t := αej , whereα ∈ R. Then

ϕX

(j)n

(α) = E

[exp

(iαX(j)

n

) ]= ϕn(αej) → ϕ(αej)

= E

[exp

(iαX(j)

)]= ϕX(j) (α).

Thm. 1.17 =⇒ X(j)n

D−→ X(j) for each j ∈ 1, . . . , d. It follows that

(X(j)n )n≥1 is tight for each j ∈ 1, . . . , d. Thus, (Xn)n≥1 is tight. (!) Thm.

6.9 =⇒ ∃ subsequence (Xnk) ∃ r.v. Y with Xnk

D−→ Y as k →∞. Part”=⇒“

and 1.21 a) =⇒ XD= Y , i.e., Xnk

D−→ X. 6.10 c) =⇒ assertion, q.e.d.

Norbert Henze, KIT 6.17

Page 75: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Convergence in distribution and CLT in Rd

6.12 Theorem (Cramer-Wold-Device)

Let X,X1, X2, . . . be d-dimensional random vectors. We then have:

XnD−→ X ⇐⇒ c⊤Xn

D−→ c⊤X ∀c ∈ Rd.

Proof:”=⇒“ : Put h(x) := c⊤x and use the Continuous Mapping Theorem.

”⇐=“ : We have

ϕXn(c) = E

[exp

(ic⊤Xn

)]= ϕc⊤Xn

(1)

ϕX(c) = E

[exp

(ic⊤X

)]= ϕc⊤X(1).

By the Continuity Theorem of Levy-Cramer in R, we haveϕc⊤Xn

(1)→ ϕc⊤X(1). Thus, ϕXn(c)→ ϕX(c) for each c ∈ Rd, and theassertion follows from Theorem 6.11.

Norbert Henze, KIT 6.18

Page 76: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Convergence in distribution and CLT in Rd

6.13 Theorem (Multivariate Central Limit Theorem)

Let X1, X2, . . . be i.i.d. d-dimensional random vectors such that E‖X1‖2 <∞.Putting a := EX1, Σ := Σ(X1), we have

1√n

(n∑

j=1

Xj − na)

D−→ Nd(0,Σ).

Proof: Let Zn := n−1/2(∑n

j=1Xj − na), Y ∼ Nd(0,Σ). To show:

c⊤ZnD−→ c⊤Y ∀c ∈ R

d.

Notice that

c⊤Zn =1√n

(n∑

j=1

c⊤Xj − nc⊤a).

We have E(c⊤Zn) = 0, V(c⊤Zn) = V(c⊤X1) = c⊤Σc, c⊤Y ∼ N(0, c⊤Σc)=⇒ w.l.o.g. c⊤Σc > 0. Thm. 1.22, applied to (c⊤Xj)j≥1, yields

c⊤Zn√c⊤Σc

=

∑nj=1 c

⊤Zn − nc⊤a√nc⊤Σc

D−→ N(0, 1).

The CMT yields c⊤ZnD−→√c⊤Σc N(0, 1) = N(0, c⊤Σc), q.e.d.

Norbert Henze, KIT 6.19

Page 77: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Convergence in distribution and CLT in Rd

6.14 Example (Chi-square-test)

Let X1, X2, . . . be i.i.d., P(X1 = ej) := pj , j = 1, . . . , s,0 < pj < 1 ∀j, p1 + . . .+ ps = 1, ej is jth unit vector in Rs. Then

∑nj=1Xj ∼ Mult(n; p1, . . . , ps) (multinomial distribution).

a := E(X1) = (p1, . . . , ps)⊤,

Σ := Σ(X1) = (pjδkj − pjpk)1≤j,k≤s (Σ is singular!)

6.13 =⇒ 1√n

(n∑

j=1

Xj − na)

D−→ Z := (Z1, . . . , Zs)⊤ ∼ Ns(0,Σ).

Let A := (pjδkj − pjpk)1≤j,k≤s−1 =⇒ A−1 = (δjkp−1k + p−1

s )1≤j,k≤s−1. Put∑n

j=1Xj =: (Nn,1, . . . , Nn,s−1, Nn,s)⊤, Vn := (Nn,1, . . . , Nn,s−1)

⊤.

CMT =⇒Wn :=1√n

(Vn−n(p1, . . . , ps−1)

⊤)

D−→ (Z1, . . . , Zs−1)⊤ ∼ Ns−1(0, A).

CMT =⇒W⊤n A

−1WnD−→ (Z1, . . . , Zs−1)A

−1(Z1, . . . , Zs−1)⊤ ∼ χ2

s−1.

We have W⊤n A

−1Wn =s∑

j=1

(Nn,j − npj)2npj

(use Nn,1 + . . .+Nn,s = n).

Norbert Henze, KIT 6.20

Page 78: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Convergence in distribution and CLT in Rd

6.15 Theorem (Delta Method)

Let (Tn) be a sequence of d-dimensional random vectors such that, for someϑ ∈ Rd, √

n (Tn − ϑ) D−→ X ∼ Nd(0,Σ). (6.1)

Suppose that the measurable function g : Rd → Rs is differentiable at ϑ with(s× d)-Jacobian matrix g′(ϑ). We then have

√n (g(Tn)− g(ϑ)) D−→ Ns

(0, g′(ϑ)Σg′(ϑ)⊤

).

Proof: We have (pointwise on the underlying probability space)√n (g(Tn)− g(ϑ)) = g′(ϑ)

√n(Tn − ϑ) + ‖

√n(Tn − ϑ)‖ r(Tn − ϑ),

where r(Tn − ϑ)→ 0 as Tn → ϑ. (6.1) =⇒ Tn − ϑ P−→ 0 (!). Invoking 1.4, it

follows that r(Tn − ϑ) P−→ 0. Furthermore, (6.1) and the CMT yield

‖√n(Tn − ϑ)‖ D−→ ‖X‖. From Slutsky’s lemma, we therefore have

‖√n(Tn − ϑ)‖ · r(Tn − ϑ) P−→ 0.

From (6.1) and the CMT, we obtain g′(ϑ)√n(Tn − ϑ) D−→ g′(ϑ)X. Now, 5.7

implies g′(ϑ)X ∼ Ns(0, g′(ϑΣg′(ϑ)⊤), and the assertion follows from 6.7.

Norbert Henze, KIT 6.21

Page 79: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Convergence in distribution and CLT in Rd

6.16 Stochastic Landau notations

Let X,X1, X2, . . . be d-dimensional random vectors and (an) a sequence ofpositive real numbers. The following notation is frequently encountered:

Xn = OP(1) :⇐⇒ (Xn)n≥1 tight,

Xn = OP(an) :⇐⇒(Xn

an

)

n≥1

tight,

Xn = oP(1) :⇐⇒ XnP−→ 0,

Xn = oP(an) :⇐⇒ Xn

an

P−→ 0,

Xn = X + oP(1) :⇐⇒ Xn −X P−→ 0.

Norbert Henze, KIT 6.22

Page 80: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Convergence in distribution and CLT in Rd

6.17 Theorem (Properties of OP and oP)

Let Xn, Yn(n ≥ 1) be d-dimensional random vectors and (Zn)n≥1 a sequenceof random variables. We then have:

a) Xn = OP(1), Yn = OP(1) =⇒ Xn + Yn = OP(1),

b) Xn = oP(1), Yn = oP(1) =⇒ Xn + Yn = oP(1),

c) Xn = OP(1), Zn = OP(1) =⇒ Xn · Zn = OP(1),

d) Xn = OP(1), Zn = oP(1) =⇒ Xn · Zn = oP(1),

e) Xn = OP(1), h : Rd → Rs continuous =⇒ h(Xn) = OP(1).

Proof: Exercise!

6.18 Corollary If XnD−→ X and Zn = a+ oP(1), then ZnXn

D−→ aX.

Proof: We haveZnXn = (Zn − a)Xn + aXn.

Since XnD−→ X implies Xn = OP(1), the first summand on the right-hand

side is oP(1) by 6.17 d). The second term converges to aX in distribution by

the CMT. Hence ZnXnD−→ aX by Slutsky’s lemma.

Norbert Henze, KIT 6.23

Page 81: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Empirical distribution functions

7 Empirical distribution functions

Let X1, X2, . . . be i.i.d. random variables on a probability space (Ω,A, P)having distribution function F (x) := P(X1 ≤ x), x ∈ R.

7.1 Definition (Empirical distribution function)

The function

Fn :

Ω× R→ [0, 1]

(ω, x) 7→ Fωn (x) :=

1

n

n∑

j=1

1Xj(ω) ≤ x

is called the empirical distribution function (EDF) of X1, . . . , Xn.

7.2 Remarks

a) For fixed ω ∈ Ω, Fωn is the distribution function of the discrete probability

measure n−1∑n

j=1 δXj(ω).

b) For fixed x ∈ R, Fn(x) := n−1∑nj=1 1Xj ≤ x is a random variable.

SLLN =⇒ Fn(x)a.s.−→ F (x) as n→∞.

Norbert Henze, KIT 7.1

Page 82: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Empirical distribution functions

1

x

Fω8 (x)

x6 x2 x7 x5 x1 x3 x8 x4

.5

••

••

••

••

Realization of an EDF corresponding to data xj = Xj(ω), j = 1, . . . , 8

Norbert Henze, KIT 7.2

Page 83: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Empirical distribution functions

7.3 Theorem (Glivenko-Cantelli, fundamental theorem of statistics)

We havelim

n→∞supx∈R

∣∣Fn(x)− F (x)∣∣ = 0 P-a.s.

Proof: Let

Dn := supx∈R

∣∣Fn(x)− F (x)∣∣(= ‖Fn − F‖∞

),

Dωn := sup

x∈R

∣∣Fωn (x)− F (x)

∣∣, ω ∈ Ω.

Notice that, by right continuity, Dn = supx∈Q

∣∣Fn(x)− F (x)∣∣. Hence, Dn is

measurable (!) and thus a random variable.

To show: ∃Ω0 ∈ A with P(Ω0) = 1 and

limn→∞

Dωn = 0 ∀ω ∈ Ω0.

A bit of notation: For H : R→ R, H ր, put H(x−) := limy↑x,y<xH(x).

Norbert Henze, KIT 7.3

Page 84: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Empirical distribution functions

From the strong law of large numbers, we have:

∀x ∈ R ∃Ax ∈ A : P(Ax) = 1 and Fωn (x)→ F (x) ∀ω ∈ Ax,

∀x ∈ R ∃Bx ∈ A : P(Bx) = 1 and Fωn (x−)→ F (x−) ∀ω ∈ Bx.

For 0 < p < 1, let F−1(p) := infx ∈ R : F (x) ≥ p(F−1 is the quantile function of F ). We have (!)

F(F−1(p)−

)≤ p ≤ F

(F−1(p)

). (7.1)

For m ≥ 2 and 1 ≤ k ≤ m− 1, let xm,k := F−1(k/m).Putting p = k/m and p = (k − 1)/m in (7.1), we have (!)

F (xm,k−) − F (xm,k−1) ≤ 1

mfor each k = 2, . . . ,m− 1. (7.2)

Moreover,

F (xm,1−) ≤ 1

m, F (xm,m−1) ≥ 1− 1

m. (7.3)

Putting u ∨ v := max(u, v), set

Dωm,n := max

1≤k≤m−1

∣∣Fωn (xm,k)−F (xm,k)

∣∣∨∣∣Fω

n (xm,k−)−F (xm,k−)∣∣. (7.4)

Norbert Henze, KIT 7.4

Page 85: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Empirical distribution functions

Memo: Dωn = supx∈R

∣∣Fωn (x)− F (x)

∣∣

Memo: Dωm,n = max

1≤k≤m−1

∣∣Fωn (xm,k)− F (xm,k)

∣∣ ∨∣∣Fω

n (xm,k−)− F (xm,k−)∣∣

We claim that

Dωn ≤

1

m+Dω

m,n (m ≥ 2, n ≥ 1, ω ∈ Ω). (7.5)

To this end, fix x ∈ R.

Case 1: ∃k ∈ 2, . . . ,m− 1 such that xm,k−1 ≤ x < xm,k.

Monotonicity arguments yield

Fωn (x) ≤ Fω

n (xm,k−) ≤ F (xm,k−) +Dωm,n

≤ F (xm,k−1) +1

m+Dω

m,n ≤ F (x) +1

m+Dω

m,n.

Analogously, Fωn (x) ≥ F (x)− 1

m−Dω

m,n. Hence

∣∣Fωn (x)− F (x)

∣∣ ≤ 1

m+Dω

m,n. (7.6)

Case 2: x < xm,1 or x ≥ xm,m−1 by complete analogy, using (7.3), q.e.d. (7.5).

Norbert Henze, KIT 7.5

Page 86: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Empirical distribution functions

Memo: Dωn = supx∈R

∣∣Fωn (x)− F (x)

∣∣

Memo: Dωm,n = max

1≤k≤m−1

∣∣Fω(xm,k)− F (xm,n)∣∣ ∨∣∣Fω(xm,k−)− F (xm,n−)

∣∣

Memo: Dωn ≤

1

m+Dω

m,n (m ≥ 2, n ≥ 1, ω ∈ Ω).

Memo: ∀x ∈ R ∃Ax ∈ A : P(Ax) = 1 and Fωn (x)→ F (x)∀ω ∈ Ax

Memo: ∀x ∈ R ∃Bx ∈ A : P(Bx) = 1 and Fωn (x−)→ F (x−) ∀ω ∈ Bx

Put

Ω0 :=∞⋂

m=2

m−1⋂

k=1

(Axm,k

∩Bxm,k

).

We have Ω0 ∈ A and P(Ω0) = 1. (why?)

ω ∈ Ω0 =⇒ limn→∞

Dωm,n = 0, m ≥ 2,

=⇒ lim supn→∞

Dωn ≤

1

m∀m ≥ 2, q.e.d.

Norbert Henze, KIT 7.6

Page 87: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Empirical distribution functions

7.4 Remarks

a) In statistical terms, Theorem 7.3, i.e., ‖Fn − F‖∞ a.s.−→ 0, means that

(Fn) is a strongly consistent sequence of estimators of F .

b) Let X1, X2, . . . be i.i.d. d-dimensional random vectors. Let B ∈ Bd,

Qωn(B) :=

1

n

n∑

j=1

1B(Xj(ω)) =1

n

n∑

j=1

δXj(ω)(B), ω ∈ Ω.

From the SLLN, there is a set AB ∈ A with P(AB) = 1 andQω

n(B)→ PX1(B) ∀ω ∈ AB .

Let C ⊂ Bd be a class of Borel sets. Do we have

limn→∞

supB∈C

∣∣Qn(B)− PX1(B)

∣∣ = 0P-almost surely ? (7.7)

Thm. 7.3 =⇒ (7.7) holds if d = 1 and C = (−∞, x] : x ∈ R.(7.7) holds for C = (−∞, x] : x ∈ Rd (

”multivariate Glivenko-Cantelli“).

If X1 has a Lebesgue density, then (7.7) holds forC = B ∈ Bd : B convex.

Norbert Henze, KIT 7.7

Page 88: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Empirical distribution functions

Memo: Dn = supx∈R

∣∣Fn(x)− F (x)∣∣

c) There is a constant C, 0 < C <∞, not dependent on F , such that

P(Dn > t) ≤ C exp(−2nt2

), t > 0, n ∈ N. (DKW)

(Dvoretsky, Kiefer, Wolfowitz 1956).

If (DKW) holds for X1 ∼ U(0, 1), then it holds for any F ! (Exercise!)

Notice that (DKW) entails

∞∑

n=1

P(Dn > ε) ≤ C∞∑

n=1

exp(−2nε2

)<∞ ∀ ε > 0.

The Borel–Cantelli Lemma then gives

P

(lim supn→∞

Dn > ε)

= 0 ∀ε > 0,

from which Theorem 7.3 follows.

Norbert Henze, KIT 7.8

Page 89: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Empirical distribution functions

7.5 Theorem Let X1, X2, . . . be i.i.d. random variables with distributionfunction F ,

Bn(x) :=√n(Fn(x)− F (x)

), x ∈ R.

For any k ≥ 1 and any choice of x1, . . . , xk ∈ R, we have

Bn(x1)

...Bn(xk)

D−→ Nk

0...0

, Σ

,

where Σ = (σij)1≤i,j≤k and

σij = F (min(xi, xj)) − F (xi)F (xj), 1 ≤ i, j ≤ k.

Proof: Exercise! (use the multivariate CLT)

Norbert Henze, KIT 7.9

Page 90: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Limit theorems for U -statistics

8 Limit theorems for U -statistics

Let X1, X2, . . . be i.i.d. d-dimensional random vectors with distributionfunction F . For k ∈ N, let h : (Rd)k → R be measurable and symmetric.

8.1 Definition (U-statistic)

Un := Un(X1, . . . , Xn) :=1(nk

)∑

1≤i1<...<ik≤n

h (Xi1 , . . . , Xik )

is called U -statistic of order k with kernel h.

8.2 Remark In statistical applications, F is assumed to be unknown.

We assume that the second moment of h exists, i.e.,

EFh2 = EFh

2 (X1, . . . , Xk) <∞

and putϑ := ϑ(F ) := EF (Un) = EFh(X1, . . . , Xk).

Then Un is an unbiased estimator of ϑ.

Norbert Henze, KIT 8.1

Page 91: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Limit theorems for U -statistics

In the following examples, we have d = 1.

8.3 Examples

a) k = 1, Un =1

n

n∑

j=1

h(Xj), ϑ(F ) = EFh(X1).

b) k = 2, h(x1, x2) =1

2(x1 − x2)

2,

Un =1(n2

)∑

1≤i<j≤n

1

2(Xi −Xj)

2 = · · · = 1

n− 1

n∑

j=1

(Xj −Xn

)2(!)

ϑ(F ) = VF (X1), Un is the sample variance

c) k = 2, h(x1, x2) = 1x1 + x2 > 0,

Un =1(n2

)∑

1≤i<j≤n

1Xi +Xj > 0,

ϑ(F ) = PF (X1 +X2 > 0).

In the sequel, we often omit the index F and write E = EF , V = VF , P = PF

etc.

Norbert Henze, KIT 8.2

Page 92: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Limit theorems for U -statistics

8.4 Theorem (Variance of a U-statistic)

For c ∈ 1, 2, . . . , k, let

σ2c := Cov (h(X1, . . . , Xc, Xc+1, . . . , Xk), h(X1, . . . , Xc, Xk+1, . . . , X2k−c))

(”c common indices“). We then have

V(Un) =1(nk

)k∑

c=1

(k

c

)(n− kk − c

)σ2c .

Proof: Exercise! (use V(Un) = Cov(Un, Un) and the bilinearity of Cov(·, ·))

For c ∈ 1, . . . , k − 1, let

hc(x1, . . . , xc) := E [h(x1, . . . , xc, Xc+1, . . . , Xk)]

= E[h(X1, . . . , Xk)

∣∣X1 = x1, . . . , Xc = xc

].

Furthermore, put hk := h. Notice that

Ehc = E [hc(X1, . . . , Xc)] = ϑ = Eh.

Norbert Henze, KIT 8.3

Page 93: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Limit theorems for U -statistics

Memo: σ2c := Cov (h(X1, . . . , Xk), h(X1, . . . , Xc, Xk+1, . . . , X2k−c))

Memo: hc(x1, . . . , xc) = E [h(x1, . . . , xc, Xc+1, . . . , Xk)] ; Ehc = ϑ

Memo: hc(X1, . . . , Xc) = E [h(X1, . . . , Xc, Xc+1, . . . , Xk)|X1, . . . , Xc]

8.5 Lemma We have σ2c = V (hc(X1, . . . , Xc)) .

Proof: We have

σ2c = E [h(X1, . . . , Xk) · h(X1, . . . , Xc, Xk+1, . . . , X2k−c)] − ϑ2

= E

[E

[h(X1, . . . , Xk)h(X1, . . . , Xc, Xk+1, . . . , X2k−c)

∣∣∣X1, . . . , Xc)] ]− ϑ2

︸ ︷︷ ︸= h2

c(X1, . . . , Xc)

= E(h2c

)− (Ehc)

2

= V (hc) .√

Norbert Henze, KIT 8.4

Page 94: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Limit theorems for U -statistics

8.6 Example (cf. Example 8.3 b))

h(x1, x2) =1

2(x1 − x2)

2

µ := EX1, µr := E [(X1 − µ)r] =⇒

h1(x1) = E

[1

2(x1 −X2)

2

]=

1

2E[(X2 − µ+ µ− x1)

2]

=1

2

(µ2 + (µ− x1)

2).

σ21 = V

(1

2

(µ2 + (X1 − µ)2

))=

1

4

(µ4 − µ2

2

),

σ22 = V

(1

2(X1 −X2)

2

)=

1

2

(µ4 + µ2

2

). (!)

Ex. 8.3 b), Thm. 8.4 =⇒

V

(1

n− 1

n∑

j=1

(Xj −Xn

)2)

=2

n(n− 1)

[(2

1

)(n− 2

2− 1

)σ21 +

(2

2

)(n− 2

2− 2

)σ22

]

=1

n

(µ4 − n− 3

n− 1µ22

). (!)

Norbert Henze, KIT 8.5

Page 95: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Limit theorems for U -statistics

8.7 Definition (Hajek Projection) Let

Un =1(nk

)∑

1≤i1<...<ik≤n

h (Xi1 , . . . , Xik )

be a U -statistic and ϑ = EUn. The random variable

Un :=n∑

j=1

E[Un|Xj ]− (n− 1)ϑ

is called the Hajek projection of Un.

Notice that EUn = ϑ, and that Un is a sum of independent random variables.

8.8 Lemma We have:

a) Un =k

n

n∑

j=1

(h1(Xj)− ϑ) + ϑ,

b) E(Un − Un)2 = σ2

1

k

(n−kk−1

)(nk

) − k2

n

+

1(nk

)k∑

c=2

(k

c

)(n− kk − c

)σ2c ,

c) E(Un − Un)2 = O(n−2) as n→∞.

Norbert Henze, KIT 8.6

Page 96: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Limit theorems for U -statistics

If A = a1, . . . , ak ⊂ 1, . . . , n, |A| = k, put h(XA) := h(Xa1 , . . . , Xak).

Proof: a) We have

E[Un|Xj ] =1(nk

)∑

A:|A|=k

E [h(XA)|Xj ] .

Now, E[h(XA)|Xj ] = ϑ if j /∈ A (why?) and E[h(XA)|Xj ] = h1(Xj), ifj ∈ A. Counting the respective cases gives

E[Un|Xj ] =1(nk

)[(

n− 1

k − 1

)h1(Xj) +

(n− 1

k

]=k

nh1(Xj) +

n− kn

ϑ.√

b) Since EUn = EUn we may assume w.l.o.g. ϑ = 0. Then

E(Un − Un)2 = V(Un) + V(Un)− 2E(UnUn)

=1(nk

)k∑

c=1

(k

c

)(n− kk − c

)σ2c +

k2

n2nσ2

1

−2 kn

1(nk

)∑

A:|A|=k

n∑

j=1

E [h(XA) · h1(Xj)] .︸ ︷︷ ︸

=?

Norbert Henze, KIT 8.7

Page 97: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Limit theorems for U -statistics

We have

E [h(XA) · h1(Xj)] =

0, if j /∈ A (since ϑ = 0),

σ21 , if j ∈ A. (*)

Proof of (*): By symmetry we have

E [h(XA)h1(Xj)] = E [h(X1, . . . , Xk)h1(X1)]

= E

[E [h(X1, . . . , Xk)h1(X1)|X1]

]

= E[h1(X1)E [h(X1, . . . , Xk)|X1]

]

= E [h1(X1) · h1(X1)]

= V(h1(X1)) = σ21 .√

Thus,

−2 kn

1(nk

)∑

A:|A|=k

n∑

j=1

E [h(XA) · h1(Xj)] = − 2k2

nσ21

︸ ︷︷ ︸= kσ2

1

and

E(Un − Un)2 =

1(nk

)k∑

c=1

(k

c

)(n− kk − c

)σ2c +

k2

n2n σ2

1 − 2k2

nσ21 .√

Norbert Henze, KIT 8.8

Page 98: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Limit theorems for U -statistics

Memo: E(Un − Un)2 = σ2

1

k

(n−kk−1

)(nk

) − k2

n

+

1(nk

)k∑

c=2

(k

c

)(n− kk − c

)σ2c

c) The 2nd summand is of order O(n−2) since c ≥ 2.

The first summand equals

σ21k2

n

(n−kk−1

)(n−1k−1

) − 1

.

Check that the curly bracket is of order O(n−1) as n→∞, q.e.d.

Norbert Henze, KIT 8.9

Page 99: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Limit theorems for U -statistics

Memo: Un − ϑ =k

n

n∑

j=1

(h1(Xj)− ϑ)

8.9 Theorem (CLT for nondegenerate U-statistics)

Let Un be a U -statistic. If σ21 > 0, Un is said to nondegenerate. We then have

√n(Un − ϑ) D−→ N(0, k2σ2

1).

Proof: We have

√n (Un − ϑ) =

√n(Un − ϑ

)+√n(Un − Un).︸ ︷︷ ︸=: Rn

Lemma 8.8 c) =⇒ E(R2n)→ 0 and thus Rn

P−→ 0.

Put Yj := k(h1(Xj)− ϑ). Notice that EYj = 0, V(Yj) = k2σ21 .

The CLT of Lindeberg–Levy gives

√n(Un − ϑ

)=

1√n

n∑

j=1

YjD−→ N(0, k2σ2

1), q.e.d.

Norbert Henze, KIT 8.10

Page 100: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Limit theorems for U -statistics

8.10 Example (Continuation of Example 8.3 c))

Leth(x1, x2) = 1x1 + x2 > 0.

We have

h1(x1) = E [1x1 +X2 > 0] = P(X2 > −x1) = 1− F (−x1)

and thereforeσ21 = V (1− F (−X1)) = V(F (−X1)).

If F is continuous and the distribution of X1 is symmetric around 0, i.e., if

X1D= −X1, then

F (−X1)D= F (X1)

D= U(0, 1)

and

σ21 = V(U(0, 1)) =

1

12.

The CLT now gives

√n

1(

n2

)∑

1≤i<j≤n

1Xi +Xj > 0 − 1

2

D−→ N

(0,

1

3

).

Norbert Henze, KIT 8.11

Page 101: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Limit theorems for U -statistics

8.11 Definition (Two-sample U-statistic)

Let X1, X2, . . . ;Y1, Y2, . . . be independent random variables, where X1, X2, . . .are identically distributed with distribution function F , and Y1, Y2, . . . are iden-tically distributed with distribution function G.

Furthermore, let h : Rk × Rℓ → R be a measurable function such thath(x1, . . . , xk, y1, . . . , yℓ) is symmetric in x1, . . . , xk and symmetric in y1, . . . , yℓ.Then

Um,n :=1(

mk

)(nℓ

)∑

1≤i1<...<ik≤m

1≤j1<...<jℓ≤n

h(Xi1 , . . . , Xik , Yj1 , . . . , Yjℓ)

is called a two-sample U -statistic of order (k, ℓ) with kernel h.

In a statistical context, F and G will be unknown.

We have

EF,G(Um,n) = EF,Gh(X1, . . . , Xk, Y1, . . . , Yℓ) =: ϑ(F,G) =: ϑ.

In what follows, we assume EF,Gh2 <∞.

Norbert Henze, KIT 8.12

Page 102: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Limit theorems for U -statistics

8.12 Example (Mann–Whitney-U-statistic)

Let k = ℓ = 1, h(x, y) = 1x ≤ y,

Um,n =1

mn

m∑

i=1

n∑

j=1

1Xi ≤ Yj,

ϑ(F,G) = EF,G[1X1 ≤ Y1] = PF,G(X1 ≤ Y1).

Notice that ϑ(F, F ) = 1/2 if F is continuous. (why?)

8.13 Theorem Let σ00 := 0 and, for c+ d ≥ 1,

σ2c,d := Cov

(h(XA1 , YB1), h(XA2 , YB2

)),

where A1, A2 ⊂ 1, . . . ,m, |A1∩A2| = c, B1, B2 ⊂ 1, . . . ,m, |B1∩B2| = d.Then

V(Um,n) =1(

mk

)(nℓ

)k∑

c=0

ℓ∑

d=0

(k

c

)(m− kk − c

)(ℓ

d

)(n− ℓℓ− d

)σ2c,d.

Proof: Exercise! (Use the bilinearity of Cov(·, ·) and symmetry).

Norbert Henze, KIT 8.13

Page 103: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Limit theorems for U -statistics

8.14 Remark For c ≤ k, d ≤ ℓ, let

hc,d(x1, . . . , xc, y1, . . . , yd)

:= E [h(x1, . . . , xc, Xc+1, . . . , Xk, y1, . . . , yd, Yd+1, . . . , Yℓ)]

Thenσ2c,d = V (hc,d(X1, . . . , Xc, Y1, . . . , Yd)) .

Proof: Exercise in conditional expectations, cf. 8.5 .

8.15 Definition (Hajek Projection)

Um,n :=

m∑

i=1

E[Um,n|Xi] +

n∑

j=1

E[Um,n|Yj ]− (m+ n− 1)ϑ

is called the Hajek projection of Um,n.

Notice that Um,n is a sum of independent random variables.

Norbert Henze, KIT 8.14

Page 104: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Limit theorems for U -statistics

8.16 Theorem We have:

a) Um,n =k

m

m∑

i=1

(h1,0(Xi)− ϑ) +ℓ

n

n∑

j=1

(h0,1(Yj)− ϑ) + ϑ,

b) Putting (a)j := a(a− 1) · . . . · (a− j + 1), we have

E(Um,n − Um,n)2 =

k2

m

(m− k)k−1

(m− 1)k−1

(n− ℓ)ℓ(n)ℓ

− 1

σ21,0

+ℓ2

n

(m− k)k(m)k

(n− ℓ)ℓ−1

(n)ℓ−1− 1

σ20,1

+1(

mk

)(nℓ

)k∑

c=0

ℓ∑

d=0

(k

c

)(m− kk − c

)(ℓ

d

)(n− ℓℓ− d

)σ2c,d

c+d≥2

Proof: Analogously to 8.8 a), b).

Norbert Henze, KIT 8.15

Page 105: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Limit theorems for U -statistics

8.17 Theorem (CLT for nondegenerate two-sample U-statistics)

If σ21,0 > 0, σ2

0,1 > 0 and m,n→∞ under the condition that

m

m+ n→ τ for some τ ∈ (0, 1) (8.1)

(so-called usual limiting regime in the two-sample case), then

√m+ n (Um,n − ϑ) D−→ N

(0,k2σ2

1,0

τ+ℓ2σ2

0,1

1− τ

).

Proof: We have

√m+ n (Um,n − ϑ) =

√m+ n

(Um,n − ϑ

)+Rm,n

where Rm,n =√m+ n(Um,n − Um,n). Notice that 8.16 b) implies

ER2m,n → 0 and thus Rm,n

P−→ 0.

(8.1) means m = ms, s ≥ 1, n = ns, s ≥ 1 and

lims→∞

ms

ms + ns= τ.

Norbert Henze, KIT 8.16

Page 106: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Limit theorems for U -statistics

From 8.16 a),

√ms + ns

(Um,n − ϑ

)=

ms+ns∑

i=1

Zs,i,

where

Zs,i =

√ms + ns

k

ms(h1,0(Xi)− ϑ) , if i ∈ 1, . . . ,ms,

√ms + ns

ns(h0,1(Yms−i)− ϑ) , if i ∈ ms + 1, . . . ,ms + ns.

(Zs,1, . . . , Zs,ms+ns)s≥1 is a triangular array of rowwise independent randomvariables. We have E(Zs,i) = 0 for each i and

V(Zs,i) =

k2ms + ns

m2s

σ21,0, if i ≤ ms,

ℓ2ms + ns

n2s

σ20,1, if i > ms.

Notice that, as s→∞,

σ2s :=

ms+ns∑

i=1

V(Zs,i) → σ2 :=k2σ2

1,0

τ+ℓ2σ2

0,1

1− τ .

Norbert Henze, KIT 8.17

Page 107: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Limit theorems for U -statistics

We now check the Lindeberg condition. We have

Ls(ε) :=1

σ2s

ms+ns∑

i=1

E[Z2

s,i1|Zs,i| > εσs]

=ms

σ2s

E[Z2

s,11|Zs,1| > εσs]+ns

σ2s

E[Z2

s,ms+11|Zs,ms+1| > εσs]

By definition of Zs,1 (= k(√ms + ns/ms)(h1,0(X1)− ϑ)) , we have

msE[Z2

s,11|Zs,1| > εσs]

= k2ms + ns

msE

[(h1,0(X1)− ϑ)21

|h1,0(X1)− ϑ| > εσsms

k√ms + ns

]

︸ ︷︷ ︸ ︸ ︷︷ ︸→ 1/τ →∞ as s→∞

Hence,lims→∞

ms

σ2s

E[Z2

s,11|Zs,1| > εσs]= 0. (why?)

In the same way,

lims→∞

ns

σ2s

E[Z2

s,ms+11|Zs,ms+1| > εσs]= 0.

This shows lims→∞ Ls(ε) = 0 ∀ε > 0.

Norbert Henze, KIT 8.18

Page 108: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Limit theorems for U -statistics

The CLT of Lindeberg–Feller yields

1

σs

ms+ns∑

i=1

Zs,iD−→ N(0, 1).

Since σ2s → σ2, it follows that

√ms + ns

(Um,n − ϑ

)=

ms+ns∑

i=1

Zs,iD−→ N(0, σ2),

where

σ2 =k2σ2

1,0

τ+ℓ2σ2

0,1

1− τ , q.e.d.

Norbert Henze, KIT 8.19

Page 109: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Limit theorems for U -statistics

8.18 Example (Mann–Whitney-U-statistic, cf. 8.12)

Let h(x, y) = 1x ≤ y, ϑ = P(X1 ≤ Y1). We have

σ21,0 = Cov(1X1 ≤ Y1, 1X1 ≤ Y2)

= P(X1 ≤ Y1, X1 ≤ Y2)− ϑ2,

σ20,1 = Cov(1X1 ≤ Y1, 1X2 ≤ Y1)

= P(X1 ≤ Y1, X2 ≤ Y1)− ϑ2.

If σ21,0 > 0 and σ2

0,1 > 0, then

√m+ n

(1

mn

m∑

i=1

n∑

j=1

1Xi ≤ Yj − ϑ)

D−→ N

(0,σ21,0

τ+

σ20,1

1− τ

).

Um,n is a widely used statistic for the testing problem H0 : F = G, where Fand G are assumed to be continuous.

If H0 holds, then ϑ = 1/2, σ21,0 = σ2

0,1 = 1/3− 1/4 (why?) = 1/12, and

√m+ n

(Um,n − 1

2

)D−→ N

(0,

1

12

(1

τ+

1

1− τ

)).

︸ ︷︷ ︸=

1

12τ (1− τ )

Norbert Henze, KIT 8.20

Page 110: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Limit theorems for U -statistics

In what follows, let

Un =1(nk

)∑

1≤i1<...<ik≤n

h(Xi1 , . . . , Xik )

as in 8.1. For σ2c (cf. 8.4, 8.5) we assume

0 = σ21 < σ2

2 (so-called first order degeneracy).

From 8.4 we have

V(Un) =1(nk

)(k

2

)(n− kk − 2

)σ22 +O

(1

n3

)

=2(k2

)2

n2σ22 +O

(1

n3

). (!)

Thus,

V (n(Un − ϑ)) → 2

(k

2

)2

σ22

and n(Un − ϑ) = OP(1). (why?)

Conjecture: n(Un − ϑ) has a non-degenerate limit distribution as n→∞.

Norbert Henze, KIT 8.21

Page 111: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Limit theorems for U -statistics

8.19 Example (a “warming up“)

Let

h(x1, x2) :=s∑

ν=1

λνϕν(x1)ϕν(x2),

where λ1, . . . , λs ∈ R \ 0, ϕ1, . . . , ϕs : R→ R measurable. Furthermore,

E[ϕν(X1)] = 0, E[ϕ2ν(X1)] = 1,

E[ϕµ(X1)ϕν(X1)] = δµ,ν (µ, ν ∈ 1, . . . , s).

Hence,

Un =1(n2

)∑

i<j

s∑

ν=1

λνϕν(Xi)ϕν(Xj) (=⇒ EUn = 0)

=

s∑

ν=1

λν1(n2

) 12

i6=j

ϕν(Xi)ϕν(Xj)

=

s∑

ν=1

λν1(n2

) 12

(n∑

j=1

ϕν(Xj)

)2 −

n∑

j=1

ϕ2ν(Xj)

Norbert Henze, KIT 8.22

Page 112: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Limit theorems for U -statistics

Memo: Un =s∑

ν=1

λν1(n2

) 12

(n∑

j=1

ϕν(Xj)

)2

−n∑

j=1

ϕ2ν(Xj)

=⇒ nUn =s∑

ν=1

λνn

n− 1

(1√n

n∑

j=1

ϕν(Xj)

)2

− 1

n

n∑

j=1

ϕ2ν(Xj)

.

Notice that, by the SLLN,

1

n

n∑

j=1

ϕ2ν(Xj)

a.s.−→ E[ϕ2

ν(X1)]= 1.

Moreover, by the multivariate CLT

1√n

n∑

j=1

ϕ1(Xj)

...ϕs(Xj)

D−→ Ns

0...0

, Is

N1

...Ns

The continuous mapping theorem and Slutsky’s lemma now give

nUnD−→

s∑

ν=1

λν

(N2

ν − 1).

Norbert Henze, KIT 8.23

Page 113: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Limit theorems for U -statistics

General idea: Approximate kernel h by kernel of order 2 and the latter by akernel as in Example 8.19.

To be precise, put

Un :=∑

1≤i<j≤n

E [Un|Xi, Xj ] −(n

2

)ϑ+ ϑ

(cf. Hajek projection).

8.20 Lemma We have

a) Un − ϑ =

(k2

)(n2

)∑

1≤j<ℓ≤n

(h2(Xj , Xℓ)− ϑ),

b) E(Un − Un)2 = O

(1

n3

)as n→∞.

Proof: Recall h(XA) := h(Xi1 , . . . , Xik ), A = i1, . . . , ik

=⇒ E [Un|Xj , Xℓ] =1(nk

)∑

A:|A|=k

E [h(XA)|Xj , Xℓ] .

Norbert Henze, KIT 8.24

Page 114: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Limit theorems for U -statistics

Now,

E [h(XA)|Xj , Xℓ] =

ϑ , if j, ℓ ∩A = ∅,h1(Xℓ) , if j /∈ A, ℓ ∈ A,h1(Xj) , if j ∈ A, ℓ /∈ A,

h2(Xj , Xℓ), if j, ℓ ⊂ A.

Since 0 = σ21 = V(h1(X1)) and ϑ = Eh1(X1) we have

E [h(XA)|Xj , Xℓ] =

ϑ , if |j, ℓ ∩ A| ≤ 1,

h2(Xj , Xℓ), otherwise,=⇒ a). (!)

b): W.l.o.g. let ϑ = 0 =⇒ E(Un − Un)2 = V(Un) + V(Un)− 2E

(UnUn

).

Memo: Un =1(nk

)∑

A:|A|=k

h(XA), Un =

(k2

)(n2

)∑

1≤j<ℓ≤n

h2(Xj , Xℓ)

Notice that

E [h2(X1, X2)h2(X1, X3)] = E[E [h2(X1, X2)h2(X1, X3)|X1]

]

= E[(E [h2(X1, X2)|X1])

2]

= E[(h1(X1))

2]

= V(h1(X1)) = σ21 = 0.

Proceed by analogy with Lemma 8.8 c), q.e.d.

Norbert Henze, KIT 8.25

Page 115: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Limit theorems for U -statistics

Consequence:n(Un − ϑ) = n(Un − ϑ) + n(Un − Un). (8.2)

By Lemma 8.20 b),

E[(n(Un − Un)

)2 ]→ 0 =⇒ n(Un − Un)

P−→ 0.

Hence, n(Un − ϑ) and n(Un − ϑ) have the same limit distribution(if there is such a limit distribution).

Notice that

n(Un − ϑ

)=

(k

2

)2

n− 1

1≤j<ℓ≤n

h2(Xj , Xℓ), (8.3)

where h2(x, y) := h2(x, y)− ϑ and E(h2) = Eh2(X1, X2) = 0.

Let L2 := L2(R,B, dF ) be the separable Hilbert space of (equivalence classesof) square integrable functions with respect to dF (:= PX1).

〈f, g〉 :=

∫f(x)g(x) dF (x) =

∫fg dF,

‖g‖2 := 〈g, g〉 =

∫g2 dF.

Norbert Henze, KIT 8.26

Page 116: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Limit theorems for U -statistics

In what follows, each unspecified integral is over R.

8.21 Lemma For g ∈ L2, let

(Ag)(x) :=

∫h2(x, y)g(y)dF (y), x ∈ R.

We then have:

a) Ag ∈ L2,

b) A : L2 → L2 is a linear operator,

c) ‖Ag‖ ≤√

Eh22 ‖g‖ (=⇒ A is continuous) ,

d) 〈Af, g〉 = 〈f,Ag〉 (i.e., A is symmetric (self-adjoint)),

e) If ϕ1, ϕ2, . . . is a complete orthonormal set in L2, then

‖A‖2HS :=

∞∑

j=1

‖Aϕj‖2 =

∫ ∫h22(x, y) dF (x)dF (y) = Eh2

2 <∞

(i.e., A is a Hilbert–Schmidt operator and therefore compact)

(see, e.g. J.Weidmann: Linear operators in Hilbert spaces, Thm. 6.10)

Norbert Henze, KIT 8.27

Page 117: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Limit theorems for U -statistics

Memo: (Ag)(x) :=∫h2(x, y)g(y) dF (y), x ∈ R.

Memo: a) Ag ∈ L2, b) A is a linear operator, c) ‖Ag‖ ≤√

Eh22 ‖g‖

Proof: a) By the Cauchy–Schwarz inequality, we have∫

(Ag)2 dF =

∫(Ag(x))2 dF (x)

=

∫ (∫h2(x, y)g(y)dF (y)

)2

dF (x)

≤∫ (∫

h22(x, y)dF (y)

∫g2(y)dF (y)

)dF (x)

︸ ︷︷ ︸= ‖g‖2

= ‖g‖2∫∫

h22(x, y)dF (x)dF (y)

= ‖g‖2 Eh22 <∞. √

b), c) follow from a).

Norbert Henze, KIT 8.28

Page 118: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Limit theorems for U -statistics

Memo: (Ag)(x) :=∫h2(x, y)g(y) dF (y), x ∈ R.

Memo: d) 〈Af, g〉 = 〈f,Ag〉

Proof:

〈Af, g〉 =

∫(Af)(x)g(x)dF (x)

=

∫ (∫h2(x, y)f(y)dF (y)

)g(x) dF (x)

︸ ︷︷ ︸= h2(y, x)

=

∫f(y)

(∫h2(y, x)g(x)dF (x)

)dF (y) (Fubini)

︸ ︷︷ ︸= (Ag)(y)

= 〈f,Ag〉. √

Norbert Henze, KIT 8.29

Page 119: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Limit theorems for U -statistics

Memo: (Ag)(x) :=∫h2(x, y)g(y) dF (y), x ∈ R.

Memo: e) ‖A‖2HS :=

∞∑

j=1

‖Aϕj‖2 =

∫∫h22(x, y) dF (x)dF (y) = Eh2

2

Proof: Put h2,x(y) := h2(x, y). Then

∞∑

j=1

‖Aϕj‖2 =∞∑

j=1

∫(Aϕj(x))

2 dF (x)

=

∞∑

j=1

∫ (∫h2(x, y)ϕj(y)dF (y)

)2

dF (x)

︸ ︷︷ ︸= 〈h2,x, ϕj〉2

=

∫ ∞∑

j=1

〈h2,x, ϕj〉2dF (x) (why?)

=

∫‖h2,x‖2 dF (x) (Parseval’s identity)

=

∫ ∫h22(x, y)dF (y) dF (x)

Norbert Henze, KIT 8.30

Page 120: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Limit theorems for U -statistics

8.22 Theorem (Expansion Thm. for compact self-adjoint linear operators)

There are λ1, λ2, . . . ∈ R with |λ1| ≥ |λ2| ≥ . . . > 0 and limn→∞ λn = 0 andϕ1, ϕ2, . . . ∈ L2 with 〈ϕi, ϕj〉 = δi,j ∀i, j, such that

Ag =∑

n≥1

λn〈g,ϕn〉ϕn, g ∈ L2.

If ψ1, ψ2, . . . is an orthonormal basis of g : Ag = 0, then ψ1, ψ2, . . . ∪ϕ1, ϕ2, . . . is an orthonormal basis of L2.

Proof: See, e.g. J.Weidmann: Linear operators in Hilbert spaces, Thm. 7.2.

Notice that Aϕk = λkϕk, k ≥ 1, i.e., λk is an eigenvalue of A associated withthe normalized eigenfunction ϕk.

From Thm. 8.21 e), we have

k≥1

λ2k =

k≥1

‖Aϕk‖2 < ∞. (8.4)

Norbert Henze, KIT 8.31

Page 121: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Limit theorems for U -statistics

In what follows, we put K(x, y) := h2(x, y).

For s ≥ 1, let Ks(x, y) :=

s∑

j=1

λjϕj(x)ϕj(y), (cf. Example 8.19).

8.23 Lemma We have

lims→∞

∫∫(K(x, y)−Ks(x, y))

2 dF (x)dF (y) = 0.

Proof: Let Kx(y) := K(x, y). Recall 〈Kx, g〉 = (Ag)(x).

Since∫∫

K2(x, y)dF (x)dF (y) <∞ we have∫K2(x, y)dF (y) <∞ for dF -almost all x

=⇒ Kx ∈ L2 for dF -almost all x.

Let ϕj , ψj as in Thm. 8.22. Then (Fourier expansion of Kx!)∫ (Kx(y)−

s∑

j=1

〈Kx, ψj〉ψj(y)−s∑

j=1

〈Kx, ϕj〉ϕj(y)

)2dF (y)→ 0

for dF -almost all x.

Norbert Henze, KIT 8.32

Page 122: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Limit theorems for U -statistics

Memo:

∫ (Kx(y)−

s∑

j=1

〈Kx, ψj〉ψj(y)−s∑

j=1

〈Kx, ϕj〉ϕj(y)

)2

dF (y)→ 0 dF -a.s.

Notice that 〈Kx, ψj〉 = Aψj(x) = 0 (ψj : j ≥ 1 is ONB of g : Ag = 0).Since 〈Kx, ϕj〉 = λjϕj(x), this means

ρs(x) :=

∫(K(x, y)−Ks(x, y))

2 dF (y) → 0 for dF -almost all x.

We have |ρs(x)| ≤ 2

∫K2(x, y)dF (y) + 2

∫K2

s (x, y)dF (y). (why?) Since

∫K2

s (x, y)dF (y) =s∑

j,ℓ=1

λjλℓϕj(x)ϕℓ(x)

∫ϕj(y)ϕℓ(y)dF (y) ≤

∞∑

j=1

λ2jϕ

2j (x)

︸ ︷︷ ︸= δj,ℓ

we have |ρs(x)| ≤ ρ(x) := 2∫K2(x, y)dF (y) + 2

∑∞j=1 λ

2jϕ

2j (x). Since

∫ρ(x)dF (x) = 2

∫∫K2dF⊗dF+2

∞∑

j=1

λ2j <∞, DOM =⇒

∫ρs(x)dF (x)→ 0.

Norbert Henze, KIT 8.33

Page 123: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Limit theorems for U -statistics

8.24 Lemma We have Eϕj(X1) = 0, j ≥ 1.

Proof: Let h1 := h1 − ϑ. (Recall h1(x1) = Eh(x1, X2, . . . , Xk))

We have (!) h1(x) =∫h2(x, y) dF (y) =

∫K(x, y) dF (y) =⇒

∫ (h1(x)−

s∑

j=1

λjϕj(x)

∫ϕjdF

)2

dF (x)

︸ ︷︷ ︸= Eϕj(X1)

=

∫ (∫ [K(x, y)−

s∑

j=1

λjϕj(x)ϕj(y)]· 1 dF (y)

)2dF (x)

︸ ︷︷ ︸= Ks(x, y)

≤∫∫

(K −Ks)2dF ⊗ dF. (Cauchy–Schwarz inequality)

︸ ︷︷ ︸→ 0 as s→∞ (cf. Lemma 8.23)

Norbert Henze, KIT 8.34

Page 124: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Limit theorems for U -statistics

Memo:

∫ (h1(x)−

s∑

j=1

λjϕj(x)

∫ϕjdF

)2

dF (x) → 0 as s→∞

Notice that

0 = σ21 = V(h1) = Eh2

1 =

∫h21(x)dF (x) =⇒ h1 = 0 dF -almost surely.

Memo =⇒

∆s :=

∫ ( s∑

j=1

λjϕj(x)

∫ϕjdF

)2 dF (x) → 0.

Now,

∆s =s∑

i,j=1

λiλj

∫ϕidF

∫ϕjdF

∫ϕi(x)ϕj(x)dF (x)

︸ ︷︷ ︸= δi,j

=s∑

j=1

λ2j

(∫ϕjdF

)2

.

It follows that∫ϕjdF = 0 = Eϕj(X1), j ≥ 1.

Norbert Henze, KIT 8.35

Page 125: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Limit theorems for U -statistics

Memo: Ks(x, y) :=s∑

j=1

λjϕj(x)ϕj(y)

8.25 Lemma We have

∫∫(K −Ks)

2 dF ⊗ dF =

∫∫K2dF ⊗ dF −

s∑

j=1

λ2j =

∞∑

j=s+1

λ2j .

Proof: The last equality follows from Thm. 8.21 e). We have∫∫

(K−Ks)2dF⊗dF =

∫∫K2dF⊗dF

−2s∑

j=1

λj

∫ [∫K(x, y)ϕj(y)dF (y)

]ϕj(x)dF (x)

︸ ︷︷ ︸= λjϕj(x)

+s∑

j,ℓ=1

λjλℓ

∫ϕj(x)ϕℓ(x)dF (x)

∫ϕj(y)ϕℓ(y)dF (y).

︸ ︷︷ ︸ ︸ ︷︷ ︸= δj,ℓ = δj,ℓ

Norbert Henze, KIT 8.36

Page 126: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Limit theorems for U -statistics

Put

Tn :=1

n

j 6=ℓ

K(Xj , Xℓ), (8.5)

Tn,s :=1

n

j 6=ℓ

Ks(Xj , Xℓ). (8.6)

8.26 Lemma We have E (Tn − Tn,s)2 ≤ 2

∞∑

j=s+1

λ2j .

Proof: We have

Tn − Tn,s = (n− 1)1(n2

)∑

j<ℓ

K(Xj , Xℓ)−Ks(Xj , Xℓ)

︸ ︷︷ ︸

=: Gs(Xj , Xℓ)︸ ︷︷ ︸=: ∆n (U -statistic !)

EGs(X1, X2) = Eh2(X1, X2)−s∑

j=1

λjEϕj(X1)Eϕj(X2) = 0.

By Lemma 8.25, EG2s(X1, X2) =

∑∞j=s+1 λ

2j .

Norbert Henze, KIT 8.37

Page 127: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Limit theorems for U -statistics

Memo: Gs(Xj , Xℓ) = K(Xj , Xℓ)−Ks(Xj , Xℓ)

Memo: EGs(X1, X2) = 0, EG2s(X1, X2) =

∞∑

j=s+1

λ2j

Memo: Ks(Xj , Xℓ) =s∑

ν=1

λνϕν(Xj)ϕν(Xℓ)

Check that Lemma 8.24 implies

E [Gs(X1, X2)Gs(X1, X3)] = E

[h2(X1, X2)h2(X1, X3)

]

(recall K = h2). Now,

E

[h2(X1, X2)h2(X1, X3)

]=

∫Eh2(x,X2)Eh2(x,X3) dF (x)

=

∫h1(x)

2dF (x) = V(h1(X1)) = σ21 = 0.

We thus have

E(Tn − Tn,s)2 = V(Tn − Tn,s) = (n− 1)2V(∆n)

= (n− 1)21(n2

)EG2s(X1, X2) ≤ 2

∞∑

j=s+1

λ2j , q.e.d.

Norbert Henze, KIT 8.38

Page 128: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Limit theorems for U -statistics

8.27 Theorem (Limit distribution of singly-degenerate U-statistics)

Let Un be a U -statistic satisfying 0 = σ21 < σ2

2 , and let h2 := h2 − ϑ. Letλ1, λ2, . . . be the nonzero eigenvalues of the integral operator on L2(R,B,dF )

associated with h2, cf. 8.21. We then have

n(Un − ϑ) D−→(k

2

) ∞∑

j=1

λj

(N2

j − 1),

where N1, N2, . . . are i.i.d. standard normal random variables.

Proof: By (8.2), (8.3),

n(Un − ϑ) =

(k

2

)n

n− 1

1

n

j 6=ℓ

h2(Xj , Xℓ) + oP(1).

︸ ︷︷ ︸= Tn, cf. (8.5)

Let

Ys :=s∑

j=1

λj

(N2

j − 1).

Check that (Ys) is a Cauchy sequence in L2. Since L2 is complete, there is a

Y ∈ L2 such that YsL2

−→ Y . If Y =:∑∞

j=1 λj(N2j − 1), then Ys

D−→ Y .

Norbert Henze, KIT 8.39

Page 129: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Limit theorems for U -statistics

We provelim

n→∞EeitTn = EeitY , t ∈ R.

The continuity theorem of Levy–Cramer implies TnD−→ Y , q.e.d.

Let t ∈ R, t 6= 0, Tn,s as in (8.6). For fixed s ∈ N, we have∣∣EeitTn − EeitY

∣∣ ≤∣∣EeitTn − EeitTn,s

∣∣+∣∣EeitTn,s − EeitYs

∣∣

+∣∣EeitYs − EeitY

∣∣=: an,s + bn,s + cs.

Fix ε > 0. We have

an,s ≤ E

∣∣∣eitTn − eitTn,s

∣∣∣ = E

∣∣∣(eit(Tn−Tn,s) − 1

)eitTn,s

∣∣∣

= E

∣∣∣(eit(Tn−Tn,s) − 1

) ∣∣∣ ≤ |t| · E∣∣Tn − Tn,s

∣∣ ( |eitx − 1| ≤ |tx|)

≤ |t| ·(E(Tn − Tn,s)

2)1/2 (Cauchy–Schwarz inequality)

≤ |t| ·(2

∞∑

j=s+1

λ2j

)1/2

(by Lemma 8.26)

≤ ε, if s ≥ s1(ε, t), since∞∑

j=s+1

λ2j → 0 as s→∞.

Norbert Henze, KIT 8.40

Page 130: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Limit theorems for U -statistics

Memo:∣∣EeitTn − EeitY

∣∣ ≤ an,s + bn,s + cs

Memo: an,s ≤ ε, if s ≥ s1(ε, t) Memo: cs =∣∣EeitYs − EeitY

∣∣

Memo: bn,s =∣∣EeitTn,s − EeitYs

∣∣

Since YsD−→ Y we have cs ≤ ε, if s ≥ s2 = s2(ε, t).

Put s0 := max(s1, s2). It follows that

lim supn→∞

∣∣∣EeiTn − EeiY∣∣∣ ≤ 2ε + lim sup

n→∞

∣∣∣EeiTn,s0 − EeiYs0

∣∣∣︸ ︷︷ ︸= 0, since Tn,so

D−→ Ys0 , cf. 8.19

q.e.d.

Norbert Henze, KIT 8.41

Page 131: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Limit theorems for U -statistics

8.28 Example (Cramer–von Mises statistic)

Let X1, X2, . . . be i.i.d. ∼ U(0, 1),

Fn(t) :=1

n

n∑

j=1

1Xj ≤ t, 0 ≤ t ≤ 1.

Let

ω2n :=

∫ 1

0

(√n(Fn(t)− t)

)2dt.

We have (Exercise!)

ω2n = (n− 1)

1(n2

)∑

1≤i<j≤n

h(Xi, Xj) +1

6+ oP(1),

where

h(x, y) =x2

2+y2

2−max(x, y) +

1

3.

We have Eh(X1, X2) = 0 = ϑ, h1(x) = Eh(x,X2) = 0 (!)=⇒ σ2

1 = V(h1(X1)) = 0.

Eh2(X1, X2) = V(h(X1, X2)) = σ22 = 1

90(!) > 0.

Norbert Henze, KIT 8.42

Page 132: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Limit theorems for U -statistics

Theorem 8.27 =⇒ look for nonzero eigenvalues of the integral operator

Ag(x) =

∫ 1

0

h(x, y)g(y)dy (8.7)

on L2 = L2([0, 1],B ∩ [0, 1],U(0, 1)).

Notice: g := g0 ≡ 1 =⇒ Ag0(x) =∫ 1

0h(x, y)dy = h1(x) = 0 =⇒ g ≡ const

has eigenvalue 0. Suppose Ag = λg, λ 6= 0 =⇒∫ 1

0

g(x)dx = 〈g, 1〉 = 1

λ〈λg, 1〉 = 1

λ〈Ag,1〉 = 1

λ〈g,A1〉

=1

λ〈g, 0〉

= 0.

In our case, the integral equation (8.7), putting Ag = λg, takes the form

λg(x) =x2

2

∫ 1

0

g(y)dy +1

2

∫ 1

0

y2g(y)dy − x∫ x

0

g(y)dy −∫ 1

x

yg(y)dy +1

3

∫ 1

0

g(y)dy

︸ ︷︷ ︸ ︸ ︷︷ ︸= 0 = 0

Norbert Henze, KIT 8.43

Page 133: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Limit theorems for U -statistics

Memo: λg(x) =1

2

∫ 1

0

y2g(y)dy − x∫ x

0

g(y)dy −∫ 1

x

yg(y)dy Approach:

Differentiate this equation twice =⇒

λg′(x) = −∫ x

0

g(y)dy − xg(x) + xg(x),

λg′′(x) = −g(x).Try g(x) = cos(ax) =⇒ g′′(x) = −a2g(x)

=⇒ − g(x) = 1

a2g′′(x) =⇒ λ =

1

a2.

Since

0 =

∫ 1

0

g(x)dx =1

asin(ax)

∣∣∣1

0=

1

asin a,

we have sin a = 0 and thus a ∈ kπ : k ∈ Z \ 0. Hence,

λk :=1

k2π2, k ≥ 1,

is an eigenvalue corresponding to the normalized eigenfunction

gk(x) =1√2cos(kπx), 0 ≤ x ≤ 1.

Norbert Henze, KIT 8.44

Page 134: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Limit theorems for U -statistics

Do we have obtained all solutions of the integral equation (8.7)?

8.21 e) =⇒∞∑

k=1

‖Aϕk‖2 = Eh2(X1, X2) =1

90

for any complete orthonormal system ϕ1, ϕ2, . . .. We have

∞∑

k=1

‖Agk‖2 =∞∑

k=1

λ2k =

1

π4

∞∑

k=1

1

k4=

1

π4

π4

90=

1

90.

From Thm. 8.27 we thus obtain

ω2n

D−→∞∑

k=1

1

π2k2(N2

k − 1)+

1

6∼ ω2,

where N1, N2, . . . are i.i.d. ∼ N(0, 1).

The distribution of ω2 is called Cramer–von Mises distribution.

ω2n is a suitable statistic for testing the hypothesis of a uniform distribution on

the unit interval.

Norbert Henze, KIT 8.45

Page 135: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Basic concepts of asymptotic estimation theory

9 Basic concepts of asymptotic estimation theory

9.1 Setting

Let X1, X2, . . . be i.i.d. random variables on some probability space (Ω,A, P)taking values in a measurable space (X0,B0). X0 is called the sample space.

Mostly, (X0,B0) = (Rd,Bd). Often, we will have d = 1.

LetM1 := Q : Q probability measure on B0.

Assumption: PX1 ∈ M1 is not completely known.

9.2 Definition (Parametric model)

A parametric model for PX1 is a subset P ⊂ M1 with the following property:There are an integer k, a set Θ ⊂ Rk, Θ 6= ∅, and a bijective mapping Θ ∋ ϑ 7→Qϑ from Θ onto P . We write

P = Qϑ : ϑ ∈ Θ.

We assume PX1 ∈ Qϑ : ϑ ∈ Θ and say ϑ is the true parameter, if PX1 = Qϑ.

Norbert Henze, KIT 9.1

Page 136: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Basic concepts of asymptotic estimation theory

9.3 Examples

a) Qϑ = Exp(ϑ), Θ = (0,∞).

b) Qϑ = N(µ, σ2), ϑ = (µ, σ2), Θ = R× R>0.

c) Qϑ = Bin(n, p), ϑ = p, Θ = [0, 1].

9.4 Canonical model

If not stated otherwise, we will adopt the so-called canonical model

Ω := X N0 , A := BN

0 , Pϑ = QNϑ,

i.e., the infinite product (Ω,A,Pϑ) := ⊗∞j=1(X0,B0, Qϑ). Moreover, given

ω = (xj)j≥1 ∈ Ω, we put Xj(ω) := xj . In other words, Xj is the jth coordinateprojection. Then X1, X2, . . . are i.i.d. random variables with distribution Qϑ.

Pϑ is the distribution of X := (Xj)j≥1.

(X ,B) :=(X N

0 ,BN0

)is the sample space of X.

(X ,B, Pϑ : ϑ ∈ Θ) is a suitable statistical space for asymptotic statistics.

If n ≥ 1, A ∈ B0 ⊗ . . .⊗B0 (n factors), then

(A××∞

j=n+1X0

)= Pϑ ((X1, . . . , Xn) ∈ A) .

Norbert Henze, KIT 9.2

Page 137: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Basic concepts of asymptotic estimation theory

In what follows, we stress the dependence of expecations, variances etc. on ϑby writing Eϑ, Vϑ etc.

9.5 Definition (Asymptotic properties of estimators)

A sequence (Sn) of estimators of ϑ is a sequence Sn : X → Rk (⊃ Θ) ofmeasurable mappings such that, for each n, Sn(x), x = (xj)j≥1, depends onlyon x1, . . . , xn.

Usual notation (canonical model): Sn = Sn(X1, . . . , Xn).

The sequence (Sn) is called

a) (asymptotically) unbiased (for ϑ) :⇔(lim

n→∞

)EϑSn = ϑ ∀ϑ ∈ Θ,

b) (weakly) consistent (for ϑ) :⇔ limn→∞

Pϑ(‖Sn−ϑ‖ > ε) = 0 ∀ε > 0∀ϑ ∈ Θ,

c) strongly consistent (for ϑ) :⇔ limn→∞

Sn = ϑ Pϑ-a.s. ∀ϑ ∈ Θ,

d)√n-consistent (for ϑ) :⇔ √n(Sn − ϑ) = OPϑ (1) ∀ϑ ∈ Θ.

Notice that a) requires Eϑ‖Sn‖ <∞ ∀ϑ ∈ Θ.

Notice that b) is equivalent to SnPϑ−→ ϑ for each ϑ ∈ Θ.

Norbert Henze, KIT 9.3

Page 138: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Basic concepts of asymptotic estimation theory

9.6 Remarks

a) Often there is interest only in γ(ϑ), where γ : Θ→ Rs, 1 ≤ s < k. Thenall definitions remain valid mutatis mutandis (Sn : X → Rs,limn→∞ EϑSn = γ(ϑ) ∀ϑ ∈ Θ etc.),

b) Let Sn =: (Sn1, . . . , Snk). If (Sn) is asymptotically unbiased andVϑ(Snj)→ 0 for each j ∈ 1, . . . , k, then (Sn) is consistent. (check!)

9.7 Example Let X1, X2, . . . be i.i.d. ∼ N(µ, σ2), ϑ := (µ, σ2), γ(ϑ) := σ2.Let

Sn :=1

n− 1

n∑

j=1

(Xj −Xn

)2, Xn :=

1

n

n∑

j=1

Xj .

We have EϑSn = σ2 = γ(ϑ) ∀ϑ ∈ Θ := R× R>0. Furthermore, by Ex. 8.6,

Vϑ(Sn) =1

n

(µ4 − n− 3

n− 1µ22

)→ 0.

Hence SnPϑ−→ γ(ϑ) ∀ϑ ∈ Θ. Since

√n(Sn− σ2)

Dϑ−→ N(0, 2σ4) (use 8.3b), 8.6,8.9), (Sn) is

√n-consistent. 8.3b), 8.6, 8.9 can be used to show asymptotic

normality even in greater generality: If X1, X2, . . . i.i.d. , E(X41 ) <∞ then√

n(Sn − σ2)D−→ N(0, µ4 − σ4), where µ4 = E(X1 − EX1)

4. (check!)

Norbert Henze, KIT 9.4

Page 139: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Basic concepts of asymptotic estimation theory

9.8 Definition (Asymptotic confidence region)

Let α ∈ (0, 1). An asymptotic confidence region for ϑ at level 1−α is a sequence(Cn), where Cn : X → P(Rk) and Cn(x), x = (xj)j≥1, is only dependent onx1, . . . , xn, such that

lim infn→∞

Pϑ (Cn(X1, . . . , Xn) ∋ ϑ) ≥ 1− α ∀ϑ ∈ Θ. (9.1)

9.9 Remarks

a) We must have x ∈ X : Cn(x1, . . . , xn) ∋ ϑ ∈ A ∀n ≥ 1, ∀ϑ ∈ Θ.

b) One often has more than (9.1), namely

limn→∞

Pϑ (Cn(X1, . . . , Xn) ∋ ϑ) = 1− α ∀ϑ ∈ Θ.

9.10 Example Let X1, X2, . . . i.i.d. ∼ Po(ϑ), ϑ ∈ Θ := (0,∞).

Sn := Xn → ϑ Pϑ-a.s. for each ϑ ∈ Θ. Since Vϑ(X1) = ϑ, the CLT gives√n(Xn − ϑ)√

ϑ

Dϑ−→ N(0, 1) ∀ϑ ∈ Θ.

Norbert Henze, KIT 9.5

Page 140: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Basic concepts of asymptotic estimation theory

Memo:

√n(Xn − ϑ)√

ϑ

Dϑ−→ N(0, 1) ∀ϑ ∈ Θ.

Slutsky’s lemma =⇒√n(Xn − ϑ)√

Xn

Dϑ−→ N(0, 1) ∀ϑ ∈ Θ

=⇒ limn→∞

(∣∣∣∣∣

√n(Xn − ϑ)√

Xn

∣∣∣∣∣ ≤ Φ−1(1− α

2

))= 1− α ∀ϑ ∈ Θ.

Thus,

Cn(X1, . . . , Xn) :=

[Xn −

Φ−1(1− α2)√

n

√Xn , Xn +

Φ−1(1− α2)√

n

√Xn

]

is an asymptotic confidence region for ϑ at level 1− α.

Norbert Henze, KIT 9.6

Page 141: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic properties of maximum likelihood estimators

10 Asymptotic properties of maximum likelihood estimators

Memo: X1, X2, . . . i.i.d. on (Ω,A,P), Xj : Ω→ X0, (X0,B0) meas. space

Memo: Θ ⊂ Rk, PX1 ∈ Qϑ : ϑ ∈ Θ = P ⊂M1.

Suppose µ is a σ-finite measure on B0.If B0 = Bd, then either

µ = λd (Borel–Lebesgue measure)

or

µ is the counting measure on a countable subset D of Rd, i.e.,µ =

∑t∈D δt.

Suppose that in 9.2∀ϑ ∈ Θ : Qϑ = f(·, ϑ)µ,

i.e.,

Qϑ(B) =

B

f(x, ϑ)µ(dx), B ∈ B0.

In other words, Qϑ has a density f(·, ϑ) with respect to µ.

Norbert Henze, KIT 10.1

Page 142: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic properties of maximum likelihood estimators

10.1 Definition (Likelihood function, maximum likelihood estimate)

a) For fixed x ∈ X0, the function Lx : Θ→ R, defined by

Lx(ϑ) = f(x, ϑ),

is called the likelihood function (pertaining to x).

b) Any ϑ(x) ∈ Θ satisfying

f(x, ϑ(x)

)= sup

ϑ∈Θf(x, ϑ) (10.1)

is called a maximum likelihood (ML) estimate of ϑ given x.

c) A measurable mapping ϑ : X0 → Θ satisfying (10.1) for each x ∈ X0 iscalled a maximum likelihood estimator (MLE) of ϑ.

10.2 Remark Suppose Θ ⊂ Rk and assume that ∂∂ϑj

f(x, ϑ) exists

(j = 1, . . . , k; ϑ = (ϑ1, . . . , ϑk)). Then one might try to find ϑ(x) in (10.1) bysolving the log-likelihood equations

∂ϑjlog f(x, ϑ) = 0, j = 1, . . . , k.

But: Be aware of relative maxima and (relative) maxima on the boundary of Θ.

Norbert Henze, KIT 10.2

Page 143: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic properties of maximum likelihood estimators

10.3 Example (Binomial case)

Let X0 := 0, 1n, Θ := (0, 1), µ := counting measure on X0.

For x = (x1, . . . , xn) ∈ X0, let

f(x, ϑ) = ϑ∑

nj=1

xj (1− ϑ)n−∑

nj=1

xj (”binomial case“).

∂ϑlog f(x, ϑ) = 0 =⇒ ϑ(x) =

1

n

n∑

j=1

= xn.

Notice that ϑ(x) ∈ Θ⇐⇒ 0 <∑n

j=1 xj < n. (check!)

If∑n

j=1 xj = 0 then f(x, ϑ) = (1− ϑ)n = maxϑ! and ϑ(x) = 0.

If∑n

j=1 xj = n then f(x, ϑ) = ϑn = maxϑ! and ϑ(x) = 1.

If Θ is the closed interval [0, 1], then ϑ : X0 → [0, 1] is the MLE.

Norbert Henze, KIT 10.3

Page 144: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic properties of maximum likelihood estimators

For asymptotic considerations, the following modification of 10.1 is convenient:

Assume the setting of 9.1 and 9.2. Let µ be some σ-finite measure on B0. Letf(·, ϑ) be the density of PX1

ϑ = Qϑ with respect to µ.

Then (X1, . . . , Xn) has the density

fn(x1, . . . , xn, ϑ) :=n∏

j=1

f(xj , ϑ)

with respect to the n-fold product measure

µn := µ⊗ µ⊗ · · · ⊗ µ (n factors). (why?)

For x = (xj)j≥1 ∈ X := X N0 , let

fn(x, ϑ) := fn(x1, . . . , xn, ϑ).

In what follows, we assume the canonical model of 9.4, , i.e.,

(Ω,A, Pϑ) := ⊗∞j=1(X0,B0, Qϑ), Xj(ω) := xj , where ω = (xj)j≥1.

Norbert Henze, KIT 10.4

Page 145: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic properties of maximum likelihood estimators

10.4 Definition (Asymptotic maximum likelihood estimator)

Let X1, X2, . . . be i.i.d. with µ-density f(·, ϑ), ϑ ∈ Θ ⊂ Rk. For n ∈ N, let

Mn :=⋃

ϑ∈Θ

x ∈ X : fn(x, ϑ) = sup

t∈Θfn(x, t)

.

Suppose there is a set M ′n ⊂Mn with M ′

n ∈ B and Pϑ(M′n)→ 1 ∀ϑ ∈ Θ.

Then any sequence (ϑn) of measurable mappings ϑn : X → Θ satisfying

fn(x, ϑn(x)

)= sup

t∈Θfn(x, t) ∀x ∈M ′

n

is called an asymptotic maximum likelihood estimator.

Aim: Under certain regularity conditions, we have

√n(ϑn − ϑ

) Dϑ−→ Nk

(0, I1(ϑ)

−1)∀ϑ ∈ Θ.

Here, I1(ϑ) is the Fisher information matrix (to be defined).

Norbert Henze, KIT 10.5

Page 146: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic properties of maximum likelihood estimators

10.5 Definition (Regularity Conditions)

a) ∀z∈X0 ∀i, j∈1, . . . , k : ∂2

∂ϑi∂ϑjf(z, ϑ) exists and is continuous on Θ,

b) ∀ϑ ∈ Θ ∀i, j ∈ 1, . . . , k we have

0 = E

[∂

∂ϑilog f(X1, ϑ)

] (= Eϑ

[∂

∂ϑif(X1, ϑ)

f(X1, ϑ)

]),

c) ∀ϑ ∈ Θ ∀i, j ∈ 1, . . . , k we have

0 = Eϑ

[1

f(X1, ϑ)· ∂

2 f(X1, ϑ)

∂ϑi∂ϑj

],

d) ∀ϑ ∈ Θ ∃δϑ > 0 such that U(ϑ, δϑ) := y ∈ Rk : ‖y − ϑ‖ < δϑ ⊂ Θ

∃ measurable function M(·, ϑ) ≥ 0 on X0 with EϑM(X1, ϑ) <∞ and

∣∣∣∣∣∂2

∂ϑi∂ϑjlog f(·, ϑ′)

∣∣∣∣∣ ≤ M(·, ϑ) ∀ϑ′ ∈ U(ϑ, δϑ) ∀i, j ∈ 1, . . . , k,

e) For each ϑ ∈ Θ, the so-called Fisher information matrix

I1(ϑ) :=

(Eϑ

[∂

∂ϑilog f(X1, ϑ)

∂ϑjlog f(X1, ϑ)

])

1≤i,j≤kis invertible.

Norbert Henze, KIT 10.6

Page 147: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic properties of maximum likelihood estimators

10.5 b),c) mean that we can interchange the order of integration anddifferentiation:

0 = E

[∂

∂ϑilog f(X1, ϑ)

]

=

X0

∂ϑilog f(z, ϑ)f(z, ϑ)µ(dz)

=

X0

∂∂ϑi

f(z, ϑ)

f(z, ϑ)f(z, ϑ)µ(dz)

=

X0

∂ϑif(z, ϑ)µ(dz)

=∂

∂ϑi

X0

f(z, ϑ)µ(dz)

︸ ︷︷ ︸= 1

Norbert Henze, KIT 10.7

Page 148: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic properties of maximum likelihood estimators

Putd

dϑ:=

(∂

∂ϑ1, . . . ,

∂ϑk

)⊤.

10.6 Definition and Theorem (Score vector)

U1(ϑ) :=d

dϑlog f(X1, ϑ)

is called the score vector of X1. Under the regularity conditions 10.5 we have:

a) Eϑ (U1(ϑ)) = 0 ∀ϑ ∈ Θ,

b) Vϑ (U1(ϑ)) = Eϑ

[U1(ϑ)U1(ϑ)

⊤] = I1(ϑ).

Proof: a) follows from 10.5 b), b) from 10.5 e).

10.7 Remarks

a) I1(ϑ) is the covariance matrix of the score vector.

b) Let Un(ϑ) :=ddϑ

log fn(X1, . . . , Xn, ϑ) be the score vector of(X1, . . . , Xn). We then have Eϑ (Un(ϑ)) = 0 (why?) and

Vϑ (Un(ϑ)) = Eϑ(Un(ϑ)Un(ϑ)⊤) =

n∑

j,ℓ=1

[∂

∂ϑilogf(Xj , ϑ)

∂ϑjlogf(Xℓ, ϑ)

]

=n∑

j,ℓ=1

δj,ℓI1(ϑ) = nI1(ϑ).

Norbert Henze, KIT 10.8

Page 149: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic properties of maximum likelihood estimators

10.8 Theorem (Main theorem of maximum likelihood estimation)

Assume the conditions in 10.5. If (ϑn) is a consistent sequence of maximumlikelihood estimators, then

√n(ϑn − ϑ

) Dϑ−→ Nk

(0, I1(ϑ)

−1) ∀ϑ ∈ Θ.

Proof: (Sketch) Fix ϑ ∈ Θ and let M ′n ⊂ X as in 10.4 (Pϑ(M

′n)→ 1).

Let U(ϑ, δϑ) ⊂ Θ as in 10.5 d) and put

Vn := x ∈ X : ϑn(x) ∈ U(ϑ, δϑ).

Since (ϑn) is consistent, we have Pϑ(Vn)→ 1. Put

ϑn(x) := ϑn(x)1M′n∩Vn(x) + ϑ1(M′

n∩Vn)c(x), x ∈ X ,

=⇒ ϑnPϑ−→ ϑ. Since Pϑ(ϑn 6= ϑn)→ 0 (why?), we have√

n(ϑn − ϑn) = oPϑ(1) (!). By Slutsky’s lemma, we have to show

√n(ϑn − ϑ

) Dϑ−→ Nk

(0, I1(ϑ)

−1)∀ϑ ∈ Θ.

Norbert Henze, KIT 10.9

Page 150: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic properties of maximum likelihood estimators

Let

Un(t) :=n∑

j=1

d

dϑlog f(Xj , ϑ)

∣∣∣∣ϑ=t

, t ∈ Θ.

On the set M ′n ∩ Vn, ϑn satisfies the log-likelihood equations 0 = Un(ϑn).

Idea: Make a Taylor expansion of Un(t) at t = ϑ. Let

Wn(ϑ) :=d

dϑ⊤Un(ϑ) =

(n∑

ℓ=1

∂2

∂ϑi∂ϑjlog f(Xℓ, ϑ)

)

1≤i,j≤k

.

We claimE(Wn(ϑ)) = −nI1(ϑ).

To prove this claim, notice that

∂ϑj

[∂

∂ϑilog f(Xℓ, ϑ)

]=

∂ϑj

∂∂ϑif(Xℓ, ϑ)

f(Xℓ, ϑ)

=1

f(Xℓ, ϑ)

∂2

∂ϑi∂ϑjf(Xℓ, ϑ)− ∂

∂ϑilogf(Xj , ϑ)

∂ϑjlogf(Xℓ, ϑ).

︸ ︷︷ ︸ ︸ ︷︷ ︸Eϑ[ ] = 0, cf. 10.5 c) Eϑ[ ] = −I1(ϑ), cf. 10.5 e)

Norbert Henze, KIT 10.10

Page 151: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic properties of maximum likelihood estimators

A Taylor expansion gives

0 = Un(ϑn) = Un(ϑ) +Wn(ϑ)(ϑn − ϑ) +Rn(ϑ, ϑn − ϑ).

Dividing this equation by√n gives

0 =1√nUn(ϑ) +

1

nWn(ϑ) ·

√n(ϑn − ϑ) + 1√

nRn(ϑ, ϑn − ϑ)

︸ ︷︷ ︸= oPϑ (1), cf. 10.5 (d)

=⇒ 1

nWn(ϑ) ·

√n(ϑn − ϑ) = − 1√

nUn(ϑ) + oPϑ(1).

Multiv. CLT, CMT and Slutsky imply

− 1√nUn(ϑ) + oPϑ(1)

Dϑ−→ Nk(0, I1(ϑ)).

The CMT gives (−I1(ϑ))−1 1

nWn(ϑ)

√n(ϑn − ϑ) Dϑ−→ Nk(0, I1(ϑ)

−1).

Since 1nWn(ϑ)→ −I1(ϑ) Pϑ-a.s., the assertion follows.

Norbert Henze, KIT 10.11

Page 152: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic properties of maximum likelihood estimators

10.9 Corollary (Representation of estimation error)Under the standing assumptions, we have

√n(ϑn − ϑ

)=

1√n

n∑

j=1

ℓ(Xj , ϑ) + oPϑ (1) as n→∞,

where ℓ(Xj , ϑ) = I1(ϑ)−1 d

dϑlog f(Xj , ϑ).

Proof: From the proof of Theorem 10.8, we have√n(ϑn − ϑ

)= I1(ϑ)

−1 1√nUn(ϑ) + oPϑ (1), q.e.d.

Notice that Eϑ(ℓ(X1, ϑ)) = 0, Vϑ(ℓ(X1, ϑ)) = I1(ϑ)−1.

ℓ(·, ϑ) : X0 → R is called influence function.

10.10 Remark Under general conditions, we have ϑn → ϑ Pϑ-almost surely.

Further reading: Witting, H., Muller-Funk, U.: Mathematische Statistik II, B.G.Teubner 1995, p. 168ff.

Ferguson, Th.: A course in large sample Theory. Chapman & Hall 1996,Chapters 16-18.

Norbert Henze, KIT 10.12

Page 153: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic (relative) efficiency of estimators

11 Asymptotic (relative) efficiency of estimators

11.1 Definition (Loewner semiorder)

Let A,B be symmetric (k × k)-matrices. We write A ≥ 0 if A is positive-semidefinite and put

A ≥ B :⇐⇒ A−B ≥ 0.

11.2 Remark (multivariate Cramer-Rao inequality)

Let X1, X2, . . . be i.i.d. ∼ Qϑ, where ϑ ∈ Θ ⊂ Rk. Furthermore, letQϑ = f(·, ϑ)µ, where µ is some σ-finite measure on B0.Let Tn = Tn(X1, . . . , Xn) be an unbiased estimator of ϑ, i.e., we haveEϑ(Tn) = ϑ for each ϑ ∈ Θ.

Under further regularity conditions, we then have

Vϑ(Tn) = Eϑ

[(Tn − ϑ)(Tn − ϑ)⊤

]≥ 1

nI1(ϑ)

−1.

(multivariate Cramer–Rao-inequality)

Norbert Henze, KIT 11.1

Page 154: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic (relative) efficiency of estimators

Memo: Vϑ(Tn) = Eϑ

[(Tn − ϑ)(Tn − ϑ)⊤

]≥ 1

nI1(ϑ)

−1.

=⇒ Vϑ

(√n(Tn − ϑ)

)≥ I1(ϑ)

−1.

Now, suppose that√n(Tn − ϑ) Dϑ−→ Nk(0,Σ(ϑ)), ϑ ∈ Θ.

Do we have Σ(ϑ) ≥ I1(ϑ)−1, ϑ ∈ Θ ?

11.3 Theorem (Bahadur)

Suppose that the conditions of 10.5 hold. Let (Tn) be a sequence of estimatorsof ϑ such that

√n(Tn − ϑ) Dϑ−→ Nk(0,Σ(ϑ)), ϑ ∈ Θ.

ThenΣ(ϑ) ≥ I1(ϑ)

−1 ∀ϑ ∈ Θ ∩Nc,

where λk(N) = 0.

Proof: See Bahadur, Ann. Mathem. Statist. 38 (1967), 303-324.

Norbert Henze, KIT 11.2

Page 155: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic (relative) efficiency of estimators

11.4 Definition (BAN estimator)

Under the conditions of 10.5, a sequence (Tn) of estimators of ϑ is said to beasymptotically efficient, if

√n(Tn − ϑ) Dϑ−→ Nk(0, I1(ϑ)

−1) ∀ϑ ∈ Θ.

In this case, Tn is called a best asymptotically normal (BAN) estimator.

11.5 The moment estimator

Let X1, X2, . . . be i.i.d. R-valued and EX2k1 <∞. Let

mℓ := EXℓ1, ℓ = 1, . . . , 2k,

mℓ,n :=1

n

n∑

j=1

Xℓj

a.s.−→ mℓ as n→∞. (SLLN)

Suppose thatϑ = g(m1, . . . ,mk) ∈ Θ ⊂ R

k

for some continuously differentiable bijective function g : D ⊂ Rk → Θ. Then

ϑn := g(m1,n, . . . , mk,n) is called the moment estimator of ϑ.

Norbert Henze, KIT 11.3

Page 156: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic (relative) efficiency of estimators

Memo: mℓ = EXℓ1, mℓ,n =

1

n

n∑

j=1

Xℓj

a.s.−→ mℓ

Memo: ϑ = g(m1, . . . ,mk), ϑn := g(m1,n, . . . , mk,n)

Notice that ϑna.s.−→ ϑ, ϑ ∈ Θ (why?) Let

Yj :=

Xj

X2j

...

Xkj

, a := EY1 =

m1

m2

...mk

T := E

[(Y1 − a)(Y1 − a)⊤

]=(E[(Xi

1 −mi)(Xj1 −mj)]

)1≤i,j≤k

= (mi+j −mimj)1≤i,j≤k .

CLT =⇒ √n(Y n − a) = 1√

n

(n∑

j=1

Yj − na)

D−→ Nk(0, T ).

Y n = (m1,n, . . . , mk,n)⊤ , g

(Y n

)= ϑn, g(a) = ϑ.

Norbert Henze, KIT 11.4

Page 157: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic (relative) efficiency of estimators

By the δ-method we have the following result:

11.6 Theorem (Limit distribution of the moment estimator)

Let X1, X2, . . . be i.i.d., EX2k1 < ∞, mℓ = EXℓ

1, mℓ,n = n−1∑n

j=1Xℓj , ℓ =

1, . . . , 2k. Let ϑ = g(m1, . . . , mk) ∈ Θ ⊂ Rk for some continuously differentiablefunction g : D ⊂ Rk → Θ. Then

√n(ϑn − ϑ

) Dϑ−→ Nk (0,Σ(ϑ)) ,

whereΣ(ϑ) = g′(a)T g′(a)⊤,

a =(EX1,EX

21 , . . . ,EX

k1

)⊤, T =

(EXi+j

1 − EXi1 · EXj

1

)1≤i,j≤k

.

In general, (ϑn) is not BAN.

Norbert Henze, KIT 11.5

Page 158: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic (relative) efficiency of estimators

11.7 Scoring (making estimators BAN)

Assume the conditions of 10.5. Let (ϑn) be any sequence of estimators of ϑwith

ϑna.s.−→ ϑ, ϑ ∈ Θ, and

√n(ϑn − ϑ) Dϑ−→ Nk(0,Σ(ϑ)), ϑ ∈ Θ.

Put

Un(ϑ) :=n∑

j=1

d

dϑlog f(Xj , ϑ), Wn(ϑ) :=

d

dϑ⊤Un(ϑ), cf. proof of 10.8,

ϑ(1)n := ϑn −Wn(ϑn)

−1Un(ϑn),

ϑ(2)n := ϑn + I1(ϑn)

−1 1

nUn(ϑn).

Let ϑn denote the maximum likelihood estimator. For j ∈ 1, 2, we then have

√n(ϑ(j)n − ϑn

)Pϑ−→ 0 ∀ϑ ∈ Θ. (11.1)

From Slutsky’s lemma, we obtain

√n(ϑ(j)

n − ϑ) =√n(ϑn − ϑ) +

√n(ϑ(j)

n − ϑn)Dϑ−→ Nk(0, I1(ϑ)

−1).

Hence,(ϑ(1)n

)and

(ϑ(2)n

)are asymptotically efficient (BAN).

Norbert Henze, KIT 11.6

Page 159: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic (relative) efficiency of estimators

Memo: ϑ(1)n = ϑn −Wn(ϑn)

−1Un(ϑn), Un(ϑ) =∑n

j=1ddϑ

log f(Xj , ϑ)

We give a sketch of the proof of (11.1) in the case j = 1. The case j = 2

follows similarly. A Taylor expansion of Un(·) at ϑn yields

Un(ϑn) = Un(ϑn) +Wn(ϑn)(ϑn − ϑn) + . . .︸ ︷︷ ︸= 0

Hence, ϑ(1)n − ϑn = ϑn − ϑn −Wn(ϑn)

−1Un(ϑn)︸ ︷︷ ︸≈Wn(ϑn)(ϑn − ϑn)

≈[Ik −Wn(ϑn)

−1Wn(ϑn)](ϑn − ϑn) =⇒

√n(ϑ(1)

n −ϑn) ≈[Ik −

(1

nWn(ϑn)

)−11

nWn(ϑn)

] (√n(ϑn−ϑ)−

√n(ϑn−ϑ)

)

︸ ︷︷ ︸ ︸ ︷︷ ︸ ︸ ︷︷ ︸ ︸ ︷︷ ︸a.s.−→ −I1(ϑ) a.s.−→ −I1(ϑ) = OPϑ (1) = OPϑ(1)

= oPϑ(1), q.e.d.

Further reading: Ferguson, Th.: A course in large sample theory. Chapman &Hall, 1996, p.133ff.

Norbert Henze, KIT 11.7

Page 160: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic (relative) efficiency of estimators

11.8 Definition (Asymptotic relative Pitman efficiency)

Let X1, X2, . . . be i.i.d. ∼ Qϑ as in Chapter 9, µ : Θ → R. Let Sn =Sn(X1, . . . , Xn) and Tn = Tn(X1, . . . , Xn) be sequences of estimators of µ(ϑ)satisfying

√n(Sn − µ(ϑ)) Dϑ−→ N(0, σ2(ϑ)), (11.2)√n(Tn − µ(ϑ)) Dϑ−→ N(0, τ 2(ϑ)) (11.3)

and 0 < σ2(ϑ), τ 2(ϑ) <∞, ϑ ∈ Θ. Then

AREϑ ((Tn) : (Sn)) :=σ2(ϑ)

τ 2(ϑ)

is called asymptotic relative (Pitman) efficiency of (Tn) with respect to (Sn).

There is the alternative notation

AREF ((Tn) : (Sn)) ,

if F is the distribution function of X1.

Norbert Henze, KIT 11.8

Page 161: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic (relative) efficiency of estimators

Memo: AREϑ ((Tn) : (Sn)) :=σ2(ϑ)

τ 2(ϑ),√n(Sn − µ(ϑ)) Dϑ−→ N(0, σ2(ϑ))

11.9 Interpretation of the ARE

Let (mn)n≥1, mn = mn(ϑ), be a sequence of integers such that mn →∞ and

√n (Tmn − µ(ϑ))

Dϑ−→ N(0, σ2(ϑ)).

A comparison with (11.2) shows that the estimator T with sample size mn isasymptotically equivalent (with respect to asymptotic variance) to theestimator S with sample size n.

√n

mn

√mn (Tmn − µ(ϑ))

Dϑ−→ N(0, σ2(ϑ))︸ ︷︷ ︸

Dϑ−→ N(0, τ 2(ϑ)), cf. (11.3)

⇐⇒ limn→∞

√n

mn=σ(ϑ)

τ (ϑ).

Hence,AREϑ ((Tn) : (Sn)) = lim

n→∞

n

mn(ϑ).

Norbert Henze, KIT 11.9

Page 162: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic (relative) efficiency of estimators

11.10 Example (Estimation of the center of a symmetric distribution)

Let X1, X2, . . . be i.i.d., EX21 <∞, ϑ = EX1. Suppose that the distribution of

X1 is symmetric around ϑ, i.e., X1 − ϑ D= −(X1 − ϑ) = ϑ−X1.

Let F be the distribution function of X1. Assume thatf(F−1(1/2)

):= F ′ (F−1(1/2)

)> 0. Then

ϑ = F−1

(1

2

)= EX1. (expectation = median)

Let σ2F := σ2(F ) := VF (X1). As a first estimator, take the sample mean

Sn := Xn :=1

n

n∑

j=1

Xj .

The CLT gives√n(Sn − ϑ) DF−→ N(0, σ2(F )).

For a second estimator, let X(1) ≤ X(2) ≤ . . . ≤ X(n) be the order statistics ofX1, . . . , Xn. Define the empirical median of X1, . . . , Xn as

Tn :=

X

( n+12

), if n is odd,

12

(X( n

2) +X(n

2+1)

), if n is even.

Norbert Henze, KIT 11.10

Page 163: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic (relative) efficiency of estimators

Memo: Tn :=

X

(n+12

), if n is odd,

12

(X(n

2) +X( n

2+1)

), if n is even.

Use

X(r) ≤ t⇐⇒n∑

j=1

1Xj ≤ t ≥ r

and∑n

j=1 1Xj ≤ t ∼ Bin(n, F (t)) to show

√n(Tn − ϑ) DF−→ N

(0,

1

4f2 (F−1(1/2))

)

(Exercise!). It follows that

AREF ((Tn) : (Sn)) = 4f2

(F−1

(1

2

))σ2(F ).

Special case: X1 ∼ ts (Student’s t-distribution with s degrees of freedom), i.e.,

X1 ∼ N0√1s

∑sj=1N

2j

, where N0, . . . , Ns i.i.d. ∼ N(0, 1).

Norbert Henze, KIT 11.11

Page 164: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic (relative) efficiency of estimators

X1 has the density

fs(x) =Γ(s+12

)√πsΓ

(s2

) ·(1 +

x2

s

)−(s+1)/2

, x ∈ R.

Let Fs(t) :=∫ t

−∞ fs(x)dx, t ∈ R, be the distribution function of X1. We have(!)

σ2(Fs) =s

s− 2, if s ≥ 3,

F−1s (1/2) = 0,

EF (X1) = 0, if s ≥ 2.

It follows that

as := AREFs ((Tn) : (Sn)) =4Γ2

(s+12

)

πsΓ2(s2

) · s

s− 2.

s 3 4 5 6 ∞as 1.621 1.125 0.961 0.879 0.637 (= 2/π)

Norbert Henze, KIT 11.12

Page 165: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic tests in parametric models

12 Asymptotic tests in parametric models

12.1 The setting

Let X1, X2, . . . be i.i.d, X0-valued, having density f(·, ϑ), ϑ ∈ Θ ⊂ Rk, withrespect to some σ-finite measure µ on B0.Assume Θ to be open. Furthermore, assume the conditions in 10.8.

For testing a simple hypothesis H0 : ϑ = ϑ0 against a simple alternativeH1 : ϑ = ϑ1 (ϑ0, ϑ1 ∈ Θ, ϑ0 6= ϑ1), there is an optimal (Neyman–Pearson)test. This uses the test statistic

Tn :=

n∏

j=1

f(Xj , ϑ1)

f(Xj , ϑ0)(likelihood ratio).

Rejection of H0 is for large values of Tn.

Now, Θ0 ⊂ Θ, Θ0 6= ∅, Θ \Θ0 6= ∅.

Hypothesis H0 : ϑ ∈ Θ0,

Alternative H1 : ϑ ∈ Θ \Θ0.

Norbert Henze, KIT 12.1

Page 166: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic tests in parametric models

12.2 Definition (Generalized likelihood ratio)

Λn := Λn(X1, . . . , Xn) :=supϑ∈Θ0

∏nj=1 f(Xj , ϑ)

supϑ∈Θ

∏nj=1 f(Xj , ϑ)

(≤ 1)

is called the generalized likelihood ratio (GLR). The GLR-test rejects H0 forsmall values of Λn.

12.3 Theorem (Simple hypothesis)

Let Θ0 = ϑ0, Mn := −2 log Λn. We then have

Mn

Dϑ0−→ χ2k as n→∞.

Proof: Let

Un(ϑ) :=

n∑

j=1

d

dϑlog f(Xj , ϑ),

Wn(ϑ) :=

(n∑

ℓ=1

∂2

∂ϑi∂ϑjlog f(Xℓ, ϑ)

)1≤i,j≤k =

d

dϑ⊤Un(ϑ).

Norbert Henze, KIT 12.2

Page 167: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic tests in parametric models

Let ϑn be the maximum likelihood estimator of ϑ. We have

0 = Un(ϑn) = Un(ϑ0) +Wn(ϑ0)(ϑn − ϑ0) + oPϑ0

(√n). (12.1)

Notice that

Λn =supϑ∈Θ0

∏nj=1 f(Xj , ϑ)

supϑ∈Θ

∏nj=1 f(Xj , ϑ)

=n∏

j=1

f(Xj , ϑ0)

f(Xj , ϑn).

It follows that (recall: Mn = −2 log Λn)

Mn = 2

n∑

j=1

[log f(Xj , ϑn)− log f(Xj , ϑ0)

]

= 2

(U⊤

n (ϑ0)(ϑn − ϑ0) +1

2(ϑn − ϑ0)

⊤Wn(ϑ0)(ϑn − ϑ0) + oPϑ0(1)

)

︸ ︷︷ ︸= − (ϑn−ϑ0)

⊤Wn(ϑ0) + oPϑ0(√n) (by (12.1))

= 2

(+

1

2

(√n(ϑn − ϑ0)

)⊤ (− 1

nWn(ϑ0)

)√n(ϑn − ϑ0)

)+ oPϑ0

(1)︸ ︷︷ ︸ ︸ ︷︷ ︸ ︸ ︷︷ ︸

Dϑ0−→ Z ∼ Nk(0, I1(ϑ0)−1)

a.s.−→ I1(ϑ0)Dϑ0−→ Z

CMT =⇒ Mn

Dϑ0−→ Z⊤I1(ϑ0)Z ∼ χ2k, q.e.d.

Norbert Henze, KIT 12.3

Page 168: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic tests in parametric models

12.4 Example (Multinomial distribution)

For s ≥ 2, let

Θ := ϑ := (p1, . . . , ps−1) : pj > 0 ∀ j, p1 + . . .+ ps−1 < 1,

and put ps := 1− p1 − . . .− ps−1. Then Θ is an open subset of Rk, wherek = s− 1.

Let ej be the jth canonical unit vector in Rs, j = 1, . . . , s.

Let X1, X2, . . . be i.i.d. s-dimensional random vectors, where

Pϑ(X1 = ej) = pj , j = 1, . . . , s.

Notice thatn∑

j=1

Xj =: (N1, . . . , Ns) ∼ Mult(n; p1, . . . , ps).

Put

f(t, ϑ) :=

pj , if t = ej (j = 1, . . . , s),

0, otherwise.

Then X1 has density f(·, ϑ) w.r.t. the counting measure µ on e1, . . . , es.

Norbert Henze, KIT 12.4

Page 169: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic tests in parametric models

The joint density of (X1, . . . , Xn) isn∏

j=1

f(Xj , ϑ) = pN11 pN2

2 · . . . · pNss .

Let Θ0 := ϑ0 = (q1, . . . , qs−1), qs := 1− q1 − . . .− qs−1.

Hypothesis H0 : ϑ ∈ Θ0 (i.e., pj = qj ∀ j),Alternative H1 : ϑ /∈ Θ0.

The MLE of ϑ is ϑn = (p1, . . . , ps−1), pj =Nj

n(!)

=⇒ Λn =qN11 · . . . · qNs

s

pN11 · . . . · pNs

s

=s∏

i=1

(qipi

)Ni

=⇒ Mn = 2

s∑

i=1

Ni log

(Ni

nqi

)Dϑ0−→ χ2

s−1.

Compare with the χ2-test statistic

Tn :=s∑

i=1

(Ni − nqi)2nqi

We have (Exercise!) Tn −Mn = oPϑ0(1) (=⇒ Tn

Dϑ0−→ χ2s−1).

Norbert Henze, KIT 12.5

Page 170: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic tests in parametric models

12.5 Theorem (Composite hypothesis)

In 12.1, let Θ0 = h(U), where U ⊂ Rℓ, U open, 1 ≤ ℓ < k, h : U → Rk,h twice continuously differentiable, injective. Under further regularity conditions(→ proof), we have

Mn = −2 log ΛnDϑ−→ χ2

k−ℓ for each ϑ ∈ Θ0.

Proof: (sketch) Fix ϑ0 = h(u0) ∈ Θ0, where u0 ∈ U .

Let ϑn be a consistent MLE of ϑ0, and let un be a consistent MLE of u0.

Then h(un) is a consistent MLE of ϑ0 within (the submodel given by) Θ0.

We have

Λn =

n∏

j=1

f(Xj , h(un))

f(Xj , ϑn),

Mn = 2

n∑

j=1

log f(Xj , ϑn)− log f(Xj , h(un))

. (12.2)

Let h′(u)k×ℓ be the Jacobian matrix of h at u.

Norbert Henze, KIT 12.6

Page 171: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic tests in parametric models

Memo: Mn = 2n∑

j=1

log f(Xj , ϑn)− log f(Xj , h(un))

Chain rule =⇒ d

dulog f(Xj , h(u)) = h′(u)⊤

d

dϑlog f(Xj , ϑ)

∣∣∣ϑ=h(u)

. (12.3)

=⇒ d2

dudu⊤ log f(Xj , h(u)) = h′(u)⊤d2

dϑdϑ⊤ log f(Xj , ϑ)∣∣∣ϑ=h(u)

h′(u)

=⇒ Eu

[d2

dudu⊤ log f(Xj , h(u))

]= h′(u)⊤Eu

[d2

dϑdϑ⊤ log f(Xj , ϑ)∣∣∣ϑ=h(u)

]h′(u).

︸ ︷︷ ︸ ︸ ︷︷ ︸=: − I1(u) = −I1(h(u))

Let

Un(u) :=n∑

j=1

d

dulog f(Xj , h(u)), Un(h(u)) =

n∑

j=1

d

dϑlog f(Xj , ϑ)

∣∣∣ϑ=h(u)

.

Then (12.3) implies

Un(u) = h′(u)⊤Un(h(u)). (12.4)

Norbert Henze, KIT 12.7

Page 172: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic tests in parametric models

From 10.9 (representation of estimation error), we have

√n(ϑn − ϑ0) = I1(ϑ0)

−1 1√nUn(ϑ0) + oPϑ0

(1), (12.5)

√n(un − u0) = I1(u0)

−1 1√nUn(u0) + oPu0

(1). (12.6)

(12.5) =⇒ 1√nUn(ϑ0) = I1(ϑ0)

√n(ϑn − ϑ0) + oPϑ0

(1). (12.7)

(12.4), (12.6) yield

√n(un − u0) = I1(u0)

−1h′(u0)⊤ 1√

nUn(ϑ0) + oPϑ0

(1) (12.8)

= I1(u0)−1h′(u0)

⊤I1(ϑ0)√n(ϑn − ϑ0) + oPϑ0

(1).(12.7) ︸ ︷︷ ︸

=: A

Let

Hn(ϑ0) :=

n∑

j=1

d2

dϑdϑ⊤ log f(Xj , ϑ)∣∣∣ϑ=ϑ0

=⇒ 1

nHn(ϑ0)→ −I1(ϑ0) Pϑ0 -a.s.

Norbert Henze, KIT 12.8

Page 173: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic tests in parametric models

Taylor expansions at ϑ0 resp. u0 =⇒n∑

j=1

log f(Xj , ϑn) =

n∑

j=1

log f(Xj , ϑ0) + Un(ϑ0)⊤(ϑn − ϑ0) (12.9)

+1

2(ϑn − ϑ0)

⊤Hn(ϑ0)(ϑn − ϑ0) + oPϑ0(1),

n∑

j=1

log f(Xj , h(un)) =n∑

j=1

log f(Xj , h(u0)) + Un(u0)⊤(un−u0)(12.10)︸ ︷︷ ︸

= ϑ0

+1

2(un−u0)

⊤Hn(u0)(un−u0) + oPϑ0(1).︸ ︷︷ ︸

= h′(u0)⊤Hn(ϑ0)h

′(u0)

Memo: Mn = 2n∑

j=1

log f(Xj , ϑn)− log f(Xj , h(un))

Memo:√n(un − u0) = A

√n(ϑn − ϑ0) + oPϑ0

(1)

Put Zn :=√n(ϑn − ϑ0). Plug (12.7), (12.8) into (12.9), (12.10) =⇒

Norbert Henze, KIT 12.9

Page 174: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic tests in parametric models

Memo: Zn :=√n(ϑn − ϑ0)

=⇒ Mn = . . . = Z⊤n

(I1(ϑ0)

[Ik − h′(u0)A

])Zn + oPϑ0

(1). (check!)

Now, from the Main Theorem of ML estimation,

Zn

Dϑ0−→ Z ∼ Nk(0,Σ), Σ = I1(ϑ0)−1.

Let B := I1(ϑ0) [Ik − h′(u0)A].

The matrix B has the following properties (check!):

B is symmetric,

BΣ = (BΣ)2,

Rank(BΣ) = k − ℓ.By a general result (Exercise), it follows that

Z⊤BZ ∼ χ2k−ℓ.

By the CMT, Mn

Dϑ0−→ Z⊤BZ, and the assertion follows.

Norbert Henze, KIT 12.10

Page 175: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic tests in parametric models

12.6 Corollary Given α ∈ (0, 1), let χ2k−ℓ;1−α be the (1 − α)-quantile of the

χ2k−ℓ-distribution. Then, in the setting of 12.1 and 12.5, the sequence (ϕn) of

testsϕn := 1

Mn ≥ χ2

k−ℓ;1−α

(so-called generalized likelihood ratio test) has asymptotic level α, i.e.,

limn→∞

Eϑ(ϕn) = α ∀ϑ ∈ Θ0.

12.7 Theorem (Consistency of the generalized likelihood ratio test)

The generalized likelihood ratio test is consistent against each fixed alternative,i.e., we have

limn→∞

Eϑ(ϕn) = 1 ∀ϑ ∈ Θ \Θ0.

Proof: We only consider the case ℓ = 0, i.e., Θ0 = ϑ0. Fix ϑ1 6= ϑ0. Wehave

Λn =n∏

j=1

f(Xj , ϑ0)

f(Xj , ϑ1)·

n∏

j=1

f(Xj , ϑ1)

f(Xj , ϑn).

Norbert Henze, KIT 12.11

Page 176: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic tests in parametric models

Memo: Λn =n∏

j=1

f(Xj , ϑ0)

f(Xj , ϑ1)·

n∏

j=1

f(Xj , ϑ1)

f(Xj , ϑn)

It follows that

Mn = −2 log Λn

= 2

n∑

j=1

log f(Xj , ϑn)− log f(Xj , ϑ1)

+ 2n · 1

n

n∑

j=1

logf(Xj , ϑ1)

f(Xj , ϑ0)︸ ︷︷ ︸ ︸ ︷︷ ︸=: Yn =: Vn

From Theorem 12.3 we have Yn

Dϑ1−→ χ2k as n→∞.

By the SLLN,

Vn → Eϑ1

[log

f(X1, ϑ1)

f(X1, ϑ0)

]Pϑ1 -a.s.

Norbert Henze, KIT 12.12

Page 177: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic tests in parametric models

Since log t ≥ 1− 1t(with equality only for t = 1), we have

Eϑ1

[log

f(X1, ϑ1)

f(X1, ϑ0)

]=

∫log

f(z, ϑ1)

f(z, ϑ0)f(z, ϑ1)µ(dz)

>

∫ (1− f(z, ϑ0)

f(z, ϑ1)

)f(z, ϑ1)µ(dz)

=

∫f(z, ϑ1)µ(dz)−

∫f(z, ϑ0)µ(dz) = 0.

Hence, MnP−→ ∞ under Pϑ1 , i.e., limn→∞ Pϑ1(Mn ≥ c) = 1 ∀ c > 0, q.e.d.

12.8 Definition (Kullback-Leibler-Information)

Let Qϑj= f(·, ϑj)µ, j = 0, 1. Assume Qϑ1 ≪ Qϑ0 .

IKL (Qϑ1 , Qϑ0) := Eϑ1

[log

f(X1, ϑ1)

f(X1, ϑ0)

]=

X0

logf(z, ϑ1)

f(z, ϑ0)f(z, ϑ1)µ(dz)

is called the Kullback-Leibler information of Qϑ1 w.r.t. Qϑ0 .

Norbert Henze, KIT 12.13

Page 178: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic tests in parametric models

12.9 Contingency tablesLet (X,Y ) be a discrete bivariate random vector, where

pi,j := P(X = xi, Y = yj), 1 ≤ i ≤ r, 1 ≤ j ≤ s.Let

Θ :=

ϑ := (p1,1, . . . , pr,s−1) ∈ R

rs−1 : pi,j > 0 ∀ i, j,∑

(i,j) 6=(r,s)

pi,j < 1

.

Θ is an open subset of Rk, where k = rs− 1.Let (X1, Y1), . . . , (Xn, Yn) be i.i.d., ∼ (X,Y ). Put

Ni,j :=

n∑

ν=1

1Xν = xi, Yν = yj =⇒ (N1,1, . . . , Nr,s) ∼ Mult(n; p1,1, . . . , pr,s).

j

1 · · · · · · · · · s Σ

1 N1,1 · · · · · · · · · N1,s N1+

2 N2,1 · · · · · · · · · N2,s N2+

i...

... · · · · · · · · ·...

......

... · · · · · · · · ·...

...r Nr,1 · · · · · · · · · Nr,s Nr+

Σ N+1 · · · · · · · · · N+s n

Norbert Henze, KIT 12.14

Page 179: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic tests in parametric models

H0 : X, Y independent ⇐⇒ pi,j = pi qj ∀ i, j,where

pi = P(X = xi), i = 1, . . . , r,

qj = P(Y = yj), j = 1, . . . , s.

H0 corresponds to

Θ0 :=ϑ ∈ Θ : ∃p1, . . . , pr−1 > 0,

r−1∑

i=1

pi < 1,∃q1, . . . , qs−1 > 0,

s−1∑

j=1

qj < 1, such that pi,j = piqj ∀(i, j) 6= (r, s).

Θ0 is a ℓ-dimensional submanifold of Rk, where ℓ = r − 1 + s− 1.

The density f(·, ·, ϑ) of (X,Y ) with respect to the counting measure onx1, . . . , xr × y1, . . . , ys is

f(x, y, ϑ) := pi,j , if (x, y) = (xi, yj)

=⇒n∏

ν=1

f(Xν , Yν , ϑ) =

r∏

i=1

s∏

j=1

pNi,j

i,j .

Norbert Henze, KIT 12.15

Page 180: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic tests in parametric models

We have (in all products, i runs from 1 to r and j from 1 to s)

Λn =suppi,qj

∏i

∏j(piqj)

Ni,j

∏i

∏j

(Ni,j

n

)Ni,j=

suppi

∏i p

Ni+1 supqj

∏j q

N+j

j

∏i

∏j

(Ni,j

n

)Ni,j

=

∏i

(Ni+

n

)Ni+ ∏j

(N+j

n

)N+j

∏i

∏j

(Ni,j

n

)Ni,j.

Generalized likelihood ratio test:

Mn = −2 log Λn

= 2

i,j

Ni,j logNi,j

n−∑

i

Ni+ logNi+

n−∑

j

N+j logN+j

n

= 2n∑

i,j

Ni,j

nlog

Ni,j

n− Ni+

n

N+j

nlog

(Ni+

n

N+j

n

).

(use n =∑

j N+j =∑

iNi+)

Norbert Henze, KIT 12.16

Page 181: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic tests in parametric models

In this case, we have k − ℓ = rs− 1− (r − 1 + s− 1) = (r − 1)(s− 1).

Thus, the generalized likelihood ratio test for independence in anr × s-contingency table is:

Reject H0 if Mn ≥ χ2(r−1)(s−1);1−α.

Using

t log t = t− 1 +1

2(t− 1)2 +O((t− 1)3) as t→ 1,

an asymptotically equivalent test statistic is (Exercise!)

Tn :=r∑

i=1

s∑

j=1

(Ni,j − nNi+

n

N+j

n

)2

nNi+

n

N+j

n

.

This is more intuitive than Mn sinceNi+

nestimates pi and

N+j

nestimates qj .

Norbert Henze, KIT 12.17

Page 182: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic tests in parametric models

12.10 The parametric bootstrap

Let X1, X2, . . . , Xn, . . . be i.i.d. X0-valued random variables with unknowndistribution PX1 .

We want to test the hypothesis

H0 : PX1 ∈ Pϑ : ϑ ∈ Θ0,

where Θ0 ⊂ Rk.

Let Tn = Tn(X1, . . . , Xn) be a sequence of test statistics of H0 with upperrejection region, i.e., reject H0 if Tn > c.

Here, c is a suitable critical value that depends on the chosen level ofsignificance α ∈ (0, 1) and the sample size n, i.e., c = c(n, α).

But c also depends on the underlying unkown distribution of Tn under H0

if H0 is composite, i.e., if |Θ0| ≥ 2.

We want to havePϑ(Tn > c) = α for each ϑ ∈ Θ0.

This goal can not be achieved!

Norbert Henze, KIT 12.18

Page 183: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic tests in parametric models

Basic idea of the parametric bootstrap:

Let (ϑn) be a consistent sequence of estimators of ϑ.

If (a realization of) ϑn is near ϑ, then Pϑn

should be near Pϑ, for short:

ϑn ≈ ϑ =⇒ Pϑn≈ Pϑ.

If X∗1 , . . . , X

∗n are i.i.d. ∼ P

ϑn, then

Pϑ(Tn(X1, . . . , Xn) > c) ≈ Pϑn

(Tn(X∗1 , . . . , X

∗n) > c) .

The right-hand side can be estimated by the parametric bootstrap

(B. Efron 1977).

The bootstrap algorithm:

Given ϑn = ϑn(X1, . . . , Xn) (i.e., conditionally on X1, . . . , Xn):

(B1) Generate X∗1 , . . . , X

∗n i.i.d. ∼ P

ϑn, (→ pseudorandom numbers)

(B2) Compute Tn(X∗1 , . . . , X

∗n)

Carry out (B1), (B2) b times (b is the number of bootstrap replications).

Norbert Henze, KIT 12.19

Page 184: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic tests in parametric models

X∗1,1, . . . , X

∗n,1 → Tn(X

∗1,1, . . . , X

∗n,1) =: T ∗

n,1,

X∗1,2, . . . , X

∗n,2 → Tn(X

∗1,2, . . . , X

∗n,2) =: T ∗

n,2,

......

...

X∗1,b, . . . , X

∗n,b → Tn(X

∗1,b, . . . , X

∗n,b) =: T ∗

n,b.

Here, X∗i,j , i = 1, . . . , n; j = 1, . . . , b, are i.i.d. ∼ P

ϑn.

Let

H∗n,b(t) :=

1

b

b∑

j=1

1T ∗n,j ≤ t

be the empirical distribution function of T ∗n,1, . . . , T

∗n,b.

Let T ∗(1) ≤ . . . ≤ T ∗

(b) be the order statistics of T ∗n,1, . . . , T

∗n,b.

Put

c∗n,b(α) := H∗−1n,b (1− α) =

T ∗(b(1−α)), if b(1− α) ∈ N,

T ∗(⌊b(1−α)+1⌋), otherwise.

Reject H0 if Tn > c∗n,b(α).

Norbert Henze, KIT 12.20

Page 185: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic tests in parametric models

12.11 Theorem Let Hn,ϑ(t) := Pϑ(Tn ≤ t), ϑ ∈ Θ. Suppose that for each

ϑ ∈ Θ there is a continuous distribution function Hϑ that is strictly increasingon t ∈ R : 0 < Hϑ(t) < 1. Suppose further that for any sequence (ϑn) in Θsatisfying limn→∞ ϑn = ϑ ∈ Θ we have

limn→∞

‖Hn,ϑn −Hϑ‖∞ = 0. (12.11)

Finally, assume that

ϑnPϑ−→ ϑ, ϑ ∈ Θ. (12.12)

We then have for each ϑ ∈ Θ:

a) ‖H∗n,b −Hϑ‖∞ Pϑ−→ 0 as n, b→∞,

b) c∗n,b(α)Pϑ−→ H−1

ϑ (1− α) as n, b→∞,

c) limn,b→∞ Pϑ(Tn > c∗n,b(α)) = α.

Proof: (12.11), (12.12) and the subsequence criterion for stochasticconvergence yield

‖Hn,ϑn

−Hϑ‖∞ Pϑ−→ 0 as n→∞. (12.13)

Norbert Henze, KIT 12.21

Page 186: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic tests in parametric models

Memo: To show: a) ‖H∗n,b −Hϑ‖∞ Pϑ−→ 0 as n, b→∞.

Memo: We know: ‖Hn,ϑn

−Hϑ‖∞ Pϑ−→ 0 as n→∞.

Let U1, U2, . . . be i.i.d. with the uniform distribution U(0, 1), independent ofX1, X2, . . .. W.l.o.g., let

T ∗n,j := H−1

n,ϑn(Uj) = inft : H

n,ϑn(t) ≥ Uj.

Notice that, conditionally on ϑn, T∗n,1, T

∗n,2, . . . are i.i.d. with distribution

function Hn,ϑn

. It follows that

‖H∗n,b −Hn,ϑn

‖∞ = supt∈R

∣∣∣∣1

b

b∑

j=1

1Uj ≤ Hn,ϑn(t) −H

n,ϑn(t)

∣∣∣∣︸ ︷︷ ︸⇐⇒ T ∗

n,j ≤ t

≤ sup0≤u≤1

∣∣∣∣1

b

b∑

j=1

1Uj ≤ u − u∣∣∣∣ → 0 Pϑ-a.s. as b→∞.

(12.13) (see second memo) and triangle inequality =⇒ a).

Norbert Henze, KIT 12.22

Page 187: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Asymptotic tests in parametric models

Memo: a) ‖H∗n,b −Hϑ‖∞ Pϑ−→ 0 as n, b→∞.

Memo: b) c∗n,b(α)Pϑ−→ H−1

ϑ (1− α) as n, b→∞Memo: c) limn,b→∞ Pϑ(Tn > c∗n,b(α)) = α.

Memo: c∗n,b(α) := H∗−1n,b (1− α)

b) follows from a), together with the continuity and strict monotonicity of Hϑ.

c) follows from b), q.e.d.

Norbert Henze, KIT 12.23

Page 188: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Probability Measures on Metric Spaces

13 Probability Measures on Metric Spaces

13.1 Motivation

Let X1, X2, . . . be i.i.d. ∼ U(0, 1). Let

Fn(t) :=1

n

∑nj=11Xj ≤ t, 0 ≤ t ≤ 1 (EDF of X1, . . . , Xn).

limn→∞

supt∈[0,1]

∣∣Fn(t)− F (t)∣∣ = 0 P-a.s. (Glivenko-Cantelli).

Let Bn(t) :=√n(Fn(t)− F (t)

), 0 ≤ t ≤ 1 (uniform empirical process).

0

0.5

−0.51

t

√n(Fn(t)− t)

• • • • •• • •

••

•••• •

• • • •• • •

• • • •

realization of a uniform empirical process (n = 25)

Norbert Henze, KIT 13.1

Page 189: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Probability Measures on Metric Spaces

Memo: Let Bn(t) =√n(Fn(t)− F (t)

), 0 ≤ t ≤ 1.

For any k ≥ 1 and any choice of 0 ≤ t1 < t2 < . . . < tk ≤ 1, we have:

(Bn(t1), . . . , Bn(tk))D−→ Nk

(0, (ti ∧ tj − titj)1≤i,j≤k

).

Question: Does Bn(·) converge in distribution as a random function in asuitable space of functions?

Answer: Yes! The”limit object“ is called the Brownian bridge (=: B(·)).

0

0.5

1.0

−0.5

t

3 realizations of a (approximate) Brownian bridge

Norbert Henze, KIT 13.2

Page 190: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Probability Measures on Metric Spaces

Memo: Let Bn(t) =√n(Fn(t)− F (t)

), 0 ≤ t ≤ 1.

Bn(·) D−→ B(·) as n→∞ (in a sense to be defined).

Question: Do we have

T (Bn)D−→ T (B)

for ’nice’ real-valued functionals T (defined on a suitable space of functions)?

Examples:

T1(Bn) := ‖Bn‖∞ =√n sup

0≤t≤1

∣∣∣Fn(t)− t∣∣∣,

T2(Bn) :=

∫ 1

0

B2n(t) dt = n

∫ 1

0

(Fn(t)− t

)2dt.

T1(Bn) is called the Kolmogorov-Smirnov statistic, T2(Bn) the Cramer-vonMises statistic for testing the hypothesis of a uniform distribution in [0, 1].

Answer: Yes!

Norbert Henze, KIT 13.3

Page 191: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Probability Measures on Metric Spaces

13.2 Notation, basic concepts

Let (S, ρ) be a metric space.

B(x, ε) := Bρ(x, ε) = y ∈ S : ρ(x, y) < ε, x ∈ S, ε > 0,

O ⊂ S open :⇐⇒ ∀x ∈ O ∃ε > 0 : B(x, ε) ⊂ O,

O := O ⊂ S : O open,A ⊂ S closed :⇐⇒ Ac := S \ A open,

A := A ⊂ S : A closed,B := σ(O) (σ-field of Borel sets),

M :=

⋃O ∈ O : O ⊂M (interior of M ⊂ S),

M :=⋂A ∈ A : A ⊃M (closure of M),

∂M := M\M (boundary of M).

Norbert Henze, KIT 13.4

Page 192: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Probability Measures on Metric Spaces

x ∈ S, M ⊂ S =⇒ ρ(x,M) := infρ(x, y) : y ∈M,(distance of x to M)

|ρ(x,M)− ρ(z,M)| ≤ ρ(x, z), x, z ∈ S, M ⊂ S, (13.1)

Mε := x ∈ S : ρ(x,M) < ε, ε > 0,

(parallel set of M at distance ε)

xn → x :⇐⇒ ρ(xn, x)→ 0 (convergence in S),

(xn) Cauchy sequence :⇐⇒ limm,n→∞

ρ(xn, xm) = 0,

(S, ρ) complete :⇐⇒ each Cauchy sequence has a limit in S,

M dense (in S) :⇐⇒ M = S,

(S, ρ) separable :⇐⇒ ∃M ⊂ S :M countable and dense.

Norbert Henze, KIT 13.5

Page 193: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Probability Measures on Metric Spaces

O0 ⊂ O base of O :⇐⇒ each O ∈ O is union of sets of O0,

O ⊂ O open cover of M ⊂ S :⇐⇒ M ⊂⋃O : O ∈ O,

M ⊂ S compact :⇐⇒ each open cover of M has a finite subcover,

N ⊂ S ε-net for M :⇐⇒ ∀x ∈M ∃y ∈ N : ρ(x, y) < ε,

M ⊂ S totally bounded :⇐⇒ ∀ε > 0 :M has a finite ε-net.

13.3 Theorem (Separability, countable base, countable subcover)

The following assertions are equivalent:

a) (S, ρ) is separable,

b) S (i.e. O) has a countable base,

c) Each open cover of each subset of S has a countable subcover.

Norbert Henze, KIT 13.6

Page 194: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Probability Measures on Metric Spaces

13.4 Theorem (Relative compactness)

For a set M ⊂ S, the following assertions are equivalent:

a) M is compact (⇐⇒:M is relatively compact),

b) Each sequence in M has a convergent subsequence (limit possibly /∈M),

c) M is totally bounded and M is complete.

M ⊂ S nowhere dense :⇐⇒ (M) = ∅.

13.5 Theorem (Baire)

Let (S, ρ) be a complete metric space.

If S =

∞⋃

n=1

An, then (An) 6= ∅ for at least one n.

I.e., S cannot be a countable union of nowhere dense sets.

Norbert Henze, KIT 13.7

Page 195: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Probability Measures on Metric Spaces

Let

Cb := f : S → R : f bounded and continuous,

Cb0 := f ∈ Cb : f uniformly continuous.

f unif. cont. :⇐⇒ ∀ε > 0 ∃δ > 0 ∀x, y ∈ S : ρ(x, y) < δ =⇒ |f(x)−f(y)| < ε.

LetP := P(B) := P : B → [0, 1] : P probability measure.

13.6 Definition and Theorem (Separating class)

M⊂ B is a separating class for P if:

∀ P,Q ∈ P : If P (A) = Q(A) ∀ A ∈M then P = Q.

The systems O and A of open resp. closed sets are separating classes.

Proof: Uniqueness theorem for measures.

Norbert Henze, KIT 13.8

Page 196: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Probability Measures on Metric Spaces

13.7 Theorem (Integrals∫fdP , f ∈ Cb0, determine P )

Let P,Q ∈ P . Then

P = Q⇐⇒∫fdP =

∫fdQ ∀f ∈ Cb0.

Proof of”⇐=“: Let A ∈ A, ε > 0,

fε(x) := max

(0, 1− ρ(x,A)

ε

), x ∈ S.

Then 0 ≤ fε ≤ 1 and

|fε(x)− fε(y)| ≤ ρ(x, y)

ε(use (13.1))

=⇒ f ∈ Cb0. Furthermore, 1A ≤ fε ≤ 1Aε . It follows that

P (A) =

∫1A dP ≤

∫fε dP =

∫fε dQ ≤

∫1Aε dQ = Q(Aε).

ε ↓ 0, A closed =⇒ Aε ↓ A =⇒ P (A) ≤ Q(A).

In the same way, Q(A) ≤ P (A).√

Norbert Henze, KIT 13.9

Page 197: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Probability Measures on Metric Spaces

13.8 Example (The space C[0, 1])

LetS := C := C[0, 1] := x : [0, 1]→ R : x continuous.

‖x‖ := sup0≤t≤1

|x(t)|, x ∈ C,

ρ(x, y) := ‖x− y‖ = max0≤t≤1

|x(t)− y(t)|,

ρ(xn, x)→ 0 ⇐⇒ uniform convergence of xn to x.

(C, ρ) is separable, since the set

[0, 1] ∋ t 7→

n∑

k=0

aktk∣∣∣n ∈ N0, a0, . . . , an ∈ Q

of polynomials with coefficients in Q is countable and dense in C

(Weierstraß’ approximation theorem).

Norbert Henze, KIT 13.10

Page 198: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Probability Measures on Metric Spaces

(C, ρ) is complete:

Let (xn) be a Cauchy sequence in C.

Then εn := supm≥n ‖xn − xm‖ → 0 as n→∞.

Thus, for fixed t ∈ [0, 1], (xn(t)) is a Cauchy sequence in R.

Since R is complete,x(t) := lim

n→∞xn(t)

exists.

Notice that|xn(t)− xm(t)| ≤ εn if m ≥ n.

m→∞ =⇒|xn(t)− x(t)| ≤ εn

=⇒ limn→∞

‖xn − x‖ = 0.

It follows that x ∈ C. (why?)

Norbert Henze, KIT 13.11

Page 199: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Probability Measures on Metric Spaces

Let zn(t) := nt1[0,1/n](t) + (2− nt)1(1/n,2/n](t), 0 ≤ t ≤ 1.

0

0.5

1.0

0 0.5 1.0

zn(t)

t1n

Let ε > 0. The sequence (εzn)n≥1 has no convergent subsequence, sinceρ(εznk

, z)→ 0 implies z ≡ 0, (why?) but ρ(εznk, 0) = ε for each k.

Consequence: B(0, ε) not compact

=⇒ no closed ball B(x, ε) is compact (consider x+ εzn)

=⇒ each compact set is nowhere dense

=⇒ C is not σ-compact (countable union of compact sets)

Norbert Henze, KIT 13.12

Page 200: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Probability Measures on Metric Spaces

For k ∈ N and 0 ≤ t1 < . . . < tk ≤ 1, let

πt1,...,tk :

C → Rk,

x 7→ πt1,...,tk(x) := (x(t1), . . . , x(tk)).

The mappings πt1,...,tk , k ∈ N, t1, . . . , tk ∈ [0, 1], are called natural projections.Let

Cf :=π−1t1,...,tk

(H)∣∣k ∈ N, 0 ≤ t1 < . . . < tk ≤ 1, H ∈ Bk

be the system of finite-dimensional sets.

We have Cf ⊂ B. (why?)

Claim: Cf is a field (algebra).

To show: (i) S ∈ Cf , (ii) A ∈ Cf ⇒ Ac ∈ Cf , (iii) A,B ∈ Cf ⇒ A∩B ∈ Cf .

(i): S = π−11 (R) ∈ Cf

(ii): Cf ∋ A = π−1t1,...,tk

(H), H ∈ Bk =⇒ Ac = π−1t1,...,tk

(Rk \H) ∈ Cf√

(iii): Cf is a π-system (a bit tricky!).

Norbert Henze, KIT 13.13

Page 201: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Probability Measures on Metric Spaces

(iii): For nonempty, finite N ⊂ [0, 1] and n := |N |, let

πN :

C → Rn,

x 7→ πN(x) := (x(u1), . . . , x(un)),

where N = u1, . . . , un, u1 < . . . < un.

If ∅ 6= I ⊂ 1, . . . , n, I = i1, . . . , iℓ, i1 < . . . < iℓ, then

πnI :

Rn → Rℓ,

(v1, . . . , vn) 7→ πnI (v1, . . . , vn) := (vi1 , . . . , viℓ)

=⇒ πnI πN :

C → Rℓ,

x 7→ πnI (x(u1), . . . , x(un)) = (x(ui1), . . . , x(uiℓ)) .

Now, let A := π−1t1,...,tk

(K), B := π−1s1,...,sℓ

(L) ∈ Cf (K ∈ Bk, L ∈ Bℓ).

Put M := t1, . . . , tk, N := s1, . . . , sℓ; let n := |M ∪N |.Let I = i1, . . . , ik ⊂ 1, . . . .n so that M = ui1 , . . . , uik.Let J = j1, . . . , jℓ ⊂ 1, . . . .n so that N = us1 , . . . , usℓ.Then πt1,...,tk = πn

M πM∪N , πs1,...,sℓ = πnN πM∪N .

Norbert Henze, KIT 13.14

Page 202: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Probability Measures on Metric Spaces

Memo: A = π−1t1,...,tk

(K), B := π−1s1,...,sℓ

(L) ∈ Cf (K ∈ Bk, L ∈ Bℓ).

Memo: πt1,...,tk = πnI πM∪N , πs1,...,sℓ = πn

J πM∪N .

It follows that

A ∩ B = π−1t1,...,tk

(K) ∩ π−1s1,...,sℓ

(L)

= (πnI πM∪N)−1 (K) ∩ (πn

J πM∪N )−1 (L)

= π−1M∪N

((πn

I )−1(K)

)∩ π−1

M∪N

((πn

J )−1(L)

)

= π−1M∪N

(((πn

I )−1(K)

)∩((πn

J )−1(L)

))

︸ ︷︷ ︸ ︸ ︷︷ ︸∈ Bn ∈ Bn

︸ ︷︷ ︸∈ Bn

∈ Cf√.

Norbert Henze, KIT 13.15

Page 203: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Probability Measures on Metric Spaces

Memo: Cf =π−1t1,...,tk

(H)∣∣k ∈ N, 0 ≤ t1 < . . . < tk ≤ 1, H ∈ Bk

.

Memo: Cf ⊂ B, Cf is a field.

Furthermore,

B(x, ε) =⋂

r∈Q∩[0,1]

y ∈ C : |y(r)− x(r)| ≤ ε

=⋂

r∈Q∩[0,1]

π−1r ([x(r)− ε, x(r) + ε]) ∈ σ (Cf )

=⇒ B(x, ε) =∞⋃

n=1

B(x, ε− 1/n) ∈ σ (Cf ) .

If M := x1, x2, . . . is dense in C, then

σ (B(xj , ε) : xj ∈M, ε ∈ Q>0) = B.

Therefore, B = σ(Cf), and Cf is a separating class.

Norbert Henze, KIT 13.16

Page 204: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Probability Measures on Metric Spaces

13.9 Example (S = R∞)

Let S := R∞ := x = (xn)n≥1 : xn ∈ R ∀n ≥ 1,

ρ(x, y) :=

∞∑

k=1

min(1, |xk − yk|)2k

.

(S, ρ) is a separable and complete metric space.

For xn = (xnj )j≥1,

ρ (xn, x)→ 0⇐⇒ xnj → xj ∀j ≥ 1.

For k ∈ N, let

πk :

S → Rk,

x 7→ πk(x) := (x1, . . . , xk).

For x ∈ S, ε > 0, k ∈ N, let

Nk,ε(x) := π−1k

(×k

j=1(xj − ε, xj + ε))

= y ∈ S : |yj − xj | < ε, j = 1, . . . , k ∈ O.

Norbert Henze, KIT 13.17

Page 205: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Probability Measures on Metric Spaces

The classN := Nk,ε(x) : x ∈ S, ε > 0, k ∈ N

is a base of O.Let R∞

f := π−1k (H) : k ∈ N,H ∈ Bk.

R∞f is a field satisfying B = σ(R∞

f ) = σ(N ).

R∞f is a determining class.

R∞ is not σ-compact (in contrast to Rd).

We have

A compact ⇐⇒ ∀k ∈ N : xk : x = (xj)j≥1 ∈ A is a bounded set in R.

(Exercise!)

Norbert Henze, KIT 13.18

Page 206: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Weak convergence in metric spaces

14 Weak convergence in metric spaces

14.1 Definition (Weak convergence)

Let (S, ρ) be a metric space, P, P1, P2, . . . ∈ P .

PnD−→ P as n→∞ :⇐⇒ lim

n→∞

∫f dPn =

∫f dP ∀ f ∈ Cb.

Wording: Pn converges weakly to P .

Notice that PnD−→ P and Pn

D−→ Q implies P = Q. (why?)

14.2 Example Let δz be the Dirac measure in z ∈ S. Let x0, x1, x2, . . . ∈ S.We then have

δxn

D−→ δx0 ⇐⇒ xn → x0.

Proof:”⇐=“: Let xn → x0. If f ∈ Cb then

∫f dδxn = f(xn)→ f(x0) =

∫f dδx0 .

Norbert Henze, KIT 14.1

Page 207: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Weak convergence in metric spaces

Memo: δxn

D−→ δx0 ⇐⇒ xn → x0

”=⇒“: Suppose xn 6→ x0. Then ∃ε > 0 and

ρ(xn, x0) > ε for infinitely many n.

Let

fε(x) := max

(0, 1− ρ(x, x0)

ε

), x ∈ S

(cf. proof of Thm. 13.7, putting A = x0).Then

fε ∈ Cb,fε(x0) = 1,

fε(xn) = 0 for infinitely many n.

=⇒∫fε dδxn = fε(xn) 6→ fε(x0) =

∫fε dδx0

=⇒ δxn

D6−→ δx0 .

Norbert Henze, KIT 14.2

Page 208: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Weak convergence in metric spaces

Memo: PnD−→ P :⇐⇒

∫f dPn →

∫f dP ∀f ∈ Cb

14.3 Theorem (Portmanteau)

The following assertions are equivalent:

a) PnD−→ P ,

b)∫f dPn →

∫f dP ∀f ∈ Cb0,

c) lim supn→∞ Pn(A) ≤ P (A) ∀A ∈ A,

d) lim infn→∞ Pn(O) ≥ P (O) ∀O ∈ O,e) limn→∞ Pn(B) = P (B) ∀B ∈ C(P ) := C ∈ B : P (∂C) = 0.

A set B ∈ B with the property P (∂B) = 0 is called P -continuity set.

Proof: (largely follows the proof in the case S = Rd, cf. 6.4).

b) =⇒ c): Use 1A ≤ fε ≤ 1Aε , where fε was defined in the proof of 13.7.

e) =⇒ a): Let f ∈ Cb. If |f | < L then 0 <(fL+ 1)· 12< 1.

∫· · · linear =⇒ w.l.o.g. 0 < f < 1.

Norbert Henze, KIT 14.3

Page 209: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Weak convergence in metric spaces

Memo: To show:∫f dPn →

∫f dP .

∫f dPn =

∫ 1

0

t P fn (dt) (transformation formula)

=

∫ 1

0

(∫ t

0

1λ1(du)

)P fn (dt)

=

∫ 1

0

(∫ 1

u

P fn (dt)

)λ1(du) (Tonelli’s theorem)

=

∫ 1

0

Pn(f > u) du. (f > u = x ∈ S : f(x) > u)

Likewise,

∫f dP =

∫ 1

0

P (f > u) du.

f continuous =⇒ ∂f > u ⊂ f = u (!) =⇒ Pn(∂f > u) ≤ Pn(f = u).

P (f = u) = 0 with at most countably many exceptions. (why?)

e) =⇒ Pn(f > u)→ P (f > u) λ1-almost everywhere.

Dominated convergence =⇒ a).

Norbert Henze, KIT 14.4

Page 210: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Weak convergence in metric spaces

14.4 Theorem (Criterion for weak convergence I)

Let P ∈ P and MP ⊂ B be a π-system (i.e., MP is closed with respect tointersections). If each open set is a countable union of sets ofMP , then:

Pn(A)→ P (A) ∀A ∈MP =⇒ PnD−→ P.

Proof: We show P (O) ≤ lim infn→∞ Pn(O) ∀O ∈ O. 14.3 d) ⇒ assertion.

Let A1, . . . , Ak ∈ MP .MP π-system, inclusion-exclusion formula =⇒

Pn

(k⋃

j=1

Aj

)→ P

(k⋃

j=1

Aj

). (⋆)

Let O ∈ O. Assumption =⇒ O = ∪∞j=1Aj , where A1, A2, . . . ∈ MP .

Fix ε > 0. Choose k such that

P (O)− ε ≤ P

(k⋃

j=1

Aj

).

(⋆) =⇒ P (O)−ε ≤ limn→∞

Pn

(k⋃

j=1

Aj

)≤ lim inf

n→∞Pn(O). ε ↓ 0 =⇒ assertion.

Norbert Henze, KIT 14.5

Page 211: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Weak convergence in metric spaces

14.5 Theorem (Criterion for weak convergence II)

Let (S, ρ) be separable, P ∈ P , and let MP ⊂ B be a π-system. Supposefurther that

∀x ∈ S ∀ε > 0 ∃A ∈MP : x ∈A ⊂ A ⊂ B(x, ε). (⋆)

If Pn(A)→ P (A) for each A ∈MP , then PnD−→ P .

Proof: Let ∅ 6= O ∈ O. Assumption =⇒

∀x ∈ O ∃Ax ∈ MP : x ∈Ax ⊂ Ax ⊂ O.

=⇒ O =⋃

x∈O

Ax (open cover!)

(S, ρ) separable, Thm. 13.3 =⇒ there is a countable subcover

O =

∞⋃

j=1

Axj

=

∞⋃

j=1

Axj.

14.4 =⇒ assertion.

Norbert Henze, KIT 14.6

Page 212: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Weak convergence in metric spaces

Memo: C(P ) = B ∈ B : P (∂B) = 0.

Memo: PnD−→ P ⇐⇒ Pn(B) −→ P (B) ∀B ∈ B ∩ C(P ).

14.6 Definition (Convergence determining class)

A systemM⊂ B is called convergence-determining class (CDC) :⇐⇒

∀P, P1, P2, . . . ∈ P : Pn(A)→ P (A) ∀A ∈M∩ C(P ) =⇒ PnD−→ P.

ForM⊂ B, x ∈ S, ε > 0, put

Mx,ε := A ∈M : x ∈A ⊂ A ⊂ B(x, ε),

∂Mx,ε := ∂A : A ∈ Mx,ε.

Norbert Henze, KIT 14.7

Page 213: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Weak convergence in metric spaces

Memo: M CDC :⇔ ∀P, (Pn) : Pn(A)→P (A)∀A ∈M∩ C(P )⇒ PnD−→P.

Memo: Mx,ε = A ∈M : x ∈A⊂ A ⊂ B(x, ε).

Memo: ∀x ∈ S ∀ε > 0 ∃A ∈ MP : x ∈A ⊂ A ⊂ B(x, ε). (⋆)

14.7 Theorem (Sufficient condition for CDC)

Let (S, ρ) be separable andM⊂ B be a π-system satisfying

(i) ∀x ∈ S ∀ε > 0:Mx,ε 6= ∅,(ii) ∀x ∈ S ∀ε > 0: ∂Mx,ε contains ∅ or uncountably many disjoint sets.

ThenM is a CDC.

Proof: Fix P ∈ P , putMP :=M∩ C(P ). Since

∂(A ∩B) ⊂ ∂A ∪ ∂B, (!)

MP is a π-system. (i) and (ii) =⇒ MP satisfies condition (⋆) in 14.5.

Thus, Pn(A)→ P (A) ∀A ∈MP implies PnD−→ P , q.e.d.

Norbert Henze, KIT 14.8

Page 214: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Weak convergence in metric spaces

Memo: 14.7: (S, ρ) separable,M π-system;M CDC if (i) ∀x∀ε :Mx,ε 6= ∅(ii) ∀x∀ε : ∂Mx,ε contains ∅ or uncountably many disjoint sets.

Mx,ε = A ∈M : x ∈A⊂ A ⊂ B(x, ε).

14.8 Examples

a) LetM be the class of finite intersections of open balls. Since

∂B(x, r) ⊂ y : ρ(x, y) = r, (!)

M is a CDC by 14.7.

b) S = Rd,M := (−∞, x] : x ∈ Rd is a CDC (cf. 6.4.e)).

c) S = R∞, R∞f = π−1

k (H) : k ∈ N,H ∈ Bk πk(x) := (x1, . . . , xk)

R∞f is a separating class, cf. 13.9. R∞

f is also a CDC (Exercise!).

d) S = C = C[0, 1]. 13.8 =⇒ a separating class is

Cf =π−1t1,...,tk

(H) : k ∈ N, 0 ≤ t1 < . . . < tk ≤ 1, H ∈ Bk.

Norbert Henze, KIT 14.9

Page 215: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Weak convergence in metric spaces

Claim: Cf is not a CDC. Let zn ∈ C as in Example 13.8.

0

0.5

1.0

0 0.5 1.0

zn(t)

t1n

Let Pn := δzn , P := δ0. We have Pn

D6−→ P since zn 6→ 0 (cf. 14.2).

But: Let k ∈ N, 0 ≤ t1 < . . . < tk ≤ 1 fixed. We have

πt1,...,tk (zn) = (0, 0, . . . , 0) = πt1,...,tk(0)

if2

n<

t1, if t1 > 0,

t2, if 0 = t1 < t2.

I.e., Pn(A)→ P (A) for each A ∈ Cf . (why?) Thus, Cf is not a CDC.

Norbert Henze, KIT 14.10

Page 216: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Weak convergence in metric spaces

14.9 Theorem (Subsequence criterion)

Let P, P1, P2, . . . ∈ P .

PnD−→ P ⇐⇒ each subsequence (Pnk

) contains a further

subsequence (Pn′

k) such that Pn′

k

D−→ P.

Proof:”⇐=“: Suppose Pn

D6−→ P =⇒ ∃f ∈ Cb ∃ε > 0 such that

∣∣∣∣∫f dPnk

−∫f dP

∣∣∣∣ > ε

for some subsequence (Pnk).

No subsequence of (Pnk) can converge weakly to P , q.e.d.

Norbert Henze, KIT 14.11

Page 217: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Weak convergence in metric spaces

Let (S, ρ), (S′, ρ′) metric spaces. Let h : S → S′ be (B,B′)-measurable.

If P ∈ P then

P h := P h−1 =: Ph−1, P h(B′) := P (h−1(B′)), B′ ∈ B′,

is a probability measure on B′. Do we have PnD−→ P =⇒ P h

nD−→ P h?

14.10 Theorem (Continuous mapping theorem, CMT)

Let C(h) be the set of points of continuity of h. We then have:

If PnD−→ P and P (C(h)) = 1 then P h

nD−→ P h.

Proof: Let A′ ∈ A′. We have (!, cf. proof of 6.6)

h−1(A′) ⊂ S \ C(h) ∪ h−1(A′) =⇒

lim supn→∞

Pn

(h−1(A′)

)≤ lim sup

n→∞Pn

(h−1(A′)

)

≤ P(h−1(A′)

)≤ P (S \ C(h)) + P

(h−1(A′)

)

= 0 + P(h−1(A′)

). 14.3 c) =⇒ assertion.

Norbert Henze, KIT 14.12

Page 218: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Weak convergence in metric spaces

Memo: R∞f = π−1

k (H) : k ∈ N,H ∈ Bk, πk(x) = (x1, . . . , xk)

14.11 Example

Let S = R∞. Then

PnD−→ P ⇐⇒ ∀ k ∈ N : Pnπ

−1k

D−→ Pπ−1k .

Proof:”=⇒“ follows from the CMT, since πk is continuous.

”⇐=“: If H ∈ Bk then (!) ∂π−1

k (H) = π−1k (∂H).

Suppose A := π−1k (H) ∈ C(P ) =⇒

P(π−1k (∂H)

)= P

(∂π−1

k (H))= P (∂A) = 0

=⇒ H ∈ C(Pπ−1k ). Assumption =⇒

Pn(A)→ P (A) ∀A ∈ R∞f ∩ C(P ).

14.8 c) =⇒ assertion. (R∞f is a CDC!)

Norbert Henze, KIT 14.13

Page 219: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Weak convergence in metric spaces

14.12 Example Let S = C[0, 1]. Then

PnD−→ P =⇒ ∀k ≥ 1, ∀t1, . . . , tk : Pnπ

−1t1,...,tk

D−→ Pπ−1t1,...,tk

Warning! The converse ⇐= does not hold, cf. Example 14.8 d).

Norbert Henze, KIT 14.14

Page 220: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Convergence in Distribution

15 Convergence in distribution

15.1 Random Elements

Let (Ω,A,P) be a probability space, (S, ρ) a metric space, B := σ(O).

A (A,B)-measurable mapping X : Ω→ S is called a random element of S.

Manner of speaking:

S = R: random variable,

S = Rd: random vector,

S = R∞: random sequence,

S = C[0, 1]: random function.

The distribution of X is the probability measure PX = P X−1 = PX−1.

Canonical construction, given a probability measure P on S :

(Ω,A) := (S,B), X := idΩ, P := P.

Norbert Henze, KIT 15.1

Page 221: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Convergence in Distribution

15.2 Notations for random Functions

Let (Ω,A,P) be a probability space, X : Ω→ C = C[0, 1] a random function.

For fixed ω ∈ Ω, X(ω) is a continuous function on [0, 1].

X(ω)(t) =: Xt(ω) =: X(t, ω), 0 ≤ t ≤ 1.

X(t) := Xt :

Ω→ R,

ω 7→ X(t)(ω) := Xt(ω) := X(t, ω).

X(t) = πt X.

0 ≤ t1 < . . . < tk ≤ 1 =⇒ (X(t1), . . . , X(tk)) = πt1...,tk X.The distributions of (X(t1), . . . , X(tk)), where k ≥ 1, 0 ≤ t1 < . . . < tk ≤ 1,are called the finite-dimensional distributions (

”fidis“) of X.

Notice that X is a random function (i.e., (A,B)-measurable), if, and only if,X(t) is a random variable (i.e., (A,B1)-measurable) for each t ∈ [0, 1](Exercise!).

Norbert Henze, KIT 15.2

Page 222: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Convergence in Distribution

15.3 Example (The partial Sum Process)

Let Z1, Z2, . . . be i.i.d. R-valued random variables with E(Z21 ) = 1, E(Z1) = 0.

Let S0 := 0, Sk := Z1 + . . .+ Zk, k ≥ 1. For t ∈ [0, 1], let

Xn(t) :=S⌊nt⌋√n

+ (nt− ⌊nt⌋) · Z⌊nt⌋+1√n

.

The random function Xn is called n-th partial sum process of (Zn)n≥1.

0

1

2

−1

−2

0.5 1.0t

Xn(t)

Realizations of X100 (Here, P(Z1 = 1) = P(Z1 = −1) = 1/2)

Notice that Xn(1) =Sn√n

D−→ N(0, 1). (why?)

Norbert Henze, KIT 15.3

Page 223: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Convergence in Distribution

15.4 Definition (Convergence in Distribution)

Let X,X1, X2, . . . be random elements in S having distributions P = PX ,P1 = PX1 , P2 = PX2 , . . ..

XnD−→ X :⇐⇒ Pn

D−→ P

⇐⇒ Ef(Xn)→ Ef(X) ∀f ∈ Cb

Notice that only distributions matter.

Underlying probability space remains”offstage“.

A set A ∈ B is called an X-continuity set :⇐⇒ P(X ∈ ∂A) = 0.

Notice that A is an X-continuity set if, and only if, A ∈ C(PX) (= C(P )).

Norbert Henze, KIT 15.4

Page 224: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Convergence in Distribution

15.5 Theorem (Portmanteau)

The following assertions are equivalent:

a) XnD−→ X,

b) Ef(Xn)→ Ef(X) ∀f ∈ Cb0,c) lim supn→∞ P(Xn ∈ A) ≤ P(X ∈ A) ∀A ∈ A,

d) lim infn→∞ P(Xn ∈ O) ≥ P(X ∈ O) ∀O ∈ O,e) limn→∞ P(Xn ∈ B) = P(X ∈ B) for each X-continuity set B.

Remark: There are”hybrid“ notations, like Xn

D−→ P , PnD−→ X.

15.6 Theorem (Continuous mapping theorem, CMT)

Let (S′, ρ′) be a further metric space, h : S → S′ measurable. We then have:

If XnD−→ X and P(X ∈ C(h)) = 1, then h(Xn)

D−→ h(X).

Norbert Henze, KIT 15.5

Page 225: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Convergence in Distribution

15.7 Example (Random elements of C)

Let X,X1, X2, . . . be random elements of C[0, 1]. If XnD−→ X, then

∀k ≥ 1, ∀0 ≤ t1 < . . . < tk ≤ 1 : (Xn(t1), . . . , Xn(tk))D−→ (X(t1), . . . , X(tk)).

Proof: The function h = πt1,...,tk : C → Rk is continuous.

15.8 Definition (Convergence in Probability)

Let X,X1, X2, . . . be random elements of S, a ∈ S.

XnP−→ X :⇐⇒ lim

n→∞P(ρ(Xn, X) ≥ ε) = 0 ∀ε > 0,

XnP−→ a :⇐⇒ lim

n→∞P(ρ(Xn, a) ≥ ε) = 0 ∀ε > 0.

Attention! ρ(Xn, X) must be a random variable, i.e., measurable w.r.t. B ⊗ B.If (S, ρ) is separable, then (Xn, X) is a random element of S × S. (!!)

Since ρ : S × S → R is continuous, this condition holds.

Norbert Henze, KIT 15.6

Page 226: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Convergence in Distribution

15.9 Theorem (Slutzy’s Lemma)

Let (Xn, Yn), n ≥ 1, be random elements of S ×S, X a random element of S.

If XnD−→ X and ρ(Xn, Yn)

P−→ 0, then YnD−→ X.

Proof: Let A ∈ A, ε > 0, Aε := x : ρ(x,A) ≤ ε. We have

P(Yn ∈ A) = P(Yn ∈ A, ρ(Xn, Yn) ≥ ε) + P(Yn ∈ A,ρ(Xn, Yn) < ε)

≤ P(ρ(Xn, Yn) ≥ ε) + P(Xn ∈ Aε).

XnD−→ X, Aε ∈ A =⇒ lim supn→∞ P(Yn ∈ A) ≤ P(X ∈ Aε).

ε ↓ 0 =⇒ Aε ↓ A =⇒

lim supn→∞

P(Yn ∈ A) ≤ P(X ∈ A). 15.5 c) =⇒ assertion.

15.10 Corollary If XnP−→ X then Xn

D−→ X.

Proof: Put Xn := X, Yn := Xn, n ≥ 1, in 15.9.

Norbert Henze, KIT 15.7

Page 227: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Relative compactness and tightness

16 Relative compactness and tightness

Let Q ⊂ P be a nonempty set of probability measures on B.

16.1 Definition (Relative compactness, tightness)

a) Q relatively compact :⇐⇒ ∀(Pn) ∈ QN ∃ subsequence (Pnk) ∃Q ∈ P :

Pnk

D−→ Q as k →∞.

b) Q tight :⇐⇒ ∀ ε > 0 ∃K ⊂ S, K compact: Q(K) ≥ 1− ε ∀Q ∈ Q.

16.2 Remark (Relative compactness is necessary for weak convergence)

If PnD−→ P then Q := Pn : n ∈ N is relatively compact.

Proof: Use the subsequence criterion 14.9.

16.3 Remark If X1, X2, . . . are random elements in S, then Xn : n ∈ N isrelatively compact (tight) if PXn : n ∈ N is relatively compact (tight).

Norbert Henze, KIT 16.1

Page 228: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Relative compactness and tightness

16.4 Definition (Fidi convergence)

Let S = C[0, 1] and P, P1, P2, . . . ∈ P .

PnDfidi−→ P :⇐⇒ Pnπ

−1t1,...,tk

D−→ Pπ−1t1,...,tk

∀k ≥ 1, ∀ 0 ≤ t1 < . . . < tk ≤ 1

(weak convergence of all finite-dimensional distributions).

If X,X1, X2, . . . are random elements of C, then

XnDfidi−→ X :⇐⇒ (Xn(t1), . . . , Xn(tk))

D−→ (X(t1), . . . , X(tk))

∀k ≥ 1, ∀ 0 ≤ t1 < . . . < tk ≤ 1.

Warning: Fidi convergence is a necessary but not sufficient condition for

PnD−→ P or Xn

D−→ X, cf. Example 14.12 and 15.7.

Norbert Henze, KIT 16.2

Page 229: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Relative compactness and tightness

16.5 Theorem (Fidi conv. and relative compactness imply PnD−→ P in C)

Let S = C = C[0, 1] and P, P1, P2, . . . ∈ P . Suppose

PnDfidi−→ P, (16.1)

Pn : n ∈ N relatively compact. (16.2)

Then PnD−→ P .

Proof: (16.2) =⇒ each subsequence (Pni) contains a further subsequence

(Pn′

i) with Pn′

i

D−→ Q for some Q ∈ P . Let k ∈ N, 0 ≤ t1 < . . . < tk ≤ 1.

CMT =⇒ Pn′

iπ−1t1,...,tk

D−→ Qπ−1t1,...,tk

,

(16.1) =⇒ Pn′

iπ−1t1,...,tk

D−→ Pπ−1t1,...,tk

,

=⇒ P (B) = Q(B) ∀B ∈ Cf . Cf separating class =⇒ P = Q.

Subsequence criterion 14.9 =⇒ PnD−→ P , q.e.d.

Norbert Henze, KIT 16.3

Page 230: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Relative compactness and tightness

16.6 Theorem (Existence of probability measures on C[0, 1])

Let S = C[0, 1]. Let P1, P2, . . . ∈ P . Suppose

Pn : n ∈ N is relatively compact, (16.3)

∀k ≥ 1, ∀ 0 ≤ t1 < . . . < tk ≤ 1 ∃ probability measure (16.4)

µt1,...,tk on Bk such that Pnπ−1t1,...,tk

D−→ µt1,...,tk .

Then there is a P ∈ P with Pπ−1t1,...,tk

= µt1,...,tk ∀k ≥ 1, ∀t1, . . . tk.

Proof: (16.3) =⇒ ∃ subsequence (Pni) ∃P ∈ P such that Pni

D−→ P .

Fix k and t1, . . . , tk. The CMT implies

Pniπ−1t1,...,tk

D−→ Pπ−1t1,...,tk

.

From (16.4), we have

Pniπ−1t1,...,tk

D−→ µt1,...,tk

=⇒ Pπ−1t1,...,tk

= µt1,...,tk , q.e.d.

Norbert Henze, KIT 16.4

Page 231: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Relative compactness and tightness

16.7 Theorem (Prokhorov)

a) Q ⊂ P is tight =⇒ Q relatively compact,

b) “⇐=“ holds if (S, ρ) is separable and complete.

Proof of b): Let Q be relatively compact. Fix ε > 0. Let On ∈ O, n ≥ 1, withOn ↑ S. Claim: ∃n ∈ N with P (On) > 1− ε ∀P ∈ Q. Proof (by contradiction):

Suppose ∀n ∃Pn ∈ Q with Pn(On) ≤ 1− ε. Assumption =⇒ ∃ subsequence

(Pnk) ∃Q ∈ P with Pnk

D−→ Q. Portmanteau theorem =⇒

for fixed n : Q(On) ≤ lim infk→∞

Pnk(On) ≤ lim inf

k→∞Pnk

(Onk) ≤ 1− ε.

But Q(On) ↑ 1 (why?), a contradiction (=⇒ claim).

For k ≥ 1, let Bk,j , j ≥ 1, be open balls of radius 1/k that cover S(separability!)

Claim =⇒ ∃nk such that P

(nk⋃

j=1

Bk,j

)> 1− ε

2k∀P ∈ Q.

Let M := ∩k≥1 (∪j≤nkBk,j) =⇒ M totally bounded. K :=M is complete

(since S is complete). 13.4 =⇒ K compact, and P (K) > 1− ε ∀P ∈ Q.

Norbert Henze, KIT 16.5

Page 232: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Relative compactness and tightness

Proof of a) (sketch). Let (Pn) be a sequence in Q. To show: ∃ subsequence

(Pnk) ∃P ∈ P with Pnk

D−→ P as k →∞. Let K1,K2, . . . ⊂ S be compactsets with Kj ⊂ Kj+1, j ≥ 1 , and

Pn(Kj) > 1− 1

j∀j ≥ 1, ∀n ≥ 1.

For each m ≥ 1, Kj has a finite 1/m-net Nj,m. The set

N :=

∞⋃

j=1

∞⋃

m=1

Nj,m

is countable, and we have ∪∞j=1Kj ⊂ N , i.e., ∪∞

j=1Kj is separable. By a

general result, there is a countable system O ⊂ O of open sets such that

∀x ∈ S ∀O ∈ O : x ∈( ∞⋃

j=1

Kj

)∩O ⇒ ∃G ∈ O : x ∈ G ⊂ G ⊂ O.

Let H be the system of all finite unions of sets of the type G ∩Kj , whereG ∈ O and j ≥ 1, enlarged by ∅. The system H is countable, and byCantor’s diagonal procedure, there is a subsequence (Pnk

) such that

Norbert Henze, KIT 16.6

Page 233: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Relative compactness and tightness

α(H) := limk→∞

Pnk(H)

exists for each H ∈ H. The aim is to construct P ∈ P such that

P (O) = supα(H) : H ∈ H, H ⊂ O, O ∈ O. (16.5)

Suppose P exists. Then, for H ∋ H ⊂ O,

α(H) = limk→∞

Pnk(H) ≤ lim inf

k→∞Pnk

(O).

(16.5) implies P (O) ≤ lim infk→∞ Pnk(O).

The Portmanteau theorem then gives Pnk

D−→ P . To construct P , put

β(O) := supH∈H,H⊂O

α(H), γ(M) := infO∈O,O⊃M

β(O), M ⊂ S.

Then γ is an outer measure on the class of all subsets of S. By Caratheodory’slemma, the restriction of γ to the system A∗ of γ-measurable sets is ameasure, denoted by µ. Show that A ⊂ A∗, and that µ is a probabilitymeasure. This measure µ is the desired P .

Norbert Henze, KIT 16.7

Page 234: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Relative compactness and tightness

16.8 Corollary If (S, ρ) is separable, any finite set Q ⊂ P is tight.

Norbert Henze, KIT 16.8

Page 235: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Weak convergence and tightness in C

17 Weak convergence and tightness in C

Let S = C = C[0, 1], P, P1, P2 . . . ∈ P . From 16.5 and 16.7, we have:

17.1 Theorem (Fidi convergence and tightness imply PnD−→ P in C)

If PnDfidi−→ P and the sequence Pn : n ∈ N is tight then Pn

D−→ P .

How to prove tightness in C?

17.2 Definition (Modulus of continuity)For x ∈ C[0, 1], the function wx : (0, 1]→ R, defined by

wx(δ) := w(x, δ) := sup|s−t|≤δ

|x(s)− x(t)|, 0 < δ ≤ 1,

is called modulus of continuity of x.

Notice that x is uniformly continuous if, and only if, limδ→0 wx(δ) = 0.

Norbert Henze, KIT 17.1

Page 236: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Weak convergence and tightness in C

17.3 Remark For x, y ∈ C and 0 < δ ≤ 1, we have:

|wx(δ)− wy(δ)| ≤ 2‖x− y‖ =⇒ w(x, δ) continuous in x.

Proof. Let δ ∈ (0, 1] and x, y ∈ C. If s, t ∈ [0, 1] with |s− t| ≤ δ, then

|x(s)− x(t)| ≤ |x(s)− y(s)|+ |y(s)− y(t)|+ |y(t)− x(t)|≤ ‖x− y‖+ wy(δ) + ‖x− y‖≤ wy(δ) + 2‖x − y‖.

Therefore,wx(δ) ≤ wy(δ) + 2‖x − y‖,

q.e.d.

Norbert Henze, KIT 17.2

Page 237: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Weak convergence and tightness in C

Memo: A ⊂ S relatively compact :⇐⇒ A compact .

17.4 Theorem (Arzela–Ascoli)A set A ⊂ C[0, 1] is relatively compact if, and only if,

supx∈A|x(0)| <∞ (uniform boundedness at 0) (17.1)

andlimδ→0

supx∈A

wx(δ) = 0 (uniform equicontinuity) (17.2)

17.5 Example : Let zn ∈ C as in 13.8.

0

0.5

1.0

0 0.5 1.0

zn(t)

t1n

wzn(δ) = 1, δ ≥ 1/n

supx∈Awx(δ) = 1, δ > 0

A := zn : n ∈ N not relatively compact since (17.2) is violated.

Norbert Henze, KIT 17.3

Page 238: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Weak convergence and tightness in C

Memo: supx∈A |x(0)| <∞ (17.1), limδ→0 supx∈Awx(δ) = 0 (17.2)

Proof of”(17.1),(17.2) =⇒ A compact“:

Choose k large enough that supx∈Awx(1/k) <∞. Since

|x(t)| ≤ |x(0)|+k∑

j=1

∣∣∣∣x(jt

k

)− x

((j − 1)t

k

) ∣∣∣∣,︸ ︷︷ ︸

≤ wx(1/k)we have

α := sup0≤t≤1

supx∈A|x(t)| <∞. (17.3)

We now use (17.2) and (17.3) to show that A is totally bounded.

Since C is complete, it then follows that A is compact (cf. 13.4 c)).

Fix ε > 0. Choose a finite ε-net H in [−α, α] ⊂ R. Choose k large enough thatwx(1/k) < ε for all x ∈ A. Let B be the finite set of (polygonal) functions thatare linear on each interval Ikj := [(j − 1)/k, j/k], 1 ≤ j ≤ k, and take valuesin H at the endpoints. If x ∈ A, then |x(j/k)| ≤ α =⇒ ∃y ∈ B with|x(j/k) − y(j/k)| < ε for j = 0, 1, . . . , k. If t ∈ Ikj then

∣∣y(j/k)− x(t)∣∣ ≤

∣∣y(j/k)− x(j/k)∣∣+∣∣x(j/k)− x(t)

∣∣ < 2ε.

Norbert Henze, KIT 17.4

Page 239: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Weak convergence and tightness in C

Memo: |x(j/k) − y(j/k)| < ε for j = 0, 1, . . . , k.

Memo: wx(1/k) < ε, x ∈ A, Ikj := [(j − 1)/k, j/k]

Likewise, for t ∈ Ikj ,∣∣∣∣y(j − 1

k

)− x(t)

∣∣∣∣ ≤∣∣∣∣y(j − 1

k

)− x

(j − 1

k

) ∣∣∣∣+∣∣∣∣x(j − 1

k

)− x(t)

∣∣∣∣ < 2ε.

Since y(t) is a convex combination of y((j − 1)/k) and y(j/k), there is aβ = β(t) ∈ [0, 1] with

y(t) = β · y(j − 1

k

)+ (1− β) · y

(j

k

)=⇒

|y(t)− x(t)| =

∣∣∣∣β ·(y

(j − 1

k

)− x(t)

)+ (1− β) ·

(y

(j

k

)− x(t)

)∣∣∣∣

≤ β ·∣∣∣∣y(j − 1

k

)− x(t)

∣∣∣∣ + (1− β) ·∣∣∣∣y(j

k

)− x(t)

∣∣∣∣≤ 2ε.

Thus, B is a finite 2ε-net for A, q.e.d.

Norbert Henze, KIT 17.5

Page 240: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Weak convergence and tightness in C

A picture”tells it all“:

t1

α

−α

••••••••

•••••••••

H

2ε>

1k

j−1k

jk

Ikj

• y ∈ B

x(t)

• • •

• •

• • • •

Norbert Henze, KIT 17.6

Page 241: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Weak convergence and tightness in C

Memo: supx∈A |x(0)| <∞ (17.1), limδ→0 supx∈Awx(δ) = 0 (17.2)

The proof of”A compact =⇒ (17.1), (17.2)“ is easy:

Since π0 : C → R is continuous and A is compact, π0(A) ⊂ R is compact andthus bounded, which is (17.1).

Let fn(x) := w(x, 1/n).

fn is continuous, and we have fn(x) ↓ 0 as n→∞.

Fix ε > 0. Let On := x : fn(x) < ε ∈ O.Then On ⊂ On+1, n ≥ 1, and S = ∪∞

n=1On.

Since A is compact, we have A ⊂ On for some n, which gives (17.2).

Norbert Henze, KIT 17.7

Page 242: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Weak convergence and tightness in C

Memo: A compact ⇐⇒ supx∈A |x(0)| <∞, limδ→0 supx∈Awx(δ) = 0

17.6 Theorem (Characterization of tightness in C[0, 1])

Let (Pn)n≥1 be a sequence in P . We then have:

Pn : n ≥ 1 tight

⇐⇒ a) ∀η > 0 ∃a ∃n0 ∀n ≥ n0 : Pn(x : |x(0)| ≥ a) ≤ η,b) ∀ε > 0 ∀η > 0 ∃δ ∈ (0, 1) ∃n0 ∀n ≥ n0 : Pn(x : wx(δ) ≥ ε) ≤ η.

Proof:”=⇒“: Let Pn : n ≥ 1 be tight. Fix η > 0.

Choose a compact set K ⊂ C such that Pn(K) > 1− η for each n ≥ 1.

Thm. 17.4 (Arzela–Ascoli) =⇒ ∃a > 0: K ⊂ x : |x(0)| < a.

=⇒ Pn(x : |x(0)| < a) > 1− η ∀n ≥ 1.

Fix ε > 0. Thm. 17.4 =⇒ ∃δ ∈ (0, 1) such that K ⊂ x : wx(δ) < ε =⇒

Pn(x : wx(δ) < ε) > 1− η ∀n ≥ 1.

Norbert Henze, KIT 17.8

Page 243: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Weak convergence and tightness in C

Memo: A compact ⇐⇒ supx∈A |x(0)| <∞, limδ→0 supx∈Awx(δ) = 0

Pn : n ≥ 1 tight

⇐⇒ a)∀η > 0 ∃a ∃n0 ∀n ≥ n0 : Pn(x : |x(0)| ≥ a) ≤ η,b) ∀ε > 0 ∀η > 0 ∃δ ∈ (0, 1) ∃n0 ∀n ≥ n0 : Pn(x : wx(δ) ≥ ε) ≤ η.

Proof:”⇐=“: 16.8 and part

”=⇒“ imply w.l.o.g. n0 = 1.

Fix ε > 0. To show: There is a compact set K with Pn(K) ≥ 1− ε ∀n.In a), let η := ε/2. Choose a > 0 such that, putting B := x : |x(0)| ≤ a,

Pn(B) ≥ 1− ε

2, n ≥ 1.

For each k ∈ N, let ε := 1/k and η := ε/2k+1 in b). Choose δk > 0 such that,writing Bk := x : wx(δk) < 1/k,

Pn(Bk) ≥ 1− ε

2k+1, k ≥ 1, n ≥ 1.

Put K := B ∩ ∩∞k=1Bk. Then Pn(K) ≥ 1− ε for each n ≥ 1.

The set A := B ∩∩∞k=1Bk satisfies the conditions of 17.4. Thus, K is compact.

Norbert Henze, KIT 17.9

Page 244: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Weak convergence and tightness in C

17.7 Theorem Let 0 = t0 < t1 < . . . < tk = 1. If

min1<j<k

(tj − tj−1) ≥ δ, (17.4)

thenwx(δ) ≤ 3 max

1≤j≤ksup

tj−1≤s≤tj

|x(s)− x(tj−1)|, x ∈ C. (17.5)

Moreover, for P ∈ P and ε > 0,

P (x : wx(δ) ≥ 3ε) ≤k∑

j=1

P

(x : sup

tj−1≤s≤tj

|x(s)−x(tj−1)| ≥ ε)

(17.6)

Proof: Let m be the maximum in (17.5). If s, t ∈ Ij := [tj−1, tj ], then

|x(s)− x(t)| ≤ |x(s)− x(tj−1)|+ |x(tj−1)− x(t)| ≤ 2m.

If s ∈ Ij , t ∈ Ij+1, then

|x(s)− x(t)| ≤ |x(s)− x(tj−1)|+ |x(tj)− x(tj−1)|+ |x(tj)− x(t)| ≤ 3m.

If |s− t| ≤ δ, no further cases possible in view of (17.4). (17.6) follows from(17.5) and the subadditivity of P , q.e.d.

Norbert Henze, KIT 17.10

Page 245: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Weak convergence and tightness in C

17.8 Theorem Let X,X1, X2, . . . be random functions on (Ω,A,P).If, for each k ≥ 1 and 0 ≤ t1 ≤ . . . ≤ tk ≤ 1,

(Xn(t1), Xn(t2), . . . , Xn(tk))D−→ (X(t1), X(t2), . . . , X(tk)) (17.7)

andlimδ→0

lim supn→∞

P (w(Xn, δ) ≥ ε) = 0 ∀ε > 0, (17.8)

then XnD−→ X.

Proof: Let P := PX , Pn := PXn , n ≥ 1. (17.7) ⇐⇒ PnDfidi−→ P .

In view of Thm. 17.1, we have to show the tightness of Pn : n ≥ 1.

Since Xn(0)D−→ X(0), Pn π−1

0 : n ≥ 1 is tight =⇒ condition a) of Thm.17.6 holds. Now,

(17.8) ⇐⇒ limδ→0

lim supn→∞

Pn(x : w(x, δ) ≥ ε) = 0 ∀ε > 0

=⇒ condition b) of Thm. 17.6.

Norbert Henze, KIT 17.11

Page 246: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Wiener Measure, Donsker’s Theorem

18 Wiener Measure, Donsker’s Theorem

In what follows, let X := idC , X(x) := x, x ∈ C, Xt := πt X : C → R

(canonical construction).

If P is a probability measure on B, then, for u ∈ R,

P (Xt ≤ u) = P (x ∈ C : Xt(x) ≤ u).

18.1 Definition (Wiener Measure)

A probability measure W on B is called Wiener measure :⇐⇒

a) W (Xt ≤ u) = 1√2πt

∫ u

−∞exp

(−z

2

2t

)dz, 0 < t ≤ 1, u ∈ R,

b) W (X0 = 0) = 1,

c) ∀k ≥ 2, ∀ 0 ≤ t0 ≤ t1 ≤ . . . ≤ tk ≤ 1:

Xt1 −Xt0 , Xt2 −Xt1 , . . . , Xtk −Xtk−1 are independent under W .

I.e., Xt ∼ N(0, t), 0 ≤ t ≤ 1, and (Xt : 0 ≤ t ≤ 1) has independent increments.

Norbert Henze, KIT 18.1

Page 247: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Wiener Measure, Donsker’s Theorem

18.2 Corollary Under the Wiener measure W , we have:

a) Xt −Xs ∼ N(0, t− s), 0 ≤ s ≤ t ≤ 1,

b) Xt −Xs ∼ Xt−s, s ≤ t (”(Xt)0≤t≤1 has stationary increments“),

c) Cov(Xs, Xt) = min(s, t) =: s ∧ t, 0 ≤ s, t ≤ 1,

d) For each k ≥ 1, for each 0 ≤ t1 ≤ t2 ≤ . . . ≤ tk ≤ 1:

(Xt1 , Xt2 , . . . , Xtk )⊤ ∼ Nk(0,Σ),

where 0 = (0, 0, . . . , 0)⊤ ∈ Rd and Σ = (ti ∧ tj)1≤i,j≤k.

Proof: a) Let 0 ≤ s < t ≤ 1. We have

Xt = Xs + (Xt −Xs),

where, according to 18.1 c), Xs (= Xs −X0) and Xt −Xs are independent

=⇒ E

(eiuXt

)= E

(eiuXs

)· E(eiu(Xt−Xs)

), u ∈ R.

18.1a) =⇒ E

(eiuXt

)= exp

(− tu

2

2

), E

(eiuXs

)= exp

(−su

2

2

)

=⇒ E

(eiu(Xt−Xs)

)= exp

(− (t− s)u2

2

)=⇒ Xt−Xs ∼ N(0, t−s) =⇒ b).

Norbert Henze, KIT 18.2

Page 248: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Wiener Measure, Donsker’s Theorem

Memo: c) Cov(Xs, Xt) = min(s, t), 0 ≤ s, t ≤ 1

Proof of c): Let 0 ≤ s < t ≤ 1. We have XsXt = X2s +Xs(Xt −Xs)

=⇒ Cov(Xs, Xt) = E(XsXt) = E(X2

s

)+ E (Xs(Xt −Xs))

= E(X2

s

)+ 0 = s = min(s, t).

Proof of d): Notice that

Xt1

Xt2

Xt3

...

...Xtk

=

1 0 0 0 · · · 01 1 0 0 · · · 01 1 1 0 · · · 0...

...... 1 · · · 0

......

......

. . . 01 1 1 1 · · · 1

·

Xt1

Xt2 −Xt1

Xt3 −Xt2

...

...Xtk −Xtk−1

︸ ︷︷ ︸ ︸ ︷︷ ︸=: A ∼ Nk(0, D)

where D := diag(t1, t2 − t1, t3 − t2, . . . , tk − tk−1).

We have (!) ADA⊤ = Σ = (ti ∧ tj)1≤i,j≤k, q.e.d.

Norbert Henze, KIT 18.3

Page 249: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Wiener Measure, Donsker’s Theorem

18.3 Construction of W

Let Z1, Z2, . . . be i.i.d. random variables on some probability space (Ω,A,P)such that E(Z1) = 0, 0 < σ2 := V(Z1) <∞. Put S0 := 0,Sn := Z1 + . . .+ Zn, n ≥ 1. Let, for ω ∈ Ω,

Xn(t)(ω) :=1

σ√nS⌊nt⌋(ω) + (nt− ⌊nt⌋) 1

σ√nZ⌊nt⌋+1(ω), (18.1)

n ≥ 1, 0 ≤ t ≤ 1. Notice that

Xn

(j

n

)=

1

σ√nSj , j ∈ 0, 1, . . . , n.

1n

2n

jn 1

•S1σ√n

•S2σ√n

•Sj

σ√n

Xn is the n-th partial sum process associated with (Zj)j≥1, cf. Ex. 16.3.

Norbert Henze, KIT 18.4

Page 250: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Wiener Measure, Donsker’s Theorem

Let

Rn(t) :=nt− ⌊nt⌋σ√n· Z⌊nt⌋+1.

Then

Xn(t) =1

σ√nS⌊nt⌋ +Rn(t).

Notice that Rn(t)P−→ 0 as n→∞. Thus, for t > 0 (and n ≥ 1/t),

Xn(t) =

√⌊nt⌋√n· S⌊nt⌋

σ√⌊nt⌋

+ Rn(t)

︸ ︷︷ ︸ ︸ ︷︷ ︸ ︸ ︷︷ ︸→√t

D−→ N(0, 1)P−→ 0

CLT of Lindeberg-Levy, CMT and Slutsky =⇒ Xn(t)D−→ N(0, t) ∼ Xt.

Notice that Xn(0) = 0D−→ δ0 = N(0, 0) ∼ X0.

Thus, Xn(t)D−→ Xt for each t ∈ [0, 1].

Norbert Henze, KIT 18.5

Page 251: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Wiener Measure, Donsker’s Theorem

Likewise, consider 0 ≤ s < t ≤ 1. Notice that S⌊ns⌋ and S⌊nt⌋ − S⌊ns⌋ areindependent. (why?) We have

(Xn(s)

Xn(t)−Xn(s)

)=

1

σ√n

(S⌊ns⌋

S⌊nt⌋ − S⌊ns⌋

)+

(Rn(s)

Rn(t)−Rn(s)

)

D−→ N2

((00

),

(s 00 t− s

))

CMT =⇒(Xn(s)Xn(t)

)=

(1 01 1

)·(

Xn(s)Xn(t)−Xn(s)

)

D−→ N2

((00

),

(1 01 1

) (s 00 t− s

) (1 10 1

))

= N2

((00

),

(s ss t

))

∼(Xs

Xt

).

Norbert Henze, KIT 18.6

Page 252: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Wiener Measure, Donsker’s Theorem

Likewise, for any k ≥ 3 and any t1, . . . , tk with 0 ≤ t1 < . . . < tk ≤ 1:

Xn(t1)

...Xn(tk)

D−→ Nk

0...0

, (ti ∧ tj)1≤i,j≤k

X(t1)

...X(tk)

(18.2)

Let Pn := PXn . Suppose W exists

=⇒ Pn π−1t1,...,tk

D−→ W π−1t1,...,tk

∀k ∀t1, . . . , tk, i.e., PnDfidi−→ W.

Suppose Pn : n ∈ N is tight (⇐⇒: Xn : n ∈ N is tight).Prokhorov’s Thm. =⇒ ∃ subsequence Pnj

: j ≥ 1 ∃ probability measure

(=:W ) on B such that Pnj

D−→W . From the CMT, we then have

Pnj

Dfidi−→ W . Now, from (18.2),

Pnj π−1

t1,...,tk

D−→ Nk

(0, (ti ∧ tj)1≤i,j≤k

)=⇒

W has the desired fidis (which determine PW ). Subsequence criterion

=⇒ PnD−→W . It remains to prove tightness of Xn : n ∈ N.

Norbert Henze, KIT 18.7

Page 253: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Wiener Measure, Donsker’s Theorem

18.4 Lemma (Tightness of the PSP for stationary sequences)

Let Xn be the partial sum process of (18.1), where (Zj)j≥1 is a stationarysequence satisfying E(Z1) = 0, 0 < σ2 := V(Z1) <∞. If

limλ→∞

lim supn→∞

λ2P

(maxk≤n|Sk| ≥ λσ

√n

)= 0,

then Xn : n ∈ N is tight.

Proof: Since Xn(0) = 0, n ≥ 1, condition a) of Thm. 17.6 holds. From(17.8), it remains to prove

limδ→0

lim supn→∞

P (w(Xn, δ) ≥ ε) = 0 ∀ε > 0. (18.3)

If 0 = t0 < t1 < . . . < tk = 1 and min1<j<k(tj − tj−1) ≥ δ, (17.6) yields

P(w(Xn, δ) ≥ 3ε) ≤k∑

j=1

P

(sup

tj−1≤s≤tj

|Xn(s)−Xn(tj−1)| ≥ ε).

Choose

tj :=mj

n, where mj ∈ N0 and 0 = m0 < m1 < . . . < mk = n.

Norbert Henze, KIT 18.8

Page 254: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Wiener Measure, Donsker’s Theorem

Memo: P(w(Xn, δ) ≥ 3ε) ≤ ∑kj=1 P

(suptj−1≤s≤tj

|Xn(s)−Xn(tj−1)| ≥ ε).

If,for tj :=

mj

n,

mj

n− mj−1

n≥ δ, 1 < j < k, (18.4)

then (because of the polygonal character of Xn and stationarity!)

P (w(Xn, δ) ≥ 3ε) ≤k∑

j=1

P

(max

mj−1≤ℓ≤mj

∣∣∣∣Sℓ − Smj−1

σ√n

∣∣∣∣ ≥ ε)

=k∑

j=1

P

(max

ℓ≤mj−mj−1

|Sℓ| ≥ εσ√n

). (18.5)

Now, put m := ⌈nδ⌉ := minr ∈ N : r ≥ nδ, and mj := jm, 0 ≤ j < k,mk := n. Then the inequalities in (18.4) hold (!). Since we also need

mk−1 = (k − 1)m < n = mk ≤ km,

take k := ⌈n/m⌉. Then mk −mk−1 ≤ m. (18.5) and stationarity =⇒

P (w(Xn, δ) ≥ 3ε) ≤ k · P(maxℓ≤m|Sℓ| ≥ εσ

√n

).

Norbert Henze, KIT 18.9

Page 255: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Wiener Measure, Donsker’s Theorem

Memo: P (w(Xn, δ) ≥ 3ε) ≤ k · P (maxℓ≤m |Sℓ| ≥ εσ√n) ,

where

m = m(n, δ) =⌈nδ⌉, k = k(n, δ) =

⌈n

m

⌉.

Notice that k −→n→∞

1

δ<

2

δ,

n

m−→

n→∞1

δ>

1

2δ=⇒

P (w(Xn, δ) ≥ 3ε) ≤⌈n

m

⌉· P(maxℓ≤m|Sℓ| ≥ ε√

2δ· σ√m ·

√2δ

√n

m

)

︸ ︷︷ ︸→√2

≤ 2

δ· P(maxℓ≤m|Sℓ| ≥ ε√

2δ· σ√m

)

for sufficiently large n.

Put λ := ε√2δ. Then, for sufficiently large n,

P (w(Xn, δ) ≥ 3ε) ≤ 4λ2

ε2· P(maxℓ≤m|Sℓ| ≥ λσ

√m

).

Norbert Henze, KIT 18.10

Page 256: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Wiener Measure, Donsker’s Theorem

Memo: P (w(Xn, δ) ≥ 3ε) ≤ 4λ2

ε2· P(maxℓ≤m|Sℓ| ≥ λσ

√m

), n ≥ n0(δ).

Memo: Assumption: limλ→∞

lim supn→∞

λ2P

(maxk≤n|Sk| ≥ λσ

√n

)= 0.

Memo: To show: limδ→0

lim supn→∞

P (w(Xn, δ) ≥ ε) = 0 ∀ε > 0.

Assumption =⇒

∀ε > 0 ∀η > 0 ∃λ0 > 0 ∀λ ≥ λ0 :4λ2

ε2lim supm→∞

P

(maxℓ≤m|Sℓ| ≥ λσ

√m

)< η.

From this, the assertion follows (m goes to infinity along with n!).

How to bound P

(maxℓ≤m|Sℓ| ≥ β

)from above?

Norbert Henze, KIT 18.11

Page 257: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Wiener Measure, Donsker’s Theorem

18.5 Lemma (Etemadi’s inequality)

Let Z1, . . . , Zn be independent random variables on (Ω,A, P), S0 := 0,

Sk :=∑k

j=1 Zj , 1 ≤ k ≤ n. Then

P

(maxk≤n|Sk| ≥ 3α

)≤ 3max

k≤nP (|Sk| ≥ α) , α > 0.

Proof: Put

A :=

maxk≤n|Sk| ≥ 3α

and, for k ∈ 1, . . . , n,

Bk := |Sk| ≥ 3α, |Sj | < 3α for j = 0, . . . , k − 1.

Then

A =n∑

k=1

Bk

and A = A ∩ |Sn| ≥ α+ A ∩ |Sn| < α.

Norbert Henze, KIT 18.12

Page 258: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Wiener Measure, Donsker’s Theorem

Memo: A = maxk≤n |Sk| ≥ 3α

Memo: Bk = |Sk| ≥ 3α, |Sj | < 3α for j = 0, . . . , k − 1

Memo: A = B1 + . . .+Bk = A ∩ |Sn| ≥ α+ A ∩ |Sn| < α.

P(A) ≤ P(|Sn| ≥ α) +n∑

k=1

P(Bk ∩ |Sn| < α) (last memo)

≤ P(|Sn| ≥ α) +n∑

k=1

P(Bk ∩ |Sn − Sk| > 2α) (triangle inequal.)

= P(|Sn| ≥ α) +n∑

k=1

P(Bk) · P(|Sn − Sk| > 2α) (independence)

≤ P(|Sn| ≥ α) + maxk≤n

P(|Sn − Sk| > 2α)

(n∑

k=1

P(Bk) ≤ 1

)

≤ P(|Sn| ≥ α) + maxk≤n

(P(|Sn| ≥ α) + P(|Sk| ≥ α)) (triangle inequal.)

≤ 3maxk≤n

P(|Sk| ≥ α), q.e.d.

Norbert Henze, KIT 18.13

Page 259: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Wiener Measure, Donsker’s Theorem

Proof of the existence of Wiener measure W :

Consider special partial sum process

Xn(t) :=1

σ√nS⌊nt⌋ + (nt− ⌊nt⌋) 1

σ√nZ⌊nt⌋+1,

where Z1, Z2, . . . are i.i.d. ∼ N(0, σ2)!! In this case, we have

N :=Sk

σ√k∼ N(0, 1), k ≥ 1, =⇒

P(|Sk| ≥ λσ√n) = P

(|N | ≥ λ ·

√n

k

)

≤ P(|N | ≥ λ) (if k ≤ n)

≤ E(N4) 1

λ4=

3

λ4.

It follows that limλ→∞

lim supn→∞

λ2 max1≤k≤n

P(|Sk| ≥ λσ√n) = 0.

In view of Lemma 18.4 and Etemadi’s inequality 18.5, this was to be shown.

Norbert Henze, KIT 18.14

Page 260: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Wiener Measure, Donsker’s Theorem

18.6 Wiener Process on [0, 1]

In what follows, we also use the notation W for a random function on aprobability space (Ω,A,P) having distribution W . Then W : Ω→ C.

For fixed ω ∈ Ω, W (ω) ∈ C is called a path of W .

We put W (ω)(t) =:Wt(ω) =:W (ω, t) and suppress the dependence on ω bywriting

W (t) :=Wt (random variable on Ω).

Then (W (t),0 ≤ t ≤ 1) is a stochastic process (family of random variables)with the following properties:

a) P(W (0) = 0) = 1,

b) W (t) ∼ N(0, t), 0 ≤ t ≤ 1;

c) W has independent increments, i.e., ∀k ≥ 2,∀0 ≤ t0 < t1 < . . . < tk ≤ 1:W (t1)−W (t0), . . . ,W (tk)−W (tk−1) are independent.

d) (W (t), 0 ≤ t ≤ 1) is a Gaussian process, i.e., ∀k ≥ 1, ∀t1, . . . , tk,(W (t1), . . .W (tk))

⊤ has a k-variate normal distribution satisfyingEW (t) = 0 and Cov(W (s),W (t)) = min(s, t), 0 ≤ s, t ≤ 1.

(W (t), 0 ≤ t ≤ 1) is called the Wiener-Process or Brownian motion on [0, 1].

Norbert Henze, KIT 18.15

Page 261: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Wiener Measure, Donsker’s Theorem

The Wiener Process (Brownian motion) (W (t),0 ≤ t ≤ 1) is a fundamentalstochastic process.

It has continuous paths (W is a C[0, 1]-valued random element), but one canprove:

With probability one, the paths of W are nowhere differentiable,

With probability one, the paths of W are nowhere locally increasing ordecreasing,

With probability one, the paths of W have unbounded variation on everyinterval [s, t] with s < t.

→ Course”Brownian Motion“.

Norbert Henze, KIT 18.16

Page 262: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Wiener Measure, Donsker’s Theorem

18.7 Theorem (Donsker’s Theorem (1951))

Let Z1, Z2, . . . be i.i.d. random variables, E(Z1) = 0, 0 < σ2 := V(Z1) < ∞.Let S0 := 0, Sn :=

∑nj=1 Zj , n ≥ 1, and put

Xn(t) :=1

σ√nS⌊nt⌋ + (nt− ⌊nt⌋) · Z⌊nt⌋+1√

n.

We then have XnD−→ W .

Proof: We have to show (cf. 18.4, 18.5)

limλ→∞

lim supn→∞

λ2 max1≤k≤n

P(|Sk| > λσ

√n)= 0. (TP)

LetMn(λ) := max

1≤k≤nP(|Sk| > λσ

√n).

Notice that

P(|Sk| > λσ√n) ≤ kσ2

λ2σ2n=

k

λ2n. (Tschebyschew’s inequality)

Norbert Henze, KIT 18.17

Page 263: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Wiener Measure, Donsker’s Theorem

Memo: P(|Sk| > λσ√n) ≤ k

λ2n, Mn(λ) := max

1≤k≤nP(|Sk| > λσ

√n).

Put Yk :=Sk

σ√k. Note that Yk

D−→ N ∼ N(0, 1) as k →∞.

P(|Sk| > λσ√n) = P

(|Yk| > λ

√n/k

)≤ P(|Yk| > λ) −→

k→∞P(|N | > λ).

↑k ≤ n

Markov’s inequality =⇒ P(|N | > λ) ≤ E(N4)

λ4=

3

λ4.

Given λ > 0, let k(λ) ∈ N such that

P(|Yk| > λ) ≤ 6

λ4∀ k > k(λ).

=⇒ Mn(λ) ≤ max

(k(λ)

λ2n,6

λ4

)

=⇒ lim supn→∞

λ2Mn(λ) ≤ 6

λ2=⇒ lim

λ→∞lim supn→∞

λ2Mn(λ) = 0, i.e., (TP).

Norbert Henze, KIT 18.18

Page 264: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Wiener Measure, Donsker’s Theorem

0

1

2

−1

−2

0.5 1.0t

Realizations of PSP X1000, P(Z1 = ±1) = 1/2

0

1

2

−1

−2

0.5 1.0t

Realizations of PSP X1000, Z1 − 1 ∼ Exp(1)

Norbert Henze, KIT 18.19

Page 265: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Wiener Measure, Donsker’s Theorem

18.8 Corollary From Donsker’s Theorem, we have

Sn

σ√n

= Xn(1)D−→ W (1) ∼ N(0, 1) (CLT of Lindeberg-Levy)

18.9 Invariance Principle, functional Central Limit Theorem

a) Let Xn be a partial sum process as in Thm. 18.7. The limit process W in18.7 does not depend on the specific distribution of Z1 (we only needE(Z1) = 0, 0 < V(Z1) <∞). This fact is called the invariance principle.

b) Let h : C → Rk be measurable and W (C(h)) = 1. From XnD−→W and

the CMT, we have

h(Xn)D−→ h(W )

(so-called functional central limit theorem).

From the invariance principle, the limit distribution of h(Xn) does not dependon the special distribution of Z1.

Important consequence: If you can find the limit distribution of h(Xn) for asimple PSP (e.g., the simple symmetric random walk case P(Z1 = ±1) = 1/2),you know the distribution of h(W ).

Norbert Henze, KIT 18.20

Page 266: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Wiener Measure, Donsker’s Theorem

18.10 Theorem (The distribution of max0≤t≤1W (t))

For the Wiener process W , we have

max0≤t≤1

W (t) ∼ |N |, where N ∼ N(0, 1),

i.e.,

P

(max0≤t≤1

W (t) ≤ u)

= 2Φ(u) − 1, u ≥ 0,

where Φ is the distribution function of the standard normal distribution.

Proof: Let Xn be the partial sum process associated with the i.i.d.-sequence(Zj)j≥1, where P(Z1 = 1) = P(Z1 = −1) = 1/2 (simple symmetric randomwalk). Exercise =⇒

max0≤t≤1

Xn(t) =1√n· maxk=0,...,n

SkD−→ |N |.

Since XnD−→W and the function h : C → R, h(x) := max0≤t≤1 h(t), is

continuous, (check!) the CMT yields

max0≤t≤1

Xn(t) = h(Xn)D−→ h(W ) = max

0≤t≤1W (t).

Norbert Henze, KIT 18.21

Page 267: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Wiener Measure, Donsker’s Theorem

18.11 Corollary Let Z1, Z2, . . . be independent identically distributed randomvariables with E(Z1) = 0, 0 < σ2 := V(Z1) <∞. Then

1

σ√n

maxk=0,...,n

SkD−→ |N |, where N ∼ N(0, 1).

Consider the functionals

h+(x) := λ1 (t ∈ [0, 1] : x(t) > 0) , x ∈ C,

h0(x) := supt ∈ [0, 1] : x(t) = 0, x ∈ C.h+(x) is the time that x

”spends above the t-axis“,

h0(x) is the time of the last zero of x.

0

1

2

−11

x(t)

h0(x)

h+(x)t

Norbert Henze, KIT 18.22

Page 268: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Wiener Measure, Donsker’s Theorem

For the PSP Xn based on the symmetric simple random walk, we have (seeHenze,N. (2013): Irrfahrten und verwandte Zufalle, Spr. Spektrum, p.21, p.46):

limn→∞

P

(h0(Xn)

n≤ u)

= limn→∞

P

(h+(Xn)

n≤ u)

=2

πarcsin

√u, 0 ≤ u ≤ 1.

From this, we have the famous Arc Sine Law:

18.12 Theorem (Arc Sine Law for the Wiener process)

We have

P(h0(W ) ≤ u) = P(h+(W ) ≤ u) =2

πarcsin

√u, 0 ≤ u ≤ 1.

0 1u

1/(π√u(1− u))

0 1u

2πarcsin

√u

Density (left) and distribution function (right) of the Arc Sine distribution

Norbert Henze, KIT 18.23

Page 269: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Wiener Measure, Donsker’s Theorem

18.13 Theorem (Fourier representation of W )

Let N1, N2, . . . be i.i.d. standard normal random variables. Put

W (t) :=∞∑

j=1

√2 sin

((j − 1

2

)t)

(j − 1

2

·Nj , 0 ≤ t ≤ 1.

The series converges in L2 := L2([0, 1],B ∩ [0, 1], λ1|[0,1]), and we have

WDfidi= W (equality of finite-dimensional distributions).

The proof uses Mercer’s Theorem:

18.14 Theorem (Mercer)

Let K : [0, 1]2 → R, K 6≡ 0, be a continuous, symmetric function satisfying

∫ 1

0

∫ 1

0g(s)K(s, t)g(t) ds dt ≥ 0 ∀ g ∈ L2. (K positive-semidefinite)

ThenK(s, t) =

∑∞j=1λjϕj(s)ϕj(t), 0 ≤ s, t ≤ 1, (18.6)

where λ1, λ2, . . . are the positive eigenvalues and ϕ1, ϕ2, . . . the correspondingnormalized eigenfunctions of the integral operator associated with the kernel K.The series in (18.6) converges both uniformly and absolutely.

Norbert Henze, KIT 18.24

Page 270: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Wiener Measure, Donsker’s Theorem

18.15 Theorem (Mercer’s theorem, applied to K(s, t) = s ∧ t)We have

s ∧ t =∞∑

j=1

λj ϕj(s)ϕj(t), 0 ≤ s, t ≤ 1,

where

λj =1

π2(j − 1

2

)2 , ϕj(t) =√2 sin

((j − 1

2

)πt

), j ≥ 1.

Proof: Exercise! (Differentiate λf(s) =∫ 1

0s ∧ t f(t) dt twice.)

Let

W (t) :=

∞∑

j=1

√λj ϕj(t)Nj (L2-limit).

Proof of Theorem 18.13. Fix k ≥ 1 and 0 ≤ t1 < . . . < tk ≤ 1.Claim:

(W (t1), . . . , W (tk)

)D= (W (t1), . . . ,W (tk))

D= Nk

(0, (ti ∧ tj)1≤i,j≤k

).

Norbert Henze, KIT 18.25

Page 271: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Wiener Measure, Donsker’s Theorem

Memo: Claim:(W (t1), . . . , W (tk)

)D= Nk

(0, (ti ∧ tj)1≤i,j≤k

).

Fix c1, . . . , ck ∈ R. To show:k∑

ℓ=1

cℓW (tℓ) ∼ N(0,∑k

ℓ,m=1cℓcm tℓ ∧ tm).

Let

Wn(t) :=n∑

j=1

√λj ϕj(t)Nj , n ≥ 1.

Notice that

k∑

ℓ=1

cℓWn(tℓ) =

k∑

ℓ=1

cℓ

(n∑

j=1

√λjϕj(tℓ)Nj

)=

n∑

j=1

√λj

(k∑

ℓ=1

cℓϕj(tℓ)

)Nj

∼ N(0,∑n

j=1λj

∑kℓ,m=1cℓcm ϕj(tℓ)ϕj(tm)

)

= N(0,∑k

ℓ,m=1cℓcm∑n

j=1λjϕj(tℓ)ϕj(tm))

︸ ︷︷ ︸→ tℓ ∧ tm.

Since∑k

ℓ=1 cℓWn(tℓ)L2

−→∑kℓ=1 cℓW (tℓ), the assertion follows. (why?)

Norbert Henze, KIT 18.26

Page 272: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Brownian Bridge, Wiener process on [0,∞)

19 Brownian Bridge, Wiener process on [0,∞)

19.1 Definition (Brownian Bridge)

A C[0, 1]-valued random element B is called Brownian Bridge :⇐⇒a) P(B(0) = 0) = 1 = P(B(1) = 0),

b) For each k ≥ 1, for each t1, . . . , tk with 0 ≤ t1 < . . . < tk ≤ 1:

B(t1)

...B(tk)

∼ Nk

0...0

, (min(ti, tj)− titj)1≤i,j≤k

.

19.2 Theorem B exists.

Proof: Consider the mapping h : C → C, C ∋ x 7→ h(x), defined by

h(x)(t) := x(t)− t · x(1), 0 ≤ t ≤ 1.

Note that h is continuous, (why?) and that h(x)(1) = 0.

Moreover, x(0) = 0 =⇒ h(x)(0) = 0.

Norbert Henze, KIT 19.1

Page 273: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Brownian Bridge, Wiener process on [0,∞)

0

1

−1

1

tx(1)x(t)− tx(1)

x(t)

t

Let (W (t),0 ≤ t ≤ 1), be a Wiener process. Put

B(t) := W (t)− tW (1) = h W (t), 0 ≤ t ≤ 1.

Then P(B(0) = 0) = 1 = P(B(1) = 0), i.e., 19.1 a) holds. Notice that

B(t1)B(t2)

...B(tk)

=

1 0 · · · 0 −t0 1 0 0 −t... 0

. . . 0...

0 0 · · · 1 −t

W (t1)...

W (tk)W (1)

∼ Nk

Norbert Henze, KIT 19.2

Page 274: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Brownian Bridge, Wiener process on [0,∞)

Memo: B(t) =W (t)− tW (1).

We haveEB(t) = EW (t)− tEW (1) = 0,

and, for 0 ≤ s, t ≤ 1,

Cov(B(s),B(t)) = E [(W (s)− sW (1))(W (t)− tW (1))]

= E [W (s)W (t)] − sE [W (1)W (t)] − tE [W (s)W (1)]

+stE[W (1)2

]

= s ∧ t− st− ts+ st

= s ∧ t− st.

I.e., 19.1 b) holds, q.e.d.

Norbert Henze, KIT 19.3

Page 275: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Brownian Bridge, Wiener process on [0,∞)

0

0.5

1.0

−0.5

t

3 realizations of an (approximate) Brownian bridge

Norbert Henze, KIT 19.4

Page 276: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Brownian Bridge, Wiener process on [0,∞)

19.3 The Wiener process on [0,∞)

Let S := C[0,∞) := x : R≥0 → R | x continuous.For x, y ∈ C[0,∞), put

ρ(x, y) :=∞∑

j=1

1

2j· max0≤t≤j |x(t)− y(t)|1 + max0≤t≤j |x(t)− y(t)|

.

Then (C[0,∞), ρ) is a complete and separable metric space. (Exercise!)

ρ(xn, x)→ 0 ⇐⇒ maxt∈K|xn(t)− x(t)| → 0 for each compact set K ⊂ R≥0.

For j ∈ N, let

rj :

C[0,∞) → C[0, j],

x 7→ rj(x) := x|[0,j], (restriction of x to [0, j])

Let P, P1, P2, . . . be probability measures on B. Then

PnD−→ P ⇐⇒ Pnr

−1j

D−→ Pr−1j ∀ j ≥ 1.

Ref.: Whitt, W.: Ann. Mathem. Statist. 41, 1970, 939–944.

Norbert Henze, KIT 19.5

Page 277: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Brownian Bridge, Wiener process on [0,∞)

Consider the mapping

h :

C[0, 1]→ C[0,∞),

x 7→ h(x), h(x)(t) := (1 + t) · x(

t

1 + t

), 0 ≤ t <∞.

The function h is continuous. (why?)

Let (B(t))0≤t≤1 be a Brownian bridge. Put

V (t) := h(B)(t)

= (1 + t) ·B(

t

1 + t

), t ≥ 0.

Then V is a random element of C[0,∞).

Notice that EV (t) = 0, t ≥ 0, and that, for 0 ≤ s ≤ t,

Cov(V (s), V (t)) = (1 + s)(1 + t)Cov

(B

(s

1 + s

), B

(t

1 + t

))

= (1 + s)(1 + t)

(min

(s

1 + s,

t

1 + t

)− s

1 + s

t

1 + t

)

= s(1 + t)− st = s

= min(s, t).

Norbert Henze, KIT 19.6

Page 278: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Brownian Bridge, Wiener process on [0,∞)

Memo: V (t) = (1 + t) ·B(

t1+t

), EV (t) = 0, Cov(V (s), V (t)) = s ∧ t.

Notice that

a) P(V (0) = 0) = 1,

b) For each k ≥ 1, for each t1, . . . , tk with 0 ≤ t1 < . . . < tk <∞:

V (t1)

...V (tk)

∼ Nk

0...0

, (ti ∧ tj)1≤i,j≤k

.

From this property, it follows that V has independent increments.

Any random element V satisfying a) and b) is called Wiener process orBrownian motion on C[0,∞).

Norbert Henze, KIT 19.7

Page 279: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Brownian Bridge, Wiener process on [0,∞)

19.4 Theorem Let W be a Wiener process on [0, 1]. For ε > 0, let

Pε(A) := P (W ∈ A|0 ≤W (1) ≤ ε), A ∈ B.

We then have PεD−→ B as ε ↓ 0, where B is a Brownian bridge.

Proof: Let W be defined on (Ω,A, P), and let B be defined as

B(t) := W (t)− tW (1), 0 ≤ t ≤ 1.

According to the Portmanteau Theorem, we have to show

lim supε↓0

P(W ∈ A|0 ≤W (1) ≤ ε) ≤ P(B ∈ A) ∀A ∈ A.

Notice that, for each k ≥ 1 and each choice of t1, . . . , tk,(W (1), B(t1), . . . , B(tk)) has a (k + 1)-variate normal distribution. Moreover,for each j ∈ 1, . . . , k,

E[W (1)B(tj)] = E[W (1)(W (tj)− tjW (1)

]= tj − tj = 0 =⇒

W (1) and (B(t1), . . . B(tk)) are independent ∀k ≥ 1,∀t1, . . . , tk =⇒

P(W (1) ∈ A,B ∈M) = P(W (1) ∈ A) · P(B ∈M) ∀M ∈ Cf ∀A ∈ B1.

Norbert Henze, KIT 19.8

Page 280: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Brownian Bridge, Wiener process on [0,∞)

Memo: Cf :=π−1t1,...,tk

(H)∣∣k ∈ N, 0 ≤ t1 < . . . < tk ≤ 1,H ∈ Bk

Memo: P(W (1) ∈ A,B ∈M) = P(W (1) ∈ A) · P(B ∈M) ∀M ∈ Cf ∀A ∈ B1

Fix A ∈ B1. Put

DA := M ∈ B : P(W (1) ∈ A,B ∈M) = P(W (1) ∈ A) · P(B ∈M).

We have:

a) Cf ⊂ DA.

b) DA is a Dynkin system, i.e., we have:

C ∈ DA,

D,E ∈ DA and D ⊂ E =⇒ E \D ∈ DA,

E1, E2, . . . ∈ DA pairwise disjoint =⇒∑∞

n=1 En ∈ DA.

It follows that δ(Cf ) ⊂ DA, where δ(Cf ) is the smallest Dynkin system over Ccontaining Cf .

Cf π-system =⇒ B = σ(Cf ) = δ(Cf ) ⊂ DA.

Norbert Henze, KIT 19.9

Page 281: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Brownian Bridge, Wiener process on [0,∞)

Hence,P(B ∈M |0 ≤W (1) ≤ ε) = P(B ∈M) ∀M ∈ B.

We have (recall: B(t) =W (t)− tW (1))

ρ(W,B) = sup0≤t≤1

|W (t)− (W (t)− tW (1))| = |W (1)|.

Now, fix A ∈ A and δ > 0.

If |W (1)| ≤ δ and W ∈ A, then B ∈ Aδ := x : ρ(x,A) ≤ δ.If 0 < ε < δ, then

P(W ∈ A|0 ≤W (1) ≤ ε) ≤ P(B ∈ Aδ|0 ≤W (1) ≤ ε)= P(B ∈ Aδ)

=⇒ lim supε↓0

P(W ∈ A|0 ≤W (1) ≤ ε) ≤ P(B ∈ Aδ).

δ ↓ 0 =⇒ lim supε↓0

P(W ∈ A|0 ≤W (1) ≤ ε) ≤ P(B ∈ A), q.e.d.

Norbert Henze, KIT 19.10

Page 282: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Brownian Bridge, Wiener process on [0,∞)

19.5 Remark Loosely speaking, Thm. 19.4 reads

PB = P

W |W (1)=0.

(”Brownian bridge is tied down Brownian motion“)

0

0.5

1.0

−0.5

t

The Brownian bridge is tied down Brownian motion

Norbert Henze, KIT 19.11

Page 283: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Brownian Bridge, Wiener process on [0,∞)

19.6 Some relations between processes (Exercise!)

a) Let W be a Wiener process (WP) on [0,∞) and r > 0. Then

W ∗(t) := ±√rW(t

r

), t ≥ 0, is a WP on [0,∞).

b) Let W be a WP on [0,∞) and r > 0. Then

W (t) :=W (t+ r)−W (r), t ≥ 0, is a WP on [0,∞).

c) Let W be a WP [0,∞). Then (use W (s)/sa.s.−→ 0 as s→∞)

W (t) := tW

(1

t

), t ≥ 0, is a WP on [0,∞).

d) Let W be a WP on [0,∞). Then (use W (s)/sa.s.−→ 0 as s→∞)

B(t) := (1− t)W(

t

1− t

), 0 ≤ t ≤ 1, is a Brownian bridge.

e) Let B be a Brownian bridge and Z ∼ N(0, 1), independent of B. Then

W (t) := B(t) + tZ, 0 ≤ t ≤ 1, is a WP on [0, 1].

Norbert Henze, KIT 19.12

Page 284: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Brownian Bridge, Wiener process on [0,∞)

19.7 Theorem (Reproduction Theorem for B)

Let B1, B2 be independent Brownian bridges. If a1, a2 ∈ R and a21 + a22 = 1,then

B := a1B1 + a2B2

is a Brownian bridge.

Proof: Notice that P(B(0) = 0) = 1 = P(B(1) = 0).

For each k ≥ 1, for each t1, . . . , tk ∈ [0, 1]:

(B(t1), . . . , B(tk))⊤ ∼ Nk (addition theorem for Nk).

We have EB(t) = 0, 0 ≤ t ≤ 1.

Let K(s, t) := s ∧ t− st (covariance function of a Brownian bridge).

E [B(s)B(t)] = E [(a1B1(s) + a2B2(s))(a1B1(t) + a2B2(t))]

= a21K(s, t) + a1a2E [B1(s)B2(t)] + a2a1E [B2(s)B1(t)] + a22K(s, t)︸ ︷︷ ︸ ︸ ︷︷ ︸= 0 = 0

= (a21 + a22)K(s, t)

= K(s, t), q.e.d. Generalization to n indep. Brownian bridges?

Norbert Henze, KIT 19.13

Page 285: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

The Space D[0, 1]

20 The space D[0, 1]

Motivation: Let U1, U2, . . . be i.i.d. random variables, where U1 ∼ U[0, 1]. Let

Fn(t) :=1

n

n∑

j=1

1Uj ≤ t, 0 ≤ t ≤ 1,

be the empirical distribution function (EDF) of U1, . . . , Un, cf. Chapter 7. Wealready know that, for each k ≥ 1, for each 0 ≤ t1 < . . . < tk ≤ 1:

√n(Fn(t1)− t1

)

...√n(Fn(tk)− tk

)

D−→ Nk

0...0

, (K(ti, tj))1≤i,j≤k

,

whereK(s, t) = min(s, t)− st, 0 ≤ s, t ≤ 1,

is the covariance function of a Brownian bridge B.

Does

(√n(Fn(t)− t

)0≤t≤1

)(as a discontinuous random function) converge

in distribution to B?

Norbert Henze, KIT 20.1

Page 286: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

The Space D[0, 1]

For x : [0, 1]→ R, let

x(t+) := lims↓t

x(s), t ∈ [0, 1) (right-hand limit)

x(t−) := lims↑t

x(s), t ∈ (0, 1] (left-hand limit).

20.1 Definition (The cadlag space D[0, 1])

Let

D[0, 1] := x : [0, 1]→ R|x(t+) = x(t)∀ t ∈ [0, 1), x(t−) exists ∀t ∈ (0, 1]

be the space of real functions on [0, 1] that are right-continuous and have left-hand limits. D := D[0, 1] is called cadlag space.

(french: continue a droite, limites a gauche).

For x ∈ D and T ⊂ [0, 1], let

wx(T ) := w(x, T ) := sups,t∈T

|x(s)− x(t)|.

Notice thatwx(δ) = sup

|u−v|≤δ

|x(u)− x(v)| = sup0≤t≤1−δ

wx([t, t+ δ]).

Norbert Henze, KIT 20.2

Page 287: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

The Space D[0, 1]

20.2 Lemma For each x ∈ D and each ε > 0, there exist points t0, t1, . . . , tkso that

0 = t0 < t1 < . . . < tk = 1

andwx([ti−1, ti)) < ε, i = 1, 2, . . . , k. (20.1)

Proof: Let

t := supt ∈ [0, 1] : [0, t) can be decomposed into finitely many

intervals satisfying (20.1).

Since x(0) = x(0+) we have t > 0. Since x(t−) exists, [0, t) can itself be sodecomposed. t < 1 is impossible because of x(t) = x(t+) in that case.

20.3 Corollary

a) ∀ε > 0:∣∣t ∈ [0, 1] : |x(t)− x(t−)| ≥ ε

∣∣ <∞,

b) ‖x‖∞ := sup0≤t≤1 |x(t)| <∞,

c) x is measurable (uniform limit of simple functions constant over intervals).

x ∈ D can have at most countably many discontinuities. (why?)

Norbert Henze, KIT 20.3

Page 288: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

The Space D[0, 1]

x(t)

tt1 t2 t3 t4 t5 tk

Notice that |x(t)| ≤ max0≤j≤k |x(tj)|+ ε.

Norbert Henze, KIT 20.4

Page 289: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

The Space D[0, 1]

For 0 < δ < 1, let

w′x(δ) := inf

max1≤i≤k

wx[ti−1, ti)∣∣∣k ∈ N, 0 = t0 < t1 < . . . < tk = 1,

min1≤i≤k

(ti − ti−1) > δ

w′x(·) is called the cadlag modulus.

Notice that the infimum is taken over all so-called δ-sparse partitions of [0, 1].

Lemma 20.2 is equivalent to saying that for each x ∈ D, w′x(δ)→ 0 as δ → 0.

We have

w′x(δ) ≤ wx(2δ), if δ < 1

2.

Proof. δ < 1/2 =⇒ ∃ δ-sparse partition with ti − ti−1 ≤ 2δ ∀i.Let

j(x) := sup0<t≤1

|x(t)− x(t−)|

be the maximum absolute jump of x. We have (Exercise!):

wx(δ) ≤ 2w′x(δ) + j(x).

Norbert Henze, KIT 20.5

Page 290: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

The Space D[0, 1]

Since, for 0 ≤ u ≤ 1, xu := 1[u,1] ∈ D and

u 6= v =⇒ ρ(xu, xv) = max0≤t≤1

|xu(t)− xv(t)| = 1,

the metric space (D, ρ) is not separable.

Idea: xu and xv should have a small distance if u ≈ v =⇒ allow for

”deformations of the time scale“. Let

Λ := λ : [0, 1]→ [0, 1] : λ continuous, strictly increasing, bijective.

Λ is a group with respect to composition”“, and λ(0) = 0, λ(1) = 1.

Put I(t) := t, 0 ≤ t ≤ 1, and

dS(x, y) := infλ∈Λ

max (‖x λ− y‖∞, ‖λ− I‖∞) , x, y ∈ D.

≤ ρ(x, y) = ‖x− y‖∞ (put λ := I)

Notice that

dS(xn, x)→ 0 ⇐⇒ ∃λn ∈ Λ : max (‖xn λn − x‖∞, ‖λn − I‖∞) → 0,

‖xn − x‖∞ → 0 =⇒ dS(xn, x)→ 0. (take λn = I)

Norbert Henze, KIT 20.6

Page 291: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

The Space D[0, 1]

Memo: dS(xn, x)→ 0⇐⇒ ∃λn ∈ Λ : max (‖xn λn − x‖∞, ‖λn − I‖∞)→ 0.

⇐⇒ ∃λn ∈ Λ : max (‖xn − x λn‖∞, ‖λn − I‖∞) → 0

We have

|xn(t)− x(t)| ≤ |xn(t)− x (λnt) |+ |x (λnt)− x(t)|≤ ‖xn − x λn‖∞ + wx (‖λn − I‖∞) .

As a consequence, we have:

If dS(xn, x)→ 0, then

xn(t)→ x(t) for each point of continuity t of x,

xn(t)→ x(t) with at most countably many exceptional values t,

‖xn − x‖∞ → 0, if x is continuous.

20.4 Definition and Theorem

dS is a metric on D (so-called Skorokhod metric). (Exercise!)

The metric space (D, dS) is separable but not complete.

Norbert Henze, KIT 20.7

Page 292: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

The Space D[0, 1]

Memo: dS(x, y) = infλ∈Λ max (‖x λ− y‖∞, ‖λ − I‖∞)

20.5 Example ((D, dS) is not complete)

Let an := 1/2n, xn := 1[0,an), n ≥ 1.

0

1

0 1 t•

an

xn(t)

an+1

λn(t)

‖λn − I‖∞ = an+1

xn+1 = 1[0,an+1)

xn+1 λn = 1[0,an+1) λn

= 1[0,an) = xn

=⇒ ‖xn+1 λn − xn‖∞ = 0

=⇒ dS(xn, xn+1) ≤ an+1 = 2−(n+1)

=⇒ (xn) Cauchy sequence (!)

Notice that xn(t)→ 0 for each t > 0. Let x ≡ 0 (∈ D).

We have dS(xn, x) = 1 ∀n. Thus, (xn) has no limit in D.

Norbert Henze, KIT 20.8

Page 293: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

The Space D[0, 1]

Memo: dS(x, y) = infλ∈Λ max (‖x λ− y‖∞, ‖λ − I‖∞)

For λ ∈ Λ, put

‖λ‖ := sups<t

∣∣∣∣ logλ(t)− λ(s)

t− s

∣∣∣∣ (≤ ∞)

dS(x, y) := infλ∈Λ

max (‖λ‖, ‖x λ− y‖∞) .

20.6 Theorem

a) dS and dS are equivalent metrics on D (generate the same topology).

b) The space (D, dS) is separable and complete.

Notice that C = C[0, 1] ⊂ D = D[0, 1].

The Skorokhod topology relativized to C coincides with the uniform topologyon C.

The Borel σ-field in C is the trace B(D) ∩ C, where B(D) is the Borel σ-fieldin D.

Norbert Henze, KIT 20.9

Page 294: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

The Space D[0, 1]

How to characterize relative compactness in D?

w′x(δ)=inf

max1≤i≤k

wx[ti−1, ti)∣∣∣k≥1, 0= t0<t1<. . .<tk=1, min

1≤i≤k(ti−ti−1)>δ

20.7 Theorem (“Arzela–Ascoli in D[0, 1]“)A set A ⊂ D[0, 1] is relatively compact if, and only if,

supx∈A‖x‖∞ <∞, (20.2)

limδ→0

supx∈A

w′x(δ) = 0. (20.3)

0 1t

xn(t)

•n A single t with supx∈A |x(t)| <∞ does not suffice!

Consider A := xn : n ≥ 1, where xn = n1[0.5,1)

We have (20.3) and supn≥1 |xn(0.25)| <∞,

but A is not relatively compact.

Norbert Henze, KIT 20.10

Page 295: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

The Space D[0, 1]

Memo: dS(xn, x)→ 0⇐⇒ ∃λn ∈ Λ : max (‖xn λn − x‖∞, ‖λn − I‖∞) → 0.

Memo: λ ∈ Λ =⇒ λ(0) = 0, λ(1) = 1.

Let D be the Borel σ-field over D.

For t1, . . . , tk ∈ [0, 1], let πt1,...,tk : D → Rk, πt1,...,tk(x) = (x(t1), . . . , x(tk)).

20.8 Theorem

a) The projections π0 and π1 are continuous.

b) If 0 < t < 1 then: πt continuous at x ⇐⇒ t ∈ C(x).c) πt1,...,tk is (D,Bk)-measurable ∀k ≥ 1, ∀t1, . . . , tk ∈ [0, 1].

d) If T ⊂ [0, 1], 1 ∈ T and T = [0, 1], then D = σ(πt : t ∈ T).

For a probability measure P on D, put

TP := t ∈ [0, 1] : πt is continuous P -almost surely.

We have 0, 1 ⊂ TP , and [0, 1] \ TP is countable.

Norbert Henze, KIT 20.11

Page 296: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

The Space D[0, 1]

20.9 Theorem Let P, P1, P2, . . . be probability measures on D.If Pn : n ≥ 1 is tight and

Pn π−1t1,...,tk

D−→ P π−1t1,...,tk

∀k ≥ 1, ∀t1, . . . , tk ∈ TP ,

then PnD−→ P .

Let X,X1, X2, . . . be D-valued random elements. Put TX := TPX .

20.10 Theorem Suppose that

a) (Xn(t1), . . . , Xn(tk))D−→ (X(t1), . . . , X(tk)) ∀k ≥ 1, ∀t1, . . . , tk ∈ TX ,

b) X(1) −X(1− δ) D−→ δ0 as δ ↓ 0. (⇐⇒ X(1) −X(1− δ) P−→ 0)

Suppose further that, for some continuous increasing function H : [0, 1] → R

and constants α > 0, β ≥ 0:

c) P (|Xn(t)−Xn(r)| ∧ |Xn(t)−Xn(s)| ≥ γ) ≤ 1

γ4β(H(t)−H(r))2α

∀γ > 0, ∀n ≥ 1, ∀ r, s, t ∈ [0, 1] such that r ≤ s ≤ t.

Then XnD−→ X.

Norbert Henze, KIT 20.12

Page 297: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

The Space D[0, 1]

Memo: c) P (|Xn(t)−Xn(r)| ∧ |Xn(t)−Xn(s)| ≥ γ) ≤ 1

γ4β(H(t)−H(r))2α

20.11 Remark A sufficient condition for c) is

c’) E

(∣∣Xn(s)−Xn(r)∣∣2β ·

∣∣Xn(t)−Xn(s)∣∣2β)≤ (H(t)−H(r))2α.

Proof:

1|U | ∧ |V | ≥ γ ≤ |U |2β |V |2βγ4β

.

Let

ι :

C → D,

x 7→ ι(x) := x

be the canonical injection of C in D.

20.12 Definition (Wiener measure on D)

LetW be Wiener measure on B. Then the image W ι−1 ofW under ι is calledWiener measure on D. We shall write W :=W ι−1.

Norbert Henze, KIT 20.13

Page 298: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

The Space D[0, 1]

20.13 Theorem (Donsker)

Let Z1, Z2, . . . be i.i.d. random variables, E(Z1) = 0, 0 < σ2 := V(Z1) < ∞.Let S0 := 0, Sn :=

∑nj=1 Zj , n ≥ 1, and put

Xn(t) :=1

σ√nS⌊nt⌋, 0 ≤ t ≤ 1.

We then have XnD−→ W in D[0, 1].

Proof: We use Thm. 20.10. To show:

a): (Xn(t1), . . . , Xn(tk))D−→ (W (t1), . . . ,W (tk)) ∀k ≥ 1, ∀t1, . . . , tk ∈ TW .

This follows from the multiv. CLT. Notice that TW = [0, 1] since W (C) = 1.

b): W (1)−W (1−δ) D−→ δ0. This holds since W (1)−W (1−δ) ∼Wδ ∼ N(0, δ).

c’): E(∣∣Xn(s)−Xn(r)

∣∣2β ·∣∣Xn(t)−Xn(s)

∣∣2β)≤ (H(t)−H(r))2α.

Notice that Xn has independent increments =⇒

E(|Xn(s)−Xn(r)|2 · |Xn(t)−Xn(s)|2

)=⌊ns⌋ − ⌊nr⌋

n· ⌊nt⌋ − ⌊ns⌋

n.

Norbert Henze, KIT 20.14

Page 299: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

The Space D[0, 1]

Memo: Xn(t) =1

σ√nS⌊nt⌋

Memo: E(|Xn(s)−Xn(r)|2 |Xn(t)−Xn(s)|2

)=⌊ns⌋−⌊nr⌋

n· ⌊nt⌋−⌊ns⌋

n,

if 0 ≤ r ≤ s ≤ t.

=⇒ E(|Xn(s)−Xn(r)|2 |Xn(t)−Xn(s)|2

)≤( ⌊nt⌋ − ⌊nr⌋

n

)2

.

If t− r ≥ 1

n, the right-hand side is ≤ 4(t− r)2. (!)

If t− r < 1

n, the left-hand side is 0.

Thus, putting β := 1,

E

(|Xn(s)−Xn(r)|2β |Xn(t)−Xn(s)|2β

)≤ (H(t)−H(r))2α ,

where H(t) = 2t and α = 1, q.e.d.

Norbert Henze, KIT 20.15

Page 300: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Empirical Processes: Applications to Statistics

21 Empirical Processes: Applications to Statistics

Let X1, X2, . . . be i.i.d. rv’s on (Ω,A,P), P(0 ≤ X1 ≤ 1) = 1.

Let F (t) := P(X1 ≤ t), Fn(t) := n−1∑nj=1 1Xj ≤ t.

For fixed ω ∈ Ω, let Yn :

Ω→ D[0, 1],

ω 7→ Y ωn ,

where

Y ωn (t) :=

√n

(1

n

n∑

j=1

1Xj(ω) ≤ t − F (t)

), 0 ≤ t ≤ 1.

LetYn(t) :=

√n(Fn(t)− F (t)

), 0 ≤ t ≤ 1.

Then Yn := (Yn(t), 0 ≤ t ≤ 1) is a random element of D = D[0, 1].

21.1 Definition (Empirical process)

Yn is called the empirical process based on X1, . . . , Xn.

If X1 ∼ U(0, 1), then Yn is called uniform empirical process.

Norbert Henze, KIT 21.1

Page 301: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Empirical Processes: Applications to Statistics

0

0.5

1.0

−0.5

1t

√n(Fn(t)− t)

• • ••

•• • •

••

•••

• •••

••

• • •• • •

realization of a uniform empirical process (n = 25)

21.2 Definition (Gaussian random element, Gaussian process)

A random element Y of D is called Gaussian (Gaussian process) if, for anyk ≥ 1 and any t1, . . . , tk ∈ [0, 1], the random vector (Y (t1), . . . , Y (tk)) has ak-variate normal distribution.

Norbert Henze, KIT 21.2

Page 302: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Empirical Processes: Applications to Statistics

21.3 Theorem (Weak convergence of the empirical process)

For the empirical process Yn(·) =√n(Fn(·)− F (·)

), we have

YnD−→ Y in D,

where Y is a Gaussian random element of D satisfying EY (t) = 0, 0 ≤ t ≤ 1,and

Cov(Y (s), Y (t)) = F (s) ∧ F (t)− F (s)F (t), 0 ≤ s, t ≤ 1.

Proof: a) Let X1 ∼ U(0, 1), i.e., F (t) = t. In this case, Y = B = B ι−1 is a

Brownian bridge on D. Fidi convergence YnDfidi−→ B has been shown in an

exercise (multivariate CLT). Thus, condition a) of Thm. 20.10 holds. Now,

B(1)−B(1− δ) ∼ N(0, δ(1− δ)) D−→ δ0 as δ ↓ 0.

Hence, condition b) of Thm. 20.10 holds. We show

E[((Yn(s)− Yn(r))

2 · (Yn(t)− Yn(s))2] ≤ 6(t− r)2, 0 ≤ r ≤ s ≤ t ≤ 1.

Thus, putting α = β = 1, H(t) =√6 t, condition c’) of 20.11 holds, and the

assertion is true if X1 ∼ U(0, 1).

Norbert Henze, KIT 21.3

Page 303: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Empirical Processes: Applications to Statistics

Memo: To show: E[((Yn(s)−Yn(r))

2(Yn(t)−Yn(s))2]≤6(t−r)2, 0≤r≤s≤ t≤ 1.

︸ ︷︷ ︸=: ∆n(r, s, t)

Proof. Let 0 ≤ u ≤ 1. We have

Yn(u) =√n

(1

n

n∑

i=1

1Xi ≤ u − u)

=1√n

n∑

i=1

(1Xi ≤ u − u) =⇒

Yn(s)− Yn(r) =1√n

n∑

i=1

(1r < Xi ≤ s − (s− r))︸ ︷︷ ︸

=: αi

Yn(t)− Yn(s) =1√n

n∑

k=1

(1s < Xk ≤ t − (t− s))︸ ︷︷ ︸

=: βk

∆n(r, s, t) =1

n2

n∑

i,j=1

n∑

k,ℓ=1

E[αiαjβkβℓ]

=1

n2

(nE[α2

1β21 ] + n(n−1)E[α2

1]E[β22 ] + 2n(n−1)E[α1β1]E[α2β2]

)

Norbert Henze, KIT 21.4

Page 304: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Empirical Processes: Applications to Statistics

It follows that

∆n(r, s, t) ≤ E[α21β

21 ] + E[α2

1]E[β22 ] + 2E[α1β1]E[α2β2].

Memo: αi = 1r < Xi ≤ s − (s− r), βj = 1s < Xj ≤ t − (t− s).

We have

E[α21β

21 ] = E

[(1r < X1 ≤ s − (s− r))2 (1s < X1 ≤ t − (t− s))2

]

= (s− r) (1− (s− r))2 (t− s)2 + (t− s)(s− r)2 (1− (t− s))2

+(1− (t− r))(s− r)2(t− s)2

≤ 3(s− r)(t− s) ≤ 3(t− r)2.

Likewise,

E[α21]E[β

22 ] = (s− r)(1− (s− r))(t− s)(1− (t− s))≤ (s− r)(t− s) ≤ (t− r)2,

E[α1β1]E[α2β2] = . . . =− (s− r)(t− s)

2

≤ (t− r)2

=⇒ ∆n(r, s, t) ≤ 6(t− r)2, q.e.d.

Norbert Henze, KIT 21.5

Page 305: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Empirical Processes: Applications to Statistics

b) For the general case, put Xj := F−1(Uj), where U1, U2, . . . are i.i.d.∼ U(0, 1). Let

Gn(t) :=1

n

n∑

j=1

1Uj ≤ t, 0 ≤ t ≤ 1,

Zn(t) :=√n(Gn(t)− t

), 0 ≤ t ≤ 1.

a) =⇒ ZnD−→ B, where B is a Brownian bridge.

Notice that Fn(t) = Gn(F (t)) and thus Yn(t) = Zn(F (t)).

Consider the mapping

ψ :

D → D,

x 7→ ψx, ψx(t) := x(F (t)).

Recall: dS(xn, x)→ 0 and x ∈ C =⇒ ‖xn − x‖∞ → 0

=⇒ ‖ψxn − ψx‖∞ → 0

=⇒ dS(ψxn, ψx)→ 0.

a) =⇒ ZnD−→ B. CMT =⇒ Yn = ψ(Zn)

D−→ ψ(B) =: Y . The process Y hasthe desired properties (Exercise!).

Norbert Henze, KIT 21.6

Page 306: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Empirical Processes: Applications to Statistics

21.4 Goodness of fit tests

Let X1, X2, . . . be i.i.d. random variables with unknown distribution functionF , where F is assumed to be continuous.

Let F0 be a known continuous distribution function.

Suppose we want to test the hypothesis

H0 : F = F0

against the alternative H1 : F 6= F0. Let

Fn(x) :=1

n

n∑

j=1

1Xj ≤ x, x ∈ R,

be the empirical distribution function of X1, . . . , Xn.

A reasonable test statistic is the Kolmogorov test statistic

Kn := supx∈R

∣∣∣Fn(x)− F0(x)∣∣∣.

Let X(1) < . . . < X(n) denote the order statistics of X1, . . . , Xn.

Notice that P(Xi 6= Xj ∀ i 6= j) = 1 since F is continuous.

Norbert Henze, KIT 21.7

Page 307: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Empirical Processes: Applications to Statistics

Memo: H0 : F = F0, Kn := supx∈R

∣∣∣Fn(x)− F0(x)∣∣∣

We have (notice that Fn(X(j)) = j/n)

Kn = maxj=1,...,n

(max

(∣∣∣F0

(X(j)

)− j

n

∣∣∣,∣∣∣F0

(X(j)

)− j − 1

n

∣∣∣))

.

Put Uj := F0(Xj), 1 ≤ j ≤ n.Then, under H0, we have U1, . . . , Un i.i.d. ∼ U(0, 1) and thus

(F0(X(1)), . . . , F0(X(n))

)∼(U(1), . . . , U(n)

),

where U(1), . . . , U(n) are the order statistics of U1, . . . , Un.

Consequence: Under H0, the distribution of Kn does not depend on F0

=⇒ w.l.o.g. Xj ∼ U(0, 1).

Let Bn denote the uniform empirical process based on U1, . . . , Un. Then

√nKn ∼ ‖Bn‖∞ under H0.

Norbert Henze, KIT 21.8

Page 308: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Empirical Processes: Applications to Statistics

We have BnD−→ B as n→∞, where B is a Brownian bridge.

Furthermore, the mapping h : D → R, defined by

h(x) := ‖x‖∞,

is almost everywhere continuous with respect to B.

The CMT yields √nKn

D−→ ‖B‖∞ under H0.

21.5 Definition (Kolmogorov distribution)

The distribution of K := ‖B‖∞ is called Kolmogorov distribution. We have

P(K ≤ x) = 1− 2

∞∑

j=1

(−1)j−1 exp(−2j2x2

), 0 < x <∞.

Norbert Henze, KIT 21.9

Page 309: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Empirical Processes: Applications to Statistics

In Chapter 8, Example 8.28, we proved (using the theory of U-statistics)

∫ 1

0

B2n(t) dt

D−→ ω2 :=1

6+

∞∑

k=1

N2k − 1

k2π2,

where N1, N2, . . . are i.i.d. ∼ N(0, 1).

Since the mapping h : D → R, defined by

h(x) :=

∫ 1

0

x2(t) dt,

is continuous almost everywhere with respect to B, it follows that

∫ 1

0

B2(t) dt ∼ 1

6+

∞∑

k=1

N2k − 1

k2π2.

The distribution of ω2 is called Cramer-von Mises distribution.

Norbert Henze, KIT 21.10

Page 310: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Empirical Processes: Applications to Statistics

21.6 The nonparametric two-sample problem

Let X1, X2, . . . ;Y1, Y2, . . . be independent random variables, were X1, X2, . . .are i.i.d. with df F and Y1, Y2, . . . are i.i.d. with df G. F and G are assumed tobe continuous but otherwise unknown.

The problem is to test the hypothesis

H0 : F = G

against the general alternative H1 : F 6= G.

A reasonable test statistic is the Kolmogorov–Smirnov test statistic

Km,n := supx∈R

∣∣∣∣Fm(x)− Gn(x)

∣∣∣∣,

where

Fm(x) :=1

m

m∑

i=1

1Xi ≤ x, Gn(x) :=1

n

n∑

j=1

1Yj ≤ x

are the empirical df’s of X1, . . . , Xm and Y1, . . . , Yn, respectively.

Under H0, the distribution of Km,n does not depend on F . (!)

Norbert Henze, KIT 21.11

Page 311: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Empirical Processes: Applications to Statistics

21.7 Theorem Under H0, we have√

mn

m+ nKm,n

D−→ ‖B‖∞ as m,n→∞,

where B is a Brownian bridge.

Proof: W.l.o.g. let Xi, Yj ∼ U(0, 1). Then, putting

am,n =

√n

m+ n, cm,n = −

√m

m+ n

(=⇒ a2m,n + c2m,n = 1

)

we have under H0

√mn

m+ nKm,n ∼ sup

0≤t≤1

∣∣∣am,n

√m(Fm(t)− t) + cm,n

√n(Gn(t)− t)

∣∣∣

= ‖am,nAm + cm,nCn‖∞,

where Am and Cn are independent uniform empirical processes.

By the independence of Am and Cn, we have (Am, Cn)D−→ (A,C), where A

and C are independent Brownian bridges.

Norbert Henze, KIT 21.12

Page 312: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Empirical Processes: Applications to Statistics

Memo:

√mn

m+ nKm,n ∼ ‖am,nAm + cm,nCn‖∞, (Am, Cn)

D−→ (A,C)

Memo: a2m,n + c2m,n = 1

CMT =⇒ aAm + cCnD−→ aA+ cC, a, c ∈ R.

If a2 + c2 = 1, then, by the reproduction Theorem 19.7, aA+ cC ∼ B, whereB is a Brownian bridge.

If am,n → a and cm,n → c, then

‖am,nAm+cm,nCn − (aAm+cCn) ‖∞ = ‖(am,n−a)Am + (cm,n−c)Cn‖∞≤ |am,n−a|·‖Am‖∞ + |cm,n−c|·‖Cn‖∞= oP(1).

The assertion now follows from the subsequence criterion, q.e.d.

Norbert Henze, KIT 21.13

Page 313: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Empirical Processes: Applications to Statistics

21.8 Remark In the case m = n, the limit distribution of√

mn

m+ nKm,n,

i.e., the distribution of ‖B‖∞, can be obtained by elementary methods (simplesymmetric random walk and the reflection principle, see e.g., Henze, N.:Irrfahrten und andere Zufalle, Springer Spektrum 2013, p.152 ff.).

In the same way, one can derive the following result.

21.9 Theorem (The distribution of sup0≤t≤1

B(t))

We have

P

(sup

0≤t≤1B(t) ≤ x

)= 1− exp

(−2x2

), x ≥ 0.

Proof. Let X1, . . . , Xn, . . . ;Y1, . . . , Yn, . . . be i.i.d. ∼ U(0, 1),

Fn(t) :=1

n

n∑

j=1

1Xj ≤ t, Gn(t) :=1

n

n∑

j=1

1Yj ≤ t.

Norbert Henze, KIT 21.14

Page 314: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Empirical Processes: Applications to Statistics

LetUn(t) :=

√n(Fn(t)− t

), Vn(t) :=

√n(Gn(t)− t

).

By Donsker’s Theorem, UnD−→ B1, Vn

D−→ B2, where B1 and B2 areindependent Brownian bridges. Put

Bn(t) :=1√2(Un(t)− Vn(t)) =

√n

2

(Fn(t)− Gn(t)

).

By independence of B1, B2 and the CMT,

BnD−→ B :=

1√2(B1 −B2) .

Theorem 19.7 =⇒ B is a Brownian bridge.

Let Z(1) < . . . < Z(2n) denote the order statistics of X1, . . . , Xn, Y1, . . . , Yn.

Notice that sup0≤t≤1(Fn(t)− Gn(t)) only depends on whether, for eachj ∈ 1, . . . , 2n, Z(j) belongs to the X- or the Y -sample

=⇒ w.l.o.g. jumps of the EDF’s at equidistant points.

Norbert Henze, KIT 21.15

Page 315: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Empirical Processes: Applications to Statistics

10

t

Fn(t)− Gn(t)

1/n

−1/nX3 X1 Y2 Y3 Y1 X2

W.l.o.g. abscissa values 0, 1, 2, . . . , 2n.

Choose n of the points 0, 1, . . . , 2n− 1 as times for unit 1”up-steps“.

The other points are the times for unit 1 “down-steps.“

All(2nn

)ways of choosing

”up-step times“ are equiprobable.

Norbert Henze, KIT 21.16

Page 316: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Empirical Processes: Applications to Statistics

Model: W 2n := (a1, . . . , a2n) ∈ −1, 12n : a1 + . . .+ a2n = 0.

Let P be the uniform distribution on W 2n.

Let

P(V1 = a1, . . . , V2n = a2n) =1(2nn

) , if (a1, . . . , a2n) ∈W 2n

and P(V1 = a1, . . . , V2n = a2n) = 0, otherwise.

Vj models the direction of the step (+1 or −1) at time j − 1.

Let S0 := 0, Sk := V1 + . . .+ Vk, if 1 ≤ k ≤ 2n.

LetM

2n := maxk=0,...,2n

Sk.

Then

sup0≤t≤1

Bn(t) =

√n

2sup

0≤t≤1

(Fn(t)− Gn(t)

)

∼ 1√2n·M

2n.

Norbert Henze, KIT 21.17

Page 317: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Empirical Processes: Applications to Statistics

Claim: For each x > 0 we have

limn→∞

P

(M

2n√2n≤ x

)= 1− exp

(−2x2) (q.e.d.)

Proof. We first show

P (M2n ≥ k) =

(2n

n+k

)(2nn

) , k = 0, 1, . . . , n.

There is a bijection between paths from W 2n having M

2n ≥ k and paths from(0, 0) to (2n, n+ 2k)!

2n

••

••

••

••

••

••

••

••

••

••

k

2k

••

••

••

••

••

Norbert Henze, KIT 21.18

Page 318: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Empirical Processes: Applications to Statistics

Memo: P (M2n ≥ k) =

(2n

n+k

)(2nn

) , k = 0, 1, . . . , n.

Fix x > 0. Let kn := ⌈x√2n⌉. We have

P

(M

2n√2n≥ x

)= P (M

2n ≥ kn) =

(2n

n+kn

)(2nn

) =

kn−1∏

j=0

(1− kn

n− j + kn

).

Use 1− 1/t ≤ log t ≤ t− 1 to show

log P

(M

2n√2n≥ x

)≤ − kn

kn−1∑

j=0

1

n− j + kn≤ − k2n

n+ kn,

log P

(M

2n√2n≥ x

)≥ −kn

kn−1∑

j=0

1

n− j ≥ − k2nn− kn + 1

.

Since

limn→∞

k2nn+ kn

= 2x2 = limn→∞

k2nn− kn + 1

,

the assertion follows.

Norbert Henze, KIT 21.19

Page 319: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Gaussian distributions in separable Hilbert spaces

22 Gaussian distributions in separable Hilbert spaces

22.1 Hilbert spaces: Basic facts

Let H be a separable real Hilbert space with scalar (inner) product 〈x, y〉,x, y ∈ H , and norm ‖x‖ :=

√〈x, x〉.

If e1, e2, . . . is a complete orthonormal system of H , then, for x, y ∈ H :

x =∞∑

k=1

〈x, ek〉ek(

:⇐⇒ limn→∞

∥∥∥∥x−n∑

k=1

〈x, ek〉ek∥∥∥∥2

= 0

)

‖x‖2 =

∞∑

k=1

〈x, ek〉2 (Parseval’s equality),

〈x, y〉2 ≤ ‖x‖2 · ‖y‖2 (Cauchy-Schwarz-inequality),

〈x, y〉 =∞∑

k=1

〈x, ek〉 〈y, ek〉 (generalized Parseval’s equality).

The metric ρ(x, y) = ‖x− y‖ renders (H, ρ) a complete separable metric space.

As before, let O denote the system of open subsets of H and B := σ(O) theσ-field of Borel sets.

Norbert Henze, KIT 22.1

Page 320: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Gaussian distributions in separable Hilbert spaces

22.2 Examples

a) H := ℓ2 :=x = (xk)k≥1 ∈ R

N :∑∞

k=1x2k <∞

, 〈x, y〉 :=∑∞

k=1 xkyk.

b) Let (Ω,A, µ) be a σ-finite measure space, where A = σ(M) for acountable systemM⊂ P(Ω).Let H := L2(Ω,A, µ) be the set of (equivalence classes of) measurablefunctions f : Ω→ R satisfying

Ω

f2 dµ <∞.

Here,

〈f, g〉 =∫

Ω

f g dµ.

Notice that a) is a special case of b). (why?)

Each infinite-dimensional separable Hilbert space is isomorphic to ℓ2, since

H ∋ x←→ (〈x, ek〉)k≥1 ∈ ℓ2.

Norbert Henze, KIT 22.2

Page 321: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Gaussian distributions in separable Hilbert spaces

22.3 Definition (Properties of operators)

An operator is a function T : H → H . T is called

linear, if T (ax+ by) = aTx+ b Ty, x, y ∈ H, a, b ∈ R,

bounded, if ‖Tx‖ ≤ K · ‖x‖, x ∈ H , for some K ∈ [0,∞),

compact, if T (M) is relatively compact whenever M ⊂ H is bounded,

symmetric, if 〈Tx, y〉 = 〈x, Ty〉 x, y ∈ H ,

positive, if 〈Tx, x〉 ≥ 0, x ∈ H .

A compact linear operator is called of trace class, if∞∑

k=1

|〈ek, T ek〉| <∞, (22.1)

where e1, e2, . . . is a complete orthonormal system (COS) of H .

If T is of trace class, then

tr(T ) :=∞∑

k=1

〈ek, T ek〉

is called the trace of T .

Condition (22.1) and tr(T ) do not depend on the special choice of a COS.

Norbert Henze, KIT 22.3

Page 322: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Gaussian distributions in separable Hilbert spaces

Memo: A linear mapping ℓ : H → R is called a linear functional.

Memo: ℓ is bounded if ‖ℓ‖ := sup|ℓ(x)| : x ∈ H, ‖x‖ = 1 <∞.

22.4 Theorem (Riesz’s representation theorem)

If ℓ is a bounded linear functional, there is a unique z ∈ H with

ℓ(x) = 〈z, x〉, x ∈ H.

Moreover, ‖ℓ‖ = ‖z‖.

Norbert Henze, KIT 22.4

Page 323: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Gaussian distributions in separable Hilbert spaces

22.5 Finite-dimensional sets

Fix a complete orthonormal set e1, e2, . . . of H . For k ∈ N, let

πk : H → Rk, x 7→ πk(x) := πkx := (〈x, e1〉, . . . , 〈x, ek〉) .

M :=∞⋃

k=1

π−1k (Bk) is called the system of finite-dimensional sets.

M is a π-system (!), and we have σ(M) = B.

Proof. Since πk is continuous (!), we have σ(M) ⊂ B.For x ∈ H, ε > 0, let B(x, ε) := y ∈ H : ‖x− y‖ < ε. For k ≥ 1, we have

Ck(x, ε) := y ∈ H :∑k

n=1〈x− y, en〉2 ≤ ε2= y ∈ H : ‖πkx− πky‖22 ≤ ε2 (Euclidean norm in R

k)

= π−1k

(z ∈ R

k : ‖z − πkx‖2 ≤ ε)∈ M.

Parseval =⇒ ∩∞k=1Ck(x, ε) = y ∈ H : ‖x− y‖ ≤ ε ∈ σ(M).

It follows that B(x, ε) = ∪∞m=1

y ∈ H : ‖x− y‖ ≤ ε− 1/m

∈ σ(M), q.e.d.

Norbert Henze, KIT 22.5

Page 324: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Gaussian distributions in separable Hilbert spaces

Memo: M :=∞⋃

k=1

π−1k (Bk), M π-system, σ(M) = B.

22.6 Corollary

a) Let P be a probability measure on B. Then P is uniquely determined bythe distributions P π−1

k on Bk, k ≥ 1, the so-called finite-dimensionaldistributions of P .

b) Let (Ω,A,P) be a probability space. Suppose X : Ω→ H is a randomelement of H , i.e., a (A,B)-measurable mapping.

Then the distribution PX = P X−1 of X is uniquely determined by thedistributions of the k-dimensional random vectors

(〈X, e1〉, . . . , 〈X, ek〉) , k ≥ 1.

Here, e1, e2, . . . is any complete orthonormal set of H .

Norbert Henze, KIT 22.6

Page 325: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Gaussian distributions in separable Hilbert spaces

22.7 Definition (Expectation of a H-valued random element)

Let X be a H-valued random element (on some probability space (Ω,A,P))satisfying E|〈X,x〉| <∞, x ∈ H . Suppose there is m ∈ H with

〈m,x〉 = E〈X,x〉 ∀ x ∈ H.

Then m is called the expectation of X, and we write EX = m.

We thus have〈EX,x〉 = E〈X,x〉, x ∈ H.

X is called centered, if EX = 0 (the zero vector in H).

Convince yourself that (!)

EX is uniquely determined (if it exists),

if H = Rd, EX is the expectation of the d-dimensional random vector X,as given in Definition 5.1 a).

Norbert Henze, KIT 22.7

Page 326: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Gaussian distributions in separable Hilbert spaces

Memo: 〈EX,x〉 = E〈X,x〉, x ∈ H.

22.8 Theorem If E‖X‖ <∞ then EX exists.

Proof. Fix x ∈ H . We have

E|〈X,x〉| ≤ E (‖X‖ · ‖x‖)) = ‖x‖ · E‖X‖ <∞.

Hence, E〈X,x〉 exists for each x ∈ H . Put ℓ(x) := E〈X,x〉, x ∈ H .

Then ℓ : H → R is a well-defined linear functional on H .

Moreover, |ℓ(x)| ≤ E‖X‖ · ‖x‖, x ∈ H , shows that ℓ is bounded.

Riesz’s representation theorem =⇒ there is a unique m ∈ H with

ℓ(x) = 〈m,x〉, x ∈ H, q.e.d.

22.9 Remark We have ‖EX‖ ≤ E‖X‖.

Can you prove this fact?

Norbert Henze, KIT 22.8

Page 327: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Gaussian distributions in separable Hilbert spaces

22.10 Theorem (Linearity of expectations)

Let (Ω,A, P) be a probability space, and let L1 be the set of all H-valuedrandom elements X : Ω→ H satisfying E‖X‖ <∞.

Then L1 is a vector space (over R), and we have

E[aX + bY

]= aEX + bEY, a, b ∈ R, X, Y ∈ L1.

Proof. We first show that, if X and Y are random elements of H anda, b ∈ R, then aX + bY is a random element of H , i.e., (A,B)-measurable (!).Recall

M := ∪∞k=1π

−1k (Bk), πkx := (〈x, e1〉, . . . , 〈x, ek〉), σ(M) = B.

Fix M ∈ M =⇒ ∃ k ∃Bk ∈ Bk :M = π−1k (Bk). Now,

(aX + bY )−1 (M) = (aX + bY )−1(π−1k (Bk)

)= (πk (aX + bY ))−1 (Bk),

πk (aX + bY ) = a (〈X, e1〉, . . . , 〈X, ek〉) + b (〈Y, e1〉, . . . , 〈Y, ek〉).︸ ︷︷ ︸is (A,Bk)-measurable

Thus, (aX + bY )−1 (M) ⊂ A. Since σ(M) = B, the assertion follows.

The rest of the proof is an exercise!

Norbert Henze, KIT 22.9

Page 328: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Gaussian distributions in separable Hilbert spaces

22.11 Theorem

Let X be a H-valued random element. If E‖X‖2 < ∞, there is a uniquesymmetric positive linear operator T : H → H of trace-class that satisfies

〈Tx, y〉 = E[〈X,x〉〈X,y〉

], x, y ∈ H.

Moreover, we have tr(T ) = E‖X‖2.

Proof. Fix x, y ∈ H . Cauchy-Schwarz =⇒

E|〈X,x〉〈X, y〉| ≤ ‖x‖ · ‖y‖ · E‖X‖2 <∞.

Thus,ℓ(x, y) := E

[〈X,x〉〈X, y〉

], x, y ∈ H,

defines a bilinear, symmetric and bounded functional ℓ : H ×H → R.

Fix x ∈ H , and put ℓx(y) := ℓ(x, y), y ∈ H . Then ℓx is a bounded linearfunctional. Riesz =⇒ there is a unique element Tx := T (x) ∈ H with

E[〈X,x〉〈X,y〉

]= ℓ(x, y) = ℓx(y) = 〈Tx, y〉, y ∈ H.

T : H → H is symmetric (√), positive (

√) and linear (Exercise!).

Norbert Henze, KIT 22.10

Page 329: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Gaussian distributions in separable Hilbert spaces

Memo: 〈Tx, y〉 = E[〈X,x〉〈X,y〉

],

Memo: T : H → H linear, symmetric and positive.

T is of trace-class, since, for a fixed complete orthonormal set e1, e2, . . .,

E‖X‖2 = E

[ ∞∑

k=1

〈X, ek〉2]

(why?)

=∞∑

k=1

E

[〈X, ek〉2

](why?)

=∞∑

k=1

〈Tek, ek〉

< ∞.

Notice that E‖X‖2 <∞ implies E‖X‖ <∞ (why?).

In particular, EX exists.

Norbert Henze, KIT 22.11

Page 330: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Gaussian distributions in separable Hilbert spaces

22.12 Theorem and Definition (Covariance operator)

Suppose X is a H-valued random element and E‖X‖2 < ∞. Then there is aunique positive symmetric linear operator Σ : H → H satisfying

〈Σx, y〉 = E[〈X − EX,x〉〈X − EX, y〉

], x, y ∈ H. (22.2)

The operator Σ is called the covariance operator of (the distribution of) X.

Proof. Theorem 22.11 =⇒ ∃! operator T : H → H , T linear, symmetric,positive, of trace-class, satisfying

〈Tx, y〉 = E[〈X,x〉〈X,y〉

], x, y ∈ H.

Put Σx := Σ(x) := Tx− 〈EX,x〉EX, x ∈ H . Then Σ is linear, symmetric, ,positive, and (22.2) holds. (Exercise!).

22.13 Remark The covariance operator is of trace-class. (Exercise!)

Norbert Henze, KIT 22.12

Page 331: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Gaussian distributions in separable Hilbert spaces

Memo: 〈Σx, y〉 = E[〈X − EX,x〉〈X − EX, y〉

], x, y ∈ H.

22.14 Corollary If H = Rd and thus X is a d-dimensional random vector, thecovariance operator Σ of X is equal to the covariance matrix of X, viewed as alinear operator acting on column vectors.

Proof. Let X = (X1, . . . , Xd)⊤, EX = (EX1, . . . ,EXd)

⊤.

Putting x = (x1, . . . , xd)⊤, y = (y1, . . . , yd)

⊤, Σ = (σij)1≤i,j≤d, the left-handside of the memo becomes

〈Σx, y〉 =d∑

i=1

d∑

j=1

σij xi yj .

Since

〈X − EX,x〉 · 〈X − EX, y〉 =d∑

i=1

d∑

j=1

(Xi − EXi) xi (Xj − EXj) yj ,

the right-hand side is

d∑

i=1

d∑

j=1

Cov(Xi, Xj) xi yj , q.e.d.

Norbert Henze, KIT 22.13

Page 332: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Gaussian distributions in separable Hilbert spaces

22.15 Theorem (Covariance operators and independence)

Let X and Y be independent random elements of H with E‖X‖2 < ∞ andE‖Y ‖2 <∞. Writing Σ(Z) for the covariance operator of a random element Z,we then have:

Σ(X + Y ) = Σ(X) + Σ(Y ).

Proof. Fix x, y ∈ H , and put X = X − EX, Y = Y − EY . We have to show

〈Σ(X + Y )x, y〉 =⟨

Σ(X) + Σ(Y )x, y⟩.

Now, since E(X + Y ) = EX + EY and

E〈Y , x〉 = E〈Y − EY, x〉 = E〈Y, x〉 − 〈EY, x〉 = 0,

we have

〈Σ(X + Y )x, y〉 = E[〈X + Y − E(X + Y ), x〉 · 〈X + Y − E(X + Y ), y〉

]

= E[〈X + Y , x〉 · 〈X + Y , y〉

]

= E[〈X, x〉〈X, y〉

]+ E

[〈Y , x〉〈X, y〉

]+ E

[〈X, x〉〈Y , y〉

]

+E[〈Y , x〉〈Y , y〉]= 〈Σ(X)x, y〉+ 〈Σ(Y )x, y〉 =

⟨Σ(X) + Σ(Y )

x, y⟩.√

Norbert Henze, KIT 22.14

Page 333: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Gaussian distributions in separable Hilbert spaces

22.16 Definition (Characteristic functional)

Suppose X is a H-valued random element. Then the function

ϕX : H → C, x 7→ ϕX (x) := E[ei〈X,x〉

]=

H

ei〈y,x〉 PX(dy)

is called the characteristic functional of (the distribution of) X.

22.17 Theorem (Properties of ϕX)

The characteristic functional has the following properties:

a) ϕX (0) = 1 (0 is the zero vector in H),

b) ϕX is continuous, (why?)

c) ϕX is positive-semidefinite, i.e.,

n∑

k,ℓ=1

αkαℓϕX(xℓ − xk) ≥ 0 ∀n ≥ 1, ∀x1, . . . , xn ∈ H, ∀α1, . . . , αn ∈ C,

d) If X and Y are independent, then ϕX+Y = ϕX · ϕY , (why?)

e) ϕX = ϕY ⇐⇒ XD= Y .

Norbert Henze, KIT 22.15

Page 334: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Gaussian distributions in separable Hilbert spaces

Proof of c) Notice that

0 ≤ E

∣∣∣∣n∑

k=1

αkei〈X,xk〉

∣∣∣∣2

= E

[ n∑

k,ℓ=1

αkαℓei〈X,xk−xℓ〉

]=

n∑

k,ℓ=1

αkαℓϕX(xk−xℓ).

Proof of e) Let e1, e2, . . . be some COS of H . Put

Xk := (〈X, e1〉, . . . , 〈X, ek〉)⊤, Yk := (〈Y, e1〉, . . . , 〈Y, ek〉)⊤.

Put x = a1e1 + . . .+ akek, where a1, . . . , ak ∈ R. Then

E

[exp

(i

k∑

j=1

aj〈X, ej〉)]

= E

[exp

(i

⟨X,

k∑

j=1

ajej

⟩)]

= ϕX

(k∑

j=1

ajej

)= ϕY

(k∑

j=1

ajej

)

= E

[exp

(i

⟨Y,

k∑

j=1

ajej

⟩)]= E

[exp

(i

k∑

j=1

aj〈Y, ej〉)]

,

i.e., ϕXk(a) = ϕYk

(a) ∀ a = (a1, . . . , ak) ∈ Rk =⇒ Xk

D= Yk. 22.6 =⇒ assertion.

Norbert Henze, KIT 22.16

Page 335: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Gaussian distributions in separable Hilbert spaces

22.18 Proposition (Characteristic function of Nd(m,Σ))

Let m ∈ Rd, Σ ∈ Rd×d symmetric, positive-semidefinite. We then have

X ∼ Nd(m,Σ)⇐⇒ ϕX(t) = E

[eit

⊤X]= exp

(it⊤m− 1

2t⊤Σt

), t ∈ R

d.

Proof.”⇐=“ follows from the uniqueness theorem of characteristic functions.

”=⇒“: 5.8 =⇒ ∃A : Σ = AA⊤ and X

D= AY +m, Y ∼ Nd(0, Id).

Fix t ∈ Rd, and put z := A⊤t. Notice that ‖z‖2 = t⊤AA⊤t = t⊤Σt. Then

ϕX(t) = E

[eit

⊤(AY +m)]

= eit⊤m · E

[ei(A

⊤t)⊤Y]

= eit⊤m · E

[eiz

⊤Y]

= eit⊤m · E

[exp

(i

d∑

k=1

zkYk

)]

= eit⊤m · E

[ d∏

k=1

exp (izkYk)

]= eit

⊤md∏

k=1

E

[exp (izkYk)

]

= eit⊤m

d∏

k=1

exp

(−z

2k

2

)= eit

⊤m · e−‖z‖2/2, q.e.d.

Norbert Henze, KIT 22.17

Page 336: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Gaussian distributions in separable Hilbert spaces

L+tr(H) := T :H→H |T linear, bounded, symm., positive, of trace-class.

22.19 Definition (Gaussian (normal) distribution in H)

A H-valued random element X has a Gaussian (normal) distribution :⇐⇒

∃m ∈ H ∃Σ ∈ L+tr(H) : ϕX(h) = ei〈m,h〉 exp

(−〈Σh, h〉

2

), h ∈ H.

In this case, we write X ∼ N(m,Σ).

If x ∈ H : Σx = 0 = 0, the distribution N(m,Σ) is called non-degenerate.

If m = 0, the distribution N(m,Σ) is called centered.

22.20 Theorem (Existence of Gaussian distributions)

For each m ∈ H and Σ ∈ L+tr(H), there is a Gaussian distribution N(m,Σ).

Proof. Σ symmetric and compact =⇒ ∃ COS e1, e2, . . . of H andλ1, λ2, . . . ≥ 0 with Σek = λkek, k ≥ 1. Notice that

Tr(Σ) =∑∞

k=1〈Σek, ek〉 =∑∞

k=1λk <∞.

Norbert Henze, KIT 22.18

Page 337: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Gaussian distributions in separable Hilbert spaces

Put mk = 〈m, ek〉, k ≥ 1. Let

P ∗ :=

∞⊗

k=1

N1(mk, λk)

be the infinite product measure on the product-Borel-σ-field B∞ of R∞.

Notice that ℓ2 is a Borel subset of R∞ = x = (xj)j≥1 : xj ∈ R ∀j ≥ 1. (!)Claim 1: P ∗ is concentrated on ℓ2, i.e.,

P ∗ (x ∈ R∞ : ‖x‖2ℓ2 <∞

)= 1.

Proof of claim 1: We have

R∞

‖x‖2ℓ2 P ∗(dx) =

R∞

∞∑

k=1

x2k P

∗(dx) =∞∑

k=1

R

x2k P

∗(dx) (why?)

=

∞∑

k=1

R

x2k N1(mk, λk)(dxk) =

∞∑

k=1

(λk +m2k)

= Tr(Σ) + ‖m‖2 <∞, q.e.d. (why?)

Norbert Henze, KIT 22.19

Page 338: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Gaussian distributions in separable Hilbert spaces

Memo: Probability space (R∞,B∞, P ∗), P ∗ = ⊗∞j=1N1(mj , λj), P

∗(ℓ2) = 1.

Let P be the restriction of P ∗ to B(ℓ2). Consider the mapping

γ :

ℓ2 → H,

x = (xj)j≥1 7→ γ(x) :=∑∞

j=1 xjej .

Notice that, for x = (xj)j≥1, y = (yj)j≥1 ∈ ℓ2,

〈γ(x), γ(y)〉 =⟨ ∞∑

j=1

xjej ,∞∑

k=1

ykek⟩

=∞∑

j=1

∞∑

k=1

xjyk〈ej , ek〉

=∞∑

j=1

xjyj = 〈x, y〉ℓ2 .

It follows that γ is an isometry between ℓ2 and H . Let

P := γ(P ) = P γ−1

be the image (probability) measure of P under the (measurable) mapping γ.

Norbert Henze, KIT 22.20

Page 339: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Gaussian distributions in separable Hilbert spaces

Memo: γ(x) =∑∞

j=1 xjej , x = (xj)j≥1 ∈ ℓ2; γ−1(h) = (〈h, ej〉)j≥1, h ∈ H.

Claim 2: P (= γ(P )) = N(m,Σ). Proof of Claim 2: We show

H

ei〈y,h〉 P (dy) = exp

(i〈m,h〉 − 1

2〈Σh, h〉

), h ∈ H (q.e.d.)

Fix h ∈ H . We have∫

H

ei〈y,h〉 P (dy) =

ℓ2ei〈γ(x),h〉 P (dx) (transf. of integrals)

=

R∞

ei〈γ(x),h〉 P ∗(dx) (P = P ∗|ℓ2 , P

∗(ℓ2) = 1)

=

R∞

exp(i〈x, γ−1(h)〉ℓ2

)P ∗(dx) (γ isometry)

=

R∞

exp

(i

∞∑

k=1

xk〈h, ek〉)P ∗(dx) (memo)

= limn→∞

R∞

exp

(i

n∑

k=1

xk〈h, ek〉)P ∗(dx). (why?)

Norbert Henze, KIT 22.21

Page 340: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Gaussian distributions in separable Hilbert spaces

Memo:∫Reits N1(a, σ

2)(dt) = exp(ias− σ2s2

2

), Σek = λkek

H

ei〈y,h〉 P (dy) = limn→∞

R∞

exp

(i

n∑

k=1

xk〈h, ek〉)P ∗(dx)

= limn→∞

R∞

n∏

k=1

eixj〈h,ej〉 P ∗(dx)

= limn→∞

n∏

k=1

R

eixk〈h,ek〉N1(mk, λk)(dxk)

= limn→∞

n∏

k=1

exp

(imk〈h, ek〉 − 1

2λk〈h, ek〉2

)

= limn→∞

exp

(i

n∑

k=1

〈m,ek〉〈h, ek〉 − 1

2

n∑

k=1

λk〈h, ek〉2).

Now,n∑

k=1

〈m,ek〉〈h, ek〉 →∞∑

k=1

〈m, ek〉〈ek, h〉 = 〈m,h〉 (general. Parseval),n∑

k=1

λk〈h, ek〉2 =n∑

k=1

〈Σh, ek〉〈h, ek〉 →∞∑

k=1

〈Σh, ek〉〈ek, h〉 = 〈Σh, h〉.

Norbert Henze, KIT 22.22

Page 341: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Gaussian distributions in separable Hilbert spaces

Memo: X ∼ N(m,Σ)⇐⇒ E[ei〈X,h〉

]= exp

(i〈m,h〉 − 〈Σh, h〉

2

), h ∈ H.

22.21 Theorem (Properties of Gaussian distributions)

Suppose X ∼ N(m,Σ). We then have:

a) 〈X, h〉 ∼ N1(〈m,h〉, 〈Σh, h〉) ∀h ∈ H ,

b) ∀ k ≥ 1, ∀h1, . . . , hk ∈ H : (〈X,h1〉, . . . , 〈X,hk〉)⊤ has a k-variatenormal distribution,

c) E‖X‖2 <∞,

d) EX = m,

e) Σ(X) = Σ.

Proof. a) In the memo, replace h with th, where t ∈ R. Then

ϕ〈X,h〉(t) = E

[eit〈X,h〉

]= E

[ei〈X,th〉

]= exp

(i〈m,h〉t− 〈Σh, h〉t

2

2

), q.e.d.

b) Let a1, . . . , ak ∈ R. In a), put h = a1h1 + . . .+ akhk. Then∑k

j=1 aj〈X,hj〉has a univariate normal distribution, q.e.d.

Norbert Henze, KIT 22.23

Page 342: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Gaussian distributions in separable Hilbert spaces

c) E‖X‖2 <∞.

c) Since 〈X, ek〉 ∼ N1(〈m,ek〉, λk), we have

E‖X‖2 = E

[∑∞k=1〈X, ek〉2

]=∑∞

k=1 E[〈X, ek〉2

]=∑∞

k=1

(λk + 〈m,ek〉2

)

=∑∞

k=1λk +∑∞

k=1〈m,ek〉2 = tr(Σ) + ‖m‖2 <∞.

d) EX = m.

d) From a) we have E〈X,h〉 = 〈m,h〉 ∀h ∈ H , q.e.d.

e) Σ(X) = Σ.

e) To show: 〈Σx, y〉 = E[〈X − EX,x〉〈X − EX, y〉

], x, y ∈ H .

Notice that 〈X,x+ y〉 ∼ N1(〈m,x+ y〉, 〈Σ(x+ y), x+ y〉) =⇒

V (〈X,x〉+ 〈X, y〉) = 〈Σ(x+ y), x+ y〉 = 〈Σx, x〉+ 〈Σy, y〉+ 2〈Σx, y〉= V(〈X,x〉) + V(〈X, y〉) + 2Cov (〈X,x〉, 〈X, y〉)

=⇒ 〈Σx, y〉 = Cov (〈X,x〉, 〈X, y〉) = Cov (〈X − EX,x〉, 〈X − EX, y〉)= E [〈X − EX,x〉 · 〈X − EX, y〉] , q.e.d.

Norbert Henze, KIT 22.24

Page 343: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Gaussian distributions in separable Hilbert spaces

22.22 Theorem (Characterization of Gaussian distributions)

The following assertions are equivalent:

a) X is a Gaussian random element of H ,

b) 〈X, h〉 has a univariate normal distribution for each h ∈ H .

Notice that b) implies (〈X,h1〉, . . . , 〈X,hk〉)⊤ ∼ Nk ∀k ≥ 1,∀ h1, . . . , hk ∈ H .

(why?) Corollary 22.6 b) =⇒ PX uniquely determined by property b).

Proof.”=⇒“ follows from Theorem 22.21 a).

“⇐=“: We have E‖X‖2 <∞ (without proof, not trivial!). Thm. 22.8 andThm. 22.12 =⇒ m := EX and Σ := Σ(X) exist.

b) =⇒ ∀h ∈ H ∃mh ∈ R, σ2h ≥ 0 : 〈X,h〉 ∼ N1(mh, σ

2h) =⇒

ϕX(h) = E[ei〈X,h〉] = ϕ〈X,h〉(1) = exp

(imh − σ2

h

2

).

Now, mh = E〈X,h〉 = 〈EX,h〉 = 〈m,h〉. Furthermore,

σ2h = V (〈X,h〉) = E [〈X − EX,h〉 · 〈X − EX,h〉] = 〈Σh, h〉, q.e.d.

Norbert Henze, KIT 22.25

Page 344: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Gaussian distributions in separable Hilbert spaces

22.23 Theorem (The distribution of ‖X −m‖2)Suppose X ∼ N(m,Σ) is a Gaussian random element of H . Then

‖X −m‖2 D=∑∞

j=1λjN2j ,

where λ1, λ2, . . . are the eigenvalues of the covariance operator Σ, andN1, N2, . . . are i.i.d. standard normal random variables.

Proof. W.l.o.g. let m = 0 and λj > 0 for each j.

Thm. 22.20 =⇒ ∃ COS e1, e2, . . . of H with Σej = λjej , j ≥ 1.

Let Nj := 〈X, ej〉, j ≥ 1. Fix k ≥ 1. Thm. 22.22 =⇒ (N1, . . . , Nk) ∼ Nk.

Notice that E(Nj) = 0, and that

E(NiNj) = E [〈X, ei〉〈X, ej〉] = 〈Σei, ej〉 = λi〈ei, ej〉.

Hence, N1, N2, . . . are independent random variables (!), and Nj ∼ N(0, λj).

Put Nj := Nj/√λj , j ≥ 1. Then Nj ∼ N(0, 1), and

‖X‖2 =∑∞

j=1〈X, ej〉2 =∑∞

j=1N2j =

∑∞j=1λj N

2j , q.e.d.

Norbert Henze, KIT 22.26

Page 345: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Gaussian distributions in separable Hilbert spaces

22.24 Gaussian processes and Gaussian random elements in L2(Rd,Bd, µ)

LetH := L2 := L2(Rd,Bd, µ), µ σ-finite measure on Bd.

〈g, h〉 :=∫

Rd

g(t)h(t)µ(dt), g, h ∈ L2.

All random elements will be defined on a common probab. space (Ω,A,P).

A Gaussian process Z on Rd is a family Z = (Z(t))t∈Rd of random variables

Z(·, t) : Ω→ R such that, for each k ≥ 1 and each choice of t1, . . . , tk ∈ Rd,the random vector (Z(t1), . . . , Z(tk)) has some k-variate normal distribution.

The distribution of Z is characterized by

the mean function m(t) := EZ(t), t ∈ Rd,

and the

the covariance function C(s, t) := Cov(Z(s), Z(t)), s, t ∈ Rd.

We assume that Z : Ω× Rd → R is (A⊗ Bd,B)-measurable.

Norbert Henze, KIT 22.27

Page 346: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Gaussian distributions in separable Hilbert spaces

Memo: m(t) = EZ(t), t ∈ Rd; C(s, t) = Cov(Z(s), Z(t)), s, t ∈ Rd.

Now, assume ∫

Rd

m2(t)µ(dt) <∞,∫

Rd

C(t, t)µ(dt) <∞.

Then

∞ >

Rd

E[Z2(t)

]µ(dt) = E

[∫

Rd

Z2(t)µ(dt)

](why?).

It follows that Rd ∋ t 7→ Z(·, t) ∈ L2 P-a.s.

Thus (perhaps after redefining paths on a null set), Ω ∋ ω 7→ Zω may beregarded as a random element in H = L2(Rd,Bd, µ), where

Zω(t) := Z(ω, t), t ∈ Rd.

Is Z = (Z(t))t∈Rd a Gaussian random element?

According to Thm. 22.22, we have to show:

〈Z, h〉 =

Rd

Z(t)h(t)µ(dt) ∼ N1 ∀ h ∈ H.

This follows, e.g., by adapting the reasoning given on p.12 ofI. A. Ibragimov/Y. A. Rozanov: Gaussian random processes, Springer 1978.

Norbert Henze, KIT 22.28

Page 347: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Gaussian distributions in separable Hilbert spaces

Memo: m(t) = EZ(t), t ∈ Rd; C(s, t) = Cov(Z(s), Z(t)), s, t ∈ Rd.

Memo: 〈Z, h〉 =∫Rd Z(t)h(t)µ(dt)

Notice that

E〈Z, h〉 = E

[ ∫

Rd

Z(t)h(t)µ(dt)

]=

Rd

E[Z(t)]h(t)µ(dt)︸ ︷︷ ︸= m(t)

= 〈m,h〉 ∀h ∈ H=⇒ EZ = m. (equality µ-a.e. as functions in L2).

Let Σ be the covariance operator of Z.

Claim: Σ is determined by the covariance function C(s, t).

W.l.o.g., let m = EZ = 0. For g, h ∈ H , we have

〈Σg, h〉 = E[〈Z, g〉 〈Z, h〉

]= E

[∫RdZ(s)g(s)µ(ds)

∫RdZ(t)h(t)µ(dt)

]

= E

[∫Rd

∫RdZ(s)Z(t)g(s)h(t)µ(ds)µ(dt)

]

=∫

Rd

∫RdE[Z(s)Z(t)]g(s)h(t)µ(ds)µ(dt)

].︸ ︷︷ ︸

= C(s, t), q.e.d.

Norbert Henze, KIT 22.29

Page 348: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

The Central Limit Theorem in separable Hilbert spaces

23 The central limit theorem in separable Hilbert spaces

Let H be a separable infinite-dimensional Hilbert space with scalar product〈x, y〉, x, y ∈ H , and norm ‖x‖ =

√〈x, x〉, x ∈ H . Furthermore, let

ek : k ≥ 1 a fixed complete orthornormal system of H . For ℓ ∈ N, let

Πℓ :

H → H,

x 7→ Πℓ(x) :=∑ℓ

k=1〈x, ek〉ek

(orthogonal projection onto the linear subspace spanned by e1, . . . , eℓ).All H-valued random elements will be defined on (Ω,A,P).

23.1 Theorem (Convergence in distribution of Xn to X)

Suppose X,X1, X2, . . . are H-valued random elements. If

Πℓ(Xn)D−→ Πℓ(X) as n→∞ for each fixed ℓ ≥ 1, (23.1)

limℓ→∞

lim supn→∞

P (‖Xn − Πℓ(Xn)‖ ≥ δ) = 0 for each δ > 0, (23.2)

then XnD−→ X.

Norbert Henze, KIT 23.1

Page 349: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

The Central Limit Theorem in separable Hilbert spaces

Proof. Fix any uniformly continuous bounded f : H → R. To show:

Ef(Xn)→ Ef(X) as n→∞. (cf. Portmanteau-Theorm)

Fix ε > 0. Uniform continuity =⇒ there is some δ > 0 satisfying

∀ x, y ∈ H : If ‖x− y‖ < δ then |f(x)− f(y)| < ε. (23.3)

For fixed ℓ ∈ N, we have

|Ef(Xn)− Ef(X)| ≤ |Ef(Xn)− Ef(Πℓ(Xn))|+ |Ef(Πℓ(Xn))− Ef(Πℓ(X))|+|Ef(Πℓ(X))− Ef(X)|

=: un,ℓ + vn,ℓ + wℓ.

Dominated convergence =⇒ ∃ ℓ0 = ℓ0(ε) so that wℓ ≤ ε for each ℓ ≥ ℓ0.Put K := supx∈H |f(x)| <∞. Then, from (23.3),

un,ℓ ≤ |E [(f(Xn)− f(Πℓ(Xn))) · 1‖Xn − Πℓ(Xn)‖ ≥ δ] |+|E [(f(Xn)− f(Πℓ(Xn))) · 1‖Xn − Πℓ(Xn)‖ < δ] |

≤ 2KP(‖Xn − Πℓ(Xn)‖ ≥ δ) + ε.

Norbert Henze, KIT 23.2

Page 350: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

The Central Limit Theorem in separable Hilbert spaces

Memo: Πℓ(Xn)D−→ Πℓ(X) as n→∞ for each fixed ℓ ≥ 1.

Memo: limℓ→∞

lim supn→∞

P (‖Xn −Πℓ(Xn)‖ ≥ δ) = 0 for each δ > 0.

For each fixed ℓ ≥ ℓ0, we thus have

|Ef(Xn)−Ef(X)| ≤ 2KP(‖Xn−Πℓ(Xn)‖ ≥ δ) + 2ε + |Ef(Πℓ(Xn))−Ef(Πℓ(X))|.︸ ︷︷ ︸→ 0 as n→∞ (Memo 1)

Hence

lim supn→∞

|Ef(Xn)−Ef(X)| ≤ 2K lim supn→∞

P(‖Xn−Πℓ(Xn)‖ ≥ δ) + 2ε.

Memo 2 implieslim supn→∞

|Ef(Xn)−Ef(X)| ≤ 2ε,

q.e.d.

Norbert Henze, KIT 23.3

Page 351: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

The Central Limit Theorem in separable Hilbert spaces

23.2 Theorem (Lindeberg-Levy-type CLT in Hilbert spaces)

Let (Zj)j≥1 be a sequence of i.i.d. H-valued random elements, where H is aseparable Hilbert space. Assume E‖Z1‖2 <∞, and putm := EZ1, C := Σ(Z1).Then there is a centered Gaussian element X ∼ N(0, C) of H , and we have

1√n

n∑

j=1

(Zj −m)D−→ X as n→∞.

Proof. W.l.o.g. let m = 0. Put Xn := n−1/2(Z1 + . . .+ Zn). Since thecovariance operator Σ(·) satisfies Σ(aY ) = a2Σ(Y ), a ∈ R (!) Thm. 22.15gives Σ(Xn) = C. From 22.11 and 22.12, we then have

〈Cx, y〉 = E [〈Xn, x〉〈Xn, y〉] , n ≥ 1, x, y ∈ H.Since C ∈ L+

tr(H) (see 22.12, 22.13), there is a Gaussian random elementX ∼ N(0, C) of H .

Let ek : k ≥ 1 be the COS of H satisfying Cek = λkek, k ≥ 1.

Recall ∞∑

k=1

〈Cek, ek〉 =∞∑

k=1

λk < ∞.

Norbert Henze, KIT 23.4

Page 352: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

The Central Limit Theorem in separable Hilbert spaces

Memo: 〈Cx, y〉 = E [〈Xn, x〉〈Xn, y〉] ,∑∞

k=1〈Cek, ek〉 < ∞.

We first show (cf. Thm. 23.1)

limℓ→∞

lim supn→∞

P (‖Xn − Πℓ(Xn)‖ ≥ δ) = 0 for each δ > 0. (23.2)

(Recall Πℓ(x) =∑ℓ

k=1〈x, ek〉ek ). Fix δ > 0. We have

P (‖Xn − Πℓ(Xn)‖ ≥ δ) ≤ 1

δ2· E[‖Xn − Πℓ(Xn)‖2

](why?)

=1

δ2· E[∑∞

k=ℓ+1〈Xn, ek〉2]

(why?)

=1

δ2·∑∞

k=ℓ+1E [〈Xn, ek〉〈Xn, ek〉] .︸ ︷︷ ︸

= 〈Cek, ek〉Hence

lim supn→∞

P (‖Xn − Πℓ(Xn)‖ ≥ δ) ≤ 1

δ2·

∞∑

k=ℓ+1

〈Cek, ek〉,

and (23.2) follows.

Norbert Henze, KIT 23.5

Page 353: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

The Central Limit Theorem in separable Hilbert spaces

It remains to show

Πℓ(Xn)D−→ Πℓ(X) as n→∞ for each fixed ℓ ≥ 1. (23.1)

Memo: Πℓ(Xn) :=∑ℓ

k=1〈Xn, ek〉ek, Xn = 1√n

∑nj=1 Zj .

Notice that

Yn :=

〈Xn, e1〉〈Xn, e2〉

...〈Xn, eℓ〉

=

1√n

n∑

j=1

〈Zj , e1〉〈Zj , e2〉

...〈Zj , eℓ〉

=:

1√n

n∑

j=1

Vj .

V1, V2, . . . are i.i.d. ℓ-dimensional random vectors, with EV1 = 0.

Since E[〈Z1, ei〉〈Z1, ej〉] = 〈Cei, ej〉, the covariance matrix of V1 is

Σ := (〈Cei, ej〉)1≤i,j≤ℓ. Multivariate CLT =⇒ YnD−→ Nℓ(0,Σ).

Since Y := (〈X, e1〉, 〈X, e2〉, . . . , 〈X, eℓ〉)⊤ ∼ Nℓ(0,Σ), (!) we have YnD−→ Y .

Consider the mapping ψ : Rℓ → H , ψ(x) :=∑ℓ

k=1 xkek, x = (x1, . . . , xℓ).

CMT =⇒ Πℓ(Xn) = ψ(Yn)D−→ ψ(Y ) = Πℓ(X), q.e.d.

Norbert Henze, KIT 23.6

Page 354: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Statistical applications: Weighted L2-statistics

24 Statistical applications: Weighted L2-statistics

Let X1, X2, . . . be i.i.d. d-variate random vectors on (Ω,A, P).Let µ be a finite measure on M ∩ Bd, where M ∈ Bd.

For n ≥ 1, let zn : (Rd)n ×M → R be a measurable function.

Put Zn(t) := zn(X1, . . . , Xn, t), t ∈M .

24.1 Definition (One-sample weighted L2-statistic)

The random variable

Tn =

M

Z2n(t)µ(dt),

is called (one-sample) weighted L2-statistic based on zn and µ.

Often: µ(dt) = w(t) dt, where w :M → R≥0 is measurable.

Motivation: Test some hypothesis H0 about unknown distribution PX1 of X1.

Norbert Henze, KIT 24.1

Page 355: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Statistical applications: Weighted L2-statistics

24.2 Example Testing for normality (Epps and Pulley 1983)

H0 : PX1 ∈ N(a, σ2) : a ∈ R, σ2 > 0. Fix β > 0.

Yn,j :=Xj −Xn

Sn, j = 1, . . . , n, S2

n = 1n

∑nj=1

(Xj −Xn

)2,

Ψn(t) :=1

n

n∑

j=1

exp (itYn,j) ≈ exp

(− t

2

2

)under H0.

Tn,β := n

∫ ∞

−∞

∣∣∣∣Ψn(t)− exp

(− t

2

2

) ∣∣∣∣2

1

β√2π

exp

(− t2

2β2

)dt

︸ ︷︷ ︸= wβ(t)

=

∫ ∞

−∞Z2

n(t)wβ(t) dt,

where

Zn(t) =1√n

n∑

j=1

[cos(tYn,j) + sin(tYn,j)− exp

(− t

2

2

)].

Rejection of H0 is for large values of Tn,β .

Norbert Henze, KIT 24.2

Page 356: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Statistical applications: Weighted L2-statistics

24.3 Example Testing for exponentiality (Baringhaus and Henze (1991))

H0 : PX1 ∈ Exp(λ) : λ > 0.Motivation: PX1 is determined by the Laplace transform

L(t) = E(e−tX

), t ≥ 0.

The Laplace transform L(t) = (1 + t)−1 of Exp(1) statisfies differentialequation

L(t) + (1 + t)L′(t) ≡ 0.

Put Yn,j = Xj/Xn, j = 1, . . . , n. Under H0, we should have

Ln(t) :=1n

∑nj=1e

−tYn,j ≈ 11+t

=⇒ Ln(t) + (1 + t)L′n(t) ≈ 0.

Zn(t) =1√n

∑nj=1e

−tYn,j (1 + (1 + t)(−Yn,j)) , µ(dt) = e−βtdt.

Tn,β =

∫ ∞

0

Z2n(t) e

−βt dt.

Rejection is for large values of Tn,β.

Norbert Henze, KIT 24.3

Page 357: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Statistical applications: Weighted L2-statistics

24.4 Example Testing for reflected symmetry (Henze, Klar, Meintanis, 2003)

Let X1, X2, . . . be d-variate random vectors.

H0 : X1 − a D= a−X1 for some a ∈ R

d.

Notice that X1D= −X1 ⇐⇒ E

[sin(t⊤X1)

]= 0 ∀ t ∈ R

d.

Put

Yn,j := S−1/2n (Xj −Xn), Sn := n−1∑n

j=1(Xj −Xn)(Xj −Xn)⊤,

Zn(t) :=1√n

n∑

j=1

sin(t⊤Yn,j

)≈ 0 under H0.

Put µ(dt) = exp(−β‖t‖2) dt, β > 0.

Tn,β :=

Rd

Z2n(t) exp(−β‖t‖2) dt

=πd/2

2βd/2n

n∑

j,k=1

[exp

(− 1

4β‖Yn,j−Yn,k‖2

)− exp

(− 1

4β‖Yn,j+Yn,k‖2

)]

Norbert Henze, KIT 24.4

Page 358: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Statistical applications: Weighted L2-statistics

Further examples:

Testing for

bivariate exponential distribution (Alba-Fernandez, Jimenez-Gamero, 2015)

bivariate Poisson (Novoa-Munoz, Jimenez-Gamero, 2014)

Gamma distribution (Ebner et al., 2012)

bivariate extreme value copulas (Genest et al., 2011)

skew normal distribution (Meintanis, 2010)

Inverse Gaussian distribution (Fragiadakis et al., 2009)

Marshall-Olkin distribution (Meintanis, 2007)

Laplace distribution (Meintanis, 2004)

Cauchy distribution (Gurtler, Henze 2000)

Poisson distribution (Rueda et al., 1991)

etc.

Norbert Henze, KIT 24.5

Page 359: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Statistical applications: Weighted L2-statistics

Memo: Tn =

M

Z2n(t)µ(dt).

We assume that Zn is a random element of H := L2(M,M ∩ Bd, µ). Then

Tn = ‖Zn‖2.Suppose one can prove

ZnD−→ Z under H0 in H,

where Z ∼ N(0, C). Then, by the CMT,

TnD−→ ‖Z‖2 under H0.

Thm. 22.23 =⇒ ‖Z‖2 ∼∑∞j=1 λjN

2j , where λ1, λ2, . . . are the positive

eigenvalues of C, and N1, N2, . . . are i.i.d. standard normal random variables.

In each of the examples,

Zn(t) =1√n

n∑

j=1

f(Xj , t, ϑn

),

where ϑn = ϑn(X1, . . . , Xn) is a suitable estimator of ϑ, and

Eϑ [f(X1, t, ϑ)] = 0 ∀ϑ ∈ Θ ∀ t ∈M.

Norbert Henze, KIT 24.6

Page 360: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Statistical applications: Weighted L2-statistics

Memo: Tn =

M

Z2n(t)µ(dt), H = L2(M,Bd ∩M,µ),

Memo: Zn(t) =1√n

n∑

j=1

f(Xj , t, ϑn

).

Suppose H0 does not hold. Then, typically, there is a z ∈ H , z 6= 0, so that

1

n

n∑

j=1

f(Xj , ·, ϑn)P−→ z(·) in H,

i.e., ∥∥∥ 1n

n∑

j=1

f(Xj , ·, ϑn)− z(·)∥∥∥ P−→ 0.

It follows that

Tn

n=∥∥∥ 1n

n∑

j=1

f(Xj , ·, ϑn)∥∥∥2

P−→ ∆ := ‖z‖2 =

M

z2(t)µ(dt) > 0.

Hence, TnP−→∞ =⇒ consistency against each such alternative.

Norbert Henze, KIT 24.7

Page 361: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Statistical applications: Weighted L2-statistics

Memo: Tn = ‖Zn‖2, Zn(t) =1√n

n∑

j=1

f(Xj , t, ϑn

),

Zn√n

P−→ z 6= 0.

Memo:Tn

nP−→ ∆ = ‖z‖2 > 0.

Put Zn :=Zn√n. Notice that

√n

(Tn

n−∆

)=√n(‖Zn‖2 − ‖z‖2

)

=√n⟨Zn − z, Zn + z

=√n⟨Zn − z, 2z + Zn − z

= 2⟨√n(Zn − z), z

⟩+

1√n‖√n(Zn − z)‖2.

︸ ︷︷ ︸ ︸ ︷︷ ︸=: Vn(z) = Vn(z)

Norbert Henze, KIT 24.8

Page 362: Winter term 2016/2017 Norbert Henze, Institute of Stochasticshenze/media/asymptotic-stochastics-ws-2… · Norbert Henze, KIT 0.2. Contents 14. Weak convergence in metric spaces 15

Statistical applications: Weighted L2-statistics

Memo:√n

(Tn

n−∆

)= 2〈Vn, z〉+ 1√

n‖Vn‖2.

24.5 Theorem If VnD−→ V for some centered Gaussian random element V of

H , then √n

(Tn

n−∆

)D−→ 2〈V, z〉 ∼ N(0, σ2),

where, putting K(s, t) = E[V (s)V (t)],

σ2 = 4E[〈V, z〉2

]=∫M

∫MK(s, t) z(s)z(t)µ(ds)µ(dt).

24.6 Corollary If σ2n is a consistent estimator of σ2 > 0, then

√n

σn

(Tn

n−∆

)D−→ N(0, 1).

Further reading: Baringhaus, L., Henze, N., Ebner, B.: The Limit Distributionof weighted L2-Goodness-of-Fit Statistics under fixed Alternatives, withApplications (2016). Ann. Inst. Statist. Math. doi:10.1007/s10463-016-0567-8

Norbert Henze, KIT 24.9