topics in probability theory and stochastic processes ...even better, hardy and littlewood’s...

26
Steven R. Dunbar Department of Mathematics 203 Avery Hall University of Nebraska-Lincoln Lincoln, NE 68588-0130 http://www.math.unl.edu Voice: 402-472-3731 Fax: 402-472-8466 Topics in Probability Theory and Stochastic Processes Steven R. Dunbar Law of the Iterated Logarithm Rating Mathematicians Only: prolonged scenes of intense rigor. 1

Upload: others

Post on 06-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Topics in Probability Theory and Stochastic Processes ...Even better, Hardy and Littlewood’s estimate tells us: lim n!1 n S np p nlnn constant with probability 1 in the sample space

Steven R. DunbarDepartment of Mathematics203 Avery HallUniversity of Nebraska-LincolnLincoln, NE 68588-0130http://www.math.unl.edu

Voice: 402-472-3731Fax: 402-472-8466

Topics in

Probability Theory and Stochastic ProcessesSteven R. Dunbar

Law of the Iterated Logarithm

Rating

Mathematicians Only: prolonged scenes of intense rigor.

1

Page 2: Topics in Probability Theory and Stochastic Processes ...Even better, Hardy and Littlewood’s estimate tells us: lim n!1 n S np p nlnn constant with probability 1 in the sample space

Section Starter Question

Key Concepts

1. The Law of the Iterated Logarithm tells very precisely how far the num-ber of successes in a coin-tossing game will make excursions from theaverage value.

2. The Law of the Iterated Logarithm is a high-point among increasinglyprecise limit theorems characterizing how far the number of successes ina coin-tossing game will make excursions from the average value. Thetheorems start with the Strong Law of Large Numbers and the CentralLimit Theorem, to Hausdorff’s Estimate, and the Hardy-LittlewoodEstimate leading to the Law of the Iterated Logarithm.

3. Khinchin’s Law of the Iterated Logarithm says: Almost surely, for allε > 0, there exist infinitely many n such that

Sn − np > (1− ε)√n√

2p(1− p) ln(ln(n))

and furthermore, almost surely, for all ε > 0, for every n larger than athreshold value N

Sn − np < (1 + ε)√n√

2p(1− p) ln(ln(n)).

2

Page 3: Topics in Probability Theory and Stochastic Processes ...Even better, Hardy and Littlewood’s estimate tells us: lim n!1 n S np p nlnn constant with probability 1 in the sample space

Vocabulary

1. The limsup, abbreviation for limit superior is a refined and general-ized notion of limit, being the largest dependent-variable subsequencelimit. That is, among all subsequences of independent-variable val-ues tending to some independent-variable value, usually infinity, therewill be a corresponding dependent-variable subsequence. Some of thesedependent-variable sequences will have limits, and among all these, thelargest is the limsup.

2. The liminf, abbreviation for limit inferior is analogous, it is the leastof all dependent-variable subsequence limits.

3. Khinchin’s Law of the Iterated Logarithm says: Almost surely, forall ε > 0 there exist infinitely many n such that

Sn − np > (1− ε)√n√

2p(1− p) ln(ln(n))

and furthermore, almost surely, for all ε > 0, for every n larger than athreshold value N

Sn − np < (1 + ε)√n√

2p(1− p) ln(ln(n)).

Mathematical Ideas

Overview

We again consider the number of successes in a coin-tossing game. Thatis, we consider the sum Sn where the independent, identically distributedrandom variables in the sum Sn = X1 + · · · + Xn are the Bernoulli randomvariables Xi = +1 with probability p and Xi = 0 with probability q = 1− p.Note that the mean µ = p is and the variance is σ2 = p(1 − p) for each ofthe summands Xi.

3

Page 4: Topics in Probability Theory and Stochastic Processes ...Even better, Hardy and Littlewood’s estimate tells us: lim n!1 n S np p nlnn constant with probability 1 in the sample space

The Strong Law of Large Numbers says that

limn→∞

Sn − npn

= 0

with probability 1 in the sample space of all possible coin flips. This says thedenominator n is “too strong”, it “condenses out” all variation in the sumSn.

The Central Limit Theorem applied to this sequence of coin flips says

limn→∞

Sn − np√np(1− p)

= Z

where Z ≡ N(0, 1) is a normal random variable and the limit is interpretedas convergence in distribution. In fact, this implies that for large n about68% of the points in the sample space of all possible coin flips satisfy∣∣∣∣∣ Sn − np√

np(1− p)

∣∣∣∣∣ ≤ 1

and about 95% of the points in the sample space of all possible coin flipssatisfy ∣∣∣∣∣ Sn − np√

np(1− p)

∣∣∣∣∣ ≤ 2.

This says the denominator√n is “too weak”, it doesn’t condense out enough

information. In fact, using the Kolmogorov zero-one law and the CentralLimit Theorem, almost surely

lim infn→∞

Sn − np√np(1− p)

= −∞

and almost surely

lim supn→∞

Sn − np√np(1− p)

= +∞.

The Strong Law and the Central Limit Theorem together suggest that“somewhere in between n and

√n” we might be able to make stronger state-

ments about convergence and the variation in the sequence Sn.

4

Page 5: Topics in Probability Theory and Stochastic Processes ...Even better, Hardy and Littlewood’s estimate tells us: lim n!1 n S np p nlnn constant with probability 1 in the sample space

In fact, Hausdorff’s estimate tells us:

limn→∞

∣∣∣∣Sn − npn1/2+ε

∣∣∣∣ = 0

with probability 1 in the sample space of all possible coin flips for all valuesof ε > 0. This says the denominator n1/2+ε is “still too strong”, it condensesout too much information.

Even better, Hardy and Littlewood’s estimate tells us:

limn→∞

∣∣∣∣Sn − np√n lnn

∣∣∣∣ ≤ constant

with probability 1 in the sample space of all possible coin flips for all valuesof ε > 0. In a way, this says

√n lnn) is “still a little too strong”, it condenses

out most information.Khinchin’s Law of the Iterated Logarithm has a denominator that is “just

right.” It tells us very precisely how the deviations of the sums from the meanvary with n. Using a method due to Erdos, it is possible to refine the laweven more, but for these notes a refinement is probably past the point ofdiminishing returns.

Like the Central Limit Theorem, the Law of the Iterated Logarithm illus-trates in an astonishingly precise way that even completely random sequencesobey precise mathematical laws.

Khinchin’s Law of the Iterated Logarithm says that:

Almost surely, for all ε > 0, there exist infinitely many n suchthat

Sn − np > (1− ε)√np(1− p)

√2 ln(ln(n))

and furthermore, almost surely, for all ε > 0, for every n largerthan a threshold value N depending on ε

Sn − np < (1 + ε)√np(1− p)

√2 ln(ln(n)).

These appear in a slightly non-standard way, with the additional factor√2 ln lnn times the standard deviation from the Central Limit Theorem to

emphasize the similarity to and the difference from the Central Limit Theo-rem.

5

Page 6: Topics in Probability Theory and Stochastic Processes ...Even better, Hardy and Littlewood’s estimate tells us: lim n!1 n S np p nlnn constant with probability 1 in the sample space

Theorem 1 (Law of the Iterated Logarithm). With probability 1:

lim supn→∞

Sn − np√2np(1− p) ln(ln(n))

= 1.

This means that with probability 1 for any ε > 0, only finitely many of theevents:

Sn − np > (1 + ε)√n√

2p(1− p) ln(ln(n))

occur; on the other hand, with probability 1,

Sn − np > (1− ε)√n√

2p(1− p) ln(ln(n))

occurs for infinitely many n.

For reasons of symmetry, for ε > 0, the inequality

Sn − np < −(1 + ε)√n√

2p(1− p) ln(ln(n))

can only occur for finitely many n; while

Sn − np < −(1− ε)√n√

2p(1− p) ln(ln(n))

must occur for infinitely many n. That is,

lim infn→∞

Sn − np√2np(1− p) ln(ln(n))

= −1

with probability 1.Compare the Law of the Iterated Logarithm to the Central Limit The-

orem. The Central Limit Theorem, says that (Sn − np)/√np(1− p) is ap-

proximately distributed as a N(0, 1) random variable for large n. Therefore,for a large but fixed n, there is probability about 1/6 that the values of(Sn − np)/

√np(1− p) can exceed the standard deviation 1, or Sn − np >√

np(1− p). For a fixed but large n, with probability about 0.025, (Sn −np)/

√np(1− p) can exceed twice the standard deviation 2, or (Sn − np) >

2√np(1− p). The Law of the Iterated Logarithm tells us the more precise

information that there are infinitely many n such that

Sn − np > (1− ε)√

2np(1− p) ln(ln(n))

6

Page 7: Topics in Probability Theory and Stochastic Processes ...Even better, Hardy and Littlewood’s estimate tells us: lim n!1 n S np p nlnn constant with probability 1 in the sample space

for any ε > 0. The Law of the Iterated Logarithm does not tell us howlong we will have to wait between such repeated crossings however, and thewait can be very, very long indeed, although it must (with probability 1)eventually occur again. Moreover, the Law of the Iterated Logarithm tellsus in addition that

Sn − np < −(1− ε)√

2np(1− p) ln(ln(n))

must occur for infinitely many n.Khinchin’s Law of the Iterated Logarithm also applies to the cumulative

fortune in a coin-tossing game, or equivalently, the position in a randomwalk. Consider the independent Bernoulli random variables Yi = +1 withprobability p and Yi = −1 with probability q = 1−p. The mean is µ = 2p−1and the variance is σ2 = 4p(1−p) for each of the summands Yi. Then considerthe sum Tn = Y1 + · · · + Yn with mean (2p − 1)n and variance 4np(1 − p).Since Yn = 2Xn − 1, then Tn = 2Sn − n and Sn = 1

2Tn + n

2. Then applying

the Law of the Iterated Logarithm says that with probability 1 for any ε > 0,only finitely many of the events:

|Tn − n(2p− 1)| > (1 + ε)2√n√

2p(1− p) ln(ln(n))

occur; on the other hand, with probability 1,

|Tn − n(2p− 1)| > (1− ε)2√n√

2p(1− p) ln(ln(n))

occurs for infinitely many n. This means that the fortune must (with prob-ability 1) oscillate back and forth across the net zero axis infinitely often,crossing the upper and lower boundaries:

±(1− ε)2√

2p(1− p)n ln(ln(n))

The statement puts some strength behind an understanding of the long-termswings backs and forth in value of a random process. It also implies a formof recurrence, that is, a random walk must visit every integer value.

The Law of the Iterated Logarithm for Bernoulli trials stated here is aspecial case of an even more general theorem first formulated by Kolmogorovin 1929. It is also possible to formulate even stronger and more generaltheorems! The proof here uses the Large Deviations and Moderate Deviationsresults with the Borel-Cantelli Lemmas. In another direction, the Law of the

7

Page 8: Topics in Probability Theory and Stochastic Processes ...Even better, Hardy and Littlewood’s estimate tells us: lim n!1 n S np p nlnn constant with probability 1 in the sample space

Iterated Logarithm can be proved using invariance theorems, so it is distantlyrelated to the Central Limit Theorem.

Figure 1 gives an impression of the growth of the function in the Law ofthe Iterated Logarithm compared to the square root function. Figure 2 givesan impression of the Law of the Iterated Logarithm by showing a piecewiselinearly connected graph of 2000 steps of Sn − np with p = q = 1/2. Inthis figure, the random walk must return again to cross the blue curves with(1 − ε) = 0.9 infinitely many times, but may only cross the red curve with1+ε = 1.1 finitely many times. Of course, this is only a schematic impressionsince a single random walk (possibly atypical, from the negligible set!) onthe finite interval 0 ≤ n ≤ 2000 can only suggest the almost sure infinitelymany crossings of (1− ε)α(x) for any ε > 0.

Figure 3 gives a comparison of impressions of four of the limit theorems.The individual figures deliberately are “spaghetti graphs” to give an impres-sion of the ensemble of sample paths. Each figure shows a different scaling ofthe same 15 sample paths for a sequence of 100,000 fair coin flips, each patha different color. Note that the steps axis has a logarithmic scale, meaningthat the shape of the paths is distorted although it still gives an impression ofthe random sums. The top left figure shows Sn/n− p converging to 0 for allpaths in accord with the Strong Law of Large Numbers. The top right figureplots the scaling (Sn−np)/

√2p(1− p)n. For large values of steps the values

over all paths is a distribution ranging from about −2 to 2, consistent withthe Central Limit Theorem. The lower left figure plots (Sn − np)/n0.6 as anillustration of Hausdorff’s Estimate with ε = 0.1. It appears that the scaledpaths are very slowly converging to 0, the range for n = 100,000 is within[−0.5, 0.5]. The lower right figure shows (Sn − np)/

√2p(1− p)n ln(ln(x))

along with lines ±1 to suggest the conclusions of the Law of the IteratedLogarithm. It suggests that all paths are usually in the range [−1, 1] butwith each path making a few excursions outside the range.

Hausdorff’s Estimate

Theorem 2 (Hausdorff’s Estimate). Almost surely, for any ε > 0,

Sn − np = o(nε+1/2

)as n→∞.

Proof. The proof resembles the proof of the Strong Law of Large Numbersfor independent random variables with mean 0 and uniformly bounded 4th

8

Page 9: Topics in Probability Theory and Stochastic Processes ...Even better, Hardy and Littlewood’s estimate tells us: lim n!1 n S np p nlnn constant with probability 1 in the sample space

0

10

20

30

40

50y

0 1000 2000x

Figure 1: The Iterated Logarithm function α(x) =√

2p(1− p)x ln(ln(x))in green along with functions (1 + ε)α(x) in red and (1− ε)α(x) in blue, withε = 0.1. For comparison, the square root function

√2p(1− p)x is in black.

9

Page 10: Topics in Probability Theory and Stochastic Processes ...Even better, Hardy and Littlewood’s estimate tells us: lim n!1 n S np p nlnn constant with probability 1 in the sample space

0

10

20

30

40

50y

0 1000 2000x

Figure 2: An impression of the Law of the Iterated Logarithm using apiecewise linearly connected graph of 2000 steps of Sn−np with p = q = 1/2with the blue curve with (1− ε)α(x) and the red curve with (1 + ε)α(x) forε = 0.1 and α(x) =

√2p(1− p)x ln(ln(x)).

10

Page 11: Topics in Probability Theory and Stochastic Processes ...Even better, Hardy and Littlewood’s estimate tells us: lim n!1 n S np p nlnn constant with probability 1 in the sample space

Figure 3: A comparison of four of the limit theorems. The individual figuresdeliberately are “spaghetti graphs” to give an impression of the ensemble ofsample paths. Each figure shows a different scaling of the same 15 samplepaths for a sequence of 100,000 fair coin flips, each path a different color.Note that the steps axis has a logarithmic scale, meaning that the shape ofthe sample paths is distorted.

11

Page 12: Topics in Probability Theory and Stochastic Processes ...Even better, Hardy and Littlewood’s estimate tells us: lim n!1 n S np p nlnn constant with probability 1 in the sample space

moments. That proof showed that using the independence and identicaldistribution assumptions

E[S4n

]= nE

[X4

1

]+ 3n(n− 1)E

[X2

1

]2 ≤ nC + 3n2σ4 ≤ C1n2.

Then adapting the proof of the Markov and Chebyshev inequalities withfourth moments

E [|Sn|4]n4

≤ C1n2

n4.

Use the Corollary that if∞∑n=1

E [|Xn|] converges, then the sequence (Xn)n≥1

converges almost surely to 0. By comparison,∞∑n=1

E[|Sn|4]n4 converges so that

Sn/n→ 0 a.s. Using the same set of ideas

E [|Sn|4]nα

≤ C1n2

n4α

provided that α > 3/4. Then using the same lemma Sn/nα → 0 for α > 3/4

for a simple version of Hausdorff’s Estimate.Now adapt this proof to higher moments. Let k be a fixed positive integer.

Recall the definition Rn(ω) = Sn(ω)−np =n∑k=1

(Xk−p) =n∑k=1

(X ′k where X ′k =

Xk− p and consider E[R2kn

]. Expanding the product R2k

n results in a sum ofproducts of the individual random variables X ′i of the form X ′i1X

′i2· · ·X ′i2k .

Each product X ′i1X′i2· · ·X ′i2k results from a selection or mapping from indices

{1, 2, . . . , 2k} to the set {1, . . . , n}. Note that if an index j ∈ {1, 2, . . . , n} isselected only once so that X ′j appears only once in the product X ′i1X

′i2· · ·X ′ik ,

then E[X ′i1X

′i2· · ·X ′ik

]= 0 by independence. Further notice that for all sets

of indicesE[X ′i1X

′i2· · ·X ′ik

]≤ 1.

ThusE[R2kn

]=

∑1≤i1,...,i2k≤n

E[X ′i1X

′i2· · ·X ′i2k

]≤ N(k, n),

where N(k, n) is the number of functions from {1, . . . , 2k} to {1, . . . , n} thattake each value at least twice. Let M(k) be the number of partitions of{1, . . . , 2k} into subsets each containing at least two elements. If P is such apartition, then P has at most k elements. The number of functions N(k, n)

12

Page 13: Topics in Probability Theory and Stochastic Processes ...Even better, Hardy and Littlewood’s estimate tells us: lim n!1 n S np p nlnn constant with probability 1 in the sample space

that are constant on each element of P is at most nk. Thus, N(k, n) ≤nkM(k).

Now let ε > 0 and consider

E[(n−ε−1/2Rn

)2k] ≤ n−2kε−kN(k, n) ≤ n−2kεM(k).

Choose k > 12ε

. Then ∑n≥1

E[(n−ε−1/2Rn

)2k]<∞.

Recall the Corollary 2 appearing in the section on the Borel-CantelliLemma:

Let (Xn)n≥0 be a sequence of random variables. If∑∞

n=1 E [|Xn|]converges, then Xn converges to zero, almost surely.

By this corollary, the sequence of random variables(n−ε−1/2Rn

)→ 0

almost surely as n→∞.This means that for each ε > 0, there is a negligible event (depending on

ε) outside of which n−ε−1/2Rn converges to 0. Now consider a countable setof values of ε tending to 0. Since a countable union of negligible events isnegligible, then for each ε > 0, n−ε−1/2Rn converges to 0 almost surely.

Hardy-Littlewood Estimate

Theorem 3 (Hardy-Littlewood Estimate).

Sn − np = O(√

n lnn)

a.s. for n→∞.

Remark. The proof shows that Sn − np ≤√n lnn a.s. for n→∞.

Proof. The proof uses the Large Deviations Estimate as well as the Borel-Cantelli Lemma. Recall the Large Deviations Estimate says

P[Snn≥ p+ ε

]≤ e−nh+(ε),

13

Page 14: Topics in Probability Theory and Stochastic Processes ...Even better, Hardy and Littlewood’s estimate tells us: lim n!1 n S np p nlnn constant with probability 1 in the sample space

where

h+(ε) = (p+ ε) ln

(p+ ε

p

)+ (1− p− ε) ln

(1− p− ε

1− p

).

Note that as ε→ 0, h+(ε) = ε2

2p(1−p) +O(ε3).Note that

P[Snn≥ p+ ε

]= P [Sn − np ≥ nε] ,

and take ε =√

lnnn

and note that

P[Sn − np ≥

√n lnn

]≤ e

−nh+(√

lnnn

).

Then

h+

(√lnn

n

)=

lnn

2p(1− p)n+ o

(1

n

),

since O((

lnnn

)3/2)= o

(1n

). Thus, the probability is less than or equal to the

following:

exp

(−nh+

(√lnn

n

))= exp

(− 1

2p(1− p)lnn+ o (1)

)= exp

(− lnn

2p(1− p)

)exp (o (1))

= n−1

2p(1−p) · exp (o (1)) .

Hence exp(−nh+

(√lnnn

))∼ n

−12p(1−p) , and

∑n≥1 n

−12p(1−p) is convergent be-

cause of the following inequalities

p(1− p) ≤ 1

4

2p(1− p) ≤ 1

21

2p(1− p)≥ 2

−1

2p(1− p)≤ −2

n−1

2p(1−p) ≤ n−2.

14

Page 15: Topics in Probability Theory and Stochastic Processes ...Even better, Hardy and Littlewood’s estimate tells us: lim n!1 n S np p nlnn constant with probability 1 in the sample space

Thus, ∑n≥1

P[Sn − np >

√n lnn

]<∞,

and soP[Sn − np ≤

√n lnn i.o.

]= 1.

Proof of Khinchin’s Law of Iterated Logarithm

Theorem 4 (Khinchin’s Law of Iterated Logarithm). Almost surely,

lim supn→∞

Sn − np√2p(1− p)n ln (lnn)

= 1,

and

lim infn→∞

Sn − np√2p(1− p)n ln (lnn)

= −1.

First establish a two lemmas, and for convenience, let

α(n) =√

2p(1− p)n ln (lnn).

Lemma 5. For all positive a and δ and large enough n,

(lnn)−a2(1+δ) < P [Sn − np > aα(n)] < (lnn)−a

2(1−δ) .

Proof of Lemma 5. Recall that the Large Deviations Estimate gives

P [Rn ≥ aα(n)] = P [Sn − np ≥ aα(n)]

= P[Snn− p ≥ aα(n)

n

]≤ exp

(−nh+

(aα(n)a

n

)).

Note that α(n)n→ 0 as n→∞. Then

h+

(aα(n)

n

)=

a2

2p(1− p)

(α(n)

n

)2

+O

((α(n)

n

)3),

15

Page 16: Topics in Probability Theory and Stochastic Processes ...Even better, Hardy and Littlewood’s estimate tells us: lim n!1 n S np p nlnn constant with probability 1 in the sample space

and so

nh+

(aα(n)

n

)= a2 ln (lnn) +O

(α(n)3

n2

)≥ a2(1− δ) ln (lnn)

for large enough n. This means that

P[Snn− p ≥ aα(n)

]≤ exp

(−a2 (1− δ) ln (lnn)

)= (lnn)−a

2(1−δ) .

Since√

ln (lnn) = o(n1/6

), the results of the Moderate Deviations The-

orem apply to give

P[Snn− p ≥ aα(n)

n

]= P

[Snn− p ≥

√p(1− p)

na√

2 ln (lnn)

]∼ 1√

2πa√

2 ln (lnn)exp

(−a2 ln (lnn)

)=

1

2a√π ln (lnn)

(lnn)−a2

.

Since√

ln (lnn) = o(

(lnn)a2δ)

,

P[Snn− p ≥ aα(n)

n

]≥ (lnn)−a

2(1+δ)

for large enough n.

Lemma 6 (12.5, Kolmogorov Maximal Inequality). Suppose (Yn)n≥1 are in-dependent random variables. Suppose further that E [Yn] = 0 and Var(Yn) =σ2. Define Tn := Y1 + · · ·+ Yn. Then

P[

max1≤k≤n

Tk ≥ b

]≤ 4

3P[Tn ≥ b− 2σ

√n].

Remark. Lemma 6 is an example of a class of lemmas called maximal in-equalities. Here are two more examples of maximal inequalities.

Lemma 7 (Karlin and Taylor, page 280). Let (Yn)n≥1 be identical indepen-dently distributed random variables with E [Yn] = 0 and Var(Yn) = σ2 < ∞.Define Tn =

∑nk=1 Yk. Then

ε2P[

max0≤k≤n

|Tk| > ε

]≤ nσ2.

16

Page 17: Topics in Probability Theory and Stochastic Processes ...Even better, Hardy and Littlewood’s estimate tells us: lim n!1 n S np p nlnn constant with probability 1 in the sample space

Lemma 8 (Karlin and Taylor, page 280). Let (Xn)n≥0 be a submartingale forwhich Xn ≥ 0. For λ > 0,

λP[

max0≤k≤n

Xk > λ

]≤ E [Xn] .

Proof of Lemma 6. Since the Yk’s are independent, then Var(Tn − Tk) =(n− k)σ2 for 1 ≤ k ≤ n. Chebyshev’s Inequality tells us that

P[|Tn − Tk| ≤ 2σ

√n]≥ 1− Var(Tn − Tk)

4σ2n= 1− n− k

4n≥ 3

4.

Note that

P[

max0≤k≤n

Tk ≥ b

]=

n∑k=1

P [T1 < b, . . . , Tk−1 < b, and Tk ≥ b]

≤n∑k=1

P [T1 < b, . . . , Tk−1 < b, and Tk ≥ b] · 4

3P [|Tn − Tk| ≤ 2σn]

=4

3

n∑k=1

P[T1 < b, . . . , Tk−1 < b, and Tk ≥ b and |Tn − Tk| ≤ 2σ

√n]

≤ 4

3

n∑k=1

P[T1 < b, . . . , Tk−1 < b, and Tk ≥ b and Tn ≥ b− 2σ

√n]

≤ 4

3P[Tn ≥ b− 2σ

√n].

Remark. Note that the second part of the Law of the Iterated Logarithm,lim infn→∞

Sn−npα(n)

= −1 follows by symmetry from the first part by replacing

p with (1− p) and Sn with n− Sn.

Remark. The proof of the Law of the Iterated Logarithm proceeds in twoparts. First it suffices to show that

lim supn→∞

Sn − npα(n)

< 1 + η for η > 0, a.s.. (1)

The second part of the proof is to establish that

lim supn→∞

Sn − npα(n)

> 1− η for η > 0, a.s. (2)

17

Page 18: Topics in Probability Theory and Stochastic Processes ...Even better, Hardy and Littlewood’s estimate tells us: lim n!1 n S np p nlnn constant with probability 1 in the sample space

It will only take a subsequence to prove (2). However this will not be easybecause it will use the second Borel-Cantelli Lemma, which requires inde-pendence.

Remark. The following is a simplified proof giving a partial result for integersequences with exponential growth. This simplified proof illustrates the basicideas of the proof. Fix γ > 1 and let nk := bγkc. Then

P [Snk − pnk ≥ (1 + η)α(nk)] < (lnnk)−(1+η)2(1−δ)

= O(k−(1+η)

2(1−δ)).

Choose δ so that (1 + η)2(1− δ) < 1. Then∑k≥1

P [Snk − nkp ≥ (1 + η)α(nk)] <∞.

By the first Borel-Cantelli lemma,

P [Snk − nkp ≥ (1 + η)α(nk) i.o.] = 0,

and so

P[lim supk→∞

Snk − nkpα(nk)

≤ (1 + η)

]= 1,

orSnk − nkpα(nk)

≤ (1 + η) a.s.

The full proof of the Law of the Iterated Logarithm takes more work tocomplete.

Proof of (1) in the Law of the Iterated Logarithm. Fix η > 0 and let γ > 1be a constant chosen later. For k ∈ Z, let nk = bγkc. The proof consists ofshowing that ∑

k≥1

P[

maxn≤nk+1

(Sn − np) ≥ (1 + η)α(nk)

]<∞.

From Lemma 6∑k≥1

P[

maxn≤nk+1

(Sn − np) ≥ (1 + η)α(nk)

]≤ 4

3P[Rnk+1

≥ (1 + η)α(nk)− 2√nk+1p(1− p)

],

18

Page 19: Topics in Probability Theory and Stochastic Processes ...Even better, Hardy and Littlewood’s estimate tells us: lim n!1 n S np p nlnn constant with probability 1 in the sample space

where Rn = Sn − np. We do know that

P[

maxn≤nk+1

(Sn − np) ≥ (1 + η)α(nk)

]≤

4

3P[Rnk+1

≥ (1 + η)α(nk)− 2√nk+1p(1− p)

]. (3)

Note that√nk+1 = o (α(nk)) because this is approximately√

γk+1 compared to c1√γk ln(ln γγ) = c1γ

k/2√

ln(ln γ) + ln(k),

which is the same as

γ1/2 compared to c1√c2 + ln(k).

Then 2√nk+1p(1− p) < 1

2ηα(nk) for large enough k. Using this inequality

in the right hand side of Equation (3), we get

P[

maxn≤nk+1

Sn − np ≥ (1 + η)α(nk)

]≤ 4

3P[Snk+1

− nk+1p ≥ (1 + η/2)α(nk)].

Now, α(nk+1) ∼√γα(nk). Choose γ so that 1 + η/2 > (1 + η/4)

√γ. Then

for large enough k, we have

(1 + η/2)α(nk) > (1 + η/4)α(nk+1).

Now

P[

maxn≤nk+1

Sn − np ≥ (1 + η)α(nk)

]≤ 4

3P[Snk+1

− nk+1p ≥ (1 + η/4)α(nk+1)].

Use Lemma 5 with a = (1− δ)−1 = (1 + η/4). Then we get

P[

maxn≤nk+1

Sn − np ≥ (1 + η)α(nk)

]≤ 4

3(lnnk+1)

−(1+η/4)

for k large. Note that

(lnnk+1)−(1+η/4) ∼ (ln γ)−(1+η/4) k−(1+η/4),

which is the general term of a convergent series so∑k≥1

P[

maxn≤nk+1

Rn ≥ (1 + η)α(nk)

]<∞.

19

Page 20: Topics in Probability Theory and Stochastic Processes ...Even better, Hardy and Littlewood’s estimate tells us: lim n!1 n S np p nlnn constant with probability 1 in the sample space

Then the first Borel-Cantelli Lemma implies that

maxn≤nk+1

Rn ≥ (1 + η)α(nk) i.o. with probability 0.

or equivalently

maxn≤nk+1

Sn − np < (1 + η)α(nk) a.s. for large enough k.

Then in particular

maxnk≤n<nk+1

Sn − np < (1 + η)α(nk) a.s. for large enough k.

This in turn implies that almost surely

Sn − np < (1 + η)α(n).

for n > nk and large enough k which establishes (1).

Proof of (2) in the Law of the Iterated Logarithm. Continue with the proofof Equation (2). To prove the second part, it suffices to show that thereexists nk so that Rnk ≥ (1 − η)α(nk) i.o. almost surely. Let nk = γk for γchosen later with γ ∈ Z sufficiently large. The proof will show∑

n≥1

P[Rγn −Rγn−1 ≥

(1− η

2

)α(γn)

]=∞, (4)

and also

Rγn−1 ≥ −η2α(γn) a.s. for large enough n. (5)

Note that Rγn −Rγn−1dist.= Rγn−γn−1 . It suffices to consider

P[Rγn−γn−1 ≥

(1− η

2

)α(γn)

].

Note that

α(γn − γn−1)α(γn)

=

√c (γn − γn−1) ln (ln (γn − γn−1))√

cγn ln (ln γn)

=

√√√√(1− 1

γ

) ln(n ln γ + ln

(1− 1

γ

))ln (n ln γ)

→√

1− 1

γ.

20

Page 21: Topics in Probability Theory and Stochastic Processes ...Even better, Hardy and Littlewood’s estimate tells us: lim n!1 n S np p nlnn constant with probability 1 in the sample space

Choose γ ∈ Z so that1− η

2

1− η4

<

√1− 1

γ.

Then note that we can choose n large enough so that(1− η

2

)(1− η

4

) < α(γn − γn−1)α(γn)

or (1− η

2

)α(γn) <

(1− η

4

)α(γn − γn−1).

Now considering equation (4), and the inequality above

P[Rγn −Rγn−1 ≥

(1− η

2

)α(γn)

]≥ P

[Rγn−γn−1 ≥

(1− η

4

)α(γn − γn−1)

].

Now using Lemma 5, with a = (1 + δ)−1 =(1− η

4

), we get

P[Rγn −Rγn−1 ≥

(1− η

2

)α(γn)

]≥

ln(γn − γn−1

)−(1− η4 )=

(n ln γ + ln

(1− 1

γ

))−(1− η4 ).

The series with this as its terms diverges. Thus, we see that Equation (4)has been proven.

Now notice that

α(γn) =√cγn ln (ln γn)

=√cγn (lnn+ ln ln γ)

and soα(γn−1) =

√cγn−1 (ln(n− 1) + ln ln γ),

which means that

√γα(γn−1) =

√cγn (ln(n− 1) + ln ln γ).

Thus, α(γn) ∼ √γα(γn−1). Now choose γ so that η√γ > 4. Then ηα(γn) ∼

η√γα(γn−1) > 4α(γn−1) for large enough n.

21

Page 22: Topics in Probability Theory and Stochastic Processes ...Even better, Hardy and Littlewood’s estimate tells us: lim n!1 n S np p nlnn constant with probability 1 in the sample space

Thus, we have[Rγn−1 ≤ −η

2α(γn)

]⊆[−Rγn−1 ≥ 2α(γn−1)

]since [

Rγn−1 ≤ −η2α(γn)

]⊆[Rγn−1 ≤ −2α(γn−1)

].

Now use (4) and we see that −Rγn−1 < 2α(γn−1) happens almost surely forlarge enough n.

Now Rγn −Rγn−1 is a sequence of independent random variables, and sothe second Borel-Cantelli Lemma says that almost surely

Rγn −Rγn−1 >(

1− η

2

)α(γn) i.o.

Adding this with Equation (5), we get that

Rγn > (1− η)α(γn) i.o.

This is enough to show that

lim infn→∞

Sn − npα(n)

> 1− η a.s.,

which is enough to show the only remaining part of Khinchin’s Law of theIterated Logarithm.

Sources

This section is adapted from: W. Feller, in Introduction to Probability Theoryand Volume I, Chapter III, and Chapter VIII, and also E. Lesigne, Headsor Tails: An Introduction to Limit Theorems in Probability, Chapter 12,American Mathematical Society, Student Mathematical Library, Volume 28,2005. Some of the ideas in the proof of Hausdorff’s Estimate are adaptedfrom J. Lamperti, Probability: A Survey of the Mathematical Theory, SecondEdition, Chapter 8. Figure 3 is a recreation of a figure in the Wikipediaarticle on the Law of the Iterated Law.

22

Page 23: Topics in Probability Theory and Stochastic Processes ...Even better, Hardy and Littlewood’s estimate tells us: lim n!1 n S np p nlnn constant with probability 1 in the sample space

Algorithms, Scripts, Simulations

Algorithm

Scripts

R script for comparison figures.

1 p <- 0.5

2 k <- 15

3 n <- 100000

4 coinFlips <- array(runif(n*k) <= p, dim=c(n,k))

5 S <- apply(coinFlips , 2, cumsum)

6 steps <- c(1:n)

7

8 steps2 <- steps [2:n]

9 S2 <- S[2:n, ]

10 steps3 <- steps [3:n]

11 S3 <- S[3:n, ]

12

13 ones <- cbind( matrix(1,n-2,1), matrix(-1,n-2,1))

14

15 par( mfrow = c(2,2))

16

17 matplot ((S-steps*p)/steps ,

18 log="x", type="l", lty = 1, ylab="", main="Strong Law

")

19 matplot ((S-steps*p)/sqrt(2*p*(1-p)*steps),

20 log="x", type="l", lty = 1, ylab="", main="Central

Limit Theorem")

21 matplot ((S-steps*p)/(steps ^(0.6)),

22 log="x", type="l", lty = 1, ylab="", main="Hausdorff ’

s Estimate")

23 ## matplot ((S2-steps2*p)/sqrt(steps2*log(steps2)), log="x",

xlim=c(1,n), type="l", lty = 1)

24 matplot(steps3 , (S3 -steps3*p)/sqrt(2*p*(1-p)*steps3*log(log(

steps3))),

25 log="x", xlim=c(1,n), type="l", lty = 1, ylab="",

main="Law of Iterated Logarithm")

26 matlines(steps3 , ones , type="l", col="black")

23

Page 24: Topics in Probability Theory and Stochastic Processes ...Even better, Hardy and Littlewood’s estimate tells us: lim n!1 n S np p nlnn constant with probability 1 in the sample space

Problems to Work for Understanding

1. The “multiplier of the variance√

2p(1− p)n” function√

ln(ln(n)) growsvery slowly. To understand how very slowly, calculate a table withn = 10j and

√2 ln(ln(n)) for j = 10, 20, 30, . . . , 100. (Remember, in

mathematical work above calculus, ln(x) is the natural logarithm, basee, often written ln(x) in calculus and below to distinguish it from the“common” or base-10 logarithm. Be careful, some software and tech-nology cannot directly calculate with magnitudes this large.)

2. Consider the sequence

an = (−1)bn/2c +(−1)n

n

for n = 1, 2, 3 . . .. Here bxc is the “floor function”, the greatest integerless than or equal to x, so b1c = 1, b3/2c = 1, b8/3c = 2, b−3/2c = −2,etc. Find

lim supn→∞

an

andlim infn→∞

an.

Does the sequence an have a limit?

3. Show that the second part of the Law of the Iterated Logarithm,lim infn→∞

Sn−npα(n)

= −1 follows by symmetry from the first part by

replacing p with (1− p) and Sn with n− Sn.

Solutions to Problems

24

Page 25: Topics in Probability Theory and Stochastic Processes ...Even better, Hardy and Littlewood’s estimate tells us: lim n!1 n S np p nlnn constant with probability 1 in the sample space

Reading Suggestion:

References

[1] William Feller. An Introduction to Probability Theory and Its Applica-tions, Volume I, volume I. John Wiley and Sons, third edition, 1973. QA273 F3712.

[2] John W. Lamperti. Probability: A Survey of the Mathematical Theory.Wiley Series in Probability and Statistics. Wiley, second edition edition,1996.

[3] Emmanuel Lesigne. Heads or Tails: An Introduction to Limit Theoremsin Probability, volume 28 of Student Mathematical Library. AmericanMathematical Society, 2005.

Outside Readings and Links:

1.

2.

3.

4.

I check all the information on each page for correctness and typographicalerrors. Nevertheless, some errors may occur and I would be grateful if you wouldalert me to such errors. I make every reasonable effort to present current andaccurate information for public use, however I do not guarantee the accuracy ortimeliness of information on this website. Your use of the information from thiswebsite is strictly voluntary and at your risk.

25

Page 26: Topics in Probability Theory and Stochastic Processes ...Even better, Hardy and Littlewood’s estimate tells us: lim n!1 n S np p nlnn constant with probability 1 in the sample space

I have checked the links to external sites for usefulness. Links to externalwebsites are provided as a convenience. I do not endorse, control, monitor, orguarantee the information contained in any external website. I don’t guaranteethat the links are active at all times. Use the links here with the same caution asyou would all information on the Internet. This website reflects the thoughts, in-terests and opinions of its author. They do not explicitly represent official positionsor policies of my employer.

Information on this website is subject to change without notice.

Steve Dunbar’s Home Page, http://www.math.unl.edu/~sdunbar1Email to Steve Dunbar, sdunbar1 at unl dot edu

Last modified: Processed from LATEX source on March 22, 2018

26