non-homogeneous random walks - unicamp

Non-homogeneous random walks

Lyapunov function methods for near-critical stochastic systems

Mikhail Menshikov, Serguei Popov, Andrew Wade

4th October 2015

Contents

Preface vii

Summary of notation xv

1 Introduction 11.1 Random walks . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Simple random walk . . . . . . . . . . . . . . . . . . . . . . . 21.3 Lamperti’s problem . . . . . . . . . . . . . . . . . . . . . . . . 41.4 General random walk . . . . . . . . . . . . . . . . . . . . . . . 81.5 Recurrence and transience . . . . . . . . . . . . . . . . . . . . 101.6 Angular asymptotics . . . . . . . . . . . . . . . . . . . . . . . 131.7 Centrally biased random walks . . . . . . . . . . . . . . . . . 141.8 Outline of some motivation . . . . . . . . . . . . . . . . . . . 161.9 Outline of some mathematical applications . . . . . . . . . . 17Bibliographical notes . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2 General results for hitting times 212.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212.2 An introductory example . . . . . . . . . . . . . . . . . . . . 292.3 Fundamental semimartingale facts . . . . . . . . . . . . . . . 322.4 Displacement and exit estimates . . . . . . . . . . . . . . . . 392.5 Recurrence/transience criteria . . . . . . . . . . . . . . . . . . 502.6 Positive recurrence . . . . . . . . . . . . . . . . . . . . . . . . 612.7 Moments of hitting times . . . . . . . . . . . . . . . . . . . . 712.8 Growth bounds on trajectories . . . . . . . . . . . . . . . . . 79Bibliographical notes . . . . . . . . . . . . . . . . . . . . . . . . . . 86

3 Lamperti’s problem 933.1 Introduciton . . . . . . . . . . . . . . . . . . . . . . . . . . . . 933.2 Markovian case . . . . . . . . . . . . . . . . . . . . . . . . . . 95

iii

iv Contents

3.3 General case . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

3.4 Lyapunov functions . . . . . . . . . . . . . . . . . . . . . . . . 105

3.5 Recurrence classification . . . . . . . . . . . . . . . . . . . . . 110

3.6 Irreducibility and regeneration . . . . . . . . . . . . . . . . . 119

3.7 Moments and tails of passage times . . . . . . . . . . . . . . . 130

3.8 Excursion durations and maxima . . . . . . . . . . . . . . . . 136

3.9 Almost-sure bounds on trajectories . . . . . . . . . . . . . . . 141

3.10 Transient theory in the critical case . . . . . . . . . . . . . . . 145

3.11 Nullity and weak limits . . . . . . . . . . . . . . . . . . . . . 152

3.12 Supercritical case . . . . . . . . . . . . . . . . . . . . . . . . . 158

3.12.1 Example: Birth-and-death chains . . . . . . . . . . . . 168

3.13 Proofs for the Markovian case . . . . . . . . . . . . . . . . . . 169

Bibliographical notes . . . . . . . . . . . . . . . . . . . . . . . . . . 170

4 Many-dimensional random walks 179

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179

4.1.1 Chapter overview: Anomalous recurrence . . . . . . . 179

4.1.2 Notation for many-dimensional random walks . . . . . 182

4.1.3 Beyond the Chung–Fuchs theorem . . . . . . . . . . . 183

4.1.4 Relation to Lamperti’s problem . . . . . . . . . . . . . 188

4.1.5 Sufficient conditions for recurrence and transience . . 192

4.2 Elliptic random walks . . . . . . . . . . . . . . . . . . . . . . 197

4.2.1 Recurrence classification . . . . . . . . . . . . . . . . . 197

4.2.2 Example: Increments supported on ellipsoids . . . . . 199

4.3 Controlled random walks with zero drift . . . . . . . . . . . . 201

4.3.1 Transience in two dimensions . . . . . . . . . . . . . . 202

4.3.2 Conditions for transience, and a recurrent example inhigher dimensions . . . . . . . . . . . . . . . . . . . . 203

4.3.3 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . 205

4.4 Centrally biased random walks . . . . . . . . . . . . . . . . . 215

4.4.1 Model and notation . . . . . . . . . . . . . . . . . . . 215

4.4.2 Subcritical case . . . . . . . . . . . . . . . . . . . . . . 216

4.4.3 Supercritical case . . . . . . . . . . . . . . . . . . . . . 217

4.4.4 Critical case: Angular asymptotics . . . . . . . . . . . 222

4.4.5 Critical case: Recurrence classification and excursions 226

4.5 Reflecting random walks . . . . . . . . . . . . . . . . . . . . . 228

4.6 Range and local time . . . . . . . . . . . . . . . . . . . . . . . 229


Contents v

5 Heavy tails 2395.1 Chapter overview . . . . . . . . . . . . . . . . . . . . . . . . . 2395.2 Directional transience . . . . . . . . . . . . . . . . . . . . . . 240

5.2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 2405.2.2 A condition for directional transience . . . . . . . . . 2415.2.3 Almost-sure bounds . . . . . . . . . . . . . . . . . . . 2445.2.4 Moments of passage times . . . . . . . . . . . . . . . . 2475.2.5 Moments of last exit times . . . . . . . . . . . . . . . 250

5.3 Oscillating random walk . . . . . . . . . . . . . . . . . . . . . 2545.3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 2545.3.2 Lyapunov functions . . . . . . . . . . . . . . . . . . . 2575.3.3 Symmetric increments . . . . . . . . . . . . . . . . . . 2585.3.4 One-sided increments . . . . . . . . . . . . . . . . . . 2645.3.5 Complexes of half-lines . . . . . . . . . . . . . . . . . . 2695.3.6 Technical results . . . . . . . . . . . . . . . . . . . . . 271


6 Further applications 2796.1 Random walk in random environment . . . . . . . . . . . . . 279

6.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 2796.1.2 Recurrence classification . . . . . . . . . . . . . . . . . 2806.1.3 Almost-sure upper bounds . . . . . . . . . . . . . . . . 2816.1.4 Almost-sure lower bounds . . . . . . . . . . . . . . . . 2836.1.5 Perturbation of Sinai’s regime . . . . . . . . . . . . . . 286

6.2 Random strings in random environment . . . . . . . . . . . . 2946.2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 2946.2.2 Lyapunov exponents and products of random matrices 2956.2.3 Recurrence classification . . . . . . . . . . . . . . . . . 2966.2.4 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . 298

6.3 Stochastic billiards . . . . . . . . . . . . . . . . . . . . . . . . 3026.3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 3026.3.2 Reduction to Lamperti’s problem . . . . . . . . . . . . 3046.3.3 Increment estimates . . . . . . . . . . . . . . . . . . . 305

6.4 Exclusion and voter models . . . . . . . . . . . . . . . . . . . 3096.4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 3096.4.2 Configurations and exclusion-voter dynamics . . . . . 3126.4.3 Lyapunov functions . . . . . . . . . . . . . . . . . . . 3166.4.4 Exclusion process . . . . . . . . . . . . . . . . . . . . . 3196.4.5 Voter model . . . . . . . . . . . . . . . . . . . . . . . . 3226.4.6 Hybrid process . . . . . . . . . . . . . . . . . . . . . . 326

vi Contents


7 Markov chains in continuous time 3357.1 Introduction and notation . . . . . . . . . . . . . . . . . . . . 3357.2 Main results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339

7.2.1 Existence or non-existence of moments of passage times3397.2.2 Explosion . . . . . . . . . . . . . . . . . . . . . . . . . 3417.2.3 Implosion . . . . . . . . . . . . . . . . . . . . . . . . . 343

7.3 Proof of the main theorems . . . . . . . . . . . . . . . . . . . 3447.3.1 Proof of Theorems 7.2.1 and 7.2.3 on moments of pas-

sage times . . . . . . . . . . . . . . . . . . . . . . . . . 3457.3.2 Proof of theorems on explosion and implosion . . . . . 353

7.4 Applications to some critical models . . . . . . . . . . . . . . 3597.4.1 Critical models on countably infinite and unbounded

subsets of R+ . . . . . . . . . . . . . . . . . . . . . . . 3607.4.2 Simple random walk on Zd for d = 2 and d ≥ 3 . . . . 3647.4.3 Random walk on Z2

+ with reflecting boundaries . . . . 3657.4.4 Collection of one-dimensional complexes . . . . . . . . 3687.4.5 Random walk on a family of infinitely decorated lat-

tices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 370Bibliographical notes . . . . . . . . . . . . . . . . . . . . . . . . . . 371

Glossary of named assumptions 373

Bibliography 375

Index 399

Preface

What is this book about?

This book has two main goals:

• to give an up-to-date exposition of the ‘semimartingale’ or ‘Lyapunovfunction’ approach to the analysis of stochastic processes;

• to present applications of the methodology to fundamental models(classical and modern) in probability theory and related fields.

Our expository bridge between these two complementary aims is thed-dimensional non-homogeneous random walk, which as a model is simpleto describe, closely resembling the classical homogeneous random walk, butdisplays many interesting and subtle phenomena alien to the classical model.Non-homogeneous random walks cannot be studied by the techniques gen-erally used for homogeneous random walks: new methods (and, just asimportantly, new intuitions) are required. Semimartingale and Lyapunovfunction ideas lead to a unified and powerful methodology in this context.

We emphasize that semimartingale methods are ‘robust’ in the sensethat the underlying stochastic process need not satisfy simplifying assump-tions such as the Markov property, reversibility, or time-homogeneity, forinstance, and the state-space of the process need not be countable. In sucha general setting, the semimartingale approach has few rivals. In particular,the methods presented work for non-reversible Markov chains. A generalfeeling is that, if a Markov chain is reversible, then things can be done inmany possible ways: there are methods from electrical networks, spectralcalculations, harmonic analysis, etc. On the other hand, the non-reversiblecase is usually much harder. Similarly, the Markovian setting is not essentialto the methods. In the semimartingale approach, the Markov property is aside issue and non-Markovian processes can be treated equally well.

vii

viii Preface

The Lyapunov function approach for analysis of Markov processes ori-ginated with classical work of Foster, and the theory has since expandedgreatly and proved very successful in analysis of numerous Markov models.Aspects of Foster–Lyapunov theory are presented in the books [84, 5, 217](the presentation in [84] being the closest to our perspective). However,almost all of these existing presentations are concerned with the situationin which the process under consideration is not too close to a phase bound-ary in terms of its asymptotic behaviour. This book deals with analysis ofnear-critical systems, which exhibit fundamental phase transitions such asthat between recurrence and transience.

Near-critical systems are exactly that: even if transient, they are not bal-listic; even if positive-recurrent, they do not exhibit geometric ergodicity;random quantities associated with the system typically have heavy (power-law) tails. Heavy tails have become increasingly prevalent in applicationsacross many fields over the last few years, including queueing theory andfinance; physicists associate heavy-tails with the presence of “self-organizedcriticality”, a very fashionable topic at the moment. Naturally, the ana-lysis of near-critical systems is more challenging and delicate than that forsystems that are far from criticality.

The fundamental contributions to the near-critical situation come fromclassical work of Lamperti, and almost none of this has previously appearedin any book, despite its age and importance. Lamperti’s basic problem con-cerned the asymptotic analysis of a stochastic process on the half-line withmean drift at x of order 1/x: this is exactly the critical situation in respectof the recurrence classification of the process. Importantly, Lamperti’s tech-niques are based on semimartingale ideas, and so the Markov property isnot essential. Building on Lamperti’s ideas, over the last 15 years muchwork has appeared in academic journals on semimartingale methods andnear-critical probabilistic systems, and this is the theory that we present inthis book.

We want to emphasize the importance of applications of the theory. TheLyapunov function ideology often enables one to reduce a question abouta complex model arising in applications to a question about a simpler one-dimensional model by considering a function of the original process whoseimage is one-dimensional. If the original model is interesting, in that its be-haviour is near-critical in some sense, then the image process (for a suitablychosen Lyapunov function) will be near-critical. To give a concrete example,the classical and fundamental model of symmetric simple random walk on Zd(Xn, say) can be analysed through the Lyapunov function f(x) = ‖x‖ (theEuclidean norm). Then f(Xn) is a stochastic process on the half-line with

Preface ix

mean drift of order 1/x at x, i.e., precisely a critical Lamperti-type process.Moreover, f(Xn) is not a Markov process, demonstrating the importance ofthe generality of the semimartingale approach.

Much more complicated and non-classical examples can be studied bythe same methods, and we present several such examples to give a flavour ofthe power and utility of the techniques. We mention, for example, modelsfrom queueing theory, interacting particle systems, random walks in randomenvironments, and so on. As mentioned above, our canonical example willbe the non-homogeneous random walk. This model is a natural generaliza-tion of the very classical and extremely well-studied homogeneous randomwalk, but whose analysis requires entirely new methods. The semimartingaleapproach gives a systematic and intuitive way to analyse these processes.

So, to summarize: This book is about the analysis of Markov processes(such as random walks) via the method of Lyapunov functions; a correctlyapplied Lyapunov function of a process gives rise to a semimartingale. Ourterminology here is neither particularly standard nor particularly precise;we discuss briefly here our usage.

What is a random walk?

Many random walks are sums of i.i.d. random variables (or vectors); thisusage is too narrow for us. Many random walks take place on graphs orgroups; combinatorial or algebraic considerations are not the focus of thisbook. Our random walks are (usually) Markov chains, on state-spaces thatare embedded in Euclidean space, with transitions that are in some senselocal (so that it is natural to speak of ‘jumps’ or ‘increments’). These are themodels that are best suited to the probabilistic approach of this book, andinclude broad classes of models of interest in applications, such as queueingtheory or ecology.

This book is not just about random walks; we discuss other Markovprocesses, including interacting particle systems and a stochastic billiardsmodel, for example, but random walks provide a rich set of models on whichto demonstrate some aspects of the Lyapunov function method.

What is a Lyapunov function?

The phrase ‘Lyapunov function’ has a quite precise technical meaning inthe theory of stability of differential equations. For us, it has a much loosermeaning: a Lyapunov function earns the name if the image under the func-tion of a stochastic process is a process satisfying some conditions that en-

x Preface

able one to deduce some property of the original process. For instance, if(ξn, n ≥ 0) is a time-homogeneous Markov chain on Zd, and f : Rd → R isa judicious choice of function such that there is a set A ⊂ Zd for which

E[f(ξn+1)− f(ξn) | ξn = x] ≤ 0, for all x /∈ A, (?)

then one can deduce that ξn is recurrent, or transient, if f and A satisfycertain simple conditions. Results of this kind, which have their origins inwork of F.G. Foster in the 1950s, are known as Foster–Lyapunov conditions.

In all but the simplest cases, one cannot compute the left-hand side of (?)exactly, but often one can estimate (?) using Taylor’s formula and coarseproperties of the increments ξn+1 − ξn; usually one or two moments suffice.Truncation arguments may be needed to control unusually large increments,where Taylor’s formula will break down.

The language and tools of stopping times and martingale theory are closeat hand. If τ = minn ≥ 0 : ξn ∈ A denotes the first hitting time of A,the drift condition (?) above can be interpreted as saying that f(ξn∧τ ) isa supermartingale adapted to the natural filtration. Since recurrence andtransience properties of ξn can be related to properties of the stopping timeτ , it is natural to try to examine τ using the technology of martingale theory,such as the optional stopping or martingale convergence theorems. This isthe basic ideology of the semimartingale method. This approach came afterFoster; fundamental work was done by J. Lamperti in the 1960s.

What is a semimartingale?

The phrase ‘semimartingale’ has a quite precise technical meaning in stochasticanalysis, as well as being an obsolete term for submartingale. For us, it hasa much looser meaning: a semimartingale is a process that satisfies somedrift condition like (?), typically only locally. Often, with the aid of a stop-ping time, one can convert such a process into a true supermartingale orsubmartingale, as in the case of (?), but we do not make this demand on allour semimartingales.

How to use this book

As mentioned above, we use the terminology Lyapunov function in a verygeneral sense, and that in our usage the term does not presuppose anyparticular property (stability or otherwise) for the transformed process. Itis also worth noting that usually there is no fixed rule about how to discover

Preface xi

the ‘right’ Lyapunov function. For this reason, in this book we not onlypresent the ‘right’ answers, but also do our best to explain the intuitionbehind the choice of Lyapunov function in the context of diverse applicationsand examples.

Finding a suitable Lyapunov function is not always easy, and there isno exact algorithm for that. Nevertheless, there are several intuitive rulesand non-rigorous ideas that may help; in the text, we emphasize this kindof heuristics in the following way:

i just try a good-looking Lyapunov function, work hard tomake all the computations correctly, and hope for the best

Overview of content and logical structure

The material is presented in logical order, but the book has several entrypoints for the reader. Chapter 1 serves as a gentle introduction to themain theme of the book (non-homogeneous random walks). Chapters 2and 3 introduce the technical apparatus of the semimartingale approachand describe its application to near-critical processes on the half-line. Ourintention is that these two chapters will serve as a useful reference for re-searchers who wish to use these tools, and so we have tried to give relativelystrong versions of the results in some generality. As an antidote to the tech-nical demands of these two chapters, we have included many examples, asour intention is also that Chapters 2 and 3 should prove instructive to thestudent who wishes to develop an intuition for the method. Chapters 4–6present applications of the Lyapunov function method to some near-criticalstochastic processes. Thus while these chapters frequently refer to resultsfrom Chapter 2 (and Chapter 4 also relies heavily on Chapter 3), our in-tention is that the reader who is so inclined can take the technical toolsfor granted and read each of these later chapters as a stand-alone exposi-tion. The final chapter, Chapter 7, switches focus to continuous-time; whilethe development parallels some of the ideas in Chapter 2, this chapter isessentially self-contained.

The section headings in the table of contents provide an indication of thesubject matter of each chapter; here we outline briefly what each chaptercontains.

xii Preface

Chapter 1: Introduction.

This chapter motivates the developments that follow by way of a classicaland fundamental model in probability theory: the d-dimensional randomwalk. We describe the transition from (classical) homogeneous random walkto spatially non-homogeneous random walks, and how the investigation ofsuch models is motivated by theoretical questions arising from trying to gobeyond the classical setting and to probe the recurrence/transience phasetransition. Immediately the relaxation of spatial homogeneity requires a sig-nificant re-adjustment of random walk intuition: one can readily constructtwo-dimensional, zero-drift, bounded-jump random walks that are transi-ent, for example, provided spatial homogeneity is not enforced, completelycontrary to classical behaviour.

Chapter 2: Semimartingale approach and Markov chains.

This section presents the basic technical apparatus that we rely on for therest of the book. We review some basic martingale ideas (Doob’s inequality,martingale convergence, and the optional stopping theorem) and present avariety of semimartingale tools, including maximal inequalities, results onfiniteness of hitting times, and existence and non-existence of moments forhitting times. These results include Foster–Lyapunov criteria for Markovchains, whereby a suitable Lyapunov function enables one to conclude thatthe process is transient, recurrent, positive recurrent, etc. We provide manyexamples of the application of these results.

Chapter 3: Lamperti’s problem.

This chapter presents applications of the semimartingale tools of Chapter 2in the context of one-dimensional adapted processes with asymptoticallyvanishing drift, the so-called Lamperti’s problem. Lamperti’s problem servesas a first important example of a near-critical stochastic process, and is alsomotivated by its ubiquity arising from the application of the Lyapunov func-tion method to near-critical processes in higher dimensions. This chapterstudies in turn various aspects of the asymptotic behaviour of Lampertiprocesses, including the recurrence classification for processes with asymp-totically zero drifts, results on existence and non-existence of passage-timemoments, Gamma-type weak convergence results, and almost-sure boundson the trajectory of the process.

Preface xiii

Chapter 4: Many-dimensional random walks.

This chapter presents applications of the results of Chapter 3 to many-dimensional random walks. The applications in this chapter (and later on)proceed by the Lyapunov function ideology: use a suitably chosen Lyapunovfunction of the many-dimensional Markov process to obtain a (probably non-Markov) stochastic process in one dimension, which fits into the frameworkof Chapter 3. We consider in detail the recurrence classification problemfor non-homogeneous random walks, with emphasis on the possibility of an-omalous recurrence behaviour. We also give results on angular asymptoticsand on the range of many-dimensional martingales.

Chapter 5: Heavy tails.

The processes considered in Chapters 3 and 4 are all assumed to have atleast one or two moments for their increments. This chapter turns to theheavy-tailed case when the first or second increment moment is infinite.We present results for real-valued Markov chains with heavy-tailed jumps,focusing on the recurrence classification; we demonstrate how the Lyapunovfunction approach is equally effective in this heavy-tailed setting.

Chapter 6: Further applications.

This chapter presents a selection of applications of the Lyapunov functionmethod to some near-critical stochastic systems. We consider Markov pro-cesses in random environments, some models of interacting particle systems,and a stochastic billiards process. This chapter focuses on recurrence andtransience results, obtained by applications of the Foster–Lyapunov criteriafrom Chapter 2.

Chapter 7: Markov chains in continuous time.

For the final chapter of the book, we switch from discrete to continuous time.This chapter presents recent developments on semimartingale techniques forcontinuous-time discrete-space Markov chains, non-homogeneous both inspace and in jump rates. For example, the (embedded) jump process mightbe of Lamperti-type, while the rates are not uniformly bounded away from0 and ∞. This gives rise to additional phenomena over the discrete-timesetting. We present conditions for existence of moments of hitting times inthis continuous-time setting, and give criteria for explosion and implosionfor such processes, again using semimartingale techniques.

xiv Preface

Acknowledgements

We are grateful to Nicholas Georgiou for providing Figures 4.1, 4.2, and 4.4.Parts of the manuscript were read by Marcelo Costa, Nicholas Georgiou,and Hugo Lo; we are grateful for their comments and suggestions.

Summary of notation

Miscellaneous

We write a := · · · to indicate the definition of a. Occasionally we also use· · · =: a. We use the standard abbreviation ‘a.s.’ for ‘almost surely’ (withprobability 1).

Sets, probabilities and events

The set of integers is Z. The natural numbers are N := 1, 2, 3, . . .. Thenon-negative integers are Z+ := 0 ∪ N. The real numbers are R. Thenon-negative half-line is R+ := [0,∞). It is often convenient to extend thesesets to include infinities, so we set Z+ := Z+ ∪ +∞, R := R∪ ±∞, andR+ := R+ ∪ +∞. For a set S, we write #S for its cardinality (number ofelements, when finite). For a measurable subset A of Rd, we write |A| forits d-dimensional Lebesgue measure (volume).

We will always assume an underlying probability space (Ω,F ,P); ex-pectation with respect to P will be denoted E. For A ⊆ Ω, we denote itscomplement by Ac = Ω \A.

If (An, n ∈ Z+) is a sequence of events, we use the standard notation

An i.o. = An infinitely often= lim supAn = ∩m≥0 ∪n≥m An;

An ev. = An eventually = An all but finitely often= lim inf An = ∪m≥0 ∩n≥m An;

An f.o. = An finitely often = An i.o.c = Acn ev.

= ∪m≥0 ∩n≥m Acn.

xv

xvi Summary of notation

Conventions and empty evaluations

Unless otherwise stated, the following conventions are in force throughout.An empty sum is zero, and an empty product is one. Also inf ∅ := +∞, andsup ∅ := 0.

Real numbers, vectors, and matrices

For x a real number we set

x+ = max0, xx− = −min0, x

so x = x+ − x− and |x| = x+ + x−. For real numbers x and y, we set

x ∧ y = minx, yx ∨ y = maxx, y

For x ∈ R we write bxc for the largest integer not exceeding x, and dxefor the smallest integer no less than x; so dxe − bxc ∈ 0, 1, and is 0 if andonly if x ∈ Z.

For emphasis, we sometimes denote monotone convergence by ‘↑’ and‘↓’, so for a sequence ax ∈ R, ax ↑ a as x → ∞ means that limx→∞ ax = aand ax ≤ ay for all x ≤ y.

We usually use boldface letters for vectors in Rd, and write, for example,x = (x1, . . . , xd)

> in Cartesian components; for definiteness, vectors x ∈ Rdare viewed as column vectors throughout. The origin in Rd is denoted by 0.We write ‖ · ‖ for the Euclidean norm on Rd. We write e1, . . . , ed for thestandard orthonormal basis of Rd, and for vectors u,v ∈ Rd we denote theirscalar product by u · v or 〈u,v〉. For a non-zero vector x ∈ Rd we writex := x/‖x‖ for the corresponding unit vector, and we adopt the convention0 := 0. The unit-radius sphere in Rd is

Sd−1 := u ∈ Rd : ‖u‖ = 1.

The (closed) Euclidean d-ball centred at x ∈ Rd with radius r ∈ R+ is

B(x; r) := y ∈ Rd : ‖x− y‖ ≤ r.

For a matrix M we write M> for its transpose, λmax(M) for its maximumeigenvalue, and trM for its trace.

Summary of notation xvii

A d by d matrix M with real-valued entries acts on column vectorsx ∈ Rd as an affine function from Rd to Rd via x 7→ Mx. The associatedmatrix (operator) norm ‖ · ‖op induced by the Euclidean norm on Rd is

‖M‖op := supu∈Sd−1

‖Mu‖.

Using the variational characterization of the largest eigenvalue as λmax(M) =supu∈Sd−1(u>Mu), we note that

supu∈Sd−1

‖Mu‖2 = supu∈Sd−1

(u>M>Mu

)= λmax(M>M),

so that an alternative expression for the operator norm is

‖M‖op =(λmax(M>M)

)1/2.

Functions

The natural logarithm of x is log x. For r ∈ R, we write logr x for (log x)r.We also write log1 x := log x, and for k ≥ 2 set logk x := log logk−1 x, sothat logk x is the k-fold iterated logarithm of x.

Random variables

We use 1 for the indicator function of an event, indicated either in curlybraces as 1 · or via a previously assigned symbol such as 1(A).

For a sub-σ-field G of F , G-measurable random variables X and Y , andan event A ∈ G, the statement “X = Y on A” is equivalent to “X1(A) =Y 1(A)”; similarly for inequalities.

We use the standard notation for essential supremum and infimum: fora real-valued random variable X,

ess inf X = supx ∈ R : P[X ≥ x] = 1ess supX = infx ∈ R : P[X > x] = 0.

For Rd-valued random variables X,X1, X2, . . ., we denote convergenceby Xn → X qualified by the mode of convergence in the text; for example,Xn → X a.s. means that P[Xn → X] = 1. Sometimes for compactness we

write ‘a.s.−→’, ‘

p−→’, and ‘d−→’ for convergence almost surely, in probability,

and in distribution, respectively.

xviii Summary of notation

Asymptotics

We reserve unadorned Landau O( · ) and o( · ) symbols for the case whereimplicit constants are non-random. Thus for a real-vaued function f anda R+-valued function g, the expression f(x) = O(g(x)) means that thereexist finite deterministic constants C and x0 such that |f(x)| ≤ Cg(x) forall x ≥ x0. Similarly, f(x) = o(g(x)) means that for any ε > 0 there existsa finite deterministic xε such that |f(x)| ≤ εg(x) for all x ≥ xε.

It is convenient to extend the O, o notation to permit random variables,but it is important to do this carefully to avoid ambiguous formulas. Givena σ-field F and F-measurable random variables X and Y , we write OFX(Y )to represent an F-measurable random variable such that there exist finitedeterministic constants C and x0 such that |OFX(Y )| ≤ CY on the eventX ≥ x0. Although this notation is a little cumbersome, we feel the extraclarity is worthwhile, as an ambiguous O( · ) can hide a multitude of sins.

Chapter 1

Introduction

1.1 Random walks

Random walks are fundamental models in probability theory that exhibitdeep mathematical properties and enjoy broad application across the sci-ences and beyond. Generally speaking, a random walk is a stochastic pro-cess modelling the random motion of a particle (or random walker) in space.The particle’s trajectory is described by a series of random increments orjumps at discrete instants in time. Fundamental questions for these modelsinvolve the long-time asymptotic behaviour of the walker.

Random walks have a rich history involving several disciplines. Classicalone-dimensional random walks were first studied several hundred years agoas models for games of chance, such as the so-called “gambler’s ruin” prob-lem. In his 1900 thesis [13], Louis Bachelier applied similar reasoning to hismodel of stock prices. Many-dimensional random walks were first studiedat around the same time, arising from work of pioneers of science in diverseapplications such as acoustics (Lord Rayleigh’s theory of sound developedfrom about 1880 [239]), biology (Karl Pearson’s 1906 [231] theory of randommigration of species), and statistical physics (Einstein’s theory of Brownianmotion developed during 1905–08 [75]). The mathematical importance ofthe random walk problem became clear after Polya’s work in the 1920s, andover the last 60 years or so there have emerged beautiful connections link-ing random walk theory and other influential areas of mathematics, suchas harmonic analysis, potential theory, combinatorics, and spectral theory.Random walk models have continued to find new and important applica-tions in many highly active domains of modern science: see for example thewide range of articles in [264]. Specific recent developments include mod-

1

2 Chapter 1. Introduction

elling of microbe locomotion in microbiology [18], polymer conformation inmolecular chemistry [14, 185], and financial systems in economics.

Spatially homogeneous random walks are the subject of a substantialliterature, including numerous books, such as [270, 242, 178, 125]. In manymodelling applications, the classical assumption of spatial homogeneity isnot realistic: the behaviour of the random walker may depend on the presentlocation in space. Applications thus motivate the study of non-homogeneousrandom walks. These models are also motivated naturally from a mathem-atical perspective: non-homogeneous random walks are the natural settingin which to probe near-critical behaviour and obtain a finer understandingof phase transitions present in the classical random walk models.

The main theme of this book is the analysis of near-critical stochasticsystems using the method of Lyapunov functions. The non-homogeneousrandom walk serves as a prototypical near-critical system; the Lyapunovfunction methodology is robust and powerful, and can be applied to manyother near-critical models, including those with applications across mod-ern probability and beyond, to areas such as queueing theory, interactingparticle systems, and random media. In this chapter we give an informalintroduction to non-homogeneous random walks, and how their behaviourdiffers from classical random walks; we also describe some fundamental ideasof the Lyapunov function technique. We state some theorems, but we of-ten omit technical details and generally omit proofs. All of the results thatwe mention will be stated more precisely (and proved) later in the book,and also applied to a wide variety of near-critical stochastic systems: thenon-homogeneous random walk serves as an expository bridge between well-known classical results and the near-critical behaviour that is the subject ofthis book.

1.2 Simple random walk

The most intensively studied random walk model is symmetric simple ran-dom walk. Simple random walk is a discrete-time Markov process (Sn, n ≥ 0)on the d-dimensional integer lattice Zd: Sn can be thought of as the location(in the state-space Zd) of the random walker at time n (or after n steps).The stochastic evolution of the process is as follows. Given Sn in Zd, thenext point Sn+1 is chosen uniformly at random from among the 2d latticepoints adjacent to Sn, i.e., those points that differ from Sn by exactly ±1in a single coordinate. In other words, the transition probabilities of the

1.2. Simple random walk 3

Markov chain are given by

P[Sn+1 = y | Sn = x] =

12d if y ∼ x;

0 otherwise;

where y ∼ x denotes that y and x are adjacent in Zd.For example, when d = 1 one-dimensional simple random walk jumps

one unit to the left or right, each with probability 1/2, while when d = 2two-dimensional simple random walk jumps to one of its four neighbours,with probability 1/4 of each.

For definiteness, suppose that the walk starts at the origin of Zd: S0 = 0.By the Markov property and spatial homogeneity, the increments (or jumps)Sn+1 − Sn of the walk are independent and identically distributed (i.i.d.)random vectors. If e1, . . . , ed is the standard orthonormal basis on Rd, let

Ud := ±e1, . . . ,±ed

for the possible values of the increments of the walk. Then we can writeSn =

∑nk=1 Zk, where Z1, Z2, . . . are i.i.d. with

P[Z1 = e] =1

2dfor e ∈ Ud.

(With the usual convention that an empty sum is zero, S0 = 0.) Thus wemay represent the random walk Sn via a sequence of partial sums of thei.i.d. random increments Zn.

A fundamental question, addressed by Polya [235], concerns the recur-rence or transience of the random walk: What is the probability that thewalk eventually returns to 0? If we write τd := minn ≥ 1 : Sn = 0 for thetime of the first return to 0 (with the usual convention that min ∅ := ∞),the recurrence question concerns

pd := P[τd <∞].

The random walk is recurrent if pd = 1, in which case with probability onethe random walk will visit 0 infinitely often. On the other hand, if pd < 1the random walk is transient, and will, with probability one, visit 0 onlyfinitely many times, before eventually leaving, never to return.

The following fundamental result is due to Polya [235].

Theorem 1.2.1. Simple random walk is recurrent in 1 or 2 dimensions,but transient in 3 or more; i.e., p1 = p2 = 1 but pd < 1 for all d ≥ 3.


The content of the theorem is nicely captured by an aphorism attributedto Shizuo Kakutani: “A drunk man will eventually find his way home, buta drunk bird may get lost forever” (see [72, p. 191]).

Polya’s theorem (Theorem 1.2.1) tells us that the walk returns to 0eventually when d = 1 or d = 2. But how long might we have to wait? Theanswer is, potentially, a very long time, since τd has very heavy tails: asn→∞,

P[τ1 > n] ∼ 1√πn

, and P[τ2 > n] ∼ π

log n;

see e.g. [235, p. 159] for the first expression and [73, pp. 356–357] for the

second. So in d = 1, E[τ1/21 ] =∞, while in d = 2, τ2 has no moments at all.

According to Hughes [125, p. 42],

the failure of certain moments of distributions or densities toconverge . . . is pregnant with physical meaning, and indicativeof connections to scaling laws, renormalization group methods,and fractals.

The recurrence exhibited by the simple random walk for d ∈ 1, 2 is null-recurrence, meaning that E τd = ∞; more stable processes may exhibitpositive-recurrence, meaning that the analogue of τd is integrable.

1.3 Lamperti’s problem

There are several proofs of Theorem 1.2.1 in the literature, the most pop-ular being those that are largely combinatorial (such as Polya’s originalargument [235]) and those based on potential theory and electrical networks(see e.g. [70]). A drawback of each of these approaches is that they rapidlybreak down when one tries to generalize Polya’s theorem to other randomwalks. In this section we describe a robust approach to proving Polya’stheorem, due to Lamperti, which enables very broad generalization. Thisapproach is based on the methodology of Lyapunov functions.

Again let Sn be the symmetric simple random walk on Zd, starting at 0.In the context of Polya’s recurrence theorem, we are interested in the eventsSn = 0. We can reduce this d-dimensional problem to a one-dimensionalproblem by considering a transformation of the process (a Lyapunov func-tion) given by

Xn := ‖Sn‖, (1.1)

1.3. Lamperti’s problem 5

-1 0 1 2 3 4 5 6-1

0

1

2

3

4

5

6

bc bc

-1 0 1 2 3 4 5 6-1

0

1

2

3

4

5

6

bc

1

Figure 1.1: Illustration of the non-Markovian nature of Xn in d = 2.

where ‖ · ‖ is the Euclidean norm on Zd: i.e., Xn is the distance from theorigin of the walker at time n. The stochastic process (Xn, n ≥ 0) takesvalues in the countable set S = ‖x‖ : x ∈ Zd, a subset of the half-lineR+, and Xn = 0 if and only if Sn = 0. So we can study the recurrence ortransience of Sn via the recurrence or transience of Xn, a one-dimensionalprocess.

This reduction in dimensionality of the problem comes at a price: Xn

is not in general a Markov process. For instance, when d = 2, given one ofthe two events Sn = (5, 0) and Sn = (3, 4) we have Xn = 5 in each casebut Xn+1 has two different distributions; Xn+1 can take the value 6 (withprobability 1/4) in the first case, but this is impossible in the second case.See Figure 1.3. Thus the tools that we use to study Xn must not rely tooheavily on the Markov property.

First we compute the expected increment of Xn given Sn = x, namely

E[Xn+1 −Xn | Sn = x] =1

2d

d∑i=1

(‖x + ei‖+ ‖x− ei‖ − 2‖x‖) . (1.2)

To proceed we apply Taylor’s theorem in an elementary way. Using theTaylor expansion

(1 + y)1/2 = 1 +1

2y − 1

8y2 +O(y3), as y → 0,

we obtain that, for any e ∈ Sd−1,

‖x + e‖ − ‖x‖ =√

(x + e) · (x + e)− ‖x‖


= ‖x‖[(

1 +2e · x + 1

‖x‖2)1/2

− 1

]

= ‖x‖(

2e · x + 1

2‖x‖2 − (e · x)2

2‖x‖4 +O(‖x‖−3)

). (1.3)

It follows that

‖x + e‖+ ‖x− e‖ − 2‖x‖ =1

‖x‖ −(e · x)2

‖x‖3 +O(‖x‖−2).

Thus from (1.2), using the fact that∑d

i=1(e · x)2 = ‖x‖2, we obtain

E[Xn+1 −Xn | Sn = x] =1

2d

d∑i=1

(‖x + ei‖+ ‖x− ei‖ − 2‖x‖)

=

(d− 1

2d

)1

‖x‖ +O(‖x‖−2). (1.4)

Similarly

E[X2n+1 −X2

n | Sn = x] =1

2d

d∑i=1

(‖x + ei‖2 + ‖x− ei‖2 − 2‖x‖2

)= 1.

Then since (Xn+1 −Xn)2 = X2n+1 −X2

n − 2Xn(Xn+1 −Xn) we obtain

E[(Xn+1 −Xn)2 | Sn = x] =1

d+O(‖x‖−1). (1.5)

Informally speaking, (1.4) says that the mean increment of Xn at x ∈ Sis 1

2x(1 − 1d) + O(x−2), and similarly (1.5) says that the second moment of

the increment at x is 1d + O(x−1); however, to formalize these statements

we need to be careful since Xn is not a Markov process, and we have toclarify what we mean by the increment moments “at x”. We deal with thesetechnicalities later, since they complicate the notation (although they willnot complicate the proofs, which are based on martingale arguments). Fornow, for the purposes of exposition, we switch to the case where Xn is aMarkov process.

Suppose now that (Xn, n ≥ 0) is a time-homogeneous Markov process onan unbounded subset S of R+. Consider the increment moment functions

µk(x) := E[(Xn+1 −Xn)k | Xn = x].

1.3. Lamperti’s problem 7

For simplicity, suppose that Xn has uniformly bounded increments, so that

P[|Xn+1 −Xn| ≤ B] = 1 (1.6)

for some B ∈ R+; under condition (1.6), the µk are well-defined functions ofx ∈ S. The first moment function, µ1(x), is the one-step mean drift of Xn

at x.Lamperti [173, 174, 175] investigated the extent to which the asymptotic

behaviour of such a process is determined by the µk; essentially, µ1 and µ2

turn out to govern the asymptotic behaviour. For example, the followingresult is a version Lamperti’s fundamental recurrence classification.

Theorem 1.3.1. Suppose that Xn is a Markov process on S satisfying (1.6).Under mild conditions, the following recurrence classification holds. Letε > 0.

• If 2xµ1(x) + µ2(x) < −ε, then Xn is positive-recurrent;

• If 2x|µ1(x)| ≤ µ2(x) +O(x−ε), then Xn is null-recurrent;

• If 2xµ1(x)− µ2(x) > ε, then Xn is transient.

The mild conditions mentioned in the statement of Theorem 1.3.1 arerelated to issues of irreducibility; the exact nature of the conditions requireddepends on the state-space S and the notion of recurrence and transiencedesired. We leave the technical details until later in the book.

A version of Theorem 1.3.1 applies to non-Markov processes Xn whenformulated correctly, using appropriate versions of the µk. In particular, wecan use such results to study the process Xn defined by (1.1): in this case,the analogue of 2xµ1(x) is, by (1.4),

1− 1

d+O(x−1)

and the analogue of µ2(x) is, by (1.5),

1

d+O(x−1).

An application of the generalized version of Theorem 1.3.1 shows that Xn

is transient if and only if

1− 1

d>

1

d,

or, in other words, d > 2. This gives a very robust strategy for provingPolya’s theorem (Theorem 1.2.1) and its generalizations, based on compu-tations of increment moments for Xn defined by (1.1). These computationsuse elementary Taylor’s theorem ideas, and do not rely at all on specialstructure of the original process Sn.


1.4 General random walk

Simple random walk is an attractive model and can be studied using combin-atorial methods based on counting sample paths, for example; it is, however,a very specific model. Naturally, it is of interest to study a much broaderclass of random walks. In particular, for what class of models does a resultsimilar to Theorem 1.2.1 hold? To put the question in another way: Whatare the essential properties possessed by simple random walk that implyTheorem 1.2.1? To answer such questions, we start by describing a muchmore general model of a random walk.

By a random walk we mean a discrete-time Markov process (ξn, n ≥ 0) onan unbounded state-space Σ ⊆ Rd. We assume that the random walk is time-homogeneous. This means that the distribution of ξn+1 given (ξ0, ξ1, . . . , ξn)depends only on ξn (and not on n). We also need some form of irreducibilityto ensure that the random walk cannot get ‘trapped’ in some part of thestate-space: it is simplest to take Σ to be a locally finite set (such as Zd) toavoid technical issues at this point.

We are thus using the term random walk in a rather general sense, re-quiring that the Markov process inhabit Euclidean space. We also want toimpose some regularity assumptions on the increments of the process, to ruleout very long jumps for the random walk. For this chapter, for conveniencewe often suppose that the increments of the walk are uniformly bounded, sothat there exists B ∈ R+ for which, almost surely (a.s.),

P[‖ξn+1 − ξn‖ ≤ B | ξn = x] = 1, for all x ∈ Σ. (1.7)

In many cases this assumption can be replaced by a weaker assumption onthe existence of higher moments for the increments, without producing fun-damentally new behaviour. On the other hand, the case of genuinely heavy-tailed increments leads to different phenomena, as discussed in Chapter 5 ofthis book.

Under the assumption (1.7), the random vectors ξn+1 − ξn have well-defined moments, which may depend on ξn. In particular, an importantquantity is the one-step mean drift vector

µ(x) := E[ξn+1 − ξn | ξn = x],

which is the average change in position in a single step starting from x ∈ Σ.(Note that the definition via the conditional expectation on ξn = x is clearin the case of a countable state-space Σ, and makes sense when correctlyinterpreted for uncountable Σ as well.)

1.4. General random walk 9

An important and well-studied sub-class of random walks are spatiallyhomogeneous, for which the distribution of the increment ξn+1 − ξn doesnot depend on the current location ξn. Writing θn := ξn+1 − ξn, spatialhomogeneity implies that θ0, θ1, . . . are i.i.d. random vectors. Then the rep-resentation

ξn =n−1∑k=0

θk (1.8)

as a sum of i.i.d. random vectors enables classical tools of probability theory,such as Fourier methods, to be brought to bear in the analysis of the randomwalk in the spatially homogeneous case.

Example 1.4.1 (Simple random walk). Using e1, . . . , ed to denote thestandard orthonormal basis vectors of Rd, simple random walk is spatiallyhomogeneous with θn uniformly distributed on the set ±e1, . . . ,±ed. Simplerandom walk was studied by Lord Rayleigh (see [27]) but the seminal earlycontribution was Polya’s 1921 paper [235]. 4

Example 1.4.2 (Pearson–Rayleigh random walk). Take θn to be uniformlydistributed on the unit-radius sphere Sd−1 ⊂ Rd. The corresponding d-dimensional random walk, that proceeds via a sequence of unit-length steps,each in an independent and uniformly random direction, is sometimes calleda Pearson–Rayleigh random walk, after the exchange in the letters pages ofNature between Karl Pearson and Lord Rayleigh in 1905 [230]:

A man starts from a point O and walks ` yards in a straight line;he then turns through any angle whatever and walks another `yards in a second straight line. He then repeats this process ntimes.

Carazza speculates [27, p. 419] that Pearson’s enthusiasm for this problemmay have originated in the game of golf, although Pearson’s academic in-terest in such problems was directed towards migration of animal species,such as mosquitoes [231]; Rayleigh had earlier considered the acoustic ‘ran-dom walks’ in phase space produced by combinations of sound waves of thesame amplitude and random phases. 4

Spatially homogeneous random walks have been extensively studied inclassical probability theory. Non-homogeneous random walks, on the otherhand, require new techniques and, just as importantly, new intuitions.


1.5 Recurrence and transience

For our general random walks, we need a more general definition of recur-rence and transience. In the absence of any structural assumptions, ourbasic use of the terminology is as follows. Note that, in general, the twobehaviours are not a priori exhaustive.

Definition 1.5.1. A stochastic process (ξn, n ≥ 0) taking values in Σ ⊆ Rdis transient if limn→∞ ‖ξn‖ = ∞, a.s. The process is recurrent if, for someconstant r0 ∈ R+, lim infn→∞ ‖ξn‖ ≤ r0.

If ξn is an irreducible time-homogeneous Markov chain whose state-spaceis a locally finite subset of Rd (such as Zd), then recurrence and transiencein the sense of Definition 1.5.1 coincidence with the usual Markov chaindefinition in terms of returns to any given state. Definition 1.5.1 allowsmore general processes, and is the most convenient definition in the contextof Lamperti’s problem outlined in Section 1.3.

First we return to the classical spatially homogeneous random walk de-scribed in Section 1.4, in which case the increments θn in (1.8) are i.i.d. Inthis case, when defined, E[ξn+1 − ξn | ξn = x] = E θ0 = µ does not dependon x.

If µ 6= 0, then the strong law of large numbers shows that the walk istransient, and limn→∞ n

−1ξn = µ, a.s., so the walk escapes to infinity atpositive speed. The most subtle case is that of zero drift when µ = 0. Here,under mild conditions, Polya’s theorem (Theorem 1.2.1) for simple symmet-ric random walk extends to the case of the general spatially homogeneousrandom walks with zero drift.

Theorem 1.5.2. For a spatially homogeneous random walk ξn on Rd, sup-pose that E[‖θ0‖2] <∞, E θ0 = 0, and E[θ0θ

>0 ] is positive-definite. Then ξn

is recurrent for d ∈ 1, 2 and transient for d ≥ 3.

The positive-definite covariance condition ensures that the incrementsare not supported on a lower-dimensional subspace.

On the other hand, as we shall see later in the book, in the zero driftcase, the walk is diffusive, and ‖ξn‖ is bounded above and below, for all butfinitely many n, by

√n times logarithmic factors.

How does the situation change if the walk is allowed to be spatially non-homogeneous? In the general, non-homogeneous case µ(x) = E[θn | ξn = x]will depend on x. Even if µ(x) = 0 for all x, however, the non-homogeneouszero-drift walk can behave completely differently to the homogeneous zero-drift walk.

1.5. Recurrence and transience 11

Theorem 1.5.3. Let ξn be a spatially non-homogeneous random walk onRd with zero-drift, so that µ(x) = 0 for all x.

• If d = 2, then we can exhibit such a walk that is transient.

• If d ≥ 3, then we can exhibit such a walk that is recurrent.

In all cases, we may take these examples to have uniformly bounded incre-ments, as at (1.7).

We emphasize that, for example, in two dimensions, zero drift does notimply recurrence for a non-homogeneous random walk with bounded jumps.This fact is contrary to intuition built from homogeneous random walks, butshould not be surprising to readers who have encountered random walks inrandom environments; for example, Zeitouni [288, pp. 90-91] discusses anexample of a transient walk in d = 2 with symmetric increments; see alsothe examples in Chapter 4.

So non-homogeneous random walks can show anomalous (non-classical)recurrence behaviour. We can reassert some control by imposing additionalregularity structure on the second moments of the increments θn = ξn+1−ξn.

Assuming (1.7), if we view θn as a row vector, then the matrix function

M(x) = E[θnθ>n | ξn = x] (1.9)

is well defined, since for any e ∈ Sd−1, |e>θnθ>ne| = (e · θn)2 ≤ ‖θ2n‖; for each

x, M(x) is a symmetric, nonnegative-definite matrix. We refer to M(x)defined at (1.9) as the increment covariance matrix ; this is a slight abuse ofterminology, which more accurately should refer to

E[(θn − µ(x))(θn − µ(x))> | ξn = x] = M(x)− µ(x)µ(x)>,

but in our calculations it is always M(x) that appears.Suppose that M(x) = M is fixed, for a positive definite matrix M . This

condition is satisfied with M = σ2I for some σ2 ∈ (0,∞), where I is the d byd identity matrix, by a walk with sufficient symmetry in its increments, suchas simple symmetric random walk or the Pearson–Rayleigh walk. With thisadditional regularity structure, the wild behaviour shown in Theorem 1.5.3is ruled out, and we recapture the result of Theorem 1.5.2.

Theorem 1.5.4. Let ξn be a spatially non-homogeneous random walk onRd for which (1.7) holds. Suppose that, for x, µ(x) = 0 and M(x) = σ2Ifor some σ2 ∈ (0,∞). Then ξn is recurrent for d ∈ 1, 2 and transient ford ≥ 3.


We describe the relevance of Lamperti’s problem to Theorem 1.5.4. Thebasic idea is the same as in Section 1.3, but we use Taylor’s theorem in aslightly more sophisticated way.

Consider a random walk ξn satisfying the conditions of Theorem 1.5.4.Let Xn := ‖ξn‖. Then

E[Xn+1 −Xn | ξn = x] = E[‖x + θn‖ − ‖x‖ | ξn = x],

where, as usual, θn = ξn+1 − ξn is the increment of the walk. We now useTaylor’s theorem to expand the expression inside the expectation. Note firstthat, by a similar argument to (1.3), for any y ∈ Rd with ‖y‖ ≤ B,

‖x + y‖ − ‖x‖ = ‖x‖[(

1 +2y · x + ‖y‖2‖x‖2

)1/2

− 1

]

=y · x‖x‖ +

‖y‖22‖x‖ −

(y · x)2

2‖x‖3 +O(‖x‖−2),

as ‖x‖ → ∞. We are going to apply this with y = θn. In other words,

‖x + θn‖ − ‖x‖ = x · θn +1

2‖x‖(‖θn‖2 − (x · θn)2

)+O(‖x‖−2).

It is important to note that this is legitimate because of the uniform jumpsbound (1.7), and that the implicit constant in the O( · ) term is non-random,i.e., uniform in θn; we use this notation strictly for such circumstances. Thuswe may take expectations to obtain

E[Xn+1 −Xn | ξn = x]

= E[x · θn | ξn = x] +1

2‖x‖ E[‖θn‖2 − (x · θn)2

∣∣ ξn = x]

+O(‖x‖−2).

With the notation µ(x) = E[θn | ξn = x] and M(x) = E[θnθ>n | ξn = x],

using linearity of expectation and a little linear algebra we can write this as

E[Xn+1−Xn | ξn = x] = x ·µ(x)+1

2‖x‖(trM(x)− x>M(x)x

)+O(‖x‖−2).

Under the conditions of Theorem 1.5.4, we have µ(x) = 0, trM(x) = dσ2,and e>M(x)e = σ2 for any e ∈ Sd−1, so that

E[Xn+1 −Xn | ξn = x] =(d− 1)σ2

2‖x‖ +O(‖x‖−2).

1.6. Angular asymptotics 13

For symmetric simple random walk, σ2 = 1/d, and we recover (1.4), butnow we have used only the first two moments of the increments. Similarly,

E[(Xn+1 −Xn)2 | ξn = x] = σ2 +O(‖x‖−1),

which generalizes (1.5). Again, Lamperti’s approach yields transience when(d− 1)σ2 > σ2, i.e., d > 2.

From this perspective, the proof of Theorem 1.5.4 is no more difficultthan the proof of Theorem 1.5.2, at least if we assume the jumps bound(1.7); this robustness is a feature of the Lyapunov function method.

Later in the book, we shall see how to relax the jumps bound (1.7) infavour of a moments condition. The issue is clear: Taylor’s theorem for‖x + θn‖ is only useful if ‖θn‖ is small compared to ‖x‖. To overcome thisobstacle, one uses a basic truncation idea: partition the increment as

Xn+1−Xn = (Xn+1 −Xn) 1‖θn‖ ≤ ‖x‖γ+ (Xn+1 −Xn) 1‖θn‖ > ‖x‖γ

for some constant γ ∈ (0, 1); then the first term can be expanded usingTaylor’s theorem, while the second term can be bounded in expectationusing a version of Markov’s inequality and a bound on the moments of ‖θn‖.We shall see these ideas carried through later in the book.

1.6 Angular asymptotics

The recurrence and transience properties described in the previous sectionare properties of the radial component of the process; it is also natural toinvestigate asymptotics of the angular component.

We say that a random walk ξn has a limiting direction if limn→∞ ξnexists in Sd−1. Here and elsewhere x denotes the unit vector x/‖x‖ whenx 6= 0; we adopt the convention 0 := 0. A walk with a limiting directioneventually remains in an arbitrarily narrow cone with that direction as itsaxis; when is this the case?

Here we review briefly the situation for classical random walks. Supposethat ξn is a spatially homogeneous random walk, so the increments θn in(1.8) are i.i.d.

Theorem 1.6.1. For a spatially homogeneous random walk ξn on Rd, sup-pose that E ‖θ0‖ <∞; let µ = E θ0 ∈ Rd.

(i) If µ 6= 0, then P[limn→∞ ξn = µ] = 1.

(ii) If µ = 0 and P[θ0 = 0] < 1, then P[limn→∞ ξn exists] = 0.


Once spatially homogeneity is relaxed, the situation becomes much moreinteresting.

1.7 Centrally biased random walks

To probe more precisely the recurrence-transience phase transition, it isnatural to consider the case in which µ(x)→ 0 as ‖x‖ → ∞: this is knownas the asymptotically zero drift regime.

For definiteness, consider a 2-dimensional non-homogeneous random walkwhose increments have a fixed covariance structure. We have seen (The-orem 1.5.4) that if µ(x) = 0 for all x, the random walk is recurrent. On theother hand, if the walk has uniformly positive drift radially away from 0,i.e., µ(x) = εx for any ε > 0, then it can be shown that the walk is transientwith linear rate of escape.

The natural setting to look for finer behaviour is the case in which themagnitude of the drift tends to zero as the distance from the origin increases.

We focus on a concrete example of the asymptotically zero drift re-gime, in which the drift is always in the radial direction. Consider a non-homogeneous random walk on Rd. Let r0 be a finite positive constant.Suppose that, for ρ ∈ R,

µ(x) = ρx

‖x‖ , for all x with ‖x‖ ≥ r0; (1.10)

in other words, the mean drift is radially towards or away from 0, vanish-ing in magnitude as the distance from 0 increases. Such centrally biasedrandom walks were studied by Lamperti; one can view these processes asperturbations of zero-drift random walks. It is natural to investigate whatmagnitude of perturbation is needed to change the asymptotic behaviour ofthe process.

For simplicity, we take the state space of our random walk to be Zd.We also need to impose an assumption to ensure that the walk cannot gettrapped in a subset of the state space. For simplicity, at this point wesuppose that the increments are uniformly elliptic:

infu∈Sd−1

P [(ξn+1 − ξn) · u ≥ ε | ξn = x] ≥ ε, for all x ∈ Zd. (1.11)

We will see later in the book (Example 3.3.5) that uniform ellipticity (1.11)implies lim supn→∞ ‖ξn‖ =∞, a.s., so questions concerning the asymptoticbehaviour of the walk are non-trivial.

1.7. Centrally biased random walks 15

The following result due to Lamperti can be viewed as a generalizationof the one-dimensional Theorem 1.3.1. Recall the definition of the incrementcovariance matrix M(x) from (1.9).

Theorem 1.7.1. Let ξn be a spatially non-homogeneous random walk on Zdfor which (1.7) and (1.11) hold, with asymptotically zero drift of the form(1.10). Suppose that M(x) = σ2I for some σ2 ∈ (0,∞). Then there existconstants c1 = c1(d, σ2) and c2 = c2(d, σ2), with −∞ < c1 < c2 < ∞, forwhich:

• If ρ < c1, then ξn is positive-recurrent;

• If c1 ≤ ρ ≤ c2, then ξn is null-recurrent;

• If ρ > c2, then ξn is transient

In particular, for d ∈ 1, 2, c2 ≥ 0, while for d ≥ 3, c2 < 0.

This result implies that by adding an appropriate drift of order 1/‖x‖away from 0 to a zero drift random walk in d ∈ 1, 2, we can change the walkfrom recurrent to transient. On the other hand, by adding an appropriatedrift of order 1/‖x‖ towards 0 to a zero drift random walk in d ≥ 3, we canchange the walk from transient to recurrent (even positive-recurrent).

Theorem 1.7.1 shows that the case where ‖µ(x)‖ = O(‖x‖−1) is criticalfrom the point of view of the recurrence classification of the non-homogeneousrandom walk. It also turns out that several other important characterist-ics of the process undergo phase transitions in this regime; in this sense,the centrally biased random walk is a near-critical stochastic system. Wewill see this near-critical behaviour exhibited by several quantitative charac-teristics later in the book, including passage-time distributions, stationarymeasures, and almost-sure scaling exponents; typical of near-criticality isthat scaling is polynomial and distributions are heavy-tailed.

The regime ‖µ(x)‖ = O(‖x‖−1) also turns out to be critical from thepoint of view of the angular asymptotics in the sense of Section 1.6. Here,where the walk may be transient or recurrent, by Theorem 1.7.1, the follow-ing result shows that the walk has no limiting direction.

Theorem 1.7.2. Let ξn be a spatially non-homogeneous random walk on Zd,d ≥ 2, for which (1.7) and (1.11) hold. Suppose that M(x) = σ2I for someσ2 ∈ (0,∞) and that ‖µ(x)‖ = O(‖x‖−1). Then P[limn→∞ ξn exists] = 0.

On the other hand, a radial drift of order greater than ‖x‖−1 can lead tocompletely different behaviour, in which the random walk is transient witha limiting direction, which can (unlike in Theorem 1.6.1) be random.


In fact, we have the following strong law of large numbers:

Theorem 1.7.3. Let ξn be a spatially non-homogeneous random walk onZd for which (1.7) and (1.11) hold, with drift of the form

µ(x) = ρx

‖x‖β , for all x with ‖x‖ ≥ r0,

for some β ∈ (0, 1), ρ > 0, and r0 ∈ R+. Suppose that M(x) = σ2I forsome σ2 ∈ (0,∞). Then as n→∞,

ξn

n1/(1+β)→ v, a.s.,

for a random vector v ∈ Rd of fixed (deterministic, non-zero) magnitude.In particular, limn→∞ ξn = v, a.s., a random limiting direction.

See Figure 1.2 for some simulations illustrating these results.

-100

-50

0

50

100

150

200

0 50 100 150 200 250

0

200

400

600

800

1000

-700 -600 -500 -400 -300 -200 -100 0

Figure 1.2: Two 2-dimensional centrally biased random walk simulations,each run for 4 × 104 steps: when ϕ(r) = r−1 (left) and when ϕ(r) = r−1/2

(right). Theory says that the walk on the left will visit every cone eventually,while the walk on the right has a limiting direction. Both walks are transient;the walk on the left is diffusive while the walk on the right satisfies a super-diffusive law of large numbers.

1.8 Outline of some motivation

TO DO: lots to add. General applications of random walks. See alsosurveys [284]. . .

1.9. Outline of some mathematical applications 17

Models of animal behaviour going back to Pearson [231]. Macroscopic[117, 267] and Microscopic.

Theory of sound e.g. [239] [130].

1.9 Outline of some mathematical applications

Maximal branching processes going back to Lamperti [176, 177]. Re-cently [11].

Bibliographical notes

Section 1.2

Theorem 1.2.1 is due to Polya. Versions of the usual proof, which is es-sentially Polya’s, can be found for example in Chapter 10 of [219]. Simplerandom walk is undoubtedly the most studied random walk model; some ofthe key results from the vast literature are covered in the books of Feller[86] and Revesz Rev13, for example.

This deceptively simple model shows deep and interesting phenomena.For example, there is a remarkable closed-from expression for p3, given by

p3 = 1− 1

u3≈ 0.340537,

where

u3 =3

(2π)3

∫ π

−π

∫ π

−π

∫ π

−π

dxdydz

3− cosx− cos y − cos z

=

√6

32π3Γ( 1

24

)Γ( 5

24

)Γ( 7

24

)Γ(11

24

);

a version of the first equality for u3 was given by McCrea and Whipple [202],the second is a result of Glasser and Zucker [105].

Section 1.3

The approach described in this section is due to Lamperti [173, 175], andis discussed at length in Chapter 3 of this book. The statement of therecurrence classification Theorem 1.3.1 is made precise in Theorem 3.2.3 (inthe Markovian case) and Theorems 3.5.1 and 3.5.2 (in the more general casewhere the Markov property does not hold).


The observation that ‖Sn‖ is not Markov is an important instance ofa general phenomenon; if Yn is a Markov chain, then f(Yn) need not beMarkov. In the particular case where the state-space is Zd and f(x) = ‖x‖,it is clear what goes ‘wrong’ for ‖Sn‖, as illustrated in Figure 1.3: the func-tion x 7→ ‖x‖ is many-to-one in some places, with pre-images that contain‘distinguishable’ points. In this case, one can rectify the issue by consideringf(x) = ‖ζ + x‖ for some ζ ∈ Rd with irrational coordinates; then f is one-to-one over Zd, and ‖ζ + Sn‖ is Markov. So it is misleading to overplay theloss of the Markov property in the case of ‖Sn‖. However, ‖Sn‖ is certainlya more natural object than ‖ζ+Sn‖, and we will encounter other situations(more general state-spaces, more complicated Lyapunov functions) where itis less straightforward to rescue the Markov property in such a way.

The general question of when a function f applied to a Markov chainYn yields a Markov chain f(Yn) is a topic that has received much interestover the years, under the heading of Markov functions. Here we mention,for example, [248, 179, 246, 143].

Section 1.5

Theorem 1.5.2 is a special case of the famous Chung–Fuchs theorem, whichgives a criterion for recurrence/transience for general spatially homogeneousrandom walks on Rd; see [37] or Chapter 9 of [137]. In dimension d = 1, thesecond moments condition can be replaced by the weaker condition E |θ0| <∞, while in dimension d ≥ 3, the moments condition can be dispensed withaltogether; in these cases one must reformulate the covariance condition asa condition on the support of the increment distribution being of the fulldimension.

Theorem 1.5.3 exhibits the possible anomalous recurrence properties ofnon-homogeneous random walks; this issue is explored in detail in Chapter 4.The statement of Theorem 1.5.3 is exhibited by both the elliptic randomwalks of Theorem 4.2.1 (see also [101]) and the controlled random walks ofSection 4.3 (see also [233]).

Although certainly appreciated by experts, Theorem 1.5.3 is perhaps notas widely known as it might be. Zeitouni (pp. 91–92 of [289]) describes anexample of a transient zero-drift random walk on Z2, and states that theidea “goes back to Krylov (in the context of diffusions)”.

Theorem 1.5.4 shows that under suitable regularity conditions the Chung–Fuchs theorem extends to spatially non-homogeneous random walks withzero drift and a fixed increment covariance. The proof of Theorem 1.5.4 isgiven as Example 3.5.4. Assuming that M(x) = M for a positive definite

Bibliographical notes 19

matrix M , the assumption that M = σ2I is no real loss of generality inthe statement of Theorem 1.5.4, since a suitable affine transformation of Rdreduces the one case to the other: again, see Example 3.5.4. A stronger ver-sion of Theorem 1.5.4 is Theorem 4.1.3 proved in Chapter 4, which relaxesthe uniform bound on the increments of the walk to a moments condition.

Section 1.6

We do not claim that Theorem 1.6.1 is new, but we could not find a reference.Thus we present the proof here.

Proof of Theorem 1.6.1. Suppose first that µ 6= 0. Then the strong law oflarge numbers shows that n−1ξn → µ, a.s., and n−1‖ξn‖ → ‖µ‖, a.s. Thestatement in part (i) now follows, since

limn→∞

ξn = limn→∞

n−1ξnn−1‖ξn‖

= µ, a.s.

On the other hand, suppose µ = 0. Observe that, by exchangeab-ility, the Hewitt–Savage zero-one law (see e.g. [137, p. 53]) shows thatP[limn→∞ ξn exists] ∈ 0, 1, and if the limit exists with probability 1, itis a.s. degenerate.

Suppose, for the purpose of deriving a contradiction, that

P[

limn→∞

ξn exists]> 0,

in which case the probability must be equal to 1. Since ξn ∈ Sd−1 ∪ 0,and the limit can only be 0 if ξn = 0 for all but finitely many n, which isimpossible since P[θ0 = 0] < 1, it follows that P[limn→∞ ξn = e] = 1 forsome deterministic e ∈ Sd−1. Then

limn→∞

e · ξn‖ξn‖

= ‖e‖2 = 1, a.s.,

which implies that lim infn→∞ e · ξn ≥ 0, a.s. But e · ξn =∑n−1

k=0 e · θk isa one-dimensional random walk with mean increment E[e · θ0] = e · µ = 0,which means that lim infn→∞ e · ξn = −∞, a.s. (see e.g. [137, p. 167]). Thisgives the desired contradiction, and yields the statement in part (ii).


Section 1.7

Centrally biased random walks are discussed in detail in Chapter 4; referto the notes to that chapter for full bibliographic details. The recurrenceclassification, Theorem 1.7.1, is contained in Theorem 4.4.8. The result onangular asymptotics, Theorem 1.7.2, is contained in Theorem 4.4.5. Theresult on the limiting direction in the supercritical case, Theorem 1.7.3, iscontained in Theorem 4.4.2.

Chapter 2

General results for hittingtimes

2.1 Definitions

We assume that the reader is familiar with the basic concepts of probabilitytheory, including convergence of random variables and uniform integrability.

In this section we recall some basic definitions related to Markov pro-cesses with discrete time and (sub-, super-)martingales, and introduce someof our notational conventions.

Definition 2.1.1 (Basic concepts for discrete-time stochastic processes).In the following, all random variables are defined on a common probabilityspace (Ω,F ,P). Random variables take values usually in R or Rd, possiblyextended to allow components ±∞ to handle limiting variables. We write Efor expectation corresponding to P, which will be applied to real-valuedrandom variables, or componentwise in the case of random vectors in Rd.

• We take as a state space a measurable space (Σ, E). For technicalreasons (existence of Markov transition functions), one requires that(Σ, E) is a Borel space; we will always assume that any state space isBorel. In this book, typically Σ ⊆ Rd; often, Σ ⊆ Rd is countable. Inthese cases the σ-field E is the obvious one (i.e., the appropriate Borelsets B(Σ) = A ∩ Σ : A ∈ B(Rd), which reduces to the power set 2Σ

in the countable case), and will usually not be mentioned explicitly;then we use the term ‘state space’ to refer simply to the set Σ.

• A discrete-time stochastic process with state space (Σ, E) is sequenceof random variables Xn : (Ω,F) → (Σ, E) indexed by n ∈ Z+. We

21

22 Chapter 2. General results for hitting times

write such sequences as (Xn, n ≥ 0), with the understanding that thetime index n is always an integer.

• A filtration (of F) is a sequence of σ-fields (Fn, n ≥ 0) such thatFn ⊆ Fn+1 ⊆ F for all n ≥ 0. We set F∞ := σ(∪n≥0Fn) ⊆ F .

• A stochastic process (Xn, n ≥ 0) is adapted to a filtration (Fn, n ≥ 0)if Xn is Fn-measurable for all n ∈ Z+.

• It is convenient, for definiteness, to introduce the notation X∞ :=lim supn→∞Xn, computed componentwise if Xn ∈ Rd, so that X∞

has a meaning (as an extended random variable in Rd) even when thelimit does not exist. Note that if (Xn, n ≥ 0) is adapted to (Fn, n ≥ 0),then X∞ is F∞-measurable.

• For a (possibly infinite) random variable τ ∈ Z+, the random variableXτ is (as the notation suggests) equal to Xn on τ = n for finiten ∈ Z+ and equal to X∞ := lim supn→∞Xn on τ =∞.

• A (possibly infinite) random variable τ ∈ Z+ is a stopping time withrespect to a filtration (Fn, n ≥ 0) if τ = n ∈ Fn for all n ≥ 0 (andn =∞). If τ is a stopping time, then so are τ + 1, τ + 2, etc.

• If τ is a stopping time, the corresponding σ-field Fτ consists of allevents A ∈ F∞ such that A ∩ τ ≤ n ∈ Fn for all n ≥ 0 (andn = ∞). Note that Fτ ⊆ F∞; events in Fτ include τ = ∞, as wellas Xτ ∈ B for B ∈ E , and so on.

For any process (Xn, n ≥ 0), one can define a (minimal) filtration towhich this process is adapted via Fn = σ(X0, X1, . . . , Xn), to give the so-called natural filtration.

If (Xn, n ≥ 0) is a process adapted to a filtration (Fn, n ≥ 0) of F and τis a stopping time, then, from the definitions above, (Fn, n ≥ τ) is also afiltration of F , to which the sequence (Xn, n ≥ τ) is adapted, where on theevent τ =∞ the latter notation stands for (X∞, X∞, . . .).

To keep the notation concise, we will frequently write Xn and Fn insteadof (Xn, n ≥ 0) and (Fn, n ≥ 0) (and so on) when no confusion will arise.

Definition 2.1.2 (Martingales, submartingales, supermartingales). A real-valued stochastic process Xn adapted to a filtration Fn is a martingale (withrespect to the given filtration) if, for all n ≥ 0,

(i) E |Xn| <∞, and

2.1. Definitions 23

(ii) E[Xn+1 −Xn | Fn] = 0.

If in (ii) “=” is replaced by “≥” (respectively, “≤”), then Xn is called asubmartingale (respectively, supermartingale).

We use the term semimartingale in a less precise way, to encompass notonly martingales, submartingales and supermartingales, but also adaptedprocesses satisfying ‘drift’ conditions of a similar kind to (ii) above; perhaps,for instance, only locally, i.e., on Xn /∈ A for some measurable set A.

Another measure to avoid notational clutter is that we will frequentlydrop ‘almost surely’ qualifiers in statements asserting equality or inequalityof random variables, as in part (ii) above. Moreover, we often will notmention explicitly with respect to which filtration the process is a (sub-,super-)martingale, since normally there will be no possibility of ambiguity.

Consider an Fn-adapted stochastic process Xn taking values in a statespace (Σ, E). For A ∈ E we define

τA := minn ≥ 0 : Xn ∈ A, and τ+A := minn ≥ 1 : Xn ∈ A;

we may refer to either τA or τ+A as the hitting time of A (also called the

passage time into A). Then τ+A and τA are Fn-stopping times, given that

A ∈ E (see e.g. [137, p. 123]).In the case where Σ ⊆ R we will also frequently use notation for the

hitting times of half-lines; namely, for x ∈ R we set

ρx := τΣ∩[x,∞) = minn ≥ 0 : Xn ≥ x,λx := τΣ∩(−∞,x] = minn ≥ 0 : Xn ≤ x,

for right and left passage times, and, for x ≥ 0,

σx := τΣ\(−x,x) = minn ≥ 0 : |Xn| ≥ x = λ−x ∧ ρx.

In the case where Σ ⊆ R+, obviously ρx = σx for x ≥ 0; in this case ourpreference is to use the notation σx.

Next we recall some fundamental definitions on Markov processes.

Definition 2.1.3 (Markov chains).

• An Fn-adapted process Xn taking values in a state space (Σ, E) is aMarkov chain if, for any B ∈ E , any n ≥ 0, and any m ≥ 1,

P[Xn+m ∈ B | Fn] = P[Xn+m ∈ B | Xn], a.s. (2.1)

This is the Markov property.


• If there is no dependence on n in (2.1), the Markov chain is homo-geneous in time (or time-homogeneous). Unless explicitly stated oth-erwise, all Markov chains considered in this book are assumed to behomogeneous in time. In this case, the Markov property (2.1) reads

P[Xn+m ∈ B | Fn] = Pm(Xn, B), a.s., (2.2)

where Pm : Σ× E → [0, 1] is the Markov transition function.

• We use the shorthand Px[ · ] = P[ · | X0 = x] and Ex[ · ] = E[ · | X0 =x] for probability and expectation for the time-homogeneous Markovchain starting from initial state x ∈ Σ.

• If the state space Σ is countable (i.e., finite, or countably infinite), wecall Xn a countable Markov chain. For a countable state space Σ, wealways suppose that E = 2Σ, the power set of Σ.

• If the Markov chain is time-homogeneous and countable, we writep(x, y) := P[X1 = y | X0 = x] for the one-step transition probabilitiesof the Markov chain.

• A time-homogeneous, countable Markov chain is irreducible if for allx, y ∈ Σ there exists m = m(x, y) ∈ N such that Px[Xm = y] > 0.For an uncountable state space Σ, notions of irreducibility are moretechnical, and not discussed in this book. For our purposes, if we referto an irreducible Markov chain, it is implicit that the Markov chain istime-homogeneous and countable.

• For an irreducible Markov chain, we define its period as the greatestcommon divisor of n ∈ N : Px[Xn = x] > 0 (it is straightforward toshow that it does not depend on the choice of x ∈ Σ). An irreducibleMarkov chain with period 1 is called aperiodic.

Suppose now that Xn is a countable Markov chain; recall the definitionsof the hitting times τ+

A and τA, A ⊆ Σ. For x ∈ Σ, we use the notationτ+x := τ+

x and τx := τx for hitting times of one-point sets. Note that for

any x ∈ A, Px[τA = 0] = 1, while τ+A ≥ 1 is then the return time to A. Also

note that Px[τA = τ+A ] = 1 for all x ∈ Σ \A.

Definition 2.1.4. For a countable Markov chain Xn, a state x ∈ Σ is called

• recurrent if Px[τ+x <∞] = 1;

• transient if Px[τ+x <∞] < 1.

2.1. Definitions 25

A recurrent state x is classified further as

• positive recurrent if Ex τ+x <∞;

• null recurrent if Ex τ+x =∞.

It is straightforward to show that the four properties in Definition 2.1.4are class properties, which entails the following statement.

Proposition 2.1.5. For an irreducible Markov chain, if a state x ∈ Σis recurrent (respectively, positive recurrent, null recurrent, transient) thenall states in Σ are recurrent (respectively, positive recurrent, null recurrent,transient).

By the above fact, it is legitimate to call an irreducible Markov chainitself recurrent (positive recurrent, null recurrent, transient).

A basic fact that we need about positive recurrent Markov chains is thefollowing.

Theorem 2.1.6. Suppose that Xn is an irreducible, positive recurrent Markovchain. Define the stationary measure

(π(x), x ∈ Σ

)by π(x) = (Ex τ+

x )−1.

(i) It holds that∑

x∈Σ π(x) = 1 (so π defines a probability measure on Σ),π(x) > 0 for all x ∈ Σ, and

π(x) =∑y∈Σ

π(y)p(y, x), for all x ∈ Σ.

(This implies that if X0 is distributed according to the stationary meas-ure π, then so is Xn for all n.)

(ii) If the Markov chain is also aperiodic, then Px[Xn = y] → π(y) asn → ∞ for all x, y ∈ Σ. For periodic chains, convergence also takesplace, but only in the Cesaro sense: n−1 Ex

∑n−1k=0 1Xk = y → π(y).

Example 2.1.7. Let p ∈ (0, 1), and assume that Z1, Z2, Z3, . . . are i.i.d.random variables with P[Z1 = 1] = 1 − P[Z1 = −1] = p. Let F0 = ∅,Ω,the trivial σ-field, and Fn = σ(Z1, . . . , Zn) for n ≥ 1. We set S0 = 0and Sn = Z1 + · · · + Zn for n ≥ 1; the Fn-adapted process Sn is a one-dimensional simple random walk (SRW) with parameter p. In the casep = 1

2 , we say that Sn is (one-dimensional) symmetric SRW, or sometimesjust SRW, without any adjectives.

The following observations are straightforward.


• Sn is a Markov chain with countable state space Z, and transitionprobabilities p(x, x+ 1) = 1− p(x, x− 1) = p for x ∈ Z;

• this Markov chain is irreducible and periodic with period 2;

• with respect to the filtration Fn, the process Sn is a supermartingalefor p ≤ 1

2 , a submartingale for p ≥ 12 , and a martingale for p = 1

2 ;

• for the symmetric SRW, S2n is a submartingale, and (S2

n − n) is amartingale (in fact, S2

n = (S2n − n) + n is the Doob decomposition of

S2n: see Theorem 2.3.1 below);

• (1 − p)(1−p

p

)x−1+ p

(1−pp

)x+1=(1−p

p

)xfor all x ∈ Z, so the pro-

cess(1−p

p

)Sn is a martingale (this fact, of course, has non-trivial con-

sequences only for p 6= 12). 4

Example 2.1.8. Now, consider a modification Sn of the process of theprevious example, which lives on Σ = Z+ := 0, 1, 2, . . .. Again, fix aparameter p ∈ (0, 1); for x > 0 we keep the same transition probabilitiesp(x, x + 1) = 1 − p(x, x − 1) = p, and we set p(0, 1) = 1; we will callthis process the (reflected) simple random walk on the half-line. Then, forp ≥ 1

2 this process is still a submartingale, but one loses the supermartingaleproperty for p ≤ 1

2 (as well as the martingale property for p = 12) because

of the reflection at 0. However, these properties are readily recovered if weconsider the stopped process Sn∧τ0 instead of Sn itself. 4

i

When the desired (sub-, super-)martingale property breaksdown in some region, often it is possible (and useful) to fixit by considering a modification of the original process bymeans of a suitable stopping time.

The notions of recurrence and transience given in Definition 2.1.4 dependonly on the structure of Σ as a set (i.e., are purely ‘topological’). In the con-text of this book, geometric consequences of these notions are important; weare usually concerned with a given embedding of Σ into Euclidean space Rd.

A set A ⊆ Rd is bounded if supx∈A ‖x‖ < ∞; otherwise it is unbounded(recall the convention sup ∅ = 0). A set A ⊆ Rd is locally finite if #(A∩B) <∞ for all bounded B ⊂ Rd. Of course, a locally finite set is countable, andan unbounded locally finite set is countably infinite.

We state here some basic results for Markov chains, which follow frommore general results whose proofs are presented later, in Section 3.6.

2.1. Definitions 27

Theorem 2.1.9. Let Xn be an irreducible time-homogeneous Markov chainon an unbounded, countable state space Σ ⊂ Rd.

(i) If Xn is recurrent, then, a.s.,

lim infn→∞

‖Xn‖ = infx∈Σ‖x‖, and lim sup

n→∞‖Xn‖ =∞.

(ii) If Σ is locally finite and Xn is transient, then limn→∞ ‖Xn‖ =∞, a.s.

Without local finiteness of Σ, part (ii) of Theorem 2.1.9 may fail. Tosee this, note that a countable set S ⊆ Rd is locally finite if and only if itcontains no finite limit points. Indeed, it is easy to see that if S ⊆ Rd hasa finite limit point, it is not locally finite. On the other hand, suppose thatS ⊆ Rd is not locally finite. Then there is some ball B0 of radius r0 = 1such that #(B0 ∩ S) = ∞. Cover B0 by a finite number of balls of radiusr1 = 2−1 whose centres lie in B0. At least one of the balls in this coveringcontains infinitely many points of S; let B1 be one such ball. Iterating thisprocedure gives a decreasing sequence of balls Bn, each containing infinitelymany points of S, in which Bn has radius 2−n. Let xn ∈ Rd be the centreof Bn. Then ‖xn − xn+1‖ ≤ 2−n so limn→∞ xn = x ∈ Rd exists. For anyε > 0, we can take n large enough so that ‖x − xn‖ < ε/2 and rn < ε/2,so Bn (and hence an infinite subset of S) is contained in a ball of radius εcentred at x. So x is a finite limit point of S.

Contained in Theorem 2.1.9 is the following basic fact.

Corollary 2.1.10. Let Xn be an irreducible time-homogeneous Markov chainon an unbounded, locally finite state space Σ ⊂ Rd. Then lim supn→∞ ‖Xn‖ =∞, a.s.

We end this section with a technical lemma, which the standard notationsuggests is obvious, and will often be used without comment.

Lemma 2.1.11. Let Fn be a filtration and τ a stopping time.

(i) If X is an integrable random variable, then for any n ∈ Z+,

E[X | Fτ ]1τ = n = E[X | Fn]1τ = n = E[X1τ = n | Fn].

(ii) If Xn is an Fn-adapted process such that limn→∞Xn = X∞ exists a.s.,then limn→∞Xn∧τ = Xτ , a.s.


Proof. (i) Fix n ∈ Z+. Write ξ = E[X | Fn] and ζ = E[X | Fτ ]. Notethat ξ and ζ are integrable, and, by definition of Fτ , both ξ1τ = n andζ1τ = n are both Fτ -measurable and Fn-measurable. For any A ∈ Fτ ,A∩τ = n is in Fτ and Fn, so that, by definition of conditional expectation,

E[ζ1(A ∩ τ = n)] = E[X1(A ∩ τ = n)] = E[ξ1(A ∩ τ = n)].

Hence E[(ζ − ξ)1τ = n1(A)] = 0. Choosing A = (ζ − ξ)1τ = n >0 ∈ Fτ we deduce that E[(ζ − ξ)+1τ = n] = 0. A similar argumentgives E[(ζ − ξ)−1τ = n] = 0, and hence |ζ − ξ|1τ = n = 0, a.s. Exactlythe same proof works when n = ∞, but it is worth reading through theargument again in that case.

(ii) Suppose limn→∞Xn = X∞. Then limn→∞Xn1τ > n = X∞1τ =∞,while limn→∞Xτ1τ ≤ n = Xτ1τ <∞. In other words,

limn→∞

Xn∧τ = X∞1τ =∞+Xτ1τ <∞ = Xτ ,

as claimed.

The point of Lemma 2.1.11(i) will become clearer later on; it allows usto extend statements like E[f(Xn+1, . . . , Xn+k) | Fn] ≤ g(Xn) (for all n) tothe ‘stopped’ version E[f(Xτ+1, . . . , Xτ+k) | Fτ ] ≤ g(Xτ ) for finite stoppingtimes τ . Here is one example of an application.

Example 2.1.12. Let Xn be an Fn-adapted time-homogeneous Markovchain on state space (Σ, E). Let τ be a stopping time. Then, on τ <∞,

P[Xτ+m ∈ B | Fτ ] =∞∑n=0

P[Xτ+m ∈ B | Fτ ]1τ = n

=∞∑n=0

P[Xn+m ∈ B | Fn]1τ = n,

=

∞∑n=0

Pm(Xn, B)1τ = n,

by Lemma 2.1.11(i) and then (2.2). The fact that the Markov property (2.2)extends to stopping times as

P[Xτ+m ∈ B | Fτ ] = Pm(Xτ , B), on τ <∞,

is the strong Markov property. 4

2.2. An introductory example 29

2.2 An introductory example

In this section we briefly describe a simple example that is rich enough todisplay a range of interesting behaviour. We will refer back to this examplefrom time to time to illustrate the subsequent discussions in this chapter.This example, and many of the classical methods of its analysis, are likelyto be familiar to the reader; we present some of the standard results, butwill give semimartingale-style proofs here and throughout this chapter toillustrate our approach.

Our example is a nearest-neighbour (i.e., simple) random walk on Z+,(Xn, n ≥ 0). For x ≥ 1, we set

P[Xn+1 = x− 1 | Xn = x] = px,

P[Xn+1 = x+ 1 | Xn = x] = 1− px =: qx, (2.3)

and P[Xn+1 = 0 | Xn = 0] = p0, P[Xn+1 = 1 | Xn = 0] = 1− p0 =: q0.

Assuming px ∈ (0, 1) for all x ∈ Z+, Xn is an irreducible, aperiodicMarkov chain on the locally finite state space Z+; in particular, Corol-lary 2.1.10 shows that lim supn→∞Xn = +∞, a.s.

For x ∈ Z+ we make the following definitions, recalling that an emptysum in 0 and an empty product is 1.

dx :=

x−1∏y=0

pyqy

; (2.4)

ux :=

x∑y=0

q−1x−y

x∏z=x−y+1

pzqz

=1

qx+

pxqxqx−1

+ · · ·+ pxpx−1 · · · p1

qxqx−1 · · · q1q0; (2.5)

vx :=

∞∑y=x

p−1y

y−1∏z=x

qzpz

=1

px+

qxpxpx+1

+ · · · ; (2.6)

in particular, d0 = 1 and u0 = 1/q0. Since px ∈ (0, 1) for all x, dx and uxare finite for all x. However, vx <∞ if and only if

∞∑y=x

p−1y d−1

y <∞ ⇐⇒∞∑y=x

(d−1y + d−1

y+1) <∞,

using the fact that p−1y d−1

y = d−1y + d−1

y+1. Hence vx < ∞ if and only if∑∞x=1 d

−1x <∞.


Now for x ∈ Z+ define

h(x) :=x∑y=0

dy; (2.7)

then h : Z+ → R+ is increasing, so limx→∞ h(x) =∑∞

x=0 dx exists in R+.The function h has a classical interpretation in terms of hitting probabilities.Recall that τx := minn ≥ 0 : Xn = x; here x ∈ Z+. Note that Xτx = xa.s. and τx 6= τy for any x 6= y.

Lemma 2.2.1. Let a, b, x be integers with 0 ≤ a < b and a ≤ x ≤ b. Then

Px[τa < τb] =h(x)− h(b)

h(a)− h(b).

The classical proof of Lemma 2.2.1 goes via solving a difference equation.We will present a different argument in Example 2.4.2 below, based on thefollowing fact.

Lemma 2.2.2. For any n ∈ Z+, and any x ∈ Z+,

E[h(Xn+1)− h(Xn) | Xn = x] = q01x = 0.

Proof. It suffices to take n = 0. For x ≥ 1,

Ex[h(X1)− h(X0)] = pxh(x− 1) + qxh(x+ 1)− h(x)

= qxdx+1 − pxdx = 0.

On the other hand, E0[h(X1)− h(X0)] = q0(h(1)− h(0)) = q0d0 = q0.

For x ∈ Z+, set

t(x) :=x−1∑y=0

uy, and r(x) :=x∑y=1

vy, (2.8)

where uy and vy are given by (2.5) and (2.6). By the remarks above, t(x) <∞ for all x, while r(x) <∞ for x ≥ 1 if and only if

∑∞y=1 d

−1y <∞.

Lemma 2.2.3. (i) For x ∈ Z+, E0 τx = t(x).

(ii) For x ∈ Z+, Ex τ0 = r(x).

We give a proof of Lemma 2.2.3 based on the following semimartingaleproperties of the functions r and t.

2.2. An introductory example 31

Lemma 2.2.4. (i) For any n ∈ Z+ and any x ∈ Z+,

E[t(Xn+1)− t(Xn) | Xn = x] = 1.

(ii) Suppose that∑∞

x=1 d−1x <∞. Then, for any n ∈ Z+ and any x ∈ Z+,

E[r(Xn+1)− r(Xn) | Xn = x] =

−1, if x ≥ 1,

q0v1, if x = 0.

Proof. It suffices to take n = 0. For x ≥ 1, we have

Ex[t(X1)− t(X0)] = px(t(x− 1)− t(x)) + qx(t(x+ 1)− t(x))

= qxux − pxux−1 = 1,

by (2.5). Also,E0[t(X1)− t(X0)] = q0t(1) = 1,

since t(1) = u0 = 1/q0. This gives (i). Similarly, for x ≥ 1,

Ex[r(X1)− r(X0)] = px(r(x− 1)− r(x)) + qx(r(x+ 1)− r(x))

= qxvx+1 − pxvx = −1,

by (2.6). Also,

E0[r(X1)− r(X0)] = q0(r(1)− r(0)) = q0v1,

which completes the proof of (ii).

Proof of Lemma 2.2.3. Let Fn = σ(X0, . . . , Xn). For part (i), let Yn =t(Xn∧τx). By Lemma 2.2.4(i), for m ≥ 0,

E[Ym+1 − Ym | Fm] = 1τx > m.

Taking expectations conditioned on X0 = 0, summing over m from 0 ton− 1, and then letting n→∞ gives

E0 τx = limn→∞

n−1∑m=0

P0[τx > m] = limn→∞

E0 t(Xn∧τx),

using the fact that t(0) = 0. Here t(Xn∧τx) is uniformly bounded andconverges a.s. to t(Xτx) = t(x), so by the bounded convergence theorem weget part (i).


For part (ii), first suppose that∑∞

x=1 d−1x < ∞, so that r(x) ≤ r(∞) <

∞ for all x. Let Yn = r(Xn∧τ0). Similarly to the preceding argument,Lemma 2.2.4(ii) shows that Ex[n ∧ τ0] = Ex Y0 − Ex Yn. Here 0 ≤ Yn ≤r(∞) <∞, so Yn is uniformly bounded and converges to r(0) = 0, so againwe may apply the bounded convergence theorem to deduce that Ex τ0 = r(x).The fact that Ex τ0 = ∞ when

∑∞x=1 d

−1x = ∞ is easiest to see from the

classical difference equation argument.

Now we can state the basic classification result, in terms of the quantit-ies dx defined at (2.4).

Theorem 2.2.5. (i) Xn is recurrent if and only if∑∞

x=1 dx =∞;

(ii) Xn is positive recurrent if and only if∑∞

x=1 d−1x <∞.

Proof. There are numerous ways to prove Theorem 2.2.5; we describe aninstructive approach. Suppose X0 = 0. Since lim supn→∞Xn = ∞ a.s., wehave τx < ∞ a.s. for all x ∈ Z+. Also, since |Xn+1 − Xn| ≤ 1, we haveτx+1 ≥ τx + 1. Hence τx is strictly increasing in x, with limx→∞ τx = ∞.Hence τ+

0 =∞ if and only if τ+0 > τx for all x ∈ N. Since τ0 > τx ⊇ τ0 >

τx+1, we have

P0[τ+0 =∞] = q0 P1[∩x≥2τ0 > τx] = q0 lim

x→∞P1[τx < τ0],

by continuity of probability along monotone limits. Hence by Lemma 2.2.1,

P0[τ+0 =∞] = q0 lim

x→∞

h(1)− h(0)

h(x)− h(0),

which is 0 if and only if limx→∞ h(x) =∞, giving part (i) of Theorem 2.2.5.Part (ii) of Theorem 2.2.5 can be derived by direct computation of thestationary measure π, using the reversibility of the random walk, or by thehitting time result Lemma 2.2.3, since positive recurrence is equivalent tofiniteness of E0 τ

+0 = E0 τ1 + E1 τ0.

After developing some of the necessary technology, we will give Lyapunovfunction proofs of parts of Theorem 2.2.5 later, in Examples 2.5.4 and 2.5.11.

2.3 Fundamental semimartingale facts

In this section we state (often, without proof) some basic facts related to(sub-, super-)martingales that we will need in this book. We also give a

2.3. Fundamental semimartingale facts 33

number of examples to illustrate in a simple way some of the basic ideas ofapplications of these results.

Recall that the underlying setting is a probability space (Ω,F ,P) equippedwith a filtration (Fn, n ≥ 0). Whenever we refer without further qualific-ation to a martingale, submartingale, or supermartingale Xn, it is implicitthat Xn takes values in R.

Theorem 2.3.1 (Doob decomposition). A submartingale Xn has a uniquerepresentation Xn = Mn +An, n ≥ 0, where Mn is a martingale, and An isa non-decreasing process such that A0 = 0 and An is Fn−1-measurable forn ≥ 1 (that is, An is previsible).

Theorem 2.3.2 (Orthogonality of martingale increments). Let Xn be asquare-integrable martingale. If m ≤ n and Y is Fm-measurable with finitesecond moment, then E[(Xn −Xm)Y ] = 0.

Observe that Theorem 2.3.2 implies that E[(Xm+1−Xm)(Xk+1−Xk)] =0 for k 6= m for any square-integrable martingale Xn (thus justifying itsname).

Theorem 2.3.3 (Martingales and convexity). (i) If Xn is a martingaleand g : R→ R is a convex function such that E |g(Xn)| <∞ for all n,then g(Xn) is a submartingale, with respect to the same filtration.

(ii) If Xn is a submartingale and g : R → R is a non-decreasing convexfunction such that E |g(Xn)| <∞ for all n, then g(Xn) is a submartin-gale, with respect to the same filtration.

As an immediate corollary, we obtain that if Xn is a square-integrablemartingale, then X2

n is a submartingale.A fundamental result is the following martingale convergence theorem.

Theorem 2.3.4 (Martingale convergence theorem). Assume that Xn is asubmartingale such that supn E[X+

n ] < ∞. Then there is an integrable ran-dom variable X such that Xn → X a.s. as n→∞.

Remarks 2.3.5. (a) Under the hypotheses of Theorem 2.3.4, the sequenceEXn is non-decreasing (by the submartingale property) and bounded aboveby supn E[X+

n ], so limn→∞ EXn exists and is finite; however, it is not ne-cessarily equal to EX.(b) Note that X−n = X+

n − Xn so that E[X−n ] ≤ E[X+n ] − EX0, by the

submartingale property, so that the hypothesis supn E[X+n ] < ∞ actually

implies that supn E |Xn| <∞.


An important corollary to Theorem 2.3.4 and Fatou’s lemma is as follows.

Theorem 2.3.6 (Convergence of non-negative supermartingales). Assumethat Xn ≥ 0 is a supermartingale. Then there is an integrable randomvariable X such that Xn → X a.s. as n→∞, and EX ≤ EX0.

Another fundamental result that we will use extensively is the following.

Theorem 2.3.7 (Optional stopping theorem). Suppose that σ ≤ τ arestopping times, and Xτ∧n is a uniformly integrable submartingale. ThenEXσ ≤ EXτ <∞ and Xσ ≤ E[Xτ | Fσ] a.s.

Remarks 2.3.8. (a) From the conclusion Xσ ≤ E[Xτ | Fσ] a.s., it follows ontaking conditional expectations that E[Xσ | F0] ≤ E[Xτ | F0] a.s., which isanother useful form of optional stopping.(b) If Xn is a uniformly integrable submartingale and τ is any stopping time,then it can be shown that Xτ∧n is also uniformly integrable: see e.g. Section5.7 of [72].(c) Observe that two applications of Theorem 2.3.7, one with σ = 0 andone with τ = ∞, show that for any uniformly integrable submartingaleXn and any stopping time τ , EX0 ≤ EXτ ≤ EX∞ < ∞, where X∞ :=lim supn→∞Xn = limn→∞Xn exists and is integrable, by Theorem 2.3.4.

Theorem 2.3.7 has the following corollary, obtained on setting σ = 0and using well-known sufficient conditions for uniform integrability (see e.g.Sections 4.5 and 4.7 of [72]).

Corollary 2.3.9. Let Xn be a submartingale and τ a finite stopping time.For a constant c > 0, suppose that at least one of the following holds:

(i) τ ≤ c a.s.;

(ii) |Xn∧τ | ≤ c a.s. for all n ≥ 0;

(iii) E τ <∞ and E[|Xn+1 −Xn| | Fn] ≤ c a.s. for all n ≥ 0.

Then EXτ ≥ EX0. If Xn is a martingale and at least one of the aboveconditions (i)–(iii) holds, then EXτ = EX0.

Remarks 2.3.10. (a) Corollary 2.3.9(i) exhibits the important fact that forany stopping time τ , if Xn is a (sub-, super-)martingale, then so is Xn∧τ .(b) Suppose that τ is a stopping time and Xn is a non-negative supermartin-gale, so that X∞ = limn→∞Xn exists by Theorem 2.3.6. Fatou’s lemmawith Lemma 2.1.11(ii) shows that E[Xτ ] ≤ lim infn→∞ E[Xn∧τ ] ≤ E[X0],which is a simple version of optional stopping, without the need for uniformintegrability. A more general statement of this kind is the next result.


Theorem 2.3.11. Suppose that Xn is a non-negative supermartingale andσ ≤ τ are stopping times. Then EXτ ≤ EXσ<∞ and E[Xτ | Fσ] ≤Xσ, a.s.

Proof. Under the conditions of the theorem, Theorem 2.3.6 shows thatlimn→∞Xn = X∞ exists and is integrable. Hence by Lemma 2.1.11(ii)and the (conditional) Fatou lemma, for any σ-field G ⊆ F ,

E[Xτ | G] ≤ lim infn→∞

E[Xn∧τ | G].

In particular, for any fixed m ≥ 0,

E[Xτ | Fm∧σ] ≤ lim infn→∞

E[Xn∧τ | Fm∧σ] ≤ Xm∧σ,

by Theorem 2.3.7 applied to the uniformly integrable submartingale Yn =−Xm∧n, n ≥ 0. Now, by an application of Lemma 2.1.11(i),

E[Xτ | Fσ]1σ <∞ = limm→∞

E[Xτ | Fσ]1σ ≤ m= lim

m→∞E[Xτ | Fm∧σ]1σ ≤ m

≤ limm→∞

Xm∧σ1σ <∞ = Xσ1σ <∞.

This gives the claim E[Xτ | Fσ] ≤ Xσ on σ < ∞. On σ = ∞, τ = ∞too and the claim is justified by the n =∞ case of Lemma 2.1.11(i):

E[Xτ | Fσ]1σ =∞ = E[Xτ | F∞]1σ =∞ = X∞1σ =∞.

Example 2.3.12. Let Sn be simple random walk on Z with parameterp ∈ (0, 1), and, for a fixed x > 0, let σ = σx = minn ≥ 0 : |Sn| ≥ x.

First suppose p = 12 . Then the martingale Sn∧σ is uniformly bounded,

so Sn∧σ converges, by Theorem 2.3.4. Also, Sn∧σ1σ <∞ converges (toSσ1σ <∞). Hence Sn∧σ − Sn∧σ1σ <∞ converges as well. But thislast quantity is Sn1σ =∞. Since |Sn+1 − Sn| = 1, the only way that thisconvergence can happen is if P[σ <∞] = 1.

If p 6= 12 , a similar argument based on the martingale (1−p

p )Sn∧σ (seeExample 2.1.7) gives the same conclusion.

This is an amusing way of showing that lim supn→∞ |Sn| = +∞, a.s. 4Example 2.3.13. Let Xn be an Fn-adapted stochastic process on statespace (Σ, E), and let f : Σ → R be measurable. Suppose that f(Xn) isintegrable for all n. Then

Mn = f(Xn)− f(X0)−n−1∑m=0

E[f(Xm+1)− f(Xm) | Fm]


is an Fn-adapted martingale. Hence, for any stopping time τ ,

E f(Xn∧τ ) = E f(X0) + E(n∧τ)−1∑

m=0

E[f(Xm+1)− f(Xm) | Fm]

, (2.9)

which is a version of Dynkin’s formula. Under additional conditions, optionalstopping (Corollary 2.3.9) permits n→∞ preserving equality in (2.9). 4

The following maximal inequality, named after Doob, is important.

Theorem 2.3.14 (Doob’s inequality). Let Xn be a submartingale. For anyλ > 0 and any n ≥ 0 we have

P[

max0≤m≤n

Xm ≥ λ∣∣∣F0

]≤ λ−1 E[X+

n | F0].

Proof. We give a (short) proof, since this version, conditional on F0, is rarelystated explicitly. Fix λ > 0. Let τ = minn ≥ 0 : Xn ≥ λ. Then

X+n∧τ ≥ Xτ1τ ≤ n ≥ λ1τ ≤ n.

HenceP[τ ≤ n | F0] ≤ λ−1 E[X+

n∧τ | F0].

Since x 7→ x+ is non-decreasing and convex, and E[X+n ] ≤ E |Xn|, the fact

that Xn is a submartingale implies that X+n is also a submartingale, by

Theorem 2.3.3(ii). Then optional stopping (Theorem 2.3.7) applied at thebounded stopping times n∧ τ and n shows that E[X+

n∧τ | F0] ≤ E[X+n | F0];

see Remark 2.3.8(a).

Theorems 2.3.2, 2.3.3, and 2.3.14 have the following corollary.

Corollary 2.3.15. For Xn a square-integrable martingale, for any λ > 0,

P[

max0≤m≤n

|Xm| ≥ λ]≤ λ−2 E[X2

n] = λ−2(E[X2

0 ] +n−1∑m=0

E[(Xm+1 −Xm)2]).

Example 2.3.16. Suppose that Xn is a square-integrable martingale withX0 = 0 such that E[(Xn+1−Xn)2 | Fn] ≤ K for all n ≥ 0, for some constantK. Recall σx = minn ≥ 0 : |Xn| ≥ x. Then, Corollary 2.3.15 implies that

P[σn ≤ an2] = P[

max0≤m≤an2

|Xm| ≥ n]

≤ n−2Kan2 = aK.


This confirms the intuition that martingales with well-behaved incrementsusually move diffusively : the process is not likely to go out of the inter-val (−n, n) before time an2 if a is small. We return to the question ofquantifying the finite-time displacement of a process in Section 2.4. 4

Example 2.3.17. Continuing the theme of Example 2.3.16, let Sn be sym-metric simple random walk on Z with S0 = 0. Then S2

n is a non-negative sub-martingale with E[S2

n] = n, and Doob’s inequality, Theorem 2.3.14, whichin this case reduces to an instance of Kolmogorov’s inequality, gives

P[

max0≤m≤n

|Sm| ≥ λ]≤ λ−2n.

For ε > 0, let u(n) = n1/2(log n)(1/2)+ε; then

P[

max0≤m≤n

|Sm| ≥ u(n)]≤ (log n)−1−2ε.

Although this seems a rather weak bound, we can still extract a reasonableresult by considering the subsequence n = 2k, k ≥ 0. The Borel–Cantellilemma then implies that, a.s., max0≤m≤2k |Sm| ≤ u(2k) for all but finitelymany k. Any n ∈ N has 2kn ≤ n ≤ 2kn+1 for some kn ∈ Z+, where kn →∞as n→∞. Hence, a.s., for all but finitely many n,

max0≤m≤n

|Sm| ≤ max0≤m≤2kn+1

|Sm| ≤ u(2 · 2kn) ≤ 2u(n).

So we have shown that for any ε > 0, a.s., for all but finitely many n,

max0≤m≤n

|Sm| ≤ n1/2(log n)(1/2)+ε.

This result is certainly not as sharp as the ultimate ‘lim sup’ result for simplerandom walk, the law of the iterated logarithm, which states that

lim supn→∞

|Sn|√2n log logn

= 1, a.s.,

but the proof here is also certainly much simpler, and readily generalizes.We return in some detail to these ideas in Section 2.8. 4

We collect some remarks on quadratic variation for square-integrablemartingales, which will be useful occasionally later on. Let Xn be a martin-gale with X0 = 0, such that E[X2

n] < ∞ for all n. Then the process X2n is


a submartingale, with Doob decomposition X2n = Mn + An, where Mn is a

martingale with M0 = 0 and

An =

n−1∑m=1

E[X2m+1 −X2

m | Fm] =

n−1∑m=1

E[(Xm+1 −Xm)2 | Fm],

using the fact that Xn is a martingale. The non-decreasing process An is thequadratic variation process associated to Xn, and the notation 〈X〉n = Anis used. Let τ be a finite stopping time. Then An∧τ ↑ Aτ as n → ∞ andhence, by monotone convergence, EAn∧τ → EAτ .

Since Mn = X2n −An is a martingale, so is Mn∧τ , and hence E[X2

n∧τ ] =EAn∧τ because M0 = 0. Then, by Fatou’s lemma,

E[X2τ ] ≤ lim inf

n→∞EAn∧τ = EAτ . (2.10)

But X2n∧τ is a submartingale, so by optional stopping (Theorem 2.3.7)

E[X2τ ] ≥ E[X2

n∧τ ] = EAn∧τ provided X2n∧τ is uniformly integrable. Since

EAn∧τ → EAτ , we note the following result.

Lemma 2.3.18. Let Xn be a square-integrable martingale with X0 = 0,and let τ be a finite stopping time. Then, if X2

n∧τ is uniformly integrable,E[X2

τ ] = E[〈X〉τ ].

We finish this section with a result that we will use several times, whichis a refinement due to Levy of the Borel–Cantelli lemma.

Theorem 2.3.19. Let (Fn, n ≥ 0) be a filtration and (En, n ≥ 0) a sequenceof events with En ∈ Fn. Then, up to sets of probability zero,

En i.o. = ∞∑n=1

P[En | Fn−1] =∞.

This result is useful for, among other things, formalizing the intuitionthat an event that has positive probability of occurring soon, having not yetoccurred, will eventually happen; see the next example.

Example 2.3.20. Let Xn be an irreducible Markov chain on a countablestate space Σ. Suppose that there is a finite non-empty set A ⊂ Σ such thatPy[τA < ∞] = 1 for all y ∈ Σ \ A. We deduce that Py[τ+

y < ∞] = 1 for ally ∈ Σ \A, i.e., Xn is recurrent (see Definition 2.1.4).

Note that for any y ∈ Σ \A, irreducibility and the finiteness of A implythat there exist m = m(y) < ∞ and δ = δ(y) > 0 such that P[τy < m |

2.4. Displacement and exit estimates 39

X0 ∈ A] > δ. Fix one such y ∈ Σ \ A (and corresponding m and δ) for theduration of this proof.

Let ηA = τΣ\A, the first exit time from A. Then, on ηA > n, P[ηA <n+m | Fn] > δ. In particular, setting Gn = Fmn and En = ηA < nm, wehave En ∈ Gn and P[En | Gn−1] > δ on Ec

n−1. Also, En−1 ⊆ En so

P[En | Gn−1] ≥ δ1(Ecn−1) + 1(En−1) ≥ δ, a.s.

Hence Theorem 2.3.19 implies that En occurs infinitely often, so P[ηA <∞ |X0 ∈ A] = 1.

Now take X0 = y ∈ Σ \ A. Let κ0 = τA and, for k ≥ 1, κk = minn >κk−1 +m : Xn ∈ A, (a subsequence of) return times to A. It follows fromthe argument above and the hypothesis that Pz[τA <∞] = 1 for all z ∈ Σ\Athat κk <∞, a.s., for all k.

A similar argument now shows that Xn will eventually return to y havingvisited A. To spell this out, for k ≥ 1 let Ek = τ+

y ≤ κk and now setGk = Fκk ; then Ek ∈ Gk and Ek−1 ⊆ Ek so that

P[Ek | Gk−1] ≥ P[τ+y < κk−1 +m | Fκk−1

]1τ+y > κk−1+1τ+

y ≤ κk−1 ≥ δ,

a.s. Thus Theorem 2.3.19 implies that Py[τ+y <∞] = Py[En i.o.] = 1. 4

2.4 Displacement and exit estimates

A basic ingredient of many of the arguments that we use later in this bookis some estimate of how far a process can travel in a certain time, or theprobability that a process exits an interval by one end point rather thanthe other. In this section we collect some examples and results in this vein,to illustrate some important applications of the semimartingale ideas fromSection 2.3.

The main tool in this context is the optional stopping theorem. Wegive several examples showing how optional stopping may be used to deduceinformation on displacement. We will return to many of the ideas illustratedin these examples later in the book.

Example 2.4.1. For SRW Sn on Z with parameter p ∈ (0, 1), p 6= 12 ,

abbreviate β = 1−pp and Xn = βSn ; as observed in Example 2.1.7, Xn is a

martingale. Also, Xn∧τa,b is bounded and τa,b is finite, as observed inExample 2.3.12. So, with a < x < b, optional stopping (Corollary 2.3.9)gives

βx = ExX0 = ExXτa,b = βa Px[τa < τb] + βb Px[τb < τa],


and so

Px[τa < τb] =βx − βbβa − βb ,

solving the classical gambler’s ruin problem.In the case p = 1

2 the process Sn is itself a martingale, and the same

calculation gives Px[τa < τb] = b−xb−a . It is elementary but instructive to note

that one may recast this result in the form, say, for x ≥ 1,

P0[σx < τ+0 ] = P0

[max

0≤n<τ+0|Sn| ≥ x

]= P1[τx < τ0] =

1

x, (2.11)

a statement about the excursion maximum of the walk. (Here σx = τ−x,x.)Hence, the random variable M := max0≤n<τ+0

|Sn| is almost surely finite and

has E0[M s] <∞ if and only if s < 1.With a little extra work, this argument demonstrates the null recurrence

of the random walk Sn. Indeed, the σx are a.s. finite (by e.g. the argumentin Example 2.3.12), and a similar argument to the proof of Theorem 2.2.5shows that

P0[τ+0 <∞] = P0

[∪x≥1τ+

0 < σx]

= limx→∞

P0[τ+0 < σx] = 1,

by (2.11), proving recurrence. Moreover, under P0, |Sn| ≤ n so M ≤ τ+0 ,

andE0[τ+

0 ] ≥ E0[M ] =∑x≥1

P0[M ≥ x] =∞, by (2.11). 4

The martingale solution to the gambler’s ruin problem from Example 2.4.1extends to give the proof of Lemma 2.2.1 from Section 2.2, as the next ex-ample shows.

Example 2.4.2. In the setting of the general nearest-neighbour randomwalk on Z+ from Section 2.2, let a, b, x be integers with 0 ≤ a < b anda ≤ x ≤ b; let X0 = x. Then h(Xn∧τa∧τb) is a bounded martingale, byLemma 2.2.2, and τa ∧ τb <∞ a.s., by Corollary 2.1.10. Then, by optionalstopping (Corollary 2.3.9),

h(x) = Ex h(X0) = Ex h(Xτa∧τb) = h(a)Px[τa < τb] + h(b)Px[τa > τb].

Re-arranging this gives Lemma 2.2.1. 4

The next example returns to simple symmetric random walk to illustrateanother idea.


Example 2.4.3. Again consider Sn simple symmetric random walk on Z;write Fn = σ(S0, . . . , Sn). Example 2.4.1 showed that E0[τ+

0 ] =∞. In fact,E0[(τ+

0 )1/2] =∞ (and this is best possible).To show this, we elaborate the idea of the previous example. Roughly

speaking, we know that Sn reaches ±x before returning to 0 with prob-ability 1/x; but, if it does so, it should take time about x2 to come backto the origin, by the idea of Example 2.3.16. Formally, let x ∈ N and setEx = max0≤n≤x2 |2x− Sσ2x+n| < 2x. Then,

P0[τ+0 ≥ x2] ≥ P0[σ2x < τ+

0 ∩ Ex]

= E0

[1σ2x < τ+

0 P[Ex | Fσ2x ]],

since σ2x < τ+0 ∈ Fσ2x . Let Xn = (Sσ2x+n − 2x)2. Then, Xn ≤ n2 so

EXn < ∞, and E[Xn+1 − Xn | Fσ2x+n] = 1. Hence (Xn, n ≥ 0) is a non-negative submartingale adapted to (Fσ2x+n, n ≥ 0) with X0 = 0, a.s., andwe may apply Doob’s inequality (Theorem 2.3.14) to show that

P[

max0≤n<x2

Xn ≥ 4x2

∣∣∣∣Fσ2x] ≤ E[Xx2

∣∣Fσ2x]4x2

=1

4, a.s.,

so that P[Ex | Fσ2x ] ≥ 3/4, a.s. Thus by (2.11),

P0[τ+0 ≥ x2] ≥ 3

4P0[σ2x < τ+

0 ] =3

4x.

Hence, for any n ∈ Z+, P0[τ+0 ≥ n] ≥ 3

4(1 +√n)−1, which shows that

E0[(τ+0 )1/2] =∞. It is clear that the idea of this example can be generalized

to martingales satisfying uniformly bounded increments, say; we return tothis general theme in Section 2.7. 4

It is important to observe that the hitting probability calculation ofExample 2.4.1 is “robust”, in the sense that it gives non-trivial and oftenuseful bounds on hitting probabilities with only partial information. Thenext example takes up this theme.

Example 2.4.4. Suppose that Xn is a submartingale with X0 = x and|Xn+1 −Xn| ≤ K a.s. for all n. Recall also that λa = minn : Xn ≤ a andρb = minn : Xn ≥ b. Suppose a < x < b. Suppose that P[λa∧ρb <∞] = 1(this will usually be an easy consequence of some form of non-degeneracy orirreducibility). Then, analogously to Example 2.4.1,

x = EX0 ≤ EXλa∧ρb ≤ aP[λa < ρb] + (b+K)P[ρb < λa],


so

P[λa < ρb] ≤b+K − xb+K − a .

One would need, however, a supermartingale in order to obtain a lowerbound for P[λa < ρb]: see Corollary 2.4.6 below for a closely related result.

4

iFor estimating hitting probabilities with a suitable Lya-punov function to hand, the optional stopping theorem isone’s friend.

At this point, we present a simple maximal inequality for supermartin-gales, which is in some sense more elementary than Doob’s inequality (The-orem 2.3.14) and which will be useful at several points later on.

Theorem 2.4.5. Suppose that Xn is a non-negative supermartingale adap-ted to Fn. Then, for any x > 0,

P[maxm≥0

Xm ≥ x∣∣∣F0

]≤ X0

x, a.s.

In particular, P[maxm≥0Xm ≥ x] ≤ x−1 EX0.

Proof. Fix x > 0. Theorem 2.3.11 applied to the non-negative supermartin-gale Xn at stopping times 0 and σx = minn ≥ 0 : Xn ≥ x yields

X0 ≥ E[Xσx | F0] ≥ E[Xσx1σx <∞ | F0] ≥ xP[σx <∞ | F0].

But σx <∞ if and only if maxm≥0Xm ≥ x.

Theorem 2.4.5 has a useful corollary on the same theme as the precedingexamples, but which imposes no assumptions on the size of the incrementsof the process.

Corollary 2.4.6. Suppose that Xn is an Fn adapted process taking valuesin R+. Suppose that there exists y ∈ R+ for which, a.s.,

E[Xn+1 −Xn | Fn] ≤ 0, on Xn > y.

Then, with λy = minn ≥ 0 : Xn ≤ y, for any x > 0,

P[maxm≥0

Xm∧λy ≥ x]≤ EX0

x.


Proof. Apply Theorem 2.4.5 to the supermartingale Xn∧λy ∈ R+.

The next result includes another important maximal inequality, whichshows how an upper bound on the expected increments of a non-negativeprocess gives a probabilistic upper bound on its finite-time maximum. Theinequality (2.14) is closely related to the submartingale inequality of Doob(Theorem 2.3.14) and also to the supermartingale inequality Theorem 2.4.5(take B = 0).

Theorem 2.4.7. Let Xn be an integrable Fn-adapted process on R+. Sup-pose that for some B ∈ R+,

E[Xn+1 −Xn | Fn] ≤ B, a.s.

Then, for any stopping time ν with E ν <∞, and any x > 0, a.s.,

P[

max0≤m≤ν

Xm ≥ x∣∣∣F0

]≤ B E[ν | F0] +X0

x; (2.12)

and E[σx | F0] ≥ x−X0

B. (2.13)

In particular, for any n ≥ 0 and any x > 0,

P[

max0≤m≤n

Xm ≥ x]≤ Bn+ EX0

x. (2.14)

We deduce Theorem 2.4.7 from the following important result.

Theorem 2.4.8. Let Xn be an integrable Fn-adapted process on R, and letτ be a stopping time. Suppose that for some B ∈ R,

E[Xn+1 −Xn | Fn] ≤ B, on n < τ. (2.15)

Then, for any n ≥ 0, a.s.,

E[Xn∧τ | F0] ≤ X0 +B E[n ∧ τ | F0]. (2.16)

Proof. We may rewrite the condition (2.15) as

E[X(m+1)∧τ −Xm∧τ | Fm] ≤ B1τ > m, for all m ≥ 0. (2.17)

Indeed, (2.17) is clearly equivalent to (2.15) on τ > m, since then Xm∧τ =Xm and X(m+1)∧τ = Xm+1, while on τ ≤ m the statement (2.17) is trivial.Conditioning on F0 and taking expectations in (2.17), we obtain

E[X(m+1)∧τ | F0]− E[Xm∧τ | F0] ≤ B P[τ > m | F0].


Then summing from m = 0 to m = n− 1 we obtain

E[Xn∧τ | F0]−X0 ≤ B En−1∑m=0

1τ > m,

which gives (2.16) since∑n−1

m=0 1τ > m = n ∧ τ .

Remark 2.4.9. A variation of Dynkin’s formula (see Example 2.3.13) gives

E[Xn∧τ | F0] = X0 + E[(n∧τ)−1∑

m=0

E[Xm+1 −Xm | Fm]

∣∣∣∣F0

].

Theorem 2.4.8 can be viewed as a consequence of this; the elementary proofpresented above is still instructive, and demonstrates how weakly optionalstopping is required.

Now we complete the proof of Theorem 2.4.7.

Proof of Theorem 2.4.7. Take τ = ν ∧ σx; then it follows from (2.16) that

X0 +B E[n ∧ ν ∧ σx | F0] ≥ E[Xn∧ν∧σx | F0] ≥ xP[σx ≤ n ∧ ν | F0], (2.18)

since Xn∧ν∧σx ≥ x1σx ≤ n ∧ ν, given that Xn ≥ 0 for all n. In particular,since ν ≥ n ∧ ν ∧ σx,

X0 +B E[ν | F0] ≥ xP[σx ≤ n ∧ ν | F0],

which gives the maximal inequality (2.12) on taking n → ∞. The simplerversion (2.14) follows from (2.12) applied at the deterministic stopping timeν = n.

To prove (2.13) it suffices to suppose that P[σx <∞] = 1. An applicationof (2.18) with ν = n gives

X0 +B E[σx | F0] ≥ xP[σx ≤ n | F0],

which on letting n→∞ yields (2.13).

i A Lyapunov function whose drift is uniformly boundedabove gives control over the maximum of a process.

Theorems 2.4.7 and 2.4.8 will form basic tools later on; for now we makean aside to give a corollary pertaining to the so-called finite time stochasticstability of Markov chains.


Corollary 2.4.10 (Kushner–Kalashnikov inequality). Let Xn be a time-homogeneous Markov chain on state-space (Σ, E). Suppose that for measur-able f : Σ→ R+, A ∈ E, and B ∈ R+,

E[f(Xn+1)− f(Xn) | Xn = x] ≤ B, whenever x ∈ A.

Then, if σA = minn ≥ 0 : Xn /∈ A, for n ∈ Z+ and x ∈ A,

P [σA ≤ n | X0 = x] ≤ Bn+ f(x)

infy/∈A f(y).

Proof. Let Fn = σ(X0, . . . , Xn). Apply (2.12) to the process f(Xn) at thestopping time ν = n ∧ σA to get

P[σA ≤ n | F0] ≤ P[

max0≤m≤n∧σA

f(Xm) ≥ infy/∈A

f(y)∣∣∣F0

]≤ B E[n ∧ σA | F0] + f(X0)

infy/∈A f(y),

which gives the result.

Here is a result in the opposite direction, giving an upper bound on theexpected time to exit an interval. Again for x > 0 we use the notationσx = minn ≥ 0 : |Xn| ≥ x.

Theorem 2.4.11. Let Xn be an Fn-adapted process on R+ and let η be astopping time. Fix x ≥ 0. Suppose that there exist a non-decreasing functionf : R+ → R+ and a constant ε > 0 for which f(Xn) is integrable and

E[f(Xn+1)− f(Xn) | Fn] ≥ ε, on n < η ∧ σx. (2.19)

Then

E[η ∧ σx] ≤ ε−1 E f(Xσx). (2.20)

Proof. Let x ≥ 0, and write Yn = f(Xn∧η∧σx). We may express (2.19) as

E[Ym+1 − Ym | Fm] ≥ ε1η ∧ σx > m,

for any m ≥ 0. We obtain EYm+1 − EYm ≥ εP[η ∧ σx > m] on takingexpectations, and then summing for m from 0 up to n we get

EYn+1 − EY0 ≥ εn∑

m=0

P[η ∧ σx > m].


Since Y0 = f(X0) ≥ 0, it follows that

E[η ∧ σx] = limn→∞

n∑m=0

P[η ∧ σx > m] ≤ ε−1 lim supn→∞

EYn+1.

Since f is non-decreasing, Yn = f(Xn∧η∧σx) ≤ f(Xσx), a.s., for all n ∈ Z+.The result now follows.

iA Lyapunov function whose drift is uniformly bounded be-low quantifies that a process is unlikely to stay in a boundedregion for too long.

Some refinements of Theorem 2.4.11 are given in Section 3.10 below.Although the proof of Theorem 2.4.11 is elementary, it is quite powerful.For example, an easy consequence is the following result which shows thata martingale with sufficient variability will exit an interval at least ‘diffus-ively’. In the particular case where η = ∞, Theorem 2.4.12 reduces to asemimartingale version of Kolmogorov’s ‘other’ inequality.

Theorem 2.4.12. Suppose that Xn is a square-integrable Fn-adapted pro-cess on R with X0 = 0. Suppose that there exist a stopping time η andconstants v > 0, B ∈ R+ such that, for all n ≥ 0,

E[(Xn+1 −Xn)2 | Fn] ≥ v1n < η, (2.21)

P[|Xn+1 −Xn| ≤ B | Fn] = 1. (2.22)

Suppose also that either (i) Xn is a martingale; or (ii) Xn is a non-negativesubmartingale. Then for any n ≥ 0 and any x > 0,

P[

max0≤m≤n

|Xm| ≥ x]≥ 1− (x+B)2

vn− P[η ≤ n].

The lower bound on second moments in (2.21) is a form of uniformnon-degeneracy. The proof of Theorem 2.4.12 uses the following elementaryalgebraic identity:

a2 − b2 = (a− b)2 + 2b(a− b), (a, b ∈ R). (2.23)

Although trivial, (2.23) will be used repeatedly in this book, so we refer toit as the squared-difference identity.


Proof of Theorem 2.4.12. By (2.21), (2.23), and the (sub)martingale as-sumption, we have that

E[X2n+1 −X2

n

∣∣Fn] = E[(Xn+1 −Xn)2

∣∣Fn]+ 2Xn E[Xn+1 −Xn

∣∣Fn] ≥ v,on n ≤ η, while |Xσx | ≤ x + B, a.s., by (2.22). So we may apply The-orem 2.4.11 to the process X2

n ∈ R+ with f(x) = x, to obtain E[η ∧ σx] ≤v−1(x + B)2. Then, P[σx ≤ n] ≥ P[η ∧ σx ≤ n] − P[η ≤ n], which, withMarkov’s inequality, gives the desired result.

Some generalizations of Theorem 2.4.12 are given in Chapter 4.

Example 2.4.13. Let Xn be an Fn-adapted process on R+. Suppose thatthere exists an increasing f : R+ → R+ such that E f(X0) <∞ and

E[f(Xn+1)− f(Xn) | Fn] = B ∈ (0,∞), a.s.

Since f is increasing, σx = minn ≥ 0 : f(Xn) ≥ f(x). Then applicationsof (2.13) and Theorem 2.4.11 to the process f(Xn) show that

f(x)− E f(X0)

B≤ Eσx ≤

E f(Xσx)

B.

Assuming that f(x)→∞ as x→∞, and

limx→∞

E f(Xσx)

f(x)= 1, (2.24)

it follows that

limx→∞

Eσxf(x)

=1

B,

which is sometimes known as a weak renewal theorem. Sufficient for (2.24)is that Xn+1−Xn is uniformly bounded above and f does not grow too fast;see also Section 3.10. Lemma 2.2.4(i) provides an example. 4

Another very useful result for bounding probabilities of displacement,typically in the situation when the size of displacement is much bigger thanone would normally expect, is the Azuma–Hoeffding inequality. Here westate the traditional inequality alongside its one-sided versions; since thelatter are not easily found in the literature, we give the proof (which isessentially the standard one).


Theorem 2.4.14 (Azuma–Hoeffding inequalities). Suppose that (ck, k ≥ 1)are finite positive constants. Suppose that Xn is a supermartingale, suchthat |Xk −Xk−1| ≤ ck a.s. for all k ≥ 1. Then for all a > 0 and all n ≥ 0,

P[Xn −X0 ≥ a | F0] ≤ exp(− a2

2∑n

k=1 c2k

). (2.25)

If Xn is a submartingale with |Xk −Xk−1| ≤ ck, then

P[Xn −X0 ≤ −a | F0] ≤ exp(− a2

2∑n

k=1 c2k

). (2.26)

Finally, if Xn is a martingale with |Xk −Xk−1| ≤ ck, then

P[|Xn −X0| ≥ a

∣∣F0

]≤ 2 exp

(− a2

2∑n

k=1 c2k

). (2.27)

Proof. We prove only (2.25); then, (2.26) follows by considering (−Xn),and (2.27) by the union bound.

For any fixed λ, g(x) = eλx is a convex function. Let c > 0. Since theline segment with endpoints (−c, e−λc) and (c, eλc) lies above the graph of g,we have that for all x ∈ [−c, c],

eλx ≤ eλc − e−λc

2cx+

eλc + e−λc

2.

So, since Xn is a supermartingale, we obtain

E[eλ(Xk−Xk−1)

∣∣Fk−1

]≤ E

[eλck − e−λck

2ck(Xk −Xk−1) +

eλck + e−λck

2

∣∣∣Fk−1

]≤ eλck + e−λck

2

≤ eλ2c2k/2, (2.28)

where we used the elementary inequality

ey + e−y

2=

∞∑k=0

y2k

(2k)!≤∞∑k=0

y2k

2kk!= ey

2/2.

With (2.28), we obtain by induction

E[eλ(Xn−X0)

∣∣F0

]≤ exp

(1

2λ2

n∑k=1

c2k

).


Now the (conditional) Markov inequality gives, for a > 0 and λ > 0,

P[Xn −X0 ≥ a | F0] ≤ e−λa E[eλ(Xn−X0)

∣∣F0

]≤ exp

(−λa+

1

2λ2

n∑k=1

c2k

),

and taking λ = a(∑n

k=1 c2k

)−1we conclude the proof.

We finish this section with an example of particularly wide-ranging ex-cursions.

Example 2.4.15. Let Sn be symmetric SRW on Z2, and consider f(x) =(log(1 + ‖x‖2))γ for γ > 1. Let U2 := ±e1,±e2 denote the possible jumpsof the walk. Then for e ∈ I2, we have

f(x + e)− f(x) = f(x)

1 +

log(

1 + 2e·x+11+‖x‖2

)log(1 + ‖x‖2)

γ

− 1

.

Here, we have from Taylor’s formula that

log

(1 +

2e · x + 1

1 + ‖x‖2)

=2e · x + 1

‖x‖2 − 2(e · x)2

‖x‖4 +O(‖x‖−3),

so that, with the Taylor formula for (1 + x)γ , we obtain

f(x + e)− f(x) = γ‖x‖−2(log(1 + ‖x‖2))γ−1

×(

2e · x + 1− 2(e · x)2

‖x‖2 +2(γ − 1)(e · x)2

‖x‖2 log(1 + ‖x‖2)+O(‖x‖−1)

).

Using the fact that∑e∈U2

1 = 4,∑e∈U2

e · x = 0, and∑e∈U2

(e · x)2 = 2‖x‖2,

it follows that, as ‖x‖ → ∞,

E[f(Sn+1)− f(Sn) | Sn = x] =1

4

∑e∈U2

(f(x + e)− f(x)

)= γ(γ − 1)‖x‖−2(log(1 + ‖x‖2))γ−1(1 + o(1)),

which, for γ > 1, is positive for all ‖x‖ sufficiently large.


Suppose S0 6= 0. Then, if τ = minn ≥ 0 : Sn = 0 and σx = minn ≥0 : ‖Sn‖ ≥ x, we have that Yn = f(Sn∧τ∧σx) is a non-negative submartin-gale, bounded above by (log(2 + x2))γ since ‖Sn∧σx‖2 ≤ 1 + x2. Thus Yn isuniformly integrable, and by optional stopping (Theorem 2.3.7),

log 2 ≤ f(S0) = Y0 ≤ EYτ∧σx ≤ P[σx < τ ](log(2 + x2))γ ,

since Yτ = f(0) = 0. In other words, for any ε > 0, there exists c > 0 suchthat, for all x ≥ 2,

P[

max0≤m≤τ

‖Sm‖ ≥ x]≥ c(log x)−1−ε. 4

2.5 Recurrence and transience criteria for Markovchains

We start with a technical result that will allow us to convert results onhitting times of sets (such as we obtain by Lyapunov function arguments)to results on return times to points, and hence statements about recurrenceand transience as defined at Definition 2.1.4.

Lemma 2.5.1. Let Xn be an irreducible Markov chain on a countable statespace Σ.

(i) If for some x ∈ Σ and some non-empty A ⊆ Σ, Px[τA <∞] < 1, thenXn is transient.

(ii) If for some finite non-empty A ⊆ Σ and all x ∈ Σ\A, Px[τA <∞] = 1,then Xn is recurrent.

Proof. We prove part (i). If A is non-empty and Px[τA = ∞] > 0, thenthere is some y ∈ A, y 6= x (since Px[τx = ∞] = 0). But τy ≥ τA, soPx[τy =∞] > 0. By irreducibility, for any x 6= y, Py[τx < τ+

y ] > 0; otherwisethe strong Markov property would show that Py[τx <∞] = 0, contradictingirreducibility. So, by the strong Markov property,

Py[τ+y =∞] ≥ Py[τx < τ+

y ]Px[τy =∞] > 0,

which shows transience by Definition 2.1.4.Part (ii) is demonstrated in Example 2.3.20.

The two basic criteria for recurrence and transience of Markov chainsthat we present in this section have at their foundation the supermartingaleconvergence result Theorem 2.3.6. We first present the recurrence criterion.

2.5. Recurrence/transience criteria 51

Theorem 2.5.2 (Recurrence criterion). An irreducible Markov chain Xn

on a countably infinite state space Σ is recurrent if and only if there exist afunction f : Σ→ R+ and a finite non-empty set A ⊂ Σ such that

E[f(Xn+1)− f(Xn) | Xn = x] ≤ 0, for all x ∈ Σ \A, (2.29)

and f(x)→∞ as x→∞.

We emphasize an elementary but important point about the notationhere. The key to correct interpretation of the statement ‘f(x) → ∞ asx → ∞’ is the fact that the domain of f is precisely the countable set Σ,so while the first ‘∞’ in the statement is supR+, the second is #Σ. Thusthe statement ‘f(x) → ∞ as x → ∞’ entails that for any enumerationx1, x2, . . . of the elements of Σ, limn→∞ f(xn) = ∞. Indeed, given suchan enumeration, limn→∞ infm≥n f(xm) ≥ M for any M ∈ R+, so the setx ∈ Σ : f(x) ≤ M is finite for any M ∈ R+. Hence the choice ofenumeration of Σ did not matter.

If Σ ⊂ Rd is unbounded and locally finite, then any enumeration x1, x2, . . .of Σ has limn→∞ ‖xn‖ = ∞; thus we may restate an ‘embedded’ version ofTheorem 2.5.2 as follows.

Corollary 2.5.3. An irreducible Markov chain Xn on an unbounded andlocally finite state space Σ ⊂ Rd is recurrent if and only if there exist afunction f : Rd → R+ and a bounded non-empty set A ⊂ Σ such that (2.29)holds and f(x)→∞ as ‖x‖ → ∞.

Proof of Theorem 2.5.2. To prove that having a function that satisfies (2.29)is sufficient for the recurrence, take X0 = x ∈ Σ. Set Yn = f(Xn∧τA) andobserve that Yn is a non-negative supermartingale. Then, by Theorem 2.3.6,there exists a random variable Y∞ such that Yn → Y∞ a.s. and

Ex Y∞ ≤ Ex Y0 = f(x), (2.30)

for any x ∈ Σ. On the other hand, since f → ∞, it holds that the sety ∈ Σ : f(y) ≤ M is finite for any M ∈ R+; so, the irreducibility impliesthat lim supn→∞ f(Xn) = +∞ a.s. on τA =∞. Hence, on τA =∞, wemust have Y∞ = limn→∞ Yn = +∞, a.s. This would contradict (2.30) if weassume that Px[τA =∞] > 0, because then Ex[Y∞] ≥ Ex[Y∞1τA =∞] =∞. Hence Px[τA = ∞] = 0 for all x ∈ Σ, which means that the Markovchain is recurrent, by Lemma 2.5.1(ii).

For the ‘only if’ part, see the proof of Theorem 2.2.1 of [84].


Example 2.5.4. Consider the nearest-neighbour random walk on Z+ fromSection 2.2, and take f(x) = h(x) as defined at (2.7). Then Lemma 2.2.2shows that

E[f(Xn+1)− f(Xn) | Xn = x] = 0, for all x 6= 0,

so Theorem 2.5.2 shows that Xn is recurrent if limx→∞ h(x) =∞. This is aLyapunov function proof of the ‘if’ half of Theorem 2.2.5(i). 4Example 2.5.5. Let Xn be an irreducible Markov chain on a locally finitesubset of R+. Suppose that for some bounded set A,

E[Xn+1 −Xn | Xn = x] = 0, for all x /∈ A, (2.31)

i.e., the process has zero drift outside A. Corollary 2.5.3 (with f(x) = x)shows that Xn is recurrent. However, without additional conditions thesame result is not true if the state-space is allowed to be a subset of Rrather than just R+: see the notes at the end of this chapter. This contrastswith the situation when Xn is a spatially homogeneous random walk on R,i.e., a sum of i.i.d. random variables, in which case zero drift does implyrecurrence; see e.g. [137, Ch. 9]. 4

To be able to conclude that zero drift implies recurrence for a Markovchain on R, we need to impose additional moments assumptions on theincrements, including a uniform non-degeneracy condition.

Theorem 2.5.6. Let Xn be an irreducible Markov chain on a locally finitesubset of R. Suppose that for some bounded set A, (2.31) holds. Supposethat for some p > 2 and v > 0,

supx

E[(Xn+1 −Xn)p | Xn = x] <∞;

infxE[(Xn+1 −Xn)2 | Xn = x] ≥ v.

Then Xn is recurrent.

Proof. We will apply Corollary 2.5.3 with f(x) = log(1 + |x|). Write ∆ =X1 −X0 and Ex = |∆| < |x|. We compute

E[f(Xn+1)− f(Xn) | Xn = x] = Ex[(f(x+ ∆)− f(x))1(Ex)]

+ Ex[(f(x+ ∆)− f(x))1(Ecx)].

On |∆| < |x| we have that x and x+ ∆ have the same sign, so

Ex[(f(x+ ∆)− f(x))1(Ex)]


= Ex[log

(1 + |x+ ∆|

1 + |x|

)1(Ex)

]= Ex

[log

(1 +

∆ sgn(x)

1 + |x|

)1(Ex)

]≤(

sgn(x)

1 + |x|

)Ex[∆1(Ex)]− 1

6(1 + |x|)−2 Ex[∆21(Ex)],

using the inequality log(1 + y) ≤ y − 16y

2 for all −1 < y ≤ 1. Here, sinceEx ∆ = 0 for x /∈ A, as |x| → ∞,

|Ex[∆1(Ex)]| ≤ Ex[|∆|1(Ecx)] ≤ Ex[|∆|p|x|1−p] = o(|x|−1).

Similarly,

Ex[∆21(Ex)] ≥ v − Ex[∆21(Ecx)] ≥ v − o(1), as |x| → ∞.

Finally we estimate the term

|Ex[(f(x+ ∆)− f(x))1(Ecx)]| ≤ Ex [(log(1 + |∆|) + log(1 + 2|∆|)) 1(Ec

x)] .

Here,

log(1 + 2|∆|)1(Ecx) = log(1 + 2|∆|)|∆|p|∆|−p1(Ec

x)

≤ |x|−p log(1 + 2|x|)|∆|p,

for all x with |x| greater than some x0 sufficiently large, using the fact thaty 7→ y−p log(1 + 2y) is eventually decreasing. It follows that

|Ex[(f(x+ ∆)− f(x))1(Ecx)]| ≤ 2|x|−p log(1 + 2|x|)Ex[|∆|p] = o(|x|−2).

Combining these calculations we obtain

E[f(Xn+1)− f(Xn) | Xn = x] ≤ −v6

(1 + |x|)−2 + o(|x|−2),

which is negative for all x with |x| sufficiently large. Thus we may applyCorollary 2.5.3 to deduce recurrence.

The idea of Theorem 2.5.6, and the interplay between first and secondmoments of increments for determining recurrence, will be picked up againin detail in Chapter 3. When second moments fail to exist, heavy tailedphenomena come into play; some of these are explored in Chapter 5. Inhigher dimensions, the analogue of Theorem 2.5.6 fails, and either recurrenceor transience is possible: see Chapter 4

Next we formulate a criterion for transience.


Theorem 2.5.7 (Transience criterion). An irreducible Markov chain Xn ona countable state space Σ is transient if and only if there exist a function f :Σ→ R+ and a non-empty set A ⊂ Σ such that

E[f(Xn+1)− f(Xn) | Xn = x] ≤ 0, for all x ∈ Σ \A, (2.32)

and

f(y) < infx∈A

f(x), for at least one site y ∈ Σ \A. (2.33)

Note that, unlike the recurrence criterion, here the set A need not befinite. The central semimartingale idea of the proof of Theorem 2.5.7 isvery simple, and will be useful in other contexts; thus we extract it as thefollowing lemma.

Lemma 2.5.8. Let Xn be an Fn-adapted process with state space (Σ, E).Suppose that f : Σ→ R+ is measurable, and A ∈ E is such that

E[f(Xn+1)− f(Xn) | Fn] ≤ 0, on Xn ∈ Σ \A, (2.34)

for all n ≥ 0. Then, a.s.,

P[τA <∞ | F0] ≤ f(X0)

infx∈A f(x).

Proof. Define the process Yn = f(Xn∧τA). Then (2.34) implies that Yn is asupermartingale adapted to Fn. Since Yn is also non-negative, Theorem 2.3.6implies that there is a random variable Y∞ ∈ R+ such that limn→∞ Yn = Y∞a.s. Also, by Theorem 2.3.11,

Y0 ≥ E[Y∞ | F0] ≥ E[Y∞1τA <∞ | F0].

Here we have that, a.s.,

Y∞1τA <∞ = limn→∞

Yn1τA <∞ = f(XτA)1τA <∞≥ inf

x∈Af(x)1τA <∞.

So we obtain

f(X0) = Y0 ≥ P[τA <∞ | F0] infx∈A

f(x),

which yields the result.


iA Lyapunov function that is a supermartingale outside a setgives an upper bound on the probability the the processesstarted outside that set ever reaches its destination.

Proof of Theorem 2.5.7. For the ‘if’ part, we apply Lemma 2.5.8. In thiscase, with y ∈ Σ \A as in (2.33), we obtain

Py[τA <∞] ≤ f(y)

infx∈A f(x)< 1,

proving the transience of the Markov chain Xn, by Lemma 2.5.1(i).For the ‘only if’ part, fix an arbitrary x0 ∈ Σ, set A = x0, and f(x) =

Px[τx0 <∞] for x ∈ Σ (so, in particular, f(x0) = 1). Then (2.32) holds withequality for all x 6= x0, and, by transience, one can find y ∈ Σ such thatf(y) < 1 = f(x0).

An inspection of the ‘only if’ part of the preceding proof shows that, ifa Markov chain is transient, a function f will be an apt Lyapunov functionif the value f(x) is given by the probability that the process reaches somefixed set starting from x. This suggests the following heuristic rule:

iTo find a Lyapunov function that proves transience, fix awell-chosen A ⊂ Σ, and set f(x) to be your reasonable guessfor the probability to ever reach A starting from x.

Example 2.5.9. Consider the one-dimensional SRW Sn, as in Example 2.1.7.

• To prove recurrence for p = 12 , take A = 0 and f(x) = |x|; then,

(2.29) holds with equality for all x 6= 0.

• To prove transience of the SRW for p > 12 , take A = (−∞, 0] and

f(x) =(1−p

p

)x; then f(x) → 0 as x → +∞ and (2.32) holds with

equality. For p < 12 , take A = [0,+∞) and the same f . Or just take

A = 0 in both cases, and choose any y > 0 in the first case and anyy < 0 in the second one to have (2.33). 4

Theorem 2.5.7 has the following consequence, which is sometimes useful.

Corollary 2.5.10. An irreducible Markov chain Xn on a countable statespace Σ is transient if there exist a non-constant function h : Σ→ R and astate x0 ∈ Σ such that supx∈Σ |h(x)| <∞ and

E[h(Xn+1)− h(Xn) | Xn = x] = 0, for all x 6= x0. (2.35)


Proof. By swapping h with −h, it suffices to consider the non-constant con-dition in the form h(x) < h(x0) for some x 6= x0. Then taking f(x) =h(x) − infx∈Σ h(x) we have f : Σ → R+ satisfies the conditions of The-orem 2.5.7 with A = x0, and transience follows.

Example 2.5.11. Consider the nearest-neighbour random walk on Z+ fromSection 2.2. With h as defined at (2.7), we see that if supx h(x) <∞, thenLemma 2.2.2 shows that the conditions of Corollary 2.5.10 are satisfied, sothat Xn is transient. This is a Lyapunov function proof of the ‘onlf if’ halfof Theorem 2.2.5(i). 4

Next, we formulate a sufficient condition for the transience, first forgeneral processes, and then for Markov chains.

Theorem 2.5.12. For a real-valued Fn-adapted process Xn, define τa =minn ≥ 1 : Xn −X0 ≤ −a, a ≥ 0, and suppose that, for some ε > 0,

E[Xn+1 −Xn | Fn] ≥ ε, on n < τa, (2.36)

for all n ≥ 0. Suppose also there exists K > 0 such that, for all n ≥ 0,

|Xn+1 −Xn| ≤ K, on n < τa. (2.37)

Then there exist positive constants c1, c2 depending only on ε,K, such that

P[τa <∞ | F0] ≤ c1e−c2a, for all a ≥ 0. (2.38)

Also, we have

P[τ0 =∞ | F0] = P[Xn > X0 for all n ≥ 1 | F0] ≥ c3 > 0, (2.39)

for some constant c3 depending only on ε and K.

Proof. Define the process Yn = Xn − εn; (2.36) implies that Yn∧τa is asubmartingale. By (2.36), it holds a.s. that |Y(n+1)∧τa − Yn∧τa | ≤ K + ε,so, using the one-sided Azuma–Hoeffding inequality (2.26) applied to thesubmartingale Yn∧τa , together with the union bound, we obtain

P[τa <∞ | F0] = P[there exists n ≥ 1 such that Yn∧τa − Y0 ≤ −a− εn | F0]

≤∞∑n=1

exp(− (a+ εn)2

2n(ε+K)2

)≤∞∑n=1

exp(−2aεn+ (εn)2

2n(ε+K)2

)


= exp(− ε

(ε+K)2a) ∞∑n=1

exp(− ε2

2(ε+K)2n),

proving (2.38) with c1 =∑∞

n=1 exp(− ε2

2(ε+K)2n)

and c2 = ε(ε+K)2

.

To prove (2.39), observe first that, by (2.37), on n < τa,

Xn+1 −Xn ≤ε

21Xn+1 −Xn ∈ (0, ε/2)+K1Xn+1 −Xn ≥ ε/2,

so, taking expectation conditional on Fn and using (2.36), we obtain

ε1τa > n ≤ ε

2+K P[Xn+1 −Xn ≥ ε/2 | Fn]

and thus

P[Xn+1 −Xn ≥ ε/2 | Fn] ≥ ε

2K1τa > n. (2.40)

Now, choose a0 large enough so that c1e−c2a < 1

2 for all a ≥ a0. Denoten0 = d a0ε/2e. The idea now is that, by (2.40), the process can initially make

at least n0 steps to the right of amount at least ε/2 with uniformly positiveprobability (so that its distance from the starting point is at least a0), andthen by (2.38) the process will not fall below its initial position with prob-ability at least 1

2 . To make this argument rigorous, observe that by (2.40),

P[Xk −Xk−1 ≥ ε/2 for all k = 1, . . . , n | F0]

= E[P[Xk −Xk−1 ≥ ε/2 for all k = 1, . . . , n | Fn−1]

∣∣F0

]= E

[1Xk −Xk−1 ≥ ε/2 for all k = 1, . . . , n− 1

P[Xn −Xn−1 ≥ ε/2 | Fn−1]∣∣F0

]≥ ε

2KP[Xk −Xk−1 ≥ ε/2 for all k = 1, . . . , n− 1 | F0],

and so, by induction,

P[Xk −Xk−1 ≥ ε/2 for all k = 1, . . . , n | F0] ≥( ε

2K

)n. (2.41)

Abbreviating Xk = Xk+n0 and τa = mink ≥ 1 : Xk − X0 ≤ −a, we have

P[τ0 =∞ | F0]

= E[E[1τ0 =∞ | Fn0 ]

∣∣F0

]≥ E

[E[1τ0 =∞, Xk −Xk−1 ≥ ε/2 for all k = 1, . . . , n0 | Fn0 ]

∣∣F0

]≥ E

[1Xk −Xk−1 ≥ ε/2 for all k = 1, . . . , n0P[τa0 =∞ | Fn0 ]

∣∣F0

]


≥ 1

2P[Xk −Xk−1 ≥ ε/2 for all k = 1, . . . , n0 | F0]

≥ 1

2

( ε

2K

)n0

,

proving (2.39).

A consequence of the above result is the following sufficient condition forthe transience of Markov chains.

Theorem 2.5.13. Let Xn be an irreducible Markov chain on a countablestate space Σ. Suppose that there exist a function f : Σ → R, some h ∈ Rfor which the set A := x ∈ Σ : f(x) ≤ h is neither ∅ nor Σ, and constantsε > 0 and K ∈ R+ such that

E[f(Xn+1)− f(Xn) | Xn = x] ≥ ε, for all x ∈ Σ \A; and (2.42)

P[|f(Xn+1)− f(Xn)| ≤ K | Xn = x] = 1, for all x ∈ Σ \A. (2.43)

Then, the Markov chain is transient.

Proof. Choose any x0 ∈ Σ \ A, so that f(x0) > h, and take X0 = x0. Leta = f(x0)− h > 0, and consider the process Yn = f(Xn). Then

τa = minn ≥ 0 : Yn − Y0 ≤ −a = minn ≥ 0 : Xn ∈ A,

so that an application of Theorem 2.5.12 to the process Yn shows that withpositive probability Xn started from X0 = x0 never reaches A, i.e., theprocess is transient.

Example 2.5.14. Again, consider the one-dimensional SRW Sn, as inExample 2.1.7. Assume that p > 1

2 ; to prove the transience using The-orem 2.5.13, take f(x) = x and a = 0. Then, (2.42) holds with ε = 2p−1 > 0and (2.43) holds with K = 1. 4

i

Heuristically, when something has drift in a fixed direc-tion, it should go to that direction in the limit, and The-orem 2.5.13 is a convenient tool to make this intuition rigor-ous; however, the restriction on the size of the jumps cannotbe completely removed, as the following example shows.


Example 2.5.15. Now, consider the random walk on the half-line Sn ofExample 2.1.8, with, say, p = 1

3 (so that it is positive recurrent). It itstraightforward to check, however, that (2.42) still holds with uniformlypositive ε for f(x) = 3x and a = 1:

E[f(Xn+1)− f(Xn) | Xn = x] =(2

3· 3x−1 +

1

3· 3x+1 − 3x

)= 2 · 3x−2 ≥ 2

3, for all x ≥ 1. 4

Example 2.5.15 shows that one cannot omit the condition (2.43) fromTheorem 2.5.13. However, if one is interested only in transience, rather thanexponential estimates such as those in Theorem 2.5.12, then the uniformbounds on the increments in Theorems 2.5.12 and 2.5.13 can be relaxed andreplaced by a moments assumption, as in the following result. The ideaof the proof is elementary, combining truncation and a Taylor’s formulacomputation, but important, and previews the methods of Chapter 3.

Theorem 2.5.16. Let Xn be a stochastic process on R+ adapted to a fil-tration Fn. Suppose that there exist constants ε > 0, δ > 0, B < ∞, andx0 <∞ such that, a.s.,

E[Xn+1 −Xn | Fn] ≥ ε, on Xn ≥ x0; (2.44)

E[|Xn+1 −Xn|1+δ | Fn] ≤ B, on Xn ≥ x0. (2.45)

There exist α > 0 and x1 ∈ R+ such that, with λx1 = minn ≥ 0 : Xn ≤ x1,

P[λx1 <∞ | F0] ≤(

1 +X0

1 + x1

)−α, a.s.

Proof. It suffices to take δ ∈ (0, 1). Let α > 0, to be specified later. Theidea is to show that, for suitable choice of α we can find x1 ∈ R+ for which

E[(1 +Xn+1)−α − (1 +Xn)−α | Fn] ≤ 0, on Xn ≥ x1, (2.46)

and then use Lemma 2.5.8. To ease notation, write ∆n = Xn+1 − Xn forthe duration of this proof. Let γ ∈ (0, 1), to be specified later, and letAn = |∆n| ≤ (1 +Xn)γ. Then, by Taylor’s formula,(

(1 +Xn+1)−α − (1 +Xn)−α)1(An)

= (1 +Xn)−α

[(1 +

∆n

1 +Xn

)−α− 1

]1(An)


= −α(1 +Xn)−1−α∆n1(An) + ζn,

where there is a constant C < ∞ (depending on γ) for which the errorterm ζn satisfies

|ζn| ≤ C(1 +Xn)−α−2|∆n|21(An), a.s.

Note that, a.s.,

|E[∆n1(An) | Fn]− E[∆n | Fn]| ≤ E[|∆n|1(Acn) | Fn]

≤ (1 +Xn)−γδ E[|∆n|1+δ | Fn]

≤ B(1 +Xn)−γδ,

on Xn ≥ x0, by (2.45). Moreover, a.s.,

E[|ζn| | Fn] ≤ C(1 +Xn)−α−2 E[|∆n|21(An) | Fn]

≤ C(1 +Xn)−2−α+γ(1−δ) E[|∆n|1+δ | Fn]

≤ BC(1 +Xn)−2−α+γ(1−δ),

on Xn ≥ x0. So we obtain, for some constant C ′ <∞, on Xn ≥ x0,

E[(

(1 +Xn+1)−α − (1 +Xn)−α)1(An) | Fn

]≤ −αε(1 +Xn)−1−α + C ′(1 +Xn)−1−α−γδ

+ C ′(1 +Xn)−2−α+γ(1−δ).

The first term on the right-hand side of the last display dominates as Xn →∞ provided γ(1 − δ) < 1, which is indeed the case. Hence we can find x2

with x0 ≤ x2 <∞ for which

E[(

(1 +Xn+1)−α − (1 +Xn)−α)1(An) | Fn

]≤ −(αε/2)(1 +Xn)−1−α,

on Xn ≥ x2. On the other hand,

E[∣∣(1 +Xn+1)−α − (1 +Xn)−α

∣∣1(Acn) | Fn

]≤ P[|∆n|1+δ ≥ (1 +Xn)γ(1+δ) | Fn]

≤ B(1 +Xn)−γ(1+δ),

on Xn ≥ x0, by Markov’s inequality and (2.45). The result (2.46) nowfollows provided γ(1 + δ) > 1 + α, i.e., γ ≥ 1+α

1+δ , a choice of γ ∈ (0, 1) thatwe can achieve provided we take α ∈ (0, δ).

2.6. Positive recurrence 61

Theorem 2.5.16 has the following consequence for Markov chains.

Theorem 2.5.17. Assume that, for an irreducible Markov chain Xn on acountable state space Σ, one can find a function f : Σ→ R+ and a ∈ (0,∞),such that the set A := x ∈ Σ : f(x) < a 6= ∅ is a proper subset of Σ, andfor some ε > 0, δ > 0, and B ∈ R+, (2.42) holds, and

E[|f(Xn+1)− f(Xn)|1+δ | Xn = x] ≤ B, for all x ∈ Σ \A. (2.47)

Then, the Markov chain is transient.

Proof. The idea is to set Yn = f(Xn), Fn = σ(X0, . . . , Xn), and applyTheorem 2.5.16 to the process Yn. Then (2.44) and (2.45) hold for Yn withx0 = a. With x1 the constant arising from the conclusion of Theorem 2.5.16,set A′ = x ∈ Σ : f(x) ≤ x1; then Theorem 2.5.16 yields

Px[τA′ <∞] ≤(

1 + f(x)

1 + f(x1)

)−α< 1,

for all x ∈ Σ \A′. Then Lemma 2.5.1(i) gives transience.

iA Lyapunov function that has strictly positive drift out-side some finite set yields transience provided it satisfies abounded moments condition.

2.6 Expectations of hitting times and positive re-currence

We start this section with a technical lemma for Markov chains that relatesthe moments of hitting times of finite sets to those of individual states.

Lemma 2.6.1. Let Xn be an irreducible Markov chain on a countable statespace Σ. Let α > 0 and A be a finite non-empty subset of Σ such thatEx(τ+

A )α <∞ for all x ∈ A. Then Ex(τ+x )α <∞ for all x ∈ Σ.

Proof. Define σ0 := 0 and, for k ≥ 0,

σk+1 = minn > σk : Xn ∈ A,

so that σk is the k-th passage time of A. Fix an arbitrary x ∈ A and assumethat X0 = x; then, the process Yn = Xσn is a (finite) Markov chain on A


starting from Y0 = x. It is also not hard to see that irreducibility of Ynfollows from irreducibility of Xn. Define

w = minn ≥ 1 : Yn = x

to be the return time to x for the process Yn.

Now, since the state space of Yn is finite, there exist constants K1 > 0and δ > 0 such that

Px[w = k] ≤ K1e−δk. (2.48)

Let q(y, z) be the one-step transition probabilities from y ∈ A to z ∈ A forthe Markov chain Yn, and set

r1 := minq(y, z) : y, z ∈ A, q(y, z) > 0,

which is positive since A is finite. Set uk := σk+1 − σk. By assumption,

E[uαk | Yk = y] <∞, for all y ∈ A. (2.49)

Also, we have for all y, z ∈ A with q(y, z) > 0,

E[uαk | Yk = y] ≥ E[uαk1Yk+1 = z | Yk = y]

= q(y, z)E[uαk | Yk = y, Yk+1 = z],

and hence, since r1 > 0,

E[uαk | Yk = y, Yk+1 = z] ≤ 1

r1E[uαk | Yk = y] <∞.

So, since A is finite, there exists r2 ∈ R+ such that

E[uαk | Yk, Yk+1] ≤ r2, a.s.

Then, using the simple inequality

(a1 + · · ·+ an)α ≤ nα(aα1 + · · ·+ aαn) for all α > 0 and ai ≥ 0, (2.50)

we have for all k,

E[σαk | w = k] = E[(u0 + · · ·+ uk−1)α | w = k]

≤ kα E[uα0 + · · ·+ uαk−1 | Y1 6= x, . . . , Yk−1 6= x, Yk = x]

≤ kα+1r2.


Since Ex(τ+x )α = Ex σαw and we have by (2.48)

Ex σαw =∞∑k=1

Px[w = k] E[σαk | w = k]

≤∞∑k=1

K1e−δkr2kα+1 <∞,

we obtain that Ex(τ+x )α <∞ for all x ∈ A.

Now, fix any x0 ∈ A, consider an arbitrary y ∈ Σ, and define A′ :=x0, y. Since the Markov chain is irreducible, there exist ε > 0 and k ≥ 1such that

Px0 [Xk = y, Xm /∈ A′ for all m = 1, . . . , k − 1] ≥ ε.

Then, we have

Ex0(τ+x )α ≥ Ex0

[(τ+x )α;Xk = y,Xm /∈ A′ for all m = 1, . . . , k − 1

]≥ εEy[(τ+

x )α],

so we must have Ey(τ+x )α <∞. But this means that Ez(τ+

A′)α ≤ Ez(τ+

x )α <∞ for all z ∈ A′. Repeating the above argument with A′ on the place of A,we obtain that Ey(τ+

y )α <∞, and this concludes the proof.

Next, we state a couple of simple results, which are, however, very usefulfor proving positive recurrence of (Markov) processes.

Theorem 2.6.2. Let Xn be a real-valued integrable Fn-adapted process, andlet τ be a stopping time with respect to Fn; suppose also that EXn∧τ ≥ 0a.s. for all n. Assume that for some ε > 0 and all n ≥ 0, on τ > n,

E[Xn+1 −Xn | Fn] ≤ −ε, a.s. (2.51)

Then it holds that

E τ ≤ EX0

ε<∞. (2.52)

Proof. The result is a version of Theorem 2.4.8, but we recapitulate the(simple) proof. We may rewrite the condition (2.51) as

E[X(m+1)∧τ −Xm∧τ | Fm] ≤ −ε1τ > m, a.s., for any m ≥ 0.

We take expectations

E[X(m+1)∧τ ]− E[Xm∧τ ] ≤ −εP[τ > m],


and then sum the above over m from 0 to n to obtain that

0 ≤ EX(n+1)∧τ ≤ −εn∑

m=0

P[τ > m] + EX0.

So, passing to the limit, we have

E τ = limn→∞

n∑m=0

P[τ > m] ≤ EX0

ε<∞,

which yields (2.52).

Remark 2.6.3. Under the conditions of Theorem 2.6.2, Dynkin’s formula(2.9) (with f the identity) gives E[n∧ τ ] ≤ ε−1 EX0, which first shows thatP[τ = ∞] = 0 and then, with Fatou’s lemma, provides an alternative proofof the theorem.

A consequence of Theorem 2.6.2 is the following positive recurrence cri-terion for Markov chains.

Theorem 2.6.4 (Foster’s criterion). An irreducible Markov chain Xn ona countable state space Σ is positive recurrent if and only if there exist apositive function f : Σ→ R+, a finite non-empty set A ⊂ Σ, and ε > 0 suchthat

E[f(Xn+1)− f(Xn) | Xn = x] ≤ −ε, for all x ∈ Σ \A, (2.53)

E[f(Xn+1) | Xn = x] <∞, for all x ∈ A. (2.54)

Proof. Let p(x, y) be the transition probabilities of the Markov chain. Fory ∈ Σ define the stopping times

τy = minn ≥ 0 : Xn = y,

and for B ⊂ Σ let

τB = minn ≥ 0 : Xn ∈ B = miny∈B

τy.

To prove the “only if” part, it is enough to set A = x0 (where x0 ∈ Σ isany fixed state), and then notice that for x 6= x0,

Ex τx0 =∑y∈Σ

p(x, y)Ey[1 + τx0 ]


and that ∑y∈Σ

p(x0, y)Ey τx0 = Ex0 τ+x0 <∞,

so the function f(x) = Ex τx0 satisfies (2.53)–(2.54) with ε = 1.

To prove the “if” part, take X0 = x ∈ Σ. Then f(X0) is integrable,and f(Xn) is integrable by an induction based on (2.53) and (2.54). NowTheorem 2.6.2 applied to the non-negative, integrable process f(Xn) withthe stopping time τA implies that

Ex τA ≤f(x)

ε, for all x /∈ A. (2.55)

So, for any y ∈ A, by (2.54) and (2.55)

Ey τA =∑z∈A

p(y, z) +∑z /∈A

p(y, z)Ez[1 + τA]

= 1 +∑z /∈A

p(y, z)Ez τA

≤ 1 + ε−1∑z /∈A

p(y, z)f(z) <∞.

Then the α = 1 case of Lemma 2.6.1 implies that the Markov chain Xn ispositive recurrent, and this concludes the proof of Theorem 2.6.4.

Of course, a successful application of Theorem 2.6.4 for proving the posit-ive recurrence of a Markov chain depends on one’s ability to find a Lyapunovfunction f that works. As observed in the above proof, such a function al-ways exists and is given by the expected time to reach some fixed state.While it is not always possible to obtain an exact expression for this expec-ted time, one may try to guess an approximation for it, and check if it worksas a Lyapunov function in Theorem 2.6.4:

i

When looking for the Lyapunov function that proves positiverecurrence, fix a state (or a group of states), and set f(x) tobe proportional to your guess for the expected time (startingfrom x) to hit that fixed state.

Example 2.6.5. Consider Sn, the nearest-neighbour random walk on thehalf-line from Example 2.1.8, with p < 1

2 . Intuitively speaking, the particlejust drifts towards the origin with constant speed, so the time of reaching the


origin from x should be proportional to x (i.e., the distance to the origin).So, consistently with the above heuristic rule, one can set A = 0 andf(x) = x; then, (2.53) holds with ε = 1− 2p > 0, thus proving the positiverecurrence of Sn with p < 1

2 . 4Theorem 2.6.2 has many uses beyond the case where τ is a hitting time

of a finite set, as the next two examples illustrate.

Example 2.6.6. Let (ξn, n ≥ 0) be a Markov chain on Σ ⊆ R2+; denote its

increments by θn := θn+1 − θn, as in Section 1.4. Write ξn = (ξ(1)n , ξ

(2)n ) in

Cartesian coordinates. Let τ = minn ≥ 0 : ξ(1)n ξ

(2)n = 0, the first hitting

time of the boundary of the quarter-plane Σ0 = (x, y) ∈ R2+ : xy = 0.

Suppose that for some B ∈ R+,

E[θn | ξn = x] = 0, for all x ∈ Σ \ Σ0;

E[‖θn‖2 | ξn = x] ≤ B for all x ∈ Σ \ Σ0.

Suppose also that the components of the increments have a constant covari-ance, i.e.,

E[(ξ(1)n+1 − ξ(1)

n )(ξ(2)n+1 − ξ(2)

n ) | ξn = x] = ρ, for all x ∈ Σ \ Σ0,

for a constant ρ; with the notation at (1.9), this means that the off-diagonalelement of M(x) is equal to ρ. Then, for x ∈ Σ \ Σ0,

E[ξ(1)n+1ξ

(2)n+1 − ξ(1)

n ξ(2)n | ξn = x] = ρ. (2.56)

Hence if ρ < 0 we may apply Theorem 2.6.2 with Xn = ξ(1)n ξ

(2)n to deduce

that E τ <∞.Remarkably, one may also explicitly compute E τ when it exists. Sup-

pose that E τ < ∞. For each k ∈ 1, 2, the process ξ(k)n∧τ is a non-negative

martingale adapted to Fn = σ(ξ0, . . . , ξn), and converges to ξ(k)τ . The

associated quadratic variation process satisfies 〈ξ(k)〉n∧τ ≤ B(n ∧ τ), so

E[(ξ(k)n∧τ )2] ≤ B E τ <∞. Thus the martingales ξ

(k)n∧τ are uniformly bounded

in L2, and hence converge in L2 (see e.g. Theorem 5.4.5 of [72]). Hence,

ξ(1)n∧τξ

(2)n∧τ converges in L1, so

limn→∞

E ξ(1)n∧τξ

(2)n∧τ = E ξ(1)

τ ξ(2)τ = 0. (2.57)

Moreover, it follows from (2.56) that ξ(1)n ξ

(2)n − ρn is a martingale, so

ξ(1)0 ξ

(2)0 = E ξ(1)

n∧τξ(2)n∧τ − ρE[n ∧ τ ].


Re-arranging, taking n→∞ and using (2.57) and the monotone convergencetheorem, we obtain

E τ = limn→∞

E[n ∧ τ ] =ξ

(1)0 ξ

(2)0

|ρ| , when ρ < 0.

A similar argument shows that the result E τ <∞ when ρ < 0 is sharp:we show that if ρ ≥ 0 and ξ0 /∈ Σ0, then E τ = ∞. For the purposes ofderiving a contradiction, suppose that E τ < ∞. Since ρ ≥ 0, (2.56) shows

that ξ(1)n∧τξ

(2)n∧τ is a submartingale, which converges in L1 by (2.57). Hence

0 = E ξ(1)τ ξ

(2)τ ≥ E ξ(1)

0 ξ(2)0 > 0, since ξ0 /∈ Σ0, which is a contradiction. So

E τ =∞. 4

Example 2.6.7. Let Xn be a square-integrable martingale with X0 = 0,such that |Xn+1 −Xn| ≤ K, a.s., and E[(Xn+1 −Xn)2 | Fn] = v2, a.s., forsome K, v ∈ (0,∞). Fo a ∈ R+, define

κa := minn ≥ 0 : |Xn| > an1/2, (2.58)

the exit time from a square-root boundary. In this example we prove that

Eκa <∞ if and only if a < v. (2.59)

To obtain the ‘only if’ half of (2.59), it suffices to show that Eκv =∞. To this end, let Xn be a square-integrable martingale with E[(Xn+1 −Xn)2 | Fn] = v2 (for this part of the argument the uniform jumps bound isunnecessary).

Suppose, for the purpose of deriving a contradiction, that Eκv < ∞.We make use of the discussion leading up to Lemma 2.3.18. In this case,〈X〉n = nv2, and 〈X〉κv = κvv

2. Then, from (2.10), E[X2κv ] ≤ v2 Eκv <∞.

It is straightforward to show that a sufficient condition for X2n∧κv to be

uniformly integrable is that

E[X2κv ] <∞, and X2

n1κv > n is uniformly integrable.

Moreover, by definition of κv,

X2n1κv > n ≤ v2n1κv > n ≤ v2κv,

which is integrable under the assumption Eκv <∞. So X2n∧κv is uniformly

integrable, and we may apply Lemma 2.3.18 to conclude

E[X2κv ] = E[〈X〉κv ] = v2 Eκv.


But, by definition, X2κv − v2κv > 0 then has expectation 0, and so must be

0 a.s., providing the desired contradiction. Thus Eκv =∞.For the ‘if’ half of (2.59), we will actually prove the stronger result that

Eκa <∞ for any a < v, but where we replace (2.58) by

κa := minn ≥ 0 : |Xn| > a(n+ c)1/2 for some c ∈ R+.

We permit c > 0 to downgrade the role of trivial cases where P[|X1| > a] = 1;but note that E[|X1|2] = v2 so P[|X1| > v] < 1, and there can be no trivialcontradiction to Eκv =∞.

Suppose a ∈ (0, v). Take b ∈ (a, v) and c′ > c. Define Yn = b2(n+ c′)−X2n. Then, using the squared-difference identity (2.23),

Yn+1 − Yn = b2 − 2Xn(Xn+1 −Xn)− (Xn+1 −Xn)2,

and hence, by our hypotheses on Xn, E[Yn+1 − Yn | Fn] = b2 − v2, which isuniformly (strictly) negative since b < v. Moreover, a.s.,

Yn∧κa ≥ b2(n ∧ κa + c′)−(a(n ∧ κa + c)1/2 +K

)2 ≥ 0, for all n ≥ 0,

provided c′ > c is chosen sufficiently large (depending on a, b, c and K).Then we can apply Theorem 2.6.2 to see that, if a ∈ (0, v), Eκa <∞. 4

Foster’s original argument in [95] for his version of Theorem 2.6.4 israther different from the proof given above; instead it is more similar in spiritto the following result, which gives a lower bound on expected occupationtimes. Note that the set A in this formulation need not be finite.

Lemma 2.6.8. Let Xn be an Fn-adapted process on state space (Σ, E).Suppose that there exist f : Σ → R+, a non-empty A ∈ E, and ε > 0 suchthat f(X0) is integrable, and

E[f(Xn+1)− f(Xn) | Fn] ≤ −ε, on Xn ∈ Σ \A.

Suppose also that for some constant K <∞,

E[f(Xn+1)− f(Xn) | Fn] ≤ K, on Xn ∈ A.

Then it holds that

lim infn→∞

1

nEn−1∑m=0

1Xm ∈ A ≥ε

ε+K.


Proof. Combining the conditions in the lemma, we have

E[f(Xn+1)− f(Xn) | Fn] ≤ −ε+ (ε+K)1Xn ∈ A.

In particular, it follows that f(Xn) is integrable for all n, given that f(X0)is integrable, and we obtain

E f(Xn+1)− E f(Xn) ≤ −ε+ (ε+K)P[Xn ∈ A],

and then summing yields

E f(Xn)− E f(X0) ≤ −εn+ (ε+K)

n−1∑m=0

P[Xm ∈ A].

Re-arranging and using the non-negativity of f , we obtain

ε+K

n

n−1∑m=0

P[Xm ∈ A] ≥ ε− 1

nE f(X0).

Taking n→∞ yields the result.

Example 2.6.9. Suppose that Xn is an irreducible Markov chain on acountable state space Σ, and A ⊂ Σ is finite. It is a classical result that, forπ a sub-probability measure on Σ,

limn→∞

1

nEn−1∑m=0

1Xm ∈ A = π(A);

Xn is positive recurrent if and only if π(x) > 0 for some (hence all) x ∈ Σ,in which case π is a proper probability measure. Thus in this case theconclusion of Lemma 2.6.8 shows that π(A) > 0 and hence π(x) > 0 for somex ∈ A, which yields positive recurrence. So the “if” half of Theorem 2.6.4can be deduced from Lemma 2.6.8. 4

We end this section with a partial result in the opposite direction.

Theorem 2.6.10. Let Xn be an Fn-adapted process taking values in R+,and let τ be a stopping time. Suppose that there exists B ∈ R+ such that

E[Xn+1 −Xn | Fn] ≥ 0, on n < τ;E[(Xn+1 −Xn)+ | Fn] ≤ B, on n < τ.

Suppose that there is some c ≥ 0 such that P[Xτ ≤ c | τ < ∞] = 1 andEX0 > c. Then E τ =∞.


Proof. Let Yn = Xn∧τ . Then, by hypothesis,

E[(Yn+1 − Yn)+ | Fn] ≤ B1n < τ. (2.60)

Suppose, for the purpose of deriving a contradiction, that E τ < ∞; soP[τ < ∞] = 1. Now, taking expectations in (2.60) and summing fromm = 0 to n− 1 we obtain

En−1∑m=0

(Ym+1 − Ym)+ ≤ Bn−1∑m=0

P[τ > m] ≤ B E τ.

Moreover, since Y0 = X0,

0 ≤ Yn = Y0 +n−1∑m=0

(Ym+1 − Ym)

≤ X0 +∞∑m=0

(Ym+1 − Ym)+ =: Z.

Here Z ≥ 0 is an integrable random variable, since, by Fatou’s lemma,

EZ ≤ X0 + lim infn→∞

En−1∑m=0

(Ym+1 − Ym)+ ≤ X0 +B E τ <∞.

Hence |Xn∧τ | is dominated by an integrable random variable, so the sub-martingale Xn∧τ is uniformly integrable. Then by optional stopping (The-orem 2.3.7),

EXτ ≥ EX0 > c,

by hypothesis. But, since P[τ <∞] = 1, Xτ ≤ c, a.s., by hypothesis, givingthe desired contradiction. So E τ =∞.

This leads to a sufficient condition for a Markov chain to be null, i.e.,null recurrent or transient.

Corollary 2.6.11. Let Xn be an irreducible Markov chain on a countablestate space Σ. Suppose that there exists a non-empty A ⊂ Σ and a functionf : Σ→ R+ such that f(Xn) is integrable, and

E[f(Xn+1)− f(Xn) | Xn = x] ≥ 0, for all x ∈ Σ \A;

E[(f(Xn+1)− f(Xn))+ | Xn = x] ≤ B, for all x ∈ Σ \A;

f(y) > maxx∈A

f(x), for some y ∈ Σ \A.

Then Ex τ+x =∞ for any x ∈ Σ, i.e., Xn is not positive recurrent.

2.7. Moments of hitting times 71

Proof. Take X0 = y and let τ = τA. Then f(Xn) satisfies the hypotheses ofTheorem 2.6.10 (with c = maxx∈A f(x)), which implies that Ey τA =∞.

But τA = minx∈A τx, so that Ey τx = ∞ for any x ∈ A. Take one suchx ∈ A. Then Px[τy < τ+

x ] > 0, by irreducibility. Hence, by the strongMarkov property,

Ex τ+x ≥ Px[τy < τ+

x ]Ey τ+x =∞.

But then we must have Ex τ+x =∞ for all x ∈ Σ, since (by Proposition 2.1.5)

the property Ey τ+y <∞ is a class property.

Example 2.6.12. Let Sn be symmetric simple random walk on Z. Thehypotheses of Corollary 2.6.11 are satisfied with Xn = Sn, f(x) = |x|,A = 0, B = 1

2 , and any y ≥ 1, so we conclude E1 τ0 =∞. Hence

E0 τ+0 ≥ 1 +

1

2E1 τ0 =∞. 4

2.7 Moments of hitting times

If a Markov chain is recurrent, it is of interest to quantify recurrence beyondthe distinction between null and positive recurrent by estimating tails ofthe return time or determining which moments of return times exist. Thequestion of existence and non-existence of moments of hitting times is alsoof interest beyond the Markov setting. This section gives some results inthis direction.

Recall that λx = minn ≥ 0 : Xn ≤ x. The first result provides asufficient condition for existence of moments for λx, which, by Markov’sinequality, yield upper tail bounds on λx.

Theorem 2.7.1. Let Xn be an Fn-adapted stochastic process, taking valuesin an unbounded subset of R+. Suppose that there exist δ > 0, x > 0, ands0 > 0 such that for any n ≥ 0, X2s0

n is integrable and

E[X2s0n+1 −X2s0

n | Fn] ≤ −δX2s0−2n , on n < λx. (2.61)

For any s ∈ [0, s0), there exists a positive constant c = c(δ, s, s0) such that

E[λsx | F0] ≤ cX2s00 .

Moreover, in the case when s0 ≥ 1, for any s ∈ [0, s0], there exists a positiveconstant c′ = c′(δ, s, s0) such that

E[λsx | F0] ≤ c′X2s0 .


Remark 2.7.2. It follows from (2.61) that Xn ≥ δ−1/2 on n < λx. Soimplicit in the hypotheses of Theorem 2.7.1 is that x ≥ δ−1/2 is not toosmall.

Before proving Theorem 2.7.1, we state a useful corollary, which can beseen as a direct generalization of Theorem 2.6.2 (the case γ = 0).

Corollary 2.7.3. Let Xn be an integrable Fn-adapted stochastic process,taking values in an unbounded subset of R+, with X0 = x0 fixed. Supposethat there exist δ > 0, x > 0, and γ < 1 such that for any n ≥ 0,

E[Xn+1 −Xn | Fn] ≤ −δXγn , on n < λx. (2.62)

Then, for any p ∈ [0, 1/(1− γ)), E[λpx] <∞.

Proof. Let s0 = 1/(1 − γ) > 0, and Yn = X(1−γ)/2n . Then Xn = Y 2s0

n andXγn = Y 2s0−2

n , so (2.62) implies that (2.61) holds for Yn, and Theorem 2.7.1yields the result.

Proof of Theorem 2.7.1. It suffices to suppose that X0 > x. We treat sep-arately the two cases: s0 ≥ 1 and s0 < 1.

Case 1. s0 ≥ 1. For s ∈ (0, s0] define U(s)n for n ≥ 0 by

U (s)n =

(X2n∧λx +

δ

s0(n ∧ λx)

)s.

We show that U(s0)n is a supermartingale. By (2.61), on n < λx,

E[X2s0n+1 | Fn] ≤ X2s0

n − δX2s0−2n = X2s0

n

(1− δX−2

n

).

The last quantity is non-positive if Xn ≤√δ/s0, while if Xn >

√δ/s0 we

may apply the simple inequality

1− sy ≤ (1− y)s for s ≥ 1 and 0 < y < 1,

with y = (δ/s0)X−2n ∈ (0, 1) and s = s0 to see that

1− δX−2n ≤

(1− δX−2

n

)1Xn >

√δ/s0 ≤

(1− δ

s0X−2n

)s0,

on n < λx. It follows that

E[X2s0n+1 | Fn] ≤

(X2n −

δ

s0

)s0, on n < λx. (2.63)


Now, on n < λx, we apply the Ls0 version of Minkowski’s inequality forconditional expectations to obtain

E[U(s0)n+1 | Fn] = E

[(X2n+1 +

δ

s0(n+ 1)

)s0 ∣∣∣Fn]≤(E[X2s0

n+1 | Fn]1/s0 + (δ/s0)(n+ 1))s0

, on n < λx.

Then using (2.63), we obtain on λx > n,

E[U(s0)n+1 | Fn] ≤

(X2n +

δ

s0n)s0

= U (s0)n .

Hence U(s0)n is a non-negative supermartingale, and it follows from The-

orem 2.3.3 that for s ∈ (0, s0], the process U(s)n is also a non-negative super-

martingale. Hence, by optional stopping (Theorem 2.3.11),( δs0

)sE[(n ∧ λx)s | F0] ≤ E[U (s)

n | F0] ≤ E[U(s)0 | F0] = X2s

0 ,

and it follows from the (conditional) Fatou lemma that

E[λsx | F0] ≤(s0

δ

)sX2s

0 , for any s ∈ (0, s0].

Case 2. s0 < 1. We fix some s ∈ (0, s0), and define, for n ≥ 0,

V (s)n = X2s0

n∧λx +δ

s0(n ∧ λx)s.

First we observe, on λx > n,

E[X2s0n+1 −X2s0

n | Fn] = E[(X2s0

n+1 −X2s0n )1Xn ∈ (x, n1/2)

∣∣Fn]+ E

[(X2s0

n+1 −X2s0n )1Xn ≥ n1/2

∣∣Fn];note that for small n the event Xn ∈ (x, n1/2) may be empty. A simpleconsequence of (2.61) is

E[X2s0n+1 −X2s0

n | Fn] ≤ 0, on n < λx. (2.64)

So applying (2.64) and (2.61) we have, on n < λx,

E[X2s0n+1 −X2s0

n | Fn] ≤ −δX2s0−2n 1Xn ∈ (x, n1/2).


Hence, on n < λx,

E[V(s)n+1 − V (s)

n | Fn] ≤ −δX2s0−2n 1Xn ∈ (x, n1/2)+

δ

s0((n+ 1)s − ns) .

So, using the simple inequality

(n+ 1)s − ns ≤ sns−1, for all n ≥ 1,

we obtain, for s < s0, for n ≥ 1,

E[V(s)n+1 − V (s)

n | Fn] ≤ δ(ns−1 −X2s0−2

n 1Xn ∈ (x, n1/2))

1n < λx

≤ δ(ns−1 −X2s0−2

n∧λx

)1Xn∧λx ∈ (x, n1/2)

+ δns−11Xn∧λx ≥ n1/2.≤ δ

(ns−1 − ns0−1

)+ δns−11Xn∧λx ≥ n1/2

≤ δns−11Xn∧λx ≥ n1/2.

Now taking expectations conditioned on F0 and using Markov’s inequalitywe have, for n ≥ 1,

E[V(s)n+1 − V (s)

n | F0] ≤ δns−s0−1 E[X2s0n∧λx | F0]

≤ δns−s0−1X2s00 ,

as follows from (2.64). Thus, for n ≥ 1,

E[V (s)n | F0] ≤ E[V

(s)1 | F0] + δX2s0

0

∞∑k=1

ks−s0−1

≤ E[V(s)

1 | F0] + c1X2s00 ,

for c1 = c1(δ, s, s0) <∞. Here, by (2.61),

E[V(s)

1 | F0] = E[X2s01 | F0] +

δ

s0≤ X2s0

0

(1− δX−2

0 +δ

s0X−2s0

0

).

Since X0 > x ≥ δ−1/2 (see Remark 2.7.2), the term in the last displayedbracket is bounded above by some c2 = c2(δ, s0) <∞, so finally we obtain,for some c′ = c′(δ, s, s0) <∞,

E[V (s)n | F0] ≤ c′X2s0

0 , on X0 > x, for all n ≥ 0.

The statement now follows similarly to the case s0 ≥ 1.


Now we turn to the opposite problem. The first result states sufficientconditions for non-existence of moments of hitting times λx = minn ≥ 0 :Xn ≤ x.Theorem 2.7.4. Let Xn be a Fn-adapted stochastic process taking values inan unbounded subset of R+. Suppose that there exist constants x ∈ (0,∞),B ∈ (0,∞) and c ∈ R such that, for any n ≥ 0,

E[(Xn+1 −Xn)2 | Fn] ≤ B, on Xn ≥ x; (2.65)

E[Xn+1 −Xn | Fn] ≥ − c

Xn, on Xn ≥ x. (2.66)

Suppose in addition that for some s0 > 0, the process X2s0n∧λx is a submartin-

gale. Then, for any s > s0, we have E[λsx | X0 = x0] =∞ for any x0 > x.

A key role in the proof of this theorem is the following estimate, whichgives conditions to ensure that a process in unlikely to return rapidly to aneighbourhood of the origin. The proof is based on an application of themaximal inequality from Theorem 2.4.8.

Lemma 2.7.5. Let Xn be an Fn-adapted stochastic process taking values inR+. Let δ ∈ (0, 1). Suppose that there exist finite positive constants x, B,and c such that, for any n ≥ 0, (2.65) and (2.66) hold. There exists ε > 0(depending only on B, c, and δ) so that, for any n,

P[

minn≤m≤n+εX2

n

Xm ≥ 12Xn

∣∣∣Fn] ≥ 1− δ, on Xn > 2x. (2.67)

In particular, on Xn∧λx > 2x,

P[λx > n+ εX2n | Fn] ≥ 1− δ. (2.68)

Proof. To ease notation, we prove (2.67) and (2.68) for n = 0; the argumentin the general case is the same. Hence suppose X0 > 2x. Write λ = λX0/2,and note that λx ≥ λ provided X0 > 2x.

Write ∆k = Xk+1−Xk for the duration of this proof. Define the functionw(x) := (X0 − x)21x < X0 for x ∈ R+, and let Wk = w(Xk). Since w isnon-increasing, on Xk ≥ X0, w(Xk + ∆k) ≤ w(X0 + ∆k) ≤ ∆2

k. Hence,on Xk ≥ X0,

E[Wk+1 −Wk | Fk] ≤ E[∆2k | Fk] ≤ B, a.s.,

by (2.65). On the other hand, on Xk < X0,

Wk+1 −Wk ≤ (Wk+1 −Wk)1Xk+1 < X0


=(−2(X0 −Xk)∆k + ∆2

k

)1∆k < X0 −Xk.

Here we have that

− 2(X0 −Xk)∆k1∆k < X0 −Xk= −2(X0 −Xk)∆k + 2(X0 −Xk)∆k1∆k ≥ X0 −Xk≤ −2(X0 −Xk)∆k + 2∆2

k1∆k ≥ X0 −Xk.

Combining these bounds we get, on Xk < X0,

Wk+1 −Wk ≤ −2(X0 −Xk)∆k + 2∆2k.

Taking expectations we see that, on X0/2 ≤ Xk < X0, a.s.,

E[Wk+1 −Wk | Fk] ≤ −2(X0 −Xk)E[∆k | Fk] + 2E[∆2k | Fk]

≤ 2c(X0 −Xk)

Xk+ 2B.

In particular, there exists C < ∞, depending only on B and c, such that,on Xk > X0/2, E[Wk+1 −Wk | Fk] ≤ C, a.s. Hence we conclude that,for any k ≥ 0, E[W(k+1)∧λ −Wk∧λ | Fk] ≤ C, a.s. We apply the maximalinequality (2.12) from Theorem 2.4.8 to the process Xk∧λ at deterministicstopping time ν = m to obtain

P[

max0≤k≤m

Wk∧λ ≥ y2∣∣∣F0

]≤ Cm+W0

y2=Cm

y2, a.s.,

on X0 > 2x. With y2 = X20/4 and m = εX2

0 we see that

P[

max0≤k≤εX2

0

Wk∧λ <X2

0

4

∣∣∣F0

]≥ 4Cε,

on X0 > 2x. Now, Wk < y2 implies Xk > X0 − y, while Wλ = (X0 −Xλ)2 ≥ X2

0/4, so the event in the last display implies min0≤k≤εX20Wk >

X0/2. Thus we obtain (2.67), on taking ε < δ4C , and this in turn im-

plies (2.68).

Proof of Theorem 2.7.4. Let X0 = x0 > x. It suffices to suppose that λx <∞ a.s., otherwise Eλsx =∞ is trivial. We proceed by contradiction. Supposethat there exists some s > s0 such that Eλsx <∞. Lemma 2.7.5 with δ = 1

2yields the existence of positive ε such that for all n ≥ 0,

Eλsx ≥ Eλsx1Xn∧λx > 2x


≥ εs

2E[X2sn∧λx

]1Xn∧λx > 2x

≥ εs

2EX2s

n∧λx −εs

2(2x)2s.

Under the hypothesis Eλsx < ∞, the last inequality shows that there ex-ists K < ∞ such that EX2s

n∧λx ≤ K for all n. Hence X2s0n∧λx is uniformly

integrable, andlimn→∞

EX2s0n∧λx = EX2s0

λx≤ x2s0 ,

by definition of λx. On the other hand, since X2s0n∧λx is a submartingale,

we have EX2s0n∧λx ≥ EX2s0

0 = x2s00 . Since x0 > x, this gives the desired

contradiction.

Example 2.7.6. Let Sn be one-dimensional SRW. Take a positive α < 1/2,and fix c ∈ (0, α(1− 2α)). We have then

E[X2αn+1 −X2α

n | Xn = x] =1

2x2α((

1 +1

x

)2α+(

1− 1

x

)2α− 2)

≤ −cx2α−2,

for all large enough x, so Theorem 2.7.1 implies that E0[(τ+0 )α] <∞. On the

other hand, in order to apply Theorem 2.7.4, take Xn = |Sn|. The conditions(2.65) and (2.66) are easily verified, and Xn is itself a submartingale, so wemay apply Theorem 2.7.4 with s0 = 1/2 to deduce that E[(τ+

0 )β] = ∞ forall β > 1/2. 4

As Example 2.7.6 demonstrates when compared to Example 2.4.3, The-orem 2.7.4 is not quite sharp; in compensation, it is robust. In some situ-ations it is possible to do better by using Lemma 2.7.5 in a slightly differentway: the next result shows how tail information about the maximum of anexcursion can be used to deduce a lower bound on the tail of λx. The ideais basically an extension of that in Example 2.4.3. In fact, the idea read-ily yields a lower bound on any non-decreasing additive functional of theexcursion, so we state the result in that generality.

Lemma 2.7.7. Let Xn be an Fn-adapted stochastic process taking values inan unbounded subset of R+. Suppose that there exist finite positive constantsx, B, and c such that, for any n ≥ 0, (2.65) and (2.66) hold. Let α : R+ →R+ be a non-decreasing function. Then there exists ε > 0 (not depending onα or x) such that, for all y > x,

P

[λx∑m=1

α(Xm) ≥ 4εy2α(y)

∣∣∣∣∣F0

]≥ 1

2P[

max0≤m≤λx

Xm ≥ 2y

∣∣∣∣F0

], a.s.


In particular, there exists C < ∞ (not depending on α or x) such that, forall n sufficiently large,

P[λx ≥ n

∣∣F0

]≥ 1

2P[

max0≤m≤λx

Xm ≥ Cn1/2

∣∣∣∣F0

], a.s.

Example 2.7.8. Let y > x. Lemma 2.7.7 applied to the function α(x) =1x ≥ y gives the estimate

P

[λx∑m=1

1Xm ≥ y ≥ 4εy2

]≥ 1

2P[

max0≤m≤λx

Xm ≥ 2y

],

and hence, for the expected number of visits to [y,∞) before reaching [0, x],

E

[λx∑m=1

1Xm ≥ y]≥ 2εy2 P

[max

0≤m≤λxXm ≥ 2y

]. 4

Proof of Lemma 2.7.7. Let y > x and define the stopping time κ = minn ≥0 : Xn > 2y. Define the event

E(y, ε) =

minκ≤m≤κ+εX2

κ

Xm ≥ 12Xκ

.

Then by (2.67) applied at the stopping time κ (if finite), which is justifiedby Lemma 2.1.11(i), there exists ε > 0 such that

P[E(y, ε) | Fκ] ≥ 1

2, on κ <∞. (2.69)

Now, κ < λx ∩ E(y, ε) implies that λx > κ+ εX2κ, so that

λx∑m=1

α(Xm) ≥ 1κ < λx1(E(y, ε))∑

κ≤m≤κ+εX2κ

α(Xm)

≥ εX2κα(Xκ/2)1κ < λx1(E(y, ε))

≥ 4εy2α(y)1κ < λx1(E(y, ε)), a.s.,

since α is non-decreasing. Hence, since κ < λx is Fκ measurable,

P

[λx∑m=1

α(Xm) ≥ 4εy2α(y)

∣∣∣∣∣F0

]≥ P

[κ < λx ∩ E(y, ε)

∣∣F0

]= E

[1κ < λxP[E(y, ε) | Fκ]

∣∣F0

]

2.8. Growth bounds on trajectories 79

≥ 1

2P[κ < λx | F0],

by (2.69). But κ < λx if and only if max0≤m≤λx Xm ≥ 2y. The result nowfollows.

i

A general strategy to prove that certain moments of a hittingtime do not exist is to bound below the probability that theprocess ends up a long way away from its destination, andthen takes a while to come back. A local submartingale canhelp with the first part, and a maximal inequality with thesecond.

Example 2.7.9. As in Example 2.4.15, let Sn be symmetric SRW on Z2.Let τ = minn ≥ 1 : Sn = 0. It is not hard to show that Xn = ‖Sn‖satisfies the conditions of Lemma 2.7.7, so that with the excursion lowerbound in Example 2.4.15, we obtain that for any ε > 0 there is a constantc > 0 such that, for all n ≥ 2,

P[τ ≥ n] ≥ c(log n)−1−ε.

In particular, E[τ s] =∞ for all s > 0. 4

2.8 Growth bounds on trajectories

In this section we turn to the question of quantifying the ‘speed’ of an R+-valued discrete-time stochastic process (Xn, n ≥ 0) adapted to a filtration(Fn, n ≥ 0); later we will apply these results to multidimensional models.

Under mild conditions, lim supn→∞Xn = ∞, i.e., max0≤m≤nXm tendsto infinity. To quantify the ‘speed’ of the process, we present general se-mimartingale criteria for obtaining upper and lower almost-sure bounds onmax0≤m≤nXm. For example, we give conditions for finding an increasingfunction h such that max0≤m≤nXm ≤ h(n) for all but finitely many n ∈ Z+,a.s. (which implies lim supn→∞Xn/h(n) ≤ 1). Similarly, we present lowerbounds that may be cast as ‘lim inf’ statements.

To start with, we consider almost-sure upper bounds. The statementinvolves a function a satisfying the following condition.

(A0) Suppose that for some na ∈ Z+ the function a : [na,∞) → (0,∞) issuch that (i) x 7→ a(x) is increasing on x ≥ na; (ii) limx→∞ a(x) =∞;and (iii)

∑n≥na

1a(n) <∞.


Condition (A0) is satisfied, for example, by a(x) = x1+ε (with na = 1) ora(x) = x(log x)1+ε (with na = 2), where ε > 0.

Theorem 2.8.1. Let Xn be an Fn-adapted process on R+. Suppose thatthere exist a non-decreasing function f : R+ → R+ and a constant B ∈ R+

for which f(X0) is integrable, and

E[f(Xn+1)− f(Xn) | Fn] ≤ B, a.s., (2.70)

for all n ≥ 0. Define the non-decreasing function f−1 for x > 0 by

f−1(x) := supy ≥ 0 : f(y) < x. (2.71)

Let a satisfy (A0). Then, a.s., for all but finitely many n ≥ 0,

max0≤m≤n

Xm ≤ f−1(a(2n)). (2.72)

Proof. Note that

E f(Xn) ≤ E f(X0) +n−1∑m=0

E[f(Xm+1)− f(Xm)] ≤ E f(X0) +Bn,

by (2.70), so that f(Xn) is integrable for all n provided f(X0) is integrable.The maximal inequality (2.14) from Theorem 2.4.7 applied to the processf(Xn) shows that, for any n ≥ na,

P[

max0≤m≤n

f(Xm) ≥ a(n)]≤ Bn+ E f(X0)

a(n). (2.73)

Also, since f is non-negative and non-decreasing, for n ≥ na,

P[

max0≤m≤n

f(Xm) ≥ a(n)]

= P[f(

max0≤m≤n

Xm

)≥ a(n)

]. (2.74)

With f−1 defined by (2.71), let En denote the event

En :=

max0≤m≤n

Xm > f−1(a(n)).

Then since z > f−1(r) implies f(z) ≥ r, we obtain from (2.73) and (2.74)that for all n ≥ na,

P[En] ≤ P[f(

max0≤m≤n

Xm

)≥ a(n)

]≤ Bn+ E f(X0)

a(n). (2.75)


Now for any `a ≥ log2 na,

∞∑`=`a

2`

a(2`)<∞ ⇐⇒

∞∑`=na

1

a(`)<∞. (2.76)

By (2.75) and (2.76), along the subsequence n = 2` for ` ∈ Z+, (A0) andthe Borel–Cantelli lemma imply that, a.s., the event En occurs only finitelyoften, so, for all but finitely many `,

max0≤m≤2`

Xm ≤ f−1(a(2`)

).

Every n ≥ 1 has 2`n ≤ n < 2`n+1 for some `n ≥ 0 with limn→∞ `n = ∞;then,

max0≤m≤n

Xm ≤ max0≤m≤2`n+1

Xm ≤ f−1(a(2`n+1)

), a.s.,

for all but finitely many n. Now since 2`n+1 ≤ 2n and f−1 and a areeventually non-decreasing, (2.72) follows.

We give an application of the preceding result to a many-dimensionalrandom walk with asymptotically vanishing drift.

Example 2.8.2. Let (ξn, n ≥ 0) be a Markov process with state spaceΣ ⊆ Rd and increments θn := ξn+1 − ξn. Suppose that there exists C < ∞such that, for all x ∈ Σ,

‖E[θn | ξn = x]‖ ≤ C(1 + ‖x‖)−1, and E[‖θn‖2 | ξn = x] ≤ C.

Set Xn = ‖ξn‖2. Then, writing ‘·’ for the scalar product on Rd,

Xn+1 −Xn = ξn+1 · ξn+1 − ξn · ξn = 2ξn · (ξn+1 − ξn) + ‖ξn+1 − ξn‖2.

It follows that

E[Xn+1 −Xn | ξn = x] = 2x · E[θn | ξn = x] + E[‖θn‖2 | ξn = x]

≤ 3C <∞,

by hypothesis. In other words, if Fn = σ(ξ0, ξ1, . . . , ξn), we have that

E[Xn+1 −Xn | Fn] ≤ 3C, a.s.

Hence an application of Theorem 2.8.1 with f(x) = x shows that for anyε > 0, a.s., for all but finitely many n,

max0≤m≤n

‖ξm‖ ≤ n1/2(log n)(1/2)+ε. 4


The next result is a lower bound for max0≤m≤nXm. In addition to thecondition (A0), the statement involves the following condition.

(A1) Suppose that for some na ∈ Z+ the function a : [na,∞) → (0,∞) issuch that (i) x 7→ a(x) is increasing on x ≥ na; (ii) limx→∞ a(x) =∞;and (iii)

∑n≥na

1na(n) <∞.

Note that the summability condition in (A1) is equivalent to the conditionthat

∑n≥na

1a(rn) < ∞ for some (hence all) r > 1. Condition (A1) is satis-

fied, for example, by a(x) = (log x)1+ε or (log x)(log log x)1+ε, ε > 0.

Theorem 2.8.3. Let Xn be an Fn-adapted process on R+. Suppose thatthere exists b ∈ R+ such that, for all n ≥ 0,

P[Xn+1 −Xn ≤ b] = 1. (2.77)

Suppose that there exist a non-decreasing function f : R+ → R+ and some ε >0 for which, for all n ≥ 0,

E[f(Xn+1)− f(Xn) | Fn] ≥ ε, a.s. (2.78)

For a function a : [na,∞)→ (0,∞) with na ∈ Z+, define ra : R+ → R+ by

ra(x) := infy ≥ na : ε−1a(y)f(y + b) ≥ x, (x ≥ 0). (2.79)

(i) Let a satisfy (A0). Then, a.s., for all but finitely many n ≥ 0,

max0≤m≤n

Xm ≥ ra(n)− 1. (2.80)

(ii) Let a satisfy (A1). For any δ > 0, a.s., for all but finitely many n ≥ 0,

max0≤m≤n

Xm ≥ (1− δ)ra(n). (2.81)

Remark 2.8.4. The two parts of the theorem are complementary. If oneexpects the growth rate of max0≤m≤nXm with n to be polynomial, then oneshould try a polynomial f , and part (ii) of the theorem will usually providethe better bound; for slower growth, one should try a super-polynomial f ,and part (i) will usually provide the better bound.

Proof of Theorem 2.8.3. Recall that for Xn ∈ R+, σx = minn ≥ 0 : Xn ≥x. Note that under either (A0) or (A1), the fact that f is non-decreasing


ensures that x 7→ ra(x) is non-decreasing for all x ≥ 0, and ra(x) → ∞ asx→∞.

We now apply Theorem 2.4.11. Since f satisfies (2.78), we have that (2.19)holds for all x > 0 and all n ≥ 0, while (2.77) and the fact that f is non-decreasing implies that f(Xσx) ≤ f(x+ b), a.s. Then Theorem 2.4.11 (withstopping time η =∞) implies that Eσx ≤ ε−1f(x+ b) for all x > 0.

In particular, for all x > 0, σx < ∞, and σx ≤ σy for x ≤ y. Moreover,the jumps bound (2.77) implies that, for all x ≥ X0, σx+b ≥ 1 + σx a.s., sothat limx→∞ σx =∞, a.s.

By Markov’s inequality, for any x ≥ na,P[σx > a(x)Eσx

]≤ (a(x))−1. (2.82)

To prove part (i), suppose that (A0) holds. Then (2.82) and the Borel–Cantelli lemma imply that, a.s., for all but finitely many ` ∈ Z+,

σ` ≤ a(`)Eσ` ≤ ε−1a(`)f(`+ b).

Here, since σx →∞ as x→∞, σ` ≥ na for all but finitely many `, so, withthe definition of ra at (2.79), a.s., for all but finitely many `,

ra(σ`) ≤ ra(ε−1a(`)f(`+ b)) ≤ `. (2.83)

For n ∈ Z+, define `n = min` ≥ 0 : σ` > n. Then `n →∞ as n→∞, andσ`n−1 ≤ n < σ`n . Moreover, a.s.,

max0≤m≤n

Xm ≥ max0≤m≤σ`n−1

Xm ≥ Xσ`n−1

≥ `n − 1 ≥ ra(σ`n)− 1, (2.84)

for all but finitely many n, by (2.83). Since σ`n > n, (2.80) follows.The proof of part (ii) is similar. Fix δ > 0 and K > 1 such that

K−1 > 1 − δ. Supposing that (A1) holds, (2.82) and the Borel–Cantellilemma imply that, a.s., for all but finitely many ` ∈ Z+,

σK` ≤ a(K`)EσK` ≤ ε−1a(K`)f(K` + b).

Similarly to (2.83), we obtain ra(σK`) ≤ K`, a.s., for all but finitely many`. Now setting `n = min` ≥ 0 : σK` > n, we have σK`n−1 ≤ n < σK`n .Then, similarly to (2.84), a.s.,

max0≤m≤n

Xm ≥ XσK`n−1

≥ K`n−1 ≥ 1

Kra(σK`n ) ≥ 1

Kra(n),

for all n sufficiently large, which yields (2.81).


For the rest of this section we return to the nearest-neighbour randomwalk on Z+ introduced in Section 2.2 in order to illustrate the results of thissection. Consider the case px = p ∈ (0, 1) for all x ∈ Z+. If p > 1/2, therandom walk Xn is positive-recurrent; this fact is an easy consequence ofTheorem 2.2.5, or can be seen with the argument in Example 2.6.5, whichis for a model with a slightly different reflection rule at the origin. Quant-itatively, we will show that

limn→∞

((log n)−1 max

0≤m≤nXm

)= %, a.s., (2.85)

where % ∈ (0,∞) is given by

% := (log(p/(1− p)))−1. (2.86)

We will prove the following more refined result, from which (2.85) isimmediate. Recall that logk x denotes the k-fold iterated logarithm.

Proposition 2.8.5. Let px = p ∈ (1/2, 1) for all x ∈ Z+. Then with %given by (2.86), for any integer k ≥ 3 and any ε > 0, a.s., for all but finitelymany n ∈ Z+, the following inequalities hold:

max0≤m≤n

Xm ≤ %(

log n+k−1∑j=2

logj n+ (1 + ε) logk n

), (2.87)

max0≤m≤n

Xm ≥ %(

log n−k−1∑j=2

logj n− (1 + ε) logk n

). (2.88)

Proof. From (2.5) with px = p ∈ (1/2, 1), qx = q = 1− p for all x ∈ Z+,

ux = q−1x∑y=0

(p/q)y =1

p− q ((p/q)x+1 − 1).

Hence from (2.8) we obtain, for x ∈ N,

t(x) =1

p− qx−1∑y=0

((p/q)y+1 − 1) =p

(p− q)2(p/q)x − x

p− q −p

(p− q)2.

Thus there is a constant C0 ∈ (1,∞) such that for all x ≥ 1,

t0(x) := C−10 (p/q)x ≤ t(bxc) ≤ C0(p/q)x =: t1(x). (2.89)


The functions t0 and t1 are continuous and monotone, and so can be inverted.Then, with % given by (2.86), we have that for a constant C1 = % logC0 ∈(0,∞) and all x ≥ 1,

t−10 (x) = % log x+ C1, t−1

1 (x) = % log x− C1. (2.90)

We will apply Theorems 2.8.1 and 2.8.3 with Xn as given, with Fn =σ(X0, . . . , Xn), f : R+ → R+ the non-decreasing function given by f(x) =t(bxc). In particular, (2.70) holds with B = 1, (2.78) holds with ε = 1,and (2.77) holds with b = 1.

For the upper bound we apply Theorem 2.8.1. For k ≥ 2 and ε > 0, set

a(x) = x(logk x)1+εk−1∏j=1

logj x.

Then a satisfies (A0). Define f−1 as at (2.71). Since f(x) = t(bxc) ≥ t0(x),we have by continuity of t0 that f−1(x) ≤ t−1

0 (x). Then Theorem 2.8.1and (2.90) imply that a.s.

max0≤m≤n

Xm ≤ t−10 (a(2n)) = %(log(a(2n))) + C1, (2.91)

for all but finitely many n, and here

log(a(2n)) =

k∑j=1

logj n+ (1 + ε) logk+1 n+O(1),

and so from (2.91) we obtain (2.87).On the other hand, for the lower bound we apply Theorem 2.8.3(i). For

k ≥ 3 and ε > 0, set

c(x) := %

(log x−

k∑j=2

logj x− (1 + ε) logk+1 x

).

Then, for some C2 ∈ (0,∞), for all x > 0 sufficiently large,

sup0≤y≤c(x)

a(y) ≤ a(c(x)) ≤ a(% log x)

≤ C2(log x)(log2 x) · · · (logk x)(logk+1 x)1+ε.

Also, from (2.89), for all y ≤ c(x),

t1(y) ≤ t1(c(x)) = C0x(log x)−1(log2 x)−1 · · · (logk−1 x)−1(logk x)−1−ε.


Thus there exists C3 ∈ (0,∞) such that for all y with y + 1 ≤ c(x),

a(y)f(y + 1) ≤ a(y)t1(y + 1) ≤ C3x(logk x)−ε(logk+1 x)1+ε ≤ x,

for all x sufficiently large. Hence

ra(x) = infy ≥ na : a(y)f(y + 1) ≥ x ≥ c(x)− 1,

for all x sufficiently large. Now (2.88) follows from Theorem 2.8.3(i).


Sections 2.1 and 2.3

There are many books that provide the necessary background in probabil-ity theory, e.g. [19, 72, 240, 263], and there also one can find the system-atic treatment of martingales and countable Markov chains. The books ofDurrett [72] and Shiryaev [263] are excellent sources for many of the res-ults quoted without proof in Section 2.3; martingale inequalities are treatedthoroughly by Chow and Teicher [31]. Many of the fundamental results ondiscrete-time martingales in Section 2.3 are due to Doob, and can be foundin his classic book [67].

Theorem 2.1.6 is a simple Markov chain ergodic theorem; similar resultsin a more general setting are given in Section 3.6, which also contains theproof of Theorem 2.1.9 and related results.

Section 2.2

The recurrence classification of nearest-neighbour random walks on Z+ isclassical, with early contributions being [113] and [115]. A standard present-ation is given by Chung [35, §I.12]

Section 2.4

Of the numerous examples in Section 2.4, several are standard, and can alsobe found in [72], for instance; others are more specialized and foreshadowlater developments in the present chapter and the rest of the book.

Theorem 2.4.7 is from a vein of results that can be traced back to Kush-ner [170, 171] and Spitzer [269]. The maximal inequality (2.14) is Lemma 3.1in [211]. The expectation lower bound (2.13) is inspired by Spitzer [269]; arelated result is Lemma 2.2 in [209]. Theorem 2.4.8 extends the result from[211]; inequality (2.12) is a slight generalization of Lemma 4.3 from [119].


Theorem 2.4.5 was used by Kushner [170], who showed that a super-martingale Lyapunov function provides an upper bound on the probabilityof eventual escape from a set, similar to the B = 0 case of Corollary 2.4.10.A version of Corollary 2.4.10 for Markov chains on Rd was given by Kush-ner [171], assuming continuity of f ; Kalashnikov [132, p. 1406] remarkedthat this continuity assumption was unnecessary. Kalashnikov [133, 134]later provided some generalizations of the idea of Corollary 2.4.10 and gaveapplications to reliability theory and to queueing systems.

An alternative proof of (2.14) is to apply Theorem 2.4.5 to the non-negative supermartingale (Xm∧n +B(m−m∧ n), n ≥ 0), where m is fixed;this is the method used by Kushner [171] in the Markov setting, but it isawkward to extend to admit the stopping time ν. The argument in [211, 119]is based on Doob’s inequality. The proof of the more general Theorem 2.4.8given here is entirely elementary.

The elementary approach to Theorem 2.4.8 is essentially the same ideaused in the proofs of the close relatives Theorems 2.4.11 and 2.6.2; asdescribed in Remark 2.4.9, all of these results can essentially be seen asconsequences of Dynkin’s formula. Theorem 2.4.11, which is a relative ofLemma 3.2 from [211], can be seen as a ‘reverse’ Foster-type result. The-orem 2.4.11 provides an unusually simple approach to Kolmogorov’s ‘other’inequality (Theorem 2.4.12); the standard approach via optional stoppingcan be found in [109, p. 502], for instance. Example 2.4.13 generalizes a res-ult of Spitzer [269] for Markov chains on Z+; the terminology ‘weak renewaltheorem’ is Spitzer’s.

The Azuma–Hoeffding martingale concentration inequality (2.27) in The-orem 2.4.14 is due to Hoeffding [116] and Azuma [12]. The submartingaleand supermartingale versions given here are not often explicitly stated inthe literature. A variety of related concentration inequalities can be foundin e.g. [33].

Section 2.5

The basic criteria in Theorems 2.5.2 and 2.5.7 have their origin in work ofF. G. Foster [95], who was a student of D. G. Kendall. Foster [95] proveda version of the ‘if’ half of the recurrence criterion Theorem 2.5.2, in thecase where the exceptional set A is a single point. Extension to the case ofa finite set A appears in Pakes [227]. The ‘only if’ part of Theorem 2.5.2 isdue to Mertens et al. [216].

The transience criterion Theorem 2.5.7 is also due to Foster [95], againin the case where A is a singleton. The case of finite A is treated by Harris


and Marlin [112] and Mertens et al. [216]; the condition that A be finiteis not needed, as had already been realised in a different but analogouscontext by Kalashnikov [135], who proved a version of Theorem 2.5.7 withan inessential restriction on the form of f .

Theorem 2.5.13, which gives transience of a Markov chain under a pos-itive drift condition, assuming a uniform jumps bound, can be found in[196], following earlier work of Malyshev [197]. Both Theorem 2.5.2 andTheorem 2.5.13 were usefully extended to multiple-step drift conditions byMenshikov and Malyshev [206, 196]; see also Chapter 2 of Fayolle et al. [84].Theorem 2.5.17 replaces the jumps bound in Theorem 2.5.13 by a (muchweaker) moments assumption.

As mentioned in the text, Theorems 2.5.6 and 2.5.16 prefigure the de-velopments in Chapter 3, and the basic ideas behind these two results owemuch to Lamperti [173]. Theorem 2.5.6 is contained in Theorem 3.1 of[173]. A counterexample showing that zero drift for a Markov chain on Rdoes not imply recurrence (unlike on R+, as discussed in Example 2.5.5)can be found in Rogozin and Foss [247], who give a particular example ofKemperman’s [144] oscillating random walk in which the increment law isone of two distributions (with mean zero but infinite variance) dependingon the walk’s present sign. The oscillating random walk and related heavy-tailed phenomena are discussed in Chapter 5. The Markov chain transiencecondition, Theorem 2.5.17, is close to best possible for a one-step drift con-dition of this kind: the version here is given by Zachary [287], a (weaker)version requiring second moments is in Robert [245, p. 218], while someimprovements are given by Denisov and Foss [93, 62].

The arguments of Foster, and other early authors, are simple enough,but are based on algebraic manipulations with Markov transition probabilit-ies. The more general, and more probabilistic, approach via stopping timesand adapted processes came later. The systematic use of martingale andstopping time ideas for investigating recurrence and transience seems to be-gin with Lamperti [173], who (while acknowledging a debt to ideas of Doob)treated more general processes than Markov chains. It took some time (seeMertens et al. [216] and Filonov [90]) for martingale ideas to become thenorm in the proofs of the Markov chain versions of the classification the-orems, although Blackwell [21] and Breiman [26], for example, had earliermade use of martingale ideas in proving structural results about Markovchains relating to recurrence.

The presentation in Section 2.5, and also Section 2.6, based on adap-ted processes, stopping times, and semimartingales, follows the spirit ofChapter 2 of [84], which contains additional results in a similar vein.


Section 2.6

Foster’s positive-recurrence criterion for Markov chains, Theorem 2.6.4, wasfirst proved by Foster in [95] in the case where A is a singleton, and ε = 1(the latter being no loss of generality here). In fact, Foster appeals to aresult of Feller (from the 1950 first edition of [86]) for the ‘only if’ part.Foster himself was aware that his argument could be extended to the caseof finite A (see his contribution to the discussion of [145, p. 175]), as wasalso remarked by Moustafa [220] and Kingman [155]; explicit proofs forfinite A were given by Pakes [227] and Kalashnikov [135], although thefinite A result is also a consequence of a condition of Mauldon [201] fora (not necessarily irreducible) Markov chain to be non-dissipative, which isequivalent to positive-recurrence in the irreducible case. (The terminologywas introduced by Foster [94].)

The formulation of a Foster-type condition for integrability of a stoppingtime for an adapted process, as in Theorem 2.6.2, can be found for examplein Theorem 2.1.1 of [84]; a version for processes with bounded incrementswas given by Malyshev [197]. Theorems 2.6.2 and 2.6.4 may both be ex-tended to a condition concerning increments over non-unit time intervals:see Theorems 2.1.2 and 2.2.4 of [84], which are based on earlier results of[206, 196, 90].

The results in Example 2.6.6 on the exit time of a zero drift Markovchain from a quadrant were first announced in the case of nearest-neighbourjumps on Z2

+ by Klein Haneveld [159], who used generating functions andan analytical method from unpublished work of J. Groeneveld; see [160] fora systematic account. In the generality in Example 2.6.6, the results aredue to Klein Haneveld and Pittenger [161], whose proof is the basis for theargument given here, and Cohen [40], who used another elaborate generatingfunction technique; see also [39, 274].

Example 2.6.7 on the exit time of a martingale from a square-root bound-ary gives a semimartingale approach to a result that dates back to Blackwelland Freedman [22], who showed for one-dimensional symmetric SRW thatEκa < ∞ if and only if a < 1. The result of Blackwell and Freedmanwas generalized by Chow et al. [30] and Gundy and Siegmund [108]. Theapproach here to integrability, via Foster’s criterion, is different from theseoriginal papers, while the approach to non-integrability, via quadratic vari-ation, is closely related to the approach in [30].

As mentioned in the main text, Foster’s original argument in [95], andindeed the arguments of Pakes [227] and Kalashnikov [135], are closer to thatpresented in the proof of Lemma 2.6.8, although all those results are stated


only in the Markov chain case. Lemma 2.6.8 is the analogue of Foster’soriginal argument for adapted processes.

The idea of proving that a stopping time has infinite expectation by ex-hibiting a contradiction with the optional stopping theorem, as in Theorem2.6.10, goes back at least to Doob [67, p. 308], who considered a martin-gale with uniformly bounded increments. The submartingale result statedas Theorem 2.6.10 here is a variation on Theorem 2.4 of Fayolle et al. [83].The Markov chain formulation, Corollary 2.6.11, is a version of a resultof Tweedie [279]; an earlier version, assuming a uniform bound on the in-crements of f(Xn), was given by Malyshev in his doctoral thesis of 1973.Other results on ‘non-ergodicity’ of Markov chains include those of Sennottet al. [258], which make use of the less elementary Kaplan’s condition [138].

Section 2.7

Results of the form of those in Section 2.7 on existence and non-existence ofmoments of passage times originate with fundamental work of Lamperti [175],who considered the case of E[λpx] for positive integer p. Lamperti’s ideas wereextended in [9] to include the case of non-integer p, including the importantcase p ∈ (0, 1); the presentation here is based closely on [9]. In the Appendixof [9] it is shown that Theorems 2.7.1 and 2.7.4 of Section 2.7 improve theresults of Lamperti (cf. Theorems 2.1, 2.2, 3.1, and 3.2 in [175]). Generalcriteria in the vein of Theorem 2.7.1 for existence of generalized momentsE g(τ) are given in [8].

Lemma 2.7.5, showing that the process takes quadratic time to return toa neighbourhood of zero from a distant point, is closely related to Lemma 2of [9], which improved Lemma 3.1 of [175]. The proof in [9] is different to theproof of Lemma 2.7.5 given here, which is based on the maximal inequalityTheorem 2.4.8.

While existence of moments results are essentially equivalent, via Markov’sinequality, to upper tail bounds on passage times, non-existence of momentsresults do not directly yield any lower tail bounds. However, both typesof result are obtained by combining, with slightly different technical ideas,two elements, namely (i) Lemma 2.7.5 and (ii) a suitable submartingale.The first method, as in the proof of Theorem 2.7.4, uses the submartingaleproperty and the quadratic time estimate to exhibit a contradiction with theoptional stopping theorem if one assumes existence of sufficient moments,an argument similar in spirit to Theorem 2.6.10 and an idea that goes backto Doob [67, p. 308]. A second method uses the submartingale to insteadprovide a lower tail bound on the maximum of an excursion, which with


the quadratic time estimate yields a lower tail bound on the return time viaLemma 2.7.7. A similar idea goes back to Lamperti, who obtained a lowertail bound in the course of the proof of his Theorem 3.2 [175]; a variation isused to obtain lower tail bounds in [6]. The explicit connection to excursionmaxima, as in Lemma 2.7.7, follows a similar presentation in [49] and [121].

For an irreducible Markov chain Xn on a countable state-space Σ, theproperty that Ex[τγx ] < ∞ for x ∈ Σ was studied under the name “γ-recurrence” (also translated as “γ-reflexivity”) by Kalashnikov [136] andPopov [236], apparently unaware of Lamperti’s work, who both providedsome conditions for Ex[τγx ] < ∞, γ ≥ 1, in terms of space-time Lyapunovfunctions of the form f(Xn, n); these conditions seem to be much less ap-plicable than the results presented here based on [175, 9]. Popov [236] wenton to show that if Xn is γ-recurrent with γ > 1, then the rate of convergenceof P[Xn = y | X0 = x] to the stationary probability πy is O(nγ−1).

Results on existence and non-existence of passage-time moments forMarkov chains in general state spaces appear in [129]; there the connec-tion between drift conditions and rates of convergence to stationarity is alsoexplored.

The idea of Example 2.7.9 is clearly applicable much more widely thanthe case of SRW on Z2, and the statement is close to sharp. For SRW,Erdos and Taylor [77], refining an earlier result of Dvoretzky and Erdos[73], showed that P[τ ≥ n] = π

logn +O((log n)−2). See also [69].

Section 2.8

The results of Section 2.8 are based on [211]. The upper bound result, The-orem 2.8.1, is Theorem 3.2 in [211]. The lower bound result, Theorem 2.8.3,is based on Theorem 3.3 of [211]; the proof here is simpler, and the state-ment corrects the erroneous assertion from [211] that one may set δ = 0 in(2.81).

Example 2.8.2 gives a diffusive upper bound for many-dimensional ran-dom walks whose drift decays suitably rapidly with distance from the origin;such walks may be transient, null-recurrent, or positive-recurrent (in whichcase this upper bound is typically not sharp) depending on the details: seeChapter 4. The application of the results in this section to nearest-neighbourrandom walk on Z with negative drift (Proposition 2.8.5) may well be known,but we could find no reference in the literature.


Further remarks

Additional historical and bibliographic details related to this chapter maybe found in [84] and [217].

It is worth emphasizing that a large and sophisticated body of work existson Foster–Lyapunov results for Markov chains in general state spaces. Herethe interaction between the process, the space, and the different possiblenotions of recurrence and transience leads to a rich theory, but one withrather different focus and methodology from this book. Early work in thisdirection, starting in the mid 1970s, includes that of Tweedie [277, 279], anda recent treatment, including thorough bibliographic details, is the book ofMeyn and Tweedie [217]. TO DO: we need to mention some of e.g. [129]and [68] etc.

It is also worth mentioning that there is a well-developed analogue of theFoster–Lyapunov theory for diffusion processes (and Ito processes more gen-erally). Here early contributions include fundamental work of Khas’minskii,such as [153]; see also [285, 286]. Recent accounts of aspects of the theoryare presented in the books of Pinsky [234] (for diffusions) and Mao [199] (forIto processes).

Chapter 3

Lamperti’s problem

3.1 Introduciton

The law of a one-dimensional time-homogeneous Markov chain is determinedby the collection, over all points x in the state space, of laws of its incrementsat x, i.e., the one-step distributions for the chain started from x. A verybroad question is to investigate the extent to which the asymptotic behaviourof the process is determined by coarser information, such as the first fewmoments of the increments at x, as functions of x: can some general formof ‘invariance principle’ be discerned? To obtain an intuitive grounding, itis sensible that the variable x should describe a natural scale for the processin some way. Formalizing such vague notions, and generalizing beyond theMarkovian setting, one is led almost inevitably to what has become knownas Lamperti’s problem, after pioneering work of J. Lamperti published inthe early 1960s (see the bibliographical notes at the end of this chapter fordetails).

Let us describe informally Lamperti’s problem. Consider a discrete-timestochastic process (Xn, n ≥ 0) on R+. For now suppose that Xn is a time-homogeneous Markov process whose increment moment functions

µk(x) = E[(Xn+1 −Xn)k | Xn = x] (3.1)

are well-defined for k ≥ 0; one way to ensure this is to impose a uniformbound on the increments. (We will relax all of these conditions shortly.)Lamperti’s problem is to determine how the asymptotic behaviour of Xn

depends upon µ1 and µ2.

It is possible to proceed more generally, but to establish some intuitionwe suppose for now that Xn is already on a natural scale, meaning that

93

94 Chapter 3. Lamperti’s problem

µ2(x) is bounded away from 0 and ∞ by finite constants. This is not toomuch of a restriction, since if this second moments condition does not holdfor Xn, one can often choose a suitable f : R+ → R+ for which the conditiondoes hold for the process Yn = f(Xn). If f is bijective, then the Markovproperty of Xn transfers to Yn. Even if Yn is not itself Markov, we maystill proceed; indeed, there are other, more pressing reasons why we wish torelax the Markov assumption, as we describe below.

Assuming that µ2(x) is bounded away from 0 and ∞, the behaviourof Xn is rather standard when, outside some bounded set, µ1(x) ≡ 0 (thezero drift case) or µ1(x) is uniformly bounded to one side of 0. In the zero-drift case, the Markov chain is null-recurrent; recurrence follows, under mildconditions, from Theorem 2.5.6 and the null property by Corollary 2.6.11. Inthe case of uniformly negative drift, the chain is positive-recurrent by The-orem 2.6.4. In the case of uniformly positive drift, the chain is transient, byTheorem 2.5.17. Typically, these regimes are also far from critical; positive-recurrence is accompanied by exponential estimates on hitting times andtails of stationary distributions, and transience is accompanied by positivespeed of escape.

This motivates the study of the asymptotically zero drift regime, in whichµ1(x) → 0 as x → ∞, to investigate phase transitions. We shall examinein this chapter part of the rich spectrum of possible behaviours for Xn. Itturns out that from the point of view of the recurrence classification of Xn,the case where |µ1(x)| is of order 1/x is critical.

These Lamperti processes serve as prototypical near-critical stochasticsystems, and provide an excellent setting in which to demonstrate the ap-plication of the ideas from Chapter 2. There is, however, another reason,of at least equal importance, for studying these processes: they appear reg-ularly in the analysis of near-critical stochastic processes via the Lyapunovfunction method. This context also provides the main motivation for thedesire to work in some generality without imposing, for instance, assump-tions of the Markov property, a countable state-space, or uniformly boundedincrements. The following simple example demonstrates several importantpoints.

Example 3.1.1. As in Example 2.8.2, let (ξn, n ≥ 0) be a time-homogeneousMarkov chain on state space Σ ⊆ Rd with increments θn := ξn+1− ξn. Sup-pose that there exists B ∈ R+ such that, for all x ∈ Σ,

E[‖θn‖2 | ξn = x] ≤ B, and E[θn | ξn = x] = 0.

For example, ξn might be symmetric simple random walk on Zd; in this

3.2. Markovian case 95

case, as is typically true in general, Xn is not Markov: see the discussion inSection 1.3 and the accompanying bibliographic notes in Chapter 1.

Consider Xn = ‖ξn‖. First note that, by the triangle inequality,∣∣‖ξn+1‖ − ‖ξn‖∣∣ ≤ ‖ξn+1 − ξn‖, a.s.,

so that E[(Xn+1 −Xn)2 | ξn = x] is uniformly bounded. To get an estimateon the mean increment of Xn given ξn 6= 0, we write

Xn+1 −Xn =‖ξn+1‖2 − ‖ξn‖2

2‖ξn‖− (‖ξn+1‖ − ‖ξn‖)2

2‖ξn‖.

Taking expectations, we see |E[Xn+1 −Xn | ξn = x]| is bounded above by

1

2‖x‖(∣∣E[‖ξn+1‖2 − ‖ξn‖2 | ξn = x]

∣∣+ E[(‖ξn+1‖ − ‖ξn‖)2 | ξn = x]),

which is at most B‖x‖−1, by a similar calculation to that in Example 2.8.2.So, roughly speaking, the process Xn has mean drift O(X−1

n ). To study, forinstance, the recurrence properties of ξn, this Lyapunov function approachthus leads to the study of the R+-valued process Xn, which has asymptot-ically zero drift, and so is of Lamperti type. However, since Xn is typicallynot Markov, some care is needed to formulate this approach precisely. Givenadditional assumptions on ξn, we can estimate the drift of Xn more care-fully, using Taylor’s theorem: we return to this idea in Example 3.5.4, andin detail in Chapter 4. 4

i Natural Lyapunov functions for near-critical Markov pro-cesses often lead to Lamperti-type stochastic processes.

Relaxing the Markov assumption leads to a slight complication in defin-ing the correct analogues of (3.1), but does not complicate the proofs whichare based on the semimartingale ideas of Chapter 2. For expository pur-poses, we start with the Markov case and then proceed to the more generalsetting. Throughout this chapter we use the notation

∆n := Xn+1 −Xn.

3.2 Markovian case

For orientation, we start with the Markov setting. We state in this sectionthe Markov versions of some of the key results in this chapter; we do not


give the proofs at this stage. The results presented in this section will followfrom the more general results presented in the main body of this chapter;we defer the proofs until Section 3.13.

For this section, our basic assumptions on our process Xn and on thestructure of its state space Σ are as follows.

(M0) Let Xn be an irreducible, time-homogeneous Markov chain on Σ, alocally finite, unbounded subset of R+, with 0 ∈ Σ.

We also make some assumptions on the increments of Xn. We willtypically assume that for some p > 2 (at least),

supx∈Σ

E[|∆n|p | Xn = x] <∞. (3.2)

(M1) Suppose that (3.2) holds for some p > 2.

Remark 3.2.1. As stated in Corollary 2.1.10, a consequence of (M0) is thatlim supn→∞Xn =∞, a.s.; see Section 3.6 below for a proof.

Given (M1), µk(x) as defined at (3.1) is finite for k ∈ 1, 2.

Example 3.2.2. Although the proofs of the results in this section are de-ferred till later, we present, by way of an example, an application of Foster’scriterion (Theorem 2.6.4) to demonstrate a condition for positive recurrence.Suppose that (M0) and (M1) hold, and there exist positive constants ε andx0 so that 2xµ1(x) + µ2(x) < −ε for all x ≥ x0. Consider the Lyapunovfunction f(x) = x2. Then

E[f(Xn+1)− f(Xn) | Xn = x] = E[2Xn∆n + ∆2n | Xn = x]

= 2xµ1(x) + µ2(x) ≤ −ε,

outside the set [0, x0] ∩ Σ, which is finite since Σ is locally finite. Thus wemay apply Theorem 2.6.4 to deduce that Xn is positive recurrent. 4

The basic recurrence classification is as follows; part (iii) is the statementfrom Example 3.2.2.

Theorem 3.2.3. Suppose that (M0) and (M1) hold. Then the followingrecurrence conditions are valid.

(i) Xn is transient if there exist ε, x0 ∈ (0,∞) such that

2xµ1(x)− µ2(x) > ε, for all x ≥ x0.

3.2. Markovian case 97

(ii) Xn is null-recurrent if there exist x0 ∈ (0,∞), θ > 0, and v > 0 suchthat for all x ≥ x0, µ2(x) ≥ v and

2x|µ1(x)| ≤(

1 +1− θlog x

)µ2(x).

(iii) Xn is positive-recurrent if there exist ε, x0 ∈ (0,∞) such that

2xµ1(x) + µ2(x) < −ε, for all x ≥ x0.

Remark 3.2.4. More compactly, (i) says lim infx→∞(2xµ1(x) − µ2(x)) > 0,and similarly (iii) says lim supx→∞(2xµ1(x) + µ2(x)) < 0.

Example 3.2.5. As in Section 2.2, consider the simple random walk on Z+

with transition probabilities p(x, x− 1) = px = 1− p(x, x+ 1) for x ≥ 1 andp(0, 0) = p0 = 1− p(0, 1), where px ∈ (0, 1) for all x ≥ 0. Suppose that, forsome c ∈ R,

px =1

2

(1 +

c

x

)+ o(x−1).

Then µ1(x) = 1 − 2px = −(c + o(1))x−1 and µ2(x) = 1, so Theorem 3.2.3says that this random walk is

• transient if c < −1/2;

• null recurrent if |c| < 1/2;

• positive recurrent if c > 1/2.

The boundary cases c = ±1/2 can go either way depending on the o(x−1)term in px; if this term is actually o

((x log x)−1

), say, Theorem 3.2.3(iii)

shows that the case |c| = 1/2 is null recurrent. 4

Other results that we present in this chapter include existence of passage-time moments and almost-sure bounds on trajectories. To simplify the state-ments in the Markovian case, it is convenient to assume the following.

(M2) Suppose that there exist a ∈ R and b ∈ (0,∞) such that

limx→∞

µ2(x) = b, and limx→∞

xµ1(x) = a.

In the case where (M0), (M1) and (M2) all hold, Theorem 3.2.3 impliesthat Xn is


• transient if 2a > b;

• null recurrent if |2a| < b;

• positive recurrent if 2a < −b.As in Example 3.2.5, to classify the boundary cases needs stronger assump-tions. The next result gives existence and non-existence of moments ofreturn times. Recall from Section 2.1 that τ+

x = minn ≥ 1 : Xn = x.Theorem 3.2.6. Suppose that (M0), (M1) and (M2) hold for p > max2, 2s,where s > 0.

(i) For any x ∈ Σ, Ex[(τ+x )s] <∞ if s < b−2a

2b .

(ii) For any x ∈ Σ, Ex[(τ+x )s] =∞ if s > b−2a

2b .

The next result gives growth bounds on trajectories. For simplicity, westate the result on the logarithmic scale.

Theorem 3.2.7. Suppose that (M0), (M1) and (M2) hold.

(i) If 2a > b, then, a.s.,

limn→∞

logXn

log n=

1

2.

(ii) If |2a| ≤ b, then, a.s.,

lim supn→∞

logXn

log n=

1

2.

(iii) If 2a < −b and (M1) holds with p > b−2ab , then, a.s.,

lim supn→∞

logXn

log n=

b

b− 2a∈(0, 1

2

).

Example 3.2.8. Continuing from Example 3.2.5, consider the simple ran-dom walk on Z+ with

p(x, x− 1) = px = 1− p(x, x+ 1) =1

2

(1 +

c

x

)+ o(x−1).

Then µ1(x) = 1 − 2px = −(c + o(1))x−1 and µ2(x) = 1, so (M2) holdswith a = −c and b = 1, and Theorem 3.2.7 gives corresponding almost-surebounds. For example, in the case c > 1/2 (positive recurrence), a.s.,

lim supn→∞

logXn

log n=

1

1 + 2c∈(0, 1

2

),

demonstrating the sub-diffusive behaviour of the walk’s upper envelope. 4

3.3. General case 99

3.3 General case

We now describe the more general setting that will occupy most of thischapter. Let (Xn, n ≥ 0) be a discrete-time stochastic process adapted to afiltration (Fn, n ≥ 0) and taking values in a subset S of R that is boundedin one direction and unbounded in the other. Without loss of generality, welocate the ordinate such that S ⊆ R+ with inf S = 0 and supS = +∞.

Central objects in this chapter will be the conditional increment momentsE[∆k

n | Fn], specifically for k = 1 and k = 2. Many of the conditions in thetheorems will suppose that an inequality hold involving the Fn-measurablerandom variables E[∆n | Fn], E[∆2

n | Fn], and Xn; such inequalities willhave to hold a.s. and in an appropriate asymptotic sense (for large valuesof Xn). It is convenient therefore to introduce some notation for upper andlower bounds on the increment moments E[∆k

n | Fn] as ‘functions’ of Xn.To this end, shortly we will define µ

k: S → R and µk : S → R such that

µk(Xn) ≤ E[∆k

n | Fn] ≤ µk(Xn), a.s., for all n ≥ 0. (3.3)

If Xn is Markovian, then E[∆kn | Fn] = E[∆k

n | Xn], a.s., and we can take

µk(x) = inf

n≥0E[∆k

n | Xn = x], and µk(x) = supn≥0

E[∆kn | Xn = x];

if additionally Xn is time-homogeneous then µk(x) ≡ µk(x) ≡ µk(x) as

defined at (3.1).Loosely speaking, in the general case E[∆k

n | Fn] involves additionalrandomness in Fn, once Xn has been fixed. Thus µk(x) should be the(essential) supremum over this additional randomness given Xn = x. Forµk

the situation is analogous.Let us now formally define µ

kand µk. Let k be a positive integer. Sup-

pose that E[∆kn | Fn] exists for all n ≥ 0. By standard theory of conditional

expectations (see e.g. Section 9.1 of [36]), for each n ≥ 0 there exist a Borel-measurable function ϕk,n : S → R and an Fn-measurable random variableψk,n such that, a.s., E[ψk,n | Xn] = 0 and

E[∆kn | Fn] = E[∆k

n | Xn] + ψk,n = ϕk,n(Xn) + ψk,n. (3.4)

Here ϕk,n(x) = E[∆kn | Xn = x]. Then for x ∈ S define

µk(x) := inf

n≥0

ϕk,n(x) + ess inf(ψk,n1Xn ≥ x)

; (3.5)

µk(x) := supn≥0

ϕk,n(x) + ess sup(ψk,n1Xn ≥ x)

. (3.6)


Provided that the expectations in question exist, (3.5) and (3.6) define µk(x)

and µk(x) as R-valued functions of x ∈ S such that µk(x) ≤ µk(x) for all

x ∈ S, and the property (3.3) is satisfied.

Example 3.3.1. With the notation at (3.4), we may write

Xn+1 = Xn + ϕ1,n(Xn) + ψ1,n + ζn+1, (3.7)

where ζn+1 = ∆n − E[∆n | Fn] is Fn+1-measurable with E[ζn+1 | Fn] = 0;i.e., ζn is a martingale difference sequence. 4

If ϕk,n(x) = ϕk(x) does not depend on n, then

|µk(x)− µk(x)| ≤ 2 sup

n≥0ess sup(|ψk,n|1Xn ≥ x),

which, in many situations of interest, tends to zero as x→∞ (and does sosufficiently fast if µk(x) and µ

k(x) also tend to zero) so that µk(x) ∼ µ

k(x).

The following simple example is one such case, and also serves to show thateven in elementary situations, some technical machinery along the lines ofthat presented in this section is not easily avoided.

Example 3.3.2. Let Sn be symmetric SRW on Zd, d ∈ N, and set Xn =‖Sn‖. Let Fn = σ(S0, . . . , Sn). We consider E[∆n | Fn] = E[∆n | Sn]. Re-calling the notation from Section 1.2, for e1, . . . , ed the standard orthonor-mal basis vectors, let Ud = ±e1, . . . ,±ed denote the possible jumps of thewalk. Then for x ∈ Zd,

E[∆n | Sn = x] =1

2d

∑e∈Ud

(‖x + e‖ − ‖x‖).

Here, a Taylor’s formula calculation (as in Section 1.3) shows that

‖x + e‖ − ‖x‖ = ‖x‖((

1 +2e · x‖x‖2

)1/2

− 1

)

=1

2

(2e · x + 1

‖x‖

)− 1

2

((e · x)2

‖x‖3)

+O(‖x‖−2), (3.8)

where the implicit constant in the O( · ) term depends only on d. So

E[∆n | Sn = x] =1

2‖x‖

(1− 1

d

)+O(‖x‖−2);


see equation (1.4). It follows that

ϕ1,n(x) =1

2x

(1− 1

d

)+O(x−2), and |ψ1,n|1Xn ≥ x = O(x−2),

where, again, implicit constants in the error terms depend only on d. Hence

limx→∞

xµ1(x) = lim

x→∞xµ1(x) =

1

2

(1− 1

d

).

Note that if d ≥ 2 the difference µ1(x)−µ1(x) is genuinely of order x−2: for

instance, the third-order Taylor term in (3.8) contains ‖x‖−5∑

e∈Ud(e·x)3 =

θ(x)‖x‖−2 for θ(x) whose range includes a positive interval for all x with‖x‖ sufficiently large. 4

i

If Yn is Markov and Xn = f(Yn), then although Xn is typ-ically not Markov, if Yn has reasonably homogeneous be-haviour around level curves of f , then one can expect thatµk(x) and µ

k(x) have similar asymptotics.

Some further discussion of the definitions in (3.5) and (3.6), and someillustrative examples, can be found at the end of this section.

Our basic assumption is now as follows.

(L0) Let Xn be a stochastic process adapted to a filtration Fn and takingvalues in S ⊆ R+ with inf S = 0 and supS = +∞.

We impose a moments condition analogous to, but somewhat weakerthan, condition (M1).

(L1) Suppose that for some p > 2, δ ∈ (0, p− 2], and C ∈ R+,

E[|∆n|p | Fn] ≤ C(1 +Xn)p−2−δ, a.s., for all n ≥ 0. (3.9)

Note that a sufficient condition for (L1) is that E[|∆n|p | Fn] ≤ C, a.s., forsome p > 2, some C <∞, and all n ≥ 0; i.e., the δ = p− 2 case of (3.9).

We also will typically assume the following non-confinement condition:

(L2) Suppose that lim supn→∞Xn = +∞, a.s.

Remarks 3.3.3. (a) Since Xn < ∞ for all n, lim supn→∞Xn = ∞ is equi-valent to supn≥0Xn =∞.


(b) Condition (L2) is generally necessary for our questions of interest to benon-trivial, and is usually straightforward to verify in a particular applica-tion, as we discuss in Section 3.6 below. For example, if Xn is an irreducibletime-homogeneous Markov chain and S is locally finite, then (L2) holdsautomatically: see Corollary 2.1.10.

We state one simple but useful sufficient condition for (L2). This does notrely on irreducibility as such, but the following related local escape property,which says that the process exits any interval [0, x] in a finite number of stepswith positive probability, depending only on x.

(L3) Suppose that for each x ∈ R+ there exist rx ∈ N and δx > 0 suchthat, for all n ≥ 0,

P[

maxn≤m≤n+rx

Xm ≥ x∣∣∣Fn] ≥ δx, on Xn ≤ x. (3.10)

Recall that for Xn a process on R+, σx = minn ≥ 0 : Xn ≥ x.

Proposition 3.3.4. Suppose that (L0) and (L3) hold. Then (L2) holds.Moreover, there exist Kx ∈ R+ and cx > 0 such that, for all n ≥ 0,

P[σx ≥ n | F0] ≤ Kxe−cxn, on X0 < x. (3.11)

Finally, a sufficient condition for (L3) is that for each x ∈ R+ there existmx ∈ N and εx > 0 such that, for all n ≥ 0,

P[Xn+mx −Xn ≥ εx | Fn] ≥ εx, on Xn ≤ x. (3.12)

Example 3.3.5. As in Example 3.1.1, let (ξn, n ≥ 0) be a time-homogeneousMarkov chain on state space Σ ⊆ Rd with increments θn := ξn+1− ξn. Sup-pose that there exists ε > 0 for which

infu∈Sd−1

P [θn · u ≥ ε | ξn = x] ≥ ε, for all x ∈ Σ; (3.13)

the property (3.13) is uniform ellipticity . Then, for Xn = ‖ξn‖, we have

P [Xn+1 −Xn ≥ ε | ξn = x] ≥ P [x · (x + θn)− ‖x‖ ≥ ε | ξn = x]

≥ P [x · θn ≥ ε | ξn = x] ≥ ε,

by (3.13). Hence (3.12) holds for all x ∈ R+ with mx = 1 and εx = ε, andProposition 3.3.4 shows that lim supn→∞ ‖ξn‖ =∞, a.s. 4


Proof of Proposition 3.3.4. Fix x ≥ 0. Write r = rx, δ = δx, and σ = σx forease of notation. By (3.10), P[maxn≤m≤n+rXm ≥ x | Fn] ≥ δ, on n < σ.Hence,

P[σ > n+ r | Fn] ≤ 1− δ, on n < σ. (3.14)

Now, the n = 0 case of (3.14) says that P[σ > r | F0] ≤ 1− δ on X0 < x.Hence, conditioning on Fr and using the n = r case of (3.14),

P[σ > 2r | F0] = E[1σ > rP[σ > 2r | Fr]

∣∣F0

]≤ (1− δ)P[δ > r | F0]

≤ (1− δ)2, on X0 < x.

Iterating this argument shows that, on X0 < x, P[σ > kr | F0] ≤ (1− δ)kfor any integer k ≥ 0. Any integer n has knr < n ≤ (kn + 1)r for someinteger kn, and then, on X0 < x,

P[σ > n | F0] ≤ P[σ > knr | F0]

≤ (1− δ)kn ≤ (1− δ)(n/r)−1,

which yields (3.11) for constants cx and Kx depending on δ = δx and r = rx.

Now we deduce (L2). Let λ0 = minn ≥ 0 : Xn ≤ x, and definerecursively, for k ∈ N, λk = minn ≥ λk−1 + r : Xn ≤ x, with the usualconvention min ∅ =∞. Set L = infk ≥ 0 : λk =∞. Thus we have definedstopping times λk such that λk−1 < λk for all k ≤ L, and λk = ∞ for allk ≥ L. Also define for k ∈ N the event

Ek =

maxλk−1≤n≤λk−1+r

Xn ≥ x∪ λk =∞.

If λk <∞, then Xλk ≤ x and so, by (3.10), P[Ek+1 | Fλk ] ≥ δ on λk <∞.On the other hand, if λk =∞ then λk+1 =∞ as well, by definition, so thatP[Ek+1 | Fλk ] = 1 on λk =∞. Hence we obtain

P[Ek+1 | Fλk ] ≥ δ, a.s.

Since Ek ∈ Fλk , it follows from Levy’s version of the Borel–Cantelli lemma(Theorem 2.3.19) that Ek i.o. occurs a.s. Hence either L < ∞ or eachof Xn ≤ x and Xn ≥ x occurs for infinitely many n. In either case,lim supn→∞Xn ≥ x, a.s. Since x ∈ R+ was arbitrary, lim supn→∞Xn =∞,a.s.


It remains to show that (3.12) implies (3.10). Fix x ≥ 0. Write m = mx,ε = εx, and σ = σx. For ease of notation, with no loss of generality, we taken = 0. Define for k ∈ N the events

Ak = Xkm −X(k−1)m ≥ ε, and Bk = Ak ∪ σ ≤ m(k − 1).

Then Bk ∈ Fkm and

P[Bk+1 | Fkm] ≥ P[Ak+1 | Fkm]1km < σ+ P[σ ≤ mk | Fkm]1km ≥ σ≥ ε, a.s.,

by (3.12), since on km < σ we have Xkm ≤ x. Hence by a similariterated conditioning argument to above, we obtain P[∩rk=1Bk | F0] ≥ εr.Taking rx = minr ∈ N : rε > x and δx = εrx , we have that with prob-ability at least δx, ∩rxk=1Bk occurs, which implies that either ∩rxk=1Ak oc-curs, or σ < mrx. In the former case, Xmrx ≥ X0 + rxε > x. HenceP[max0≤k≤mrx Xk ≥ x | F0] ≥ δx, which implies (3.10).

To end this section, we briefly discuss further the definitions in (3.5) and(3.6), and give some examples for particular classes of process Xn.

Moments Markov property

Somewhat weaker than the full Markov property is the assumption that

E[∆n | Fn] = µ1(Xn), and E[∆2n | Fn] = µ2(Xn), a.s., (3.15)

for measurable functions µ1 and µ2. In the notation at (3.4), (3.15) assertsthat ψn,k = 0 a.s. for k ∈ 1, 2. This moments Markov property is thestanding assumption in papers such as [142] motivated by growth modelsand phrased in terms of stochastic difference equations. Indeed, under (3.15)one may write, similarly to (3.7),

Xn+1 = Xn + µ1(Xn) + ζn+1,

where E[ζn+1 | Fn] = 0 and E[ζ2n+1 | Fn] = µ2(Xn) − µ1(Xn)2, so ζn is a

martingale difference sequence adapted to Fn.

History-dependent processes

Suppose that Fn = σ(X0, . . . , Xn) and the law of Xn+1 depends only upon(X0, . . . , Xn). For convenience, take S to be countable. Then we can write

E[∆n | Fn] =∑

x0,...,xn∈SE[∆n | X0 = x0, . . . , Xn = xn]

3.4. Lyapunov functions 105

× 1X0 = x0, . . . , Xn = xn.

In this case

µ1(x) = supn∈Z+

supE[∆n | X0 = x0, . . . , Xn−1 = xn−1, Xn = x],

where the second supremum is taken over x0, . . . , xn−1 ∈ S for which P[X0 =x0, . . . , Xn−1 = xn−1 | Xn = x] > 0. There is an analogous expression for µ

1.

Functions of Markov processes

Suppose that Yn is a time-homogeneous Markov process on a general state-space Σ, and that for some measurable function f : Σ → R+, Xn = f(Yn).Set Fn = σ(Y0, . . . , Yn). Then Xn has state-space S = f(Σ). Now E[∆n |Fn] = E[∆n | Yn], a.s., and if Σ (hence S) is countable, we may write

E[∆n | Fn] =∑x∈S

∑y∈Σ:f(y)=x

E[∆n | Yn = y]1Yn = y,Xn = x.

In this case,

µ1(x) = supn∈Z+

supy∈Σ:f(y)=x,P[Yn=y|Xn=x]>0

E[∆n | Yn = y],

and similarly for µ1. This situation often arises in applications, where f may

be, for instance, a Lyapunov-type function applied to a multi-dimensionalMarkov process.

3.4 Lyapunov functions

The results in this chapter will be obtained by application of the semi-martingale results of Chapter 2 to suitably chosen Lyapunov functions ofthe process Xn. For γ ∈ R and ν ∈ R we define

fγ,ν(x) :=

xγ logν x if x ≥ e;

eγ if x < e.(3.16)

The piecewise definition in (3.16) ensures that the function fγ,ν : R+ →(0,∞) is well defined for all γ and ν.

The analysis in this chapter will be built on the fact that fγ,ν(Xn) sat-isfies a submartingale or supermartingale condition, for Xn outside some


bounded set, for appropriate γ and ν. The next result gives the necessaryincrement estimates that we will need throughout the chapter; we statethe result in some generality to avoid repeating very similar computationsseveral times.

Lemma 3.4.1. Let γ ∈ R and ν ∈ R. Suppose that (L0) and (L1) hold,with p > 2 and δ ∈ (0, p − 2] such that −δ < γ < p. Then there existsε0 = ε0(p, δ, γ) ∈ (0, 1/2) such that, for any ε ∈ (0, ε0), a.s.,

E[fγ,ν(Xn+1)− fγ,ν(Xn) | Fn]

= γ

(Xn E[∆n | Fn] +

γ − 1

2E[∆2

n | Fn]

)fγ−2,ν(Xn)

+ ν

(Xn E[∆n | Fn] +

2γ − 1

2E[∆2

n | Fn]

)fγ−2,ν−1(Xn)

+(ν(ν − 1) + oFnXn(1)

)E[∆2

n | Fn]fγ−2,ν−2(Xn) +OFnXn(Xγ−2−εn ). (3.17)

Moreover, if γ ∈ (0, p) then, for any ε ∈ (0, ε0), a.s.,

E[Xγn+1 −Xγ

n | Fn]

= γ

(Xn E[∆n | Fn] +

γ − 1

2E[∆2

n | Fn]

)Xγ−2n +OFnXn(Xγ−2−ε

n ). (3.18)

To prepare for the proof of Lemma 3.4.1, we first state a technical result.For ε ∈ (0, 1), define the event

Eε(n) :=|∆n| ≤ (1 +Xn)1−ε. (3.19)

Denote the complementary event by Ecε(n).

Lemma 3.4.2. Suppose that (L0) and (L1) hold, with p > 2, δ ∈ (0, p− 2],and C ∈ R+. Then for any ε ∈ (0, 1), any q ∈ [0, p], and all n ≥ 0,

E[|∆n|q1(Ecε(n)) | Fn] ≤ C(1 +Xn)q−2−δ+pε, a.s. (3.20)

Moreover, for any ε ∈ (0, δ1+p) and any q ∈ [0, p], for all n ≥ 0,

E[|∆n|q1(Ecε(n)) | Fn] ≤ C(1 +Xn)q−2−ε, a.s. (3.21)

Proof. For q ∈ [0, p], by definition of Eε(n),

|∆n|q1(Ecε(n)) = |∆n|p|∆n|q−p1(Ec

ε(n)) ≤ |∆n|p(1 +Xn)(1−ε)(q−p).

Taking expectations conditioned on Fn and using (3.9) we obtain

E[|∆n|q1(Ecε(n)) | Fn] ≤ C(1 +Xn)p−2−δ+(1−ε)(q−p), a.s.,

which implies (3.20), from which (3.21) also follows.


Proof of Lemma 3.4.1. Fix γ, ν ∈ R. Throughout the proof we write justf for fγ,ν . As we are interested in asymptotics when Xn is large, we maysuppose throughout that Xn > 2e. For any ε ∈ (0, 1),

f(Xn+1)− f(Xn) = (f(Xn + ∆n)− f(Xn)) 1(Eε(n))

+ (f(Xn + ∆n)− f(Xn)) 1(Ecε(n)).

We estimate the expectation on Eε(n) here using a Taylor expansion, whilewe use Lemma 3.4.2 to control the expectation on Ec

ε(n). Write

ε0 = ε0(p, δ, γ) =minδ, γ + δ, 1 + p

2 + 2p, (3.22)

and suppose that ε ∈ (0, 2ε0); note that since −δ < γ < p, we have 2ε0 ∈(0, 1). Differentiation of f with respect to x > e gives

f ′(x) =(γ + ν log−1 x

)x−1f(x);

f ′′γ,ν(x) =(γ(γ − 1) + ν(2γ − 1) log−1 x+ ν(ν − 1) log−2 x

)x−2f(x);

and f ′′′γ,ν(x) = O(xγ−3 logν x). Taylor’s formula with Lagrange remaindersays that for any x, h ∈ R with x > e and x+ h > e,

f(x+ h)− f(x) = hf ′(x) +h2

2f ′′(x) +

h3

6f ′′′(x+ ϕh),

for some ϕ = ϕ(x, h) ∈ [0, 1]. Taking x = Xn and h = ∆n, we have onEε(n) that Xn + ϕ∆n > Xn/2 > e, say, for all Xn sufficiently large. Hencewe can find constants C and x1 so that the third-order term in the Taylorexpansion of f(Xn + ∆n)− f(Xn) on Eε(n) satisfies

|∆3nf′′′(Xn + ϕ∆n)|1(Eε(n)) ≤ C|∆n|3Xγ−3

n logν Xn1(Eε(n))

≤ C∆2nX

γ−2−εn logν Xn

≤ C∆2nX

γ−2−(ε/2)n , on Xn > x1.

Hence

(f(Xn + ∆n)− f(Xn)) 1(Eε(n))

=

(∆nf

′(Xn) +∆2n

2f ′′(Xn)

)1(Eε(n)) + ∆2

nOFnXn

(Xγ−2−(ε/2)n ).

Collecting terms, we obtain

(f(Xn + ∆n)− f(Xn)) 1(Eε(n))


= γ

(Xn∆n +

γ − 1

2∆2n

)f(Xn)X−2

n 1(Eε(n))

+ ν

(Xn∆n +

2γ − 1

2∆2n

)f(Xn)X−2

n log−1Xn1(Eε(n))

+(ν(ν − 1) + oFnXn(1)

)∆2nf(Xn)X−2

n log−2Xn1(Eε(n)),

where the oFnXn(1) error term encompasses the third-order Taylor term. Totake (conditional) expectations here, we observe that, since (3.9) holds forsome p > 2, and ε ∈ (0, 2ε0) satisfies ε < δ

1+p , we have from the q ∈ 1, 2cases of Lemma 3.4.2 that, a.s.,

E[∆n1(Eε(n)) | Fn] = E[∆n | Fn] +OFnXn(X−1−εn );

E[∆2n1(Eε(n)) | Fn] = E[∆2

n | Fn] +OFnXn(X−εn ).

It follows that

E[(f(Xn + ∆n)− f(Xn)

)1(Eε(n))

∣∣Fn]= γ

(Xn E[∆n | Fn] +

γ − 1

2E[∆2

n | Fn]

)Xγ−2n logν Xn

+ ν

(Xn E[∆n | Fn] +

2γ − 1

2E[∆2

n | Fn]

)Xγ−2n logν−1Xn

+(ν(ν − 1) + oFnXn(1)

)E[∆2

n | Fn]Xγ−2n logν−2Xn

+OFnXn(Xγ−2−(ε/2)n ). (3.23)

On the other hand, we claim that for any ε ∈ (0, 2ε0), a.s.,

E[|f(Xn + ∆n)− f(Xn)|1(Ecε(n)) | Fn] = OFnXn(Xγ−2−(ε/2)

n ). (3.24)

To verify (3.24), first suppose that γ < 0. Then fγ,ν is eventually decreasing,and there exists C ∈ R+ such that 0 ≤ fγ,ν(x) ≤ C for all x ∈ R+. Hence

E[|f(Xn + ∆n)− f(Xn)|1(Ecε(n)) | Fn] ≤ C P[Ec

ε(n) | Fn],

which is bounded above by a constant times X−2−δ+pεn , by the q = 0 case

of (3.20). By choice of ε ∈ (0, 2ε0), we have ε < γ+δ1+p , so that −δ+pε < γ−ε;

this establishes the claim (3.24) in the case γ ∈ (−δ, 0).Next suppose that γ ≥ 0. Now, for any ε′ > 0, there exists C ∈ R+ such

that 0 ≤ fγ,ν(x) ≤ C(1 + x)γ+ε′ for all x ∈ R+. Hence for any ε′ > 0 thereexists C <∞ such that

|f(Xn + ∆n)− f(Xn)| ≤ C(1 +Xn)γ+ε′ + C|∆n|γ+ε′ , a.s. (3.25)


To see (3.25), note f(X + ∆n) ≤ C(1 + 32Xn)γ+ε′ on |∆n| ≤ 1

2Xn, and

f(Xn + ∆n) ≤ C(1 + 3|∆n|)γ+ε′ ≤ C(4|∆n|)γ+ε′ , on |∆n| > 12Xn,

since Xn > 2. Then by (3.25), E[|f(Xn + ∆n)− f(Xn)|1(Ecε(n)) | Fn] is

bounded above by

C(1 +Xn)γ+ε′ P[Ecε(n) | Fn] + C E[|∆n|γ+ε′1(Ec

ε(n)) | Fn]. (3.26)

Given that (3.9) holds for p > γ, we may choose ε′ > 0 small enough so that0 ≤ γ + ε′ ≤ p. By choice of ε ∈ (0, 2ε0), we have ε < δ

1+p , so that both

terms in (3.26) are bounded above by a constant times Xγ+ε′−2−εn , by the

q = 0 and q = γ+ε′ cases of (3.21) respectively. Taking ε′ > 0 small enough(ε′ < ε/2, say) this establishes the claim (3.24) in the case γ ∈ [0, p).

Combining (3.24) with (3.23) we obtain (3.17), but with ε in place ofε/2, which lies in the interval (0, ε0). The expression in (3.18) is essentiallythe ν = 0 case of (3.17); the difference near zero between fγ,0 as defined at(3.17) and the function xγ does not affect the Taylor estimate in the analogueof (3.23), and the analogue of (3.24) goes through provided γ > 0.

For some of our results we need to study the case where fγ,ν as defined at(3.16), with positive γ, satisfies a local submartingale property; applicationoften requires uniform integrability. A convenient technical device is to workwith a truncated version the function. Specifically, for x > 0 define

fxγ,ν(y) := minfγ,ν(y), fγ,ν(2x). (3.27)

Lemma 3.4.3. Let γ ∈ (0,∞) and ν ∈ R+. Suppose that (L0) and (L1)hold, with p > max2, γ. Then there exists ε0 = ε0(p, δ, γ) > 0 such that,for any ε ∈ (0, ε0), there exists x0 = x0(ε) ∈ R+ such that, for any x ≥ x0,on Xn ≤ x,

E[fxγ,ν(Xn+1)− fxγ,ν(Xn) | Fn]

≥ γ(Xn E[∆n | Fn] +

γ − 1

2E[∆2

n | Fn]

)fγ−2,ν(Xn)

+ ν

(Xn E[∆n | Fn] +

2γ − 1

2E[∆2

n | Fn]

)fγ−2,ν−1(Xn)

+(ν(ν − 1) + oFnXn(1)

)E[∆2

n | Fn]fγ−2,ν−2(Xn) +OFnXn(Xγ−2−εn ), (3.28)

where the implicit constant in the error term does not depend on x.


Proof. Write f for fγ,ν and fx for fxγ,ν ; note that both f and fx are non-decreasing, and they coincide on [0, 2x]. Let ε0 ∈ (0, 1/2) be the constantin the statement of Lemma 3.4.1, as given at (3.22). Fix ε ∈ (0, 2ε0). ThenXn ≤ x ∩ Eε(n) implies that Xn+1 ≤ 2x provided x ≥ x0 for somex0 = x0(ε) sufficiently large, so that fx(Xn+1) = f(Xn+1). Fix x ≥ x0.Then, on Xn ≤ x,

fx(Xn+1)− fx(Xn) ≥(f(Xn+1)− f(Xn)

)1(Eε(n))

− f(Xn)1∆n > (1 +Xn)1−ε,

so that, on Xn ≤ x,

E[fx(Xn+1)− fx(Xn) | Fn] ≥ E[(f(Xn+1)− f(Xn))1(Eε(n))

∣∣Fn]− f(Xn)P[Ec

ε(n) | Fn].

Since ε ∈ (0, 2ε0) satisfies ε < δ1+p , the q = 0 case of (3.21) shows that

f(Xn)P[Ecε(n) | Fn] = OFnXn(Xγ−2−(ε/2)

n ).

Combined with (3.23) this shows that, on Xn ≤ x, (3.28) holds, once ε/2is replaced by ε ∈ (0, ε0).

3.5 Recurrence classification

We need to explain what we mean by ‘recurrence’ or ‘transience’ in thegeneral setting introduced in Section 3.3. In this section we give conditionsunder which one or other of the following two behaviours (which are a priorinot exhaustive) occurs (cf. Definition 1.5.1):

• limn→∞Xn = +∞, a.s., in which case we say that Xn is transient ;

• lim infn→∞Xn ≤ r0, a.s., for some constant r0 ∈ R+, when we say Xn

is recurrent.

First we give a condition for transience.

Theorem 3.5.1. Suppose that (L0), (L1) and (L2) hold. Suppose that

lim supx→∞

µ2(x) <∞, and lim infx→∞

(2xµ1(x)− µ2(x)) > 0. (3.29)

Then limn→∞Xn = +∞, a.s.

3.5. Recurrence classification 111

The next result gives a condition for recurrence.

Theorem 3.5.2. Suppose that (L0), (L1) and (L2) hold. Suppose thatlim infx→∞ µ2

(x) > 0, and that there exist x0 ∈ R+ and θ > 0 such that

2xµ1(x) ≤(

1 +1− θlog x

)µ

2(x), for all x ≥ x0. (3.30)

Then there exists r0 ∈ R+ such that lim infn→∞Xn ≤ r0, a.s.

We give two important examples.

Example 3.5.3. Consider Sn symmetric SRW on Zd, and let Xn = ‖Sn‖and Fn = σ(S0, . . . , Sn). Either Sn is transient and limn→∞Xn =∞ a.s., orelse Sn is recurrent and lim infn→∞Xn = 0 a.s.: see Section 3.6 for a moregeneral discussion. Example 3.3.2 shows that

2xµ1(x) = 1− 1

d+O(x−2), and 2xµ1(x) = 1− 1

d+O(x−2).

Second moments can be computed similarly to in Example 3.3.2 using Taylor’sformula. Alternatively, we can use the squared-difference identity (2.23) inthe form

E[∆2n | Sn] = E[X2

n+1 −X2n | Sn]− 2Xn E[∆n | Sn],

and the fact, obtained by expanding ‖Sn + (Sn+1 − Sn)‖2, that

X2n+1 −X2

n = 2Sn · (Sn+1 − Sn) + ‖Sn+1 − Sn‖2, (3.31)

so E[X2n+1 −X2

n | Sn] = 1, to obtain

µ2(x) =

1

d+O(x−1), and µ2(x) =

1

d+O(x−1).

Thus (3.29) holds provided d ≥ 3, so Theorem 3.5.1 implies transience.On the other hand, (3.30) holds if d ∈ 1, 2 and Theorem 3.5.2 impliesrecurrence. This is Lamperti’s proof of Polya’s theorem on simple randomwalk, as advertised in Section 1.3. 4

Example 3.5.4. We extend Polya’s theorem from Example 3.5.3 to a classof zero-drift random walks, as advertised in Section 1.5. Similarly to Ex-ample 3.1.1, let (ξn, n ≥ 0) be a time-homogeneous Markov chain with state


space Σ ⊆ Rd with increments θn := ξn+1 − ξn such that, for some B ∈ R+

and all x ∈ Σ,

P[‖θn‖ ≤ B | ξn = x] = 1, and E[θn | ξn = x] = 0.

Suppose in addition that the increment covariance matrix is constant, i.e.,with θn as a column vector, for all x ∈ Σ,

E[θnθ>n | ξn = x] = M,

for some positive definite, symmetric d by d matrix M . There exists a(unique) positive definite, symmetric matrix square-root M1/2 with matrixinverse M−1/2, such that M1/2M1/2 = M . The matrix M−1/2 defines alinear transformation of Rd via x 7→ M−1/2x (x ∈ Rd); set ζn := M−1/2ξn.Then ϕn := ζn+1 − ζn = M−1/2θn, and, by linearity,

E[ϕn | ζn = x] = M−1/2 E[θn | ξn = M1/2x] = 0, and

E[ϕnϕ>n | ζn = x] = M−1/2 E[θnθ

>n | ξn = M1/2x]M−1/2,

which is the identity. Since recurrence and transience properties transferbetween ξn and ζn, this transformation shows that it is sufficient to takeM = I, the identity.

So suppose M = I. Let Xn = ‖ξn‖. Similarly to (3.31), we have

E[X2n+1 −X2

n | ξn] = 2ξn · E[θn | ξn] + E[‖θn‖2 | ξn]

= 0 + trM = d. (3.32)

Moreover, using the bound ‖θn‖ ≤ B and the Taylor expansion (1 + y)1/2 =1 + (y/2)− (y2/8) +O(y3), similarly to the calculation in Section 1.5,

E[Xn+1 −Xn | ξn] = ‖ξn‖E[(

1 +2ξn · θn + ‖θn‖2

‖ξn‖2)1/2

− 1∣∣∣ ξn]

=ξn · E[θn | ξn]

‖ξn‖+

E[‖θn‖2 | ξn]

2‖ξn‖− E[(ξn · θn)2 | ξn]

2‖ξn‖3+O(‖ξn‖−2).

=1

2Xn

(trM − ξ>nMξn

)+O(X−2

n ),

using the fact that E[θn | ξn] = 0. Hence, since M = I, we obtain

E[Xn+1 −Xn | ξn] =d− 1

2Xn+O(X−2

n ). (3.33)


Thus with (3.32), (3.33), and the usual trick based on (2.23), we obtain

2xµ1(x) = d− 1 +O(x−1) = 2xµ1(x), and µ

2(x) = 1 +O(x−1) = µ2(x).

Thus (3.29) holds provided d ≥ 3, so Theorem 3.5.1 implies transience,while (3.30) holds if d ∈ 1, 2 and Theorem 3.5.2 implies recurrence. Thisexample is a preview of Chapter 4; there we relax the uniform incrementbound, as well considering walks with non-zero drift. 4

Remarks 3.5.5. (a) The calculations in Examples 3.5.3 and 3.5.4 suggestinformally that the recurrence phase transition for this class of zero-driftmany-dimensional random walks occurs at d = 2 (rather than elsewhere inthe interval [2, 3]). This transition is probed more precisely in Chapter 4.(b) For condition (3.30), it is sufficient (take θ = 1) that 2xµ1(x)−µ

2(x) ≤ 0

for all x large enough; however, we needed to take θ < 1 to settle the criticald = 2 case in Examples 3.5.3 and 3.5.4.

We now turn to the proofs of Theorems 3.5.1 and 3.5.2. We proceedunder the minimal assumption (L0) for much of the argument, in part tohighlight exactly where the non-confinement assumption (L2) enters, butalso so that the results that follow can be applied to the situations where (L2)fails (which is the case, for example, in the context of branching processesor other processes with absorption: see Example 3.5.9).

Consider the random variable X? := supn≥0Xn ∈ R+. In principle,there are three possibilities:

(a) ess supX? <∞, i.e., X? is uniformly bounded, a.s.;(b) ess inf X? <∞ but ess supX? =∞;(c) ess inf X? =∞, i.e., X? =∞ a.s.

Here, (c) is the case when (L2) holds, case (a) is generally uninteresting, andcase (b) pertains in, for instance, the case where there is some non-trivialprobability that the process gets absorbed or trapped.

The following semimartingale result is a ‘transience’ result related toTheorem 2.5.7, but which applies in case (b) as well as case (c); we proveTheorem 3.5.1 by applying this result with a suitable Lyapunov function.Recall that σx = minn ≥ 0 : Xn ≥ x.

Theorem 3.5.6. Suppose that (L0) holds. Let f : R+ → R+ be such thatsupx f(x) < ∞, limx→∞ f(x) = 0, and infy≤x f(y) > 0 for any x ∈ R+.Suppose also that there exists x0 ∈ R+ for which, for all n ≥ 0,

E[f(Xn+1)− f(Xn) | Fn] ≤ 0, on Xn > x0.


(i) If P[σx < ∞] > 0 for all x ∈ R+, then P[lim infn→∞Xn ≥ x] > 0 forall x ∈ R+.

(ii) If (L2) holds, then P[limn→∞Xn =∞] = 1.

The core of Theorem 3.5.6 is a hitting probability estimate, which is avariation on Lemma 2.5.8.

Lemma 3.5.7. Let Xn be an Fn-adapted process taking values in R+. Letf : R+ → R+ be such that supx f(x) < ∞ and limx→∞ f(x) = 0. Supposethat there exists x1 ∈ R+ for which infy≤x1 f(y) > 0 and, for all n ≥ 0,

E[f(Xn+1)− f(Xn) | Fn] ≤ 0, on Xn > x1.Then for any ε > 0 there exists x ∈ (x1,∞) for which, for all n ≥ 0,

P[

infm≥n

Xm ≥ x1

∣∣∣Fn] ≥ 1− ε, on Xn > x.

Proof. Fix n ∈ Z+. For x1 ∈ R+ in the hypothesis of the lemma, writeλ = minm ≥ n : Xm ≤ x1 and set Ym = f(Xm∧λ). Then (Ym,m ≥n) is an (Fm,m ≥ n)-adapted non-negative supermartingale, and so, byTheorem 2.3.6, converges a.s. as m → ∞ to some Y∞ ∈ R+. Moreover, byTheorem 2.3.11,

Yn ≥ E[Y∞ | Fn] ≥ E[Y∞1λ <∞ | Fn], a.s.

Here we have that, a.s.,

Y∞1λ <∞ = limm→∞

Ym1λ <∞ = f(Xλ)1λ <∞≥ inf

y≤x1f(y)1λ <∞.

Combining these inequalities we obtain

infy≤x1

f(y)P[λ <∞ | Fn] ≤ Yn, a.s.

In particular, on Xn > x ≥ x1, we have Yn = f(Xn) and so

infy≤x1

f(y)P[λ <∞ | Fn] ≤ f(Xn) ≤ supy>x

f(y).

Since limy→∞ f(y) = 0 and infy≤x1 f(y) > 0, given ε > 0 we can choosex > x1 large enough so that

supy>x f(y)

infy≤x1 f(y)< ε;

the choice of x depends only on f , x1, and ε, and, in particular, does notdepend on n. Then, on Xn > x, P[λ <∞ | Fn] < ε, as claimed.


We now use the local result Lemma 3.5.7 to deduce the sample-pathresult Theorem 3.5.6.

Proof of Theorem 3.5.6. Under the conditions of Theorem 3.5.6, the hypo-theses of Lemma 3.5.7 are satisfied for any x1 > x0. So for any x1 > x0 andany ε > 0, there exists x ∈ R+, x > x1 such that, on Xn ≥ x,

P[

infm≥n

Xm > x1

∣∣∣Fn] ≥ 1− ε, a.s.

As usual, let σx = minn ≥ 0 : Xn ≥ x. Then, on σx <∞,

P[

infm≥σx

Xm > x1

∣∣∣Fσx] ≥ 1− ε, a.s.

But on σx <∞ ∩ infm≥σx Xm > x1 we have lim infm→∞Xm ≥ x1, so

P[lim infm→∞

Xm ≥ x1

]≥ E

[P[

infm≥σx

Xm > x1

∣∣∣Fσx]1σx <∞]≥ (1− ε)P[σx <∞].

Under the conditions of part (i) of the lemma, P[σx < ∞] > 0, and sothe claim in part (i) follows. On the other hand, under the conditions ofpart (ii) of the lemma, P[σx < ∞] = 1, and so, since ε > 0 was arbitrary,lim infm→∞Xm ≥ x1, a.s. Then since x1 > x0 was arbitrary, the resultfollows.

Proof of Theorem 3.5.1. We apply Theorem 3.5.6 with Lyapunov functionf = f0,ν for ν < 0 as defined at (3.16). Taking γ = 0 in (3.17) and usingthe fact that lim supx→∞ µ2(x) <∞, we see that

E[f(Xn+1)− f(Xn) | Fn]

≤ ν

2

(2Xn E[∆n | Fn]− E[∆2

n | Fn])X−2n logν−1Xn

+ oFnXn(X−2n logν−1Xn). (3.34)

The assumption (3.29) says that there exist x1 ∈ R+ and ε0 > 0 such that2Xnµ1

(Xn)− µ2(Xn) ≥ ε0 on Xn > x1. Hence, by definition of µ and µ,

2Xn E[∆n | Fn]− E[∆2n | Fn] ≥ 2Xnµ1

(Xn)− µ2(Xn) ≥ ε0,

for all Xn sufficiently large. Since ν < 0, it follows that the right-hand sideof (3.34) is negative for all Xn sufficiently large. Thus the conditions ofTheorem 3.5.6 are satisfied, and hence limn→∞Xn =∞, a.s.


Now we move on to recurrence. First we state a semimartingale relativeof Theorem 2.5.2.

Theorem 3.5.8. Suppose that (L0) holds. Let f : R+ → R+ be such thatf(x) → ∞ as x → ∞ and E f(X0) < ∞. Suppose that there exist x0 ∈ R+

and C <∞ for which, for all n ≥ 0,

E[f(Xn+1)− f(Xn) | Fn] ≤ 0, on Xn > x0, a.s.;

E[f(Xn+1)− f(Xn) | Fn] ≤ C, on Xn ≤ x0, a.s.

Then

P[

lim supn→∞

Xn <∞∪

lim infn→∞

Xn ≤ x0

]= 1.

In particular, if (L2) also holds, we have lim infn→∞Xn ≤ x0, a.s.

Proof. First note that, by hypothesis, E f(X1) ≤ E f(X0) + C < ∞, and,iterating this argument, it follows that E f(Xn) <∞ for all n ≥ 0.

Fix n ∈ Z+. For x0 ∈ R+ in the hypothesis of the lemma, write λ =minm ≥ n : Xm ≤ x0. Let Ym = f(Xm∧λ). Then (Ym,m ≥ n) is a non-negative supermartingale. Hence, by Theorem 2.3.6, there exists a finiterandom variable Y∞ ∈ R+ such that limm→∞ Ym = Y∞, a.s. In particular,this means that

lim supm→∞

f(Xm) ≤ Y∞, on λ =∞.

Setting ζ = supx ≥ 0 : f(x) ≤ Y∞, which satisfies ζ < ∞ a.s. sincelimx→∞ f(x) =∞, it follows that lim supm→∞Xm ≤ ζ on λ =∞. Hence

P[

lim supn→∞

Xn <∞∪

infm≥n

Xm ≤ x0

]= 1.

Since n ∈ Z+ was arbitrary, it follows that

P[

lim supn→∞

Xn <∞∪⋂n≥0

infm≥n

Xm ≤ x0

]= 1,


We can now complete the proof of Theorem 3.5.2; again we use a Lya-punov function of the form fγ,ν .


Proof of Theorem 3.5.2. We apply Theorem 3.5.8 with Lyapunov functionf = f0,ν for ν ∈ (0, 1) as defined at (3.16). Taking γ = 0 in (3.17), we have


=ν

2

(2Xn E[∆n | Fn]− E[∆2

n | Fn])X−2n logν−1Xn

+(ν(ν − 1) + oFnXn(1)

)E[∆2

n | Fn]X−2n logν−2Xn

+ oFnXn(X−2n logν−2Xn). (3.35)

Here, since ν > 0 and ν−1 < 0, we have that, for x1 ∈ R+ sufficiently large,

ν(ν − 1) + oFnXn(1) ≤ −1

2ν(1− ν), on Xn ≥ x1.

Hence, by definition of µ1 and µ2,


≤ ν

2

(2Xnµ1(Xn)− µ

2(Xn)

)X−2n logν−1Xn

− 1

2ν(1− ν)µ

2(Xn)X−2

n logν−2Xn + oFnXn(X−2n logν−2Xn).

Hence, by (3.30),

E[f(Xn+1)− f(Xn) | Fn] ≤ ν

2X−2n logν−2

((ν − θ)µ

2(Xn) + oFnXn(1)

).

Choosing ν ∈ (0, θ), the fact that lim infx→∞ µ2(x) > 0 shows that the last

display is negative for all Xn sufficiently large, i.e., for some r0 ∈ R+,

E[f(Xn+1)− f(Xn) | Fn] ≤ 0, on Xn ≥ r0.

Hence we may apply Theorem 3.5.8 with Lyapunov function f to concludethat lim infn→∞Xn ≤ r0, a.s.

We end this section with a branching process example, which is instruct-ive in two respects: how the above ideas may be applied when condition (L2)fails, and how a Lamperti-type problem can often be obtained via an appro-priate scaling.

Example 3.5.9. Consider the following state-dependent branching process.Let ζz, z ∈ N, be a family of random variables, representing the offspringdistribution of an individual in a population of size z. The total population


size (Zn, n ≥ 0) is a Markov process on Z+ constructed as follows. LetZ0 = z0 ∈ N. Given Zn = z ∈ Z+, Zn+1 is distributed as the sum of Znindependent copies of ζz, as generation n is succeeded by generation n+ 1.

For simplicity, suppose that P[ζz ≤ B] = 1 for some B ∈ R+. Supposealso, to avoid degenerate cases, that P[ζz = 1] < 1. The Markov chain Znhas an absorbing state at 0 (extinction), and there are two possibilities forthe long-term behaviour: either Zn → 0, a.s., or else P[Zn → ∞] > 0 andthe process survives with positive probability.

The classical Galton–Watson process has ζz = ζ independent of z, anda.s.-extinction occurs if and only if m = E ζ ≤ 1. In the state-dependentsetting, the case where E ζz → 1 is clearly of particular interest. So fromnow on we suppose that the offspring distribution is given by

ζz = 1 +mz + θz,

where, for a deterministic mz and a random θz, mz ∼ c/z, E θz = 0 andE[θ2

z ]→ d, for constants c ∈ R and d > 0.In this case, given Zn = z we have

Dn := Zn+1 − Zn =

z∑k=1

(mz + θn,kz

)= zmz +

z∑k=1

θn,kz ,

where θn,kz , n ≥ 1, z ≥ 1, are independent copies of θz, representing theoffspring of individual k in generation n. We compute directly,

E[Dn | Zn = z] = c+ o(1);

E[D2n | Zn = z] = zd+ o(z);

E[D4n | Zn = z] = O(z2).

A useful change of scale is provided by f(z) = z1/2. Then, Xn = f(Zn) is aMarkov chain, and, given Xn = x (so Zn = x2), by Taylor’s formula,

Xn+1 −Xn =Dn

2x− D2

n

8x3+O(|Dn|3x−5).

Here, by Lyapunov’s inequality,

E[|Dn|3 | Zn = x2] ≤(E[D4

n | Zn = x2])3/4

= O(x3).

It follows that, with the notation at (3.1),

2xµ1(x) = c− d

4+ o(1);

3.6. Irreducibility and regeneration 119

µ2(x) =d

4+ o(1).

In this example the condition (L2) does not hold, so we cannot apply The-orems 3.5.1 and 3.5.2 directly, but it is not hard to adapt the proofs to thissetting (Theorems 3.5.6 and 3.5.8 are sufficient), and we see that

• if c > d/2, then P[Xn →∞] > 0;

• if c < d/2, then P[Xn → 0] = 1.

The boundary case can not be classified without sharper assumptions on therates of convergence of mz and E[θ2

z ]. 4

3.6 Irreducibility and regeneration

To translate results from the general semimartingale setting into the Markoviansetting, one needs to convert between the notions of recurrence and tran-sience described in Section 3.5 and the usual recurrence and transience forMarkov chains. In the Markov chain context, sufficient structure is providedby the (strong) Markov property and irreducibility, although even here thereare some subtleties over the necessary nature of the state space. The mostconvenient general framework is in fact not to assume a full Markov structurebut a weaker notion of regeneration.

For most of this section, Xn will be an Fn-adapted process on a countablestate-space S, not assumed to be a subset of R+. The form of irreducibilitythat we will use is as follows.

(I) Suppose that for each x, y ∈ S there exist constants m(x, y) ∈ Nand ϕ(x, y) > 0, and a collection of Fn-measurable random variablesMn(y) ∈ Z+, such that, for all n ≥ 0, Mn(y) ≤ m(Xn, y), a.s., and

P[Xn+Mn(y) = y | Fn] ≥ ϕ(Xn, y), a.s. (3.36)

In words, assumption (I) says that for every y, given Fn, the process visits ywith probabiltiy bounded below in a number of steps bounded above, wherethe bounds depend on Xn and y. A variant of (I) is the following.

(I′) Suppose that for each x, y ∈ S there exist constants m(x, y) ∈ N andϕ(x, y) > 0 such that, for all n ≥ 0,

P[∪m(Xn,y)k=0 Xn+k = y

∣∣∣Fn] ≥ ϕ(Xn, y), a.s.


Proposition 3.6.1. Conditions (I) and (I′) are equivalent.

Proof. Assuming (I), we have that

P[∪m(Xn,y)k=0 Xn+k = y

∣∣∣Fn] ≥ P[Xn+Mn(y) = y,Mn(y) ≤ m(Xn, y)

∣∣Fn]which is bouned below by ϕ(Xn, y), giving (I′). On the other hand, assum-ing (I′), take

Mn(y) = arg max0≤k≤m(Xn,y)

P[Xn+k = y | Fn],

which is Fn-measurable and has Mn(y) ≤ m(Xn, y), a.s. Then

P[Xn+Mn(y) = y

∣∣Fn] = max0≤k≤m(Xn,y)

P[Xn+k = y | Fn]

≥ 1

1 +m(Xn, y)

[∪m(Xn,y)k=0 Xn+k = y

∣∣∣Fn]≥ ϕ(Xn, y)

1 +m(Xn, y)=: ϕ′(Xn, y),

with ϕ′(x, y) > 0 for all x, y, which gives (I) after a relabelling.

If Xn is a time-homogeneous Markov process on a countable state-spaceS, then (3.36) reduces to the usual sense of irreducibility that, for any x, y ∈S, there exists m(x, y) ∈ N such that P[Xm(x,y) = y | X0 = x] > 0. Theassumption (3.36) allows us to work with more general processes, such asfunctions of Markov process, as described in the next example.

Example 3.6.2. Suppose that ξn is an irreducible Markov chain on a count-able state-space Σ, and that for some measurable function f : Σ → R+,Xn = f(ξn). Set Fn = σ(ξ0, . . . , ξn). Then Xn is Fn-adapted and has thecountable state-space S = f(Σ). By irreducibility, given u, v ∈ Σ there existk(u, v) ∈ N and ω(u, v) > 0 such that P[ξn+k(u,v) = v | ξn = u] ≥ ω(x, y).Fix y ∈ S; choose and fix an arbitrary v ∈ f−1(y) := u ∈ Σ : f(u) = y.For x ∈ S, let

ϕ(x, y) = infu∈Σ:f(u)=x

ω(u, v), and Mn(y) = k(ξn, v).

Then Mn(y) is Fn-measurable, and

P[Xn+Mn(y) = y | ξn] ≥ P[ξn+k(ξn,v) = v | ξn] ≥ ω(ξn, v)

≥ ϕ(Xn, y),


which is (3.36). Here Mn(y) ≤ m(Xn, y) where

m(x, y) = supu∈f−1(x),v∈f−1(y)

k(u, v).

Assuming that f−1(x) is finite for each x, we have m(x, y) < ∞ andϕ(x, y) > 0. Hence (I) is satisfied for Xn and this countable S ⊆ R+.

In particular, if Σ ⊆ Rd is locally finite and f(x) = ‖x‖ for x ∈ Σ, thenfor x ∈ S = f(Σ) we have f−1(x) = x ∈ Σ : ‖x‖ = x is finite, and soXn = ‖ξn‖ satisfies (I). 4

We first note the following basic result. Here ‘i.o.’ and ‘f.o.’ stand for‘infinitely often’ and ‘finitely often’, respectively.

Lemma 3.6.3. Suppose that (I) holds for a countable S.

(i) Let R ⊂ S be finite and non-empty. Then, up to sets of probability 0,

Xn ∈ R i.o. ⊆ ∩S⊆S, 0<#SXn ∈ S i.o..

(ii) Let R ⊂ S be non-empty. Then, up to sets of probability 0,

Xn ∈ R f.o. ⊆ ∩S⊆S, 0<#S<∞Xn ∈ S f.o..

Hence, a.s., either Xn ∈ R i.o. for all finite non-empty R ⊂ S, or for none.

Proof. First we prove part (i). Let R ⊂ S be finite. Suppose that Xn ∈R i.o., and let y ∈ S. Then, since R is finite and non-empty, we choose x ∈ R(any x ∈ R will do); by (3.36) there exist stopping times n1 < n2 < · · · withni+1 > ni +m(x, y) such that Xni = x and, for all i,

P[Xni+Mni (y) = y | Fni ] ≥ ϕ(x, y) > 0, and Xni+Mni (y) = y ∈ Fni+1 ,

since Mni(y) ≤ m(x, y). Then Levy’s extension of the Borel–Cantelli lemma(Theorem 2.3.19) shows that Xn = y i.o., a.s. So Xn = y i.o. for all y inthe countable set S. In particular, Xn ∈ S i.o. for any non-empty S ⊆ S, asclaimed.

Next we prove part (ii). Let R ⊂ S be non-empty. Suppose Xn ∈ R f.o.If there exists S a finite non-empty subset of S for which Xn ∈ S i.o., thenpart (i) would say that Xn ∈ R i.o., a.s., giving a contradiction. So part (ii)follows.

The next result returns to the case S ⊆ R+, and relates the present no-tion of irreducibility to the earlier local escape condition for non-confinementgiven in Section 3.3, assuming that S is locally finite.


Lemma 3.6.4. Suppose that (L0) and (I) hold for a locally finite S ⊂ R+.Then (L3) holds.

Proof. Since S is locally finite, 0 = inf S ∈ S. Fix x ∈ R+; then, since S islocally finite and contains 0, S∩[0, x] is finite and non-empty. Choose y ∈ S∩(x,∞). Take δx = minz∈S∩[0,x] ϕ(z, y) > 0 and rx = maxz∈S∩[0,x]m(z, y) ∈N, with m and ϕ as given by (3.36). Then, by (3.36), on Xn ∈ S ∩ [0, x],

P[

maxn≤k≤n+rx

Xk ≥ x∣∣∣Fn] ≥ P[Xn+Mn(y) = y | Fn]

≥ ϕ(Xn, y) ≥ δx,

as required for (L3).

In particular, under the conditions of Lemma 3.6.4, Proposition 3.3.4applies, so that the non-confinement condition (L2) holds. In this sectionwe explore neighbouring results, including a direct proof of this implication,included in the next lemma.

Lemma 3.6.5. Suppose that (L0) and (I) hold for a countable S ⊂ R+.Then for any (hence every) finite, non-empty R ⊂ S, the following equalityholds up to sets of probability 0:

Xn ∈ R i.o. =

lim infn→∞

Xn = 0, lim supn→∞

Xn =∞. (3.37)

If, in addition, S is locally finite, then for any (hence every) finite, non-empty R ⊂ S, the following equality holds up to sets of probability 0:

Xn ∈ R f.o. =

limn→∞

Xn =∞. (3.38)

In particular, if S is locally finite, then

P[Xn →∞ ∪

lim infn→∞

Xn = 0, lim supn→∞

Xn =∞]

= 1.

Proof. Suppose that S is countable. By Lemma 3.6.3, there are two possib-ilities, up to events of probability 0: either (i) Xn ∈ R i.o. for any non-emptyR ⊂ S; or (ii) Xn ∈ R f.o. for any finite non-empty R ⊂ S.

First suppose case (i) occurs. Then, for any x ∈ S, Xn = x i.o., so thatlim infn→∞Xn ≤ x ≤ lim supn→∞Xn; since x ∈ S was arbitrary, togetherwith the obvious bound Xn ≥ 0, we obtain

lim infn→∞

Xn = inf S = 0, and lim supn→∞

Xn = supS =∞.


This establishes (3.37).

On the other hand, suppose case (ii) occurs, and that S is locally finite; inparticular inf S = 0 ∈ S in this case. Now, for any K ∈ R+, RK = S∩ [0,K]is finite and non-empty, and so Xn ∈ RK f.o. Hence

lim infn→∞

Xn ≥ K.

Since K was arbitrary, it follows that limn→∞Xn = ∞. This establishes(3.38). Combining (3.37) and (3.38) we deduce the final statement in thelemma.

The next ingredient will be a notion of regeneration, for which we need tointroduce notation for the excursions of Xn. Consider a general countablestate-space S, with an identified state 0 ∈ S. For convenience, we willassume X0 = 0; we consider excursions away from state 0.

Set ν0 := 0 and for k ∈ N define

νk := minn > νk−1 : Xn = 0,

with the usual convention that min ∅ = ∞. That is, ν0, ν1, ν2, . . . are thesuccessive times of visit to 0 by Xn. Let N := mink ∈ N : νk = ∞; ifN = ∞ then all the νk are finite and Xn visits 0 infinitely often. Providedνk−1 <∞, we denote the kth excursion (k ∈ N) by Ek := (Xn)νk−1≤n<νk .

For k ∈ N, let κk := νk − νk−1 if νk−1 <∞ and let κk =∞ if νk−1 =∞;when νk−1 is finite, κk is the duration of the kth excursion.

(Ra) Suppose that, for all k ∈ N, the distribution of Ek on νk−1 < ∞ isthe same as the distribution of E1.

(Rb) Suppose that, on N =∞, (Ek)k∈N is an i.i.d. sequence.

(R) Suppose that X0 = 0 and both (Ra) and (Rb) hold.

Example 3.6.6. In the setting of Example 3.6.2, suppose that S = f(Σ)has 0 ∈ S. If f−1(0) is a singleton, then the strong Markov property forthe underlying Markov chain ξn yields the regenerative structure requiredfor (R). 4

Our irreducibility and regenerative assumptions have the following basicconsequence, which includes the recurrence/transience dichotomy in thissetting.


Lemma 3.6.7. Suppose that (I) and (Ra) hold for a countable S with 0 ∈ S.Then either:

(i) P[κ1 < ∞] < 1 and P[Xn ∈ R f.o.] = 1 for every finite, non-emptyR ⊂ S; or

(ii) P[κ1 <∞] = 1 and P[Xn ∈ R i.o.] = 1 for every non-empty R ⊂ S.

Proof. A consequence of (Ra) is that

P[κk = n | νk−1 <∞] = P[κ1 = n], for all n ∈ Z+. (3.39)

It follows from (3.39), with a repeated conditioning argument, that

P[N > k] = P[κk <∞, νk−1 <∞, . . . , ν1 <∞]

=

k∏j=1

P[κj <∞ | νj−1 <∞] = (P[κ1 <∞])k.

If P[κ1 < ∞] < 1, this implies that N < ∞ a.s., so that Xn = 0 f.o.,a.s. The statement in part (i) follows from Lemma 3.6.3(ii). On the otherhand, if P[κ1 < ∞] = 1 we have that P[N > k] = 1 for any k, so N = ∞a.s. and hence Xn = 0 i.o., and the statement in part (ii) follows fromLemma 3.6.3(i).

In the case where S ⊆ R+ is locally finite, the recurrence/transiencedichotomy reads as follows.

Theorem 3.6.8. Suppose that (L0) and (I) hold for a locally finite S ⊂ R+.Then lim supn→∞Xn = +∞, a.s. If (Ra) also holds, then either:

(i) (transience) P[κ1 <∞] < 1 and limn→∞Xn = +∞, a.s.; or

(ii) (recurrence) P[κ1 <∞] = 1 and lim infn→∞Xn = 0, a.s.

Proof. Lemma 3.6.5 shows that lim supn→∞Xn = +∞, and also shows thatthe statements (i) and (ii) in Lemma 3.6.7 translate into statements (i) and(ii) in the theorem, in the case where S ⊂ R+ is locally finite.

Now we can also complete the proof of Theorem 2.1.9 (a close relativeof Theorem 3.6.8) that was deferred from Chapter 2.


Proof of Theorem 2.1.9. Let ξn be an irreducible time-homogeneous Markovchain on Σ ⊂ Rd, unbounded and countable. Write r = infy∈Σ ‖y‖ and setXn = ‖ξn‖ − r and Fn = σ(ξ0, ξ1, . . . , ξn). Then (L0) and (I) hold, withS = ‖y‖− r : y ∈ Σ countable. The result now follows from Lemma 3.6.5,since recurrence of the Markov chain ξn corresponds to (3.37) holding withprobability 1, while transience of ξn corresponds to (3.38) holding with prob-ability 1.

The concepts of irreducibility and regeneration also play an importantrole in the discussion of positive- and null-recurrence. For a countable state-space S, define for x ∈ S the occupation time (also called local time)

Ln(x) :=n∑

m=0

1Xm = x, for n ∈ N.

Also define the occupation times during the kth excursion for x ∈ S by

`k(x) :=

νk−1∑m=νk−1

1Xm = x, for k ∈ N.

The following ergodic theorem shows that our regeneration and irreducibilityassumptions imply that normalized occupation times converge; if the limitis zero, the process is null, otherwise it is positive.

Theorem 3.6.9. Suppose that (I) and (R) hold for a countable S with0 ∈ S. Then, for any x ∈ S,

limn→∞

1

nLn(x) = πx :=

E `1(x)Eκ1 ∈ (0, 1) if Eκ1 <∞;

0 if Eκ1 =∞;

the limit statement holding a.s. and in Lq for any q ≥ 1. In the case Eκ1 <∞, we have

∑x∈S πx = 1.

If Xn is a positive-recurrent, irreducible, time-homogeneous Markov pro-cess, then the πx in Theorem 3.6.9 is the usual (unique) stationary distri-bution; Theorem 3.6.9 thus extends (the Cesaro version of) Theorem 2.1.6.

The proof of Theorem 3.6.9 will be accomplished by a sequence of lem-mas. The first simple lemma will also be used frequently later on. Recallthat τy = minn ≥ 0 : Xn = y.Lemma 3.6.10. Suppose that (I) and (Ra) hold for a countable S with0 ∈ S. Then P[κ1 > 1] > 0, and, for any y ∈ S \ 0, there exists c(y) > 0so that

P[τy < κ1 | F0] = c(y), a.s.


Proof. First suppose, for the purpose of deriving a contradiction, that P[κ1 =1] = 1. Then it follows from an induction on (3.39) that P[κk = 1] = 1 for allk, which implies thatXn = 0 for all n, a.s. But this contradicts Lemma 3.6.3.Hence P[κ1 > 1] > 0.

Next the irreducibility assumption (3.36) implies that for any k,

P[Xνk+Mn(y) = y | Fνk ] ≥ ϕ(0, y), on νk <∞. (3.40)

By the regenerative assumption (Ra), P [hit y before returning to 0 | Fνk ] isconstant on νk <∞; call this probability c(y). Then, an upper bound onthe probability that Xn ever visits y is

P[ ∞⋃k=0

(νk <∞ ∩ hit y between νk and νk+1)]≤∞∑k=0

c(y).

Thus if c(y) = 0, the probability of eventually hitting y is also 0, whichcontradicts (3.40). Hence c(y) > 0.

Lemma 3.6.10 shows that any x ∈ S \ 0 is visited at least once duringan excursion with positive probability. The next key result shows that theexpected number of visits is finite.

Lemma 3.6.11. Suppose that (I) and (R) hold for a countable S with 0 ∈ S.Suppose that P[κ1 < ∞] = 1. Then, for any x ∈ S, `k(x), k ≥ 1, arei.i.d. with E `1(x) ∈ (0,∞).

Proof. Assuming P[κ1 < ∞] = 1, the regenerative assumption (R) showsthat Xn = 0 i.o., and hence Lemma 3.6.3 shows that Xn = x i.o. also forany x ∈ S. Also, (R) shows that the `k(x) are i.i.d.

The fact that E `1(x) < ∞ will follow from irreducibility. Indeed, fixx ∈ S. Write m = m(x, 0) and ϕ = ϕ(x, 0), the constants given by (I). Setτ1 = minn ≥ 0 : Xn = x and for k ≥ 2 set τk = minn > τk−1 +m : Xn =x, so that τ1, τ2, . . . are a subsequence of the times of successive visits to x.(Note that τk <∞ for all k, since Xn = x i.o.) Set

Jx = mink ≥ 1 : Xτk+Mτk(0) = 0.

Recall that Mτk(0) ≤ m a.s. Then Xn must visit 0 by time τJx +m, and bythat time Xn has visited x at most (2m+ 1)Jx times, since for each visit tox recorded as some τk, other (unrecorded) visits may occur anywhere in theinterval [τk −m, τk + m]. Hence `1(x) ≤ (2m + 1)Jx, a.s. However, by (I)applied at time τ1,

P[Jx ≥ 2 | Fτ1 ] ≤ P[Xτ1+Mτ1 (0) 6= 0 | Fτ1 ] ≤ 1− ϕ, a.s.,


and, since Jx ≥ 2 ∈ Fτ2 by the fact that τ1 +Mτ1(0) ≤ τ2, we have

P[Jx ≥ 3 | Fτ1 ] = E[1Jx ≥ 2P[Jx ≥ 3 | Fτ2 ] | Fτ1 ]

= E[1Jx ≥ 2P[Xτ2+Mτ2 (0) 6= 0 | Fτ2 ] | Fτ1 ]

≤ (1− ϕ)2,

by (I) applied at time τ2. Iterating this argument gives P[Jx > k] ≤ (1−ϕ)k

for any k ≥ 0, so in particular E `1(x) ≤ (2m+ 1)E Jx <∞.

Finally, we have E `1(x) > 0 since `1(0) ≥ 1, a.s., and, for x ∈ S \ 0,E `1(x) ≥ P[τx < κ1], which is positive by Lemma 3.6.10.

Let Nn := maxk ≥ 1 : νk ≤ n denote the number of completedexcursions by time n ≥ 0.

Lemma 3.6.12. Suppose that (I) and (R) hold for a countable S with 0 ∈ S.Then, a.s.,

limn→∞

1

nNn =

1

Eκ1 ∈ (0, 1) if Eκ1 <∞;

0 if Eκ1 =∞.

Proof. If P[κ1 =∞] > 0, then limn→∞Nn = N−1 <∞, a.s., and n−1Nn →0, trivially. So suppose that P[κ1 <∞] = 1.

The result is essentially an inversion of the strong law of large numbersfor νk = νk − ν0 =

∑k`=1 κ`, where κ1, κ2, . . . are i.i.d., by (R).

Suppose first that Eκ1 <∞; then, since P[κ > 1] > 0 (see Lemma 3.6.10)this means Eκ1 ∈ (1,∞). Then the strong law of large numbers shows thatk−1νk → Eκ1, a.s. Hence for any ε ∈ (0,Eκ1), there exists an a.s.-finiteKε such that νk ≥ k(Eκ1 − ε) for all k ≥ Kε. Then if k ∈ N is such thatk(Eκ1 − ε) > n for an integer n with n > Kε(Eκ1 − ε), we have νk > n;hence Nn ≤ n

Eκ1−ε for all n sufficiently large. Since ε > 0 was arbitrary, weobtain

lim supn→∞

n−1Nn ≤1

Eκ1∈ (0, 1).

A similar argument in the other direction gives the corresponding lim infstatement.

On the other hand, suppose that Eκ1 =∞; now the strong law impliesthat k−1νk →∞, a.s. So, for any ε > 0, νk > k/ε for all k sufficiently large,which yields lim supn→∞ n

−1Nn ≤ ε by a similar argument to above. Sinceε > 0 was arbitrary, the result follows.

Now we can complete the proof of Theorem 3.6.9.


Proof of Theorem 3.6.9. Suppose that P[κ1 =∞] > 0. Then limn→∞ Ln(x) =∑∞m=0 1Xm = x <∞, a.s., by Lemma 3.6.7, so that n−1Ln(x)→ 0.

So suppose that P[κ1 < ∞] = 1. Lemma 3.6.11 and the strong law oflarge numbers shows that

limm→∞

1

m

m∑k=1

`k(x) = E `1(x), a.s. (3.41)

First note that, by definition of Nn and `k(x), a.s.,

Nn∑k=1

`k(x) ≤ Ln(x) =

Nn∑k=1

`k(x) +n∑

m=νNn

1Xm = x ≤Nn+1∑k=1

`k(x), (3.42)

since νNn ≤ n < νNn+1. In particular, the upper bound in (3.42) gives

1

nLn(x) ≤ Nn + 1

n· 1

Nn + 1

Nn+1∑k=1

`k(x).

Hence, since Nn →∞ a.s., it follows from (3.41) that

lim supn→∞

1

nLn(x) ≤ E `1(x) lim sup

n→∞

Nn + 1

n, a.s. (3.43)

If Eκ1 = ∞, Lemma 3.6.12 shows that the limit on the right-hand side of(3.43) is zero, so that n−1Ln(x) → 0, a.s. Suppose now that Eκ1 < ∞. Inthis case, (3.43) with Lemma 3.6.12 shows that

lim supn→∞

1

nLn(x) ≤ E `1(x)

Eκ1, a.s.

In the other direction, the lower bound in (3.42) gives

1

nLn(x) ≥ Nn

n· 1

Nn

Nn∑k=1

`k(x),

so that by Lemma 3.6.12 and (3.41),

lim infn→∞

1

nLn(x) ≥ E `1(x)

Eκ1, a.s.

This completes the proof.


To end this section, we state a technical lemma on almost-sure boundsfor sums and maxima of i.i.d. random variables; under the regenerativeassumption, these results will be useful for our analysis of excursions ofLamperti-type processes presented later in this chapter.

Lemma 3.6.13. Let ζ1, ζ2, . . . be i.i.d. R+-valued random variables.

(i) Suppose that, for some θ ∈ (0, 1) and ϕ ∈ R,

lim supx→∞

(xθ(log x)−ϕ P[ζ1 ≥ x]) <∞. (3.44)

Then, for any ε > 0, a.s., for all but finitely many n,

n∑i=1

ζi ≤ n1θ (log n)

ϕ+1θ

+ε.

(ii) Suppose that, for some θ ∈ (0,∞) and ϕ ∈ R,

lim infx→∞

(xθ(log x)−ϕ P[ζ1 ≥ x]) > 0. (3.45)

Then, for any ε > 0, a.s., for all but finitely many n,

max1≤i≤n

ζi ≥ n1θ (log n)

ϕ−1θ−ε.

Proof. Part (i) belongs to a family of classical results related to the stronglaws of large numbers of Marcinkiewicz and Zygmund (see e.g. [137, p. 73]):it follows from Theorem 2 of Feller [85] (see also [183, p. 253] for a moregeneral result).

For part (ii), we have from (3.45) that for some c > 0 and all x largeenough, P[ζ1 ≥ x] ≥ cx−θ(log x)ϕ, so that

P[

max1≤i≤n

ζi < x

]≤(

1− cx−θ(log x)ϕ)n.

Taking x = n1/θ(log n)q we obtain

P[

max1≤i≤n

ζi < n1/θ(log n)q]≤(

1− cn−1(log n)ϕ−θq(1 + o(1)))n

= O(

exp(−c(log n)ϕ−θq(1 + o(1))

)),

which is summable over n ≥ 2 if q < (ϕ− 1)/θ; now use the Borel–Cantellilemma.


3.7 Moments and tails of passage times

Recall that for x ≥ 0, λx = minn ≥ 0 : Xn ≤ x. In the recurrentcases identified in Section 3.5, we have that λx < ∞ a.s., at least for all xsufficiently large. To quantify recurrence, it is natural to ask for more in-formation about these hitting times, such as tail or moments bounds. Inthis section we use the general results from Section 2.7 to investigate thisquestion. We start with a result on existence of moments.

Theorem 3.7.1. Suppose that (L0) holds. Suppose that, for some α > 0,(L1) holds with p > max2, 2α, and that lim supx→∞ µ2(x) < ∞. Supposealso that one of the following two conditions holds.

(i) We have α ∈ (0, 12 ] and lim supx→∞

(2xµ1(x)− (1− 2α)µ

2(x))< 0.

(ii) We have α ∈ (12 ,∞) and lim supx→∞

(2xµ1(x) + (2α− 1)µ2(x)

)< 0.

Then there exists γ0 ∈ (2α, p) and x0 ∈ R+ for which, for any γ ∈ (2α, γ0),any s ∈ (0, γ/2), and any x ≥ x0, E[λsx | F0] ≤ CXγ

0 , where C = C(γ, s) <∞. In particular, for any x ≥ x0, E[λαx ] <∞.

Now we turn to non-existence of moments. The next result also gives alower tail bound.

Theorem 3.7.2. Suppose that (L0) holds. Suppose that, for some β > 0,(L1) holds with p > max2, 2β, and that lim supx→∞ µ2(x) < ∞. Supposealso that one of the following two conditions holds.

(i) We have β ∈ (0, 12 ] and lim infx→∞

(2xµ

1(x)− (1− 2β)µ2(x)

)> 0.

(ii) We have β ∈ (12 ,∞) and lim infx→∞

(2xµ

1(x) + (2β − 1)µ

2(x))> 0.

Then there exists x0 ∈ R+ such that for any x ≥ x0, E[λβx | F0] = ∞ onX0 > x. Moreover, for any x1 > x0 and any x < x1, there is a constantc = c(x0, x1, x, β) > 0 such that, for all n sufficiently large,

P[λx ≥ n | F0] ≥ cn−β, on X0 = x1.

The following more specific result gives a sufficient condition to ensurethat Eλx =∞ which is sharper than the β = 1 case of Theorem 3.7.2.

3.7. Moments and tails of passage times 131

Theorem 3.7.3. Suppose that (L0) holds. Suppose that (L1) holds withp > 2, and that lim supx→∞ µ2(x) < ∞. Suppose also that there exist x0 ∈(0,∞), v ∈ (0,∞), and θ ∈ (0, 1) such that for all x ≥ x0, µ

2(x) ≥ v and

2xµ1(x) +

(1 +

1− θlog x

)µ

2(x) ≥ 0.

Then there exists x0 ∈ R+ such that, for any x ≤ x0, E[λx | F0] = ∞ onX0 > x. Moreover, for any ν > 1− θ there exists x0 ∈ R+ such that, forany x1 > x0 and any x < x1, there is a constant c = c(x0, x1, x, ν, θ) > 0such that, for all n sufficiently large,

P[λx ≥ n | F0] ≥ cn−1(log n)−ν , on X0 = x1.

Note that Theorem 3.7.3 is a considerably stronger result than one canobtain from Theorem 2.6.10; one would like to apply the latter result to X2

n

satisfying the conditions of Theorem 3.7.3, but the moments condition inTheorem 2.6.10 will not be satisfied.

Example 3.7.4. As in Example 3.5.4, let (ξn, n ≥ 0) be a time-homogeneousMarkov chain on state space Σ ⊆ R2 with increments θn := ξn+1 − ξn suchthat ‖θn‖ ≤ B, a.s., for some B ∈ R+, and, for all x ∈ Σ,

E[θn | ξn = x] = 0, and E[θnθ>n | ξn = x] = I,

the 2 by 2 identity matrix. In Example 3.5.4, we saw that ξn is recurrent.Moreover, for Xn = ‖ξn‖,

lim infx→∞

(2xµ

1(x)− (1− 2β)µ2(x)

)= 1− (1− 2β) > 0,

for any β > 0, so Theorem 3.7.2 shows that, for suitable x1 > 0, E[λβx] =∞for any ‖ξ0‖ > x1 > x. Example 2.7.9 gave a version of this conclusion forsymmetric SRW on Z2. 4

The rest of this section is devoted to the proofs of the preceding theorems.The first step is to apply the results of Section 2.7 with suitable Lyapunovfunctions. First we work towards a proof of Theorem 3.7.1.

Lemma 3.7.5. Suppose that (L0) holds. Suppose that, for some α > 0,(L1) holds with p > max2, 2α, and that lim supx→∞ µ2(x) < ∞. Supposealso that one of the following two conditions holds.


(2xµ1(x)− (1− 2α)µ

2(x))< 0.



(2xµ1(x) + (2α− 1)µ2(x)

)< 0.

There exist γ0 ∈ (2α, p), x0 ∈ R+, and ε0 > 0 such that, for any γ ∈ (2α, γ0),

E[Xγn+1 −Xγ

n | Fn] ≤ −ε0Xγ−2n , on Xn ≥ x0.

Proof. In this case, for any γ ∈ (0, p), (3.18) gives

E[Xγn+1 −Xγ

n | Fn] =γ

2

(2Xn E[∆n | Fn] + (γ − 1)E[∆2

n | Fn])Xγ−2n

+ oFnXn(Xγ−2n ),

using the fact that lim supx→∞ µ2(x) <∞. First suppose condition (i) holds,so that for any ε1 > 0 sufficiently small there exists x1 ∈ R+ such that2xµ1(x)− (1− 2α)µ

2(x) < −ε1 for all x ≥ x1. Take γ0 = 2α+ ε2 ∈ (2α, p).

Then, for γ ∈ (2α, γ0),

2Xn E[∆n | Fn] + (γ − 1)E[∆2n | Fn]

≤ 2Xnµ1(Xn)− (1− 2α)µ2(Xn) + ε2µ2(Xn)

≤ −ε1 + ε2µ2(Xn),

on Xn ≥ x1. Since lim supx→∞ µ2(x) < ∞, we may choose ε2 > 0 smallenough so that this last display is less than −ε1/2, say, on Xn ≥ x1,uniformly for γ ∈ (2α, γ0). Hence

E[Xγn+1 −Xγ

n | Fn] ≤ −ε1

4γXγ−2

n + oFnXn(Xγ−2n ),

on Xn ≥ x1. Since γ is bounded uniformly away from 0, there exist ε0 > 0and x0 <∞, not depending on γ ∈ (2α, γ0), such that

E[Xγn+1 −Xγ

n | Fn] ≤ −ε0Xγ−2n , on Xn ≥ x0.

A similar argument applies if condition (ii) holds.

Proof of Theorem 3.7.1. Let γ0, x0, and ε0 be the constants in Lemma 3.7.5,and let γ ∈ (2α, γ0). Then Lemma 3.7.5 shows that we may apply The-orem 2.7.1 with s0 = γ/2 to conclude that E[λsx | F0] ≤ CXγ

0 for anys < γ/2 and any x ≥ x0. In particular, since γ > 2α, we may take s = α toconclude that E[λαx ] <∞.

Next we turn to the non-existence of moments results. We will obtainTheorems 3.7.2 and 3.7.3 from applications of Lemma 2.7.7. Thus we needlower bounds on the extent of an excursion away from a bounded set. Thefollowing result is required for Theorems 3.7.2, and will also be useful inSection 3.8 below.


Lemma 3.7.6. Suppose that (L0) holds. Suppose that, for some β > 0,(L1) holds with p > max2, 2β, and that lim supx→∞ µ2(x) < ∞. Supposealso that one of the following two conditions holds.


(2xµ

1(x)− (1− 2β)µ2(x)

)> 0.


(2xµ

1(x) + (2β − 1)µ

2(x))> 0.

Then there exist x0 ∈ R+ and a constant c = c(β) > 0 such that, for anyx ≥ x0, for all y > x,

P[

max0≤m≤λx

Xm ≥ y∣∣∣F0

]≥ c(X2β

0 − x2β)y−2β, on x < X0 < y.

Proof. Write f = f2β,0 and fy = fy2β,0, where f2β,0 is the Lyapunov func-

tion defined at (3.16) and fy2β,0 is the truncated version defined at (3.27).Lemma 3.4.3 shows that, for any y > e,

E[fy(Xn+1)− fy(Xn) | Fn]

≥ β(2Xn E[∆n | Fn] + (2β − 1)E[∆2

n | Fn])X2β−2n + oFnXn(X2β−2

n ),

on ≤ Xn ≤ y, where the implicit constant in the error term does notdepend on y. Hence under either condition (i) or condition (ii) in the lemma,we have that there exists x1 ∈ R+ such that, for any y > x1,

E[fy(Xn+1)− fy(Xn) | Fn] ≥ 0, on x1 ≤ Xn ≤ y.

Let x1 ≤ x < y, and set Yn = fy(Xn∧λx∧σy). Then Yn is a non-negativesubmartingale, which is uniformly bounded by f(2y), and hence uniformlyintegrable. Hence, by optional stopping (Theorem 2.3.7) we have, on x <X0 < y,

f(X0) = Y0 ≤ E[Yλx∧σy | F0]

≤ f(2y)P[σy < λx | F0] + f(x)P[λx < σy | F0],

since fy is non-decreasing and bounded above by f(2y). Hence

P[σy < λx | F0] ≥ f(X0)− f(x)

f(2y)− f(x)≥ f(X0)− f(x)

f(2y),

which yields the statement in the lemma.


Proof of Theorem 3.7.2. Let x0 be the constant from Lemma 3.7.6. Weapply Lemma 2.7.7 with the excursion lower bound from Lemma 3.7.6 toobtain, for any x ≥ x0,

P[λx ≥ n | F0] ≥ 1

2P[

max0≤m≤λx

Xm ≥ Cn1/2∣∣∣F0

]≥ c(X2β

0 − x2β)n−β1x < X0 < Cn1/2,

for constants c > 0 (depending on β) and C <∞. It follows that, for somec > 0 depending on β,

E[λβx | F0] ≥ c(X2β0 − x2β)

∑n>X0/C

n−1 =∞, on X0 > x.

Moreover, for any x1 > x0, for any x < x1, on X0 = x1,

P[λx ≥ n | F0] ≥ P[λx∨x0 ≥ n | F0] ≥ c(x2β1 − (x ∨ x0)2β)n−β,

for all n ≥ n0 sufficiently large. This yields the result.

Now we work towards a proof of Theorem 3.7.3. First we need an ex-cursion bound analogous to Lemma 3.7.6.

Lemma 3.7.7. Suppose that (L0) holds. Suppose that (L1) holds with p >2, and that lim supx→∞ µ2(x) < ∞. Suppose also that there exist x0 ∈(0,∞), v ∈ (0,∞), and θ ∈ (0, 1) such that for all x ≥ x0, µ

2(x) ≥ v and

2xµ1(x) +

(1 +

1− θlog x

)µ

2(x) ≥ 0.

Then for any ν > 1 − θ there exist x1 ∈ R+ and a constant c = c(ν, θ) > 0such that, for any x ≥ x1, for all y > x, on x < X0 < y,

P[

max0≤m≤λx

Xm ≥ y∣∣∣F0

]≥ c(X2

0 logν X0 − x2 logν x)y−2(log y)−ν .

Proof. Write f = f2,ν and fy = fy2,ν , where f2,ν and fy2,ν are defined at(3.16) and (3.27) respectively. Take ν > 1− θ. Lemma 3.4.3 shows that, forany y > 2e,

E[fy(Xn+1)− fy(Xn) | Fn]

≥(2Xn E[∆n | Fn] + E[∆2

n | Fn])

logν Xn

+ν

2

(2Xn E[∆n | Fn] + 3E[∆2

n | Fn])

logν−1Xn


+ oFnXn(logν−1Xn),

on 2e ≤ Xn ≤ y, where the implicit constant in the error term does notdepend on y. Using the fact that on Xn ≥ x0,

2Xn E[∆n | Fn] + E[∆2n | Fn] ≥ − 1− θ

logXnµ

2(Xn),

we thus obtain

E[fy(Xn+1)− fy(Xn) | Fn] ≥(ν − (1− θ) + oFnXn(1)

)µ

2(Xn) logν−1Xn,

which is non-negative for all Xn sufficiently large, by the choice of ν > 1− θand the fact that µ

2is eventually positive. In other words, there exists

x1 ∈ R+ such that, for any y > x1,

E[fy(Xn+1)− fy(Xn) | Fn] ≥ 0, on x1 ≤ Xn ≤ y.

The remainder of the proof now follows the same lines as the proof ofLemma 3.7.6.

Proof of Theorem 3.7.3. Choose ν ∈ (1−θ, 1), and let x0 be as in the state-ment of Lemma 3.7.7. We apply Lemma 2.7.7 with the excursion lowerbound from Lemma 3.7.7 to obtain, for any x ≥ x0, on x < X0 < Cn1/2,

P[λx ≥ n | F0] ≥ 1

2P[

max0≤m≤λx

Xm ≥ Cn1/2∣∣∣F0

]≥ c(X2

0 log−ν X0 − x2 log−ν x)n−1 log−ν n,

for constants c > 0 (depending on ν and θ) and C <∞. It follows that, forsome c > 0, on X0 > x,

E[λx | F0] ≥ c(X20 log−ν X0 − x2 log−ν x)

∑n>X0/C

n−1 log−ν n =∞,

as claimed. Moreover, for any x1 > x0, on X0 = x1,

P[λx ≥ n | F0] ≥ P[λx∨x0 ≥ n | F0] ≥ cn−1 log−ν n,

for some constant c = c(x0, x1, x, θ, ν) > 0 and all n ≥ n0 sufficiently large.


3.8 Excursion durations and maxima

The results in Section 3.7 deal with bounds for passage times into boundedintervals, and the extent of trajectories up until such a passage time, forgeneral Lamperti processes started far enough away from the origin. In thissection we impose the additional structure provided by the regeneration andirreducibility assumptions of Section 3.6, and obtain analogous results forfull excursions in that setting.

First we consider tail bounds for the excursion duration κ1. The firstresult is an analogue of the upper tail bound result for passage times intobounded intervals, Theorem 3.7.1.

Theorem 3.8.1. Suppose that (L0), (I), and (R) hold for a locally finiteS ⊂ R+. Suppose that, for some α > 0, (L1) holds with p > max2, 2α,and that lim supx→∞ µ2(x) <∞. Suppose also that one of the following twoconditions holds.


(2xµ1(x)− (1− 2α)µ

2(x))< 0.


(2xµ1(x) + (2α− 1)µ2(x)

)< 0.

Then there exists ε > 0 for which P[κ1 ≥ n] = O(n−α−ε). In particular,E[κα1 ] <∞.

Note that the case α = 1 of Theorem 3.8.1 generalizes the positive-recurrence result Theorem 3.2.3(iii). The analogue of the lower tail boundresult for passage times into bounded intervals, Theorem 3.7.2, is the fol-lowing.

Theorem 3.8.2. Suppose that (L0), (I), and (R) hold for a locally finiteS ⊂ R+. Suppose that, for some β > 0, (L1) holds with p > max2, 2β,and that lim supx→∞ µ2(x) <∞. Suppose also that one of the following twoconditions holds.


(2xµ

1(x)− (1− 2β)µ2(x)

)> 0.


(2xµ

1(x) + (2β − 1)µ

2(x))> 0.

Then there exists c = c(β) > 0 such that P[κ1 ≥ n] ≥ cn−β for all n ≥ 1. In

particular, E[κβ1 ] =∞.

We also present the following analogue of Theorem 3.7.3, which sharpensthe β = 1 case of Theorem 3.8.2.

3.8. Excursion durations and maxima 137

Theorem 3.8.3. Suppose that (L0), (I), and (R) hold for a locally finiteS ⊂ R+. Suppose that (L1) holds with p > 2, and that lim supx→∞ µ2(x) <∞. Suppose also that there exist x0 ∈ (0,∞) and θ ∈ (0, 1) such that for allx ≥ x0,

2xµ1(x) +

(1 +

1− θlog x

)µ

2(x) ≥ 0.

Then for any ν > 1 − θ, there exists c = c(ν) > 0 such that P[κ1 ≥ n] ≥cn−1(log n)−ν for all n ≥ 2. In particular, Eκ1 =∞.

In this section we also present an almost-sure lower bound on the max-imum of the process after a large number of excursions; the basis for thisresult is Lemma 3.7.6.

Theorem 3.8.4. Suppose that (L0), (I), and (R) hold for a locally finiteS ⊂ R+. Suppose that, for some β > 0, (L1) holds with p > max2, 2β,and that lim supx→∞ µ2(x) <∞. Suppose also that one of the following twoconditions holds.


(2xµ

1(x)− (1− 2β)µ2(x)

)> 0.


(2xµ

1(x) + (2β − 1)µ

2(x))> 0.

Then for any ε > 0, a.s., for all but finitely many k,

max0≤m≤νk

Xm ≥ k12β (log k)

− 12β−ε.

The proofs of the preceding theorems occupy the rest of this section. Wefirst need a technical result that says that, with uniformly positive probabil-ity, the process will reach the origin before exceeding a fixed level, providedit is not too close to the starting point.

Lemma 3.8.5. Suppose that (L0), (I), and (R) hold for a locally finiteS ⊂ R+. Suppose that E[∆n | Fn] ≤ B, a.s., for some B ∈ R+. Then forany x ∈ R+ there exists y ∈ (x,∞) such that, for all n ≥ 0,

P[

maxn≤m≤κ1

Xm < y

∣∣∣∣Fn] ≥ δ, on Xn ≤ x.

Proof. Fix x ∈ R+. Since S is locally finite and inf S = 0, S ∩ [0, x] is finiteand non-empty; by (I), m := maxy∈S∩[0,x]m(y, 0) and ϕ := miny∈S∩[0,x] ϕ(y, 0)satisfy m <∞ and ϕ > 0. Hence P[κ1 ≤ n+m | Fn] ≥ ϕ on Xn ≤ x. In


addition, the first moment bound in the lemma with the maximal inequalityin Theorem 2.4.7 gives

P[

maxn≤k≤n+m

Xk ≥ y∣∣∣∣Fn] ≤ Bm+ x

y,

on Xn ≤ x. Choosing y = 2(Bm+ x)/ϕ it follows that

P[κ1 ≤ m∩

max

n≤k≤n+mXk < y

∣∣∣∣Fn] ≥ ϕ−Bm+ x

y≥ ϕ

2, on Xn ≤ x,


We can now give the proof of Theorem 3.8.1.

Proof of Theorem 3.8.1. Lemma 3.7.5 shows that there exist γ0 ∈ (2α, p),C ∈ R+, and x0 <∞ such that, for any γ ∈ (2α, γ0),

E[Xγn+1 −Xγ

n | Fn] ≤ C1Xn < x0.

It follows from Theorem 2.4.8 that, for any x ∈ R+,

E[Xγn∧σx | F0] ≤ Xγ

0 + C E[σx | F0].

Moreover, Lemma 3.6.4 shows that (L0), (I), and the fact that S is locallyfinite imply that (L3) holds; hence it follows from Proposition 3.3.4 thatE[σx | F0] ≤ Kx, a.s., for some Kx <∞ depending on x. Hence

E[Xγn∧σx | F0] ≤ Xγ

0 +Kx, a.s., (3.46)

for some Kx <∞ depending on x.

Now take x1 ∈ (x0,∞). Define stopping times λk and ρk recursively byρ0 = 1 and, for k ∈ N,

λk = minn ≥ ρk−1 : Xn ≤ x0, and ρk = minn ≥ λk : Xn ≥ x1.

Lemma 3.8.5 shows that we may choose x1 sufficiently large so that, for aconstant δ > 0,

P[κ1 < ρk | Fλk ] ≥ δ, a.s. (3.47)

An application of Theorem 3.7.1 shows that, for all k ≥ 0,

E[(λk+1 − ρk)s | Fρk ] ≤ CXγρk, a.s.,

3.8. Excursion durations and maxima 139

for any s ∈ (0, γ/2). Here, it follows from (3.46) and Fatou’s lemma that,for all k ≥ 0,

E[Xγρk| Fλk ] ≤ xγ0 +Kx1 , a.s.,

so that, for all k ≥ 0, E[(λk+1 − ρk)s] ≤ C for some C <∞ depending on s,γ, x0, and x1.

Another application of Proposition 3.3.4 shows that, for all k ≥ 1, E[(ρk−λk)

s] ≤ C for some C <∞ depending on γ and x1. Hence, by Minkowski’sinequality, for all k ≥ 1, E[(λk+1−λk)s] ≤ C for some C <∞ depending ons, γ, x0, and x1. Then, by Markov’s inequality, for all n ≥ 0 and all k ≥ 1,

P[λk+1 − λk ≥ n] ≤ Cn−s.

Let N = mink ≥ 0 : ρk > κ1. Then P[N > 1] = E[P[κ1 > ρ1 | Fλ1 ]] ≤1− δ, by (3.47). Similarly, two applications of (3.47) show that

P[N > 2] = E[P[κ1 > ρ2 | Fλ2 ]1κ1 > ρ1

]≤ (1− δ)2;

iterating this argument gives P[N > k] ≤ (1 − δ)k ≤ e−δk for all k ≥ 0.Moreover, by definition of N , we have κ1 < ρN ≤ λN+1, so that κ1 ≤∑N

k=1(λk+1 − λk). Hence

P[κ1 ≥ n] ≤ P[N ≥ nε] + P

bnεc∑k=1

(λk+1 − λk) ≥ n

≤ e−δn

ε+ nε sup

kP[λk+1 − λk ≥ n1−ε].

Thus for any ε > 0 and any s ∈ (0, γ/2), P[κ1 ≥ n] = O(nε−(1−ε)s). Sinceγ > 2α, we may choose s > α and then choose ε > 0 small enough to seethat P[κ1 ≥ n] = O(n−α−ε).

The proofs of Theorems 3.8.2 and 3.8.3 are accomplished by convertinglower bounds on the return time to a bounded interval starting far enoughaway from the origin (given in Theorems 3.7.2 and 3.7.3 respectively) intolower bounds for the duration of excursions away from the origin, with theaid of Lemma 3.6.10.

Proof of Theorem 3.8.2. Recall that τy = minn ≥ 0 : Xn = y. The-orem 3.7.2 shows that there exists x0 ∈ R+ such that, for any x1 > x0, thereis a constant c = c(x0, x1, β) > 0 such that, for all n ≥ n0 sufficiently large,

P[λx0 ≥ n | F0] ≥ cn−β, on X0 = x1.


It follows that, for n ≥ n0,

P[κ1 ≥ n] ≥ E[P[κ1 ≥ n | Fτx1 ]1τx1 < κ1

]≥ cn−β P[τx1 < κ1]; (3.48)

the result now follows from Lemma 3.6.10.

Proof of Theorem 3.8.3. Let ν > 1 − θ. Theorem 3.7.3 shows that thereexists x0 ∈ R+ such that, for any x1 > x0, there is a constant c > 0 suchthat, for all n sufficiently large,

P[λx0 ≥ n | F0] ≥ cn−1 log−ν n, on X0 = x1.

Another application of (3.48) and Lemma 3.6.10 now completes the proof.

Finally, we can complete the proof of Theorem 3.8.4.

Proof of Theorem 3.8.4. Under the stated conditions, Lemma 3.7.6 applies;let x0 ∈ R+ be the constant in the statement of Lemma 3.7.6. Recall that,assuming (R), we have X0 = 0. Define for k ∈ Z+,

Mk := maxνk≤m<νk+1

Xm,

the maximum extent of the kth excursion. Then choose (and fix) x =inf Σ ∩ (x0,∞); since Σ is locally finite, x > x0.

P[M0 ≥ y] ≥ P[τx < κ1 ∩

max

τx≤m≤κ1Xm ≥ y

]= E

[1τx < κ1P

[max

τx≤m≤κ1Xm ≥ y

∣∣∣∣Fτx]] .Here, by Lemma 3.7.6, for all y > x,

P[

maxτx≤m≤κ1

Xm ≥ y∣∣∣∣Fτx] ≥ c1y

−2β, a.s.,

for some constant c1 > 0 depending only on β and x, but not on y. Hence,P[M0 ≥ y] ≥ c1y

−2β P[τx < κ1]. Also, Lemma 3.6.10 shows that P[τx <κ1] = c2 for a constant c2 > 0 depending only on x. Hence there existsc = c(β, x) > 0 such that

P[M0 ≥ y] ≥ cy−2β, for all y > x. (3.49)

3.9. Almost-sure bounds on trajectories 141

By the regenerative assumption (R), the random variables M0,M1,M2 . . .are i.i.d. and satisfy the tail bound (3.49). Hence an application of theθ = 2β, ϕ = 0 case of Lemma 3.6.13(ii) shows that, for any ε > 0, a.s., forall but finitely many k,

max0≤`≤k

M` ≥ k12β (log k)

− 12β−ε.

The result now follows, since max0≤m≤νk Xm ≥ max0≤`≤k−1M`.

3.9 Almost-sure bounds on trajectories

In this section we present almost-sure upper and lower bounds on max0≤m≤nXm

for a Lamperti process Xn. We start with the upper bounds, which we de-duce from the results of Section 2.8. The first result gives a general diffusiveupper bound.

Theorem 3.9.1. Suppose that (L0) holds and that for some constant C ∈R+, E[∆2

n | Fn] ≤ C, a.s. Suppose also that lim supx→∞(xµ1(x)) < ∞.Then for any ε > 0, a.s., for all but finitely many n ≥ 0,

max0≤m≤n

Xm ≤ n12 (log n)

12

+ε.

Proof. Given that E[∆2n | Fn] ≤ C, µ2(x) and µ1(x) are well defined and

uniformly bounded. We will apply Theorem 2.8.1 with f(x) = x2. Here

E[f(Xn+1)− f(Xn) | Fn] = 2Xn E[∆n | Fn] + E[∆2n | Fn]

≤ 2Xnµ1(Xn) + µ2(Xn), a.s.,

which is bounded above by a finite constant, a.s., by assumption. ThenTheorem 2.8.1 yields the result.

The bound in Theorem 3.9.1 is not always sharp. In some cases, astronger sub-diffusive upper bound holds.

Theorem 3.9.2. Let α > 1. Suppose that (L0) holds and that (L1) holdswith p > 2α. Suppose also that lim supx→∞ µ2(x) <∞ and

lim supx→∞

(2xµ1(x) + (2α− 1)µ2(x)

)< 0.

There exists ε > 0 such that, a.s., for all but finitely many n ≥ 0,

max0≤m≤n

Xm ≤ n12α−ε.


Remark 3.9.3. In the Markov chain setting, Theorem 3.9.2 applies in thecase of positive-recurrence. Indeed, the condition for positive-recurrencegiven in Theorem 3.2.3(iii) is

lim supx→∞

(2xµ1(x) + µ2(x)

)< 0;

since lim supx→∞ µ2(x) <∞, it follows that for some α > 1,

lim supx→∞

(2xµ1(x) + (2α− 1)µ2(x)

)< 0.

Proof of Theorem 3.9.2. Let f(x) = xγ . Under the assumptions of the the-orem, Lemma 3.7.5 shows that there exists γ > 2α for which f(Xn) inintegrable and

E[f(Xn+1)− f(Xn) | Fn] < 0, on Xn ≥ x0,

for some x0 ∈ R+. Thus we may apply Theorem 2.8.1 to give the result.

With additional regularity assumptions on the increments moment func-tions, a consequence of the preceding results in the following corollary, whichuses log-log scaling to identify (upper bounds on) scaling exponents.

Corollary 3.9.4. Suppose that (L0) holds, and that (L1) holds with p > 2.Suppose also that, for some a, b ∈ R with b > 0,

limx→∞

µ2(x) = lim

x→∞µ2(x) = b;

limx→∞

xµ1(x) = lim

x→∞xµ1(x) = a.

(i) If 2a+ b ≥ 0, then, a.s.,

lim supn→∞

logXn

log n≤ 1

2.

(ii) If 2a+ b < 0 and (L1) holds with p > b−2ab , then, a.s.,

lim supn→∞

logXn

log n≤ b

b− 2a∈(0, 1

2

).

Proof. First of all, for all a and b, Theorem 3.9.1 implies that

max0≤m≤n

logXm ≤1

2log n+O(log log n), a.s.,

3.9. Almost-sure bounds on trajectories 143

which gives part (i). For part (ii), suppose that 2a + b < 0. Then, sinceb > 0, we have a < 0 and b− 2a > 2b > 0. Moreover, for α > 0,

lim supx→∞

(2xµ1(x) + (2α− 1)µ2(x)

)= 2a+ (2α− 1)b < 0,

provided α < b−2a2b . Hence Theorem 3.9.2 shows that, a.s., for all but finitely

many n, max0≤m≤nXm ≤ n12α , so that

lim supn→∞

logXn

log n≤ 1

2α.

Since α ∈ (0, b−2a2b ) was arbitrary, we obtain part (ii).

Obtaining lower bounds requires stronger assumptions, and it is mostconvenient to use the concepts of irreducibility and regeneration introducedin Section 3.6. Under these assumptions, we can complement the upperbounds in Corollary 3.9.4 by corresponding lower bounds, to fully identifyscaling exponents.

Theorem 3.9.5. Suppose that (L0), (I), and (R) hold for a locally finiteS ⊂ R+, and that (L1) holds with p > 2. Suppose that, for some a, b ∈ Rwith b > 0,

limx→∞

µ2(x) = lim

x→∞µ2(x) = b;

limx→∞

xµ1(x) = lim

x→∞xµ1(x) = a.

(i) If 2a+ b ≥ 0, then, a.s.,

lim supn→∞

logXn

log n=

1

2.

(ii) If 2a+ b < 0 and (L1) holds with p > b−2ab , then, a.s.,

lim supn→∞

logXn

log n=

b

b− 2a∈(0, 1

2

).

Proof. To prove the equalities in parts (i) and (ii), it suffices to prove onlythe corresponding ‘≥’ statements, since the ‘≤’ statements are given in Co-rollary 3.9.4 (under weaker conditions).

For part (i), suppose that 2a+ b ≥ 0. Then, for α > 0,

lim supx→∞

(2xµ1(x) + (2α− 1)µ2(x)

)= lim sup

x→∞

(2xµ1(x) + (1− 2α)µ

2(x))


= 2a+ (2α− 1)b < 0,

provided α ∈ (0, b−2a2b ); fix such an α (so α < 1, since 2a + b ≥ 0). The-

orem 3.8.1 bounds applies, and shows that P[κ1 ≥ n] = O(n−α−ε). Anapplication of Lemma 3.6.13(i) shows that νk ≤ k1/α, a.s., for all but fi-nitely many k. In other words, νbkαc ≤ k for all but finitely many k.

On the other hand, for β > 0,

lim infx→∞

(2xµ

1(x) + (2β − 1)µ

2(x))

= lim infx→∞

(2xµ

1(x)− (1− 2β)µ2(x)

)= 2a+ (2β − 1)b > 0,

provided β > b−2a2b ; fix such a β. Then, by Theorem 3.8.4(i), for any ε > 0,

a.s., for all but finitely many n,

max0≤m≤n

Xm ≥ max0≤m≤νbnα−εc

≥ nα−2ε2β .

It follows that, for any ε > 0, a.s.,

lim supn→∞

logXn

log n≥ α

2β− ε,

which yields the ‘≥’ half of part (i), since ε > 0, α < b−2a2b , and β > b−2a

2bwere arbitrary.

For part (ii), suppose that 2a+ b < 0. Then, for α > 0,

lim supx→∞

(2xµ1(x) + (2α− 1)µ2(x)

)= 2a+ (2α− 1)b < 0,

provided α < b−2a2b . Since 2a+ b < 0, we may choose α ∈ (1, b−2a

2b ); fix suchan α. Theorem 3.8.1 bounds applies, and shows that Eκ1 < ∞. Then thestrong law of large numbers shows that k−1νk → Eκ1, a.s. Choosing δ > 0so that δ Eκ1 < 1, we have that νbδkc ≤ k, a.s., for all but finitely many k.

On the other hand, for β > 0,

lim infx→∞

(2xµ

1(x) + (2β − 1)µ

2(x))

= 2a+ (2β − 1)b > 0,

provided β > b−2a2b ; fix such a β (necessarily, β > α > 1). Then, by The-

orem 3.8.4(ii), for any ε > 0, a.s., for all but finitely many n,

max0≤m≤n

Xm ≥ max0≤m≤νbδnc

≥ n12β−ε.

It follows that, a.s.,

lim supn→∞

logXn

log n≥ 1

2β− ε,

giving the ‘≥’ half of part (ii), since ε > 0 and β > b−2a2b were arbitrary.

3.10. Transient theory in the critical case 145

3.10 Transient theory in the critical case

The main result of this section is the following diffusive lower bound on therate of escape in the transient case.

Theorem 3.10.1. Suppose that (L0) and (L3) hold, and that (L1) holdswith p > 2 and δ ∈ (0, p − 2]. Suppose also that there exists θ ∈ (0, δ) suchthat

lim supx→∞

µ2(x) <∞, and lim infx→∞

(2xµ

1(x)− (1 + θ)µ2(x)

)> 0. (3.50)

Then for any ε > 0, a.s., for all but finitely many n ≥ 0,

Xn ≥ n12 (log n)−

12− 1θ−ε.

Remark 3.10.2. The conditions in Theorem 3.10.1 are the same as thosein the transience classification result, Theorem 3.5.1, except that the non-confinement condition (L2) is replaced by the stronger local escape condition(L3), which ensures that the process does not spend too much time near zero:see also Remark 3.10.9 below. Note that if the Lamperti condition (3.29)for transience from Theorem 3.5.1 holds, then (3.50) holds for some θ > 0.

We give two examples.

Example 3.10.3. Consider again Sn symmetric SRW on Zd, and let Xn =‖Sn‖ and Fn = σ(S0, . . . , Sn). Take d ≥ 3 (the transient case). The calcu-lations in Example 3.5.3 show that

limx→∞

(2xµ

1(x)− (1 + θ)µ2(x)

)= 1− 2 + θ

d,

which is strictly positive if θ ∈ (0, d− 2). Since in this case (L1) holds withδ = p − 2 for all p > 2, Theorem 3.10.1 shows that, for any ε > 0, a.s., allbut finitely often,

‖Sn‖ ≥ n12 (log n)−

12− 1d−2−ε.

This result is not quite sharp. A theorem of Dvoretzky and Erdos [73] saysthat the sharp bound is, for any ε > 0, a.s., all but finitely often,

‖Sn‖ ≥ n12 (log n)−

1d−2−ε,

and this inequality fails infinitely often for ε = 0. 4


Example 3.10.4. Continuing from Example 3.2.5, consider the simple ran-dom walk on Z+ with

p(x, x− 1) = px = 1− p(x, x+ 1) =1

2

(1 +

c

x

)+ o(x−1).

Then µ1(x) = 1 − 2px = −(c + o(1))x−1 and µ2(x) = 1. In the transientcase c < −1/2, we thus have that (3.50) holds for any θ ∈ (0,−1− 2c), andhence Theorem 3.10.1 shows that, for any ε > 0, a.s., all but finitely often,

Xn ≥ n12 (log n)−

12

+ 11+2c

−ε.

Again, this result is not quite sharp: a result of Csaki et al. [54] shows thatthe −1

2 in the exponent of the logarithm can be dropped. 4

The rest of this section is devoted to the proof of Theorem 3.10.1. Thestarting point is the following. Recall that σy = minn ≥ 0 : Xn ≥ y is thefirst passage time into [y,∞). To quantify the transience of the process Xn,we study

ηx := maxn ≥ 0 : Xn ≤ x, (3.51)

the last exit time from [0, x]. Note that ηx is not a stopping time. For y > x,let λy,x := minn ≥ σy : Xn ≤ x be the return time to [0, x] having reached[y,∞). Then we have, for any y > x,

P[ηx > n] = P[ηx > n, σy ≤ n] + P[ηx > n, σy > n].

Here

P[ηx > n, σy ≤ n] ≤ P[ηx > σy, σy ≤ n] ≤ P[σy ≤ n, λy,x <∞].

Hence we obtain the simple upper bound

P[ηx > n] ≤ P[λy,x <∞] + P[σy > n]. (3.52)

Theorem 3.10.1 will follow from an analysis of the last exit estimate (3.52).

i

A decreasing Lyapunov function that gives a supermartin-gale outside a set provides a bound on the probability ofreturning to that set having reached a more distant region.An increasing Lyapunov function that gives a strict sub-martingale outside a bounded set provides an upper boundon the hitting time of the distant region. Together theseestimates can be used to estimate the rate of escape for atransient process.


To bound P[σy > n] we use an extension of the idea of the ‘reverse Foster’Theorem 2.4.11. The main shortcoming with Theorem 2.4.11 is the fact that(2.19) does not permit an exceptional set. Also, to get a suitable bound onthe right-hand side of (2.20) we previously imposed a uniform upper boundon the increments (see Theorem 2.8.3). In this section we address thesepoints; in fact, the second necessitates dealing with the first, which is thepurpose of the next result.

Lemma 3.10.5. Suppose that (L0) holds. Suppose that there exist a non-decreasing function f : R+ → R+, and constants x0 ≥ 0 and ε > 0, forwhich, for all n ≥ 0,

E[f(Xn+1)− f(Xn) | Fn] ≥ ε, on Xn ≥ x0. (3.53)

Then for any x > 0,

Eσx ≤ ε−1 E f(Xσx) + (1 + ε−1f(x0))Eσx∑n=0

1Xn ≤ x0.

Proof. Let Yn = Xn∧σx . Note first that, on Xm ≤ x0 ∩ m < σx,

E[f(Ym+1)− f(Ym) | Fm] ≥ −f(Ym) = −f(Xm) ≥ −f(x0),

since f is non-negative and non-decreasing. So, with K = ε+f(x0), we mayexpress (3.53) as

E[f(Ym+1)− f(Ym) | Fm] ≥ 1m < σx (ε−K1Xm ≤ x0) . (3.54)

The rest of the proof now follows closely the proof of Theorem 2.4.11; takingexpectations in (3.54) and summing for m from 0 up to n− 1 we get

E f(Yn) ≥ εn−1∑m=0

P[σx > m]−K E∞∑m=0

1m < σx, Xm ≤ x0.

Rearranging this inequality and using the fact that f(Yn) ≤ f(Xσx) we get

n−1∑m=0

P[σx > m] ≤ 1

εE f(Xσx) +

K

εE

σx∑m=0

1Xm ≤ x0.

The statement in the lemma follows on taking n→∞.


To apply Lemma 3.10.5, one must also bound E f(Xσx). This may bedone by assuming an upper bound on jumps as in Theorem 2.8.3, but suchan assumption is often too restrictive. The next result imposes a weaker‘uniform f -integrability’ condition on the increments instead.

Lemma 3.10.6. Suppose that (L0) holds. Suppose that there exist a non-decreasing function f : R+ → R+, and constants x0 ≥ 0 and ε > 0, forwhich (3.53) holds for all n ≥ 0. Let δ > 0. Suppose that for any ν > 0there exists y ∈ R+ such that, for all n ∈ Z+,

E[f((1 + δ−1)∆+n )1∆+

n > y | Fn] ≤ ν, a.s. (3.55)

Then there is a constant x1 ≥ x0 such that for all x > x1,

Eσx ≤ 2ε−1f((1 + δ)x) + 2(1 + ε−1)f(x1)Eσx∑n=0

1Xn ≤ x1.

Remark 3.10.7. For example, if f(x) = xp for some p > 0, then a sufficientcondition for (3.55) is that there exist q > p and C <∞ for which

E[(∆+n )q | Fn] ≤ C, a.s., (3.56)

for all n ≥ 0. Indeed, in this case,

f((1 + δ−1)∆+n )1∆+

n > y = (1 + δ−1)p(∆+n )q(∆+

n )p−q1∆+n > y

≤ (1 + δ−1)p(∆+n )qyp−q,

since p− q > 0, so that

E[f((1 + δ−1)∆+n )1∆+

n > y | Fn] ≤ C(1 + δ−1)pyp−q,

by (3.56). Hence (3.55) is satisfied.

Proof of Lemma 3.10.6. We use a truncation argument. Let

fx(y) := minf(y), f((1 + δ)x),

where δ > 0 is as in the statement of the lemma. We will show that a versionof (3.53) holds with fx instead of f , up to modification of the constants εand x0. Specifically, we claim that there is some x1 > x0 such that

E[fx(Xn+1)− fx(Xn) | Fn] ≥ ε/2, on x1 ≤ Xn < x, (3.57)


for all x > x1 and all n ≥ 0.

Assuming the claim (3.57), we may apply Lemma 3.10.5, since (3.57)implies that (3.54) now holds with f replaced by fx, x0 replaced by x1,and ε replaced by ε/2. Then we conclude from Lemma 3.10.5 that, for allx > x1,

Eσx ≤ 2ε−1 E fx(Xσx) + 2(1 + ε−1)fx(x1)Eσx∑n=0

1Xn ≤ x1.

The statement of the lemma now follows since, by definition of fx, fx(Xσx) ≤f((1 + δ)x) and fx(x1) = f(x1) for x > x1.

It remains to verify the claim (3.57). Note first that, on Xn < x,

fx(Xn+1)− fx(Xn)

= f(Xn+1)− f(Xn) + (f((1 + δ)x)− f(Xn+1)) 1Xn+1 > (1 + δ)x≥ f(Xn+1)− f(Xn)− f(x+ ∆n)1∆n > δx,

using the fact that Xn < x ∩ Xn+1 > (1 + δ)x implies that ∆n > δxand f(Xn + ∆n) ≤ f(x + ∆n). Hence, again using the fact that f is non-decreasing, on Xn < x,

fx(Xn+1)− fx(Xn) ≥ f(Xn+1)− f(Xn)− f((1 + δ−1)∆n)1∆n > δx.

Here we have E[f(Xn+1)− f(Xn) | Fn] ≥ ε on Xn ≥ x0. Also

E[f((1 + δ−1)∆n)1∆n > δx | Fn] ≤ ε/2,

for all x > x1 sufficiently large, by (3.55). So we may choose x1 > x0 suchthat (3.57) holds, as claimed.

Denote the renewal function associated to Xn by

H(x) := E∞∑n=0

1Xn ≤ x =

∞∑n=0

P[Xn ≤ x]. (3.58)

Note that H(x) ≤ 1 +E ηx. It is important for application of Lemma 3.10.5or Lemma 3.10.6 to show that H(x1) <∞. The next result establishes thisin the present setting.

Lemma 3.10.8. Suppose that (L0) and (L3) hold, and that (L1) holds withp > 2. Suppose also that (3.29) holds. Then, for any x ∈ R+, E ηx <∞.


Remark 3.10.9. This is the only point in the proof of Theorem 3.10.1 wherethe local escape condition (L3) (rather than just the fact that lim supn→∞Xn =∞) is used; presumably this condition is stronger than is necessary, but itis not too restrictive.

Proof of Lemma 3.10.8. Fix x ∈ R+. Define E(n) = ηx ≤ n, the eventthat the process never visits [0, x] after time n. For the duration of thisproof, write f = f0,ν as defined at (3.16) It was shown in the proof ofTheorem 3.5.1 that (L0), the p > 2 case of (L1), and (L1) imply that forsome ν < 0 and some x1 ∈ R+,

E[f(e +Xn+1)− f(e +Xn) | Fn] ≤ 0, on Xn ≥ x1.

An application of Lemma 3.5.7 shows that there exists y > x such that

P[E(n) | Fn] ≥ 1

2, on Xn ≥ y.

For this choice of y, write r = ry and δ = δy for the constants in (L3). Then,on Xn < y, if κn,y = mink ≥ n : Xk ≥ y,

P[E(n+ r) | Fn] ≥ P[κn,y ≤ n+ r ∩ E(κn,y) | Fn]

= E[1κn,y ≤ n+ rP[E(κn,y) | Fκn,y ]

∣∣Fn]≥ 1

2P[κn,y ≤ n+ r | Fn] ≥ δ

2, a.s.,

by (L3). On the other hand, on Xn ≥ y, P[E(n + r) | Fn] ≥ P[E(n) |Fn] ≥ 1/2. It follows that, for all n ≥ 0,

P[E(n+ r) | Fn] ≥ δ/2, a.s. (3.59)

For n ∈ N, let A(n) = ∪2r(n−1)≤k≤2rnXk ≤ x and Gn = F2rn. ThenA(n) ∈ Gn. For n ≥ 0, let τ1 = minn ≥ 1 : A(n) occurs, and for k ≥ 2 setτk = minn > τk−1 : A(n) occurs. Then, on τk <∞,

P[τk+2 =∞ | Gτk ] ≥ P[E(2rτk + r) | F2rτk ] ≥ δ/2 =: q,

by (3.59), while τk =∞ implies that τk+2 =∞. Hence

P[τk+2 <∞] = E[P[τk+2 <∞ | Gτk ]1τk <∞]≤ (1− q)P[τk <∞].


In other words, if N = maxk ≥ 1 : τk < ∞ denotes the number of timesA(n) occurs, we have, setting τ0 = 0,

P[N ≥ k + 2 | N ≥ k] = P[τk+2 <∞ | τk <∞] ≤ 1− q < 1,

for all k ≥ 0. It follows that P[N ≥ k] ≤ (1 − q)bk/2c for all k ≥ 0, so inparticular E ηx ≤ 2rEN <∞.


Proof of Theorem 3.10.1. Let f(x) = x2. Then

E[f(Xn+1)− f(Xn) | Fn] = 2Xn E[∆n | Fn] + E[∆2n | Fn]

≥ 2Xnµ1(Xn) ≥ ε, on Xn ≥ x0,

for some constants ε > 0 and x0 ∈ R+, by assumption. Thus we can applyLemma 3.10.6 with f(x) = x2 and Lemma 3.10.8 to see that there existsa constant C ∈ R+ such that Eσy ≤ C(1 + y)2, for all y ≥ 0. Hence, byMarkov’s inequality,

P[σy > n] ≤ C(1 + y)2

n, for all y > 0.

Now we turn to the other term on the right-hand side of (3.52). Let f =f−θ,0. Provided θ ∈ (0, δ), we may apply the γ = −θ case of Lemma 3.4.1to obtain, using the fact that lim supx→∞ µ2(x) <∞,


≤ −θ2

(2Xnµ1

(Xn)− (1 + θ)µ2(x))X−2−θn + oFnXn(X−2−θ

n ).

By the hypothesis of the theorem, there exist ε > 0 and x0 ∈ R+ such that,for x ≥ x0, µ

1(x) ≥ 0 and 2xµ

1(x)− (1 + θ)µ2(x) > ε. Hence

E[f(Xn+1)− f(Xn) | Fn] ≤ −θ2εX−2−θ

n + oFnXn(X−2−θn ).

So there exists x1 ∈ R+ such that

E[f(Xn+1)− f(Xn) | Fn] ≤ 0, on Xn ≥ x1.

Then an application of Lemma 2.5.8 shows that, for any x > x1,

P[λy,x <∞ | Fσy ] ≤f(Xσy)

infz≤x f(z)≤ f(y)

f(x),


since z 7→ f(z) is non-increasing.

Thus we have from (3.52) that, for all x sufficiently large,

P[ηx > n] ≤(x

y

)θ+C(1 + y)2

n. (3.60)

Fix ε > 0. Choose

y = y(x) = x(log x)1θ

+ε, and n = n(x) = x2(log x)1+ 2θ

+4ε.

Then, for all x sufficiently large,

P[ηx > n(x)] ≤ (log x)−1−θε + (log x)−1−ε.

So∑

k∈Z+P[η2k > n(2k)] <∞. Hence the Borel–Cantelli lemma shows that,

a.s., η2k ≤ n(2k) for all but finitely many k ∈ Z+.

Any x ∈ R+ has 2k(x) ≤ x < 2k(x)+1 for some k(x) ∈ Z+ with k(x)→∞as x→∞. Then for all x sufficiently large, a.s.,

ηx ≤ η2k(x)+1 ≤ n(2 · 2k(x)) ≤ n(2x) ≤ 5n(x),

since n( · ) is eventually increasing. So we get that for any ε > 0, a.s., for allx sufficiently large,

ηx ≤ x2(log x)1+ 2θ

+ε.

In other words, since (by Theorem 3.5.1) Xn → ∞, we have that, a.s., forall but finitely many n,

n ≤ ηXn ≤ X2n(logXn)1+ 2

θ+ε.

The statement of the theorem follows.

3.11 Nullity and weak limits

For this section, we restrict attention to the Markov case as described inSection 3.2. We are interested here in the null case, when, roughly speaking,the probability mass of the process escapes to infinity. The following resultgives a sufficient condition for the null property, formalized by recalling fromSection 3.6 the notation for occupation times Ln(x) =

∑nm=0 1Xm = x.

3.11. Nullity and weak limits 153

Theorem 3.11.1. Suppose that (M0) and (M1) hold. Suppose that 0 <lim infx→∞ µ2(x) ≤ lim supx→∞ µ2(x) < ∞, and, for some constants θ > 0and x0 ∈ R+, for all x ≥ x0,

2xµ1(x) +

(1 +

1− θlog x

)µ2(x) ≥ 0.

Then for any x ∈ S, limn→∞ n−1Ln(x) = 0 a.s. and in Lq for any q ≥ 1.

Proof. Under the conditions of the theorem, Theorem 3.7.3 applies, andshows that there exists x ∈ S such that E[λx | F0] = ∞ on X0 > x.Choose y ∈ S with y > x; then, on τy < τ+

0 , we have

E[τ+0 | Fτy ] ≥ E[minn ≥ 0 : Xτy+n ≤ x | Fτy ] =∞.

Moreover, by Lemma 3.6.10, P0[τy < τ+0 ] > 0, and so

E0[τ+0 ] ≥ E0[E[τ+

0 | Fτy ]1τy < τ+0 ] =∞.

In the notation of Section 3.6, we have shown that Eκ1 =∞, and then thestatement follows from Theorem 3.6.9.

To proceed further, we impose the additional assumption (M2); The-orem 3.11.1 applies when b > 0 and 2a + b ≥ 0. To obtain a non-trivialdistributional limit theorem, we must scale Xn. The diffusive bounds givenin Theorem 3.2.7 suggest n−1/2Xn as the most likely candidate for a non-trivial limit; we will see that this is indeed the case. The limit statementwill involve the distribution function Fα,θ defined for parameters α > 0 andθ > 0 as the (normalized) lower incomplete Gamma function,

Fα,θ(x) =γ(α, x2/θ)

Γ(α)=

1

Γ(α)

∫ x2/θ

0zα−1e−zdz, (x ≥ 0). (3.61)

Note that, if Z ∼ Γ(α, θ) is a Gamma random variable with shape parameterα > 0 and scale parameter θ > 0, then P[

√Z ≤ x] = Fα,θ(x). (In the special

case with α = 1/2 and θ = 2, Fα,θ is the distribution of the square-root ofa χ2 random variable with one degree of freedom, i.e., the absolute value ofa standard normal random variable.)

Theorem 3.11.2. Suppose that (M0) and (M1) hold, and that (M2) holdsfor b > 0 and a > −b/2. Set

α =b+ 2a

2b, and θ = 2b. (3.62)


Then, for any x ≥ 0,

limn→∞

P[n−1/2Xn ≤ x] = Fα,θ(x).

The remainder of this section is devoted to the proof of Theorem 3.11.2.Let m ∈ N. We define an auxiliary process (Xm,n, n ≥ 0) coupled to(Xn, n ≥ 0) via truncation of increments. As usual, we write ∆n = Xn+1 −Xn; also write ∆m,n = Xm,n+1 − Xm,n. For some γ ∈ (0, 1) to be specifiedlater, let

bm(x) := (maxx,m1/2)γ .We will define our truncation via the function

Tm(x, h) :=

h if |h| ≤ bm(x)

bm(x) if h > bm(x)

−bm(x) if h < −bm(x).

The pair (Xn, Xm,n) for n ≥ 0 will be a time-homogeneous Markov processon R2

+. We take Xm,0 = X0 ∈ R+. The transition law of (Xn, Xm,n) is as

follows. Given Xn = Xm,n ∈ R+, we take

Xn+1 = Xn + ∆n, and Xm,n+1 = Xn + Tm(Xn,∆n).

Given Xn 6= Xm,n, we take ∆m,n to be independent of ∆n but with theappropriate truncated distribution, i.e., for any Borel subsets B1, B2 of R,

P[(∆n, ∆m,n) ∈ B1 ×B2 | (Xn, Xm,n = (x, y)]

= P[∆n ∈ B1 | Xn = x]P[Tm(y,∆n) ∈ B2 | Xn = y], for all x 6= y.

With this coupling, the process (Xn, n ≥ 0) is our original Markov pro-cess, and (Xm,n, n ≥ 0) is also time-homogeneous and Markov. Note that|∆m,n| ≤ bm(Xm,n), a.s., by construction. We write

µk(m,x) = E[∆km,n | Xm,n = x],

for the moments of the increments of Xm,n.By construction of the coupled Markov processes,

P[Xm,n+1 6= Xn+1 | Xn = Xm,n] ≤ P[|∆n| > bm(Xn) | Xn = Xm,n]

≤ P[|∆n| > mγ/2 | Xn]

≤ Cm−pγ/2,


by Markov’s inequality and (M1). From this point onwards, we choose (andfix) γ ∈ (2/p, 1), which is possible since p > 2. Then

P[∪0≤n≤mXn 6= Xm,n

]≤ C

m∑n=0

m−pγ/2 = o(1),

as m→∞. Hence, for any ε > 0,

limm→∞

P[|Xm − Xm,m| > ε] = 0.

Thus to complete the proof of the theorem it suffices to show that

limm→∞

P[m−1/2Xm,m ≤ x] = Fα,θ(x).

By definition of ∆m,n, we have that

|µk(m,x)− µk(x)| ≤ E[|∆n|k1∆n > xγ | Xn = x],

so that for k ∈ 1, 2, by (M1) and a similar argument to Lemma 3.4.2,

|µk(m,x)− µk(x)| ≤ Cxγ(k−p) = ox(xk−2), (3.63)

since γ > 2p >

1p−1 , where the notation ox emphasizes that the error term is

uniform in m as x→∞. Moreover, for integer k ≥ 3,

E[|∆m,n|k | Xm,n = x] ≤ E[|∆m,n|2|∆m,n|k−2 | Xm,n = x]

≤ bm(x)k−2 E[∆2m,n | Xm,n = x]

≤ Cbm(x)k−2. (3.64)

The distributional convergence of Xm,m will be deduced by the method ofmoments, based on the following result.

Lemma 3.11.3. For any j ∈ N,

limm→∞

(m−j sup

0≤n≤m

∣∣∣∣∣E[X2jm,n]− nj

j∏i=1

(2a+ (2i− 1)b)

∣∣∣∣∣). (3.65)

Proof. We use induction on j. First, for j = 1, for ` ∈ Z+,

E[X2m,`+1 − X2

m,` | Xm,` = x] = E[(x+ ∆m,`)2 − x2 | Xm,` = x]

= 2xµ1(m,x) + µ2(m,x). (3.66)


By (3.63) and our assumptions on µ1 and µ2,

limx→∞

supm|xµ1(m,x)− a| = 0, and lim

x→∞supm|µ2(m,x)− b| = 0. (3.67)

For compactness, we write these last expressions as xµ1(m,x) = a + ox(1)and µ2(m,x) = b + ox(1), where again the subscript x on the error termindicates that it is uniform in m as x→∞. Hence for any ε > 0 there existx ∈ R+ and C ∈ R+ such that∣∣∣E[X2

m,`+1 − X2m,` | Xm,`]− (2a+ b)

∣∣∣ ≤ ε+ C1Xm,` ≤ x.

Hence, taking expectations,∣∣∣E[X2m,`+1]− E[X2

m,`]− (2a+ b)∣∣∣ ≤ ε+ C P[Xm,` ≤ x].

Summing from ` = 0 to ` = n− 1 we obtain

sup0≤n≤m

∣∣∣E[X2m,n]− E[X2

m,0]− n(2a+ b)∣∣∣ ≤ mε+ C

m−1∑`=0

P[Xm,` ≤ x]

≤ 2mε,

for all m ≥ m0 sufficiently large, by the null property [TO DO!]. In otherwords,

m−1 sup0≤m≤n

∣∣∣E[X2m,n]− n(2a+ b)

∣∣∣→ 0,

as m→∞, which is the j = 1 case of (3.65).For the inductive step, suppose that (3.65) holds for all j ∈ 1, 2, . . . , k−

1 for some k ≥ 2. Consider

E[X2km,`+1 − X2k

m,` | Xm,` = x] = E[(x+ ∆m,`)2k − x2k | Xm,` = x]

=2k∑i=1

(2k

i

)x2k−iµi(m,x),

by the binomial formula. Hence, using (3.67) to estimate the terms i = 1and i = 2 in the sum, we obtain

E[X2km,`+1 − X2k

m,` | Xm,` = x]

= k(2a+ (2k − 1)b+ ox(1)

)x2k−2 +

2k∑i=3

(2k

i

)x2k−iµi(m,x). (3.68)


Also, we have from (3.64) that, for all i ≥ 3,

|µi(m,x)| ≤ Cxγ(i−2) + Cmγ(i−2)/21x2 < m≤ Cxγ(i−2) + Cm(i−2)/2m−(1−γ)/21x2 < m≤ Cxγ(i−2) + Cm(i−2)/2x−(1−γ)/4.

Here, since γ < 1,

2k∑i=3

(2k

i

)x2k−ixγ(i−2) = Ox(x2k+γ−3) = ox(x2k−2),

so we have from (3.68) that∣∣∣E[X2km,`+1 − X2k

m,` | Xm,` = x]− k(2a+ (2k − 1)b+ ox(1)

)x2k−2

∣∣∣≤ ox(1)

2k∑i=3

(2k

i

)x2k−im(i−2)/2.

In other words, for any ε > 0 there exist constants x ∈ R+ and C ∈ R+

(depending on k but not on ` or m) so that∣∣∣E[X2km,`+1 − X2k

m,` | Xm,`]− k(2a+ (2k − 1)b

)X2k−2m,`

∣∣∣≤ ε

2k∑i=2

(2k

i

)X2k−im,` m

(i−2)/2 + Cmk−11Xm,` ≤ x.

By the inductive hypothesis (3.65) and Lyapunov’s inequality, we have thatfor all i ∈ 2, 3, . . . , 2k and all ` ≤ m,

E[X2k−im,` ] ≤

(E[X2k−2

m,` ]) 2k−i

2k−2 ≤ C`k−(i/2) ≤ Cmk−(i/2),

for some C ∈ R+ (depending on k but not on m). Thus, for all ` ≤ m,

2k∑i=2

(2k

i

)E[X2k−i

m,` ]m(i−2)/2 ≤ Cmk−1,

where, again, C ∈ R+ depends on k but not on m. Together with thej = k− 1 case of (3.65), it follows that for any ε > 0, for all 0 ≤ ` ≤ m withm ≥ m0 sufficiently large,∣∣∣∣∣E[X2k

m,`+1]− E[X2km,`]− k`k−1

k∏i=1

(2a+ (2i− b))∣∣∣∣∣ (3.69)


≤ εmk−1 + Cmk−1 P[Xm,` ≤ x]. (3.70)

Also, for any ε > 0 and x ∈ R+, we can choose m0 sufficiently large so that,for m ≥ m0,

Cmk−1m−1∑`=0

P[Xm,` ≤ x] ≤ εmk.

Hence summing from ` = 0 to ` = n− 1 in (3.69) we see that, for any ε > 0,for all m sufficiently large,

sup0≤n≤m

∣∣∣∣∣E[X2km,n]− E[X2k

m,0]− nkk∏i=1

(2a+ (2i− b))∣∣∣∣∣ ≤ εmk,

which verifies the j = k case of (3.65), and completes the induction.

We can now complete the proof of Theorem 3.11.2.

Proof of Theorem 3.11.2.

3.12 Supercritical case

In this section we turn to the supercritical case where µ1(x) and µ1(x) are

positive and of order x−β for some β ∈ (0, 1), in the general setting ofSection 3.3. In this case, under mild conditions, transience is assured: ourprimary interest is to quantify this transience by studying the rate of escapeand accompanying second-order behaviour.

As before, we need to assume a moments condition on the increments∆n = Xn+1 −Xn.

(L5) Suppose that for some C ∈ R+, E[|∆n|p | Fn] ≤ C, a.s.

If (L5) holds for p ≥ 1, then µ1 and µ1

given by (3.5) and (3.6) exist asR-valued functions. The next result gives transience of Xn.

Theorem 3.12.1. Suppose that (L0) and (L2) hold, and that there existsβ ∈ [0, 1) such that (L5) holds for some p > 1 + β, and

lim infx→∞

(xβµ1(x)) > 0.

Then Xn →∞ a.s. as n→∞.

3.12. Supercritical case 159

The main topic of this section is the quantitative asymptotic behaviourof Xn. Our results involve the constants λ(a, β) defined for a ∈ (0,∞),β ∈ (0, 1) by

λ(a, β) := (a(1 + β))1

1+β . (3.71)

The next result gives sharp almost-sure bounds on Xn, and shows that forβ ∈ (0, 1) the transience given in Theorem 3.12.1 is super-diffusive but sub-ballistic, since the exponent 1

1+β is greater than 12 but less than 1.

Theorem 3.12.2. Suppose that (L0) and (L2) hold. Suppose that thereexist β ∈ [0, 1) and a,A with 0 < a ≤ A <∞ such that

a = lim infx→∞

(xβµ1(x)) ≤ lim sup

x→∞(xβµ1(x)) = A, (3.72)

and (L5) holds for some p > 2 + 2β. Then, a.s.,

λ(a, β) ≤ lim infn→∞

n− 1

1+βXn ≤ lim supn→∞

n− 1

1+βXn ≤ λ(A, β).

Remark 3.12.3. The proof of the upper bound onXn given by Theorem 3.12.2only uses the condition on µ1 in (3.72), and not the condition on µ

1there.

A corollary of Theorem 3.12.2, obtained on taking a = A = ρ, is thefollowing strong law of large numbers.

Theorem 3.12.4. Suppose that (L0) and (L2) hold. Suppose that thereexists β ∈ [0, 1) such that (L5) holds for some p > 2 + 2β, and

limx→∞

xβµ1(x) = limx→∞

xβµ1(x) = ρ ∈ (0,∞). (3.73)

Thenlimn→∞

n− 1

1+βXn = λ(ρ, β), a.s. (3.74)

The next result shows that there is a central limit theorem to accompanythe law of large numbers in Theorem 3.12.4, provided that we impose asomewhat stronger version of (3.73) and an asymptotic stability conditionon the second moments of the increments. Unlike the preceding results inthis section, the case β = 0 is excluded from the following theorem.

Theorem 3.12.5. Suppose that (L0) and (L2) hold. Suppose that thereexist β ∈ (0, 1) and ρ ∈ (0,∞) such that (L5) holds for some p > 2 + 2β,and

µ1(x) = ρx−β + o(x−β−

1−β2 ); µ1(x) = ρx−β + o(x−β−

1−β2 ). (3.75)


Suppose also that there exists σ2 ∈ (0,∞) such that

E[∆2n | Fn]→ σ2, a.s. (3.76)

Then, as n→∞,

n−1/2(Xn − λ(ρ, β)n

11+β

)d−→ Zσ

√1 + β

1 + 3β,

where Z is a standard normal random variable.

In the rest of this section we give the proofs of these theorems. We startwith a technical result. Again we use the notation Eε(n) = |∆n| ≤ (1 +Xn)1−ε as at (3.19). The next truncation result is a relative of Lemma 3.4.2.

Lemma 3.12.6. Suppose that (L0) holds and that (L5) holds for p > 0. Forany q ∈ [0, p] and ε ∈ (0, 1) there exists C ∈ R+ for which, for all n ≥ 0,

E[|∆n|q1(Ec

ε(n))∣∣Fn] ≤ C(1 +Xn)−(p−q)(1−ε), a.s.

Proof. For any n ≥ 0,

|∆n|q1(Ecε(n)) = |∆n|p|∆n|q−p1(Ec

ε(n))

≤ |∆n|p(1 +Xn)(q−p)(1−ε).

Taking expectations and using (L5) gives the result.

Proof of Theorem 3.12.1. We use the Lyapunov function f(x) = (1 + x)−γ ,γ > 0. We claim that there exist γ > 0 and x0 ∈ R+ such that for all n ≥ 0,

P[f(Xn+1)− f(Xn) | Fn] ≤ 0, on Xn ≥ x0. (3.77)

Note that our assumption on µ1

implies that for some c > 0 and x1 ∈ R+,

P[∆n | Fn] ≥ cX−βn on Xn ≥ x1. Let ε ∈ (0, 1) be such that pε < p−1−β.Since f is decreasing,

f(Xn+1)− f(Xn) ≤ (f(Xn+1)− f(Xn))1(Eε(n))

+ (f(Xn+1)− f(Xn))1∆n ≤ −(1 +Xn)1−ε. (3.78)

On Eε(n), Taylor’s formula with Lagrange remainder shows that

(f(Xn+1)− f(Xn))1(Eε(n))

= −γ(1 +Xn)−γ−1(1 +OFnXn(X−εn ))∆n1(Eε(n)). (3.79)


Since (L5) holds for p > 1 + β ≥ 1, we may apply the q = 1 case ofLemma 3.12.6 to obtain

E[∆n1(Eε(n)) | Fn] = E[∆n | Fn] +OFnXn(X−(p−1)+pεn ), (3.80)

and this last error term is oFnXn(X−βn ) by choice of ε. Hence taking expecta-tions in (3.79) and using (3.80) we obtain

E[(f(Xn+1)− f(Xn))1(Eε(n)) | Fn] ≤ −γX−γ−1−βn (c+ oFnXn(1)). (3.81)

On the other hand, since f(x) ∈ [0, 1], the q = 0 case of Lemma 3.12.6 yields

E[|f(Xn+1)− f(Xn)|1∆n ≤ −(1 +Xn)1−ε | Fn]

≤ P[Ecε(n) | Fn] = OFnXn(X−p(1−ε)n ) = oFnXn(X−γ−1−β

n ), (3.82)

provided we take γ > 0 with γ < p− 1− β − pε, which is possible by choiceof ε. From (3.78) with (3.81) and (3.82), we therefore conclude that, forγ > 0 sufficiently small,

P[f(Xn+1 − f(Xn) | Fn] ≤ −(cγ + oFnXn(1))X−γ−1−βn ,

which yields the claim (3.77). Transience now follows from an applicationof Theorem 3.5.6.

The next technical result will allow us to work with the process X1+βn .

Lemma 3.12.7. Suppose that (L0) holds, and that there exists β ∈ [0, 1)such that (L5) holds for some p > 1 + β.

(i) If lim supx→∞(xβµ1(x)) <∞, then there exists C ∈ R+ such that,

E[X1+βn+1 −X1+β

n | Fn] ≤ C, a.s. (3.83)

(ii) Suppose that for some A ∈ (0,∞), lim supx→∞(xβµ1(x)) ≤ A. Thenfor any ε > 0 there exists x ∈ R+ such that for all n ≥ 0,

E[X1+βn+1 −X1+β

n | Fn] ≤ A(1 + β) + ε, on Xn ≥ x. (3.84)

(iii) Suppose that for some a ∈ (0,∞), lim infx→∞(xβµ1(x)) ≥ a. Then for

any ε > 0 there exists x ∈ R+ such that for all n ≥ 0,

E[X1+βn+1 −X1+β

n | Fn] ≥ a(1 + β)− ε, on Xn ≥ x. (3.85)


(iv) Let r ∈ [1, p1+β ). Then there exists C ∈ R+ such that for all n ≥ 0,

E[|X1+βn+1 −X1+β

n |r | Fn] ≤ C(1 +Xβrn ), a.s. (3.86)

Proof. Let ε ∈ (0, 1) be such that pε < p − 1 − β. By a Taylor’s formulaargument that should now be familiar, we have that

X1+βn+1 −X1+β

n = (1 + β +OFnXn(X−εn ))Xβn∆n1(Eε(n)) +Rn, (3.87)

where Rn = (X1+βn+1 − X1+β

n )1(Ecε(n)). Since on Ec

ε(n) we have Xn ≤|∆n|1/(1−ε), it follows that, by choice of ε,

|Rn| ≤ C|∆n|1+β+pε1(Ecε(n)), a.s., (3.88)

for some absolute constant C ∈ R+. Then taking expectations in (3.88) andusing the q = 1 + β + pε < p case of Lemma 3.12.6 we have that for n ≥ 0,

E[|Rn| | Fn] ≤ C(1 +Xn)−(p−1−β−pε)(1−ε) = oFnXn(1),

by choice of ε. Also, as at (3.80), we have that

E[∆n1(Eε(n)) | Fn] = E[∆n | Fn] + oFnXn(X−βn ),

by choice of ε. It follows from (3.87) that, a.s., for all n ≥ 0,

E[X1+βn+1 −X1+β

n | Fn] = (1 + β + oFnXn(1))Xβn E[∆n | Fn] + oFnXn(1).

Under the conditions of part (ii) of the lemma we have that E[∆n | Fn] ≤(A + oFnXn(1))X−βn , and (3.84) follows. Similarly (3.85) follows under theconditions of part (iii) of the lemma.

Next we prove part (iv) of the lemma. From (3.87) and (3.88), for anyε > 0 small enough,

|X1+βn+1 −X1+β

n |r ≤ CXβrn |∆n|r + C|∆n|(1+β+pε)r, a.s.,

for a constant C ∈ R+. Given r < p1+β , we may choose ε > 0 small enough

so that (1 + β + pε)r ≤ p. Then (L5) shows that both E[|∆n|r | Fn] andE[|∆n|(1+β+pε)r | Fn] are uniformly bounded above. Thus (3.86) follows.

Finally, part (i) of the lemma follows on combining part (ii) with ther = 1 case of part (iv).

Under conditions of comparable strength to those in Theorem 3.12.1, wehave the following upper bound, which, while relatively crude, will be usefulin the proofs of the sharper results of this section.


Lemma 3.12.8. Suppose that (L0) and (L2) hold, and that there existsβ ∈ [0, 1) such that (L5) holds for some p > 1 + β, and

lim supx→∞

(xβµ1(x)) <∞.

Then for any ε > 0, a.s., for all but finitely many n ≥ 0,

max0≤k≤n

Xk ≤ n1

1+β (log n)1

1+β+ε. (3.89)

Proof. Lemma 3.12.7(i) shows that under the conditions of the lemma, wemay apply Theorem 2.8.1 with f(x) = x1+β and a(x) = x(log x)1+ε, whichyields the result.

We now work towards a proof of Theorem 3.12.2. For notational con-venience, for the rest of this section we write

Yn := X1+βn , and Dn := E[Yn+1 − Yn | Fn].

The Doob decomposition for Yn (see Theorem 2.3.1) is

Yn = Mn +An, (3.90)

where A0 := 0, An :=∑n−1

k=0 Dk for n ≥ 1, and Mn is an Fn-adapted mar-tingale. The next result gives conditions for An to be a good approximationfor X1+β

n .

Lemma 3.12.9. Suppose that (L0) holds, and that there exists β ∈ [0, 1)such that (L5) holds for some p > 2 + 2β. If lim supx→∞(xβµ1(x)) < ∞,then

limn→∞

n−1∣∣X1+β

n −An∣∣ = 0, a.s.

Proof. Since Mn = Yn −∑n−1

k=0 Dk is a martingale, we have that

E[M2n+1 −M2

n | Fn] = E[(Mn+1 −Mn)2 | Fn]

= E[(Yn+1 − Yn −Dn)2 | Fn]

= E[(Yn+1 − Yn)2 | Fn]−D2n, (3.91)

where we have expanded the term (Yn+1−Yn−Dn)2 and used the fact thatDn is Fn-measurable. Now by (3.91) and the r = 2 case of (3.86) (which isvalid since p > 2(1 + β)) we have that for all n ≥ 0,

E[M2n+1 −M2

n | F0] = E[E[M2

n+1 −M2n | Fn]

∣∣F0

]≤ C + C E[X2β

n | F0].


Now since β ∈ [0, 1), the (conditional) Jensen inequality implies that

E[X2βn | F0] ≤

(E[X1+β

n | F0]) 2β

1+β ≤ (X1+β0 + Cn)

2β1+β ,

by (3.83). Thus M2n is a nonnegative submartingale with

E[M2n | F0] ≤ Y 2

0 +n−1∑k=0

E[M2k+1 −M2

k | F0] ≤ X2+2β0 + n(X1+β

0 + Cn)2β1+β .

Doob’s inequality (Theorem 2.3.14) then implies that for any ε > 0,

P[

max0≤k≤n

M2k > n

1+3β1+β

+ε]≤ n−

1+3β1+β−ε E[M2

n] = O(n−ε).

Hence the Borel–Cantelli lemma implies that for any ε > 0, a.s.,

max0≤k≤2m

|Mk| ≤ (2m)1+3β2+2β

+ε,

for all but finitely many m ∈ Z+. Since for any n ∈ N we have 2m(n) ≤ n <2m(n)+1 for some m(n) ∈ Z+ with m(n) → ∞ as n → ∞, we have that forany ε > 0, a.s., for all but finitely many n ≥ 0,

max0≤k≤n

|Mk| ≤ max0≤k≤2m(n)+1

|Mk| ≤ (2m(n)+1)1+3β2+2β

+ε ≤ Cn1+3β2+2β

+ε,

for some C ∈ R+ not depending on n. Since β ∈ [0, 1), we may take ε smallenough so that 1+3β

2+2β + ε ≤ 1 − ε. Then we have that |An − Yn| = O(n1−ε)as n→∞, a.s.

Proof of Theorem 3.12.2. Under the conditions of the theorem, we havefrom Lemma 3.12.7 that (3.84) and (3.85) hold. Moreover, we know fromTheorem 3.12.1 that Xt → ∞ as t → ∞, a.s., so that the ε terms in (3.84)and (3.85) may be taken to be arbitrarily small for all t large enough. Hencewith An given in (3.90) we have that for any ε > 0, a.s.,

a(1 + β)− ε ≤ t−1At ≤ A(1 + β) + ε,

for all but finitely many t ∈ Z+. Now from Lemma 3.12.9 we have thatX1+βt = At + o(t), so that for any ε > 0, a.s.,

a(1 + β)− ε ≤ t−1X1+βt ≤ A(1 + β) + ε,

for all but finitely many t ∈ Z+. This proves the theorem.


The basic ingredients of the proof of Theorem 3.12.5 are already in placein the decomposition used in the proof of Lemma 3.12.9, but we need torevisit some of our earlier estimates and obtain sharper bounds under theconditions of Theorem 3.12.5. First we have the following refinement ofLemma 3.12.7 in this case.

Lemma 3.12.10. Suppose that (A0) holds and that for β ∈ (0, 1), ρ ∈(0,∞), (3.75) holds. Suppose that (L5) holds for γ > 2 + 2β. Then ast→∞

P[X1+βt+1 −X1+β

t | Ft] = ρ(1 + β) + o(t− 1−β

2+2β ), a.s.. (3.92)

If in addition (3.76) holds for σ2 ∈ (0,∞), as t→∞

P[(X1+βt+1 −X1+β

t )2 | Ft] = σ2(1 + β)2λ(ρ, β)2βt2β1+β (1 + o(1)), a.s.. (3.93)

Proof. We need to obtain better estimates for the error terms in (3.87) thanwe did in the proof of Lemma 3.12.7. For this reason the δ there will not bearbitrarily small, so we cannot use (3.88). Instead, from the display before(3.88) we have that

|R(Xt,∆t)| ≤ C|∆t|1+ β1−δ 1(Ec

t ),

so that from Lemma 3.12.6

P[|R(Xt,∆t)| | Ft] ≤ C(1 +Xt)(1+ β

1−δ−γ)(1−δ) = C(1 +Xt)1−δ+β−(1−δ)γ .

Now take δ = 1−β2 + ε for ε ∈ (0, 1/2). It follows that provided γ > 2 and

ε is small enough, P[|R(Xt,∆t)| | Ft] = o(X− 1−β

2t ). Moreover, we have that,

again taking ε small enough and using the fact that γ > 2,

P[∆t1(Et) | Ft] = P[∆t | Ft] + o(X−β− 1−β

2t ) = ρX−βt + o(X

−β− 1−β2

t ), a.s.,

by (3.75). With these sharper bounds, from (3.87) and the present choiceof δ we obtain

P[X1+βt+1 −X1+β

t | Ft] = (1 + β)ρ+ o(X− 1−β

2t ), a.s..

Under the conditions of the lemma, Theorem 3.12.4 applies so that Xt ∼λ(ρ, β)t

11+β . Thus we obtain (3.92). The argument for (3.93) is similar,

starting by squaring (3.87) and then taking δ > 0 small enough, so we omitthe details.


Lemma 3.12.11. Suppose that (A0) holds and that for β ∈ (0, 1), ρ ∈(0,∞), (3.75) holds. Suppose that (L5) holds for p > 2 + 2β and (3.76)holds for σ2 ∈ (0,∞). Then with Mt as defined at (3.90), we have that ast→∞,

t− 1+3β

2+2βMtd−→ Zσλ(ρ, β)β

√(1 + β)3

(1 + 3β),

where Z is a standard normal random variable.

Proof. We will apply a standard martingale central limit theorem. Set

Mt,s = t− 1+3β

2+2β (Ms −Ms−1) for 1 ≤ s ≤ t. For fixed t, (Mt,s)s is a mar-tingale difference sequence with P[Mt,s | Fs−1] = 0. Moreover,

t∑s=1

E[M2t,s | Fs−1] = t

− 1+3β1+β

t∑s=1

P[(Ms −Ms−1)2 | Fs−1]

= t− 1+3β

1+β

t∑s=1

(P[(X1+β

s −X1+βs−1 )2 | Fs−1] +O(1)

), a.s.,

by (3.91), using the fact that |Dt| = O(1) by (3.92). Now applying (3.93)we obtain

t∑s=1

P[M2t,s | Fs−1] = t

− 1+3β1+β σ2(1 + β)2λ(ρ, β)2β(1 + o(1))

t∑s=1

s2β1+β

= σ2 (1 + β)3

1 + 3βλ(ρ, β)2β + o(1), a.s.. (3.94)

We also need to verify a form of the conditional Lindeberg condition. Weclaim that

t∑s=1

P[M2t,s1|Mt,s| > ε | Fs−1] = o(1), a.s., (3.95)

for any ε > 0. To see this, take p ∈ (2, γ1+β ). By the elementary inequality

|Mt,s|21|Mt,s| > ε ≤ ε2−p|Mt,s|p we have that, for any ε > 0,

P[M2t,s1|Mt,s| > ε | Fs−1] = O(P[|Mt,s|p | Fs−1]).

Then we have that

P[|Mt,s|p | Fs−1] = t− (1+3β)p

2+2β P[(Ms −Ms−1)p | Fs−1]


= t− (1+3β)p

2+2β P[(X1+βs −X1+β

s−1 −Ds−1)p | Fs−1],

where by (3.92), |Ds−1| = O(1), and by the r = p case of (3.86),

P[(X1+βs −X1+β

s−1 )p | Fs−1] ≤ CXβps−1 ≤ Cs

βp1+β , a.s.,

by Theorem 3.12.4. Hence by Minkowski’s inequality,

P[|Mt,s|p | Fs−1] ≤ Ct−(1+3β)p2+2β s

βp1+β , a.s.,

for some C ∈ (0,∞). Thus we obtain

t∑s=1

P[M2t,s1|Mt,s| > ε | Fs−1] ≤ Ct1+ βp

1+β− (1+3β)p

2+2β , a.s..

From this last bound we verify (3.95) since p > 2. Given (3.94) and (3.95),we can apply a standard central limit theorem for martingale differences(e.g. [19, Theorem 35.12, p. 476]) to complete the proof.

Proof of Theorem 3.12.5. Recall the decomposition at (3.90). Under theconditions of the theorem, Theorem 3.12.4 and the proof of Lemma 3.12.9imply that Mt = o(At), a.s., so that

Xt = (At +Mt)1

1+β = A1

1+β

t +1

1 + βMtA

− β1+β

t (1 + o(1)).

Here we have from (3.92) that, a.s., At = ρ(1 + β)t + o(t1+3β2+2β ). It follows

that, a.s.,

Xt = λ(ρ, β)t1

1+β + o(t1/2) +1

1 + βλ(ρ, β)−βt

− β1+βMt(1 + o(1)).

Rearranging we obtain, a.s.,

Xt − λ(ρ, β)t1

1+β

t1/2=

1

1 + βλ(ρ, β)−βt

− 1+3β2+2βMt(1 + o(1)) + o(1).

Now on letting t→∞ Lemma 3.12.11 completes the proof.


3.12.1 Example: Birth-and-death chains

Suppose that Xn is an irreducible time-homogeneous Markov chain suppor-ted on the countable set S = Z+ with jumps of size at most 1. Specifically,suppose that there exist sequences ax, bx, cx (x ∈ N := 1, 2, 3, . . .) withax > 0, bx ≥ 0, cx > 0 and ax + bx + cx = 1 for all x ∈ N. Define thetransition law of Xn for t ∈ Z+ as follows: for x ∈ N,

P[Xt+1 = x+ 1 | Xt = x] = ax,

P[Xt+1 = x | Xt = x] = bx,

P[Xt+1 = x− 1 | Xt = x] = cx,

and with reflection from 0 governed by P[Xt+1 = 1 | Xt = 0] = 1. Of coursein this setting Xn has uniformly bounded increments.

For x ∈ N, with µ1(x) and µ2(x) defined by (3.1) we have that

µ1(x) = ax − cx, µ2(x) = 1− bx > 0.

The asymptotically zero drift case is the case where limx→∞(ax − cx) = 0.There is an extensive literature concerned with various special cases where|xµ1(x)| = O(1). For recent work, we refer to [56, 54, 53]; the papersof Csaki, Foldes and Revesz cited include references to some of the olderliterature. We are in the supercitical case if, for β ∈ (0, 1),

limx→∞

xβ(ax − cx) = ρ ∈ (0,∞). (3.96)

In this case the following law of large numbers is due to Voit [282, Theorem2.11] (in fact Voit works in a more general setting of random walks onpolynomial hypergroups, which do not concern us here). Note that thereis a misprint in the limiting constant in the statement of Theorem 2.11 of[282] (the proof there does yield the correct constant): the 1/(1 + α) powershould be applied to the entire limiting expression, not just the µ there; thistypo persists into [54, Theorem D].

Proposition 3.12.12. [282, Theorem 2.11] Suppose that Xn is a birth-and-death chain specified by ax, bx, cx as described above. Suppose that (3.96)holds for β ∈ (0, 1) and ρ ∈ (0,∞). Suppose also that the two limits

limx→∞

ax and limx→∞

cx exist in (0, 1), (3.97)

(in which case they must take the same value). Then (3.74) holds.

3.13. Proofs for the Markovian case 169

Proposition 3.12.12 is a special case of our Theorem 3.12.4, under theadditional assumption (3.97). Theorem 3.12.4 shows that the assumption(3.97) is not necessary for the result: only the mean is important, not theabsolute probabilities of going left or right.

Our Theorem 3.12.5 above has the following immediate (and apparentlynew) consequence in the birth-and-death chain case. Under the assumptionthat limx→∞ bx = b, (3.97) holds (with limit 1−b

2 for ax and cx), so thiscentral limit theorem can be seen as the natural companion to Voit’s law oflarge numbers [282, Theorem 2.11].

Theorem 3.12.13. Suppose that Xn is a birth-and-death chain specifiedby ax, bx, cx as described above. Suppose that for some β ∈ (0, 1) and ρ ∈(0,∞),

ax − cx = ρx−β + o(x−β−1−β2 ); lim

x→∞bx = b ∈ [0, 1).

Then as t→∞, for a standard normal random variable Z,

Xt − λ(ρ, β)t1/(1+β)

t1/2d−→ Z

√(1− b)(1 + β)

1 + 3β. (3.98)

We make some final remarks on the case, of secondary interest to ushere, where β = 0. Our Theorem 3.12.4 applies to the case β = 0, i.e.,where ax − cx → ρ ∈ (0, 1] as x → ∞, in which case our result says thatt−1Xt → ρ a.s. as t → ∞. This particular result has been previously ob-tained by Pakes [228, Proposition 4], under some more restrictive conditions,including ρ = 1 and bx ≡ 0, and also, for general ρ but again under con-ditions more restrictive than ours, in a result of Voit [281, Corollary 2.6].In the case β = 0, the second-order behaviour of Xn is somewhat different(our Theorem 3.12.13 does not apply). See for instance [228, Theorem 7]and [281, Theorems 2.7–2.10].

3.13 Proofs for the Markovian case

We now record the proofs of the results stated in Section 3.2; this is mostlysimply collecting results from the previous sections. We start with the re-currence classification.

Proof of Theorem 3.2.3. The transience condition in part (i) is a consequenceof Theorem 3.5.1; Theorem 3.6.8 demonstrates the equivalence of the two


notions of transience. The positive-recurrence condition in part (iii) is aconsequence of the α = 1 case of Theorem 3.8.1. In part (ii), recurrencefollows from Theorem 3.5.2; Theorem 3.6.8 demonstrates the equivalence ofthe two notions of recurrence. Finally, the null property in part (ii) followsfrom Theorem 3.7.3.

Next we give the proof of the result on moments of return times.

Proof of Theorem 3.2.6. The existence of moments statement in part (i)follows from Theorem 3.8.1, while the non-existence of moments in part(ii) follows from Theorem 3.8.2.

Finally, the almost-sure bounds on trajectories.

Proof of Theorem 3.2.7. This is a combination of the bounds from The-orem 3.9.5 with the lower bound from Theorem 3.10.1 in the transient casefor the ‘lim inf’ half of part (i).


Section 3.1

The foundations for the material in this chapter were laid in the early 1960sby J. Lamperti in a seminal series of papers [173, 174, 175]. Lampertisystematically studied conditions under which the asymptotic behaviour ofa non-negative real-valued discrete-time stochastic process is governed bythe (first two) moment functions of its increments. These three papershave provided a rich source of material for subsequent theoretical work, aswell as providing key evidence of the power of the semimartingale approachfor studying multidimensional processes via Lyapunov functions. Some ofthe chapters later in this book describe examples of the application of thismethod to near-critical processes, often in cases where it is hard to find anyserious methodological competitor.

Example 3.1.1 gives a hint to the ubiquity of Lamperti-type situations.As remarked in the text, if ξn is Markov then typically Xn = f(ξn) is not:see Section 1.3 and the accompanying bibliographical notes for Chapter 1.

Many dimensional zero-drift random walks satisfying the conditions ofExample 3.1.1 may be either recurrent or transient, as we will see in Chapter 4;this contrasts with the one-dimensional case where recurrence is assured un-der similar conditions, as shown in Theorem 2.5.6.


Section 3.2

The Markovian recurrence classification, Theorem 3.2.3, improves a littleon Lamperti’s original results from [173, 175] in the Markov case. The-orem 3.2.3 is usually sufficiently sharp for any application, but can besharpened near the phase boundaries; an ultimate refinement is given byMenshikov et al. [208]. Theorem 3.2.3 parts (i) and (ii) are contained inTheorem 3.1 of [173] and Theorem 2.1 of [175] respectively; part (iii) issharper than Lamperti’s results, and is contained in the results of [208].

Nearest neighbour walks in the vein of Example 3.2.5 have been extens-ively studied for a long time; these models also go under the name birth-and-death chains. In the context, the observation that the regime wherexµ1(x) = O(1) is critical from the point of view of the recurrence clas-sification goes back at least to Harris [113]; another early contribution is[115]. In certain cases, nearest-neighbour models are amenable to explicitcalculation, facilitated by reversibility and associated algebraic structure.An intricate apparatus was introduced by Karlin and McGregor [139, 140],using orthogonal polynomials to provide a spectral representation, to give adeep analysis of models such as

P[Xn+1 = Xn ± 1 | Xn = x] =1

2∓ δ

4x+ 2δ,

for x > 0 and a parameter δ, which is an instance of Example 3.2.5 with c =−δ/2. This and closely related models were considered by many subsequentauthors, including for instance [249, 80, 81, 82, 275, 99, 281] and, morerecently [54, 53, 56, 126, 2, 165]. Alexander [2] calls such nearest-neighbourrandom walks with drift O(x−1) at x ‘Bessel-like’. See e.g. [48] for a surveyof methods based on orthogonal polynomials.

Theorem 3.2.6 on moments of return times in the Markov case is a con-sequence of the more general results in Section 3.7; see the notes below onthat section for bibliographical details. Likewise, Theorem 3.2.7 is a con-sequence of the more general results on almost-sure bounds in Sections 3.9and 3.10; see the notes below on the relevant sections.

One possible extension to the models of this section is to move from thehalf-line R+ to a Markov process on a half strip A×Z+, where A is a finiteset, in which the process has different Lamperti-type characteristics on eachcopy of Z+, and is close to being Markov on A: see [102].


Section 3.3

For all the reasons outlined in the text, the importance of a setting more gen-eral than the Markovian one of Section 3.2 was emphasized by Lamperti [173].The formulation (3.5) and (3.6) builds on the original formulation of [173],and rectifies an omission in a similar earlier version in [215].

The representation (3.7) is a very general stochastic difference equation;stochastic recursions of this form have received much attention in variouscontexts over the years, particularly when ψ1,n ≡ 0, including stochasticgrowth models (see e.g. [142]) and stochastic approximation (see e.g. [232]).

The natural non-confinement condition (L2) was used by Lamperti [173],and avoids technicalities involving irreducibility in general spaces. The uni-form ellipticity assumption discussed in Example 3.3.5 is common in theliterature on multidimensional random walks.

Section 3.4

The Lyapunov function computations presented in this section are variationson a well-established theme. The closest presentation in the literature is thatof [121]. Instead of the truncation near zero in defining fγ,ν at (3.16), analternative is to take say (e + x)γ logν(e + x), but this makes some of thecomputations a little more cumbersome.

Section 3.5

The transience criterion, Theorem 3.5.1, is the transience half of Theorem 3.2of [173]. The recurrence criterion, Theorem 3.5.2, improves on the recurrencehalf of Theorem 3.2 of [173], which admits θ = 1 but not θ ∈ (0, 1). Formally,Theorem 3.5.2 is not contained in the results of [208], which are stated inthe Markovian case, although it is likely that these sharper results can beextended to the more general setting.

The fact that symmetric SRW on Zd is recurrent if and only if d ≤ 2 isa seminal result of Polya [235]; the fact that the statement extends to non-degenerate homogeneous random walks whose increments have finite secondmoments is a consequence of the equally famous Chung–Fuchs theorem (see[37] or Chapter 9 of [137]). Polya’s original proof is based on path-countingand generating functions, and is still the most widely-presented proof of theSRW result. The proof of the Chung–Fuchs theorem is by Fourier analysis.The approach via Lyapunov functions, as presented in Examples 3.5.3 and3.5.4, is due to Lamperti [173], and represents a significant breakthrough, notonly in clarifying the probabilistic intuition behind Polya’s result, but also


in providing a powerful methodology that is applicable in greater generalitythan combinatorial or analytical approaches.

The fundamental semimartingale results Theorems 3.5.6 and 3.5.8 buildon ideas of [173], as well as being relatives of the results of Foster presentedin Chapter 2. Lamperti [173] worked under the non-confinement assump-tion (L2), and his formulations are somewhat different. The formulationhere, which permits cases where (L2) does not hold, is closely related tothat contained in Kersting [146].

The analysis of branching processes via Lamperti-type methods, as inExample 3.5.9, goes back to Lamperti himself in a different context [176].A reference for the classical theory of Galton–Watson processes is [10].

Section 3.6

The material on irreducibility and regeneration and their consequences oncountable state-spaces is standard, although no single reference covers thepresentation in this section. A useful reference for regenerative processes isChapter VI of [5].

The ergodic theorem for occupation times (Theorem 3.6.9) is a standardresult, and the regenerative structure and irreducibility combine to give aneasy and natural proof. The statement of the result given here is not alwaysapparent in the standard references, which often focus on the (deeper) resultlimn→∞ P[Xn = x] = πx, which requires aperiodicity (cf. Theorem 2.1.6).Under suitable conditions, there is a central limit theorem to accompanyTheorem 3.6.9, namely,

Ln(x)− nπx√n

converges in distribution to a centred normal distribution with finite positivevariance: see e.g. §6.9 of [229] or §VI.3 of [5] for further details.

More generally than Theorem 3.6.9, similar ergodicity results hold fortime-averages of other functionals f : S → R, provided they are integrablewith respect to the limiting measure πx. Specifically, if (I) and (R) hold fora countable S with 0 ∈ S, and if Eκ1 <∞ then, a.s.,

limn→∞

1

n

n−1∑m=0

f(Xm) = limn→∞

1

n

∑x∈S

Ln(x)f(x) =∑x∈S

πxf(x),

provided∑

x∈S πx|f(x)| < ∞; the proof is similar to that given for The-orem 3.6.9. Compare e.g. [5, Theorem VI.3.1, p. 178].


Much of the discussion in this section can be extended to more generalstate-spaces, at the expense of technical complications: in that context, de-tailed accounts of the relevant concepts may be found in the books [225, 243,224, 217]. Roughly speaking, to obtain analogous results to those presentedhere, one needs a concept that fulfils not only the role of irreducibility, butalso that of the local finiteness of the state-space.

Although we do not discuss it in this book, it is an interesting prob-lem to determine the x-asymptotics of the stationary distribution πx fora positive-recurrent Lamperti process on a locally finite subset of R+, forwhich the conditions (M0), (M1) and (M2) hold with 2a < −b, say. In thecase of a Markov process with uniformly bounded increments, Menshikovand Popov [209] showed that

π(x) = x−(2a/b)+o(1), as x→∞;

related results appear in [7]. Denisov et al. [61] obtain sharper asymptoticsfor the tail

∑y≥x πy under weaker conditions; see also [164]. The results of

Menshikov and Popov [209] also address the rate of convergence to station-arity; see also [165] in the case of a nearest-neighbour random walk. Thisis one sense in which Lamperti processes exhibit only polynomial ergodicityrather than geometrical ergodicity.

Sections 3.7 and 3.8

As described in the bibliographical notes for Section 2.7, results on theexistence and non-existence of moments of passage times originate withLamperti [175] (for positive integer moments) and were extended in [9] andsubsequently in [6, 8].

Theorem 3.7.1 is closely related to Proposition 1 of [9], which was statedin the Markovian case. This extended Theorem 2.2 of [175], which appliedonly to integer moments, and was again stated in the Markovian case, al-though Lamperti indicated how it held more generally.

Theorems 3.7.2 and 3.7.3 on lower tail bounds and non-existence of mo-ments extend Proposition 2 of [9] and Theorem 3.2 of [175], which wereagain stated in the Markovian case. The results of [9] did not provide a tailbound, while [175] applied only to integer moments. The tail bound in The-orem 3.7.2 can be derived, under more restrictive assumptions (includinguniformly bounded increments), from Corollary 1 of [6]. The proofs of thelower tail bounds given here, which use Lemma 2.7.7 and lower tail boundson excursions, are related to the arguments in [121], where related excursionbounds are given under slightly different conditions.


The fact that tails of passage times for Lamperti processes satisfy up-per and lower bounds of polynomial order is typical of near-critical sys-tems, which exhbit polynomial behaviour; in the case of a positive-recurrentMarkov chain, this is polynomial ergodicity in another sense (closely relatedto that described above for stationary distributions).

In the special case of nearest neighbour random walks, Fal’ [80] givesasymptotics for excursion times and the number of excursions.

Section 3.9

Related almost-bounds in a general Lamperti-type setting are given in [211,Section 4]; there a uniform bound on the increments was imposed. Theupper bounds in Theorems 3.9.1 and 3.9.2 are variations on Theorems 4.1(i)and 4.3(i) from [211]; see also Section 6 of [42].

In [211], lower bounds were also given, using a version of Theorem 2.8.3,but this required a somewhat unnatural ejection condition. The approachto lower bounds given here, building on the excursion developments of Sec-tions 3.7 and 3.8, follows similar ideas to [121], where slightly sharper boundsare given under somewhat different assumptions.

In various special cases of certain nearest-neighbour random walks onZ+, various results on almost-sure bounds have been obtained; often, usingmethods restricted to the nearest-neighbour case, sharper results are avail-able. The diffusive case (cf Theorem 3.9.1) has received most attention, anditerated-logarithm results are available [249, 82, 275, 99, 281, 126]; of these,only [126] treats the sub-diffusive case (cf Theorem 3.9.2).

Section 3.10

Theorem 3.10.1 is a substantial improvement of Theorem 4.2 of [211], whichgave the first general almost-sure lower bounds for the transient Lampertiproblem. Theorem 4.2 of [211] gave a lower bound Xn ≥ n1/2(log n)−D, forsome constant D not given explicitly, under much more stringent conditions,including the Markov assumption, a uniform bound on the increments, andan unnatural ‘re-entry’ condition to overcome the technicality addressed hereby Lemmas 3.10.5 and 3.10.8.

The proof of Theorem 3.10.1 given here improves the proof in [211];the connection to last exit times via the inequality (3.52) simplifies theargument. Lemmas 3.10.5 and 3.10.6 extend the idea of Lemma 3.2 of [211]to permit an exceptional set in the Lyapunov function condition, and toreplace a uniform increments bound by a moments condition. Related ideas


appear in Section 3 of [61].

As mentioned in the text of Example 3.10.3, the rate of escape for tran-sient SRW on Zd (d ≥ 3) was studied by Dvoretzky and Erdos [73]. Es-sentially the same result holds for spatially homogeneous, zero-drift randomwalks under suitable moments assumptions: see e.g. [107, 237] and referencestherein.

As mentioned in the text of Example 3.10.4, in the much more restric-ted setting of nearest-neighbour random walks, a sharp version of The-orem 3.10.1 is given by Csaki et al. [53]. Specifically, under the conditionsof Example 3.10.4, Theorem 6.1 of [53] says that for any ε > 0, a.s., for allbut finitely many n,

Xn ≥ n12 (log n)

11+2c

−ε,

and that this result is sharp in that it fails with ε = 0. The proof in [53] pro-ceeds via strong approximation to an appropriate transient Bessel process,and then appeals to results on the latter.

Another way to quantify transience is via the rate of growth of the re-newal function H(x) defined at (3.58). Under conditions similar to those inTheorem 3.10.1, Denisov et al. [61] show that H(x) is of order x2 as x→∞.

Section 3.11

The weak convergence result Theorem 3.11.2 was first obtained by Lamperti [174]under stronger conditions. Although already implicit in Lamperti’s work,the fact that his results could be extended to processes with different scalingbehaviour was not appreciated more widely until much later: Klebaner [157]and Kersting [147] provided extended versions of Theorem 3.11.2, althoughboth [157] and [147] admit only the transient case, where Xn → ∞. Theversion of Theorem 3.11.2 presented here follows [61].

In [174], Lamperti extended his version of the marginal weak limit resultTheorem 3.11.2 to a full invariance principle, i.e., weak convergence onthe appropriate sample-path space of the scaled Markov chain to a time-homogeneous diffusion process on R+. The diffusions that arise in the limitare, up to a deterministic scale factor, Bessel processes. The parameter(‘dimension’) of the process is given by 1+ 2a

b , where a and b are the constantsin the statement of Theorem 3.11.2.

Section 3.12

The supercritical case was studied by Lamperti [174].


Theorem 3.12.1 proves transience under weaker conditions than we haveseen previously published; for instance Lamperti [173, Theorem 3.2] (see also[217, Section 9.5.3]) assumed (L5) with γ > 2 and also that E[(Xt+1−Xt)

2 |Ft] ≥ v a.s. for v > 0; Lamperti [173] was mainly concerned with the criticalcase (β = 1), where such stronger conditions are natural, but they are notnecessary here, as Theorem 3.12.1 shows.

The strong law of large numbers Theorem 3.12.4 and accompanying cent-ral limit theorem (Theorem 3.12.5) are both from [215]. A version of The-orem 3.12.4 in the case where the “Markov moments” property (3.15) holdsis contained in Keller et al. [142], among results that permit more generalgrowth rates. The first result in this direction was a weak law of large num-bers obtained by Lamperti [174]. Specifically, if Xn is a time-homogeneousMarkov process with limx→∞ x

βµ1(x) = ρ and supx |µk(x)| < ∞ for all k,where µk is given by (3.1), then [174, Theorem 7.1] says that (3.74) holdswith convergence in probability. Theorem 3.12.4 also generalizes a result ofVoit [282] in the birth-and-death chain case: see Section 3.12.1 below. Thequestion of a central limit theorem in this context was raised by Lamperti[174, p. 768], and seems to have remained open until [215] even for the caseof a birth-and-death chain. The paper [142] (a reference unknown to the au-thors of [215] when that paper was written) builds on earlier work of Kuster[172], and includes certain second order (distributional limit) results, butnone pertinent to Theorem 3.12.5.

Further remarks

In recent years, significant interest in processes with asymptotically zerodrifts has come from a community of probabilists and statistical physicistsfrom the point of view of modelling random polymers and interfaces, theirstructure, and their interactions with a medium or boundary. In the contextof random polymers, the path of the process models the physical polymerchain; the asymptotically zero drift indicates the presence of long-rangeinteraction with a boundary, which can be either attractive or repulsive. Fora random interface, the walk models the behaviour of a liquid interface on asolid substrate (including wetting and pinning phenomena); in this contextthe drift may represent affinity for the boundary. We refer to [59, 103, 280]for recent surveys.

TO DO. Lots to add here. Including a whole section of the literatureinspired by growth processes and framing everything in terms of ‘stochasticdifference equations’ [157] [146] [142] [172]

In the context of diffusion processes, Bessel processes can be viewed as a


certain analogue of Lamperti-type processes; the analogy is rather narrow,as special methods are available for Bessel processes, comparable to thecase of nearest-neighbour random walks. A Bessel process is a R+-valueddiffusion (Xt, t ∈ R+) determined by the stochastic differential equation

dXt =a

Xtdt+ bdBt,

where Bt is standard Brownian motion on R. Similarly one can considersupercritical cases, where the drift is of order X−βt , β ∈ (0, 1), or caseswhere the drift is o(1/Xt): see e.g. [57].

Chapter 4

Many-dimensional randomwalks

4.1 Introduction

4.1.1 Chapter overview: Anomalous recurrence

Famously, a d-dimensional, spatially homogeneous random walk whose in-crements are non-degenerate, have finite second moments, and have zeromean is recurrent if d ∈ 1, 2 but transient if d ≥ 3: this is a consequenceof the classical Chung–Fuchs theorem (see Theorem 1.5.2). Once spatialhomogeneity is relaxed, this is no longer true, as discussed in Section 1.5:there are zero-drift spatially non-homogeneous random walks which, in anyambient dimension d ≥ 2, can be either transient or recurrent depending onthe details of the jump distributions. Such potential anomalous recurrencebehaviour is the first focus of this chapter. As described in Theorem 1.5.4,anomalous recurrence behaviour can be excluded under mild assumptions,provided we impose a fixed covariance structure on the increments of thewalk. In that case, by analogy with the Lamperti problem described inChapter 3, it is natural to subsequently relax the assumption of zero drift toasymptotically zero drift, to probe phase transitions in recurrence behaviour.The second focus of this chapter is the asymptotically zero drift case, and inparticular the family of examples known as centrally biased random walks.

In this chapter we explore recurrence/transience phase transitions forspatially non-homogeneous random walks in Rd, and give proofs of the res-ults discussed in Sections 1.5 and 1.7. Now we give an informal overviewof the models and results that we will present, and an outline of the rest of

179

180 Chapter 4. Many-dimensional random walks

this chapter.In the zero drift case, natural examples of anomalous recurrence beha-

viour are provided by random walks whose increments are supported onellipsoids that are symmetric about the ray from the origin through thewalk’s current position. These elliptic random walks generalize the clas-sical homogeneous Pearson–Rayleigh random walk (PRRW): recall from Ex-ample 1.4.2 that a PRRW proceeds via a sequence of unit-length steps, eachin an independent and uniformly random direction in Rd. The PRRW canbe represented via partial sums of sequences of i.i.d. random vectors that areuniformly distributed on the unit sphere Sd−1 in Rd. Clearly the PRRW haszero drift. It is well known (cf. Theorem 1.5.2) that the PRRW is recurrentfor d ∈ 1, 2 and transient if d ≥ 3.

Suppose that we replace the spherically symmetric increments of thePRRW by increments that instead have some elliptical structure, while re-taining the zero drift. For example, one could take the increments to beuniformly distributed on the surface of an ellipsoid of fixed shape and ori-entation, as represented by the picture on the right of Figure 4.1. Moregenerally, one should view the ellipses in Figure 4.1 as representing the co-variance structure of the increments of the walk (we will give a concreteexample later; the uniform distribution on the ellipse is actually not themost convenient for calculations).

Figure 4.1: Pictorial representation of spatially homogeneous random walkswith increments distributed on a fixed circle (left) and a fixed ellipse (right).

A little thought shows that the walk represented by the picture on theright of Figure 4.1 is essentially no different to the PRRW: an affine trans-formation of Rd will map the walk back to a walk whose increments havethe same covariance structure as the PRRW. To obtain genuinely different

4.1. Introduction 181

behaviour, it is necessary to abandon spatial homogeneity.In Section 4.2 we consider a family of spatially non-homogeneous elliptic

random walks with zero drift. These include generalizations of the PRRWin which the increments are not i.i.d. but have a distribution supported onan ellipsoid of fixed size and shape but whose orientation depends uponthe current position of the walk. Figure 4.2 gives representations of twoimportant types of example, in which the ellipsoid is aligned so that itsprincipal axes are parallel or perpendicular to the vector of the currentposition of the walk, which sits at the centre of the ellipse.

Figure 4.2: Pictorial representation of spatially non-homogeneous randomwalks with increments distributed on a radially-aligned ellipse with majoraxis aligned in the radial sense (left) and in the transverse sense (right).

The random walks represented by Figure 4.2 are no longer sums ofi.i.d. variables. These modified walks can behave very differently to thePRRW. For instance, one of the two-dimensional random walks representedin Figure 4.2 is transient while the other (as in the classical case) is recur-rent. The reader who has not seen this kind of example before may take amoment to identify which is which.

The elliptic random walk models represented by Figure 4.2 require un-countably many different increment distributions in order to generate theiranomalous recurrence behaviour. A natural combinatorial question concernsthe minimal number of different increment distributions required; this is thesubject of Section 4.3. One can view this problem as one of control : thecontroller is able to implement a number of different increment distributions,and wishes to minimize this number while achieving the desired recurrenceproperties; thus we address this question in the context of controlled randomwalks with zero drift.


Section 4.4 moves beyond the zero drift setting to discuss walks withasymptotically zero drift. Analogously to the one-dimensional Lampertiproblem (Chapter 3), this is the critical regime in which to investigate vari-ous phase transitions in asymptotic behaviour for the process. In higherdimensions, one natural model is the centrally biased random walk in whichthe mean drift field has magnitude that tends to zero with the distancefrom the origin, and direction either away from or towards the origin: seeFigure 4.3.

Figure 4.3: Pictorial representation of centrally biased random walks withasymptotically drift aligned in the radial sense, outward (left) and inward(right).

The final section of this chapter, Section 4.6, addresses the different(but related) topic of the range of a many-dimensional random walk, whichmeans, roughly speaking, the volume of the state-space visited by the walkup to a given time.

4.1.2 Notation for many-dimensional random walks

Our notation in this chapter coincides, whenever possible, with that inSections 1.4 and 1.5. We work in Rd, d ≥ 1. Our main interest is ind ≥ 2, as we shall explain in the following sections. Write e1, . . . , ed forthe standard orthonormal basis vectors in Rd. Write 0 for the origin andSd−1 = u ∈ Rd : ‖u‖ = 1 for the unit sphere in Rd, and let ‖ · ‖ de-note the Euclidean norm and 〈 · , · 〉 the Euclidean inner product on Rd. Forx ∈ Rd \ 0, x = x/‖x‖ is the unit vector parallel to x; we use the conven-tion 0 := 0. Vectors x ∈ Rd are viewed as column vectors throughout.


In most (but not all) of this chapter, we will consider the following basicsetting for our many-dimensional non-homogeneous random walk.

(MD0) Let d ≥ 1. Suppose that (ξn, n ≥ 0) is a discrete-time, time-homogeneous Markov process on an unbounded subset Σ of Rd, with0 ∈ Σ.

For the increments of the random walk, we use the notation

θn := ξn+1 − ξn, n ∈ Z+.

By assumption (MD0), given ξ0, . . . , ξn, the distribution of θn depends onlyon the position ξn ∈ Σ and not on n.

We use the shorthand Px[ · ] = P[ · | ξ0 = x] for probabilities whenthe walk is started from x ∈ Σ; similarly we use Ex for the correspondingexpectations.

We will typically impose a moments assumption on θn of the form,

supx∈Σ

Ex[‖θ0‖p] <∞, (4.1)

for appropriate p > 0. If (4.1) holds for p ≥ 1, then the one-step mean driftvector is well defined, and will be denoted

µ(x) := Ex[θ0], x ∈ Σ. (4.2)

If (4.1) holds for p ≥ 2, then the increment covariance matrix is also welldefined, and will be denoted

M(x) := Ex[θ0θ>0 ], x ∈ Σ. (4.3)

A little linear algebra shows that M(x) encodes various information aboutsecond moments, including, for y ∈ Rd,

Ex[〈y, θ0〉2] = Ex[y>θ0θ>0 y] = y>M(x)y, (4.4)

and Ex[‖θ0‖2] = trM(x). (4.5)

4.1.3 Beyond the Chung–Fuchs theorem

The starting point for much of the discussion in this chapter is the ques-tion raised in Section 1.5: to what extent does the recurrence classifica-tion for spatially homogeneous random walks extend to the spatially non-homogeneous case? Our main interest is in the case where the increments


of the walk have uniformly bounded second moments, and then it is aconsequence of the Chung–Fuchs theorem that the conclusion of Polya’stheorem (Theorem 1.2.1) holds for any zero drift, spatially homogeneousrandom walk: see Theorem 1.5.2. We now start to consider in detail thenon-homogeneous case.

We use the notation of Section 4.1.2 and consider a non-homogeneousrandom walk ξn on Σ ⊆ Rd, d ≥ 1. We suppose for this section that theincrement moments condition (4.1) holds for some p > 2, so that the meandrift vector µ(x) given by (4.2) is well defined, and that the random walkhas zero drift :

(MD1) Suppose that µ(x) = 0 for all x ∈ Σ.

Our moments assumption also ensures that the increment covariance matrixM(x) given by (4.3) is well defined. To rule out pathological cases, weimpose the following non-degeneracy condition on the increments.

(MD2) There exists v > 0 such that trM(x) ≥ v for all x ∈ Σ.

Note that assumption (MD2) is weaker than uniform ellipticity such as (3.13).

First, we state the following basic non-confinement result.

Lemma 4.1.1. Suppose that (MD0), (MD1), and (MD2) hold, and that(4.1) holds for some p > 2. Then

lim supn→∞

‖ξn‖ = +∞, a.s. (4.6)

Lemma 4.1.1, which we prove at the end of this section, ensures thatquestions of the escape of trajectories to infinity are non-trivial. We arethen interested in classifying ξn as either recurrent or transient, in the senseof Definition 1.5.1. If ξn is an irreducible time-homogeneous Markov chainon a locally finite state-space, these definitions reduce to the usual notionsof transience and recurrence.

In dimension d = 1, it is a consequence of the Chung–Fuchs theorem (seeChapter 9 of [137]) that a spatially homogeneous random walk with zerodrift is necessarily recurrent. For a spatially non-homogeneous random walkthis is not true in general, and there is a counterexample with incrementsof infinite variance that is transient [247]; see the bibliographical notes toSection 2.5. Our conditions exclude such heavy-tailed phenomena, so that ind = 1 recurrence is assured in our setting. The following result is a variantof Theorem 2.5.6, which was stated for a locally finite state space.


Theorem 4.1.2. Suppose that (MD0), (MD1), and (MD2) hold, and that(4.1) holds for some p > 2. Suppose that d = 1. Then ξn is recurrent.

In dimensions d ≥ 2, we first seek an analogue of Theorem 1.5.2. Thenext result shows that one recovers the classical recurrence classification forzero-drift random walk as soon as one imposes a fixed increment covariancestructure (in this case, diagonal) across all of space.

Theorem 4.1.3. Suppose that (MD0) and (MD1) hold, and that (4.1) holdsfor some p > 2. Suppose that there exists σ2 ∈ (0,∞) for which

M(x) = σ2I, for all x ∈ Σ.

Then ξn is recurrent if d ≤ 2 and transient if d ≥ 3.

Remark 4.1.4. Similarly to Example 3.5.4, the assumption that M(x) = σ2Ican be replaced by M(x) = M for any fixed positive-definite incrementcovariance matrix M .

A weaker version of Theorem 4.1.3, assuming a uniform bound on theincrements, was stated as Theorem 1.5.4 and established in Example 3.5.4.Importantly, the zero drift assumption can be relaxed: for example, forthe conclusion of Theorem 4.1.3 one may replace µ(x) = 0 by (essentially)‖µ(x)‖ = o(‖x‖−1), as we shall see later. This fact leads naturally to thequestion of the extent to which one can ‘perturb’ a zero-drift walk and pre-serve its recurrence or transience. Lamperti showed that the critical scale forperturbations is O(‖x‖−1). Roughly speaking, for any walk satisfying (4.1)for p > 2, regardless of its covariance structure, one can arrange µ with‖µ(x)‖ = O(‖x‖−1) for which the walk is recurrent or transient, as desired:see Section 4.1.5 for details.

We prove Theorems 4.1.2 and 4.1.3 in Section 4.1.5, after presenting amore general result (Theorem 4.1.8). The rest of this section is devoted tothe proof of Lemma 4.1.1. We first present a general result for martingaleson Rd satisfying a non-degeneracy condition; the result can be viewed asa d-dimensional martingale version of Kolmogorov’s other inequality (The-orem 2.4.12). Theorem 4.1.5 extends the one-dimensional martingale res-ult Theorem 2.4.12 to higher dimensions, and also replaces the uniformlybounded jumps assumption by a moments condition. The proof here uses adifferent idea from that of Theorem 2.4.12.

Theorem 4.1.5. Let d ∈ N. Suppose that (Yn, n ≥ 0) is an Rd-valuedprocess adapted to a filtration (Fn, n ≥ 0), with P[Y0 = 0 | F0] = 1. Suppose


that there exist a stopping time η and constants p > 2, v > 0, B < ∞ suchthat for all n ≥ 0, a.s.,

E[‖Yn+1 − Yn‖p | Fn] ≤ B; (4.7)

E[‖Yn+1 − Yn‖2 | Fn] ≥ v1n < η; (4.8)

E[Yn+1 − Yn | Fn] = 0. (4.9)

Then there exists D <∞, depending only on B, p, and v, such that for alln ∈ Z+ and all x ∈ R+,

P[

max0≤m≤n

‖Ym‖ ≥ x∣∣∣F0

]≥ 1− D(1 + x)2

n− P[η ≤ n | F0], a.s.

Proof. Let x ∈ R+ and set σ = minn ≥ 0 : ‖Yn‖ ≥ x. For the incrementsof the process write Dn = Yn+1 − Yn, and let

Wn =

Yn if ‖Yn‖ ≤ A(1 + x),

Yn−1 + Dn−1(A− 1)(1 + x) if ‖Yn‖ > A(1 + x),

where A > 1 is a constant to be specified later. Note that Wn is Fn-measurable. Now, on ‖Yn‖ ≤ x, Wn = Yn and

E[Wn+1 −Wn | Fn] = E[Dn | Fn]

+ E[Dn ((A− 1)(1 + x)− ‖Dn‖) 1‖Yn+1‖ > A(1 + x) | Fn].

But ‖Yn+1‖ > A(1 +x)∩‖Yn‖ ≤ x implies that ‖Dn‖ > (A− 1)(1 +x),and by (4.9), E[Dn | Fn] = 0. Hence, on ‖Yn‖ ≤ x,∥∥E[Wn+1 −Wn | Fn]

∥∥ ≤ E[‖Dn‖1‖Dn‖ > (A− 1)(1 + x) | Fn]

≤ (A− 1)−1(1 + x)−1 E[‖Dn‖2 | Fn]

≤ B′(A− 1)−1(1 + x)−1, a.s.,

where, by (4.7) and Lyapunov’s inequality, B′ <∞ depends only on B andp. Hence we can choose A ≥ A0 for some A0 = A0(B, p, v) large enough so∥∥E[Wn+1 −Wn | Fn]

∥∥ ≤ (v/8)(1 + x)−1, on ‖Yn‖ ≤ x. (4.10)

Also, on ‖Yn‖ ≤ x, by a similar argument, E[‖Wn+1 −Wn‖2 | Fn] equals

E[‖Dn‖2 | Fn]

+ E[((A− 1)2(1 + x)2 − ‖Dn‖2

)1‖Yn+1‖ > A(1 + x) | Fn]


≥ E[‖Dn‖2 | Fn]− E[‖Dn‖21‖Dn‖ > (A− 1)(1 + x) | Fn]

≥ v1n < η − (A− 1)2−p(1 + x)2−p E[‖Dn‖p | Fn]

≥ (v/2)1n < η, (4.11)

for all x ≥ 0 and A ≥ A1 for some sufficiently large A1 = A1(B, p, v), using(4.7) and (4.8).

Now, set Zn = ‖Wn∧σ∧η‖2. Then, on n < σ ∧ η, by (4.10) and (4.11),

E[Zn+1 − Zn | Fn] = E[‖Wn+1‖2 − ‖Wn‖2 | Fn]

= E[‖Wn+1 −Wn‖2 | Fn] + 2⟨Wn,E[Wn+1 −Wn | Fn]

⟩≥ v

2− 2‖Wn‖v

8(1 + x)≥ v

2− vx

4(1 + x)≥ v

4.

Hence Zn −∑n−1

k=0 vk is an Fn-adapted submartingale, where

vk =v

41k < σ ∧ η ≥ v

41n < σ ∧ η, for 0 ≤ k < n.

By construction, 0 ≤ Zn ≤ A2(1 + x)2, so

0 = E[Z0 | F0] ≤ E[Zn | F0]−n−1∑k=0

E[vk | F0]

≤ A2(1 + x)2 −n−1∑k=0

v

4P[n < σ ∧ η | F0],

which implies that n(v/4)P[n < σ ∧ η | F0] ≤ A2(1 + x)2. Hence, since

P[

max0≤m≤n

‖Ym‖ < x∣∣∣F0

]= P[n < σ | F0]

= P[n < σ, n < η | F0] + P[n < σ, n ≥ η | F0]

≤ P[n < σ ∧ η | F0] + P[η ≤ n | F0],

the statement of the theorem follows.

Now we can give the proof of Lemma 4.1.1.

Proof of Lemma 4.1.1. Let Xn = ‖ξn‖ and Fn = σ(ξ0, . . . , ξn). Then theassumption (MD0) on ξn shows that Xn satisfies (L0) with S = ‖x‖ :x ∈ Σ ⊆ R+; note that inf S = 0 since 0 ∈ Σ and supS = ∞ since Σ isunbounded.


To prove the non-confinement statement, we show that Xn satisfies (L3)and then apply Proposition 3.3.4. Thus we must verify (3.10). By theMarkov property and time-homogeneity, it suffices to check that for anyx ∈ R+ there exist rx ∈ N and δx > 0 such that

P[

max0≤m≤rx

Xm ≥ x∣∣∣F0

]≥ δx, on X0 ≤ x. (4.12)

To verify (4.12) we apply Theorem 4.1.5 with Yn = ξn − ξ0; the assump-tions (4.1), (MD2), and (MD1) imply that the conditions (4.7), (4.8), and(4.9) are satisfied (with η = ∞). Hence the triangle inequality and The-orem 4.1.5 show that, on ‖ξ0‖ ≤ x,

P[

max0≤m≤n

Xm ≥ x∣∣∣F0

]≥ P

[max

0≤m≤n‖ξm − ξ0‖ ≥ 2x

∣∣∣F0

]≥ 1− D(1 + 2x)2

n, a.s.

Thus for any x we may choose n = rx big enough so that this last probabilityis at least 1/2, say, which verifies (4.12) and completes the proof.

4.1.4 Relation to Lamperti’s problem

The basic idea behind the proof of Theorem 4.1.3 is the same as that de-scribed in Section 1.5 and carried through in Example 3.5.4, namely toconsider the one-dimensional process Xn = ‖ξn‖, which can be studied bythe methods of Chapter 3. The additional technical difficulty introduced byassuming (4.1) rather than uniformly bounded increments is overcome us-ing truncation ideas similar to those employed in Theorems 2.5.6 and 2.5.16,and repeatedly in Chapter 3.

Let Fn = σ(ξ0, . . . , ξn). The assumption (MD0) on ξn shows that Xn

satisfies (L0) with S = ‖x‖ : x ∈ Σ ⊆ R+; note that inf S = 0 since0 ∈ Σ and supS = ∞ since Σ is unbounded. As in Chapter 3, we write∆n = Xn+1 − Xn. To apply the results of Chapter 3, we must study theincrements ∆n given ξn = x ∈ Σ; in general, Xn is not itself a Markovprocess.

We make an important comment on notation. When we writeO(‖x‖−1−δ),and similar expressions, these are understood to be uniform in x. That is,if f : Rd → R and g : R+ → R+, we write f(x) = O(g(‖x‖)) to mean thatthere exist C ∈ R+ and r ∈ R+ such that

|f(x)| ≤ Cg(‖x‖) for all x ∈ Σ with ‖x‖ ≥ r. (4.13)


Lemma 4.1.6. Suppose that (MD0) holds, and that (4.1) holds for somep > 0. Then, for Xn := ‖ξn‖, we have

supx∈Σ

E[|∆n|p | ξn = x] <∞. (4.14)

Moreover, the radial increment moment functions satisfy the following.

(i) If p > 1, then, for any r > 0 with r < (p ∧ 2)− 1, as ‖x‖ → ∞,

E[∆n | ξn = x] = 〈x, µ(x)〉+O(‖x‖−r). (4.15)

(ii) If p > 2, then, for some δ = δ(p) > 0, as ‖x‖ → ∞,

E[∆n | ξn = x] = 〈x, µ(x)〉+trM(x)− x>M(x)x

2‖x‖ +O(‖x‖−1−δ);

(4.16)

E[∆2n | ξn = x] = x>M(x)x +O(‖x‖−δ). (4.17)

Lemma 4.1.6 is central to our analysis of the recurrence and transienceof ξn, by applying the results of Chapter 3 to Xn = ‖ξn‖. For example, ifµ(x) = 0 then (4.16) shows that the drift of Xn is of order 1/Xn, the criticalregime for Lamperti processes: see Section 4.1.5.

The rest of this section is devoted to the proof of Lemma 4.1.6. Thefollowing notation will be useful. Given x 6= 0 and y ∈ Rd, write

yx :=〈x,y〉‖x‖ = 〈x,y〉, (4.18)

so that yx is the component of y in the x direction, and y−yx x is a vectororthogonal to x. For convenience, we write simply θ for θ0 = ξ1−ξ0, and letθx be the radial component of θ at ξ0 = x in accordance with the notationdefined at (4.18); no confusion should arise with our notation θn definedpreviously.

Given ξ0 = x, the increment ∆0 = X1 −X0 is given in terms of x andthe increment θ = ξ1 − ξ0 by

‖x + θ‖ − ‖x‖ =√〈x + θ,x + θ〉 − ‖x‖

= ‖x‖[(

1 +2θx‖x‖ +

‖θ‖2‖x‖2

)1/2

− 1

]. (4.19)


We would like to use Taylor’s theorem here, as in Example 3.5.4, but this isonly legitimate provided θ is not too big compared to x. Thus let

Ax := ‖θ‖ ≤ ‖x‖γ, (4.20)

for some γ ∈ (0, 1) to be chosen appropriately. On the event Ax we ap-proximate (4.19) using Taylor’s formula for (1 + y)1/2, and on the event Ac

x

we bound (4.19) using (4.1). For various truncation estimates, we need thefollowing analogue of Lemma 3.4.2.

Lemma 4.1.7. Suppose that (4.1) holds for some p > 0, and Ax is asdefined at (4.20) for some γ ∈ (0, 1). Then, for any q ∈ [0, p],

Ex

[‖θ‖q1(Ac

x)]

= O(‖x‖−γ(p−q)).

Proof. By definition of Ax, we have for q ∈ [0, p],

‖θ‖q1(Acx) = ‖θ‖p‖θ‖q−p1(Ac

x) ≤ ‖θ‖p‖x‖γ(q−p),

and the result follows on taking expectations, by (4.1).

Now we can complete the proof of Lemma 4.1.6 by carrying through thescheme outlined above.

Proof of Lemma 4.1.6. By time-homogeneity, it suffices to consider the casen = 0. By the triangle inequality, |X1 −X0| =

∣∣‖ξ0 + θ‖ − ‖ξ0‖∣∣ ≤ ‖θ‖, so

that (4.14) follows from (4.1).Define the event Ax as at (4.20), for some γ ∈ (0, 1) to be specified later.

We prove part (ii); so suppose that (4.1) holds for p > 2. For all y > −1,Taylor’s formula with Lagrange remainder shows that

(1 + y)1/2 = 1 +1

2y − 1

8y2(1 + ϕy)−3/2,

for some ϕ = ϕ(y) ∈ [0, 1], so on the event Ax, ‖x + θ‖ − ‖x‖ is equal to

‖x‖(θx‖x‖ +

‖θ‖22‖x‖2 −

1

8

(2θx‖x‖ +

‖θ‖2‖x‖2

)2 (1 +O(‖x‖γ−1)

)),

where θx = 〈x, θ〉 as defined at (4.18). Here, on the event Ax,(2θx‖x‖ +

‖θ‖2‖x‖2

)2

=4θ2

x

‖x‖2 + ‖θ‖2O(‖x‖γ−3),


using the fact that |θx| ≤ ‖θ‖ ≤ ‖x‖γ for γ < 1. Hence, on the event Ax,

‖x + θ‖ − ‖x‖ = θx +‖θ‖2 − θ2

x

2‖x‖ + ‖θ‖2O(‖x‖γ−3). (4.21)

It follows from (4.21) and the triangle inequality that∣∣∣∣(‖x + θ‖ − ‖x‖) 1(Ax)−(θx +

‖θ‖2 − θ2x

2‖x‖

)∣∣∣∣≤∣∣∣∣θx +

‖θ‖2 − θ2x

2‖x‖

∣∣∣∣1(Acx) + ‖θ‖2O(‖x‖γ−3)

≤ ‖θ‖1(Acx) +

‖θ‖22‖x‖1(Ac

x) + ‖θ‖2O(‖x‖γ−3). (4.22)

Here, by the q = 1 and q = 2 cases of Lemma 4.1.7, we have

Ex[‖θ‖1(Acx)] = O(‖x‖−γ(p−1)), and Ex[‖θ‖21(Ac

x)] = O(‖x‖−γ(p−2)),

so that, taking expectations in (4.22), we obtain

Ex

∣∣∣∣(‖x + θ‖ − ‖x‖) 1(Ax)−(θx +

‖θ‖2 − θ2x

2‖x‖

)∣∣∣∣= O(‖x‖−γ(p−1)) +O(‖x‖−γ(p−2)−1) +O(‖x‖γ−3).

Let δ ∈ (0, 1) with δ < p − 2. Choose γ such that 1+δp−1 < γ < 1. Then

−γ(p− 2)− 1 < −γ(p− 1) < −1− δ and γ − 3 < −1− δ, so it follows that

Ex

[(‖x + θ‖ − ‖x‖) 1(Ax)

]= Ex θx +

Ex[‖θ‖2 − θ2x]

2‖x‖ +O(‖x‖−1−δ)

= 〈x, µ(x)〉+trM(x)− x>M(x)x

2‖x‖ +O(‖x‖−1−δ), (4.23)

using (4.4), (4.5), and the observation that, by linearity,

Ex θx = Ex〈x, θ〉 = 〈x,Ex θ〉 = 〈x, µ(x)〉.

On the other hand, by the triangle inequality,

Ex

∣∣‖x + θ‖ − ‖x‖∣∣1(Ac

x) ≤ Ex

[‖θ‖1(Ac

x)]

= O(‖x‖−γ(p−1)), (4.24)

by the q = 1 case of Lemma 4.1.7; again, by choice of γ, we have that theright-hand side of (4.24) is O(‖x‖−1−δ). Since

‖x + θ‖ − ‖x‖ =(‖x + θ‖ − ‖x‖

)1(Ax) +

(‖x + θ‖ − ‖x‖

)1(Ac

x),


we can combine (4.23) and (4.24) to obtain (4.16). For the second moment,we use the identity

(‖x + θ‖ − ‖x‖)2 = ‖x + θ‖2 − ‖x‖2 − 2‖x‖(‖x + θ‖ − ‖x‖)= 2‖x‖θx + ‖θ‖2 − 2‖x‖(‖x + θ‖ − ‖x‖),

so that

Ex[∆20] = 2‖x‖Ex θx + Ex[‖θ‖2]− 2‖x‖Ex ∆0 = Ex[θ2

x] +O(‖x‖−δ),

by (4.16), which gives (4.17). This completes the proof of part (ii).The proof of part (i) is similar, now supposing that (4.1) holds for p > 1

only. In this case, Taylor’s theorem shows that the analogue of (4.21) is(‖x + θ‖ − ‖x‖

)1(Ax) = 〈x, θ〉1(Ax) + ‖θ‖21(Ax)O(‖x‖−1). (4.25)

Here, by definiton of Ax,

Ex[‖θ‖21(Ax)] ≤ Ex[‖θ‖p‖θ‖(2−p)+1(Ax)] ≤ ‖x‖γ(2−p)+ Ex[‖θ‖p],

which is O(‖x‖γ(2−p)+) by (4.1), and, as shown above, Ex[‖θ‖1(Acx)] =

O(‖x‖−γ(p−1)) , so

Ex[〈x, θ〉1(Ax)] = 〈x,Ex θ〉+O(‖x‖−γ(p−1)).

Thus taking expectations in (4.25) we obtain

Ex

[(‖x + θ‖ − ‖x‖

)1(Ax)

]= 〈x,Ex θ〉+O(‖x‖−γ(p−1)) +O(‖x‖γ(2−p)+−1).

On the other hand, (4.24) still applies, which contributes anotherO(‖x‖−γ(p−1))term, so that

Ex [‖x + θ‖ − ‖x‖] = 〈x, µ(x)〉+O(‖x‖γ(2−p)+−γ),

since γ(1−p) = γ(2−p)−γ ≤ γ(2−p)+−γ. Since γ ∈ (0, 1) was arbitrary,and (2− p)+ − 1 = 1− (p ∧ 2) < 0, we obtain (4.15).

4.1.5 Sufficient conditions for recurrence and transience

In this section we present sufficient conditions for recurrence and transienceof the non-homogeneous random walk ξn, assuming (MD0) and that (4.1)holds for some p > 2. In the zero drift case, the main result of this section(Theorem 4.1.8) will allow us to deduce Theorems 4.1.2 and 4.1.3, but the


result is also valid for the non-zero drift case, and will later enable us todeduce results on elliptic random walks (Section 4.2) and centrally biasedrandom walks (Section 4.4).

For this section, since we do not impose a zero drift condition, we cannotrely on Lemma 4.1.1; instead we must assume a non-confinement condition.

(MD3) Suppose that lim supn→∞ ‖ξn‖ = +∞, a.s.

In the case where Σ is locally finite and ξn is irreducible, (MD3) is satisfied,as shown in Corollary 2.1.10.

For some of our results, we also need to assume a non-degeneracy condi-tion on the increment covariance matrix M(x) as defined at (4.3), to ensurethat the walk is genuinely d-dimensional.

(MD4) Suppose that there exists v0 > 0 such that,

infe∈Sd

(e>M(x)e

)≥ v0, for all x ∈ Σ. (4.26)

The assumption (MD4) is stronger than (MD2), in that it imposes auniform lower bound not just on the trace of M(x) but on all its eigenvalues,but weaker than uniform ellipticity, since (3.13) implies (4.26).

Theorem 4.1.8. Suppose that (MD0) and (MD3) hold, and that (4.1) holdsfor some p > 2.

(i) A sufficient condition for ξn to be transient is

limr→∞

infx∈Σ:‖x‖≥r

(2‖x‖〈x, µ(x)〉+ trM(x)− 2x>M(x)x

)> 0. (4.27)

(ii) If (MD4) also holds, then a sufficient condition for ξn to be recurrentis, as r →∞,

supx∈Σ:‖x‖≥r

(2‖x‖〈x, µ(x)〉+trM(x)−2x>M(x)x

)≤ o(log−1 r). (4.28)

(iii) Let λr = minn ≥ 0 : ‖ξn‖ ≤ r, and suppose that

limr→∞


(2‖x‖〈x, µ(x)〉+ trM(x)

)< 0. (4.29)

Then there exists r0 ∈ R+ such that, for any r > r0, Eλr <∞.


One interpretation of Theorem 4.1.8 is as follows: for fixed dimensiond ∈ N and increment covariance matrix function M(x), there is a constantctra such that a sufficient condition to ensure that ξn is transient is

lim infn→∞

(‖x‖〈µ(x), x〉

)> ctra;

specifically, one may take

ctra = limr→∞

inf‖x‖≥r

(12 trM(x)− x>M(x)x

).

Analogously, given (MD4), a sufficient for recurrence is the condition

lim infn→∞

(‖x‖〈µ(x), x〉

)< crec,

wherecrec = lim

r→∞sup‖x‖≥r

(12 trM(x)− x>M(x)x

).

In other words, given M(x) one can always arrange a drift field µ(x), satis-fying ‖µ(x)‖ = O(‖x‖−1), such that the walk is transient; or recurrent; orpositive-recurrent. We give some concrete examples in Section 4.4. In Sec-tion 4.2 we will apply Theorem 4.1.8 to show that in any dimension d ≥ 2,even if µ(x) = 0 everywhere, by suitably adjusting M(x) one can achievetransience or recurrence as desired (but not positive-recurrence).

One useful corollary to Theorem 4.1.8 is the following sufficient conditionfor transience in the zero drift setting. Recall that λmax(M) denotes themaximum eigenvalue of matrix M .

Corollary 4.1.9. Suppose that (MD0), (MD1) and (MD3) hold, and that (4.1)holds for some p > 2. A sufficient condition for ξn to be transient is

limr→∞

infx∈Σ:‖x‖≥r

(trM(x)− 2λmax(M(x))

)> 0. (4.30)

Proof. We apply Theorem 4.1.8(i). We have from (MD1) that

2‖x‖〈x, µ(x)〉+ trM(x)− 2x>M(x)x ≥ trM(x)− 2 supu∈Sd−1

(u>M(x)u

)= trM(x)− 2λmax(M(x)),

by the variational characterization of the maximal eigenvalue. Under con-dition (4.30), we thus have that (4.27) holds, and then Theorem 4.1.8(i)completes the proof.


Here is a different example of the application of Theorem 4.1.8.

Example 4.1.10. Suppose that ξn satisfying (MD0) is an irreducible Markovchain on Zd, d ≥ 1, with transition probabilities given for x 6= 0 by

Px[θ0 = ei] =1

2d

(1− ε

〈x, ei〉

)and Px[θ0 = −ei] =

1

2d

(1 +

ε

〈x, ei〉

),

for all i ∈ 1, . . . , d, where ε ∈ (−1, 1) is a fixed parameter, and

P0[θ0 = ei] = P0[θ0 = −ei] =1

2d.

Then we calculate M(x) = 1dI and, for x 6= 0,

µ(x) = −εd

d∑i=1

ei〈x, ei〉

.

Thus for x 6= 0,

2‖x‖〈x, µ(x)〉+ trM(x)− 2x>M(x)x = −2ε+ 1− (2/d),

which is strictly positive if and only if ε < d−22d . Thus Theorem 4.1.8 parts

(i) and (ii) show that this random walk is transient if ε < d−22d and recurrent

if ε ≥ d−22d . Similarly, part (iii) shows that the walk is positive-recurrent if

ε > 12 . 4

Proof of Theorem 4.1.8. We consider the process of norms Xn = ‖ξn‖ anduse Lemma 4.1.6 to apply the results of Chapter 3. The condition (MD3)shows that Xn = ‖ξn‖ satisfies (L2). Given that (4.1) holds for some p > 2,Lemma 4.1.6 shows that Xn = ‖ξn‖ satisfies (L1) for the same p > 2. In thenotation of Section 3.3, we have for k ∈ 1, 2 that

µk(x) ≥ inf

x∈Σ:‖x‖≥xE[∆k

n | ξn = x],

and similarly for µk(x).Lemma 4.1.6(ii) shows that

2‖x‖E[∆n | ξn = x]− E[∆2n | ξn = x]

= 2‖x‖〈x, µ(x)〉+ trM(x)− 2x>M(x)x +O(‖x‖−δ).

Suppose that (4.27) holds. Then,

lim infx→∞

(2xµ

1(x)− µ2(x)

)> 0,


and Theorem 3.5.1 yields transience, proving part (i). Now consider part(ii). It follows from (MD4) and (4.17) that

lim infx→∞

µ2(x) ≥ lim

r→∞infx∈Σ

(x>M(x)x

)≥ v0. (4.31)

Moreover, it follows from (4.16) and (4.17) that

2xµ1(x)− µ2(x)

≤ supx∈Σ:‖x‖≥x

(2‖x‖〈x, µ(x)〉+ trM(x)− 2x>M(x)x

)+O(x−δ),

which is o(log−1 x) by (4.28). Together with (4.31), this shows that

2xµ1(x)−(

1 +1− ψlog x

)µ

2(x) ≤ −1− ψ

log xµ

2(x) + o(log−1 x)

≤ −((1− ψ)v0 + o(1)

)log−1 x,

which is negative for ψ ∈ (0, 1) and all x sufficiently large, so Theorem 3.5.2yields recurrence. This proves part (ii).

Finally, a similar argument shows that (4.29) implies that

lim supx→∞

(2xµ1(x) + µ2(x)

)< 0,

from which the α = 1 case of Theorem 3.7.1 gives part (iii).

To conclude this section, we apply Theorem 4.1.8 to give the proofs ofTheorems 4.1.2 and 4.1.3.

Proof of Theorem 4.1.2. Suppose that d = 1. In one dimension, M(x) is thescalar Ex[θ2

0], and the condition (MD2) is equivalent to (MD4). Lemma 4.1.1shows that (MD3) holds. Moreover, the zero drift condition (MD1) thenshows that

2‖x‖〈x, µ(x)〉+ trM(x)− 2x>M(x)x = −Ex[θ20] ≤ −v,

by (MD2). Hence the conditions of Theorem 4.1.8(ii) are satisfied, and thisestablishes recurrence.

Proof of Theorem 4.1.3. Again, Lemma 4.1.1 shows that (MD3) holds. Un-der the conditions of the theorem, M(x) = σ2I, so trM(x) = dσ2 and, forany u ∈ Sd−1, u>M(x)u = σ2; thus (MD4) holds. Now, we have

2‖x‖〈x, µ(x)〉+ trM(x)− 2x>M(x)x = (d− 2)σ2,

which is strictly positive if d ≥ 3, in which case transience follows fromTheorem 4.1.8(i), and non-positive if d ≤ 2, in which case recurrence followsfrom Theorem 4.1.8(ii).

4.2. Elliptic random walks 197

4.2 Elliptic random walks

4.2.1 Recurrence classification

Theorem 4.1.2 shows that in d = 1, under mild conditions, the classicalChung–Fuchs recurrence classification for homogeneous zero-drift randomwalks extends to zero-drift non-homogeneous random walks. This extensionfails for dimension d ≥ 2; in this section we demonstrate this, and henceprove Theorem 1.5.3. To do so, we introduce a family of non-homogeneousrandom walks that we call elliptic random walks, that demonstrate the pos-sible ‘anomalous’ behaviour of non-homogeneous random walks when com-pared to classical homogeneous random walks.

The following assumption on the asymptotic stability of the covariancestructure of the process along rays is central; recall that ‖ · ‖op denotes thematrix (operator) norm.

(E1) Suppose that there exists a positive-definite matrix function σ2 withdomain Sd−1 such that, as r →∞,

ε(r) := supx∈Σ:‖x‖≥r

‖M(x)− σ2(x)‖op → 0.

A little informally, (E1) says that M(x) → σ2(x) as ‖x‖ → ∞; in whatfollows, we will often make similar statements, formal versions of which maybe cast as in (E1).

Note that (MD2) and (E1) together imply that trσ2(u) ≥ v > 0; next weimpose a key assumption on the form of σ2 that is considerably stronger. Todescribe this, it is convenient to introduce the notation 〈 · , · 〉u that defines,for each u ∈ Sd−1, an inner product on Rd via

〈y, z〉u := y> · σ2(u) · z = 〈y, σ2(u) · z〉, for y, z ∈ Rd.

(E2) Suppose that there exist constants U and V with 0 < U ≤ V < ∞such that, for all u ∈ Sd−1,

〈u,u〉u = U, and trσ2(u) = V.

Informally, V quantifies the total variance of the increments, while U quanti-fies the variance in the radial direction; necessarily U ≤ V . The assumptionthat U > 0 excludes some degenerate cases. As we will see, one possibleway to satisfy condition (E2) is to suppose that the eigenvectors of σ2(u)are all parallel or perpendicular to the vector u, and that the corresponding


eigenvalues are all constant as u varies; the level sets of the correspondingquadratic forms qu(x) := 〈x,x〉u for u ∈ Sd−1 are then ellipsoids like thosedepicted in Figure 4.2.

Our main result on elliptic random walks is the following, which showsthat both transience and recurrence are possible for any d ≥ 2, depending onparameter choices; as seen in Theorem 4.1.2, this possibility of anomalousrecurrence behaviour is a genuinely multidimensional phenomenon underour regularity conditions.

Theorem 4.2.1. Suppose that (MD0), (MD1), and (MD2) hold, and that(4.1) holds for some p > 2. Suppose also that (E1) and (E2) hold, withconstants 0 < U ≤ V <∞ as defined in (E2). Then the following recurrenceclassification is valid.

(i) If 2U < V , then ξn is transient.

(ii) If 2U > V , then ξn is recurrent.

(iii) If 2U = V and (E1) holds with ε(r) = o(log−1 r), then ξn is recurrent.

An explicit family of examples satisfying conditions (E1) and (E2) ispresented in Section 4.2.2. We deduce Theorem 4.2.1 by another applicationof Theorem 4.1.8.

Proof of Theorem 4.2.1. Lemma 4.1.1 shows that (MD3) holds. By defini-tion of ε(r) at (E1) we have ‖M(x) − σ2(x)‖op = O(ε(‖x‖)) as ‖x‖ → ∞.Then (E2) implies that

trM(x) = trσ2(x) +O(ε(‖x‖)) = V +O(ε(‖x‖)); and

x>M(x)x = 〈x, σ2(x)x〉+O(ε(‖x‖)) = U +O(ε(‖x‖)).

These expressions, together with (MD1), show that


∣∣2‖x‖〈x, µ(x)〉+ trM(x)− 2x>M(x)x− (V − 2U)∣∣ = O(ε(r)).

Given that ε(r)→ 0, if V −2U > 0, then Theorem 4.1.8(i) yields transience,while if V −2U < 0, then Theorem 4.1.8(ii) yields recurrence. In the bound-ary case V − 2U = 0, Theorem 4.1.8(ii) again yields recurrence, providedthat ε(r) = o(log−1 r).

4.2. Elliptic random walks 199

4.2.2 Example: Increments supported on ellipsoids

Let d ≥ 2. We describe a specific model on Σ = Rd where the jump distri-bution at x ∈ Rd is supported on an ellipsoid having one distinguished axisaligned with the vector x. The model is specified by two constants a, b > 0.Construct θ as follows. Let Qx be an orthogonal matrix representing atransformation of Rd mapping e1 to x, and write D =

√ddiag (a, b, . . . , b).

Given X0 = x, take ζ uniform on Sd−1 and set

θ =

QxDζ if x 6= 0;

ζ if x = 0.(4.32)

See Figure 4.4 for an illustration of this construction.

ζ

Dζθ

x

0

Figure 4.4: Definition of θ = QxDζ.

Thus θ is a random point on an ellipsoid that has one distinguishedsemi-axis, of length a

√d, aligned in the x direction, and all other semi-axes

of length b√d. Note that the law of θ is well defined, since the distribution

of QxDζ is the same for any orthogonal matrix Qx satisfying Qxe1 = x.Concretely, observe that for x 6= 0, θ can be rewritten as

θ = QxDQ>xQxζ = QxDQ

>xζ,

where ζ = Qxζ is also uniform on Sd−1, by the spherical symmetry of theuniform distribution. Moreover, the symmetric matrix Hx := QxDQ

>x is

determined explicitly in terms of x via

Hx = Qx(b√dI + (a− b)

√de1e

>1 )Q>x

= b√dI + (a− b)

√dxx>.


Thus a more concrete specification of θ is

θ = Hxζ = b√dζ + (a− b)

√dx〈x, ζ〉,

with ζ taken to be uniform on Sd−1.As a corollary to Theorem 4.2.1, we get the following recurrence classi-

fication for this example.

Corollary 4.2.2. Let d ≥ 1 and a, b ∈ (0,∞). Let ξn be a random walkon Rd with distribution determined by (4.32) for parameters a, b ∈ (0,∞).Then ξn is transient if a2 < (d− 1)b2 and recurrent if a2 ≥ (d− 1)b2.

Proof. Since ‖θ‖ is bounded above by√dmaxa, b, assumption (4.1) holds

for all p > 2. Moreover, by (4.32), we compute

M(x) = Ex[θθ>] = E[QxDζζ>DQ>x] = QxDE[ζζ>]DQ>x =1

dQxD

2Q>x,

by linearity of expectation, and using the fact that E[ζζ>] = 1dI for ζ uni-

formly distributed on Sd−1. Hence (E1) holds with

σ2(u) =1

dQuD

2Q>u, for u ∈ Sd−1,

and with ε(r) identically zero. Also, by a similar calculation (or a sym-metry argument) we have µ(x) = 0 for all x ∈ Rd, since E[ζ] = 0; thus(MD1) holds. Finally, to verify that (MD2) and (E2) hold, note that thematrix σ2(u) represented in coordinates for the orthonormal basis Que1 =u, Que2, . . . , Qued is diagonal with entries a2, b2, . . . , b2. Indeed,

σ2(u) = Qu[b2I + (a2 − b2)e1e>1 ]Q>u

= a2uu> + b2(I − uu>),

and therefore 〈u,u〉u = 〈u, σ2(u) · u 〉 = a2 > 0 for all u ∈ Sd−1, andtr (M(x)) = tr (σ2(x)) = a2 + (d − 1)b2 > 0 for all x ∈ Rd. Thus we verifyall the assumptions of Theorem 4.2.1, with U = a2 and V = a2 + (d− 1)b2,and the claimed recurrence classification follows.

In two dimensions we can explicitly describe the random walk as follows.For x ∈ R2, x 6= 0 with x = (x1, x2) in Cartesian components, set x⊥ :=(−x2, x1). Fix a, b ∈ (0,∞). Let Ex(a, b) denote the ellipse with centre xand principal axes aligned in the x, x⊥ directions, with lengths 2

√2a, 2

√2b

respectively, given in parametrized form by

Ex(a, b) :=

x +√

2ax

‖x‖ cosϕ+√

2bx⊥

‖x‖ sinϕ : ϕ ∈ (−π, π]

, (4.33)

4.3. Controlled random walks with zero drift 201

and for x = 0 set

E0(a, b) := e1 cosϕ+ e2 sinϕ : ϕ ∈ (−π, π] ,the unit circle. The parameter ϕ in the parametrization (4.33) should beinterpreted with caution: it is not, in general, the central angle of the para-metrized point on the ellipse.

Given ξn = x ∈ R2, ξn+1 is taken to be distributed on Ex(a, b), ‘uni-formly’ with respect to the parametrization (4.33). Precisely, let ϕ0, ϕ1, . . .be a sequence of independent random variables uniformly distributed on(−π, π]. Then, on ξn 6= 0,

ξn+1 = ξn +√

2aξn‖ξn‖

cosϕn +√

2bξ⊥n‖ξn‖

sinϕn,

while, on ξn = 0, ξn+1 = (cosϕn, sinϕn).Figure 4.5 shows two simulated sample paths of this two-dimensional

elliptic random walk, for different choices of a and b, one recurrent and theother transient, corresponding roughly to the two case pictures in Figure 4.2.

Figure 4.5: Simulation of 105 steps of the elliptic random walk in R2 for therecurrent case a > b (left) and the transient case a < b (right).

4.3 Controlled random walks with zero drift

In this section we study transience and recurrence for walks in dimensions 3and above that are generated by a finite collection of step distributions. We


now give the precise definition of the walks we will be considering.

Definition 4.3.1. Let π1, . . . , πk be k probability measures in Rd and let(δjn, j = 1, . . . , k, n = 1, 2, 3, . . .) be independent random variables with δjn ∼πj for all n and all j = 1, . . . , k. Define an adapted rule ` = (`(n), n ≥ 0) withrespect to a filtration (Fn, n ≥ 0) to be a process such that, for all n ∈ Z+,`(n) ∈ 1, . . . , k is Fn-measurable. We say that the walk (ξn, n ≥ 0) withξ0 = 0 is generated by the measures π1, . . . , πk and the rule `, if ξn is Fn-measurable and

ξn+1 = ξn + δ`(n)n+1.

That is, informally speaking, on each step the law of the jump of theprocess is chosen from k possibilities, according to some rule that depends onthe history of the process. In particular, any Markov chain with at most kdifferent jump distributions fits into this framework.

For a measure π in Rd, let Z ∼ π. We say π has mean 0 if EZ = 0,i.e., if

∫Rd xπ(dx) = 0. We say that π has β moments if E[‖Z‖β] < ∞. We

define the covariance matrix of π as Cov(π) = E[ZZ>].Note that if π is a measure in Rd, then it has an invertible covariance

matrix if and only if its support contains d linearly independent vectorsof Rd. We will call such measures d-dimensional.

We say that the walk ξn is transient if ‖ξn‖ → ∞ a.s. as n → ∞,and recurrent if there exists a compact set that is visited by ξn infinitelymany times a.s. (observe that, since the walks we are considering are notnecessarily Markovian, in principle one can construct examples that areneither transient nor recurrent).

In this section we address the following two questions:

• Let π1, . . . , πk be mean 0 probability measures in Rd. What are theconditions on the measures so that for every adapted rule ` the result-ing walk is transient?

• For a given dimension d, how do we construct examples of recurrentwalks generated by k d-dimensional mean 0 measures? How small canthis number k be made?

4.3.1 Transience in two dimensions

In dimension 2 we are able to construct a transient walk with just twomeasures, in the following way. Let us consider the following two measures:

π1 =1

6(δ(0,1) + δ(1,0) + δ(1,1) + δ(0,−1) + δ(−1,0) + δ(−1,−1)),


0

Figure 4.6: A transient random walk in Z2 with two measures (all non-zerotransition probabilities equal 1

6)

π2 =1

6(δ(0,1) + δ(1,0) + δ(−1,1) + δ(0,−1) + δ(−1,0) + δ(1,−1)),

and two sets

A1 = (x, y) ∈ Z2 : x = 0 or x > 0, y < 0 or x < 0, y > 0,A2 = Z2 \A1.

Then, we define the Markov chain (ξn) by

ξn+1 − ξn ∼ π11ξn ∈ A1+ π21ξn ∈ A2,

see Figure 4.6. To prove that this Markov chain is transient, consider the

process (Y(1)n , Y

(2)n ) = (|ξ(1)

n |, |ξ(2)n |). This new process is a random walk

in the quarter-plane, with the jumps distributed as π2 inside and the driftorthogonal to the boundary on the axes. Theorem ????? cite some resultfrom the book then implies that (Yn) is transient (since, .....), yieldingalso the transience of (ξn).

4.3.2 Conditions for transience, and a recurrent example inhigher dimensions

Theorem 4.3.2. Let π1, π2 be d-dimensional measures in Rd, d ≥ 3, withzero mean and 2 + β moments, for some β > 0. If ξn is a random walk


generated by these measures and an arbitrary adapted rule `, then ξn istransient.

The following result will be used in the proof of Theorem 4.3.2 butis also of independent interest, since it gives a sufficient condition on thecovariance matrices of the measures used in order to generate a transientrandom walk ξn for an arbitrary adapted rule `.

Recall that for a matrix A we write A> for its transpose, λmax(A) for itsmaximum eigenvalue, and trA for its trace.

Theorem 4.3.3. Let π1, . . . , πk be mean 0 measures in Rd, d ≥ 3, with 2+βmoments, for some β > 0. Suppose that there exists a matrix A such thatfor all i we have

tr(AMiAT ) > 2λmax(AMiA

T ), (4.34)

where Mi is the covariance matrix of the measure πi. If ξn is a randomwalk generated by these measures and an arbitrary adapted rule `, then ξnis transient.

We will refer to (4.34) as the trace condition.

Now, we discuss the situation where the walker may be recurrent indimension d ≥ 3.

Theorem 4.3.4. There exist d mean-zero d-dimensional measures of boundedsupport and an adapted rule such that the corresponding walk is recurrent.

To prove this result, it is enough to construct a particular example ofa recurrent random walk in d dimensions, which is generated by d mean 0measures that are fully supported in Rd; we now describe this example. Weconsider a random walk (ξn, n = 0, 1, 2, . . .) on Zd, d ≥ 3, defined in thefollowing way. Fix a parameter γ > 0, and for x = (x0, . . . , xd−1) ∈ Zddefine %(x) = mink : |xk| = maxj=0,...,d−1 |xj |. Then

ξn+1 = ξn + δn+1,

where δn+1 = ±e%(ξn) with probabilities γ2(γ+d−1) and δn+1 = ±ek for k 6=

%(ξn) with probabilities 12(γ+d−1) . In words, we choose the maximal (in

absolute value) coordinate of ξn with weight γ and all the other coordinateswith weight 1, and then add 1 or −1 to the chosen coordinate with equalprobabilities.


4.3.3 Proofs

First, we give the proof of Theorem 4.3.3.

Proof of Theorem 4.3.3. Consider the process ζn = Aξn; then ζn is a walkgenerated by the measures Aπ1, . . . , Aπk given by Aπi(R) = πi(AR) forR ⊆ Rd. In particular, the increment covariances for ζn are

Mn := E[(ζn+1 − ζn)(ζn+1 − ζn)> | Fn] = E[Aθnθ>nA

> | Fn]

= AE[(δ`(n)n+1)(δ

`(n)n+1)> | Fn]A>

= AM`(n)A>.

By assumption, the trace condition (4.34) holds for all possible `(n), so that,for all n,

ess inf(

tr Mn − 2λmax(Mn))> 0.

This condition is the analogue of the Markov chain condition (4.30) foradapted processes, and can be used to establish transience by extendingslightly the proofs of Theorem 4.1.8(i) and Corollary 4.1.9.

Now, we turn to the proof of Theorem 4.3.2.

Proposition 4.3.5. Let M1,M2 be 3×3 invertible positive definite matrices.Then there exists a 3× 3 matrix A such that

tr(AMiAT ) > 2λmax(AMiA

T ) ∀ i = 1, 2.

Proof. We prove Proposition 4.3.5 by constructing the matrix A of The-orem 4.3.3 directly.

Let π1, π2 have covariance matrices M1 and M2 respectively and δi ∼ πifor i = 1, 2. SinceM1 is positive definite, there exists an orthogonal matrix Usuch that UM1U

T is diagonal, i.e.

UM1UT =

a 0 00 b 00 0 c

,

where a, b, c > 0 are the eigenvalues of M1. If we now multiply the vector Uδ1

by the matrix D given by

D =

1√a

0 0

0 1√b

0

0 0 1√c

,


then Cov(DUδ1) = I, where I stands for the 3× 3 identity matrix.So far we have applied the matrix DU to the vector δ1 and we have to

apply the same transformation to the vector δ2. The vector DUδ2 will havecovariance matrix M2. Since it is positive definite, it can be diagonalized,so there exists an orthogonal matrix V such that

V M2VT =

λ1 0 00 λ2 00 0 λ3

,

where λ1 ≥ λ2 ≥ λ3 > 0 are the eigenvalues in decreasing order. Apply-ing the same transformation to DUδ1 is not going to change its identitycovariance matrix, since V is orthogonal.

The condition we want to satisfy is

λ1 + λ2 + λ3 > 2λi,

for all i = 1, 2, 3. Since the eigenvalues are in decreasing order, it is clear thatthis inequality is always satisfied for i = 2, 3. Suppose that λ2 + λ3 ≤ λ1.Multiplying DUδ2 by the matrix

B =

√λ2√λ1

0 0

0 1 00 0 1

will give us a random vector with covariance matrix λ2 0 0

0 λ2 00 0 λ3

,

which clearly satisfies the trace condition (4.34). Multiplying V DUδ1 bythe same matrix will give us a vector with covariance matrix λ2

λ10 0

0 1 00 0 1

which satisfies the trace condition (4.34), since λ2 ≤ λ1.

Proof of Theorem 4.3.2. By projection to the first three coordinates, it isclear that it suffices to prove the theorem in 3 dimensions.In d = 3, the statement of the theorem follows from Theorem 4.3.3 andProposition 4.3.5.


Remark 4.3.6. It can be seen from the proof of Proposition 4.3.5 that if themeasures π1 and π2 are supported on any 3 dimensional subspaces of Rd,then a walk X generated by these measures and an arbitrary adapted ruleis transient.

We will now give the proof of Theorem 4.3.4.

Proof of Theorem 4.3.4. By Theorem 2.5.2, to prove recurrence it is enoughto find a nonnegative function f such that f(x)→∞ as ‖x‖ → ∞, and

E[f(ξn+1)− f(ξn) | ξn = x] ≤ 0 for all large enough x. (4.35)

Before presenting the explicit construction of such a function, let usinformally explain the intuition behind this construction. First of all, re-call from (1.4) and (1.5) that if Sn is a simple random walk in Zd, then(this computation, probably, appeared somewhere in this bookalready...)

E[‖Sn+1‖ − ‖Sn‖ | Sn = x] =d− 1

2d

1

‖x‖ +O(‖x‖−2);

E[(‖Sn+1‖ − ‖Sn‖)2 | Sn = x] =1

d+O(‖x‖−1).

One can observe that the ratio of the drift to the second moment behavesas d−1

2‖x‖ ; combined with the well-known fact that the SRW is recurrent ford = 2 and transient for d ≥ 3, this suggests that, to obtain recurrence, theconstant in this ratio should not be too large (in fact, at most 1

2). Then,the second moment depends essentially on the dimension, and thus it iscrucial to look at the drift. So, consider a (smooth in Rd \ 0) functiong(x) = Θ(‖x‖); we shall try to figure out how the level sets of g should beso that the “drift outside” with respect to g “behaves well” (i.e., the driftmultiplied by ‖x‖ is uniformly bounded above by a not-so-large constant).For that, let us look at Figure 4.7: level sets of g are indicated by solid lines,vectors’ sizes correspond to transition probabilities. Then, it is intuitivelyclear that the case of “moderate” drift corresponds to the following:

• the “preferred” direction is radial, the curvature of level lines is large,or

• the “preferred” direction is transversal and the curvature of level linesis small;


case 1:case 2:

case 3:case 4:

Figure 4.7: Looking at the level sets: how large is the drift? We have verysmall drift in case 1, very large drift in case 2, and moderate drifts in cases3 and 4.


0

Figure 4.8: How the level sets of f should look like?

also, it is clear that “very flat” level lines always generate small drift. How-ever, one cannot hope to make the level lines very flat everywhere, as theyshould go around the origin. So, the idea is to find in which places one canafford “more curved” level lines.

Observe that, for the random walk we are considering now, the preferreddirection near the axes is the radial one, while in the “diagonal” regions it isin some intermediate position between transversal and radial. This indicatesthat the level sets of the Lyapunov function should look as depicted onFigure 4.8: more curved near the axes, and more flat off the axes.

We are going to use the Lyapunov function

f(x) = ϕ( x

‖x‖)‖x‖α,

where α is a positive constant and ϕ : Sd−1 7→ R is a positive continuousfunction, symmetric in the sense that for any (u0, . . . , ud−1) ∈ Sd−1 we haveϕ(u0, . . . , ud−1) = ϕ(τ0uσ(0), . . . , τd−1uσ(d−1)) for any permutation σ and

any τ ∈ −1, 1d. By the previous discussion, to have the level sets as onFigure 4.8, we are aiming at constructing ϕ with values close to 1 near the“diagonals” and less than 1 near the axes.


By symmetry, it is enough to define the function ϕ for u ∈ Sd−1 suchthat u0 ≥ u1,...,d−1 ≥ 0 (clearly, it then holds that u0 > 0), and, again bysymmetry, it is enough to prove (4.35) for all large enough x ∈ Zd of thesame kind. For such u ∈ Sd−1 abbreviate sj = uj/u0, j = 1, . . . , d − 1;observe that, if u = x/‖x‖, then sj = xj/x0. We are going to look forthe function (for u as above) ϕ(u) = 1 − αψ(s1, . . . , sd−1), where ψ is afunction with continuous third partial derivatives on [0, 1]d−1 (in fact, itwill become clear that the function ψ extended by means of symmetry on[−1, 1]d has continuous third derivatives on [−1, 1]d; this will imply that o-sin the computations below are uniform).

Next, we proceed in the following way: we do calculations in order tofigure out, which conditions the function ψ should satisfy in order to guar-antee that (4.35) holds, and then try to construct a concrete example of ψthat satisfies these conditions.

First of all, a straightforward calculation shows that for any e ∈ Zd with‖e‖ = 1 we have (did it appeare in the book?)

‖x+e‖α = ‖x‖α(

1+αx · e‖x‖2 +

α

2‖x‖2 −1

2α(2−α)

(x · e)2

‖x‖2 ·1

‖x‖2 +o(‖x‖−2)),

(4.36)as x→∞.

In the computations below, we will use the abbreviations

ψ′j :=∂ψ(s1, . . . , sd−1)

∂sj, j = 1, . . . , d− 1,

ψ′′ij :=∂2ψ(s1, . . . , sd−1)

∂si∂sj, i, j = 1, . . . , d− 1.

Let us now consider x ∈ Zd. From now on we will refer to the situationwhen x0 > x1,...,d−1 ≥ 0 as the “non-boundary case” and x0 = x1 = · · · =xm > xm+1 ≥ . . . ≥ xd−1 ≥ 0 for some m ≥ 1 as the “boundary case”.Observe for the boundary case the corresponding s will be of the form s =(1, . . . , (1)m, sm+1, . . . , sd−1); here and in the sequel we indicate the positionof the symbol in a row by placing parentheses and putting a subscript.Also, in the situation when only one coordinate of the vector s changes,we use the notation of the form ψ((s)j) for ψ(s1, . . . , sj−1, s, sj+1, . . . , sd−1),possibly omitting the parentheses and the subscript when the position isclear.

First we deal with the non-boundary case.Let us consider x ∈ Zd such that x0 > x1,...,d−1 ≥ 0. Again using (4.36)

and observing that (recall sj = xj/x0)xjx0−1 = sj(1 +x−1

0 +x−20 + o(‖x‖−2))


andxjx0+1 = sj(1− x−1

0 + x−20 + o(‖x‖−2)), we write

E[f(ξn+1)− f(ξn) | ξn = x]

= −(1− αψ(s))‖x‖α +γ

2(γ + d− 1)

[(1− αψ

( x1

x0 − 1, . . . ,

xd−1

x0 − 1

))‖x− e0‖α

+(

1− αψ( x1

x0 + 1, . . . ,

xd−1

x0 + 1

))‖x+ e0‖α

]

+1

2(γ + d− 1)

d−1∑j=1

[(1− αψ

(xj − 1

x0

))‖x− ej‖α +

(1− αψ

(xj + 1

x0

))‖x+ ej‖α

]

= ‖x‖α

γ

2(γ + d− 1)

[(1− αψ(s)− α

d−1∑j=1

( sjx0

+sjx2

0

)ψ′j −

α

2

d−1∑i,j=1

sisjx2

0

ψ′′ij + o(‖x‖−2))

×(

1− α x0

‖x‖2 +α

2‖x‖2 −1

2α(2− α)

x20

‖x‖2 ·1

‖x‖2 + o(‖x‖−2))

+(

1− αψ(s)− αd−1∑j=1

(− sjx0

+sjx2

0

)ψ′j −

α

2

d−1∑i,j=1

sisjx2

0

ψ′′ij + o(‖x‖−2))

×(

1 + αx0

‖x‖2 +α

2‖x‖2 −1

2α(2− α)

x20

‖x‖2 ·1

‖x‖2 + o(‖x‖−2))

− 2(1− αψ(s))

]

+1

2(γ + d− 1)

d−1∑j=1

[− 2(1− αψ(s)) +

(1− αψ(s) + αx−1

0 ψ′j −α

2x20

ψ′′jj

)×(

1− α xj‖x‖2 +

α

2‖x‖2 −1

2α(2− α)

x2j

‖x‖2 ·1

‖x‖2 + o(‖x‖−2))

+(

1− αψ(s)− αx−10 ψ′j −

α

2x20

ψ′′jj

)×(

1 + αxj‖x‖2 +

α

2‖x‖2 −1

2α(2− α)

x2j

‖x‖2 ·1

‖x‖2 + o(‖x‖−2))]

= α‖x‖α

γ

γ + d− 1

[1− αψ(s)

2‖x‖2 − (2− α)(1− αψ(s))

2‖x‖2 · x20

‖x‖2

−d−1∑j=1

( sjx2

0

− αsj‖x‖2

)ψ′j −

1

2

d−1∑i,j=1

sisjx2

0

ψ′′ij + o(‖x‖−2)

]


+1

γ + d− 1

[(d− 1)(1− αψ(s))

2‖x‖2 +(2− α)(1− αψ(s))

2‖x‖2 · x20

‖x‖2

− (2− α)(1− αψ(s))

2‖x‖2 −d−1∑j=1

αsj‖x‖2ψ

′j −

1

2

d−1∑j=1

1

x20

ψ′′jj + o(‖x‖−2)

]= −α‖x‖α−2Φ(x, ψ) + α‖x‖α−2

(γ−1Φ1(x, ψ, γ, α) + αΦ2(x, ψ, γ, α)

),(4.37)

where Φ1 and Φ2 are uniformly bounded for large enough x, and

Φ(x, ψ) =x2

0

‖x‖2 −1

2+‖x‖2x2

0

( d−1∑j=1

sjψ′j +

1

2

d−1∑i,j=1

sisjψ′′ij

). (4.38)

The idea is then to prove that, with a suitable choice for ψ, the quantityΦ(x, ψ) will be uniformly positive for all large enough x, and then the secondterm in the right-hand side of (4.37) can be controlled by choosing large γand small α. This will make (4.37) negative for all large x.

Now, in order to obtain a simplified form for (4.38), we pass to the(hyper)spherical coordinates:

s1 = r cos θ1,

s2 = r sin θ1 cos θ2,

. . .

sd−2 = r sin θ1 . . . sin θd−3 cos θd−2,

sd−1 = r sin θ1 . . . sin θd−3 sin θd−2.

Since ‖x‖2

x20= 1 + r2, and (abbreviating ψ′r = ∂ψ

∂r and ψ′′rr = ∂2ψ∂r2

)

ψ′r =1

r

d−1∑j=1

sjψ′j , ψ′′rr =

1

r2

d−1∑i,j=1

sisjψ′′ij ,

we have

Φ(x, ψ) =1

1 + r2− 1

2+ (1 + r2)

(rψ′r +

r2

2ψ′′rr

)=

1 + r2

2

( 1− r2

(1 + r2)2+(r2ψ′r

)′r

). (4.39)


0 ε0 2ε0

14

1

√d− 1

r

h(r)

1−r2

(1+r2)2

1

Figure 4.9: On the construction of h

Now, we define the function ψ (it will depend on r only, not on θ1, . . . , θd−2)in the following way. First, clearly, we need to define ψ(r) for r ∈ [0,

√d− 1].

Then, observe that ∫ √d−1

0

1− r2

(1 + r2)2dr =

√d− 1

d> 0, (4.40)

so, for a suitable (small enough) ε0 we can construct a smooth function hwith the following properties (on the Cartesian plane with coordinates (r, y),

think of going from the origin along y = r2

4ε20until it intersects with y =

1−r2(1+r2)2

and then modify a little bit the curve around the intersection point

to make it smooth, see Figure 4.9):

(i) 0 ≤ h(r) ≤ 1−r2(1+r2)2

for all r < 2ε0 and h(r) = 1−r2(1+r2)2

for r ≥ 2ε0;

(ii) h(0) = 0 and h(r) ∼ r2

4ε20as r → 0;

(ii) 1−r2(1+r2)2

− h(r) > 12 for r ≤ ε0;

(iv) b :=∫ √d−1

0 h(r) dr > 0 (by (4.40) it holds in fact that b ∈ (0, 1));

(v)∫ r

0 h(u) du > br3

3(d−1)3/2for all r ∈ (0,

√d− 1].


Denote H(r) =∫ r

0 h(u) du, so that we have H(√d− 1) = b. Then, define

for r ∈ [0,√d− 1]

ψ(r) =

∫ √d−1

r

(H(v)

v2− bv

3(d− 1)3/2

)dv. (4.41)

For the function ψ defined in this way, we have r2ψ′(r) = br3

3(d−1)3/2−H(r),

so h(r) + (r2ψ′(r))′ = b(d− 1)−3/2r2. By construction, it then holds that

infr∈[0,

√d−1]

( 1− r2

(1 + r2)2+(r2ψ′(r)

)′) ≥ b(d− 1)−3/2ε20 ∧

1

2, (4.42)

and this (recall (4.38) and (4.39)) shows that, if γ is large enough and αis small enough then the right-hand side of (4.37) is negative for all largeenough x ∈ Zd.

To complete the proof of the theorem, it remains to deal with the bound-ary case.Let x0 = x1 = · · · = xm > xm+1 ≥ . . . ≥ xd−1 ≥ 0 for some m ≥ 1. Us-ing (4.36) (up to the term of order ‖x‖−1 in the parentheses), using the factthat ϕ is invariant under permutations and observing that x0 and ‖x‖ areof the same order, we have

E[f(ξn+1)− f(ξn) | ξn = x]

= −(1− αψ(s))‖x‖α +γ +m

2(γ + d− 1)

[(1− αψ

((x0 − 1

x0)m))‖x− e0‖α

+(

1− αψ( x0

x0 + 1, . . . , (

x0

x0 + 1)m,

xm+1

x0 + 1. . . ,

xd−1

x0 + 1

))‖x+ e0‖α

]

+1

2(γ + d− 1)

d−1∑j=m+1

[(1− αψ

(xj − 1

x0

))‖x− ej‖α +

(1− αψ

(xj + 1

x0

))‖x+ ej‖α

]

= ‖x‖α

γ +m

2(γ + d− 1)

[(1− α

(ψ(s)− ψ′m

x0+ o(‖x‖−1)

))(1− α x0

‖x‖2 + o(‖x‖−1))

+(

1− α(ψ(s)−

m∑k=1

ψ′kx0−

d−1∑k=m+1

skψ′k

x0+ o(‖x‖−1)

))(1 + α

x0

‖x‖2 + o(‖x‖−1))

− 2(1− αψ(s))

]


+1

2(γ + d− 1)

d−1∑j=m+1

[(1− α

(ψ(s)−

ψ′jx0

+ o(‖x‖−1)))(

1− α xj‖x‖2 + o(‖x‖−1)

)(

1− α(ψ(s) +

ψ′jx0

+ o(‖x‖−1)))(

1 + αxj‖x‖2 + o(‖x‖−1)

)− 2(1− αψ(s))

]

= α‖x‖α γ +m

2(γ + d− 1)

[1

x0

(m−1∑k=1

ψ′k + 2ψ′m +d−1∑

k=m+1

skψ′k

)+ o(‖x‖−1)

](4.43)

(observe that in the above calculation all the terms of order ‖x‖α−1 thatcorrespond to the choice of coordinates m+ 1, . . . , d− 1 of x, cancel).

Now simply note that by the property (v), we have ψ′(r) < 0 for allr ∈ (0,

√d− 1]. Observe also that for some positive constant δ0 it holds

that ψ′(r) ≤ −δ0 for all r ∈ [1,√d− 1]. Then (recall that in the boundary

case s1 = 1 and sj ≥ 0 for all j = 2, . . . , d− 1) we have

ψ′j =sjrψ′r ≤ 0 for all j = 1, . . . , d− 1 and ψ′1(s) ≤ − δ0√

d− 1.

This implies that the right-hand side of (4.43) is negative for all largeenough x ∈ Zd and thus concludes the proof of Theorem 4.3.4.

4.4 Centrally biased random walks

4.4.1 Model and notation

The results of Sections 4.2 and 4.3 demonstrate that in the spatially non-homogeneous setting, even assuming zero drift is no guarantee of predictablebehaviour for a many-dimensional random walk, at least in terms of recur-rence versus transience. However, we saw in Theorem 4.1.3 that one regainssome control given suitable assumptions on the covariance structure of theincrements of the walk: then the classical recurrence theory for homogeneousrandom walk has a natural analogue for non-homogeneous random walk. Inthis setting, it turns out that the zero drift assumption can be relaxed toan assumption of asymptotically zero drift, which also turns out to be thecorrect setting in which to probe the recurrence/transience phase transition,in much the same way as in the one-dimensional setting of Chapter 3. Themost natural framework in which to investigate these questions, and more,


is that of the centrally biased random walk, which is the focus for this partof the present chapter.

Once more we use the notation of Section 4.1.2 and consider a non-homogeneous random walk ξn on Σ ⊆ Rd, d ≥ 1. In this section, our basicassumption will again be (MD0).

As mentioned in Section 4.1.3, under mild conditions, a non-homogeneousrandom walk with zero drift and fixed increment covariance matrix is recur-rent if and only if d ≤ 2. We will see in Section 4.4.2 that the assumption ofµ(x) = 0 may be relaxed to (essentially) ‖µ(x)‖ = o(‖x‖−1). On the otherhand, it is possible to achieve either transience or recurrence (even positive-recurrence) in any dimension under the condition ‖µ(x)‖ = O(‖x‖−1), asdescribed in Section 4.1.5. In view of these phenomena, and others thatwe will see later, we call the case where ‖µ(x)‖ = O(‖x‖−1) the criticalcase. Before considering in detail that case, we examine in turn the sub-critical case in which ‖x‖‖µ(x)‖ → 0 and the supercritical case in which‖x‖‖µ(x)‖ → ∞.

Let us comment briefly on our use of the terminology ‘centrally biased’random walk (see the bibliographical notes at the end of this chapter forhistorical remarks): in its broadest use we mean any random walk whosedrift field µ is perturbed around the origin, so, typically, µ(x) is asymptot-ically zero as ‖x‖ → ∞. In some, but certainly not all, of the results in thissection, the ‘central bias’ takes on the additional sense that the dominantcomponent of the drift field is in the radial direction, towards or away fromthe origin.

4.4.2 Subcritical case

The following result generalizes Theorem 4.1.3.

Theorem 4.4.1. Suppose that (MD0), (MD3), and (MD4) hold, and that(4.1) holds for some p > 2. Suppose that there exists σ2 ∈ (0,∞) for which

‖µ(x)‖ = o(‖x‖−1) and ‖M(x)− σ2I‖op → 0, as ‖x‖ → ∞, (4.44)

Then ξn is recurrent if d = 1 and transient if d ≥ 3.

If d = 2 and, as ‖x‖ → ∞,

‖µ(x)‖ = o(‖x‖−1 log−1 ‖x‖) and ‖M(x)− σ2I‖op = o(log−1 ‖x‖), (4.45)

then ξn is recurrent.


Proof. We again apply Theorem 4.1.8. First suppose that (4.44) holds. ThentrM(x) = dσ2 + o(1) and, for any u ∈ Sd−1, u>M(x)u = σ2 + o(1); thus

2‖x‖〈x, µ(x)〉+ trM(x)− 2x>M(x)x = (d− 2)σ2 + o(1),

which, for ‖x‖ sufficiently large, is strictly positive for if d ≥ 3, in which casetransience follows from Theorem 4.1.8(i), and negative if d ≤ 1, in whichcase recurrence follows from Theorem 4.1.8(ii).

Finally, suppose that d = 2 and (4.45) holds. Then, similarly,

2‖x‖〈x, µ(x)〉+ trM(x)− 2x>M(x)x = o(log−1 ‖x‖),

which with another application of Theorem 4.1.8(ii) yields recurrence.

4.4.3 Supercritical case

In this section we suppose that the moments condition (4.1) holds for p > 1(at least), so that µ as given by (4.2) is well defined, and we will studymodels in which ‖x‖‖µ(x)‖ is unbounded. The model that we focus onin this section has dominant drift in the (outwards) radial direction. Inthis case, to conclude about asymptotic behaviour one typically needs noassumptions on the covariance structure of the walk; indeed, without furtherassumptions the covariance matrix (4.3) may not exist.

Specifically, we suppose that for some ρ ∈ R+ and β ≥ 0,

µ(x) = ρ‖x‖−βx +O(‖x‖−β log−2 ‖x‖), (4.46)

as ‖x‖ → ∞. In equation (4.46) the notational convention described at (4.13)is in force, i.e., (4.46) is to be understood as


‖µ(x)− ρ‖x‖−βx‖ = O(‖r‖−β log−2 ‖r‖), as r →∞,

in the usual scalar sense of O( · ). We will assume the following.

(CB1) Suppose that for some β ∈ [0, 1), ρ ∈ R+, and p > 1 + β, theconditions (4.1) and (4.46) hold.

Our main result on the supercritical case is the following, which showsthat ξn satisfies a strong law of large numbers with a super-diffusive rate ofescape and has a limiting direction; this result implies Theorem 1.7.3.


Theorem 4.4.2. Suppose that (MD0), (MD3), and (CB1) hold. Then thereexists a random u ∈ Sd−1 such that, as n→∞,

n− 1

1+β ξn → (ρ(1 + β))1

1+β u, a.s.

In particular, limn→∞ ξn = u, a.s.

The first step in the proof of Theorem 4.4.2 is to appeal to Lemma 4.1.6(i),which shows that the process Xn = ‖ξn‖ is of supercritical Lamperti type,so that we may apply the results of Section 3.12. In particular, we establishthe following law of large numbers for Xn.

Lemma 4.4.3. Suppose that (MD0), (MD3), and (CB1) hold. Then,

n− 1

1+β ‖ξn‖ → (ρ(1 + β))1

1+β , a.s., as n→∞.

Proof. We take Xn = ‖ξn‖. It suffices to suppose that p ∈ (1 + β, 2]. Then(4.15) shows that, for some δ > 0,

E[∆n | ξn = x] = 〈x, µ(x)〉+O(‖x‖−β−δ)= ρ‖x‖−β +O(‖x‖−β log−2 ‖x‖),

by (4.46).

TO DO: apply the 1d LLN

The second step in the proof of Theorem 4.4.2 is to show that the pro-cess ξn has a limiting direction. To this end, the next result concerns theangular process ξn on Sd−1 ∪ 0.

Lemma 4.4.4. Suppose that (MD0) holds. Let β ∈ [0, 1). Suppose that(4.1) holds for some p > 1 + β, and that, as ‖x‖ → ∞,

supy∈Sd−1:〈y,x〉=0

∣∣〈y, µ(x)〉∣∣ = O(‖x‖−β log−2 ‖x‖). (4.47)

Then, as ‖x‖ → ∞,

E[ξn+1 − ξn

∣∣ ξn = x]

= O(‖x‖−1−β log−2 ‖x‖); (4.48)

E[‖ξn+1 − ξn‖2

∣∣ ξn = x]

= O(‖x‖−1−β log−2 ‖x‖). (4.49)

Proof. Without loss of generality, suppose that p ∈ (1 + β, 2]. For theduration of this proof, condition on ξn = x for x ∈ Σ \ 0. Recall the


definition of Ax from (4.20), and let γ ∈ (0, 1) with γ > 1+βp , which is

possible since p > 1 + β. Also, write

θ⊥n := θn − 〈θn, x〉x;

note that 〈θ⊥n , x〉 = 0 and ‖θ⊥n ‖2 = ‖θn‖2 − 〈θn, x〉2.Now, a computation shows, given ξn = x with ‖x‖ > 0, for ‖θn‖ < ‖x‖,

ξn+1 − ξn =θn

‖x + θn‖− x

(‖x + θn‖ − ‖x‖‖x + θn‖

)=

θ⊥n‖x + θn‖

− x

‖x + θn‖

(2〈θn,x〉+ ‖θn‖2‖x‖+ ‖x + θn‖

− 〈θn,x〉‖x‖

).

Note that, here,

2〈θn,x〉‖x‖+ ‖x + θn‖

− 〈θn,x〉‖x‖ = 〈θn,x〉( ‖x‖ − ‖x + θn‖‖x‖(‖x‖+ ‖x + θn‖)

),

and, by the triangle inequality,∣∣∣∣ ‖x‖ − ‖x + θn‖‖x‖(‖x‖+ ‖x + θn‖)

∣∣∣∣ ≤ ‖θn‖‖x‖2 .

Hence ∣∣∣∣2〈θn,x〉+ ‖θn‖2‖x‖+ ‖x + θn‖

− 〈θn,x〉‖x‖

∣∣∣∣ ≤ 2‖θn‖2‖x‖2 .

By definition of Ax, we may choose ‖x‖ large enough so that ‖θn‖ < ‖x‖/2,and then∥∥∥∥ x

‖x + θn‖

(2〈θn,x〉+ ‖θn‖2‖x‖+ ‖x + θn‖

− 〈θn,x〉‖x‖

)∥∥∥∥1(Ax) ≤ 4‖θn‖2‖x‖−21(Ax)

≤ 4‖θn‖p‖x‖γ(2−p)−2,

which has an expectation of O(‖x‖γ(2−p)−2), by (4.1). In particular,

E[(ξn+1 − ξn)1(Ax) | ξn = x] = ‖x‖−1(1 + o(1))E[θ⊥n 1(Ax) | ξn = x]

+O(‖x‖γ(2−p)−2). (4.50)

Expressed in an orthonormal basis containing x, the (magnitudes of the)non-zero components of θ⊥n are of the form 〈y, θ⊥n 〉 for y with 〈y, x〉 = 0,where

E[〈y, θ⊥n 〉 | ξn = x] = E[〈y, θn〉 | ξn = x] = 〈y, µ(x)〉 = O(‖x‖−β log−2 ‖x‖),


by assumption (4.47). Hence ‖E[θ⊥n | ξn = x]‖ = O(‖x‖−β log−2 ‖x‖). Also,by the q = 1 case of Lemma 4.1.7,

‖E[θ⊥n 1(Acx) | ξn = x]‖ ≤ E[‖θn‖1(Ax) | ξn = x] = O(‖x‖−γ(p−1)).

Hence, using the fact that γ > 1+βp > β

p−1 ,

E[θ⊥n 1(Ax) | ξn = x] = O(‖x‖−β log−2 ‖x‖), (4.51)

We also have that 2(γ − 1) − pγ < −pγ < −1 − β since γ > 1+βp , so we

obtain from (4.50) and (4.51) that

E[(ξn+1 − ξn)1(Ax) | ξn = x] = O(‖x‖−1−β log−2 ‖x‖).

On the other hand,

E[‖ξn+1 − ξn‖1(Acx) | ξn = x] ≤ 2P[‖θn‖p > ‖x‖pγ | ξn = x] = O(‖x‖−pγ),

by Markov’s inequality and (4.1). Since γ > βp−1 , we obtain (4.48).

Now note that, given ξn = x with ‖x‖ > 0, for ‖θn‖ < ‖x‖,

‖ξn+1 − ξn‖ =

∥∥∥∥x(‖x‖ − ‖x + θn‖) + θn‖x‖‖x‖‖x + θn‖

∥∥∥∥ ≤ 2‖θn‖‖x + θn‖

,

by the triangle inequality. Hence, for all ‖x‖ sufficiently large,

E[‖ξn+1 − ξn‖21(Ax) | ξn = x] ≤ 8‖x‖−2 E[‖θn‖21(Ax) | ξn = x]

≤ 8‖x‖γ(2−p)−2 E[‖θn‖p | ξn = x],

which is O(‖x‖−1−β log−2 ‖x‖), by (4.1). On the other hand,

E[‖ξn+1 − ξn‖21(Acx) | ξn = x] ≤ 4P[Ac

x | ξn = x] = O(‖x‖−pγ),

by Markov’s inequality and (4.1). Since γ > 1+βp , we obtain (4.49).


Proof of Theorem 4.4.2. Set Fn := σ(ξ0, . . . , ξn). Then (ξn, n ≥ 0) is an(Fn, n ≥ 0)-adapted process taking values in Sd−1 ∪ 0, and using thevector-valued version of Doob’s decomposition we may write

ξn = An +Mn, (4.52)


where M0 = ξ0, Mn is a d-dimensional martingale, and An is the previsiblesequence defined by A0 = 0 and An =

∑n−1m=0 E[ξm+1 − ξm | Fm] for n ≥ 1.

To show that ξn converges, we show that both An and Mn converge.We have from (4.48) that, for finite positive constants C1 and C2,

‖An+1 −An‖ ≤ C1‖ξn‖−1−β log−2 ‖ξn‖, on ‖ξn‖ ≥ C2.

By Lemma 4.4.3, there is a positive constant c such that ‖ξn‖ ∼ cn1/(1+β),a.s. It follows that

lim supn→∞

‖An+1 −An‖n−1(log n)−2

<∞, a.s.,

so∑

n≥0 ‖An+1 −An‖ <∞, a.s., implying that An converges almost surely

to some limit A∞ ∈ Rd.Taking expectations in the vector identity ‖Mn+1−Mn‖2 = ‖Mn+1‖2−

‖Mn‖2 − 2Mn · (Mn+1 −Mn) and using the martingale property, we have

E[‖Mn+1‖2 − ‖Mn‖2 | Fn] = E[‖Mn+1 −Mn‖2 | Fn]

= E[‖ξn+1 − ξn − E[ξn+1 − ξn | Fn]‖2

∣∣Fn].Expanding out the expression in the latter expectation, we obtain

E[‖Mn+1‖2 − ‖Mn‖2 | Fn] = E[‖ξn+1 − ξn‖2 | Fn]− (E[ξn+1 − ξn | Fn])2

≤ E[‖ξn+1 − ξn‖2 | Fn].

Then by (4.49) and the fact that ‖ξn‖ ∼ cn1/(1+β), by Lemma 4.4.3, itfollows that

lim supn→∞

E[‖Mn+1‖2 − ‖Mn‖2 | Fn]

n−1(log n)−2<∞, a.s.

Hence∑

n≥0 E[‖Mn+1‖2 − ‖Mn‖2 | Fn] < ∞, a.s., which implies that Mn

has an almost-sure limit M∞ ∈ Rd, by e.g. the d-dimensional version of [72,Theorem 5.4.9, p. 254].

Taking n → ∞ in (4.52), we conclude that, a.s., ξn → A∞ + M∞ =: u,for some random vector u ∈ Sd−1 ∪ 0. Finally, note that (MD3) impliesP[u = 0] = 0, since, any sequence Sd−1 ∪ 0 with 0 as a limit must beeventually equal to 0, so u = 0 if and only if ξn = 0 for all but finitelymany n, which contradicts (MD3); hence we may assume that u ∈ Sd−1.Combining the convergence of ξn with the asymptotics for ‖ξn‖ given inLemma 4.4.3 completes the proof of the theorem.


4.4.4 Critical case: Angular asymptotics

Throughout this section, we assume d ≥ 2. The main result of the previoussection, Theorem 4.4.2, showed that the angular process ξn has a limit inthe supercritical case, where ‖x‖‖µ(x)‖ → ∞. We will see that the case inwhich ‖x‖‖µ(x)‖ remains bounded is fundamentally different: under mildassumptions, ξn does not converge. This angular wandering is a classicalproperty of zero-drift homogeneous random walk in dimensions d ≥ 2 (seeTheorem 1.7.2), and in this section we show the extent to which this propertycarries across to more general models. We emphasize that the phenomenaexhibited in this section are genuinely many dimensional (d ≥ 2 is essential),as we explain below.

In addition to taking d ≥ 2, for this section we will make the followingassumption on the moments and drift of the increments.

(CB2) Suppose that the moments condition (4.1) holds for some p > 2,and that there exists C <∞ such that

‖µ(x)‖ ≤ C(1 + ‖x‖)−1, for all x ∈ Σ. (4.53)

The main result of this section is the following.

Theorem 4.4.5. Suppose that, for d ≥ 2, (MD0), (MD3), (CB2), and(MD4) hold. Then P[limn→∞ ξn exists] = 0.

The remainder of this section is devoted to the proof of Theorem 4.4.5.First we state an important consequence of Theorem 2.4.8.

Lemma 4.4.6. Suppose that (4.1) holds for some p ≥ 2 and that (4.53)holds. Then for any δ > 0 there exists ε > 0 such that

P[

max0≤n≤bε‖ξ0‖2c

‖ξn − ξ0‖ ≤ 12‖ξ0‖

]≥ 1− δ.

Proof. Again let Fn = σ(ξ0, . . . , ξn). Set Xn = ‖ξn − ξ0‖2, so X0 = 0 andXn ≥ 0. Let

η = minn ≥ 0 : ‖ξn − ξ0‖ ≥ 1

2‖ξ0‖

= minn ≥ 0 : Xn ≥ 1

4‖ξ0‖2.

Then Xn+1 −Xn = ‖θn‖2 + 2〈θn, ξn − ξ0〉, so that∣∣E[Xn+1 −Xn | Fn]∣∣ ≤ E[‖θn‖2 | Fn] + 2‖ξn − ξ0‖‖µ(ξn)‖

≤ B +2C‖ξn − ξ0‖

1 + ‖ξn‖,


for some constants B <∞ and C <∞, by the p = 2 case of (4.1) and (4.53).On n < η, we have ‖ξn‖ ≥ ‖ξ0‖ − ‖ξn − ξ0‖ ≥ 1

2‖ξ0‖, so there is someconstant B′ <∞ for which

E[Xn+1 −Xn | Fn] ≤ B +C‖ξ0‖

1 + 12‖ξ0‖

≤ B′, on n < η.

Hence we may apply Theorem 2.4.8 with the stopping time τ = η = σx,where x = 1

4‖ξ0‖2 and, as usual, σx = minn ≥ 0 : Xn ≥ x, to obtain

xP[n < σx] ≤ E[Xn∧σx ] ≤ B′n.

In other words,

P[

max0≤m≤n

‖ξm − ξ0‖ ≤ 12‖ξ0‖

]= P

[max

0≤m≤nXm ≤ 1

4‖ξ0‖2]≥ 1− 4B′n

‖ξ0‖2.

Taking n = bε‖ξ0‖2c gives the result, on choosing ε > 0 small enough sothat 4B′ε < δ.

The next ingredient is the following estimate for the probability that thewalk makes a significant deviation in an orthogonal direction.

Lemma 4.4.7. Suppose that (CB2) and (MD4) hold. There exists y0 <∞such that, for any δ > 0, we can choose ε > 0 and h > 0 such that, for allx ∈ Σ with ‖x‖ ≥ y0,

infe∈Sd−1:〈e,x〉=0

P[

max0≤n≤bε‖x‖2c

|〈ξn, e〉| ≥ h‖x‖∣∣∣ ξ0 = x

]≥ 1− δ.

Proof. Fix δ > 0. Suppose that ξ0 = x ∈ Σ with ‖x‖ = y > 0. Fix e ∈ Sd−1

such that 〈e,x〉 = 0. Denote the Doob decomposition of 〈ξn, e〉 as

〈ξn, e〉 = 〈ξ0, e〉+Mn +An = Mn +An,

where Mn is a martingale with M0 = 0, and

An =

n−1∑m=0

E[〈ξm+1 − ξm, e〉 | Fm] =

n−1∑m=0

〈e, µ(ξm)〉.

Again let η = minn ≥ 0 : ‖ξn − ξ0‖ ≥ 12‖ξ0‖. By Lemma 4.4.6, there

exists ε1 = ε1(δ) > 0 such that, for all ε ∈ (0, ε1), for any y > 0,

P[η ≥ εy2 | ξ0 = x] ≥ 1− δ.


Moreover, for ξ0 = x and n < η we have ‖ξn‖ ≥ y/2 so that, on n < η,

|〈e, µ(ξm)〉| ≤ ‖µ(ξm)‖ ≤ C

1 + (y/2)≤ 2C

y, a.s.,

by (4.53). Hence |An∧η| ≤ 2Cn/y, a.s. In particular,

P[

max0≤n≤bεy2c

|An| ≥ 4Cεy∣∣∣ ξ0 = x

]≤ P[η ≤ εy2 | ξ0 = x] ≤ δ. (4.54)

Next we show that the martingale Mn satisfies the hypotheses of thed = 1 case of Theorem 4.1.5. Indeed,

Mn+1 −Mn = 〈θn − µ(ξn), e〉, (4.55)

so that, a.s.,

E[(Mn+1 −Mn)2 | Fn] = E[〈e, θn〉2 | Fn]− 〈e, µ(ξn)〉2

≥ v0 − ‖µ(ξn)‖2,

for v0 > 0 as given by (MD4). Hence, on n < η, a.s., by (4.53),

E[(Mn+1 −Mn)2 | Fn] ≥ v0 − (2C/y)2 ≥ v0/2,

for all y ≥ y0 sufficiently large; note that y0 depends only on C and v, andnot on δ. Hence (4.8) is satisfied for Yn = Mn for this choice of η andv = v0/2. Also, (4.55) shows that

E[|Mn+1 −Mn|p | Fn] ≤ E[‖θn − µ(ξn)‖p | Fn].

Here, by (4.1), E[‖θn‖p | Fn] ≤ B a.s. for some finite constant B, while

‖µ(ξn)‖p ≤ supx∈Σ‖µ(x)‖p ≤ sup

x∈ΣE[‖θn‖p | ξn = x],

by Jensen’s inequality. Hence, by Minkowski’s inequality, (4.7) is also satis-fied for Yn = Mn. So applying Theorem 4.1.5 we obtain, for some constantD <∞ depending on v0 and C, but not on ε or δ,

P[

max0≤n≤bεy2c

|Mn| ≥ 2hy∣∣∣ ξ0 = x

]≥ 1− D(2hy)2

εy2− P[η ≤ εy2 | ξ0 = x]

≥ 1− 4Dh2

ε− δ,


for ε ∈ (0, ε1). Now take h = h(ε) = 12

√δε/D, so that

P[

max0≤n≤bεy2c

|Mn| ≥ 2hy∣∣∣ ξ0 = x

]≥ 1− 2δ, (4.56)

for all y ≥ y0. Thus, from (4.54) and (4.56), we see that with probability atleast 1− 3δ, the event

max0≤n≤bεy2c

|Mn| ≥ 2hy∩

max0≤n≤bεy2c

|An| ≤ 4Cεy

(4.57)

occurs. Since h/ε→∞ as ε ↓ 0, there exists a constant ε0 = ε0(δ) ∈ (0, ε1)such that h ≥ 4Cε for all ε ∈ (0, ε0). Then, for such ε, on the event (4.57),

max0≤n≤bεy2c

|〈ξn, e〉| ≥ max0≤n≤bεy2c

|Mn| − max0≤n≤bεy2c

|An|

≥ 2hy − 4Cεy ≥ hy,

for all y ≥ y0. This completes the proof, after a relabelling of the arbitrarypositive constant δ.


Proof of Theorem 4.4.5. Again let Fn = σ(ξ0, . . . , ξn). Suppose that ξ0 = xwith ‖x‖ ≥ y0, the constant in the statement of Lemma 4.4.7, and takee ∈ Sd−1 with 〈x, e〉 = 0. Note

‖ξn − ξ0‖ ≥ |〈ξn − ξ0, e〉| =|〈ξn, e〉|‖ξn‖

, (4.58)

since 〈ξ0, e〉 = 0. By the δ = 1/4 cases of Lemmas 4.4.6 and 4.4.7, we canchoose ε > 0 and h > 0 such that, with probability at least 1 − 2δ = 1/2,the event


‖ξn − ξ0‖ ≤ 12‖x‖

∩


|〈ξn, e〉| ≥ h‖x‖

occurs. On this event, we have from (4.58) that


‖ξn − ξ0‖ ≥h‖x‖

3‖x‖/2 =2h

3.

With the strong Markov property, we have thus shown that, for any finitestopping time τ , P[E(τ) | Fτ ] ≥ 1

21‖ξτ‖ ≥ y0, where

E(τ) :=

max0≤n≤bε‖ξτ‖2c

‖ξτ+n − ξτ‖ ≥2h

3

.


Define τ0 = minn ≥ 0 : ‖ξn‖ ≥ y0, and recursively, for k ∈ Z+,

τk+1 = minn > τk + ε‖ξτk‖2 : ‖ξn‖ ≥ y0.

Then, by (MD3), the stopping times (τk, k ≥ 0) are a.s. finite, and, byconstruction,

E(τk) ∈ Fτk+1, and P[E(τk) | Fτk ] ≥ 1

2.

Hence, by Levy’s extension of the Borel–Cantelli lemma (Theorem 2.3.19),the events E(τk) occur for infinitely many k, a.s. But this implies that

limm→∞

supn≥0‖ξm+n − ξm‖ > 0, a.s.,

and so ξn does not converge to a limit.

4.4.5 Critical case: Recurrence classification and excursions

Having established some angular properties in Section 4.4.4, we now returnto the question of (compact set) recurrence. Again, we use the same basicnotation as in Section 4.4.1; in particular, recall the definitions of the in-crement mean drift µ(x) and increment covariance matrix M(x) from (4.2)and (4.3) respectively.

The sufficient conditions in Theorem 4.1.8 are not, in general, sharp. Forthe rest of this section we impose additional regularity in order to obtaina spectrum of fine results on the behaviour of the walk. We will imposethe following many-dimensional analogue of the assumption (M2) for theone-dimensional Lamperti problem.

(CB3) Suppose that there exist ρ ∈ R and σ2 ∈ (0,∞) for which, as‖x‖ → ∞,

µ(x) = ρx‖x‖−1 + o(‖x‖−1 log−1 ‖x‖); and

‖M(x)− σ2I‖op = o(log−1 ‖x‖).

For convenience, the following results also assume that the state spaceΣ is locally finite and that ξn is irreducible; versions of all of the followingresults still hold in the case of more general state spaces, under appropriateconditions. First we have the following recurrence classification.

Theorem 4.4.8. Suppose that (MD0) holds for Σ ⊆ Rd locally finite with0 ∈ Σ, and that ξn is irreducible. Suppose also that (4.1) holds for somep > 2, and that (CB3) holds. Then ξn is


(i) transient if 2ρ/σ2 > 2− d;

(ii) null-recurrent if −d ≤ 2ρ/σ2 ≤ 2− d;

(iii) positive-recurrent if 2ρ/σ2 < −d.

Next we give a result on existence of moments of return times.

Theorem 4.4.9. Suppose that (MD0) holds for Σ ⊆ Rd locally finite with0 ∈ Σ, and that ξn is irreducible. Suppose also that (CB3) holds, andthat (4.1) holds with p > 2. Let s ∈ (0, p/2) and write s0 := 1 − (d/2) −(ρ/σ2). Then E[(τ+

0 )s] <∞ if s < s0 and E[(τ+0 )s] =∞ if s > s0.

The next result gives almost-sure scaling behaviour for ‖ξn‖.Theorem 4.4.10. Suppose that (MD0) holds for Σ ⊆ Rd locally finite with0 ∈ Σ, and that ξn is irreducible. Suppose also that (CB3) holds.

(i) Suppose that 2ρ/σ2 > 2− d and (4.1) holds with p > 2. Then

limn→∞

log ‖ξn‖log n

=1

2, a.s.

(ii) Suppose that −d ≤ 2ρ/σ2 ≤ 2− d and (4.1) holds with p > 2. Then

lim supn→∞

log ‖ξn‖log n

=1

2, a.s.

(iii) Suppose that 2ρ/σ2 < −d and (4.1) holds with p > 2 − d − (2ρ/σ2).Then

lim supn→∞

log ‖ξn‖log n

=1

2− d− (2ρ/σ2), a.s.

The rest of this section is devoted to the proofs of the preceding theorems.The following consequence of Lemma 4.1.6 shows the relation to Lamperti’sproblem in this case.

Lemma 4.4.11. Suppose that (MD0) holds for Σ ⊆ Rd locally finite with0 ∈ Σ, and that ξn is irreducible. Suppose also that (4.1) holds for somep > 2, and that (CB3) holds. Then, for Xn := ‖ξn‖, we have that (4.14)holds for the given p > 2, and

2xµ1(x) = 2ρ+ σ2(d− 1) + o(log−1 x) = 2xµ1(x);

µ2(x) = σ2 + o(log−1 x) = µ2(x).


Proof. It follows from (CB3) that

〈x, µ(x)〉 = ρ‖x‖−1 + o(‖x‖−1 log−1 ‖x‖);trM(x) = dσ2 + o(log−1 ‖x‖);

supe∈Sd−1

∣∣e>M(x)e− σ2∣∣ = o(log−1 ‖x‖).

Lemma 4.1.6(ii) then yields the result.

Now we can complete the proofs of the theorems in this section.

Proof of Theorem 4.4.8. As in the proof of Lemma 4.4.11, (CB3) shows that

2‖x‖〈x, µ(x)〉+ trM(x)− 2x>M(x)x = 2ρ+ (d− 2)σ2 + o(log−1 ‖x‖).

It follows from Theorem 4.1.8(i) that ξn is transient if 2ρ/σ2 > 2 − d, andit follows from Theorem 4.1.8(ii) that ξn is recurrent if 2ρ/σ2 ≤ 2− d.

It remains to distinguish between positive- and null-recurrence; we applythe results of Section 3.8 to the process Xn = ‖ξn‖ and use Lemma 4.4.11.Then, for the process Xn,

lim supx→∞

(2xµ1(x) + (2α− 1)µ2(x)

)≤ 2ρ+ σ2(d+ 2α− 2). (4.59)

If α = 1 and 2ρ/σ2 < −d, then (4.59) is strictly negative, and so The-orem 3.8.1(ii) yields positive-recurrence. On the other hand, for any θ ∈(0, 1),

2xµ1(x) +

(1 +

1− θlog x

)µ

2(x) ≥ 2ρ+ σ2d+

(1− θlog x

)(σ2 + o(1)),

which is positive for all x sufficiently large if σ2 > 0 and 2ρ/σ2 ≥ −d. Inthis case Theorem 3.8.3 shows that E[τ+

0 ] =∞, and hence the random walk,if recurrent, is null-recurrent.

Proof of Theorem 4.4.9.

Proof of Theorem 4.4.10. Consider Xn = ‖ξn‖. Lemma 4.4.11 shows thatwe may apply Theorem 3.10.1 to obtain part (i) of the theorem and The-orem 3.9.5 to obtain parts (ii) and (iii) of the theorem.

4.5 Reflecting random walks

Mainly the stuff from [9].

4.6. Range and local time 229

4.6 Range and local time of many-dimensional mar-tingales

Throughout this section we assume that d ≥ 2 and consider a discrete-timeprocess (ξn, n ≥ 0) with values in Σ ⊆ Rd, adapted to a filtration (Fn, n ≥ 0).We suppose that Σ is a set of infinite cardinality; the elements of Σ will becalled sites. Without restriction of generality, we assume that 0 ∈ Σ. Forthe process ξn, we suppose that it is uniformly elliptic (can advance in anygiven direction with uniformly positive probability), has uniformly boundedjumps, and is a martingale in each coordinate (see Definition 4.6.1 below forthe precise meaning). In principle, we do not assume homogeneity in spaceor time, or even that the process is Markovian.

In this section we study two related questions:

• How many different sites can be visited by the process ξn by time n?

• How big can be the number of visits to a given site?

Of course, in the absence of space/time homogeneity one cannot hope to beable to characterize the precise behavior of the quantities of interest; in thissection we content ourselves in proving that with probability 1− exp(−nε)the number of visits to any fixed site by time n is less than n

12−δ for some

δ > 0. This in its turn implies that the number is different sites visited bythe process by time n with very high probability will be at least n

12

+δ.

Although it is not important for the formulation of our results, whilereading this section one may always assume that Σ is the vertex set ofthe integer lattice Zd, the vertex set of some other mosaic, or just anylocally finite (in particular, countable) set. This, of course, is justified bythe questions that are of our interest: if the law of the jump of the processis (in some sense) continuous, then such questions typically do not arise(every site is visited at most once, and the process visits n different sites bytime n).

Now, we write formal definitions and state our results. Let

Ln(x) :=

n∑k=0

1ξk = x

be the local time of the process in x ∈ Σ up to time n, and we denote by

Ξn := ξ0, ξ1, . . . , ξn


the set of visited sites up to time n. The range is #Ξn, the number ofvisited sites up to time n. Define the random variable Dn ∈ Rd to be the(conditional) drift of the process at time n:

Dn := E[ξn+1 − ξn | Fn] = E[ξn+1 | Fn]− ξn.

Definition 4.6.1. We say that the Fn-adapted process ξn in Σ ⊆ Rd, d ≥ 2,

(a) has uniformly bounded jumps if there exists K > 0 such that ‖ξn+1−ξn‖ ≤ K a.s. for all n (we assume without restriction of generality thatK ≥ 1);

(b) is uniformly elliptic if there exist h, r > 0 such that for all e ∈ Sd−1 wehave P[(ξn+1−ξn) ·e > r | Fn] > h a.s. (we assume without restrictionof generality that r ≤ 1);

(c) is a d-dimensional martingale, if Dn = 0 a.s. for all n.

The main result of this section is

Theorem 4.6.2. Let d ≥ 2. Suppose that ξn is a uniformly elliptic, d-dimensional martingale with uniformly bounded jumps. Then, there existγ ∈

(0, 1

2

), C1, C2, δ > 0 such that, for any x ∈ Σ, for all n ≥ 0,

P[Ln(x) > nγ ] ≤ C1 exp−C2nδ;

and P[#Ξn < n1−γ ] ≤ C1n exp−C2nδ.

Proof. So, suppose that ξn is a martingale in dimension d ≥ 2, with boundedjumps and uniform ellipticity. To begin, we show that there exist b ∈ (0, 1)close enough to 1 and γ′ > 0 (depending only on the constants K, h, r inDefinition 4.6.1(a) and (b)) such that

E[‖ξn+1‖b | Fn] ≥ ‖ξn‖b1‖ξn‖ > γ′.

To see that this inequality holds, first observe that for a fixed y ∈ Rd wehave

‖x + y‖b =(‖x‖2 + 2x · y + ‖y‖2

)b/2= ‖x‖b

(1 + b

x · y‖x‖2 +

b‖y‖22‖x‖2 −

1

2b(2− b)(x · y)2

‖x‖4 + o(‖x‖−2)),

as ‖x‖ → ∞. Taking x = ξn and y = θn := ξn+1 − ξn, denoting by ϕn theangle between ξn and θn, and using the martingale property,

E[‖ξn+1‖b − ‖ξn‖b | Fn]

4.6. Range and local time 231

=b

2‖ξn‖2−b(E[‖θn‖2 | Fn]− (2− b)E[‖θn‖2 cos2 ϕn | Fn] + oFn‖ξn‖(‖ξn‖

−b)).

(4.60)

Using the uniform ellipticity and the boundedness of jumps, one canobtain that

E[‖θn‖2 cos2 ϕn | Fn] < (1− ε′)E[‖θn‖2 | Fn]

for some ε′ > 0. It follows that if b < 1 is close enough to 1, the right-handside of (4.60) is positive for all large enough ‖ξn‖, since E[‖θn‖2 | Fn] ≥hr2 > 0 by uniform ellipticity. (More details can be found in the proof ofLemma 5.2 in [204].)

Recall that B(x; s) is the closed ball of radius s centred at x. Theproof of Theorem 4.6.2 is now quite straightforward, we only sketch it in thefollowing way:

• The Optional Stopping Theorem implies that, starting from x0 ∈ Rd,the process will reach Rd \ B(x0;α) without coming back to 0 withprobability at least O(α−b) (to apply the Optional Stopping Theorem,one has first to force the process a bit away from the origin, whichhappens with positive probability by uniform ellipticity).

• Doob’s inequality implies that from any place in Rd \ B(x0;α) withprobability bounded away from 0 the process will not return to x0

after additional c1α2 steps, where c1 > 0 is a small enough constant.

• Then, consider α = c2n1/2 for large enough c2. By the previous dis-

cussion, each time the process is in x0, independently of the past ithas probability at least of order n−b/2 of not returning to x0 duringnext n steps.

• Take ε > 0 such that b + ε < 1. Then, by an obvious coin-tossing

argument, the number of visits to x0 by time n will not exceed nb+ε2

with probability at least 1− c3e−c4nε/2

.

• So, with probability at least 1 − c3ne−c4nε/2

the process will have to

visit at least n1− b+ε2 different sites.

Theorem 4.6.2 can be extended to a more general class of processes,namely, the so-called strongly directed submartingales. To give a formal


0 u

L

`

Hu`,L

Figure 4.10: On the definition of Hu`,L; observe that Hu

`,L and H−u−`,L “com-

plement” each other (i.e., their union is Rd and they intersect on a set ofmeasure 0).

definition, let us denote by PL the operator of projection on the linearsubspace L ⊂ Rd. Assuming that L is a two-dimensional subspace of Rd,` ∈ Sd−1 ∩ L and u ∈ R, define

Hu`,L =

x ∈ Rd : PLx = 0 or

PLx · `‖PLx‖

≥ u,

see Figure 4.10.

Definition 4.6.3. We say that the Fn-adapted process ξn is a (u, `,L)-strongly directed submartingale, if u > 0 and P[Dn ∈ Hu

`,L | Fn] = 1 a.s.

Informally, the drift of a strongly directed submartingale should alwaysbelong to the “interior of the open book” as in Figure 4.10. Observe thatfor (u, `,L)-strongly directed submartingale it holds that the (conditional onthe history) expected projection of the drift to ` is always nonnegative (sothat it is what one would naturally call a “submartingale in direction `”).

Then, we have the following generalization of Theorem 4.6.2:


Theorem 4.6.4. Let d ≥ 2. Suppose that ξn is a (u, `,L)-strongly directedsubmartingale, which is uniformly elliptic and has uniformly bounded jumps.is a uniformly elliptic, d-dimensional martingale with uniformly boundedjumps. Then, there exist constants γ ∈ (0, 1

2), C1, C2, δ > 0 (apart fromthe dimension, depending only on u and on K, r, h from Definition 4.6.1(a)–(b)) such that for any x ∈ Σ, for all n ≥ 0,

P[Ln(x) > nγ ] ≤ C1 exp−C2nδ;

and P[#Ξn < n1−γ ] ≤ C1n exp−C2nδ.

We do not give the proof of this theorem here, since it is quite technical,does not rely on Lyapunov functions, and is not as instructive as the proofof Theorem 4.6.2.


Section 4.1

The non-confinement result for zero-drift random walks, Lemma 4.1.1, canbe found as Proposition 2.1 in [101]; the d-dimensional martingale ver-sion of Kolmogorov’s other inequality, Theorem 4.1.5, on which the non-confinement result rests, is a variation of Lemma 4.1 of [101].

As mentioned in the text, Theorem 4.1.2, which states that in a one-dimensional zero-drift random walk is recurrent under mild second momentconditions, is a relative of Theorem 2.5.6: see the bibliographical notes toSection 2.5 for further discussion.

The recurrence classification for a zero-drift random walk with fixedincrement covariance matrix, Theorem 4.1.3, is contained in Theorem 4.1 ofLamperti [173], under a uniform bound on the increments of the walk. Theversion stated here is contained, for example, in Theorem 2.2 of [101].

The analysis of many-dimensional random walks by considering the pro-cess of norms, as in Section 4.1.4, was an important contribution of Lamperti’sseminal work [173, 175]. Early versions of the computations for Lemma 4.1.6,assuming uniformly bounded increments, can be found for example in [173,§4]. The sufficient conditions for recurrence and transience given in The-orem 4.1.8 extend Theorem 4.1 of Lamperti [173], which, as well as as-suming uniformly bounded increments, applies only to the case where theincrement covariance matrix M(x) is (asymptotically) constant. The ap-proach here unifies the proofs not only of Theorems 4.1.2 and 4.1.3, but alsoof Theorems 4.2.1, 4.4.2, and 4.4.8.


The model in Example 4.1.10 is due to Gillis [104], who calls it a ‘cent-rally biased discrete random walk’. Gillis established recurrence for (only)ε > d−2

2d and transience for ε < d−22d ; his method used generating functions.

Building on Gillis’s argument, Flatto [91] established recurrence for ε = d−22d .

Section 4.2

The material on elliptic random walks is based on [101], although the germof the model goes back to earlier work of two of the authors of the presentbook (see the Acknowledgements in [101]).

Consider the example with increments supported on ellipsoids discussedin Section 4.2.2. It follows from (4.32) that

‖ξ1‖2 = ‖ξ0‖2 + 2‖ξ0‖〈ξ0, θ〉+ ‖θ‖2

= ‖ξ0‖2 + 2‖ξ0‖〈e1, Dζ〉+ 〈ζ, D2ζ〉= ‖ξ0‖2 + 2a

√d‖ξ0‖〈e1, ζ〉+ (a2 − b2)d〈e1, ζ〉2 + b2d. (4.61)

In particular, for this family of models (‖ξn‖, n ≥ 0) is itself a Markovprocess, since the distribution of ‖ξn+1‖ depends only on ‖ξn‖ and not ξn;however, in the general setting of Section 4.2.1, this need not be the case.

One-dimensional processes with evolutions reminiscent to that given by(4.61) have been studied previously by Kingman [156] and Bingham [20].Those processes can be viewed, respectively, as the distance from its startpoint of a random walk in Euclidean space, and the geodesic distance fromits start point of a random walk on the surface of a sphere, but in both casesthe increments of the random walk have the property that the distributionof the jump vector is a product of the independent marginal distributionsof the length and direction of the jump vector. In contrast, for the ellipticrandom walk the laws of ‖θn‖ and 〈ξn, θn〉 are not independent (except whena = b).

Section 4.3

The results in this section are from [233], where the following question wasraised.

Open problem 4.6.5. Prove or refute the conjecture that (for d ≥ 4) anyd− 1 measures yield a transient walk.

Some partial results in this direction can be found in [233], but thisconjecture still stands open.


Section 4.4

Many-dimensional Markov processes with asymptotically zero drift, such asthose in this section, were studied by Lamperti [173, 175] under the namecentrally biased random walks, due to the nature of the drift field; the namehad been used earlier by Gillis [104] for the model in Example 4.1.10.

The results on the supercritical case given in Section 4.4.3 are adaptedfrom [215, 194, 42]. In (4.46) the exponent −2 on the logarithm is chosen forsimplicity; it could be replaced with any exponent strictly less than −1. Thelaw of large numbers for the process of norms, Lemma 4.4.3, is a variationon Theorem 3.2 of [215]. The limiting direction part of Theorem 4.4.2 isa variation on Theorem 2.2 of [194]. The method of proof of the limitingdirection, via Lemma 4.4.4, is different from (and more transparent than)the method in [194], and extends the approach of [42, §6.2], which (in aslightly different context) was restricted to two dimensions and assumed auniform bound on the increments.

The conclusion of Theorem 4.4.5 that ξn does not have a limit can bestrengthened in many circumstances. For example, a stronger property isangular recurrence, which says that ξn visits every neighbourhood on Sd−1

infinitely often; see [194, 195] for some approaches to studying angular re-currence for non-homogeneous random walks. Stronger still is angular er-godicity, such as

lim infn→∞

n−1n−1∑m=0

1‖u− ξm‖ < ε > 0.

The proof of Theorem 4.4.5 given here uses some ideas similar to those in[194, 195], but is considerably more transparent.

The recurrence classification in Section 4.4.5 is similar to that givenin [121], and builds on work going back to Lamperti. Specifically, The-orem 4.4.8 extends results of Lamperti [173, 175], who assumed uniformlybounded increments and a stronger version of (CB3) with the error termlog−1 ‖x‖ replaced by ‖x‖−δ for δ > 0: see Theorem 4.1 of [173, p. 324] andTheorem 5.1 of [175, p. 142]; the latter result is not sharp enough to decidebetween null- or positive-recurrence at the boundary case 2ρ/σ2 = −d. Theresult on moments of return times, Theorem 4.4.9, also extends Theorem 5.1of [175]; Lamperti’s result only covers integer s.

The almost-sure bounds in Theorem 4.4.10 are a variation on similarresults from [121]. Upper bounds similar to those in Theorem 4.4.10 can bederived from [211, §3]: see Theorem 2.4 of [42] for a similar application ofsuch results, albeit under more restrictive assumptions.


Section 4.6

The range (i.e., the cardinality of the set visited sites, or sometimes this setitself) and the local time (i.e., the number of visits to a given site) for space-homogeneous discrete-time random walks have been extensively studied inthe literature. It is a classical result that the expected range of the simplerandom walk is O( n

logn) for d = 2 and O(n) for d ≥ 3, see e.g. Section 6.1of [125]. It is not difficult to obtain from this fact (using an independenceargument as e.g. in Lemma 3.1 of [3]) that with very high probability thewalk visits at least n1−δ distinct sites by time n. Finer results for the rangeof homogeneous random walks can be found in a number of papers; see e.g.in [17, 66, 110] and references therein. For nonhomogeneous random walksthese questions, of course, are more difficult; we mention [238] that containsresults on the range of simple random walk on a supercritical percolationcluster.

The behaviour of the local time (i.e., the number of visits) in a fixedsite, or the field of local times in all sites, was much studied in the literatureas well. It is quite elementary to obtain that the expected number of visitsto the origin by time n for the simple random walk is O(log n) for d = 2and O(1) for d ≥ 3. Also, one can easily obtain for the simple randomwalk in dimension 2 (using e.g. E1 of Section III.16 of [270]) that, withstretched-exponentially small probability, the number of visits to the originis less than nδ for any fixed δ > 0. Of course, finer results (for more generalrandom walks as well) are available; see e.g. [28, 51, 52, 55, 200].

Theorem 4.6.2 is taken from [204], although the exposition basicallyfollows [cite Menshikov-Popov-JTP]; the last paper also contains the proofof more general Theorem 4.6.4.

Theorem 4.6.2 shows that the range of a many-dimensional martingaleis of order at least n(1/2)+ε with high probability. As discussed above, forsimple random walk the range is in fact of order at least n1−ε. A naturalquestion is thus:

Open problem 4.6.6. Under the conditions of Theorem 4.6.2, either provethat #Ξn is at least n1−ε with high probability for any ε > 0, or produce acounterexample.

In general state-spaces Σ ⊆ Rd, one natural definition of the range is

Rn :=∣∣∪x∈ΞnB(x; r)

∣∣, for fixed r > 0, (4.62)

the volume of the r-radius ‘sausage’ around the path. If Σ is locally finite,then, as r ↓ 0,

r−dRn → vd#Ξn,


where vd = |B(0; 1)| is the volume of the unit-radius d-ball. So for locallyfinite Σ, it is usual to work simply with #Ξn, the number of sites visited,and also call this the range of the walk. This is the approach taken inSection 4.6.

Open problem 4.6.7. Study the behaviour of Rn as defined by (4.62) forthe elliptic random walks of Section 4.2.

Further remarks

We make some brief comments on motivation for some of the models con-sidered in this chapter.

The PRRW has received some renewed interest recently as a model forthe motion of microscopic organisms, such as certain bacteria, on a surface.The PRRW is more natural than lattice-based models, and experiment sug-gests that the locomotion of several kinds of cells consists of roughly straightline segments linked by discrete changes in direction: see e.g. Chapter 6 of[18] and [222, 223]. The generalization to elliptically-distributed incrementsstudied in Section 4.2.2 represents movement on a surface on which either ra-dial or transverse motion is inhibited. In chemistry and physics, the traject-ory of a finite-step PRRW (also called a ‘random chain’) is an idealized modelof the growth of weakly interacting polymer molecules: see, e.g. Chapter 15of [162] and Section 2.6 of [125]. The modification to ellipsoid-supportedjumps represents polymer growth in a biased medium.

Chapter 5

Heavy tails

5.1 Chapter overview

So far, the processes that we have considered in this book have typically hadincrements whose first (and second) moments have been uniformly bounded.The focus of this chapter is on the case where first (or second) moments donot exist, i.e., the increments are heavy-tailed. We consider Markov processeson R, and study questions of asymptotic behaviour, such as recurrence ortransience.

This chapter deals with two different types of process. First, in Sec-tion 5.2, we study processes whose increments have, in some sense, heaviertails in one direction than the other. We investigate sufficient conditions fortransience in the direction of the heavier tail, and quantify the transienceby deriving results on the rate of escape and on moments of first passageand last exit times.

Second, in Section 5.3, we study processes whose increments are of twodifferent types, depending on whether the current position is to the left or tothe right of the origin. Such processes are known as oscillating random walks,and their study (via classical methods) goes back to Kemperman [144]. Dif-ferent versions of the model have increment distributions that are symmetricor signed (i.e., jumps only towards the origin are allowed). For these modelswe study recurrence and transience.

In keeping with the theme of this book, the methods of this chapter arebased on the semimartingale ideas of Chapter 2: for appropriate choices ofLyapunov function, we verify Foster–Lyapunov style drift conditions. Veri-fication of drift conditions usually entails some Taylor’s formula expansions(as in Chapter 3) as well as some careful truncation ideas to deal with the

239

240 Chapter 5. Heavy tails

heavy tails. The resulting proofs are relatively short, and based on someintuitively appealing ideas.

5.2 Directional transience

5.2.1 Introduction

In this section we assume the following.

(H0) Let Xn be a time-homogeneous Markov chain on Σ ⊆ R with 0 ∈ Σ,inf Σ = −∞ and sup Σ = +∞. Suppose that X0 = 0.

For n ∈ Z+, we write ∆n := Xn+1 − Xn for the increments of Xn. Inmost of our results, we impose ‘heavy tail’ conditions on either ∆+

n (posit-ive jumps) or ∆−n (negative jumps) or both; typically these conditions areone-sided (i.e., inequalities). Two of the most common assumptions are asfollows.

(H1) There exist α > 0, c > 0, and y0 <∞ for which, for all x ∈ Σ and ally ≥ y0, P[∆+

n > y | Xn = x] ≥ cy−α.

(H2) There exist α ∈ (0, 1), c > 0, and y0 <∞ for which, for all x ∈ Σ andall y ≥ y0, E[∆+

n 1∆+n ≤ y | Xn = x] ≥ cy1−α.

The following non-confinement result shows that, under the conditions ofmost of the results in this section, the process Xn has non-trivial asymptoticbehaviour.

Proposition 5.2.1. Suppose that (H0) holds. Suppose that either (H1) or(H2) holds, or either holds with ∆−n instead of ∆+

n . Then

lim supn→∞

|Xn| =∞, a.s. (5.1)

Proof. We claim that under any of the conditions in the proposition, it isthe case that for any y ≥ 0 sufficiently large, there exists δ(y) > 0 for which,

P[|∆n| > y | Xn = x] ≥ δ(y), for all x ∈ Σ. (5.2)

Consider the events An := |Xn| > B and let Fn = σ(X0, . . . , Xn). Given(5.2), for any B <∞ large enough,

P[An+1 ∪An | Fn] = P[An+1 | Acn]1(Ac

n) + 1(An)

5.2. Directional transience 241

≥ P[An+1 | Acn] ≥ δ(2B) > 0.

Thus∑

n≥0 P[An+1 ∪ An | Fn] = ∞ a.s., and An+1 ∪ An ∈ Fn+1, soLevy’s extension of the Borel–Cantelli lemma (Theorem 2.3.19) shows thatlim supn→∞ |Xn| ≥ B, a.s. Since B was arbitrarily large, (5.1) follows.

It remains to verify (5.2). Since |∆n| = ∆+n + ∆−n , it suffices to verify

(5.2) with one of ∆+n or ∆−n in place of |∆n|. In the case where (H1) holds,

the claim is immediate. If (H2) holds, then, for any y ≥ y0, for z > y,

E[∆+t 1y ≤ ∆+

t ≤ z | Xn = x] ≥ E[∆+n 1∆+

n ≤ z | Xn = x]− y > 1,

provided z > y0 + ((1 + y)/c)1/(1−α), say. Then,

1 < E[∆+n 1y ≤ ∆+

n ≤ z | Xn = x] ≤ z P[y ≤ ∆+n ≤ z | Xn = x]

≤ z P[∆+n ≥ y | Xn = x],

which implies (5.2) in this case also.

5.2.2 A condition for directional transience

Our first main result, which gives conditions under which limn→∞Xn = +∞,imposes a moments condition on the negative jumps of the process:

(H3) There exist β > 0 and C < ∞ for which E[(∆−n )β | Xn = x] ≤ C forall x ∈ Σ.

Theorem 5.2.2. Suppose that (H0) holds, and that (H2) and (H3) hold forα ∈ (0, 1) and β > α. Then Xn → +∞ a.s. as n→∞.

To prove Theorem 5.2.2, we use the Lyapunov function fz,δ : R→ [0, 1]defined for z ∈ R and δ > 0 by

fz,δ(y) :=

1 if y ≤ z;(1 + y − z)−δ if y > z.

(5.3)

Note that fz,δ is non-increasing on R.

Lemma 5.2.3. Suppose that (H0) holds, and that (H2) and (H3) hold forα ∈ (0, 1) and β > α. Then for any δ ∈ (0, β − α) and some A > 0sufficiently large, for any z ∈ R,

E[fz,δ(Xn+1)− fz,δ(Xn) | Xn = x] ≤ 0, for all x > z +A.


Proof. It suffices to suppose that n = 0 and z = 1. Let δ > 0, and letfδ := f1,δ be as defined at (5.3). Let γ ∈ (0, 1); we will specify δ and γ later.Since fδ is non-increasing and [0, 1]-valued, we have for y > 1 that

fδ(x+ ∆0)− fδ(x) ≤[(x+ ∆+

0 )−δ − x−δ]

1∆+0 ≤ xγ

+[(x−∆−0 )−δ − x−δ

]1∆−0 ≤ xγ+ 1∆−0 > xγ. (5.4)

We will take expectations on both sides of (5.4), conditioning on X0 = x.The final term in (5.4) then becomes, by Markov’s inequality and (H3),

Px[∆−0 > xγ ] = Px[(∆−0 )β > xγβ] ≤ Cx−γβ. (5.5)

For the second term on the right-hand side of (5.4), since γ < 1, Taylor’sformula implies that, as x→∞,[

(x−∆−0 )−δ − x−δ]

1∆−0 ≤ xγ = δ(1 + o(1))x−1−δ∆−0 1∆−t ≤ xγ,(5.6)

where the o(1) term is non-random. Here we have for the product of thefinal two terms in (5.6) that

∆−0 1∆−0 ≤ xγ = (∆−0 )β∧1(∆−0 )(1−β)+1∆−0 ≤ xγ≤ (∆−0 )β∧1xγ(1−β)+ . (5.7)

Combining (5.6) and (5.7), and using (H3), we obtain

Ex[[

(x−∆−0 )−δ − x−δ]

1∆−0 ≤ xγ]

= O(x−1−δ+γ(1−β)+). (5.8)

For the first term on the right-hand side of (5.4), another application ofTaylor’s formula implies that, as x→∞,[

(x+ ∆+0 )−δ − x−δ

]1∆+

0 ≤ xγ = −δ(1 + o(1))x−1−δ∆+0 1∆+

0 ≤ xγ.

Taking expectations and using (H2) we obtain, for x sufficiently large,

Ex[[

(x+ ∆+0 )−δ − x−δ

]1∆+

0 ≤ xγ]≤ −(cδ/2)x−1−δ+γ(1−α). (5.9)

Thus from (5.4), using the estimates (5.5), (5.8) and (5.9), we verify thatEx[fδ(X1)− fδ(X0)] ≤ 0, for x > A with some A sufficiently large, providedthat the negative term arising from (5.9) dominates, i.e.,

−1− δ+ γ(1−α) > −γβ, and − 1− δ+ γ(1−α) > −1− δ+ γ(1− β)+.

The second inequality holds since α < β ∧ 1. The first inequality holdsprovided we choose δ ∈ (0, β − α), which we may do since α < β, and thenchoose γ ∈ ( 1+δ

1+β−α , 1).


Now we can give the proof of Theorem 5.2.2.

Proof of Theorem 5.2.2. The argument is similar to Theorem 3.5.6. Firstwe show that, under the conditions of the theorem,

P[lim infn→∞

Xn = −∞]

= 0. (5.10)

Let a > 0, to be chosen later. For x ∈ R, set

νx := minn ≥ 0 : Xn > x+ a; ηx := minn ≥ νx : Xn ≤ x.

In particular, since X0 = 0, we have that νx = 0 for all x < −a.Let δ ∈ (0, β − α) and write Fn = σ(X0, . . . , Xn). Then Lemma 5.2.3

shows that, on νx <∞, (fx−A,δ(Xn∧ηx), n ≥ νx) is a non-negative super-martingale, and so converges a.s. to a finite limit, Lx, say. On νx < ∞,we have by the supermartingale property that

E[fx−A,δ(Xn∧νx) | Fνx ] ≤ fx−A,δ(Xνx) ≤ (1 +A+ a)−δ, a.s.,

while by Fatou’s lemma, also on νx <∞,

limn→∞

E[fx−A,δ(Xn∧ηx) | Fνx ] ≥ E[Lx | Fνx ]

≥ E[Lx1ηx <∞ | Fνx ]

≥ (1 +A)−δ P[ηx <∞ | Fνx ],

since, on ηx <∞, Xn∧ηx ≤ x for all n sufficiently large. So on νx <∞we have, a.s.,

P[ηx <∞ | Fνx ] ≤(

1 +A+ a

1 +A

)−δ.

Let ε > 0. Then we can take a sufficiently large so that P[ηx = ∞ |Fνx ] ≥ 1 − ε, a.s., on νx < ∞. For such a choice of a, suppose thatx < −a; then νx < ∞ a.s. (indeed, since X0 = 0, νx = 0 a.s.). Hence forsuch an x,

P[lim infn→∞

Xn > x]

= P[ηx =∞] ≥ E [P[ηx =∞ | Fνx ]1νx <∞]≥ 1− ε. (5.11)

It follows from (5.11) that

P[lim infn→∞

Xn = −∞]≤ P

[lim infn→∞

Xn ≤ −a− 1]≤ ε.


Since ε > 0 was arbitrary, (5.10) follows.Proposition 5.2.1 applies under condition (H2). Hence, together with

(5.1), (5.10) implies that, a.s., lim supn→∞Xn =∞; in other words, for anya > 0 and any x ∈ R, νx < ∞ a.s. Hence the argument for (5.11) extendsto any x ∈ R, which implies that for any x ∈ R, a.s., lim infn→∞Xn > x, soXn →∞ a.s.

5.2.3 Almost-sure bounds

Our next two results deal with the growth rate of Xn. First we have thefollowing upper bounds.

Theorem 5.2.4. Suppose that there exist θ ∈ (0, 1], ϕ ∈ R, y0 < ∞ andC <∞ such that, for all y ≥ y0,

P[∆+n ≥ y | Xn = x] ≤ Cy−θ(log y)ϕ, for all x ∈ Σ. (5.12)

(i) If θ ∈ (0, 1), then, for any ε > 0, a.s., for all but finitely many n ≥ 0,

Xn ≤ n1/θ(log n)ϕ+2θ

+ε.

(ii) If θ = 1, then, for any ε > 0, a.s., for all but finitely many n ≥ 0,

Xn ≤ n(log n)(1+ϕ)++1+ε.

The next result shows that if we impose condition (H1) instead of thecondition (H2) in Theorem 5.2.2, not only does Xn → +∞, a.s., but it doesso at a particular rate of escape.

Theorem 5.2.5. Suppose that (H0) holds, and that (H1) and (H3) hold forα ∈ (0, 1) and β > α. Then for any ε > 0, a.s., for all but finitely manyn ≥ 0,

Xn ≥ n1/α(log n)−(1/α)−ε.

Theorems 5.2.4 and 5.2.5 have the following corollary.

Corollary 5.2.6. Suppose that (H0) holds, and that (H3) holds for α ∈(0, 1) and β > α. Suppose also that

limy→∞

supx∈Σ

∣∣∣∣ logP[∆+n > y | Xn = x]

log y+ α

∣∣∣∣ = 0.

Then

limn→∞

logXn

log n=

1

α, a.s.


Proof. The condition in the corollary ensures that for any ε > 0 there existsy0 <∞ such that, for all y ≥ y0 and all x ∈ Σ,

y−α−ε ≤ P[∆+n > y | Xn = x] ≤ y−α+ε, a.s.

Theorem 5.2.4 with the upper bound in the last display and (H3) then showsthat for any ε > 0, a.s., Xn ≤ n(1/α)+ε for all but finitely many t. On theother hand, Theorem 5.2.5 with the lower bound in the last display and (H3)shows that for any ε > 0, a.s., Xn ≥ n(1/α)−ε for all but finitely many t.Since ε > 0 was arbitrary, the result follows.

The rest of this section is devoted to the proofs of Theorems 5.2.4and 5.2.5. First we give s a general condition for obtaining almost-sureupper bounds.

Lemma 5.2.7. Let h : R+ → R+ be increasing and concave. Suppose thatthere exists C < ∞ such that E[h(∆+

n ) | Xn = x] ≤ C, for all x ∈ Σ. Thenfor any ε > 0, a.s., for all but finitely many n ≥ 0,

Xn ≤n−1∑m=0

∆+m ≤ h−1(n(log n)1+ε).

Proof. Set Y0 := 0 and for n ∈ N let Yn :=∑n−1

m=0 ∆+m. Then Yn ≥ 0 is non-

decreasing and Xn ≤ X0 + Yn = Yn, since X0 = 0. Since h is non-negativeand concave, it is subadditive, i.e., h(a + b) ≤ h(a) + h(b) for a, b ∈ R+.Hence

E[h(Yn + ∆+n )− h(Yn) | Xn = x] ≤ E[h(∆+

n ) | Xn = x] ≤ C, (5.13)

by hypothesis. The almost-sure upper bound in Theorem 2.8.1 applied toh(Yn) implies that, for any ε > 0, a.s., h(Yn) ≤ n(log n)1+ε, for all butfinitely many n. Since h is increasing and Xn ≤ Yn, it follows that for anyε > 0, a.s., h(Xn) ≤ n(log n)1+ε.

Next we need a result on the maxima of the increments of Xn.

Lemma 5.2.8. Suppose that (H1) holds for α ∈ (0,∞). Then for any ε > 0,a.s., for all but finitely many n ≥ 0,

max0≤m≤n

∆+m ≥ n1/α(log n)−(1/α)−ε.


Proof. By repeated applications of the Markov property and (H1),

P[

max0≤m≤n

∆+m < y

]≤

n∏m=0

(1− cy−α) ≤ (1− cy−α)n, (5.14)

for y ≥ y0. Taking y = n1/α(log n)q in (5.14) we obtain, for n sufficientlylarge,

P[

max0≤m≤n

∆+m < n1/α(log n)q

]≤(1− cn−1(log n)−αq

)n= O

(exp

(−c(log n)−αq

)),

which is summable over sufficiently large n provided q < −1/α. Hence theBorel–Cantelli lemma completes the proof.

Now we can complete the proofs of Theorems 5.2.4 and 5.2.5.

Proof of Theorem 5.2.4. First we prove part (i), so let θ ∈ (0, 1). For ε > 0,take h(x) = (K+x)θ(log(K+x))−ϕ−1−ε. For a large enough choice of K ≥ 1,h is non-negative, increasing, and concave. Moreover, E[h(∆+

n ) | Xn = x] isuniformly bounded provided

∑∞k=1 h

′(k)P[∆+n > k | Xn = x] is uniformly

bounded; see e.g. [109, p. 76]. This is indeed the case under the hypothesis ofthe theorem, by (5.12), since h′(x) = O(xθ−1(log x)−ϕ−1−ε). Now (i) follows

from Lemma 5.2.7, noting that h−1(x) = O(x1/θ(log x)ϕ+1θ

+ε). The proof of(ii) is similar, this time taking h(x) = (K + x)(log(K + x))−(ϕ+1)+−ε.

Proof of Theorem 5.2.5. Let ε > 0. Lemma 5.2.7 applied to −Xt withh(x) = xβ∧1, using (H3), shows that for all but finitely many n,

n−1∑m=0

∆−m ≤ n1/(β∧1)(log n)(1/(β∧1))+ε, a.s.

On the other hand, Lemma 5.2.8 implies that for all but finitely many n,

n−1∑m=0

∆+m ≥ max

0≤m≤n−1∆+m ≥ n1/α(log n)−(1/α)−ε, a.s.

Combining these bounds and using the fact that α < β ∧ 1 completes theproof.


5.2.4 Moments of passage times

Recall that ρx = minn ≥ 0 : Xn ≥ x denotes the first passage time intothe half-line [x,∞). Under the conditions of Theorem 5.2.2, Xn → +∞,a.s., so that ρx < ∞ a.s., for all x ∈ R. It is natural to study the tails ormoments of the random variable ρx in order to quantify the transience.

Theorem 5.2.9. Suppose that (H0) holds, and that (H2) and (H3) hold forα ∈ (0, 1) and β > α. Then for any x ∈ R and any s ∈ [0, β/α), E[ρsx] <∞.

Theorem 5.2.10. Let α ∈ (0, 1] and β > 0. Suppose that, for some C <∞,E[(∆+

n )α | Xn = x] ≤ C for all x ∈ Σ, and E[(∆−n )β | Xn = x] = ∞ for all

x ∈ Σ. Then, for any x > 0, E[ρβ/αx ] =∞.

Note that in Theorem 5.2.9, β/α > 1, so in particular E ρx <∞ for anyx ∈ R. The proof of Theorem 5.2.9 will use another Lyapunov function,described in the next result.

Lemma 5.2.11. Suppose that (H0) holds, and that (H2) and (H3) holdfor α ∈ (0, 1) and β > α. For γ ∈ (α, β) and y ∈ R let Wn := (y −Xn)γ1Xn < y. Then the following hold.

(i) Take γ = β − ε. Then for any ε ∈ (0, β(β−α)1+β−α ) there exists a finite

constant K such that, for all t,

E[Wn+1 −Wn | Xn = x] ≤ K, for all x ∈ Σ.

(ii) For any η ∈ (0, 1 − (α/β)), we can choose z < y and γ ∈ (α, β) suchthat, for some c > 0, for all x < z,

E[Wn+1 −Wn | Xn = x] ≤ −cW ηn .

Proof. Fix y ∈ R. Also take γ ∈ (α, β) and θ ∈ (0, 1); we will make morerestrictive specifications for these parameters later. It suffices to supposethat n = 0. If X0 = x < y − 1, we have (y −X0)θ < y −X0 and so

W1 −W0 = (y − x−∆0)γ1X1 < y − (y − x)γ

≤[(y − x−∆+

0 )γ − (y − x)γ]1∆+

0 ≤ (y − x)θ+[(y − x+ ∆−0 )γ − (y − x)γ

]1∆−0 ≤ (y − x)θ

+ (y − x+ ∆−0 )γ1∆−0 ≥ (y − x)θ. (5.15)


We bound the three terms on the right-hand side of (5.15) in turn. For thefirst term, we have from Taylor’s formula that[

(y − x−∆+0 )γ − (y − x)γ

]1∆+

0 ≤ (y − x)θ= −γ∆+

0 (y − x)γ−1(1 + o(1))1∆+0 ≤ (y − x)θ,

where the o(1) is non-random and as y−x→∞. Hence, taking expectationsand using (H2), it follows that for a fixed y and any x for which y − x islarge enough,

Ex[[

(y − x−∆+0 )γ − (y − x)γ

]1∆+

0 ≤ (y − x)θ]≤ −(cγ/2)(y−x)γ−1+θ(1−α).

For the second term on the right-hand side of (5.15), a similar applicationof Taylor’s formula yields, for y − x sufficiently large,[

(y − x+ ∆−0 )γ − (y − x)γ]1∆−0 ≤ (y − x)θ

≤ 2γ(y − x)γ−1(∆−0 )β∧1(∆−0 )(1−β)+1∆−0 ≤ (y − x)θ≤ 2γ(y − x)γ−1+θ(1−β)+(∆−0 )β∧1.

Taking expectations and using (H3), we obtain

Ex[[

(y − x+ ∆−0 )γ − (y − x)γ]1∆−0 ≤ (y − x)θ

]≤ K(y−x)γ−1+θ(1−β)+ ,

for a constant K <∞. For the final term in (5.15),

(y − x+ ∆−0 )γ1∆−0 ≥ (y − x)θ ≤ ((∆−0 )1/θ + ∆−0 )γ ≤ 2γ(∆−0 )γ/θ.

Taking γ = θβ, which requires θ ∈ (α/β, 1), and using (H3), we see that

Ex[(y − x+ ∆−0 )γ1∆−0 ≥ (y − x)θ

]≤ 2γC.

Combining these estimates and taking expectations in (5.15) we see that thenegative term dominates asymptotically provided

γ − 1 + θ(1− α) > 0 and γ − 1 + θ(1− α) > γ − 1 + θ(1− β)+.

The first inequality requires θ > 1/(1+β−α), which is a stronger conditionthan θ > α/β that we had already imposed, but which can be achieved withθ ∈ (α/β, 1) since α < β. The second inequality reduces to 1−α > (1−β)+

which is satisfied since α < β ∧ 1. Part (i) follows. Moreover, for γ = θβ,1/(1 + β − α) < θ < 1, we can take y − x large enough so that, for someε > 0,

Ex[W1 −W0] ≤ −ε(y − x)θβ−1+θ(1−α) = −εW ηt ,

where η = (θβ − 1 + θ(1 − α))/(θβ) can be anywhere in (0, 1 − (α/β)), byappropriate choice of θ, which proves part (ii).


Lemma 5.2.11 has as a consequence the following tail bound.

Lemma 5.2.12. Suppose that (H0) holds, and that (H2) and (H3) hold forα ∈ (0, 1) and β > α. Then for any ϕ > 0 and any ε > 0, as n→∞,

P[

min0≤m≤n

Xm ≤ −nϕ]

= O(n1−βϕ+ε).

Proof. As in Lemma 5.2.11, choosing y = 0 there, letWn = (−Xn)γ1Xn < 0.Take γ = β − ε for ε ∈ (0, β(β−α)

1+β−α ). For n > 0,

P[

min0≤m≤n

Xm ≤ −nϕ]≤ P

[max

0≤m≤nWm ≥ nϕγ

]= O(n1−ϕγ),

by Lemma 5.2.11(i) and Theorem 2.4.8. The result follows.

Now we can give the proof of our result on existence of moments for ρx.

Proof of Theorem 5.2.9. DefineWn = (y−Xn)γ1Xn < y as in Lemma 5.2.11.For z > 0, let τz = minn ≥ 0 : Wn ≤ z. Since Wn ≤ z = Xn ≥y − z1/γ, we have that ρx = τ(y−x)γ for x ≤ y. Now fix x ∈ R. Un-der the conditions of the theorem, Lemma 5.2.11(ii) implies that, for anyη ∈ (0, 1− (α/β)), for y > x sufficiently large, on n < σ(y−x)γ, a.s.,

E[Wn+1 −Wn | Fn] ≤ −εW ηn .

Then Corollary 2.7.3 shows that for any x ∈ R, E[ρsx] = E[τ s(y−x)γ ] <∞, for

any s < β/α.

Next we prove our non-existence of moments result for ρx. We use anidea based on Theorem 2.4.8: roughly speaking, we show that with goodprobability Xn travels a long way in the negative direction with a singleheavy-tailed jump, and then must take a long time to come back.

Proof of Theorem 5.2.10. Fix x > 0 and let y < x. Let W ′n = (Xn −y)α1Xn > y. Then on Xn ≤ y, W ′n+1 −W ′n ≤ (∆+

n )α. On the otherhand, on Xn > y,

W ′n+1 −W ′n ≤ (Xn + ∆+n − y)α − (Xn − y)α ≤ (∆+

n )α,

by concavity since α ∈ (0, 1]. Hence for C < ∞ (not depending on y),E[W ′n+1 − W ′n | Fn] ≤ C, a.s., so the maximal inequality Theorem 2.4.8implies that, for any y < x,

P[

max0≤k≤m

W ′n+k ≥ (x− y)α | Fn]≤ Cm+W ′n

(x− y)α, a.s.


In particular, on Xn ≤ y, W ′n = 0 and so

P[

max0≤k≤m

Xn+k ≥ x | Fn]≤ P

[max

0≤k≤mW ′n+k ≥ (x− y)α | Fn

]≤ Cm

(x− y)α, a.s.

Setting m = (x−y)α/(2C) in the last display, we obtain that for some ε > 0(not depending on x or y), on n < ρx ∩ Xn ≤ y, for any y < x,

P[ρx ≥ ε(x− y)α | Fn] ≥ 1/2, a.s. (5.16)

Since X0 = 0 and x > 0, we have that ∆−0 > y− implies ρx > 1 andX1 ≤ y. So applying (5.16) at n = 1 we have that

P[ρx ≥ ε(x− y)α] ≥ E[1∆−0 > y−P[ρx ≥ ε(x− y)α | F1]

]≥ 1

2P[∆−0 > y−].

Taking y = −ε−1/αz1/α < 0, we have that for any z > 0,

P[ρx ≥ z] ≥ P[ρx ≥ ε(x− y)α] ≥ 1

2P[∆−0 > ε−1/αz1/α].

Hence for any γ > 0,

E[ργx] =

∫ ∞0

P[ρx > z1/γ ]dz ≥ 1

2

∫ ∞0

P[∆−0 > ε−1/αz1/(αγ)]dz.

Using the substitution w = ε−γz we obtain

E[ργx] ≥ 1

2εγ∫ ∞

0P[∆−0 > w1/(αγ)]dw =

1

2εγ E[(∆−0 )αγ ],

which is infinite provided αγ ≥ β, i.e., γ ≥ β/α.

5.2.5 Moments of last exit times

Similarly to (3.51), for x ∈ R we denote the last exit time from (−∞, x] by

ηx := maxn ≥ 0 : Xn ≤ x.

If Xn → +∞ a.s. (such as under the conditions of Theorem 5.2.2) thenηx < ∞ a.s. for all x ∈ R, and the moments of the random variables ηxprovide a quantitative characterization of the transience.

Theorem 5.2.13. Suppose that (H0) holds, and that (H2) and (H3) holdfor α ∈ (0, 1) and β > α. Then for any x ∈ R and any s ∈ [0, (β/α) − 1),E[ηsx] <∞.


Theorem 5.2.14. Suppose that (H0) holds. Let α ∈ (0, 1] and β > α.Suppose that there exist c > 0, C < ∞, and y0 < ∞ such that E[(∆+

n )α |Xn = x] ≤ C for all x ∈ Σ, and P[∆−n > y | Xn = x] ≥ cy−β for all y ≥ y0

and all x ∈ Σ. Then for any x ∈ R and any s > (β/α)− 1, E[ηsx] =∞.

We will again use the Lyapunov function fz,δ defined at (5.3).

Lemma 5.2.15. Let α ∈ (0, 1) and β > α. Suppose that there exist C <∞,c > 0, and x0 < ∞ for which E[(∆+

n )α | Xn = x] ≤ C for all x ∈ Σ andP[∆−n ≥ y | Xn = x] ≥ cy−β for all y ≥ y0 and all x ∈ Σ. Then for anyδ > β − α and some A > 0 sufficiently large, for any z ∈ R,

E[fz,δ(Xn+1)− fz,δ(Xn) | Xn = x] ≥ 0, for all x > z +A.

Proof. As in the proof of Lemma 5.2.3, it suffices to take n = 0 and z = 1.Let δ > 0, and let fδ := f1,δ be as defined at (5.3). Let γ ∈ (0, 1); we willspecify δ and γ later. For x > 1 we have

fδ(x+ ∆0)− fδ(x) ≥ [(x+ ∆+0 )−δ − x−δ]1∆+

0 ≤ xγ+ (1− x−δ)1∆−0 ≥ x − x−δ1∆+

0 > xγ. (5.17)

We bound the three terms on the right-hand side of (5.17). For the firstterm, we have that by Taylor’s formula, as x→∞, since γ < 1,[(x+ ∆+

0 )−δ − x−δ]

1∆+0 ≤ xγ = −δ(1 + o(1))x−1−δ∆+

0 1∆+0 ≤ xγ, a.s.,

where, as usual, the o(1) term is non-random. Similarly to (5.7), we havethat ∆+

n 1∆+0 ≤ xγ ≤ (∆+

0 )αx(1−α)γ , so that∣∣∣(x+ ∆+0 )−δ − x−δ

∣∣∣1∆+0 ≤ xγ = O((∆+

0 )αx(1−α)γ−1−δ).

It follows that

Ex[|(x+ ∆+

0 )−δ − x−δ|1∆+0 ≤ xγ

]= O(x(1−α)γ−1−δ Ex[(∆+

0 )α])

= O(x(1−α)γ−1−δ). (5.18)

For the second term on the right-hand side of (5.17), we have that for someA > 1 sufficiently large,

Ex[(1− x−δ)1∆−0 ≥ x] ≥ (1/2)Px[∆−0 ≥ x] ≥ (c/2)x−β. (5.19)


For the third term on the right-hand side of (5.17), we have that, byMarkov’s inequality,

Ex[x−δ1∆+0 > xγ ] ≤ x−δx−αγ Ex[(∆+

0 )α] = O(x−δ−αγ). (5.20)

Combining (5.17) with (5.18), (5.19) and (5.20) we have that

Ex[fδ(X1)− fδ(X0)] ≥ (c/2)x−β +O(x−δ−αγ) +O(x(1−α)γ−1−δ).

The positive x−β term here dominates for A large enough provided that

−β > −δ − αγ and − β > (1− α)γ − 1− δ.

For any δ > β − α, the second inequality holds since α ∈ (0, 1) and γ <1. Given any such δ, the first inequality holds provided we choose γ ∈(β−δα , 1).

Finally, we can complete the proofs of our results on last exit times.

Proof of Theorem 5.2.13. Fix x ∈ R and let y > x, to be specified later. Forthis proof, define the stopping time λy,x := minn ≥ ρy : Xn ≤ x, the timeof reaching (−∞, x] after having first reached [y,∞). To prove our resulton finiteness of moments for λx, we prove an upper tail bound for ηx. Fory > x, ρy ≤ n ∩ λy,x =∞ implies ηx ≤ n, so

P[ηx > n] ≤ P[λy,x <∞] + P[ρy > n]. (5.21)

We obtain an upper bound for P[λy,x <∞] using our usual supermartingaleargument. Under the conditions of the theorem, Lemma 5.2.3 applies. Itfollows that for δ ∈ (0, β − α), on ρy < ∞, (fx−A,δ(Xn∧ηy,x), n ≥ ρy) is anon-negative supermartingale adapted to (Fn, n ≥ ρy), and hence convergesa.s. as n→∞ to a limit, Ly,x, say. Then, on ρy <∞, by Fatou’s lemma,

fx−A,δ(Xρy) ≥ E[Ly,x | Fρy ] ≥ E[fx−A,δ(Xλy,x)1λy,x <∞ | Fρy ]≥ (1 +A)−δ P[λy,x <∞ | Fρy ].

By definition, on ρy <∞, Xρy ≥ y, so fx−A,δ(Xρy) ≤ (1 + A+ y − x)−δ.Hence,

P[λy,x <∞] = E[P[λy,x <∞ | Fρy ]1ρy <∞] = O(y−δ). (5.22)

For the final term in (5.21), for y > 0, P[ρy > n] = P [max0≤m≤nXm < y],where

P[

max0≤m≤n

Xm < y

]≤ P

[max

0≤m≤nXm ≤ y, min

0≤m≤nXs ≥ −y

]+ P

[min

0≤m≤nXm ≤ −y

]


≤ P[

max0≤m≤n−1

∆+m ≤ 2y

]+ P

[min

0≤m≤nXs ≤ −y

].

(5.23)

We choose y = n(1/α)−ε, for ε ∈ (0, 1/α). Then we have from (5.14) that forc′ > 0,

P[

max0≤m≤n−1

∆+m ≤ 2n(1/α)−ε

]= O(exp−c′nαε). (5.24)

On the other hand, the ϕ = (1/α)− ε case of Lemma 5.2.12 implies that

P[

min0≤m≤n

Xm ≤ −n(1/α)−ε]

= O(n1−(β/α)+ε). (5.25)

Using the bounds (5.24) and (5.25) in the y = n(1/α)−ε case of (5.23), weobtain

P[

max0≤m≤n

Xm < n(1/α)−ε]

= O(n1−(β/α)+(β+1)ε). (5.26)

Thus taking y = n(1/α)−ε in (5.21) and δ as close as we wish to β − α, andcombining (5.22) with (5.26), we conclude that, for any ε > 0, P[ηx > n] =O(n1−(β/α)+ε), which yields the claimed moment bounds.

Proof of Theorem 5.2.14. Fix x ∈ R and let y > x. For this proof, defineλn,x := minm ≥ n : Xm ≤ x, the first time of reaching (−∞, x] after timen. Similarly, set ρn,y := minm ≥ n : Xm ≥ y. We have that, for r > 0,

P[λx > n] ≥ E [1Xn ≤ rP[λn,x <∞ | Fn]] . (5.27)

Under the conditions of the theorem, Lemma 5.2.15 applies. It followsthat for δ > β − α, (fx−A,δ(Xm∧λn,x∧ρn,y),m ≥ n) is a non-negative sub-martingale adapted to (Fm,m ≥ n); moreover, it is uniformly bounded andso converges a.s. and in L1, as m → ∞, to the limit fx−A,δ(Xλn,x∧ρn,y),since νn,x∧ρn,y <∞ a.s., by (5.1), which is available since Proposition 5.2.1applies under the conditions of the theorem. Hence, a.s.,

fx−A,δ(Xn) ≤ E[fx−A,δ(Xλn,x∧ρn,y) | Fn] ≤ P[λn,x <∞ | Fn] + fx−A,δ(y).

Since y was arbitrary, and fx−A,δ(y)→ 0 as y →∞, it follows that, a.s.,

P[νn,x <∞ | Fn] ≥ fx−A,δ(Xn) ≥ fx−A,δ(r),

on Xn ≤ r. Hence from (5.27) we obtain for r ≥ x,

P[λx > n] ≥ fx−A,δ(r)P[Xn ≤ r] ≥ (1 +A+ r − x)−δ P[Xn ≤ r]. (5.28)


It remains to obtain a lower bound for P[Xn ≤ r], for a suitable choice ofr. Let Yn =

∑n−1m=0 ∆+

m. Following the argument for (5.13), with h(y) = yα,α ∈ (0, 1], we may apply Theorem 2.4.8 to Zn = Y α

n to obtain

P[

max0≤m≤n

Y αm ≥ x

]= P[Yn ≥ x1/α] ≤ Cnx−1,

for some C <∞ and all n ≥ 0, x > 0, which implies that

P[Xn ≤ (2Cn)1/α

]≥ P

[Yn ≤ (2Cn)1/α

]≥ 1/2,

since Xn ≤ X0 + Yn = Yn. Thus taking r = (2Cn)1/α, we have P[Xn ≤ r] ≥1/2, and with this choice of r in (5.28) we obtain P[λx > n] ≥ εn−δ/α, forsome ε > 0 and all n sufficiently large. Since δ > β − α was arbitrary, theresult follows.

5.3 Oscillating random walk

5.3.1 Introduction

We consider a (Xn;n ∈ Z+) be a discrete-time, time-homogeneous Markovprocess on R, with transition kernel p(x,dz) = wx(z − x)dz so that

P[Xn+1 ∈ A | Xn = x] =

∫Ap(x,dz) =

∫Awx(z − x)dz, (5.29)

for all Borel sets A ⊆ R. Here the local transition densities wx : R→ R+ areBorel functions. We will assume one of several ‘heavy-tailed’ conditions forthe wx. We call the process Xn a random walk ; since wx depends on x, therandom walk is not, in general, spatially homogeneous. We are interestedthe recurrence classification of Xn, where we say Xn is

• recurrent if lim infn→∞ |Xn| = 0, a.s.;

• transient if limn→∞ |Xn| =∞, a.s.

To describe the tail conditions on the increments of the random walk,we define

vα(y) :=

cα(y)y−1−α if y > 0

0 if y ≤ 0.(5.30)

For now, we will assume the following.

5.3. Oscillating random walk 255

(C0) Let vα be a probability density function given by (5.30) for α > 0,where cα(y) = cα ∈ (0,∞) is constant.

Eventually we may want to relax this to the following, although this willprobably not be possible in the critical case.

(C1) Let vα be a probability density function given by (5.30) for α > 0,where cα : (0,∞) → (0,∞) is a Borel function with limy→∞ cα(y) =cα ∈ (0,∞).

The most classical case is the following.

(M0) For each x ∈ R, suppose that wx(y) = 12vα(|y|) for y ∈ R, a symmetric

density function.

In this case, Xn describes a random walk with i.i.d. symmetric increments.The recurrence classification is classical: the following result is an easy spe-cial case of Theorem 5 of Shepp [260], for example.

Theorem 5.3.1. Suppose that (C0) and (M0) hold. Then the symmetricrandom walk is transient if and only if α < 1.

The classical proof of Theorem 5.3.1 takes as its starting point a char-acteristic function approach and the sophisticated theorem of Chung andFuchs [37] on arbitrary sums of i.i.d. random variables. It is of interest todevelop alternative methods that will generalize beyond the classical i.i.d.setting.

Our main interest concerns models in which wx depends on x, typicallyonly through sgnx, the sign of x. Such models are known as oscillatingrandom walks, and were studied by Kemperman [144], to whom the modelwas suggested in 1960 by Anatole Joffe and Peter Ney (see [144, p. 29]).

The next example, following Kemperman [144], is a continuous-spaceone-sided oscillating random walk :

(M1) Suppose that α, β > 0. For x, y ∈ R, let

wx(y) :=

vα(−y) if x ≥ 0

vβ(y) if x < 0.

In other words, the walk always jumps in the direction of (and possibly over)the origin, with tail exponent α from the positive half-line and exponent βfrom the negative half-line. The following recurrence classification applies.


Theorem 5.3.2. Suppose that (C0) and (M1) hold. Then the one-sidedoscillating random walk is transient if and only if α+ β < 1.

Theorem 5.3.2 was obtained in the discrete-space case by Kemperman[144, p. 21].

A special case in which α = β is the antisymmetric case in which

wx(y) = vα(−y sgn(x)), (x, y ∈ R).

In this case, Theorem 5.3.2 shows that the walk is transient for α < 1/2 andrecurrent for α ≥ 1/2. Rogozin and Foss [247] give some additional resultson the oscillating random walk.

Kemperman [144] and Rogozin and Foss [247] obtain their results us-ing some sophisticated classical methods for the analysis of random walks,primarily Wiener–Hopf theory together with an appeal to various deep res-ults from renewal theory. Again, we proceed via more elementary methodsthat do not rely so heavily on the special structure of the classical randomwalk.

We describe another model in the vein of Kemperman [144], a two-sidedoscillating random walk :


wx(y) :=

12vα(|y|) if x ≥ 012vβ(|y|) if x < 0.

Now the jumps of the walk are symmetric, as under (M0), but with a tailexponent depending upon which side of the origin the walk is currently on,as under (M1).

The most general recurrence classification result for the model (M2) isdue to Sandric [253]. A somewhat less general, discrete-space version wasobtained by Rogozin and Foss (Theorem 2 of [247, p. 159]), building on workof Kemperman [144]. Analogous results in continuous time were given byFranke [96] and Bottcher [23].

Theorem 5.3.3. Suppose that (C0) and (M2) hold. Then the two-sidedoscillating random walk is transient if and only if α+ β < 2.

A final model is another oscillating walk that mixes the one- and two-sided models:



wx(y) :=

12vα(|y|) if x ≥ 0

vβ(y) if x < 0.

In the discrete-space case, Theorem 2 of Rogozin and Foss [247, p. 159]gives the recurrence classification.

Theorem 5.3.4. Suppose that (C0) and (M3) hold. Then the random walkis transient if and only if α+ 2β < 2.

Other variations on these models, within the general setting of (5.29),have also been studied in the literature. Sandric [252, 253] supposes thatthe wx satisfy, for each x ∈ R, wx(y) ∼ c(x)|y|−1−α(x) as |y| → ∞ forsome measurable functions c and α; he refers to this as a stable-like Markovchain. Under a uniformity condition on the wx, and other mild technicalconditions, Sandric [252] obtained, via Foster–Lyapunov methods similar inspirit to those of the present paper, sufficient conditions for recurrence andtransience: essentially lim infx→∞ α(x) > 1 is sufficient for recurrence andlim supx→∞ α(x) < 1 is sufficient for transience. These results can be seenas a generalization of Theorem 5.3.1. Some related results for models incontinuous-time (Levy processes) are given by Schilling and Wang [255] andSandric [251]. Further results and an overview of the literature are providedin Sandric’s PhD thesis [250].

5.3.2 Lyapunov functions

Our proofs are based on Lyapunov functions. Our first function is roughlyof the form x 7→ |x|ν , with two modifications: one is a minor adjustmentnear x = 0 to control the case ν < 0, and the other is the introduction ofan extra parameter, λ, to give different weights to the different sides of theorigin, which is a crucial technical tool. Precisely, for parameters ν ∈ R andλ ∈ R+, we consider the function fν( · ;λ) : R→ R+ given by

fν(x;λ) :=

|x|ν if x ≥ 1

λ+ (1− λ)1+x2 if |x| < 1

λ|x|ν if x ≤ −1.

(5.31)

Then x 7→ fν(x;λ) is continuous. The form given at (5.31) ensures that fνis bounded as a function of x for ν < 0. We write fν(x) := fν(x; 1) for thecase λ = 1.


Note that for λ > 0 and any x ∈ R,

fν(x;λ) = λfν(−x; 1/λ). (5.32)

We consider the process fν(Xn;λ) for appropriately chosen ν and λ. By(5.29), the one-step mean increments of fν(Xn;λ) is given by

E [fν(Xn+1;λ)− fν(Xn, λ) | Xn = x]

=

∫ ∞−∞

(fν(x+ y;λ)− fν(x;λ))wx(y)dy =: Fν(x;λ),

say.We will also consider the function g( · ;λ) : R→ R+ given for λ ∈ R+ by

g(x;λ) := log(1 + |x|) + λ1x < 0. (5.33)

We write g(x) := g(x; 0) for the case λ = 0. Note that for any x ∈ R,

g(x;λ) = g(−x;λ)− λ sgn(x). (5.34)

Now, by (5.29),

E [g(Xn+1;λ)− g(Xn, λ) | Xn = x]

=

∫ ∞−∞

(g(x+ y;λ)− g(x;λ))wx(y)dy =: G(x;λ),

say.

5.3.3 Symmetric increments

Suppose that (M0) holds. Now, for x ∈ R,

Fν(x;λ) =1

2

∫ ∞−∞

(fν(x+ y;λ)− fν(x;λ)) vα(|y|)dy =: F symν,α (x;λ),

say. Moreover, some manipulation using (5.32) shows that

Fν(−x;λ) =λ

2

∫ ∞−∞

(fν(x+ y; 1/λ)− fν(x; 1/λ)) vα(|y|)dy.

That is, (M0) implies that

Fν(x;λ) =

F symν,α (|x|;λ) if x > 0

λF symν,α (|x|; 1/λ) if x < 0.

(5.35)


If, on the other hand, (M2) holds, then, analogously to (5.35),

Fν(x;λ) =

F symν,α (|x|;λ) if x > 0

λF symν,β (|x|; 1/λ) if x < 0.

(5.36)

Thus we study F symν,α (x;λ) for x > 0.

Lemma 5.3.5. Let α ∈ (0, 1) and ν ∈ (−1, α). Then, as x→∞,

F symν,α (x;λ) = cαx

ν−αSν,α(λ) + o(xν−α),

where

Sν,α(λ) = λΓ(1 + ν)Γ(α− ν)

Γ(1 + α)− Γ(1− α)Γ(α− ν)

αΓ(−ν)− Γ(1 + ν)Γ(1− α)

αΓ(1− α+ ν).

(5.37)Moreover, suppose that for θ ∈ R, λ(θ, ν) > 0 is such that

limν→0

1− λ(θ, ν)

ν= θ.

Then, as ν → 0,

Sν,α(λ(θ, ν))) = − ναs(θ, α) + o(ν), (5.38)

wheres(θ, α) = θ − π cot

(πα2

).

In particular, s(0, α) < 0 for α ∈ (0, 1).

Remark 5.3.6. To do: This should probably extend in some form to α ∈(0, 2), at least for small ν. (But not any further, because xν−α is then thewrong scale.)

Proof of Lemma 5.3.5. Throughout the proof we suppose that x > 1. By(5.31), we have that,

F symν,α (x;λ) =

∫ ∞0

((x+ y)ν − xν) vα(y)dy

+

∫ x−1

0((x− y)ν − xν) vα(y)dy

+

∫ x+1

x−1

(λ+ (1− λ)x−y+1

2 − xν)vα(y)dy


+

∫ ∞x+1

(λ(y − x)ν − xν) vα(y)dy. (5.39)

Here ∣∣∣∣∫ x+1

x−1

(λ+ (1− λ)x−y+1

2 − xν)vα(y)dy

∣∣∣∣≤∫ x+1

x−1(|λ|+ |1− λ|+ xν) cα(x− 1)−1−αdy

= O(xν−1−α) = o(xν−α).

Also, after the substitution u = 1− (y/x), we see that∫ x

x−1((x− y)ν − xν) vα(y)dy = cαx

ν−α∫ 1/x

0(uν − 1)(1− u)−1−αdu.

Here, if ν > 0,∣∣∣∣∣∫ 1/x

0(uν − 1)(1− u)−1−αdu

∣∣∣∣∣ ≤∫ 1/x

0du = 1/x,

while if ν ∈ (−1, 0),∣∣∣∣∣∫ 1/x

0(uν − 1)(1− u)−1−αdu

∣∣∣∣∣ ≤∫ 1/x

0uνdu = O(x−1−ν).

It follows that, for any ν > −1,∣∣∣∣∫ x

x−1((x− y)ν − xν) vα(y)dy

∣∣∣∣ = o(xν−α).

Similarly, ∣∣∣∣∫ x+1

x(λ(y − x)ν − xν) vα(y)dy

∣∣∣∣ = o(xν−α).

Hence, with I1, I2, and I3 as defined by (5.63), (5.64), and (5.65) respectively,

F symν,α (x;λ) = cαI

ν,α1 (x) + cαI

ν,α2 (x) + cαI

ν,α3 (x;λ) + o(xν−α),

from which the first part of the lemma follows, with (5.67), (5.68), and(5.66).


Applying the asymptotic formulae (5.56) and (5.57) to the ν-dependentterms in (5.37) we obtain

Sν,α(λ) =1

α

(λ− 1− ν (λψ(α)− ψ(1− α) + Γ(α)Γ(1− α)) +O(ν2)

),

as ν → 0, uniformly for |λ| bounded away from infinity. In particular, ifλ = λ(θ, ν) = 1− θν + o(ν) we obtain

Sν,α(λ(θ, ν)) = − να

(θ + ψ(α)− ψ(1− α) + Γ(α)Γ(1− α)) + o(ν).

Then, applying the two reflection formulae (5.55) and (5.58), we obtain thegiven form for (5.38), where

s(θ, α) = θ − π (cotπα+ cosecπα) ,

which, with the half-angle formula cot(θ/2) = cot θ+cosec θ, gives the statedform for s(θ, α).

Now, under (M0), we have

G(x;λ) =1

2

∫ ∞−∞

(g(x+ y;λ)− g(x;λ)) vα(|y|)dy =: Gsymα (x;λ).

We use the notation Gsymα (x) := Gsym

α (x; 0). Then, for x ∈ R,

Gsymα (x;λ) = Gsym

α (x)− λ1x < 0+λ

2

∫ ∞x

vα(|y|)dy.

It follows that (M0) implies

G(x;λ) =

Gsymα (|x|) + λ

2Tα(|x|) if x > 0

Gsymα (|x|)− λ

2Tα(|x|) if x < 0,(5.40)

where, for x > 0, Tα(x) :=∫∞x vα(y)dy.

Similarly, under (M2),

G(x;λ) =

Gsymα (|x|) + λ

2Tα(|x|) if x > 0

Gsymβ (|x|)− λ

2Tβ(|x|) if x < 0,(5.41)

So we consider Gsymα (x) for x > 0.

Lemma 5.3.7. Suppose that α ∈ (0, 2). Then, as x→∞,

Gsymα (x) = cαx

−α π

2αcot(πα2 ) +O(x−1−α log x).


Proof. A calculation shows that, for x > 0,

Gsymα (x) =

1

2

∫ ∞x

log

(1 + y − x

1 + x

)vα(y)dy

+1

2

∫ ∞x

log

(1 + x+ y

1 + x

)vα(y)dy

+1

2

∫ x

0

[log

(1 + x+ y

1 + x

)+ log

(1 + x− y

1 + x

)]vα(y)dy. (5.42)

The first integral on the right-hand side of (5.42) is

cα

∫ ∞x

log

(2 + y

1 + x− 1

)y−1−αdy = cα(1+x)

∫ ∞2+x1+x

((1+x)u+2)−1−α log(u−1)du,

after the substitution u = y/(1 + x). First note that∣∣∣∣∣∫ ∞

2+x1+x

((1 + x)u+ 2)−1−α log(u− 1)du−∫ ∞

1((1 + x)u+ 2)−1−α log(u− 1)du

∣∣∣∣∣≤ (3 + x)−1−α

∫ 11+x

0log udu

= O(x−2−α log x).

So we have

cα

∫ ∞x

log

(2 + y

1 + x− 1

)y−1−αdy = cα(1 + x)

∫ ∞1

((1 + x)u+ 2)−1−α log(u− 1)du

+O(x−1−α log x).

In the last integrand in the previous display, u ≥ 1, so (1 +x)u ≤ (1 +x)u+2 ≤ (3 +x)u, so bounding the given integral above and below, for α > 0, wehave∫ ∞

1((1 + x)u+ 2)−1−α log(u− 1)du = x−1−α

∫ ∞1

u−1−α log(u− 1)du+O(x−2−α)

= −x−1−α 1

α(γ + ψ(α)) +O(x−2−α),

by the Jα1 integral from Lemma 5.3.17.The second integral on the right-hand side of (5.42) is

cα

∫ ∞x

log

(1 +

y

1 + x

)y−1−αdy = cα(1 + x)−α

∫ ∞x

1+x

u−1−α log(1 + u)du,


after the substitution u = y/(1 + x). Here, for α > 0, similarly to above,∫ ∞x

1+x

u−1−α log(1 + u)du =

∫ ∞1

u−1−α log(1 + u)du+O(x−1)

=1

α

(ψ(α)− ψ(α2 )

)+O(x−1),

by the Jα2 integral from Lemma 5.3.17.Finally, some algebra shows that the third integral on the right-hand

side of (5.42) is

cα

∫ x

0log

(1− y2

(1 + x)2

)y−1−αdy =

1

2cα(1+x)−α

∫ x2

(1+x)2

0u−1−α

2 log(1−u)du,

after the substitution u = y2/(1 + x)2. Here, for α ∈ (0, 2),∫ x2

(1+x)2

0u−1−α

2 log(1− u)du =

∫ 1

0u−1−α

2 log(1− u)du+O(x−1 log x)

=2

α

(γ + ψ(1− α

2 ))

+O(x−1 log x),

by the Jα4 integral from Lemma 5.3.17.Combining these computations and summing in (5.42) we conclude that

for α ∈ (0, 2) and x > 0,

Gsymα (x) = cαx

−α 1

2α

(ψ(1− α

2 )− ψ(α2 ))

+O(x−1−α log x),

and the statement in the lemma follows by (5.58).

Theorem 5.3.8. Suppose that (M2) holds for α, β ∈ (0, 2). Then Xn isrecurrent if α+ β > 2.

Proof. Lemma 5.3.7 shows that, as x→∞,

G(x;λ) =cα2α

(π cot

(πα2

)+ λ

)|x|−α + o(|x|−α);

G(−x;λ) =cβ2β

(π cot

(πβ2

)− λ

)|x|−β + o(|x|−β).

We aim to choose λ > 0 such that both lim supx→∞G(x;λ) < 0 andlim supx→−∞G(x;λ) < 0, which requires

π cot(πα2

)+ λ < 0, and π cot

(πβ2

)− λ < 0.

Suppose α+β > 2. Then at least one of α, β > 1. Without loss of generality,

take α > 1. Then the desired λ > 0 exists provided cot(πα2

)+cot

(πβ2

)< 0,

or, equivalently, α+ β > 2.


5.3.4 One-sided increments

Suppose that (M1) holds. Let x > 0. Then some manipulation using (5.32)shows that

Fν(x;λ) =

∫ ∞0

(fν(x− y;λ)− fν(x;λ)) vα(y)dy =: F oneν,α (x;λ), and

Fν(−x;λ) = λ

∫ ∞0

(fν(x− y; 1/λ)− fν(x; 1/λ)) vβ(y)dy.

In other words, (M1) implies that

Fν(x;λ) =

F oneν,α (|x|;λ) if x > 0

λF oneν,β (|x|; 1/λ) if x < 0.

(5.43)

Thus we study F oneν,α (x;λ) for x > 0.

Lemma 5.3.9. Let α ∈ (0, 1), ν ∈ (−1, α), and λ ∈ R+. Then, as x→ +∞,

F oneν,α (x;λ) = cαx

ν−αRν,α(λ) + o(xν−α),

where

Rν,α(λ) = λΓ(1 + ν)Γ(α− ν)

Γ(1 + α)− Γ(1 + ν)Γ(1− α)

αΓ(1− α+ ν). (5.44)

Moreover, suppose that for θ ∈ R, λ(θ, ν) > 0 is such that

limν→0

1− λ(θ, ν)

ν= θ.

Then, as ν → 0,

Rν,α(λ(θ, ν))) = − ναr(θ, α) + o(ν), (5.45)

where

r(θ, α) = θ − π cotπα.

In particular, r(0, α) < 0 for α ∈ (0, 1/2) but r(0, α) > 0 for α ∈ (1/2, 1).

Proof. The proof is similar to that of Lemma 5.3.5, but with the first termon the right-hand side of (5.39) absent in the analogous equation for F one

ν,α .In this case we obtain

F oneν,α (x;λ) = cαI

ν,α2 (x) + cαI

ν,α3 (x;λ) + o(xν−α),


from which the first part of the lemma follows, with (5.68) and (5.66).Applying the asymptotic formula (5.57) to the ν-dependent terms in

(5.44) we obtain

Rν,α(λ) =1

α

(λ− 1− ν (λψ(α)− ψ(1− α)) +O(ν2)

),

as ν → 0, uniformly for |λ| bounded away from infinity. In particular, ifλ = λ(θ, ν) = 1− θν + o(ν) we obtain

Rν,α(λ(θ, ν)) = − να

(θ + ψ(α)− ψ(1− α)) + o(ν),

and then (5.45) follows from the digamma reflection formula (5.58).

The next result will be the key to proving recurrence or transience forthis model (apart from in the critical boundary case, which we deal withlater using a logarithmic function).

Lemma 5.3.10. Suppose that α, β ∈ (0, 1).

(i) If α+ β < 1, then there exist ν0 < 0 and λ0 > 0 such that

Rν0,α(λ0) < 0, and λ0Rν0,β(1/λ0) < 0.

(ii) If α+ β > 1, then there exist ν0 > 0 and λ0 > 0 such that

Rν0,α(λ0) < 0, and λ0Rν0,β(1/λ0) < 0.

Proof. First we prove part (i). We take λ = λ(θ, ν) = 1 − θν, so that1/λ = 1 + θν + o(ν). Then we have from (5.45) that, as ν → 0,

Rν,α(λ) ∼ − να

(θ − π cotπα) ;

λRν,β(1/λ) ∼ −νβ

(−θ − π cotπβ) .

We can thus choose some ν0 < 0 so that the claim in the lemma will followprovided we can choose θ (and hence λ0 = λ(θ, ν0) > 0) for which

θ − π cotπα < 0, and − θ − π cotπβ < 0.

Such a θ exists provided cotπα+ cotπβ > 0. Some algebra shows that

cotπα+ cotπβ =sinπ(α+ β)

sinπα · sinπβ ;


both terms in the denominator here are strictly positive, since α, β ∈ (0, 1),so the sign is determined by the numerator, sinπ(α + β), which is strictlypositive provided α + β < 1. This gives (i). The argument for (ii) isanalogous.

The next result will be used to study the case when the random walkhas a drift towards the origin on at least one side. For n ∈ Z+, let ∆n =Xn+1 −Xn.

Lemma 5.3.11. Suppose that there exists p > 1 for which

lim supx→∞

E[|∆n|p | Xn = x] <∞. (5.46)

Suppose also that

lim supx→∞

E[∆n | Xn = x] < 0. (5.47)

Then for any ν ∈ (0, 1) and any λ > 0, there is x0 <∞ for which

E[fν(Xn;λ)− fν(Xn;λ) | Xn = x] < 0, (5.48)

for all x ≥ x0.

Proof. Fix ν ∈ (0, 1) and λ > 0. Let γ ∈ (0, 1). For any x > 0, we have

E[fν(Xn+1;λ)− fν(Xn;λ) | Xn = x]

= E [(fν(x+ ∆n;λ)− fν(x;λ)) 1|∆n| > xγ | Xn = x]

+ E [(fν(x+ ∆n;λ)− fν(x;λ)) 1|∆n| ≤ xγ | Xn = x] .

Here we have from the triangle inequality that, for any x > 0,

|E [(fν(x+ ∆n;λ)− fν(x;λ)) 1|∆n| > xγ | Xn = x]|≤ C(ν, λ)E [(|∆n|ν + xν)1|∆n| > xγ | Xn = x] ,

for some finite constant C(ν, λ) depending only on ν and λ. The last ex-pression is in turn bounded for all x sufficiently large by a constant times

E[|∆n|ν/γ1|∆n| > xγ | Xn = x

]≤ E

[|∆n|p|∆n|(ν/γ)−p1|∆n| > xγ | Xn = x

]≤ E [|∆n|p | Xn = x]xν−γp,


provided ν < γp. In particular, since (5.46) holds for p > 1, we may takeγ ∈ (1/p, 1) (and we fix this choice of γ for the remainder of the proof).Then, since ν < 1, as x→∞,

|E [(fν(x+ ∆n;λ)− fν(x;λ)) 1|∆n| > xγ | Xn = x]| = o(xν−1).

On the other hand, if Xn = x > 1 the event |∆n| ≤ xγ implies thatXn+1 > 0, and then

E [(fν(x+ ∆n;λ)− fν(x;λ)) 1|∆n| ≤ xγ | Xn = x]

= E[xν((1 + x−1∆n)ν − 1

)1|∆n| ≤ xγ | Xn = x

]= νxν−1(1 + o(1))E[∆n1|∆n| ≤ xγ | Xn = x],

by an application of Taylor’s formula. But

|E[∆n1|∆n| > xγ | Xn = x]| ≤ E[|∆n|p|∆n|1−p1|∆n| > xγ | Xn = x]

≤ E[|∆n|p | Xn = x]x−γ(p−1) = o(1),

since p > 1. So combining these results we obtain

E [(fν(x+ ∆n;λ)− fν(x;λ)) 1|∆n| > xγ | Xn = x]

= νxν−1 ((1 + o(1))E[∆n | Xn = x] + o(1)) .

The right-hand side of the last display is negative for all x large enough, bythe assumption (5.47).

Theorem 5.3.12. (a) Suppose that α, β ∈ (0, 1). Then the random walkis transient if α+ β < 1 and recurrent if α+ β > 1.

(b) Suppose that there exists p > 1 for which (5.46) holds. Suppose alsothat at least one of the conditions

lim supx→∞

E[∆n | Xn = x] < 0, or lim infx→−∞

E[∆n | Xn = x] > 0

holds. Then the random walk is recurrent.

Proof. First we prove part (a). We use Lemma 5.3.9 with Lemma 5.3.10. Ifα + β < 1, these results show that we can choose ν0 < 0 and λ0 > 0 suchthat

lim supx→∞

F oneν0,α(x;λ0) < 0, and lim sup

x→∞λ0F

oneν0,β(x; 1/λ0) < 0.


Hence, by (5.43),

lim supx→∞

Fν0(x;λ0) < 0, and lim supx→−∞

Fν0(x;λ0) < 0.

Then transience follows from an appropriate Foster–Lyapunov criterion.Similarly, if α + β > 1, we can choose ν0 > 0 and λ0 > 0 such thatlim sup|x|→∞ Fν0(x;λ0) < 0, and then recurrence follows.

Next we prove part (b). By replacing Xn with −Xn, it suffices to supposethat (5.47) holds. We choose the Lyapunov function fν( · ;λ), where ν > 0.Lemma 5.3.11 shows that (5.48) holds for all x > 0 large enough. Moreover,for x < 0, the relevant expression is

Fν(x;λ) = λF oneν,β (|x|; 1/λ)

= cα|x|ν−α(

Γ(1 + ν)Γ(α− ν)

Γ(1 + α)− λΓ(1 + ν)Γ(1− α)

αΓ(1− α+ ν)

)+ o(|x|ν−α),

which is negative for all x, having chosen λ > 0 sufficiently large. Hencewe may choose λ such that (5.48) holds for all x outside some bounded set.Recurrence follows.

Now, under (M1), we have, for x > 0,

G(x;λ) =

∫ ∞0

(g(x− y;λ)− g(x;λ)) vα(y)dy =: Goneα (x;λ).

We use the notation Goneα (x) := Gone

α (x; 0). Then, for x > 0,

Goneα (x;λ) = Gone

α (x) + λ

∫ ∞x

vα(y)dy.

Also, for x > 0, using (5.34),

G(−x;λ) =

∫ ∞0

(g(−x+ y;λ)− g(−x;λ)) vβ(y)dy

= Goneβ (x)− λ

∫ ∞x

vβ(y)dy.

It follows that (M1) implies

G(x;λ) =

Goneα (|x|) + λTα(|x|) if x > 0

Goneβ (|x|)− λTβ(|x|) if x < 0.

(5.49)

So we consider Goneα (x) for x > 0.


Lemma 5.3.13. Suppose that α ∈ (0, 1). Then, as x→∞,

Goneα (x) = cαx

−α π

αcot(πα) +O(x−1−α log x).

Proof. We have that

Gone(x) =

∫ x

0log

(1 + x− y

1 + x

)vα(y)dy +

∫ ∞x

log

(1 + y − x

1 + x

)vα(y)dy.

(5.50)

The final integral in (5.50) was evaluated in Lemma 5.3.7. The first integralin (5.50), after the substitution u = y/(1 + x), becomes

cα(1 + x)−α∫ x

1+x

0u−1−α log(1− u)du = cαx

−α∫ 1

0u−1−α log(1− u)du+O(x−1−α log x)

= cαx−α 1

α(γ + ψ(1− α)) +O(x−1−α log x),

provided α ∈ (0, 1), Jα4 integral from Lemma 5.3.17. Combining the twoevaluations and using (5.58) gives the result.

5.3.5 Complexes of half-lines

We will now define a Markov chain (Xn, ξn) ∈ R+×S, where S := Ssym∪Sone

for finite disjoint sets Ssym and Sone. As parameters we have an irreduciblestochastic matrix labelled by S, P = (p(i, j); i, j ∈ S), and a list of exponents(αk; k ∈ S) where αk > 0 for all k.

For k ∈ Ssym, we write θ ∼ Θ(k) to mean that θ is a random variabledistributed symmetrically on R with the density 1

2vαk(|y|), y ∈ R.

For k ∈ Sone, we write θ ∼ Θ(k) to mean that θ is a random variabledistributed on (−∞, 0] with the density vαk(|y|), y ≤ 0.

For k ∈ S, we write ζ ∼ p(k, · ) to mean that ζ is a S-valued randomvariable with P[ζ = j] = p(k, j).

Now we can define the Markov chain. Given (Xn, ξn) = (x, k), let θn+1 ∼Θ(k) and ζn+1 ∼ p(k, · ). Then:

• If x+ θn+1 ≥ 0, set (Xn+1, ξn+1) = (x+ θn+1, k).

• If x+ θn+1 < 0, set (Xn+1, ξn+1) = (|x+ θn+1|, ζn+1).

In words, the walk takes a Θ(ξn)-distributed step. If this step wouldbring the walk to less than zero, it reflects but switches onto ‘line’ ζn+1.


We now describe our Lyapunov functions. For a vector of non-negativereal numbers λ = (λk; k ∈ S), x ∈ R+, and k ∈ S, set

g(x, k;λ) := log(1 + x) + λk.

Also write g(x) := log(1 + x), and set χk = 1 if k ∈ Sone and χk = 1/2 ifk ∈ Ssym.

Lemma 5.3.14. For k ∈ S, as x→∞,

E[g(Xn+1, ξn+1;λ)− g(Xn, ξn;λ) | (Xn, ξn) = (x, k)]

=cαkαk

χk∑j∈S

p(k, j)((λj − λk) + π cot(χkπαk)

)x−αk + o(x−αk). (5.51)

Proof. Now


=∑j∈S

p(k, j)(λj − λk)Tk(x) +∑j∈S

p(k, j)E[g(|x+ θn+1|)− g(x) | ξn = k],

where

Tk(x) = P[θn+1 < −x | ξn = k] = χk

∫ ∞x

vαk(y)dy.

In particular, if k ∈ Sone,


=∑j∈S


p(k, j)Goneαk

(x),

while, if k ∈ Ssym,


=∑j∈S


p(k, j)Gsymαk

(x).

Then (5.51) follows.

Theorem 5.3.15. Let P be an irreducible stochastic matrix with associ-ated stationary probability distribution (µk; k ∈ S). The Markov chain isrecurrent if ∑

k∈Sµk cot(χkπαk) < 0.


Proof. By Lemma 5.3.14 we will have

lim supx→∞

E[g(Xn+1, ξn+1;λ)− g(Xn, ξn;λ) | (Xn, ξn) = (x, k)] < 0, (5.52)

for all k, if we can choose λ such that∑j∈S

p(k, j)λj − λk + ak = −ε < 0,

for all k, where ak = π cot(χkπαk). In other words, we require that∑j∈S

p(k, j)λj − λk + bk = 0, (5.53)

where bk = ak + ε for all k.The irreducible stochastic matrix p(i, j) is associated to a positive in-

variant probability distribution (µk; k ∈ S) satisfying∑j∈S

µjp(j, k)− µk = 0, for all k, (5.54)

which is the homogeneous system adjoint to (5.53) Then a standard resultfrom linear algebra (ref cf. pp. 34–35 of [84]) shows that (5.53) admits asolution if and only if the vector (bk; k ∈ S) is orthogonal to any solution of(5.54), namely any scalar multiple of (µk; k ∈ S); in other words, a solutionλ exists when

∑j∈S µjbj = 0, or, equivalently,

∑j∈S µjaj = −ε < 0. Since

(5.53) is translation invariant in λ, we may suppose that λk > 0 for all k ∈ S.Thus, under the condition of the theorem, we can choose λ so that the

function g( · , · ;λ) : R+ × S → R+ is such that mink g(x, k;λ) → ∞ asx→∞ and (5.52) holds, and recurrence follows.

5.3.6 Technical results

We collect some facts from [1]. The Euler gamma function Γ satisfies thefunctional equation zΓ(z) = Γ(z + 1), as well as the following ‘reflectionformula’, equation 6.1.17 from [1, p. 256]:

Γ(z)Γ(1− z) = −zΓ(−z)Γ(z) = π cosecπz, (5.55)

valid for z ∈ (0, 1); one consequence of the functional equation is that

Γ(z) ∼ 1/z, AbrSte64 z ↓ 0. (5.56)


Moreover, for x > 0, by Taylor’s formula,

Γ(x+ h) = Γ(x)(1 + hψ(x) +O(h2)), (5.57)

as h → 0, where ψ(z) = ddγ log Γ(z) = Γ′(z)/Γ(z) is the digamma func-

tion, and ψ(1) = −γ, where γ ≈ 0.5772 is Euler’s constant. The digammafunction also satisfies a reflection formula, namely equation 6.3.7 from [1, p.259]:

ψ(1− z)− ψ(z) = π cotπz. (5.58)

The beta function is given for a > 0, b > 0 by

B(a, b) =

∫ 1

0ua−1(1− u)b−1du =

Γ(a)Γ(b)

Γ(a+ b). (5.59)

The Gauss hypergeometric function is defined via the series

2F1(a, b; c; z) =∞∑n=0

Γ(a+ n)Γ(b+ n)Γ(c)

Γ(a)Γ(b)Γ(c+ n)zn, (5.60)

which is convergent for |z| < 1. A consequence of (5.60) is that 2F1(a, b; c; z) =1 +O(z) as z → 0. For c > b > 0 there is the integral representation

2F1(a, b; c; z) =Γ(c)

Γ(b)Γ(c− b)

∫ 1

0ub−1(1− u)c−b−1(1− zu)−adu. (5.61)

We will need the following linear transformation formula, equation 15.3.6from [1, p. 559],

2F1(a, b; c; z) =Γ(c)Γ(c− a− b)Γ(c− a)Γ(c− b)2F1(a, b; a+ b− c+ 1; 1− z)

+ (1− z)c−a−bΓ(c)Γ(a+ b− c)Γ(a)Γ(b)

2F1(c− a, c− b; c− a− b+ 1; 1− z).(5.62)

We define the following integrals.

Iν,α1 (x) :=

∫ ∞0

((x+ y)ν − xν)y−1−αdy; (5.63)

Iν,α2 (x) :=

∫ x

0((x− y)ν − xν)y−1−αdy; (5.64)

Iν,α3 (x;λ) :=

∫ ∞x

(λ(y − x)ν − xν)y−1−αdy. (5.65)


Lemma 5.3.16. Let α > 0, ν ∈ (−1, α), and λ ∈ R. Then, for any x > 0,

Iν,α3 (x;λ) = xν−α(λ

Γ(1 + ν)Γ(α− ν)

Γ(1 + α)− 1

α

). (5.66)

If, in addition, α ∈ (0, 1), then, for any x > 0,

Iν,α1 (x) = −xν−αΓ(1− α)Γ(α− ν)

αΓ(−ν); (5.67)

Iν,α2 (x) = xν−α1

α

(1− Γ(1 + ν)Γ(1− α)

Γ(1− α+ ν)

). (5.68)

Proof. First consider I1. With the substitution u = y/x we obtain

Iν,α1 (x) = xν−α∫ ∞

0((1 + u)ν − 1)u−1−αdu.

Integrating by parts,∫ ∞0

((1+u)ν−1)u−1−αdu =[−(1/α)((1 + u)ν − 1)u−α

]∞0

+ν

α

∫ ∞0

(1+u)ν−1u−αdu.

Here limu→∞((1 + u)ν − 1)u−α = 0 provided α > 0 and α− ν > 0, whilelimu→0((1 + u)ν − 1)u−α = limu→0νu1−α = 0 provided α < 1. So, forα ∈ (0, 1) and ν < α,∫ ∞

0((1 + u)ν − 1)u−1−αdu =

ν

α

∫ ∞0

(1 + u)ν−1u−αdu.

Now use the substitution z = u1+u , so u = z

1−z , 1 + u = 11−z , du = dz

(1−z)2 toget ∫ ∞

0(1 + u)ν−1u−αdu =

∫ 1

0z−α(1− z)α−ν−1dz =

Γ(1− α)Γ(α− ν)

Γ(1− ν),

by (5.59), provided ν < α and α ∈ (0, 1). Combining our computations weobtain (5.67).

Next consider I2. This time, the substitution u = y/x gives

Iν,α2 (x) = xν−α∫ 1

0((1− u)ν − 1)u−1−αdu.

Due to the singularity at u = 0, we evaluate the integral here as∫ 1

0((1− u)ν − 1)u−1−αdu = lim

t↓0

∫ 1

t((1− u)ν − 1)u−1−αdu.


For the last integral, we get, for α > 0 and t > 0,∫ 1

t((1− u)ν − 1)u−1−αdu =

∫ 1

t(1− u)νu−1−αdu+

1− t−αα

.

With the substitution v = 1−u1−t , the last integral becomes∫ 1

t(1− u)νu−1−αdu = (1− t)1+ν

∫ 1

0vν(1− (1− t)v)−α−1dv

=(1− t)1+ν

1 + ν2F1(1 + α, 1 + ν; 2 + ν; 1− t),

provided ν > −1, by (5.61). Now, by (5.62),

2F1(1 + α, 1 + ν; 2 + ν; 1− t) =Γ(2 + ν)Γ(−α)

Γ(1− α+ ν)2F1(1 + α, 1 + ν; 1 + α; t)

+ t−α1 + ν

α2F1(1− α+ ν, 1; 1 + α; t)

=Γ(2 + ν)Γ(−α)

Γ(1− α+ ν)+

1 + ν

αt−α +O(t1−α),

as t ↓ 0, provided α ∈ (0, 1). Combining these results we get∫ 1

t((1− u)ν − 1)u−1−αdu =

(1− t)1+ν

1 + ν

[Γ(2 + ν)Γ(−α)

Γ(1− α+ ν)+

1 + ν

αt−α +O(t1−α)

]+

1− t−αα

=1

α+

Γ(1 + ν)Γ(−α)

Γ(1− α+ ν)+O(t1−α).

Hence ∫ 1

0((1− u)ν − 1)u−1−αdu =

1

α+

Γ(1 + ν)Γ(−α)

Γ(1− α+ ν)

=1

α

(1− Γ(1 + ν)Γ(1− α)

Γ(1− α+ ν)

),

using (5.55). Thus we obtain (5.68).Finally consider I3; this turns out to be the simplest of the three. Indeed,

after the substitution u = y/x, we have

Iν,α3 (x) = xν−α∫ ∞

1(λ(u− 1)ν − 1)u−1−αdu


= xν−α(λ

∫ ∞1

(u− 1)νu−1−αdu− 1

α

),

for α > 0. Now using the substitution v = 1/u gives∫ ∞1

(u− 1)νu−1−αdu =

∫ 1

0(1− v)νvα−ν−1dv =

Γ(1 + ν)Γ(α− ν)

Γ(1 + α),

by (5.59), provided ν > −1 and α− ν > 0. Hence we obtain (5.66).

We also need the following integrals involving logarithms. Recall that γdenotes Euler’s constant and ψ is the digamma function.

Define the following:

Jα1 :=

∫ ∞1

u−1−α log(u− 1)du; (5.69)

Jα2 :=

∫ ∞1

u−1−α log(1 + u)du; (5.70)

Jα3 :=

∫ ∞0

u−1−α log(1 + u)du; (5.71)

Jα4 :=

∫ 1

0u−1−α log(1− u)du. (5.72)

Lemma 5.3.17. For any α > 0,

Jα1 = − 1

α(γ + ψ(α)) ; Jα2 =

1

α(ψ(α)− ψ(α/2)) .

Also, for any α ∈ (0, 1),

Jα3 =π

αcosecπα; Jα4 =

1

α(γ + ψ(1− α)) ;

and Jα1 + Jα4 =π

αcot(πα).

Proof. We appeal to tables of standard integrals (Mellin transforms) fromSection 6.4 of [76]. In particular, the given formulae for Jα1 , Jα2 , Jα3 , andJα4 follow from, respectively, equations 6.4.20, 6.4.17, 6.4.15, and 6.4.19 of[76, pp. 315–316]; also used are the reflection formula (5.58), the fact thatψ(1) = −γ, and, for Jα2 , the duplication formula for ψ given by equation6.3.8 of [1, p. 259]. The formula for Jα1 +Jα4 follows either from summing theindividual formulae given and using (5.58), or directly from equation 6.4.18of [76, p. 315].



Section 5.2

There is an extensive and rich theory of sums of independent, identically dis-tributed (i.i.d.) random variables (classical ‘random walks’): see for instancethe books of Kallenberg [137, Chapter 9], Loeve [184, §26.2], or Stout [271,§3.2]. When the summands are integrable, the (first-order) asymptotic be-haviour is governed by the mean. Completely different phenomena occurwhen the mean does not exist: see for example the book of Foss et al. [92].

To compare with the Markov chain setting of Section 5.2, it is help-ful to keep in mind the classical independent-increments case, where Sn :=∑n

m=1 ζm for a sequence of independent (often, i.i.d.) R-valued randomvariables ζ1, ζ2, . . .. Thus we give a brief summary of some known resultsin that setting. Many of the results that we discuss for random walks haveanalogues for suitable Levy processes: see e.g. the book of Sato [254], par-ticularly Sections 37 and 48.

A classical result of Kesten [149, Corollary 3] states that if ζ1, ζ2, . . . arei.i.d. random variables with E |ζ1| = ∞, then as n → ∞, n−1Sn either: (i)tends to +∞ a.s.; (ii) tends to −∞ a.s.; or (iii) satisfies

−∞ = lim infn→∞

n−1Sn < lim supn→∞

n−1Sn = +∞, a.s. (5.73)

Erickson [78] gives criteria for classifying such behaviour. Note that Kesten’sresult implies the non-confinement result (5.1) in the i.i.d. case. Other clas-sical results deal with the growth rate of the upper envelope of Sn, i.e.,determining sequences an for which |Sn| ≥ an infinitely often (or not), orSn ≥ an infinitely often; here we mention the work of Feller [85], as wellas results related to the Marcinkiewicz–Zygmund strong law of large num-bers (see e.g. [152, Theorem 1]). The lower envelope behaviour, i.e., when|Sn| ≥ an all but finitely often, is considered by Griffin [107] (particularlyTheorem 3.5); see also Pru90 [237].

Note that (5.73) can hold and Sn be transient (with respect to boundedsets); Loeve [184, §26.2] gives the example of a symmetric stable randomwalk without a mean. The general criterion for deciding between transi-ence and recurrence is due to Chung and Fuchs (see e.g. [137, Theorem 9.4]or [184, §26.2]), and is rather subtle: Shepp showed [261] that there existdistributions for ζ1 with arbitrarily heavy tails but for which Sn is still recur-rent. By assuming additional regularity for the distribution of ζ1, one canobtain more tractable criteria for recurrence; Shepp gives a criterion whenthe distribution of ζ1 is symmetric [260, Theorem 5].


The results of Section 5.2 are based on [119], which considered the moregeneral setting where Xn is adapted to a filtration Fn, and not necessarilyMarkov (i.e., the analogue of the general Lamperti setting of Chapter 3).

Theorem 5.2.2 is contained in Theorem 2.2 of [119]. In Theorem 5.2.2,condition (H2) is natural. For γ ≤ 1, (∆+

n )γ ≥ xγ−1∆+n 1∆+

n ≤ x for anyx > 0, so (H2) implies that E[(∆+

n )γ | Xn = x] = ∞ for any γ > α. Acounterexample due to K.L. Chung (see the Mathematical Reviews entryfor [63]; also Baum [15]) shows that (H2) cannot be replaced by a conditionon the moments of the increments, even in the case of a sum of i.i.d. randomvariables. Chung’s example has, for α ∈ (0, 1) and β > α, E[(ζ−1 )β] <∞ andE[(ζ+

1 )α] =∞, but E[ζ+1 1ζ+

1 ≤ x] = o(x1−α) along a subsequence, so (H2)does not hold. For Xn = Sn as in Chung’s example, lim infn→∞Xn = −∞,a.s.

Theorems 5.2.4 and 5.2.5 are contained in Theorems 2.4 and 2.6 of [119],respectively. In the case of a sum of independent random variables, The-orem 5.2.4 is slightly weaker than optimal. Suppose that ζ1, ζ2, . . . are inde-pendent, and that for some θ ∈ (0, 1) and ϕ ∈ R,

supk∈N

lim supx→∞

(xθ(log x)−ϕ P[|ζk| ≥ x]) <∞.

Then, with Sn =∑n

k=1 ζk, for any ε > 0, a.s., for all but finitely manyn ≥ 0,

|Sn| ≤ n1/θ(log n)ϕ+1θ

+ε. (5.74)

The bound (5.74) belongs to a family of classical results with a long history;the case ϕ = 0 is due to Levy and Marcinkiewicz (quoted by Feller [85,p. 257]), and the general case of (5.74) follows for example from a result ofLoeve [184, p. 253]. Under the additional condition that the summands areidentically distributed, sharp results are given by Feller [85, Theorem 2]; fora recent reference, see [168]. Related results in the i.i.d. case are also givenby Chow and Zhang [32] (see also [152, Theorem 2]).

Theorems 5.2.2 and 5.2.5 together can be viewed as an analogue of Er-ickson’s [78] result in the case of a sum of i.i.d. random variables; in thei.i.d. case the conclusion of Theorem 5.2.2 follows from [78, Corollary 1].The results of [78] show that the conditions in Theorem 5.2.2 are close tooptimal.

Note that (H1) implies that, a.s.,

E[(∆+n )α | Xn = x] =

∫ ∞0

P[∆+n > y1/α | Xn = x]dy ≥ c

∫ ∞x0

y−1dy =∞.


Conditions (H2) and (H1) are closely related, but neither implies the other.However, if one replaces the inequalities by equalities, the former impliesthe latter. In the case where Xn = Sn is a sum of i.i.d. random variables,a weaker version of Theorem 5.2.5 was obtained by Derman and Robbins[63] and stated in a stronger form by Stout [271, Theorem 3.2.6]; althoughStout’s statement is still weaker than our Theorem 5.2.5, his proof givesessentially the same result (in the i.i.d. case). Also relevant in the i.i.d. caseis a result of Chow and Zhang [32, Theorem 1]. Chung’s counterexample(see above) shows that the condition (H1) cannot be replaced by a momentscondition, for instance.

The results on first passage times, Theorems 5.2.9 and 5.2.10 are con-tained in Theorems 2.9 and 2.10 of [119], respectively. The conditions inTheorems 5.2.9 and 5.2.10 are not far from optimal; in the i.i.d. case, sharpresults on the existence or non-existence of moments for ρx are given byKesten and Maller [151, Theorem 2.1]; see [151] for references to earlierwork. Lemma 5.2.12 is essentially a large deviations result of the same kindas (but more general than) those obtained in [123] for the case Xn = Sn, asum of i.i.d. nonnegative random variables; indeed, the results in [123] showthat Lemma 5.2.12 is close to best possible.

The results on last exit times, Theorems 5.2.13 and 5.2.14, are containedin Theorems 2.11 and 2.12 of [119], respectively. Again, in the i.i.d. casesharp results are given by Kesten and Maller [151, Theorem 2.1].

Chapter 6

Further applications

6.1 Random walk in random environment

6.1.1 Introduction

The first two sections of this chapter give some applications to processesin random environments. The law of a time-homogeneous Markov chain isdetermined by its collection of one-step transition probabilities, which de-scribe how the process moves around the state space. Viewing the processas describing the dynamics of a particle, the collection of transition prob-abilities can be seen as describing the environment in which the particlemoves. Motivated by dynamics in disordered systems, the environment canitself be random, i.e., the transition probabilities are chosen according tosome probabilistic rule, before the process starts. The prototypical modelof this type is the random walk in random environment, which we discuss inthis section.

Given an infinite sequence ω = (p0, p1, p2, . . .) such that px ∈ (0, 1) forall x ∈ Z+, we consider (Xn, n ≥ 0) a nearest-neighbour random walk onZ+ defined as follows. Set X0 = 0, and for x ≥ 1,

Pω[Xn+1 = x− 1 | Xn = x] = px,

Pω[Xn+1 = x+ 1 | Xn = x] = 1− px =: qx, (6.1)

and Pω[Xn+1 = 0 | Xn = 0] = p0, Pω[Xn+1 = 1 | Xn = 0] = 1 − p0 =:q0. The assumption that each px be in (0, 1), with the given form for thereflection at the origin, ensures that Xn is an irreducible, aperiodic Markovchain on Z+ under the quenched law Pω.

The sequence of jump probabilities ω specifies the environment for therandom walk. We take ω itself to be random. Specifically, p0, p1, . . . will

279

280 Chapter 6. Further applications

be a sequence of i.i.d. (0, 1)-valued random variables on a probability space(Ω,F ,P). We use E to denote expectation with respect to the (environment)probability measure P, and Eω to denote expectation with respect to the(quenched) probability measure Pω.

In Section 6.2 we consider random strings in random environment. Therandom strings model includes the random walk in random environmentmodel given here as a special case, and the proofs in Sections 6.1 and 6.2have several common strands. As we shall see, studying recurrence of therandom walk requires analysis of products of independent random variables(compare Section 2.2), while in the case of strings one must study productsof random matrices. Via a simple transformation, one can study productsof random variables by instead studying sums, and so a wealth of classicaltheory is available; for matrices, we must appeal to some more sophisticatedtheory. For these reasons, it is recommended to read Section 6.1 beforeSection 6.2.


For any realization of the environment ω, the irreducible Markov chain Xn

is either recurrent or transient under Pω. The following recurrence classi-fication, which is essentially due to Solomon [268], shows that the crucialquantity is ζx := log(px/qx).

Theorem 6.1.1. Suppose that E |ζ0| <∞.

(i) If E ζ0 < 0, then Xn is transient for P-a.e. ω.

(ii) If E ζ0 = 0, then Xn is null-recurrent for P-a.e. ω.

(iii) If E ζ0 > 0, then Xn is positive recurrent for P-a.e. ω.

For fixed ω, Xn is the nearest-neighbour random walk of Section 2.2.Similarly to that section, we use the notation

dx := dx(ω) :=x−1∏y=0

pyqy

= exp

x−1∑y=0

ζy

. (6.2)

Proof of Theorem 6.1.1. For fixed ω, Theorem 2.2.5 shows that Xn is recur-rent if and only if

∑∞x=1 dx = ∞. If E ζ0 < 0, then the strong law of large

numbers shows that∑x−1

y=0 ζy → −∞ as x → ∞, and hence∑∞

x=1 dx < ∞for P-a.e. ω, proving part (i).

6.1. Random walk in random environment 281

For part (iii), Theorem 2.2.5 shows that Xn is positive-recurrent if andonly if

∑∞x=1 d

−1x < ∞. If E ζ0 > 0, then the strong law of large num-

bers shows that∑x−1

y=0 ζy → +∞ as x → ∞, and hence∑∞

x=1 d−1x =∑∞

x=1 exp−∑x−1y=0 ζy <∞ for P-a.e. ω, proving part (iii).

Finally, for part (ii), suppose that E ζ0 = 0. Then either P[ζ0 = 0] = 1(the degenerate case of symmetric simple random walk with reflection at 0)and dx = d−1

x = 1 for all x, or else P[ζ0 = 0] < 1 and then

lim supx→∞

x−1∑y=0

ζy = lim supx→∞

x−1∑y=0

(−ζy) = +∞,

for P-a.e. ω (see e.g. [137, p. 167]). In particular, in either case we havedx ≥ 1 for infinitely many x for P-a.e. ω, and we obtain recurrence, andd−1x ≥ 1 for infinitely many x for P-a.e. ω, so the walk is not positive-

recurrent. This proves part (ii).

6.1.3 Almost-sure upper bounds

The null-recurrent case in which E[ζ20 ] ∈ (0,∞) and E ζ0 = 0 is known as

Sinai’s regime; it is this case that we study in detail for the rest of thissection. First we give an almost-sure upper bound which demonstrates thevery slow movement of the random walk.

Theorem 6.1.2. Suppose that E[ζ20 ] ∈ (0,∞) and E ζ0 = 0. Then for any

ε > 0, for P-a.e. ω, Pω-a.s., for all but finitely many n ≥ 0,

Xn ≤ (log n)2(log log n)2+ε.

We prove this result using the Lyapunov function t(x) as defined at (2.8),which for the present purposes is most usefully expressed as

t(x) := t(x, ω) :=x−1∑y=0

y∑z=0

q−1y−z exp

y∑

w=y−z+1

ζw

. (6.3)

We will use the following lower bound on t(x).

Lemma 6.1.3. Let t be as defined at (6.3). For any ω and any x ∈ Z+,

t(x) ≥ exp

max

0≤y≤x

y−1∑z=0

ζz

. (6.4)


Proof. Since q−1y−z = 1 + (py−z/qy−z), we have from (6.3) that

t(x) =x−1∑y=0

y∑z=0

exp

y∑

w=y−z+1

ζw

+x−1∑y=0

y∑z=0

exp

y∑

w=y−zζw

. (6.5)

For the lower bound, (6.5) shows that

t(x) ≥ max0≤y≤x−1

max0≤z≤y

exp

y∑

w=y−zζw

≥ max

0≤y≤x−1exp

y∑

w=0

ζw

,

which yields the result in (6.4).

The next result, which is a consequence of Theorem 2.8.1, shows how todeduce from lower bounds for t(x) an upper bound for Xn.

Lemma 6.1.4. Let w : Z+ → R+ be increasing. Suppose that for P-a.e. ω,t(x) ≥ w(x) for all but finitely many x ∈ Z+. Then for any ε > 0, forP-a.e. ω, Pω-a.s., for all but finitely many n ≥ 0,

Xn ≤ w−1(n1+ε).

Proof. Let ε > 0. For a fixed ω, Lemma 2.2.4 shows that we may applyTheorem 2.8.1 with f = t (and B = 1) to obtain, Pω-a.s., for all but finitelymany n ≥ 0, Xn ≤ t−1((2n)1+ε).

Under the assumptions of the lemma, for P-a.e. ω there exists xω ∈ Z+

such that t(x) ≥ w(x) for all x ≥ xω, so that t−1(y) ≤ w−1(y) for all y ≥ yω,where yω = t(xω). For such ω, Pω-a.s.,

Xn ≤ t−1((2n)1+ε) ≤ w−1((2n)1+ε),

for all n sufficiently large, and the result follows.

The final ingredient is to obtain lower bounds for the right-hand sideof (6.4). We will use the following result, due to Csaki [50], which extendeda result of Hirsch [114].

Lemma 6.1.5. Suppose that ζ0, ζ1, . . . are i.i.d. with E[ζ20 ] ∈ (0,∞) and

E ζ0 = 0. Then, for any ε > 0, P-a.s., for all but finitely many n ≥ 0,

max0≤i≤n

i∑j=0

ζj ≥ n1/2(log n)−1−ε.



Proof of Theorem 6.1.2. Let ε > 0. We have from (6.4) and Lemma 6.1.5that for P-a.e. ω, for all but finitely many x ∈ Z+,

t(x) ≥ expx1/2(log x)−1−ε

.

Thus we may apply Lemma 6.1.4 with

w(x) = expx1/2(log x)−1−ε

, so that w−1(y) ≤ (log y)2(log log y)2+3ε,

for y sufficiently large, to obtain that for P-a.e. ω, Pω-a.s.,

Xn ≤ w−1(n1+ε) ≤ (log n)2(log log n)2+4ε,

for all but finitely many n ≥ 0, which yields the result.

6.1.4 Almost-sure lower bounds

Next we give an almost-sure lower bound that is attained infinitely often.

Theorem 6.1.6. Suppose that E[ζ20 ] = σ2 ∈ (0,∞) and E ζ0 = 0. Then for

any ε > 0, for P-a.e. ω, Pω-a.s., for infinitely many n ≥ 0,

Xn ≥ (1− ε) 2

π2σ2(log n)2 log log log n.

The following upper bound on t(x) is a companion to Lemma 6.1.3

Lemma 6.1.7. Let t be as defined at (6.3). For any ω and any x ∈ Z+,

t(x) ≤ 2x2 exp

2 max

0≤y≤x−1

∣∣∣∣∣y∑z=0

ζz

∣∣∣∣∣. (6.6)

Proof. For the first term on the right-hand side of (6.5) we have

x−1∑y=0

y∑z=0

exp

y∑

w=y−z+1

ζw

≤x−1∑y=0

(y + 1) max0≤z≤y

exp

y∑

w=y−z+1

ζw

≤ x2 exp

max0≤y≤x−1

max0≤z≤y

y∑w=y−z+1

ζw

.


Here we have that

max0≤y≤x−1

max0≤z≤y

y∑w=y−z+1

ζw = max0≤y≤x−1

(y∑

w=0

ζw + max0≤z≤y

y−z∑w=0

(−ζw)

)

≤ max0≤y≤x−1

y∑w=0

ζw + max0≤y≤x−1

y∑w=0

(−ζw).

Hence

x−1∑y=0

y∑z=0

exp

y∑

w=y−z+1

ζw

≤ x2 exp

max

0≤y≤x−1

y∑z=0

ζz + max0≤y≤x−1

y∑z=0

(−ζz).

A similar calculation yields the same upper bound for the second term onthe right-hand side of (6.5), and (6.6) follows.

The key to the proof of Theorem 6.1.6 is the following technical result.

Lemma 6.1.8. Suppose that there are non-negative, increasing functions uand v such that

(i) for P-a.e. ω, t(x) ≤ u(x) for all but finitely many x ∈ N;

(ii) for P-a.e. ω, t(x) ≤ v(x) for infinitely many x ∈ N.

Suppose also that

∞∑x=1

u(x)

v(x3/2)<∞, and lim

x→∞

(v(x3/4)/v(x)

)= 0. (6.7)

Then for any ε > 0, for P-a.e. ω, Pω-a.s., for infinitely many n ≥ 0,

Xn ≥ v−1((1− ε)n).

Proof. Recall that τx = minn ≥ 0 : Xn = x, and, from Lemma 2.2.3, thatEω τx = t(x) given that X0 = 0. First we obtain a rough upper bound onτx (equation (6.8) below). Condition (i) and Markov’s inequality yield, forP-a.e. ω,

Pω[τx > v(x3/2)] ≤ t(x)

v(x3/2)≤ u(x)

v(x3/2),

for all x ≥ x0, where x0 := x0(ω) <∞. Hence, for P-a.e. ω,∑x≥1

Pω[τx > v(x3/2)] ≤ x0 +∑x≥1

u(x)

v(x3/2)<∞,


by the first condition in (6.7). By the Borel–Cantelli lemma, for P-a.e. ω,

τx ≤ v(x3/2), (6.8)

for all x ≥ N where N := N(ω) (a random variable for each ω) hasPω[N(ω) <∞] = 1 for P-a.e. ω.

Now we use (6.8) and condition (ii) to show that τx is in fact muchsmaller than the bound in (6.8) for infinitely many x. Let ε > 0. Condition(ii) implies that for P-a.e. ω there exist xi := xi(ω), i ∈ N, such thatxi+1 > x2

i for all i and t(xi) ≤ v(xi) for all i (the xi are a subsequence ofthat specified by (ii) chosen so as to have very large spacings). By Markov’sinequality and the fact that Eω τxi+1 = t(xi+1) ≤ v(xi+1),

Pω[τxi+1 − τxi > (1 + ε)v(xi+1)] ≤ Pω[τxi+1 > (1 + ε)v(xi+1)] ≤ (1 + ε)−1.

It follows that, for all i ∈ N,

Pω[τxi+1 − τxi ≤ (1 + ε)v(xi+1)] ≥ ε′, (6.9)

where ε′ > 0 depends only on ε. So by (6.9), for P-a.e. ω, for any ε > 0,∑i∈N

Pω[τxi+1 − τxi ≤ (1 + ε)v(xi+1)] =∞. (6.10)

Under Pω, the random variables τxi+1 − τxi , i ∈ N, are independent, by thestrong Markov property. Hence (6.10) and the Borel–Cantelli lemma implythat, for P-a.e. ω, Pω-a.s., for infinitely many i,

τxi+1 − τxi ≤ (1 + ε)v(xi+1).

With (6.8) this implies that, for P-a.e. ω, Pω-a.s., for infinitely many i,

τxi+1 ≤ (1 + ε)v(xi+1) + v(x3/2i ) ≤ (1 + ε)v(xi+1) + v(x

3/4i+1),

since xi < x1/2i+1 and v is increasing. Hence by the second condition in (6.7),

we obtain that, for any ε > 0, for P-a.e. ω, Pω-a.s., for infinitely many x,

τx ≤ (1 + 2ε)v(x) = (1 + 2ε)v(Xτx).

Since τx 6= τy for any x 6= y, it follows that, for any ε > 0, for P-a.e. ω,Pω-a.s., n ≤ (1 + 2ε)v(Xn) for infinitely many n, which gives the result.

We will use the so-called ‘other’ law of the iterated logarithm due toChung [34] (under a 3rd moments condition) and Jain and Pruitt [128]:


Lemma 6.1.9. Suppose that ζ0, ζ1, . . . are i.i.d. with E[ζ20 ] = σ2 ∈ (0,∞)

and E ζ0 = 0. Then, P-a.s.,

lim infn→∞

(n−1/2(log log n)1/2 max

0≤i≤n

∣∣∣∣ i∑j=0

ζj

∣∣∣∣) =πσ√

8.

Proof of Theorem 6.1.6. For ε ∈ (0, 1/4) set

u(x) = 2x2 exp

2x(1/2)+ε, and

v(x) = 2x2 exp

(1 + ε)2−1/2πσx1/2(log log x)−1/2.

Consider the upper bound for t(x) given in (6.6). Then for P-a.e. ω, t(x) ≤u(x) for all but finitely many x, as follows easily from the law of the iteratedlogarithm, or from Theorem 2.8.1. Moreover, Lemma 6.1.9 shows that, forP-a.e. ω, t(x) ≤ v(x) for infinitely many x.

It is straightforward to verify that u and v also satisfy (6.7). Thus wemay apply Lemma 6.1.8 with this choice of u and v to obtain the result.

6.1.5 Perturbation of Sinai’s regime

We finish this section by studying a non-i.i.d. environment. Again supposethat Xn has quenched transition probabilities as defined at (6.1), but nowsuppose that there exist constants x0 ∈ N, a ∈ R, and β > 0 such thatpx ∈ (0, 1) for x < x0 and

px = ξx + ax−β, and qx = 1− ξx − ax−β, for all x ≥ x0, (6.11)

where ξ0, ξ1, . . . are i.i.d. random variables on probability space (Ω,F ,P).We assume that there exists δ > 0 for which

P[δ ≤ ξ0 ≤ 1− δ] = 1; (6.12)

this is a uniform ellipticity condition. In this section we define

ζx := log

(ξx

1− ξx

).

We assume that E[ζ20 ] > 0 and E ζ0 = 0. In the case a = 0, this model

reduces to the random walk in i.i.d. random environment in Sinai’s regime,as considered earlier in this section, and in that case the present definitionof ζx reduces to the usage in Section 6.1.2. For a 6= 0, this model represents


a perturbation of Sinai’s regime, in which the px are independent but noti.i.d.

This perturbation of the homogeneous (i.i.d.) environment is motivatedby analogy with the problems studied in Chapters 3 and 4 in which the homo-geneous (zero-drift) regime is perturbed by an asymptotically zero drift. Ourfirst result for this model is the following recurrence classification; whereasin the context of asymptotically zero-drift random walks, a perturbation oforder x−1 is required to effect a phase transition, the random environment isseen to be more stable to perturbations, with x−1/2 being the critical order.

Theorem 6.1.10. Suppose that for x0 ∈ N, a ∈ R, δ > 0 and β > 0 theconditions (6.11) and (6.12) hold, and that E[ζ2

0 ] > 0 and E ζ0 = 0. Then:

(i) If a < 0 and β ∈ (0, 1/2), then for P-a.e. ω, Xn is transient.

(ii) If a > 0 and β ∈ (0, 1/2), then for P-a.e. ω, Xn is positive-recurrent.

(iii) If β ≥ 1/2, then for P-a.e. ω, Xn is null-recurrent.

We also present almost-sure bounds giving the rate of escape in thetransient case.

Theorem 6.1.11. Suppose that for x0 ∈ N, a < 0, δ > 0 and β ∈ (0, 1/2)the conditions (6.11) and (6.12) hold, and that E[ζ2

0 ] > 0 and E ζ0 = 0.Then for any ε > 0, for P-a.e. ω, Pω-a.s., for all but finitely many n ≥ 0,

(log log n)−(1/β)−ε ≤ Xn

(log n)1/β≤ (log log n)(2/β)+ε.

The rest of this section is devoted to the proofs of these two theorems.First we use a similar idea to the proof of Theorem 6.1.1 to give the proofof Theorem 6.1.10.

Proof of Theorem 6.1.10. From Taylor’s theorem for log(1+x), we have that

log py = log ξy + log(1 + ay−βξ−1y )

= log ξy + ay−βξ−1y +O(y−2β), as y →∞,

where we have used (6.12) to obtain the non-random error term. Similarly,

log qy = log(1− ξy)− ay−β(1− ξy)−1 +O(y−2β).

Hence, as y →∞,

log(py/qy) = ζy + ay−βξ−1y (1− ξy)−1 +O(y−2β). (6.13)


It follows from (6.13) that, with the definition of dx = dx(ω) from (6.2),

dx = exp

x−1∑y=1

(ζy + ay−βξ−1

y (1− ξy)−1 +O(y−2β))

. (6.14)

For fixed ω, Theorem 2.2.5 shows thatXn is recurrent if and only if∑∞

x=1 dx =∞. Suppose that a < 0 and β ∈ (0, 1/2). Then, from (6.14),

log dx ≤x−1∑y=1

ζy − C|a|y1−β +O(y1−2β),

where C = C(δ, β) is a positive constant that does not depend on ω. Here,by the law of the iterated logarithm (or Theorem 2.8.1), P-a.s.,

∑x−1y=1 ζy ≤

x1/2 log x for all but finitely many x. Thus, since 1 − β > 1/2, we havelog dx → −∞, P-a.s., and hence Xn is transient, proving part (i).

For part (ii), Theorem 2.2.5 shows that Xn is positive recurrent if andonly if

∑∞x=1 d

−1x < ∞. Suppose that a > 0 and β ∈ (0, 1/2). Then,

from (6.14),

log d−1x ≤

x−1∑y=1

(−ζy)− Cay1−β +O(y1−2β),

where C = C(δ, β) is a positive constant that does not depend on ω. Hence,similarly to part (i), log d−1

x → −∞, P-a.s., and Xn is positive recurrent,proving part (ii).

Finally, for part (iii), suppose that β ≥ 1/2. Then, from (6.14),

log dx ≥x−1∑y=1

ζy − C|a|y1/2 +O(1),

and, by the law of the iterated logarithm, there is a constant c > 0 dependingonly on E[ζ2

0 ] such that, P-a.s.,∑x−1

y=1 ζy ≥ cx1/2(log log x)1/2 for infinitelymany x. It follows that lim supx→∞ log dx = +∞ for P-a.e. ω, so that Xn

is recurrent. A similar argument shows that lim supx→∞ log d−1x = +∞ for

P-a.e. ω, so that Xn is not positive recurrent.

Now we turn to the proof of Theorem 6.1.11. As in Section 6.1.3, we willuse the Lyapunov function t defined at (6.3).

Lemma 6.1.12. Suppose that for x0 ∈ N, a < 0, δ > 0 and β ∈ (0, 1/2)the conditions (6.11) and (6.12) hold, and that E[ζ2

0 ] > 0 and E ζ0 = 0. Forany ε > 0, for P-a.e. ω, for all but finitely many x,

exp(xα(log x)−2−ε) ≤ t(x) ≤ exp(xα(log x)1+ε). (6.15)


To obtain the bounds in Lemma 6.1.12 we need two technical results.

Lemma 6.1.13. Let Y0, Y1, . . . be independent, uniformly bounded randomvariables with EYn = 0 and E[Y 2

n ] = σ2 ∈ (0,∞) for all n.

(i) There exists C ∈ R+ such that, a.s., for all but finitely many i,∣∣∣∣∣∣i∑

k=i−j+1

Yk

∣∣∣∣∣∣ ≤ Cj1/2(log i)1/2, for all j ∈ 1, 2, . . . , i.

(ii) Let b : [1,∞) → R+ be a non-decreasing function such that for someγ > 0 and x0 ∈ R+, xγ ≤ b(x) ≤ x for all x ≥ x0. Then for any ε > 0,a.s., for all but finitely many n,

max1≤i≤n

max1≤j≤b(i)

i∑k=i−j+1

Yk ≥ (b(n/2))1/2(log n)−1−ε.

Proof. First we prove part (i). Set Zij =∑i

k=i−j+1 Yk. Then Zij+1 − Zij =

Yi−j ; so for fixed i, Zij is a martingale over j ∈ 1, 2, . . . , i with uni-formly bounded increments. Hence the Azuma–Hoeffding inequality (The-orem 2.4.14) implies that for some c ∈ R+, for all j ∈ 1, . . . , i, for t > 0,

P[|Zij | ≥ t] ≤ 2 exp(−cj−1t2).

Thus for a suitable C ∈ R+, for j ≤ i, P[|Zij | ≥ Cj1/2(log i)1/2] ≤ i−3. Then

∞∑i=1

i∑j=1

P[|Zij | ≥ Cj1/2(log i)1/2] ≤∞∑i=1

i−2 <∞,

and the Borel–Cantelli lemma implies that, a.s., there are only finitely manypairs (i, j) (with j ≤ i) for which |Zij | ≥ Cj1/2(log i)1/2. This proves (i).

Next we prove part (ii). For fixed i, note that

max1≤j≤b(i)

i∑k=i−j+1

Ykd= max

1≤j≤b(i)

j∑k=1

Wk,

where W1,W2, . . . are independent random variables with Wkd= Yi+1−k for

each k. Fix ε > 0. Define the event

Ei :=

max

1≤j≤b(i)

i∑k=i−j+1

Yk ≤ (b(i))1/2(log b(i))−1−ε.


Then Corollary 1 of Hirsch [114] implies that there are absolute constantsC,C ′ ∈ (0,∞) such that for all i ≥ x0,

P[Ei] ≤ C(log b(i))−1−ε ≤ C ′(log i)−1−ε,

since b(i) ≥ iγ . Consider the subsequence i = 2m for m = 1, 2, . . .. Then

∞∑m=1

P[E2m ] ≤ C∞∑m=1

m−1−ε <∞.

Hence by the Borel–Cantelli lemma, E2m occurs for only finitely many m,a.s. Each n ≥ 2 satisfies 2m(n) ≤ n < 2m(n)+1 with m(n) ∈ N such thatm(n)→∞ as n→∞. Then,

max1≤i≤n

max1≤j≤b(i)

i∑k=i−j+1

Yk ≥ max1≤i≤2m(n)

max1≤j≤b(i)

i∑k=i−j+1

Yk

≥ max1≤j≤b(2m(n))

2m(n)∑k=2m(n)−j+1

Yk ≥ (b(2m(n)))1/2(log(2m(n)))−1−ε,

a.s., for all n large enough. Since n ≥ 2m(n) > n/2, part (ii) follows.

Proof of Lemma 6.1.12. First we prove the upper bound in (6.15). Through-out this proof, C and y0 will denote finite positive constants that dependonly on δ and β, and whose value may change from line to line. Since a < 0and E ζ0 = 0, we have from (6.13) with (6.12) that

E log(py/qy) ≤ −Cy−β, for all y ≥ y0.

Using the simple inequality

y∑w=y−z+1

w−β ≥ z miny−z+1≤w≤y

w−β = zy−β, (6.16)

we obtain that for all y, z with y − z ≥ y0,

Ey∑

w=y−z+1

log(pw/qw) ≤ −Czy−β. (6.17)

By Lemma 6.1.13(i) we have that, for some C ∈ R+, for P-a.e. ω, all butfinitely many y, and all z ∈ 1, 2, . . . , y,

y∑w=y−z+1

(log(pw/qw)− E log(pw/qw)) ≤ Cz1/2(log y)1/2. (6.18)


For ε > 0 set r(y) := dy2β(log y)1+εe. Then from (6.18) with (6.17), forP-a.e. ω, for all but finitely many y ∈ N and all z with r(y) ≤ z ≤ y,

y∑w=y−z+1

log(pw/qw) ≤ −Czy−β + Cz1/2(log y)1/2 ≤ −(C/2)zy−β. (6.19)

On the other hand, for all but finitely many y ∈ N and all z with z ≤ r(y),

y∑w=y−z+1

log(pw/qw) ≤ Cz1/2(log y)1/2. (6.20)

So from (6.3) with (6.19) and (6.20) we obtain, for P-a.e. ω, for any ε > 0,for all but finitely many y,

t(x) ≤x−1∑y=0

r(y)∑z=0

exp(Cz1/2(log y)1/2) +x−1∑y=0

y∑z=r(y)

exp(−Czy−β)

≤x−1∑y=0

exp(Cyβ(log y)1+ε),

which gives the upper bound for t(x) in (6.15).

We now prove the lower bound in (6.15). For ε > 0 set s(y) := by2α(log y)−2−εc;since β < 1/2 we have s(y) < y. Then,

t(x) ≥ max1≤y≤x−1

max1≤z≤s(y)

exp

y∑w=y−z+1

log(pw/qw). (6.21)

Now we have from (6.13) with (6.12) that, for P-a.e. ω,

log(py/qy) ≥ ζy − Cy−β, for all y ≥ y0.

Hence, for y − z ≥ y0,

y∑w=y−z+1

log(pw/qw) ≥ −C(y1−β − (y − z)1−β). (6.22)

Taylor’s theorem implies that for β ∈ (0, 1),

y1−β − (y − z)1−β ≤ Czy−β, for all y − z ≥ y0. (6.23)


It follows from (6.21) with (6.22) and (6.23) that for all x ≥ 1,

t(x) ≥ exp

max

1≤y≤x−1max

1≤z≤s(y)

y∑w=y−z+1

ζw − Cs(x)x−β, (6.24)

using the fact that y−βs(y) is eventually increasing for β > 0. By Lemma 6.1.13(ii),we have that for any ε > 0, for P-a.e. ω,

max1≤y≤x−1

max1≤z≤s(y)

y∑w=y−z+1

ζw ≥ (s(x/2))1/2(log x)−1−(ε/4)

≥ Cxβ(log x)−2−(3ε/4),

for all but finitely many x, while s(x)x−β ≤ xβ(log x)−2−ε. Hence (6.24)implies the lower bound in (6.15).

Now we can complete the proof of Theorem 6.1.11. For the upper boundin the theorem, we once more use Lemma 6.1.4. The lower bound needs anadditional idea. The idea of Lemma 6.1.8 was to obtain an upper boundon hitting times, which could be translated to a lower bound that was validinfinitely often. In order to extend this technique to the transient case, andobtain a lower bound valid all but finitely often, we show in addition that(roughly speaking), in the present case, the time of the last visit of therandom walk to a site is not too much greater than the first hitting time;a similar approach was used in Theorem 3.10.1 for the transient Lampertiproblem.

Proof of Theorem 6.1.11. The lower bound in (6.15) implies that for anyε > 0, for P-a.e. ω, for all x sufficiently large,

t(x) ≥ exp(xα(log x)−2−ε) =: w(x).

Hence Lemma 6.1.4 implies that for any ε > 0, for P-a.e. ω, Pω-a.s.,

Xn ≤ w−1(n1+ε) ≤ (log n)1/α(log log n)(2+2ε)/α,

for all but finitely many n. This proves the claimed upper bound.It remains to prove the lower bound in the theorem. For fixed ω, let ax

denote the probability that the random walk Xn hits x in finite time, giventhat it starts at 2x. For x ≥ 1 define

Mx := 1 +∞∑y=1

y∏z=1

px+z

qx+z= 1 +

∞∑y=1

exp

y∑z=1

log

(px+z

qx+z

). (6.25)


Standard hitting probability arguments yield a0 = 1, and, if Mx <∞,

ax = M−1x

∞∑y=x

y∏z=1

px+z

qx+z= M−1

x

∞∑y=x

exp

y∑z=1

log

(px+z

qx+z

), x ≥ 1. (6.26)

Since a < 0 and E ζ0 = 0, we have from (6.13) with (6.12) that

Ey∑z=1

log(px+z/qx+z) ≤ −cy∑z=1

(x+ z)−β ≤ −c′y1−β, for all y ≥ x ≥ x0, ,

(6.27)

for positive constants c, c′, and x0. Also, by the Azuma–Hoeffding inequality(Theorem 2.4.14) and an argument similar to Lemma 6.1.13, we have that,for P-a.e. ω,

y∑z=1

(log(px+z/qx+z)− E[log(px+z/qx+z)]) ≤ Cy1/2(log y)1/2,

for all but finitely many (x, y) with y ≥ x. Together with (6.27) this showsthat for P-a.e. ω, for all but finitely many (x, y) with y ≥ x,

y∑z=1

log(px+z/qx+z) ≤ −Cy1−β + Cy1/2(log y)1/2 ≤ −(C/2)y1−α, (6.28)

since β ∈ (0, 1/2). Hence, for P-a.e. ω, from (6.25) with (6.12) and (6.28),for x ≥ 1 and a constant Cδ depending only on δ,

Mx ≤ 1 + Cxδ +

∞∑y=x

exp(−Cy1−β) <∞.

Further, since Mx ≥ 1 for all x, (6.26) and (6.28) imply, for P-a.e. ω, for allbut finitely many x,

ax ≤∞∑y=x

exp(−Cy1−β) ≤ exp(−C ′x1−β),

for some C ′ ∈ (0,∞). Thus, for P-a.e. ω,∑

x ax <∞.The Borel–Cantelli lemma then implies that, for P-a.e. ω, Pω-a.s., for

only finitely many sites x does Xn return to x after visiting 2x. Denoting byηx the time of the last visit of Xn to x, we then have that ηx ≤ τ2x, Pω-a.s. for


all but finitely many x. Suppose t(x) ≤ u(x) for all but finitely many x.Then since Pω τx = t(x) we have

∑x≥1 Pω[τx ≥ x2t(x)] ≤ ∑x≥1 x

−2 < ∞,so another application of the Borel–Cantelli lemma shows that for P-a.e. ω,Pω-a.s., for all x ≥ x0 with a finite (random) x0,

ηx ≤ τ2x ≤ 4x2u(2x). (6.29)

Moreover, since, for P-a.e. ω, Xn is transient, we have that Xn ≥ x0 for alln sufficiently large. Hence from (6.29), using the fact that ηXn ≥ n for alln, we have that for P-a.e. ω, Pω-a.s., for all but finitely many n,

n ≤ ηXn ≤ 4X2nu(2Xn).

Then, with the upper bound in 6.15, we obtain, for any ε > 0, for P-a.e. ω,Pω-a.s., n < exp(Xβ

n (log n)1+ε), for all but finitely many n, using the factthat Xn ≤ n to bound the logarithmic term. This gives the claimed lowerbound.

6.2 Random strings in random environment

6.2.1 Introduction

Consider a finite alphabet A = 1, . . . , d. A string is a finite sequenceof symbols from A. We write |s| for the length of the string s, i.e., ifs = s1 . . . s`, with each si ∈ A, then |s| = `.

In this section we define a time-homogeneous Markov chain (Xn, n ≥ 0)whose state space is the set of all finite strings over A. The transitions ofthis Markov chain are as follows. If the current state is the string Xn = s =s1 . . . s`, whose rightmost symbol is s` = i ∈ A, then a transition to Xn+1 isone of the following three kinds:

• erase the rightmost symbol of s with probability rì ;

• substitute the rightmost symbol i by j with probability qìj ;

• append the symbol j to the right end of the string with probability pìj .

Of course we assume that

rì ≥ 0, qìj ≥ 0, pìj ≥ 0, for all i, j ∈ A and all ` ∈ N; (6.30)

rì +∑j∈A

qìj +∑j∈A

pìj = 1, for all i ∈ A and all ` ∈ N. (6.31)

6.2. Random strings in random environment 295

The specification of the Markov chain is completed by describing the pos-sible transitions from the empty string ∅ (the string with length zero): fordefiniteness, we suppose that an empty string can transition only to a stringof length 1 with probabilities given by p0

j , j ∈ A.The asymptotic behaviour of the string will then be determined by the

collection of parameters rì , qìj , p

ìj , over ` ∈ N, i ∈ A, j ∈ A. These

parameters constitute the environment for the string dynamics.The environment will itself be random. Let PA denote the set of all

admissible (rì , qìj , p

ìj)i,j , i.e., for fixed ` the set of all parameter choices

satisfying (6.30) and (6.31). On a probability space (Ω,F ,P), let ξ1, ξ2, . . .be i.i.d. random elements of PA. Then for each `, ξ` specifies the parametersrì , q

ìj , p

ìj for all i, j ∈ A; we will denote the realization of the environment

by ω = (ξ1, ξ2, . . .). This defines the Markov chain describing the evolutionof the string, for each fixed environment ω.

Similarly to Section 6.1, we will study this model by choosing a suitableLyapunov function f(s) = f(s, ω) of the state s and the environment ω.

Remark 6.2.1. The strings model includes the random walk in random en-vironment model of Section 6.1 as a special case. Indeed, if d = 1, then theprocess tracking the length of the string |Xn| becomes a nearest-neighbourrandom walk on Z+ with quenched transition probabilities for ` ∈ N

Pω[|Xn+1| = `+ 1 | |Xn| = `] = p`11;

Pω[|Xn+1| = ` | |Xn| = `] = q`11;

Pω[|Xn+1| = `− 1 | |Xn| = `] = r`1.

Note that for general d ≥ 2, the process |Xn| is not Markov under Pω.

6.2.2 Lyapunov exponents and products of random matrices

In this section we review some properties of products of random matricesthat we need below. Recall that for a matrix A we denote its transpose byA> and its largest eigenvalue by λmax(A), and that for a d × d real matrixA, its operator norm is

‖A‖op = supu∈Sd−1

‖Au‖ =(λmax(A>A)

)1/2.

On probability space (Ω,F ,P), we say that a d× d random real matrixA satisfies condition (N) if:

(N) E log+ ‖A‖op <∞, where log+ x = maxlog x, 0.


Let A1, A2, . . . be a sequence of i.i.d. matrices satisfying condition (N).Let a1(n) ≥ a2(n) ≥ · · · ≥ ad(n) ≥ 0 be the square roots of the (random)eigenvalues of (An · · ·A1)>(An · · ·A1). Then there exist constants γj(A),j ∈ 1, . . . , d, with

−∞ ≤ γd(A) ≤ · · · ≤ γ1(A) <∞,

depending on the distribution of A, such that for each j ∈ 1, . . . , d,

P[

limn→∞

1

nlog aj(n) = γj(A)

]= 1; (6.32)

see Proposition 5.6 of [25] (note that A does not need to be invertible). Thenumbers γj(A) are called the Lyapunov exponents of the sequence of randommatrices An. In particular,

γ1(A) = limn→∞

1

nlog a1(n) = lim

n→∞

1

nlog ‖An · · ·A1‖op, P -a.s., (6.33)

is the top Lyapunov exponent.


To describe our results on recurrence and transience, we need some morenotation. With pìj , q

ìj and rì as defined in Section 6.2.1, we write P` :=

(pìj)i,j∈A and D` := (dìj)i,j∈A, where dìj = −qìj for i 6= j and

dìi = rì +∑

j∈A\i

qìj .

Thus P1, P2, . . . and D1, D2, . . . are i.i.d. sequences of random d×d matriceson (Ω,F ,P). We will impose the following assumption.

(D) E log(1/rì ) <∞, for all i ∈ A.

We will show shortly (in Proposition 6.2.2) that condition (D) ensuresthat D` is P-a.s. invertible; then we define the sequence of i.i.d. randommatrices A`, given by

A` := D−1` P`, (6.34)

and we denote the associated Lyapunov exponents by γ1, . . . , γd. The relev-ance of condition (D) is displayed by the following result.


Proposition 6.2.2. If (D) holds, then the matrices D` are P-a.s. invertible,with E log+ ‖D−1

` ‖op <∞, and the associated matrices A` defined by (6.34)satisfy condition (N).

Proof. Suppose that (D) holds. Then, P-a.s., rì > 0 for all i ∈ A, and thenany non-zero vector v ∈ Rd has a non-zero image D`v; thus the kernel ofthe map v 7→ D`v is trivial, and hence D−1

` exists and the linear map isbijective. Let vi denote the component of v with greatest absolute value;without loss of generality, suppose vi > 0. Then we have

(D`v)i = rìvi +∑j 6=i

qìj(vi − vj) ≥ rìvi > 0.

Moreover‖D`v‖ ≥ (D`v)i ≥ rìvi ≥ d−1/2rì‖v‖.

Thus if u = D`v we have ‖D−1` u‖ ≤ d1/2(1/rì )‖u‖ and hence ‖D−1

` ‖op ≤d1/2 maxi∈A(1/rì ). Therefore

E log+ ‖D−1` ‖op ≤ d1/2

∑i∈A

E log(1/rì ) <∞.

The final claim in the proposition follows from this, together with the factthat ‖A`‖op = ‖D−1

` P`‖op ≤ ‖D−1` ‖op‖P`‖op and the fact that elements of

P` are uniformly bounded.

Now we are ready to formulate our results. Roughly speaking, the topLyapunov exponent γ1 determines the recurrence or transience of the pro-cess. First we give a sufficient condition for transience. Note that (byProposition 6.2.2), under the hypotheses of Theorem 6.2.3 both D` and Pàre P-a.s. invertible, and hence so is A` as defined by (6.34).

Theorem 6.2.3. Suppose that (D) holds. Suppose also that P` is P-a.s. in-vertible, and that A−1

` satisfies condition (N). If γ1 > 0, then for P-a.e. ωthe Markov chain Xn is transient.

Next we give a sufficient condition for positive-recurrence.

Theorem 6.2.4. Suppose that (D) holds. If γ1 < 0, then for P-a.e. ω theMarkov chain Xn is positive-recurrent.

The final result covers the case γ1 = 0.

Theorem 6.2.5. Suppose that (D) holds. Suppose also that A` is P-a.s. in-vertible, and that no finite union of proper subspaces of Rd is a.s. stabilizedby A`. If γ1 = 0, then for P-a.e. ω the Markov chain Xn is recurrent.


6.2.4 Proofs

We will need the following simple lemma, which establishes relations betweenthe Lyapunov exponents of A and A−1.

Lemma 6.2.6. Suppose that A is P-a.s. invertible, and that both A and A−1

satisfy condition (N). Then for j ∈ 1, . . . , d,

γj(A−1) = −γd−j+1(A). (6.35)

Proof. Let b1(n) ≥ · · · ≥ bd(n) be the square roots of the eigenvalues of

(A−1n · · ·A−1

1 )>(A−1n · · ·A−1

1 ) =((A1 · · ·An)(A1 · · ·An)>

)−1.

Since UV has the same eigenvalues as V U , (b1(n))−1, . . . , (bd(n))−1 arethe square roots of the eigenvalues of (A1 · · ·An)>(A1 · · ·An). But theproduct A1 · · ·An has the same law as the product An · · ·A1. So for all j ∈1, . . . , d, we have (bj(n))−1 has the same law as ad−j+1(n), and con-sequently, γj(A

−1) = −γd−j+1(A).

We will also need the following classical theorem [226] cite also somebook?

Theorem 6.2.7 (Oseledets multiplicative ergodic theorem). Let A1, A2, . . .be a sequence of i.i.d. copies of a matrix A satisfying condition (N). Letγ1 ≥ γ2 ≥ · · · ≥ γd be the Lyapunov exponents of the sequence An. Considerthe strictly increasing non-random sequence of integers 1 = i1 < i2 < · · · <ip < ip+1 = d + 1 such that γiq−1 > γiq , if q = 2, 3, . . . , p, and γi = γj,if iq ≤ i, j < iq+1 (the iq’s mark the points of decrease of the γi). Then,P-a.s.:

(i) for every v ∈ Rd,

limn→∞

1

nlog ‖An . . . A1v‖

exists or is −∞;

(ii) for q ≤ p,

V (q, ω) =

v ∈ Rd : limn→∞

1

nlog ‖An . . . A1v‖ ≤ γiq

is a random linear subspace of Rd with dimension d− iq + 1 (we haveV (1, ω) = Rd);


(iii) for q ≥ 1, v ∈ V (q, ω) V (q + 1, ω) implies that

limn→∞

1

nlog ‖An . . . A1v‖ = γiq .

Now we can give the proof of our transience result.

Proof of Theorem 6.2.3. Suppose that (D) holds, that P` is P-a.s. invertible,and that A−1

` satisfies condition (N). We construct a Lyapunov function h(s)and apply Corollary 2.5.10 to deduce transience. We show that there existsa sequence of column vectors v` = (v`1, . . . , v

`d)>, depending on the envir-

onment ω, such that the function h(s) of the string s = s1 . . . s|s| can bedefined as

h(s) =

|s|∑j=1

vjsj . (6.36)

Indeed, for h(s) defined by (6.36) we have for s with |s| = ` and s` = i ∈ A,

Eω[h(Xn+1)− h(Xn) | Xn = s] = −rìvì +∑j∈A

qìj(−vì + v`j) +∑j∈A

pìjv`+1j

=(−D`v

` + P`v`+1)i, (6.37)

where the matrices P` and D` are as introduced in Section 6.2.3.

If h(s) is to satisfy the conditions of Corollary 2.5.10, we require theright-hand side of (6.37) to vanish for all non-empty strings s and all i.Thus we require the following relation between v` and v`+1:

v`+1 = D−1` P`v

` = A−1` v`. (6.38)

From (6.38) it follows that we may choose

v`+1 = A−1` · · ·A−1

1 v1, for all ` ≥ 1,

for some vector v1.

According to the q = p case of Theorem 6.2.7 applied to the sequenceA−1

1 , A−12 , . . ., for P-a.e. environment ω there exists a non-zero random vector

v1 = v1(ω) such that

lim`→∞

1

`log ‖v`+1‖ = lim

`→∞

1

`log ‖A−1

` · · ·A−11 v1‖ = γd(A

−1) = −γ1, (6.39)


using Lemma 6.2.6 for the final equality. From (6.39), if γ1 > 0 it fol-lows that there exists a positive constants C1 = C1(ω) such that ‖v`‖ ≤C1 exp−γ1`/2 for all ` ≥ 1. Thus from (6.36), we find that

sups|h(s)| ≤

∞∑j=1

‖vj‖ ≤ C1 exp−γ1/21− exp−γ1/2

<∞.

So this function satisfies all the conditions of Corollary 2.5.10, which yieldstransience.

Remark 6.2.8. The string Lyapunov function h defined at (6.36) bears aclose analogy to the nearest-neighbour random walk function of Section 2.2as defined at (2.7). The string function is a sum of products of the matricesA−1n , while the walk function is a sum of products of the scalars pn/qn.

To prove Theorem 6.2.4, we need one supplementary fact.

Lemma 6.2.9. P-a.s., all entries of matrix D−11 are non-negative.

Proof. It suffices to prove that any vector v = (v1, . . . , vd)> with at least one

negative component has an imageD1v with at least one negative component.Assume first

d1ii > −

∑j∈A\i

d1ij , for all i ∈ A. (6.40)

In this case the statement is easily checked: (D1v)i < 0 if vi = minj vj < 0.In the general case, D1 is a limit of matrices satisfying (6.40); their inversesare non-negative, and hence so is D−1

1 by continuity.

Proof of Theorem 6.2.4. In contrast to the proof of Theorem 6.2.3, whereh(s) was constructed using ingredients provided only implicitly by The-orem 6.2.7, here we explicitly construct a Lyapunov function t(s), which isan analogue of the random walk Lyapunov function given by (6.3).

Recall that Id denotes the d × d identity matrix, and denote by 1 =(1, . . . , 1)> the column vector with all d components equal to 1. Similarlyto (6.36), we define

t(s) =

|s|∑j=1

vjsj ,

but now we define v` = (v`1, . . . , v`d)> by

v` = D−1` 1 +

∑m≥`

A` · · ·AmD−1m+11


= (D−1` +A`D

−1`+1 +AÀ`+1D

−1`+2 + · · · )1.

We need to show that v` is finite and hence t(s) is well-defined. First, sincefor any x ∈ R+ we have x ≥∑∞m=1 1x ≥ m, the assumption E log+ ‖D−1

1 ‖op <∞ shows that, for any C ∈ (0,∞),

∞∑m=1

P(C−1 log ‖D−1

m ‖op > m)<∞.

Thus, by the Borel–Cantelli lemma, P-a.s., for all but finitely many m,‖D−1

m ‖op < exp(Cm). Since γ1 < 0, we may choose C ∈ (0,−γ1). Then weobtain, using (6.33), P-a.s.,

‖v`‖ ≤ (‖D−1` ‖op + ‖A`‖op‖D−1

`+1‖op + ‖AÀ`+1‖op‖D−1`+2‖op + · · · )‖1‖ <∞.

In addition, we have that v` has all components non-negative, by Lemma 6.2.9.Analogously to (6.37), we have for s with |s| = ` and s` = i ∈ A,

Eω[t(Xn+1)− t(Xn) | Xn = s] =(−D`v

` + P`v`+1)i.

With the present choice of v`, the vector whose ith component is on theright-hand side of the last display is equal to

(−Id −DÀ`D−1`+1 −DÀÀ`+1D

−1`+2 − · · · )1

+ (P`D−1`+1 + PÀ`+1D

−1`+2 + PÀ`+1A`+2D

−1`+3 + · · · )1

= −Id1−D`(A`D−1`+1 −AÀ`+1D

−1`+2 − · · · )1

+D`(A`D−1`+1 +AÀ`+1D

−1`+2 + · · · )1,

recalling from (6.34) that A` = D−1` P`. Hence

Eω[t(Xn+1)− t(Xn) | Xn = s] = −1, for all s 6= ∅.

Therefore the function t(s) satisfies all the conditions of Theorem 2.6.4 andpositive-recurrence follows.

Finally, we complete the proof of Theorem 6.2.5.

Proof of Theorem 6.2.5. From a theorem in [24] the assumptions imply

lim inf`→∞

‖A1 . . . A`‖op = 0,P -a.s.,


and therefore

P1A2 . . . A`1 ≤ D11, for infinitely many `. (6.41)

Define recursively the random times τ0 = 1, and, for k ≥ 0,

τk+1 = minn > τk : −Dτk1 + PτkAτk+1 . . . An1 ≤ 0

.

From (6.41) the times τk are P-a.s. finite. Similarly to (6.36), we define

f(s) =

|s|∑j=1

vjsj ,

but where now vτk = 1 and

v` = A`A`+1 · · ·Aτk+11, for τk < ` ≤ τk+1. (6.42)

Hence we have v` = A`v`+1 when τk < ` < τk+1, k ≥ 0, and

−Dτkvτk + Pτkv

τk+1 = −Dτk1 + PτkAτk+1 · · ·Aτk+11 ≤ 0,

by definition of τk+1. Similarly to (6.37), this means that f(Xn) is a super-martingale except at s = ∅. From Lemma 6.2.9 and from the definition (6.42)the vector v` is non-negative, and therefore f is also non-negative. Finally,since the τk are a.s. finite and vτk = 1, f(s)→∞ as |s| → ∞. Theorem 2.5.2applies, and the random string is recurrent.

6.3 Stochastic billiards

6.3.1 Introduction

The model is defined in a planar domain

D := D(γ,A) := (x, y) ∈ R2 : x ≥ A, |y| ≤ xγ, (6.43)

where γ < 1 is a fixed parameter and A > 0 is a constant to be fixed later.The dynamics are described roughly as follows. A particle moves at unitspeed in D; while in the interior D \ ∂D, the direction also remains fixed.When the particle hits the boundary ∂D, it is (independently) reflected at arandom angle to the inwards-pointing normal vector at the boundary. Thelaw of the process is determined by specifying the distribution of the randomreflections. See Figure 6.1 for some simulations of the process.

6.3. Stochastic billiards 303

Suppose that the random variable α for the angle of reflection satisfies,for some α0 ∈ (0, π/2),

P[0 < |α| < α0] = 1 and E tanα = 0. (6.44)

Thus the distribution of α is bounded strictly away from ±π/2 and doesnot have an atom at 0; a sufficient condition for E tanα = 0 is that thedistribution of α be symmetric about 0.

The process that we have described informally above is a continuous-timeprocess on D. It is more convenient to work with a discrete-time process,informally obtained by recording the locations of the successive hits of theparticle on the boundary ∂D. This is a Markov chain with state-space ∂Dthat we denote by (ξn, n ≥ 0), where we write in coordinates ξn = (ξ

(1)n , ξ

(2)n ).

Thus when ξ(1)n > A, ξ

(2)n = ±(ξ

(1)n )γ . We call ξn the collisions process.

We construct ξn formally as follows. Suppose that ξ0 ∈ ∂D. We performa step of our process as follows.

• Take an independent draw of an angle α satisfying (6.44). The realiz-ation of α specifies a ray Γ starting at ξn with angle α to the interiornormal to ∂D at ξn. We adopt the convention that positive valuesof the angle correspond to the right of the normal; negative valuescorrespond to the left.

• Let ξn+1 be the first point of intersection of the ray Γ with ∂D \ ξn.

It is not a priori clear that ξn is well-defined, as the ray Γ might neverintersect the boundary; however, this does not occur assuming that the Ain (6.43) is chosen sufficiently large, as the following result shows.

Lemma 6.3.1. Suppose that α satisfies (6.44) for some α0 ∈ (0, π/2), andthat γ < 1. Then there exists A0 ∈ R+ (depending on the distributionof α as well as on α0 and γ) such that for any A ≥ A0 the process ξnis well-defined on ∂D where D = D(γ,A) is given by (6.43). Moreover,

lim supn→∞ ξ(1)n =∞, a.s.

Thus our standing assumption for this section will be the following.

(SB) Suppose that α satisfies (6.44) for some α0 ∈ (0, π/2), and that D isgiven by (6.43) for some γ < 1 and A ≥ A0 as in Lemma 6.3.1.

Our first result covers the recurrence/transience classification for ξn. Itturns out that a key quantity is E[tan2 α]; note that under (6.44) we have


E[tan2 α] ≤ E[tan2 α0] < ∞ and E[tan2 α] > 0 by the fact that α does notdegenerate to 0. Define

γc :=E[tan2 α]

1 + 2E[tan2 α], (6.45)

so that γc ∈ (0, 1/2) under (6.44).

Theorem 6.3.2. Suppose that (SB) holds. Then:

(i) If γ > γc the process is transient, i.e., limn→∞ ξ(1)n =∞, a.s.;

(ii) If γ ≤ γc the process is recurrent, i.e., lim infn→∞ ξ(1)n <∞, a.s.

Figure 6.1: Simulations of part of the trajectory of the billiards process withα uniform on [−π/4, π/4] in domains with γ = 1/3 (left) and γ = 2/3 (right).Here E[tan2 α] = 4

π and γc = 48+π ≈ 0.36, so the first case is recurrent and

the second transient.

6.3.2 Reduction to Lamperti’s problem

The key to our analysis of the stochastic billiards process is to considera rescaled version of the process to obtain an instance of the Lampertiproblem; the idea is to find a scale on which the increments of the processhave uniformly bounded second moments. As at (3.1), we write µk(x) =E[(Xn+1 −Xn)k | Xn = x] for the increment moment functions of Xn.


Theorem 6.3.3. Suppose that (SB) holds. Set Xn := (ξ(1)n )1−γ. Then Xn

is a time-homogeneous Markov process on R+ satisfying, for some B ∈ R+,P[|Xn+1 −Xn| ≤ B], and, as x→∞,

µ1(x) =2γ(1− γ)(1 + E[tan2 α])

x+O(x−2); (6.46)

µ2(x) = 4(1− γ)2 E[tan2 α] +O(x−1). (6.47)

We defer the proof of Theorem 6.3.3 to Section 6.3.3; here we use thetheorem and the results of Chapter 3 to establish our recurrence classifica-tion.

Proof of Theorem 6.3.2. Consider the process Xn given in Theorem 6.3.3;

then Xn is recurrent if and only if ξ(1)n is recurrent. From (6.46) and (6.47)

we have that

2xµ1(x)− µ2(x) = 4(1− γ)(γ(1 + 2E[tan2 α])− E[tan2 α] +O(x−1)

)= 4(1− γ)

((γ − γc)(1 + 2E[tan2 α]) +O(x−1)

),

where γc is given by (6.45). Thus if γ > γc we have lim infx→∞(2xµ1(x) −µ2(x)) > 0 which yields transience by Theorem 3.5.1.

Similarly, for θ ∈ (0, 1), by (6.46) and (6.47),

2xµ1(x)−(

1 +1− θlog x

)µ2(x)

= 4(1− γ)(γ − γc)(1 + 2E[tan2 α])−(

1− θ + o(1)

log x

)4(1− γ)2 E[tan2 α],

which is strictly negative for all x sufficiently large provided γ ≤ γc; thenTheorem 3.5.2 establishes recurrence.

6.3.3 Increment estimates

Write g(x) := xγ . Suppose that at time n = 0 we have ξ0 = (ξ(1)0 , ξ

(2)0 ) =

(x,±g(x)) for x > A, and then ξ1 is obtained after reflecting at angle α

to the normal. Denote D(x, α) := ξ(1)1 − ξ

(1)0 , the resulting increment of

the horizontal component of the process. Also set θ := arctan g′(x), sotan θ = g′(x) = γxγ−1.

We now proceed to obtain estimates for D(x, α) and its moments. Thenext lemma gives an upper bound on D(x, α) that follows from the fact thatfor large enough x our domain is almost flat, while α is bounded strictly awayfrom ±π/2.


Lemma 6.3.4. Suppose that (SB) holds. Then there exist x0 ∈ R+ andC ∈ R+, both depending only on α0 and γ, such that for all x ≥ x0 and allα ∈ (−α0, α0),

|D(x, α)| ≤ Cxγ .

Proof. We use a geometrical argument depicted in Figure 6.2. Since γ < 1,we have g′(x)→ 0 and θ → 0 as x→∞. Thus we may choose x ≥ x0 largeenough so that |θ| < minα0, (π/2) − α0 and then tan(α0 + θ) = c0 forc0 <∞; suppose that x0 is also chosen sufficiently large that |g′(x)| < 1/c0

for all x ≥ x0.By symmetry, it suffices to suppose that we start on the positive half

of the curve, i.e., at ξn = (x, g(x)). First suppose that γ ≥ 0. Then gis non-decreasing, so θ ≥ 0. Consider D(x, α), α ≥ 0, as represented inFigure 6.2.

In the case α ≥ 0, |D(x, α)| ≤ |D(x, α0)|. The reflected ray at angle α0

to the normal from (x, g(x)) has equation in (u, v) given by

u− x = −(v − g(x)) tan(α0 + θ), u ≥ x. (6.48)

Let a = 1/c0; by choice of x0, we have |g′(u)| < a for all u ≥ x ≥ x0.Consider the line

v + g(x) = −a(u− x), u ≥ x. (6.49)

This line intersects ∂D at (x,−g(x)) and it intersects the reflected ray atangle α0 whose equation is (6.48) at (u, v) with

u− x =2g(x) tan(α0 + θ)

1− a tan(α0 + θ). (6.50)

Since |g′(u)| < a for u ≥ x, the curve v = −g(u) remains above the line (6.49)for all u ≥ x. Thus |∆(x, α0)| is bounded by u − x as given by (6.50); thisis D′ in Figure 6.2.

Thus, for some C ∈ R+, |∆(x, α0)| ≤ Cg(x), for all x ≥ x0 large enough.A similar argument applies when α < 0 and we use the bound |D(x, α)| ≤|∆(x,−α0)|. With minor modifications, the same argument also works forγ < 0.

Proof of Lemma 6.3.1. Lemma 6.3.4 shows that we may choose x0 suffi-ciently large, depending on α0 and γ, so that for all x ≥ x0, ξn ∈ ∂D with

ξn = (x,±xγ) implies that ξn+1 ∈ ∂D, a.s. If ξ(1)n = A, then since α has no

atom at 0 it is also not hard to see that ξn+1 ∈ ∂D, a.s. Thus if we takeA > x0, we ensure that ξn ∈ ∂D is well defined for all n ≥ 0.


αθ

N

(x, xγ)

α+ θ

α+ θ

α+ θ(x,−xγ)

D(x, α)

D′

(x, 0)

v + xγ = −a(u− x)

Figure 6.2: The increment D(x, α) = ξ(1)n+1 − ξ

(1)n when the particle at ξn =

(x, xγ) reflects at angle α to the inwards normal (indicated by ‘N ’ in thediagram). Here tan θ = γxγ−1. The quantity D′ is an upper bound forD(x, α), obtained by extending a line of slope −a for 0 < a < γxγ−1 from(x,−xγ).

To show the non-confinement result lim supn→∞ ξ(1)n =∞, a.s., we apply

Proposition 3.3.4. It is sufficient to show that for any admissible distributionfor α, for all y ≥ A, there exists εy > 0 for which

P[ξ(1)n+1 − ξ(1)

n > εy | ξ(1)n = x] > εy, for all x ∈ [A, y]. (6.51)

If x = A, the result is clear. So suppose x > A. Since α is not identically0 and E tanα = 0, there exists ν > 0 for which P[α > ν] > ν. Thus withpositive probability D(x, α) ≥ D(x, ν). Then (see Figure 6.2), D(x, ν) ≥ D1


where D1 = g(x) tan(θ + ν). If γ ≥ 0, then θ ≥ 0 and (6.51) follows easily,since then for x ∈ [A, y] we have D(x, α) ≥ g(A) tan(ν) with probability atleast ν. (So in the case γ ≥ 0, εy need not depend on y.)

If γ < 0, we may choose A big enough so that the angle θ to the normalsatisfies |θ| < ν for all x ≥ A, and so tan(θ + ν) is still strictly positive, sothat for x ∈ [A, y] we have D(x, α) ≥ g(y) tan(θ + ν) with probability atleast ν, and we verify (6.51) in that case too. Note that the choice of A forthis last argument depended on ν and hence on the distribution of α, as wellas on α0 and γ.

The next lemma gives crucial estimates for the first two moments ofD(x, α).

Lemma 6.3.5. Suppose that (SB) holds. Then as x→∞

E[D(x, α)] = 2γx2γ−1(1 + 2E[tan2 α]) +O(x3γ−2); (6.52)

and E[D(x, α)2] = 4x2γ E[tan2 α] +O(x3γ−1). (6.53)

Proof. To ease notation we write D = D(x, α). Recall that x ≥ A is largeenough so that if ξn = (x, xγ) is on the positive part of ∂D then the nextcollision ξn+1 = (x+D,−(x+D)γ) is on the negative part. See Figure 6.2.

We have, repeatedly using the facts that | tan(α + θ)| and | tanα| areuniformly bounded, and |D| = O(xγ) by Lemma 6.3.4,

D = (g(x) + g(x+D)) tan(α+ θ)

= xγ(1 + (1 + (D/x))γ) tan(α+ θ)

= xγ(2 +Dγx−1) tan(α+ θ) +O(x2γ−2)

= xγ(2 +Dγx−1)tanα+ tan θ

1− tanα tan θ+O(x2γ−2)

= xγ(2 +Dγx−1)(tanα+ γxγ−1)(1 + γxγ−1 tanα) +O(x3γ−2),

where, as always, the implicit constants in the O terms are non-random.Collecting terms we have

D = 2γx2γ−1 + (tanα)(xγ(2 +Dγx−1) + 2γx2γ−1

)+ (tan2 α)

(2γx2γ−1

)+O(x3γ−2)

Re-arranging, we obtain

D = 2γx2γ−1 +(2xγ + 2γx2γ−1) tanα+ 2γx2γ−1 tan2 α

1− γxγ−1 tanα+O(x3γ−2)

6.4. Exclusion and voter models 309

= 2γx2γ−1 + (2xγ + 2γx2γ−1) tanα+ 4γx2γ−1 tan2 α+O(x3γ−2).(6.54)

Taking expectations in (6.54) and using the fact that E tanα = 0 we ob-tain (6.52). Similarly, squaring both sides of (6.54) gives

D2 = 4x2γ tan2 α+O(x3γ−1),

which on taking expectations yields (6.53).


Proof of Theorem 6.3.3. Set Xn = (ξ(1)n )1−γ . Then, Xn is a Markov chain

and, by Lemma 6.3.4,

|Xn+1 −Xn| ≤ max|(x+ Cxγ)1−γ − x1−γ |, |(x− Cxγ)1−γ − x1−γ |

.

Here(x± Cxγ)1−γ − x1−γ = x1−γ

((1± Cxγ−1

)1−γ − 1),

which is uniformly bounded. Thus |Xn+1−Xn| ≤ B, a.s., for some B ∈ R+.

Given ξ(1)n = x we have Xn = x1−γ , and, by Taylor’s formula,

Xn+1 −Xn = (x+D(x, α))1−γ − x1−γ

= (1− γ)x−γD(x, α)− γ(1− γ)

2x−γ−1D(x, α)2 +O(x2γ−2), (6.55)

using Lemma 6.3.4 for the error term. Taking expectations in (6.55) andusing (6.52) and (6.53) we obtain

E[Xn+1 −Xn | Xn = x1−γ ] = 2(1− γ)γ(1 + E[tan2 α])xγ−1 +O(x2γ−2),

which yields (6.46) after a change of variable. Similarly we obtain (6.47)after squaring both sides of (6.55).

6.4 Exclusion and voter models

6.4.1 Introduction

In this section, we apply the method of Lyapunov functions to some modelsof interacting particle systems, in which instead of a single particle on acountable state space, there many particles each of which can influence theothers’ dynamics.


The processes that we will consider are Markov processes on configur-ations of particles on Z. Thus we take as our state space 0, 1Z; we calls ∈ 0, 1Z a configuration, and we interpret a coordinate value s(x) = 1as the presence of a particle at the site x ∈ Z in the configuration s, ands(x) = 0 as the absence of a particle at x.

The two processes that we will consider are the exclusion process andthe voter model, and we will also study a more general process that is amixture of the two. The dynamics of these processes may be interpretedin terms of particles that hop on Z (the case of the exclusion process) orappear and disappear at the sites of Z (the case of the voter model). Ineither case, the dynamics are driven by the presence of discrepancies 01or 10 in the configuration. In order to obtain well-defined processes, weconsider dynamics on configurations with finitely many discrepancies.

At each time step, the exclusion process selects uniformly at randomfrom among all discrepancies. If the chosen discrepancy is 01, it flips to 10with probability p (else there is no change); if the pair is 10, it flips to 01with probability 1−p. On the other hand, at each time step the voter modelselects uniformly at random from all discrepancies and then flips the chosenpair to either 00 or 11, with equal chance of each. The exclusion-voterprocess is a hybrid of these two processes, whereby at each time step wedetermine independently at random whether to perform a voter-type move(with probability β) or an exclusion-type move (probability 1− β).

Individually, the exclusion process and voter model exhibit very differentbehaviour. For instance, in the exclusion process there is local conservationof 1s: the number of 1s in a bounded interval can change only through theboundary. There is no such conservation in the voter model. In the hybridprocess, voter moves and exclusion moves interact in a highly non-trivialway.

Consider the Heaviside configuration defined by 1x ≤ 0, which consistsof a single pair 10 abutted by infinite strings of 1s and 0s to the left andright, respectively:

. . . 11110000 . . .

If the voter model starts from the Heaviside configuration, then at any futuretime it is a random translate of the same configuration. Indeed, the positionof the rightmost particle performs a symmetric simple random walk. If thevoter model starts from a perturbation of the Heaviside configuration, itis natural to study the time it takes to reach a translate of the Heavisideconfiguration. This example motivates the following notation.

Let S ′ ⊂ 0, 1Z denote the set of configurations with a finite number


of 0s to the left of the origin and 1s to the right, i.e., s′ ∈ 0, 1Z for whichthere exist `, r ∈ Z with ` < r such that s′(x) = 1 for all x ≤ ` and s′(x) = 0for all x ≥ r. In other words, S ′ contains those configurations of 0, 1Zin which there is only a finite number of discrepancies, and the number ofdiscrepancies of type 10 minus the number of discrepancies of type 01 isequal to 1; note that S ′ is countable.

Let ‘∼’ denote the equivalence relation on S ′ such that for s′1, s′2 ∈ S ′,

s′1 ∼ s′2 if and only if s′1 and s′2 are translates of each other, i.e., there existsy ∈ Z such that s′1(x) = s′2(x + y) for all x ∈ Z. Then set S := S ′/ ∼.In other words, S is the set of configurations of the form infinite string of1s—finite number of 0s and 1s—infinite string of 0s, modulo translations.For example, one s ∈ S is

s = . . . 1110000000011100001001001000000001111000 . . . (6.56)

We now formally define our processes. Fix β ∈ [0, 1] (the mixing para-meter) and p ∈ [0, 1] (the exclusion parameter). The exclusion-voter process(ξn, n ≥ 0) with parameters (β, p) is a time-homogeneous Markov chainon the countable state-space S. The one-step transition probabilities aredetermined by the following mechanism.

• At each time step we decide independently at random whether toperform a voter move or an exclusion move. We choose a voter movewith probability β and an exclusion move with probability 1− β.

• Having decided this, choose a discrepancy (i.e. 01 or 10) uniformly atrandom from the finite number of available discrepancies.

• Then execute the chosen move. The voter move is such that the chosenpair (01 or 10) flips to 00 or 11 each with probability 1/2. The exclu-sion move is such that a chosen pair 01 flips to 10 with probability p(otherwise no move) and a chosen pair 10 flips to 01 with probabilityq := 1− p (otherwise no move).

The case β = 0 is the (pure) exclusion process with parameter p, which weabbreviate as EP(p), the case β = 1 is the (pure) voter model, VM, andthe general case is the hybrid process, HP(β, p). We use Pep, Pv, and Phβ,p todenote probabilities for EP(p), VM, and HP(β, p) respectively; we use Eep,Ev, and Ehβ,p for the corresponding expectations.

The following basic result gives some elementary properties of the state-space S under Phβ,p. In particular, Proposition 6.4.1 says that for (β, p) ∈(0, 1)2 ξn is irreducible and aperiodic. Let sH ∈ S denote the equivalenceclass of the Heaviside configuration 1x ≤ 0.


Proposition 6.4.1. For any β ∈ [0, 1], sH is an absorbing state under Phβ,1.Suppose β 6= 1 and (β, p) /∈ (0, 0), (0, 1). Then all states in S \ sHcommunicate under Phβ,p. Suppose β 6= 1, p < 1, and (β, p) 6= (0, 0). Then

all states in S communicate under Phβ,p, and ξn is irreducible and aperiodic.

Define the relaxation time for the process ξn as

τ := minn ≥ 0 : ξn = sH.We introduce some convenient terminology. If Phβ,p[τ = ∞ | ξ0 = s] > 0 for

s ∈ S, we say that ξn is transient started from s; if Phβ,p[τ <∞ | ξ0 = s] = 1for s ∈ S, we say that ξ is recurrent started from s. In the latter case,if in addition Ehβ,p[τ | ξ0 = s] < ∞ for s ∈ S, we say that ξ is positive-recurrent started from s. When ξ is irreducible (see Proposition 6.4.1), thisterminology coincides with the standard usage for countable Markov chains.

In the remaining part of this section, we prove a number of results.Namely, we prove

• The exclusion process is positive recurrent for p > 1/2 and transientfor p ≤ 1/2 (in particular, there is no null-recurrence regime).

• The voter model is positive recurrent; moreover, all moments for τbelow 3/2 exist while all moment above 3/2 do not exist.

• The exclusion-voter process is positive recurrent for p ≥ 1/2 and any β,or for β ≥ β0 and any p, where β0 is a constant sufficiently close to 1. Inparticular, note that any proportion of voter moves (β > 0) added tothe transient symmetric exclusion process (p = 1/2) makes it positiverecurrent.

6.4.2 Configurations and exclusion-voter dynamics

We introduce some more notation and terminology. A 1-block (0-block) is amaximal string of consecutive 1s (0s). Configurations in S consist of a finitenumber of such blocks. (From this point onwards we describe elements of Ssimply as ‘configurations’ rather than ‘equivalence classes of configurations’.)

For s ∈ S, let N = N(s) ≥ 0 denote the number of 1-blocks not includingthe infinite 1-block to the left (this is the same as number of 0-blocks notincluding the infinite 0-block to the right). Enumerating left to right, letni = ni(s) denote the size of the i-th 0-block, and mi = mi(s) the size ofthe i-th 1-block: for example, for the configuration from (6.56),

s = . . . 111

n1︷︸︸︷00000000

m1︷︸︸︷111

n2︷︸︸︷0000

m2︷︸︸︷1

n3︷︸︸︷00

m3︷︸︸︷1

n4︷︸︸︷00

m4︷︸︸︷1

n5︷︸︸︷00000000

m5︷︸︸︷1111 000 . . .


We may thus represent configuration s ∈ S\sH by the vector of block sizes(n1,m1, . . . , nN ,mN ). For example, the configuration s of (6.56), which hasN(s) = 5, has the representation (8, 3, 4, 1, 2, 1, 2, 1, 8, 4).

For s ∈ S \ sH and i ∈ 1, . . . , N let

Ri := Ri(s) :=i∑

j=1

nj , and Ti := Ti(s) :=N∑j=i

mj ;

we adopt the convention R0 = TN+1 = 0. Set |sH| := 0 and for s ∈ S \ sHlet |s| :=

∑Ni=1(ni + mi) = RN + T1, the length of the string of 0s and 1s

between the infinite string of 1s to the left and the infinite string of 0s tothe right.

It is convenient to represent a configuration s ∈ S \ sH diagrammat-ically as a right-down path in the quarter-lattice Z2

+: starting from (0, T1),construct a path by reading left-to-right the configuration s and for each 0(1) taking a unit step in the right (down) direction. Thus the path startswith a step to the right, and ends at (RN , 0) after |s| steps. See Figure 6.3for a representation of the staircase for s given by (6.56).

Figure 6.3: An example staircase configuration.

The lattice squares of Z2+ bounded by the right-down path determined

by s constitute a polygonal region in the plane that we call the staircasecorresponding to s. With this representation of the configuration-space,the exclusion-voter model can be viewed as a growth/depletion process onstaircases. For instance, exclusion moves are particularly simple in thiscontext, corresponding to adding or removing a square at a corner.

We now introduce notation for the changes in configuration broughtabout by voter and exclusion moves. Given the staircase of s, there are 2N+


1 ‘corners’ representing 10s and 01s alternately, of which N + 1 are 10s andN are 01s. In the staircase representation, these corners have coordinates(Ri, Ti+1), i ∈ 0, . . . , N (for 10s) and (Ri, Ti), i ∈ 1, . . . , N (for 01s),where R0 = TN+1 = 0. Enumerate the 10s left-to-right in the configurationS by 0, 1, . . . , N , and similarly the 01s by 1, . . . , N .

Give a configuration s ∈ S, we define the configurations to which it cantransition under either a voter move or an exclusion move, denoted s→k , s←k ,s+rk , s+`

k , s−rk , s−`k , as follows:

• s→k is the configuration obtained from s by moving the rightmost 1 ofthe kth 1-block by 1 unit to the right, k ∈ 0, . . . , N;

• s←k is the configuration obtained from S by moving the leftmost 1 ofthe kth 1-block by 1 unit to the left, k ∈ 1, . . . , N;

• s+rk is the configuration obtained from S by adding an extra 1 to the

right of the kth 1-block, k ∈ 0, . . . , N;

• s+`k is the configuration obtained from S by adding an extra 1 to the

left of the kth 1-block, k ∈ 1, . . . , N;

• s−rk is the configuration obtained from S by removing the rightmost 1from the kth 1-block, k ∈ 0, . . . , N;

• s−`k is the configuration obtained from S by removing the leftmost 1from the kth 1-block, k ∈ 1, . . . , N.

With this notation, EP(p) can transform s to s→k or s←k , while using VM wecan get s±rk or s±`k .

To conclude this section, we sketch the (elementary) proof of Proposi-tion 6.4.1. As well as sH, we introduce special notation for one more config-uration. Set

sD := . . . 11101000 . . . ,

the single-discrepancy configuration with N(s1) = 1 and vector representa-tion (1, 1).

Proof of Proposition 6.4.1. It is not hard to see that sH is an absorbing statefor the pure voter model (β = 1) and for the left-moving totally asymmetricexclusion process (β = 0, p = 1), and hence also for the mixture model underPhβ,1 for any β ∈ [0, 1].

To show that all states within S communicate, it suffices to show thatPhβ,p[ξn = s1 | ξ0 = s0] > 0 for some n ∈ N for each of the following:


(i) s0 = sH, s1 = sD;

(ii) s0 = sD, s1 = sH;

(iii) any s0 with |s0| ≥ 2 and some s1 with |s1| = |s0|+ 1;

(iv) any s0 with |s0| ≥ 3 and some s1 with |s1| ≤ |s0| − 1;

(v) any s0 with |s0| ≥ 3 and any s1 where s1 is identical to s0 apart fromin a single position j ∈ 2, 3, . . . , |s0| − 1.

In other words, given that moves of types (i)–(v) can occur, it is possible(with positive probability) to step, in a finite number of moves, between anytwo configurations in S by first adjusting the length of the configuration viamoves of types (i)–(iv) and then flipping the states in the interior of theconfiguration via moves of type (v). Similarly, to show that all states inS \ sH communicate, it suffices to show that all moves of types (iii)–(v)have positive probability.

It is not hard to see that voter moves can perform moves of types (ii), (iii)and (iv) in a single step (i.e., with n = 1). Similarly exclusion moves withp < 1 can perform moves of types (i) and (iii) in one step, while exclusionmoves with p > 0 can perform moves of types (ii) and (iv), possibly needingmultiple steps. We claim that moves of type (v) can be performed provided:(a) β ∈ (0, 1); or (b) β = 0 and p ∈ (0, 1).

In case (a), suppose we need to replace a 0 by a 1 in the interior of a givenconfiguration. If p < 1, we may perform a voter move on the first 10 to theleft of the position to be changed, and then, if necessary, perform successive10 7→ 01 exclusion moves to ‘step’ the 1 into the desired position. If p > 0,an analogous procedure works, starting from the first 01 to the right. Onthe other hand, if we need to replace a 1 by a 0, a similar argument applies.

In case (b) we cannot use voter moves, but both types of exclusion moveare permitted, so we can ‘bring in’ any 0 (1) from outside the disorderedregion, rearrange as necessary, and ‘take out’ the excess 1 (0) to the otherboundary.

It follows that moves of type (ii)–(v) are possible provided β 6= 1 and(β, p) /∈ (0, 0), (0, 1), and all (i)–(v) are possible if we additionally imposep < 1.

To complete the proof we need to demonstrate aperiodicity in the casewhere β 6= 1, p < 1 and (β, p) 6= (0, 0), where all states communicate. Sinceβ 6= 1, exclusion moves may occur. Moreover, every configuration otherthan sH contains at least one pair of each type (01 and 10). Hence there is apositive probability that a configuration other than sH remains unchanged


at a given step (when a proposed exclusion move fails to occur). Thus, sinceall states communicate, we have aperiodicity.

6.4.3 Lyapunov functions

We define two Lyapunov functions, f1 and f2, from S to R+ that will play acrucial role in our arguments. Set f1(sH) = f2(sH) = 0, and for s ∈ S \sHdefine

f1(s) :=1

2

(N∑i=1

miRi +N∑i=1

niTi

)=

N∑i=1

miRi =N∑i=1

niTi;

and f2(s) :=1

2

(N∑i=1

miR2i +

N∑i=1

niT2i

).

Note that f1 is the area under the ‘staircase’ representation of the configur-ation (see Figure 6.3).

i

The value f1(s) is equal exactly to the number of nearest-neighbour transpositions needed to pass from s to sH, thatis, f1(s) is in some sense the ‘distance’ from s to the trivialconfiguration. Unfortunately, as we will see later, the func-tion f1 does not work well for some configurations s (namely,for s such that N(s) is small with respect to |s|). The func-tion f2 is the result of our attempts to modify f1 in orderto eliminate this disadvantage; we cannot give any intuitivemeaning of f2(s).

The next result gives some basic relations between |s|, f1(s), and f2(s).

Lemma 6.4.2. For any s ∈ S we have:

(i) 12 |s| ≤ f1(s) ≤ 1

4 |s|2;

(ii) 12 |s|2 ≤ f2(s) ≤ 1

8 |s|3;

(iii) f1(s) ≤ (f2(s))3/4.

Proof. For part (i), we have

f1(s) =1

2

N∑i=1

miRi +1

2

N∑i=1

niTi ≥1

2(RN + T1) =

|s|2

; and


f1(s) =N∑i=1

miRi ≤ RNN∑i=1

mi = RNT1 ≤(RN + T1)2

4=|s|24.

Similarly, for part (ii), we have

f2(s) ≥ 1

2(R2

N + T 21 ) ≥ 1

4(RN + T1)2 =

|s|24

; and

f2(s) ≤ 1

2R2N

N∑i=1

mi +1

2T 2

1

N∑i=1

ni =1

2RNT1(RN + T1) ≤ |s|

3

8.

Finally we prove part (iii). We shall make use of the following simple con-sequence of Jensen’s inequality: if we have n positive numbers γ1, . . . , γnsuch that

∑ni=1 γi = 1, then for any x1, . . . , xn,

γ1x1 + · · ·+ γnxn ≤(γ1x

21 + · · ·+ γnx

2n

)1/2. (6.57)

Denote αi = mi/|s|, βi = ni/|s|, so∑N

i=1(αi + βi) = 1. Using (6.57) andpart (ii), we get

f1(s) =1

2

N∑i=1

(miRi + niTi) =|s|2

N∑i=1

(αiRi + βiTi)

≤ |s|2

(N∑i=1

(αiR2i + βiT

2i )

)1/2

=

√|s|√2

(f2(s)

)1/2≤√

2(f2(s)

)1/4√

2

(f2(s)

)1/2=(f2(s)

)3/4,

thus completing the proof of Lemma 6.4.2.

We will need expressions for the expected increments of our Lyapunovfunctions f1(ξn) and f2(ξn). First we deal with f1.

Lemma 6.4.3. Let β, p ∈ [0, 1] and s ∈ S \ sH. Then

Ehβ,p[f1(ξn+1)− f1(ξn) | ξn = s] = (1− β)(1− p)− N

2N + 1.

Proof. Let s ∈ S \ sH. As noted in Section 6.4.2, EP can transform aconfiguration s only either to s→k or to s←k , and we have

f1(s←k )− f1(s) = −1, and f1(s→k )− f1(s) = 1. (6.58)


Thus, summing over the discrepancies where an exclusion move can occur,

Eep[f1(ξn+1)− f1(ξn) | ξn = s]

=p

2N + 1

N∑k=1

(f1(s←k )− f1(s)) +q

2N + 1

N∑k=0

(f1(s→k )− f1(s))

=N(q − p) + q

2N + 1. (6.59)

On the other hand, VM can transform s to one of s±`k or s±rk . Direct com-putations yield

f1(s+`k )− f1(s) = Rk − Tk − 1,

f1(s−`k )− f1(s) = −Rk + Tk − 1,

f1(s+rk )− f1(s) = Rk − Tk+1,

f1(s+rk )− f1(s) = −Rk + Tk+1.

Pairing off the terms s±`k and the terms s±rk , it follows that

Ev[f1(ξn+1)− f1(ξn) | ξn = s] =1

2N + 1

N∑k=1

1

2(−2) +

1

2N + 1

N∑k=0

1

2(0)

= − N

2N + 1. (6.60)

In the hybrid process, a voter transition occurs with probability β, and anexclusion transition occurs with probability 1 − β; then combining (6.59)and (6.60) we obtain the result after some algebra.

Next we deal with f2.

Lemma 6.4.4. Let β, p ∈ [0, 1] and s ∈ S \ sH. Then

Ehβ,p[f2(ξn+1)−f2(ξn) | ξn = s] = (1−β)

(N + q

2N + 1− 2p− 1

2N + 1

N∑i=1

(Ri + Ti)

).

Proof. Let s ∈ S \ sH. The possible changes in f2 due to the EP are

f2(s→k )− f2(s) =1

2

((Rk + 1)2 −R2

k + (Tk+1 + 1)2 − T 2k+1

)= 1 +Rk + Tk+1, (6.61)


and f2(s←k )− f2(s) =1

2

((Rk − 1)2 −R2

k + (Tk − 1)2 − T 2k

)= 1−Rk − Tk. (6.62)

Combining (6.61) and (6.62), we obtain

Eep[f2(ξn+1)− f2(ξn) | ξn = s]

=p

2N + 1

N∑k=1

(f2(s←k )− f2(s)) +q

2N + 1

N∑k=0

(f2(s→k )− f2(s))

=N + q

2N + 1− p− q

2N + 1

N∑i=1

(Ri + Ti). (6.63)

On the other hand, the possible changes in f2 due to the VM are

f2(s+rk )− f2(s) =

1

2(Rk + Tk+1 +R2

k − T 2k+1)−

N∑i=k+1

miRi +k∑i=1

niTi; (6.64)

f2(s−rk )− f2(s) =1

2(Rk + Tk+1 −R2

k + T 2k+1) +

N∑i=k+1

miRi −k∑i=1

niTi; (6.65)

for k ∈ 0, . . . , N, and, for k ∈ 1, . . . , N,

f2(s+`k )− f2(s) =

1

2(−Rk − Tk +R2

k − T 2k )−

N∑i=k

miRi +k∑i=1

niTi; (6.66)

f2(s−`k )− f2(s) =1

2(−Rk − Tk −R2

k + T 2k ) +

N∑i=k

miRi −k∑i=1

niTi. (6.67)

Summing up the contributions from (6.64)–(6.67) we obtain

Ev[f2(ξn+1)− f2(ξn) | ξn = s] = 0. (6.68)

Combining (6.63) and (6.68) gives the result.

6.4.4 Exclusion process

Our first result on the exclusion process is the following.

Theorem 6.4.5. If p > 1/2, then EP(p) is positive recurrent.


Proof. We use the Lyapunov function f2. Since RN + T1 = |s|, Ri ≥ i andTi ≥ N − i+ 1, it is straightforward to get that

N∑i=1

(Ri + Ti) ≥ max|s|, N(N + 1).

Using this fact, we get from (6.63) that

Eep[f2(ξn+1)− f2(ξn) | ξn = s] ≤ N + 1

2N + 1− (2p− 1)

max|s|, N(N + 1)2N + 1

.

Let p > 1/2. If (2p− 1)|s| > 2N + 2, we have

Eep[f2(ξn+1)− f2(ξn) | ξn = s] ≤ − N + 1

2N + 1≤ −1

2,

and for (2p− 1)N > 3, we have

Eep[f2(ξn+1)− f2(ξn) | ξn = s] ≤ 1− 3(N + 1)

2N + 1≤ −1

2.

The set of s for which both (2p− 1)N ≤ 3 and (2p− 1)|s| ≤ 2N + 2 is finite,so we have shown that

Eep[f2(ξn+1)− f2(ξn) | ξn = s] ≤ −1

2,

for all but finitely many s ∈ S, and hence Theorem 2.6.4 shows that EP(p)is positive recurrent when p > 1/2.

Theorem 6.4.6. If p ≤ 1/2, then EP(p) is transient.

Proof. First we consider the case p < 1/2, where we will use the Lyapunovfunction f1. In this case, we have from (6.58) that |f1(ξn+1) − f1(ξn)| ≤ 1,a.s., and, from (6.59)

Eep[f1(ξn+1)− f1(ξn) | ξn = s] ≥ N(1− 2p)

2N + 1≥ ε,

for some ε = ε(p) > 0 and all s ∈ S \ sH. Then by Theorem 2.5.12, theprocess ξn is transient.

We now turn now to the more delicate case where p = q = 1/2. In thiscase we have from (6.59) that

Ee1/2[f1(ξn+1)− f1(ξn) | ξn = s] =1

2(2N + 1), (6.69)


and so, since the right-hand side of (6.69) can be arbitrarily small, we cannotapply Theorem 2.5.12. As observed in Section 2.5, this theorem providesonly a sufficient condition for transience, so there is no guarantee that itshould be applicable in all situations. Therefore, to deal with the casep = 1/2, we seek to apply the more general result, Theorem 2.5.7.

We fix an arbitrary α > 0 and define the function ψ : S \ sH → R+ by

ψ(s) :=(f1(s)

)−α.

Note that the definition is correct because f1(s) > 0 for s 6= sH. (Actually,for our proof of Theorem 6.4.6 it is sufficient to take α = 1, but, since wewill need analogous calculations for the hybrid process in Section 6.4.6, atthis point we prefer to do the calculations for arbitrary α > 0.)

For any C > 0, define the set AC ⊂ S by

AC := s ∈ S : f1(s) ≤ CN(s).

We claim that AC is finite for any C > 0. Indeed, Ri ≥ i and mi ≥ 1,so f1(s) ≥ N(N + 1)/2. Thus, for a configuration to belong to AC , it isnecessary that the number of 1-blocks be less than 2C − 1, so

AC ⊆ s ∈ S : f1(s) ≤ C(2C − 1),

which is clearly finite (by e.g. Lemma 6.4.2(i)).Now we have from (6.58) that

Ee1/2[(f1(ξn+1)− f1(ξn))2 | ξn = s]

=1/2

2N + 1

(N∑k=1

(f1(s←k )− f1(s))2 +N∑k=0

(f1(s→k )− f1(s))2

)=

1

2. (6.70)

By elementary calculations, we get that for any α > 0 there exist two positivenumbers C1 = C1(α), C2 = C2(α) such that for all |x| < C2,

(1 + x)−α − 1 ≤ −αx+ C1x2.

Hence, using (6.69) and (6.70), we get

Ee1/2[ψ(ξn+1)− ψ(ξn) | ξn = s]

= f−α1 (s)Ee1/2

[(1 +

f1(ξn+1)− f1(ξn)

f1(ξn)

)−α− 1

∣∣∣∣ ξn = s

]≤ f−α1 (s)

(− α

f1(s)· 1

2(2N + 1)+

C1

2f21 (s)

)


= f−α−21 (s)

(− αf1(s)

2(2N + 1)+C1

2

)< 0, (6.71)

for all s with f1(s) > 1/C2 and f1(s) > C1(2N(s) + 1)/α. The complement-ary set, with f1(s) ≤ 1/C2 or f1(s) ≤ C1(2N(s) + 1)/α is the union of twofinite sets (using the fact that AC is finite). Thus (6.71) holds for all butfinitely many s ∈ S. Applying Theorem 2.5.7, we finish the proof.

6.4.5 Voter model

Our first result on the voter model in the following.

Theorem 6.4.7. The VM is positive recurrent. Moreover, for any ε > 0and any initial configuration s ∈ S,

Ev[τ (3/2)−ε | ξ0 = s] <∞. (6.72)

Proof. In the terminology introduced in Section 6.4.1, positive recurrence isthe existence of E τ ; thus we shall turn directly to the proof of (6.72). Theidea is to apply Theorem 2.7.1 to the process (f2(ξn))α for some α < 1.

First, elementary calculus gives us that for α ∈ (0, 1) there exists apositive constant c1 such that, for all |x| ≤ 1,

(1 + x)α − 1 ≤ αx− c1x2. (6.73)

Hence using (6.73) we obtain

Ev[(f2(ξn+1))α − (f2(ξn))α | ξn = s]

= (f2(s))αEv[(

1 +f2(ξn+1)− f2(ξn)

f2(ξn)

)α− 1

∣∣∣∣ ξn = s

]≤ −c1(f2(s))α−2Ev[(f2(ξn+1)− f2(ξn))2 | ξn = s], (6.74)

by the fact from (6.68) that f2(ξn) is a martingale under Pv.To obtain a lower bound for the final expectation in (6.74), observe

from (6.64) and (6.65) that

|f2(s−r0 )− f2(s)| ≥ T 21

2, and

|f2(s+rN )− f2(a)| ≥ R2

N

2.

It follows from these two inequalities that

Ev[(f2(ξn+1)− f2(ξn))2 | ξn = s]


≥ 1

4N + 2(f2(s−r0 )− f2(s))2 +

1

4N + 2(f2(s−rN )− f2(s))2

≥ T 21 +R2

N

8N + 4≥ c|s|4

N, (6.75)

for all s ∈ S, where c > 0 is a constant. Now, a very important observationis that the VM does not increase the number of blocks N(ξn). So we have

Ev[(f2(ξn+1)− f2(ξn))2 | ξn = s] ≥ c0|s|4, for all s ∈ S, (6.76)

with c0 = c0(ξ0) = c/N(ξ0). Using (6.76) in (6.74), we obtain

Ev[(f2(ξn+1))α − (f2(ξn))α | ξn = s] ≤ −c0c1(f2(s))α−2|s|4. (6.77)

Applying Lemma 6.4.2(ii) it follows from (6.77) that

Ev[(f2(ξn+1))α − (f2(ξn))α | ξn = s] ≤ −16C0C1(f2(s))α−2/3.

We apply Theorem 2.7.1 to the process Xn = (f2(ξn))1/3 taking α to beclose to 1 to finish the proof of (6.72).

The following theorem complements Theorem 6.4.7 by providing a resultin the other direction.

Theorem 6.4.8. For any ε > 0 and any s ∈ S \ sH,

Ev[τ (3/2)+ε | ξ0 = s] =∞. (6.78)

For any initial configuration ξ0 = s ∈ S \sH, since N(ξn+1)−N(ξn) iseither 0 or −1 under the VM dynamics, to reach sH the process must passthrough a configuration s0 with N(s0) = 1. Thus it suffices to prove (6.78)for initial configurations of this type. Representing the process started withN(ξ0) = 1 by the vector of block sizes, we obtain a random walk on Z2

+, forwhich τ is the time to reach the boundary of the quarter-plane. We mustestablish (6.78) for this random walk.

The transition probabilities of the random walk of interest are as follows:from the state (n,m) with nm 6= 0, the transition can occur to each of thesix states (n + 1,m), (n − 1,m), (n,m + 1), (n,m − 1), (n + 1,m− 1) and(n− 1,m+ 1) with probability 1/6.

Denote by τn,m the moment of hitting D0 (i.e. the boundary) providedthat the starting point was (n,m). To proceed, we need the following


Lemma 6.4.9. There exist two positive constants δ, C, such that for anyn, m

Pτn,m > δn2 ≥ Cm

m+ n(6.79)

and

Pτn,m > δm2 ≥ Cn

m+ n. (6.80)

Proof. Without loss of generality we can suppose that n ≤ m. Then, toprove (6.79), we will prove a stronger fact:

Pτn,m > δn2 ≥ C0 (6.81)

for some C0. In fact, it is a classical result that a homogeneous random walkin Z2

+ with bounded jumps and zero drift in the interior with some uniformlypositive probability cannot deviate by the distance n from its initial positionduring the time n2. To show how it can be proved formally, we denote byρ((n1,m1), (n2,m2)

)the Euclidean distance between the points (n1,m1) and

(n2,m2). Let the process start from (n,m), and denote Yt = ρ(ξt, (n,m)

).

Then, it is straightforward to get that the process Yt satisfies the hypothesisof Lemma 2 from Aspandiiarov et al. (1996), so applying it, we finish theproof of (6.79).

To prove (6.80), we need some additional notations. Denote

W i(m) = (n′,m′) : ρ((n′,m′), (m,m)

)≤ m/2 +

√2,

V i(m) = (n′,m′) : m/2 < ρ((n′,m′), (m,m)

)≤ m/2 +

√2,

W e(m) = (n′,m′) : m/2 +√

2 < ρ((n′,m′), (m,m)

)≤ m,

V e(m) = (n′,m′) : m−√

2 < ρ((n′,m′), (m,m)

)≤ m.

Clearly, the set V i(m) is the boundary of W i(m), and the set V e(m) is theexternal boundary of W e(m).

We consider the two possible cases:

a) (n,m) ∈W i(m),

b) (n,m) ∈W e(m).

Case a): first, we denote Yt = ρ(ξt, (m,m)

). Then, we apply Lemma 2

from Aspandiiarov et al. (1996) to get that Pτn,m > δn2 ≥ C1 for some C1,and thus (6.80).

Case b): we keep the notation Yt from the previous paragraph. Denote bypn,m the probability of hitting the set V i(m) before the set V e(m), provided


that the starting point is (n,m). Our goal is to estimate this probabilityfrom below.

For C > 0 consider the process ZCt , t = 0, 1, 2, . . ., defined in thefollowing way:

ZCt = expC(

1− Ytm

)= exp

C(

1− ρ(ξt, (m,m)

)m

),

ZC0 = expCn/m. One can prove the following technical fact: there existsa constant C (not depending on m) such that

E(ZCt+1 − ZCt | ξt = (n′,m′)) ≥ 0 (6.82)

for any point (n′,m′) ∈W e(m) and if m is large enough. Indeed, using thefact that there exist two positive constants C1,2 such that

e−x − 1 ≥ −x+ C1x2

on |x| < C2, we write

E(ZCt+1 − ZCt | ξt = (n′,m′))

= expC(

1− |n′ −m|m

)E(

exp− C

m(Yt+1 − Yt)

− 1 | ξt = (n′,m′)

)≥ C

mexp

C(

1− |n′ −m|m

)E(− (Yt+1 − Yt) +

C1C

m(Yt+1 − Yt)2 | ξt = (n′,m′)

).

Then, using properties of the process Yt, one can complete the proof of (6.82).

Now, to estimate pn,m, we make the sets V i(m) and V e(m) absorbing.Using that our random walk cannot overpass these sets, the process ZCtconverge as t→∞ to ZC∞, so

EZC∞ ≥ pn,m expC

2

+ (1− pn,m)

≥ EZC0 = expCnm

,

and thus

pn,m ≥expCn/m − 1

expC/2 − 1

≥ C

expC/2 − 1· nm≥ 2C

expC/2 − 1· n

m+ n. (6.83)


So, starting from the point (n,m), with probability at least (6.83) the ran-dom walk hits the set V i(m). Then, from the case a) it follows that withuniformly positive probability it will take at least δm steps to reach theexternal boundary V e(m), so we complete the proof of (6.80) and thus, ofLemma 6.4.9.

Proof of Theorem 6.4.8. Now, supposing that (6.78) does not hold, we have(denoting τ := τ(S0) and a ∧ b := mina, b)

E τ3/2+ε ≥ E(τ3/2+ε1τ ≥ t) = E((t+ τξt)3/2+ε1ξs /∈ D0 for all s ≤ t)

≥ 1

2E(

(t+ δn2t )

3/2+ε Cmt

mt + nt1ξs /∈ D0 for all s ≤ t

)+

1

2E(

(t+ δm2t )

3/2+ε Cntmt + nt

1ξs /∈ D0 for all s ≤ t)

≥ δ′C ′ E((n2+ε′

t mt +m2+ε′

t nt)1ξs /∈ D0 for all s ≤ t

)= δ′C ′ E

((n2+ε′

t∧τ mt∧τ +m2+ε′

t∧τ nt∧τ))

= C ′′ E(f2(ξt∧τ ))1+ε′′ . (6.84)

for some constants δ′, ε′, C ′, ε′′ and C ′′.From (6.84) we get that the family f2(ξt∧τ ) is uniformly integrable

as t → ∞, so E f2(ξt) → E f2(ξτ ) = 0. But this obviously contradicts toLemma ??.

6.4.6 Hybrid process

Finally, we turn to the hybrid process that allows both exclusion and votermoves. In this case, the complete recurrence classification is an open problem(see the bibliographical notes at the end of this chapter). The next resultshows that if the voter model is sufficiently prominent, one has positiverecurrence: in particular, HP(β, p) is positive recurrent for β > 2/3 and anyp.

Theorem 6.4.10. If β, p ∈ [0, 1] are such that (1 − p)(1 − β) < 1/3, thenHP(β, p) is positive recurrent.

Proof. We have from Lemma 6.4.3 that for all s ∈ S \ sH,

Ehβ,p[f1(ξn+1)− f1(ξn) | ξn = s] ≤ (1− β)(1− p)− 1

3;

provided (1 − p)(1 − β) < 1/3 this last expression is strictly negative, andso applying Theorem 2.6.4 we finish the proof.


The next result is not surprising when p > 1/2, since EP(p) and VMare (separately) positive recurrent in that case, but recall that EP(1/2) istransient.

Theorem 6.4.11. For any β > 0 and p ≥ 1/2 the process HP(β, 1/2) ispositive recurrent.

To prove Theorem 6.4.11, we will apply Theorem 2.6.4 with the function(f2(s))α for some α < 1. This entails a computation similar to that in theproof of Theorem 6.4.7 on the positive recurrence of the VM. In the proof ofTheorem 6.4.7 we used the fact that the number of blocks cannot increaseunder the VM dynamics to obtain from (6.75) the bound (6.76). For thehybrid process, this latter bound is not valid; instead, we have the followingresult.

Lemma 6.4.12. Let β > 0. Then there exists c > 0 such that for all s ∈ S,

Ehβ,p[(f2(ξn+1)− f2(ξn))2 | ξn = s] ≥ c|s|16/5. (6.85)

Proof. Since β > 0, there is positive probability of executing a voter move.Similarly to (6.75), writing ∆k = f2(s+r

k )− f2(s), we thus have

Ehβ,p[(f2(ξn+1)− f2(ξn))2 | ξn = s] ≥ β

4N + 2

N∑k=1

∆2k. (6.86)

By simple algebraic calculations, one gets from (6.64) that

∆k+1 −∆k = f2(s+rk+1)− f2(s+r

k ) ≥ mk+1Rk+1 + nk+1Tk+1 ≥ N, (6.87)

for k ∈ 0, . . . , N − 1. From (6.64) one gets also that ∆0 < 0 and ∆N > 0,so denoting L = mink ≥ 0 : ∆k ≥ 0 we have 1 < L ≤ N , ∆k < 0 fork < L, and ∆k ≥ 0 for k ≥ L. Then using (6.87), we get

N∑k=1

∆2k ≥

L−1∑k=0

(N(k − L+ 1))2 +N∑k=L

(N(k − L))2

= N2L−1∑k=0

k2 +N2N−L∑k=0

k2 ≥ c1N5,

for some c1 > 0. It then follows from (6.86) that

Ehβ,p[(f2(ξn+1)− f2(ξn))2 | ξn = s] ≥ c2N4,


for some c2 > 0. Combining this with (6.75), we get

Ehβ,p[(f2(ξn+1)− f2(ξn))2 | ξn = s] ≥ c3 maxN4,|s|4N

≥ c3|s|16/5,

thus proving the result.

Remark 6.4.13. The exponent 16/5 in Lemma 6.4.12 is the best possible;to see this, one may take a configuration s with n1 = mN = N5/4 andn2 = · · · = nN = m1 = · · · = mN−1 = 1 and compute the left-hand sideof (6.85).

Proof of Theorem 6.4.11. It suffices to take β ∈ (0, 1), since we alreadyknow from Theorem 6.4.7 that VM is positive recurrent. First suppose thatp > 1/2. Then, by Lemma 6.4.4,

Ehβ,p[f2(ξn+1)− f2(ξn) | ξn = s] ≤ (1− β)Eep[f2(ξn+1)− f2(ξn) | ξn = s],

which is bounded above by −(1 − β)/2 for all but finitely many s ∈ S, asshown in the proof of Theorem 6.4.5. Thus positive recurrence follows fromTheorem 2.6.4.

Finally, we consider the much more delicate case where p = 1/2 andβ ∈ (0, 1). This time, by the case p = q = 1/2 of Lemma 6.4.4,

Ehβ,1/2[f2(ξn+1)− f2(ξn) | ξn = s] =1− β

2. (6.88)

Let α ∈ (0, 1). Similarly to (6.74) we obtain from (6.73) that

Ehβ,1/2[(f2(ξn+1))α − (f2(ξn))α | ξn = s]

= (f2(s))αEhβ,1/2

[(1 +

f2(ξn+1)− f2(ξn)

f2(ξn)

)α− 1

∣∣∣∣ ξn = s

]≤ α(1− β)

2(f2(s))α−1 − c(f2(s))α−2|s|16/5,

where we have used (6.88) and (6.85). Now we use the fact that |s|16/5 ≥216/5(f2(s))16/15 by Lemma 6.4.2(ii), to obtain, for 14/15 < α < 1,

Ehβ,1/2[(f2(ξn+1))α − (f2(ξn))α | ξn = s]

≤ α(1− β)

2(f2(s))α−1 − 216/5c(f2(s))α−(14/15)

= −(f2(s))α−(14/15)[216/5c− α(1− β)

2(f2(s))−1/15

]< −1,

for all but finitely many s ∈ S, as follows from Lemma 6.4.2(ii). ApplyingTheorem 2.6.4, we finish the proof of Theorem 6.4.11.



Section 6.1

Random walk in random environment (RWRE) was first considered byM.V. Kozlov [166] (following up on “a hypothesis by A. N. Kolmogorov”)and F. Solomon [268] (a student of F.L. Spitzer); subsequently these proto-typical models of random processes in random media have been investigatedby many authors, see e.g. [242, 288, 289, 276].

The recurrence classification, Theorem 6.1.1, is due to Solomon [268] inthe case of RWRE on Z. Many of the papers on one-dimensional RWREwork on Z rather than Z+. Obvious differences include the fact that on Zone may have Xn → −∞, while in the i.i.d. case this regime correspondsto the positive-recurrent regime for Xn on Z+. Taking such differences intoaccount, for many questions of interest the distinction is inessential, and theanalysis is more-or-less unchanged.

The null-recurrent case in which E[ζ20 ] < ∞ and E ζ0 = 0 is known as

Sinai’s regime after Sinai’s remarkable result [265] that (log n)−2Xn con-verges weakly to a non-degenerate limit under the annealed probabilitymeasure P given by

P[ · ] =

∫Ω

Pω[ · ]P[dω].

Golosov [106] and Kesten [150] identified the limiting distribution in Sinaisresult.

The almost-sure upper bound in Theorem 6.1.2 is due to Deheuvelsand Revesz (Theorem 4 of [58]), with a different proof. The (shorter) ap-proach presented here follows [47, 214]. The almost-sure lower bound inTheorem 6.1.6 uses the method of [122], which improved on the methodof [214]; the key Lemma 6.1.8 is Lemma 4.3 in [122].

Sharp results on the almost-sure behaviour of the RWRE in Sinai’s re-gime were provided by Hu and Shi [124], making use of some delicate tech-nical estimates involving the potential associated with the random environ-ment:

lim supn→∞

Xn

(log n)2 log log log n=

8

π2σ2, Pω -a.s. for P -a.e. ω.

Thus the lower bound in Theorem 6.1.6 is close to sharp, while the up-per bound in Theorem 6.1.2 is rougher. The Lyapunov function methodspresented here are not as sharp as those say of [124], but they are simplerand more robust, and can be employed in more general settings where the


random environment is not i.i.d., such as the model of Section 6.1.5. Forexample, various were studied in

The model and results of Section 6.1.5 are based on [213, 214]. The recur-rence classification for the perturbation of Sinai’s regime, Theorem 6.1.10,is contained in Theorems 6 and 7 of [213], which dealt with more generalperturbations as well as a random environment constructed by a randomperturbation of symmetric simple random walk. The almost-sure boundsin Theorem 6.1.11 are from Theorem 3 of [214]. In the setting of The-orem 6.1.11, using a more delicate analysis, [100] obtained the precise resultthat for some constant c depending on a, β, and E[ζ2

0 ],

limn→∞

Xn

(log n)1/β(log log n)−1/β= c, Pω -a.s. for P -a.e. ω.

Thus it is the lower bound in Theorem 6.1.11 that is closest to being sharp.

Different behaviour is observed when E[ζ20 ] = ∞. The natural model

in this heavy-tailed setting takes ζ0 to be in the domain of attraction of astable law. Singh [266] gave analogues of the almost-sure results of Hu andShi [124] in the stable law setting. Analogues of Sinai’s weak convergenceresult were obtained by Schumacher [256, 257] and Kawazu et al. [141]: now(log n)−αXn has an annealed weak limit, where α ∈ (0, 2] is the index of thestable law. In the heavy-tailed setting, Lyapunov function methods wereused to study some perturbations of the i.i.d. environment in [122].

Several techniques are available for studying nearest-neighbour RWREon Z or Z+; since it is reversible, explicit calculations are often possible.In this setting the Lyapunov function method often boils down to perform-ing the ‘classical’ calculations. However, as emphasised elsewhere in thisbook, the Lyapunov function method is particularly useful in non-reversiblesituations, such as in the context of random strings in random environmentdiscussed in Section 6.2, or for many-dimensional random walks with raredefects [210].

Section 6.2

The presentation and results in this section follow [47]. Gajrat, Malyshev,Menshikov and Pelikh in [97] studied a version of the model in which thequantities rì , q

ìj and pìj do not depend on `, but the maximal jump size for

the string length may be greater than one. All these models may be inter-preted in terms of LIFO (last in, first out) queueing systems, or as randomwalks on trees: see [97]. We consider here only the one-sided evolution of


the string; two-sided homogeneous strings were studied by Gajrat, Malyshevand Zamyatin in [98].

In addition to Theorem 6.2.3, [47] gave a second sufficient condition fortransience: if (N) holds for An, and E log(1/pii) < ∞ for all i ∈ A, thenγ1 > 0 implies transience.

Section 6.3

Billiards models are dynamical systems describing the motion of a particle ina region with reflection rules at the boundaries. Physical motivation origin-ates with dynamics of an ideal gas in the low-density (Knudsen) regime [163].The random reflection rule in stochastic billiards models is motivated bythe fact that the particle is small and the surface off which it reflects has acomplicated (rough) microscopic structure. Stochastic billiards models werestudied in [45, 79, 211, 46]; we refer to those papers for further references.

The results of this section on stochastic billiards in unbounded planardomains are based on [211]. In [211], in addition to recurrence and tran-sience, almost-sure bounds on the trajectories of the collisions process andthe associated continuous-time trajectory were obtained; in [211] it was as-sumed that the distribution of α be symmetric around 0, which is strongerthan our condition E tanα = 0.

As mentioned in [211], it would be of interest to relax the assumptionthat α is bounded away from ±π/2; there are several natural distributionsthat are supported on the full interval [45, 79, 163]. We expect that if α hasdistribution on (−π/2, π/2) with sufficiently light tails at the endpoints (suchthat in particular E[tan2 α] <∞, say), then the results of Section 6.3 shouldcarry through (technically, one would obtain a Lamperti process whose in-crements were not uniformly bounded but satisfied a moments conditionthat should still admit the methods of Chapter 3).

Open problem 6.4.14. How does the stochastic billiards model behavewhen E[tan2 α] =∞? This is the case, for example, when α has the naturalLambertian or cosine law of reflection, with a density proportional to cosαon (−π/2, π/2). We conjecture that the process is transient if and only ifγ > 1/2 in this case.

Section 6.4

The results of Section 6.4 are based on [16], while some of the expositionis based on [193]. For background on the exclusion process and the votermodel, and interacting particle systems in general, see the books [181, 182].


Liggett [180] described the set of invariant measures for the exclusionprocess on Z. If p = 1/2, the invariant measures are convex combinationsof the translation invariant product measures parametrized by the densityof particles. If p > 1/2, the set of invariant measures contains also meas-ures with support in the set of configurations with a finite number of emptysites to the left of the origin and a finite number of particles to the rightof it; this fact can be used to establish positive-recurrence when p > 1/2(Theorem 6.4.5). Such measures are sometimes called blocking measures be-cause, due to the exclusion rule and the accumulation of particles to the leftof the origin, the flux of particles is null. There are also blocking measuresfor p < 1/2; they are obtained by the reflection of those mentioned above.When an asymmetric exclusion process (p 6= 1/2) is considered from a ran-dom position determined by a so-called second-class particle, a new set ofinvariant measures arises. They are called shock measures, they have supporton configurations with different asymptotic densities to the left and right ofthe origin; see [89, 87, 64]. See the review paper [88] and the book [182] foran account of properties of these measures and the asymptotic behaviour ofthe second class particle.

As mentioned above, Theorem 6.4.5 can be deduced from results of Lig-gett [180]; for p ≤ 1/2 those results say only that EP(p) is not positive-recurrent. The transience result Theorem 6.4.6 is from [16].

In the voter model, only one site changes its value at any given time,so the voter model is an example of a spin-flip model. There are only twoinvariant measures for the voter model on Z: one supported on the ‘all-0s’configuration and the other supported on the ‘all-1s’ configuration. A basictool to prove such results is duality between the voter model and a dualprocess obtained when one runs the voter model backwards in time. Thereare two dual processes for the voter model: coalescing random walks andannihilating random walks. See [71] and Chapter V of [181] for accountson these and many other properties of the voter model. Theorems 6.4.7and 6.4.8 are from [16]; it was subsequently shown in [169] that the boundarycase not covered by these two results has Ev[τ3/2] =∞.

The hybrid exclusion-voter process was introduced in [16], and Theor-ems 6.4.10 and 6.4.11 are taken from that paper; some further developmentscan be found in [131, 273, 193, 169]. The process can also be viewd as aparticular case of a model of random grammars [198]. The continuous-timeexclusion-voter model can be defined via its infinitesimal generator and con-structed via a Harris-type graphical construction. The discrete-time processstudied in Section 6.4 is naturally embedded in the continuous-time pro-cess. We rather study these discrete versions in depth (after all, this book


is mainly about discrete-time Markov chains), and then only note that thecorresponding results for the continuous-time versions of the processes areeither straightforward or may be obtained in an elementary way, as shownin Section 8 of [16].

The voter model has been used to model the spread of opinion through astatic population via nearest-neighbour interactions; see, for example [118].The hybrid model studied here is a natural extension of this model wherebyindividuals do not have to remain static, but may move by switching places.Alternative motivations, such as from the point of view of competition ofspecies (see e.g. [38]) can also be adapted to the hybrid process. Modelsconsisting of a mixture of a spin-flip dynamics and exclusion dynamics comeunder the title of reaction-diffusion processes; see e.g. [16] for more discus-sion on such processes.

The main open problem for the exclusion-voter process is:

Open problem 6.4.15. Give the complete recurrence classification forHP(β, p) for β, p ∈ [0, 1].

It is known from Theorems 6.4.10, 6.4.11 and Theorem 4 of [193] (whichis obtained using another Lyapunov function) that the process is recurrentif one of the following holds:

• β > 0 and p ≥ 1/2;

• (1− p)(1− β) < 1/3;

• p < 1/2 and β ≥ 4/7.

On the other hand, transience is established only for β = 0 and p ≤ 1/2(Theorem 6.4.6). See [193] for some other open problems related to thisprocess.

Further remarks

To conclude this chapter we give bibliographic details for some other applic-ations of Lyapunov function ideas that did not make it into this book.

Queueing theory. The process that tracks the lengths of a system of dqueues often gives rise to a random walk in a positive orthant Zd+. De-pending on the details of the queueing model, the random walk may possessvarying degrees of spatial homogeneity. In the book [84] basic applications ofthe Lyapunov function method to non-critical queueing systems were given.


Subsequently, near-critical cases have been treated; for example, pollingsystems are studied in [205, 191]. There are also applications to some moreexotic queueing models, such as the supermarket model [190] or polling sys-tems with randomized service-regime regenerations [192, 189].

Branching random walks. A particle performs a random walk while atthe same time producing offspring as in a classical branching process. Thenatural asymptotic question regards local survival and extinction, i.e., is theorigin visited infinitely often by some member of the population? We referto [212, 186, 207] for approaches to this question via Lyapunov functions.The model can be extended to consider a branching random walk in randomenvironment, and this too has received attention [41, 60, 187, 188, 43, 44].

Self-interacting random walks. There are several models currently pop-ular concerning random walks whose dynamics depends on their past traject-ory in some way. For example, reinforced random walks or excited randomwalks where transition probabilities are functions of nearby occupation timesfor the process (see e.g. [232]); in this context ideas related to Lamperti’sproblem were recently applied in [167]. Other possibilities are non-localinteractions with the occupation time distribution, such as interaction me-diated by the centre of mass of the previous trajectory [42].

Growth and deposition models. (including models inspired by biologyand physics) [146, 142, 4, 120, 259].

Chapter 7

Markov chains in continuoustime

7.1 Introduction and notation

In this chapter we establish general theorems quantifying the notion of re-currence — by studying which moments of the passage time exist — forirreducible continuous-time Markov chains ξ = (ξt)t∈[0,∞) on a countablespace X in critical regimes. The aforementioned criticality means that eventhe discrete time chain — sampled at the moments of jumps — has alreadya non-trivial behaviour.

Continuous-time Markov chains have an additional feature compared todiscrete-time ones, namely, on each visited state they spend a random hold-ing time (exponentially distributed) defined as the difference between suc-cessive jump times. We consider the space inhomogeneous situation wherethe parameters γx ∈ R+ of the exponential holding times (the inverse of theirexpectation) are unbounded, i.e. supx∈X γx = +∞. In such situations, thephenomenon of explosion can occur for transient chains, informally meaningthat the chain makes an infinite number of steps in finite time.

In the discrete time case, many results on recurrence/transience, i.e. theestimating of the number of times the Markov chain returns to a given statecan be obtained through estimating the moments of the time needed to re-turn to this state. The reason is that (discrete) time flows homogeneously;each step of the chain takes a unit of time to be performed because the in-ternal clock of the chain ticks at constant pace. In the continuous time case,the connection between the number of steps and the time needed to performthem is more subtle because the internal clock of the chain ticks at different

335

336 Chapter 7. Markov chains in continuous time

pace when the process visits different states. The second aim of this chapteris to show that the phenomenon of explosion can also be sharply studied bythe use of Lyapunov functions and to establish locally verifiable conditionsfor explosion/non explosion for Markov chains on arbitrary graphs. Thismethod is applied to models that even without explosion are difficult tostudy. More fundamentally, the development of the semimartingale methodhas been largely inspired by having these specific critical models in mind(such as the cascade of k-critically transient Lamperti models or of reflectedrandom walks on quarter planes) that seem refractory to known methods.

Also, we discuss a new phenomenon, we termed implosion (see Defini-tion 7.1.2 below), reminiscent of the Doeblin’s condition for general Markovchains [65], occurring in the case supx∈X γx = ∞. We show that this phe-nomenon can also be explored with the help of Lyapunov functions.

Throughout this chapter, X denotes the state space of our Markov chains;it denotes an abstract countably infinite set, equipped with its full σ-algebraX = P(X). It is worth stressing here that, generally, this space is notnaturally partially ordered. The graph whose edges are the ones induced bythe stochastic matrix, when equipped with the natural graph metric on Xneed not be isometrically embeddable into Zd for some d. Since the definitionof a continuous-time Markov chain on a denumerable set is standard (see[35], for instance), we introduce below its usual equivalent description interms of holding times and embedded Markov chain merely for the purposeof establishing our notation.

Denote by Γ = (Γxy)x,y∈X the generator of the continuous Markov chain,namely the matrix satisfying: Γxy ≥ 0 if y 6= x and Γxx = −γx, whereγx =

∑y∈X\x Γxy. We assume that for all x ∈ X, we have γx <∞.

We construct a stochastic Markovian matrix P = (Pxy)x,y∈X out of Γ bydefining

Pxy =

Γxyγx, if γx 6= 0,

0, if γx = 0,for y 6= x, and Pxx =

0, if γx 6= 0,1, if γx = 0.

The kernel P defines a discrete-time (X, P )-Markov chain ξ = (ξn)n∈Ntermed the Markov chain embedded at the moments of jumps. To avoidirrelevant complications, we always assume that this Markov chain is irre-ducible.

Define a sequence σ = (σn)n≥1 of random holding times distributed,conditionally on ξ, according to an exponential law. More precisely, consider

P[σn ∈ ds | ξ] = γξn−1exp(−sγξn−1

)1s ∈ R+ds, n ≥ 1,

7.1. Introduction and notation 337

so that E[σn | ξ] = 1/γξn−1. The sequence J = (Jn)n∈N of random jump

times is defined accordingly by J0 = 0 and for n ≥ 1 by Jn =∑n

k=1 σk. Thelife time is denoted ζ = limn→∞ Jn and we say that the (not yet definedcontinuous-time) Markov chain explodes n ζ < ∞, while it does not ex-plode (or is regular, or conservative) on ζ =∞.Remark 7.1.1. The parameter γx must be interpreted as the proper fre-quency of the internal clock of the Markov chain multiplicatively modulat-ing the local speed of the chain. We always assume that for all x ∈ X,0 < γx <∞. The case 0 < γ := infx∈X γx ≤ supx∈X =: γ <∞ is elementarybecause the chain can be stochastically controlled by two Markov chainswhose internal clocks tick respectively at constant pace γ and γ. Thereforethe sole interesting cases are

• supx γx =∞: the internal clock ticks unboundedly fast (leading to anunbounded local speed of the chain),

• infx γx = 0: the internal clock ticks arbitrarily slowly (leading to alocal speed that can be arbitrarily close to 0).

To have a unified description of both explosive and non-explosive pro-cesses, we can extend the state space into X = X ∪ ∂ by adjoining aspecial absorbing state ∂. The continuous-time Markov chain is then thecadlag process ξ = (ξt)t∈[0,∞) defined by

ξ0 = ξ0 and ξt =

∑n∈N

ξn1t ∈ [Jn, Jn+1), for 0 < t < ζ,

∂, for t ≥ ζ.

Note that although X is merely a set (i.e. no internal composition rule isdefined on it), the above “sum” is well-defined since for every fixed t only oneterm survives. We refer the reader to standard texts (for instance [35, 221])for the proof of the equivalence between ξ and (ξ, J). The natural rightcontinuous filtration (Ft)t∈[0,+∞) is defined as usual through Ft = σ(ξs :

s ≤ t); similarly Ft− = σ(ξs : s < t), and Fn = σ(ξk, k ≤ n) for n ∈ N.For an arbitrary (Ft)-stopping time τ , we denote as usual its past σ-algebraFτ = A ∈ F∞ : A ∩ τ ≤ t ∈ Ft and its strict past σ-algebra Fτ− =σA ∩ t < τ; t ≥ 0, A ∈ Ft ∨ F0. Since it is immediate to show that τ isFτ−-measurable, we conclude that the only information contained in FJn+1

but not in FJn+1− is conveyed by the random variable ξn+1, i.e. the positionwhere the chain jumps at the moment Jn+1.


If A ∈ X , we denote by τA = inft ≥ 0 : ξt ∈ A the (Ft)-stopping timeof reaching A.

A dual notion to explosion is that of implosion:

Definition 7.1.2. Let (ξt)t∈[0,∞) be a continuous-time Markov chain on Xand let A ⊂ X be a proper subset of X. We say that the Markov chainimplodes towards A if there exists K > 0 such that Ex τA ≤ K for allx ∈ Ac.

Remark 7.1.3. It will be shown in Proposition 7.3.13 that in the case theset A is finite and the chain is irreducible, implosion towards A means im-plosion towards any state. In this situation, we speak about implosion ofthe chain.

It is worth noticing that some other definitions of implosion can be in-troduced; all convey the same idea of reaching a finite set from an arbitraryinitial point within a random time that can be uniformly (in the initialpoint) bounded in some appropriate stochastic sense. We stick at the formintroduced in the previous definition because it is easier to establish neces-sary and sufficient conditions for its occurrence and is easier to illustrate onspecific problems (see Section 7.4).

We use the notational conventions of [244] to denote measurable func-tions, namely mX = f : X → R, f is X -measurable with all possibledecorations: bX to denote bounded measurable functions, mX+ to denotenon-negative measurable functions, etc. For f ∈ mX+ and α > 0, we denoteby Sα(f) the sublevel set of f of height α defined by

Sα(f) := x ∈ X : f(x) ≤ α.

We recall that a function f ∈ mX+ is unbounded if supx∈X f(x) = +∞ whiletends to infinity (f → ∞) when for every n ∈ N the sublevel set Sn(f) isfinite. Measurable functions f defined on X can be extended to functions f ,defined on X, by f(x) = f(x) for all x ∈ X and f(∂) := 0 (with obviousextension of the σ-algebra).

We denote by Dom(Γ) = f ∈ mX :∑

y∈X\x Γxy|f(y)| < +∞, ∀x ∈ Xthe domain of the generator and by Dom+(Γ) the set of non-negative func-tions in the domain. The action of the generator Γ on f ∈ Dom(Γ) readsthen: Γf(x) :=

∑y∈X Γxyf(y). Similarly we could have defined Dom(P )

and Dom+(P ). Nevertheless, it is immediate to see that whenever the in-equalities 0 < γ(x) < ∞ hold for all x ∈ X, we have Dom(Γ) = Dom(P )and Dom+(Γ) = Dom(P ).

7.2. Main results 339

7.2 Main results

We recall once more that in the whole chapter we make the following as-sumption: the chain embedded at the moments of jumps is irreducible and0 < γx <∞ for all x ∈ X.

We are now in position to state our main results concerning the use ofLyapunov function to obtain, through semimartingale theorems, precise andlocally verifiable conditions on the parameters of the chain allowing us toestablish existence or non-existence of moments of passage times, explosionor implosion phenomena. The proofs of these results are given in Section 7.3;Section 7.4 treats some critical models that are difficult to study even indiscrete time, illustrating thus both the power of our methods and givingspecific examples on how to use them.

7.2.1 Existence or non-existence of moments of passage times

Theorem 7.2.1. Let f ∈ Dom+(Γ) be such that f →∞.

(i) If there exist constants a > 0, c > 0 and p > 0 such that fp ∈ Dom+(Γ)and

Γfp(x) ≤ −cfp−2(x), ∀x 6∈ Sa(f),

then Ex[τ qSa(f)] < +∞ for all q < p/2 and all x ∈ X.

(ii) Let g ∈ mX+. If there exist

(a) a constant b > 0 such that f ≤ bg,

(b) constants a > 0 and c1 > 0 such that Γg(x) ≥ −c1 for x 6∈ Sa(g),

(c) constants c2 > 0 and r > 1 such that gr ∈ Dom(Γ) and Γgr(x) ≤c2g

r−1(x) for x 6∈ Sa(g), and

(d) a constant p > 0 such that fp ∈ Dom(Γ) and Γfp ≥ 0 for x 6∈Sab(f),

then Ex[τ qSa(f)] = +∞ for all q > p and all x 6∈ Sa(f).

Remark 7.2.2. The condition f → ∞ (together with the majorisation onΓfp(x)) in the first statement of the previous Theorem 7.2.1 guaranteesrecurrence of the chain. If we assume recurrence of the chain and we canfind a bounded function f we can prove existence of exponential momentsof the time of reaching a finite set. This phenomenon will be discussed later(see Proposition 7.2.12 below).


In many cases, the function g, whose existence is assumed in statement 2,of the above theorem can be chosen as g = f (with obviously b = 1). Insuch situations we have to check Γf r ≤ c2f

r−1 for some r > 1 and find ap > 0 such that Γfp ≥ 0 on the appropriate sets. However, in the case ofthe problem studied in Section 7.4.3, for instance, the full-fledged version ofthe previous theorem is needed.

Note that the conditions f ∈ Dom+(Γ) and fp ∈ Dom+(Γ) for somep > 0 holding simultaneously imply that f q ∈ Dom+(Γ) for all q in theinterval with end points 1 and p. When τA is integrable, the chain is positiverecurrent. In the null recurrent situation however, τA is almost surely finitebut not integrable; nevertheless, some fractional moments E τ qA with q < 1can exist. Similarly, in the positive recurrent case, some higher momentsE τ qA with q > 1 may fail to exist.

When p = 2, the first statement in the above Theorem 7.2.1 simplifiesto the following: if Γf(x) ≤ −ε, for some ε > 0 and for x outside a finiteset F , then the passage time Ex τ qF < ∞ for all x ∈ X and all q < 1. As amatter of fact, in this situation, we have a stronger result, expressed in theform of the following

Theorem 7.2.3. Suppose that the chain is recurrent. The following areequivalent:

(i) The chain is positive recurrent.

(ii) There exist a triple (ε, F, f), with ε > 0, F a finite non-empty subsetof X and f a function in Dom+(Γ) verifying Γf(x) ≤ −ε for allx 6∈ F .

Obviously, positive recurrence implies a fortiori that Ex τF <∞.

Remark 7.2.4. It is obvious that the triple (ε, F, f) in the above Theorem 7.2.3is not uniquely determined. Mostly, it will be possible to chose a functionf → ∞ and F as the sublevel set of f at height a, for some a > 0. Some-times it will be possible to chose the function f uniformly bounded; this casewill be further considered in Theorem 7.2.11 and leads to implosion. It isalso immediate that if f verifies the conditions Γf(x) ≤ −ε for x ∈ F c thenthe modified function f + c, where c is an arbitrary positive constant, alsoverifies the same condition. Further, if a function f verifies this condition,the function g = f1· ∈ F c verifies a fortiori the same condition.

If γx is bounded away from 0 and ∞, then since the Markov chain canbe stochastically controlled by two Markov chains with constant γx reading


respectively γx = γ and γx = γ for all x, the previous result is the straight-forward generalisation of Theorems 1 and 2 of [9] established in the case ofdiscrete time; as a matter of fact, in the case of constant γx, the complete be-haviour of the continuous time process is encoded solely into the jump chainand since results in [9] were optimal, the present theorem introduces no im-provement. Only the interesting cases of supx∈X γx = ∞ or infx∈X γx = 0are studied in the sequel; the models studied in Section 7.4, illustrate howthe theorem can be used in critical cases to obtain locally verifiable condi-tions of existence/non-existence of moments of reaching times. The processXt = f(ξt), the image of the Markov chain through the Lyapunov func-tion f , can be shown to be a semimartingale; therefore, the semimartingaleapproach will prove instrumental as was the case in discrete time chains.

7.2.2 Explosion

The next results concern explosion obtained again using Lyapunov functions.It is worth noting that although explosion can only occur in the transientcase, the next result is strongly reminiscent of the Foster’s criterion (The-orem 2.6.4) for positive recurrence!

Theorem 7.2.5. The following are equivalent:

(i) There exist f ∈ Dom+(Γ) strictly positive and ε > 0 such that Γf(x) ≤−ε for all x ∈ X.

(ii) The explosion time ζ satisfies Ex ζ < +∞ for all x ∈ X.

Remark 7.2.6. Comparison of statements (i) of Theorem 7.2.1 and (i) ofTheorem 7.2.5 demands some comments. The conditions of Theorem 7.2.1imply that Sa(f) is a finite set and necessarily not empty. For p = 2 andF = fp, the condition reads ΓF (x) ≤ −ε outside some finite set and thisimplies recurrence. In Theorem 7.2.5 the condition Γf ≤ −ε must holdeverywhere and this implies transience.

The one-side implication (Γf(x) ≤ −ε, ∀x) ⇒ (Px[ζ < +∞] = 1,∀x) isalready established, for f ≥ 0, in the second part of Theorem 4.3.6 of [272].Here, modulo the (seemingly) slightly more stringent requirement f > 0,relying on the powerful ideas developed within the proof of the aforemen-tioned theorem of [272], we strengthen the result from almost sure finitenessto integrability and prove equivalence instead of mere implication.

Proposition 7.2.7. Let f ∈ Dom+(Γ) be a strictly positive bounded func-tion and denote b = supx∈X f(x); assume there exists an increasing — not


necessarily strictly — function g : R+ \ 0 → R+ \ 0 such that its in-

verse has an integrable singularity at 0, i.e.∫ b

01

g(y)dy < ∞. If we have

Γf(x) ≤ −g(f(x)) for all x ∈ X, then Ex ζ <∞ for all x.

The previous proposition, although stating the conditions on Γf quitedifferently than in Theorem 7.2.5, will be shown to follow from the former.This proposition is interesting only when infx∈X gf(x) = 0 because then thecondition required in 7.2.7 is weaker than the uniform requirement Γf(x) ≤−ε for all x of Theorem 7.2.5.

If for some x ∈ X, explosion (i.e. Px[ζ < +∞] > 0) occurs, irreducibil-ity of the chain implies that the process remains explosive for all startingpoints x ∈ X. However, since the phenomenon of explosion can only oc-cur in the transient case, examples (see Section 7.4.4) can be constructedwith non-trivial tail boundary so that for some initial x ∈ X, we have both0 < Px[ζ < +∞] < 1. Additionally, the previous theorems established con-ditions that guarantee Ex ζ < ∞ (implying explosion). However examplesare constructed where Px[ζ = ∞] = 0 (explosion does not occur) whileEx ζ = ∞. It is therefore important to have results on conditional explo-sion.

Theorem 7.2.8. Suppose that there exists a triple (ε,A, f) with ε > 0,A a proper (finite or infinite) subset of X such that X \ A is infinite andf ∈ Dom+(Γ) such that

• there exists x0 6∈ A with f(x0) < infx∈A f(x),

• Γf(x) ≤ −ε on Ac

Then Ex0 [ζ | τA =∞] <∞.

Remark 7.2.9. It can be shown that the conditioning set appearing in theprevious Theorem 7.2.8 is not negligible, hence the statement about theconditional expectation is not trivial. Additionally, if A is finite, then Ex ζ <∞. Therefore, we have a much more instrumental criterion for explosionthan the one provided by Theorem 7.2.5; as a matter of fact, except a smallnumber of elementary examples, it is almost impossible to find a function fthat maps the Markov chain to a strict supermartingale everywhere, as wasrequired in Theorem 7.2.5.

The previous results (Theorems 7.2.5 and 7.2.8) — through uncondi-tional or conditional integrability of the explosion time ζ — give conditionsestablishing explosion. For Theorem 7.2.5 these conditions are even neces-sary and sufficient. It is nevertheless extremely difficult in general to prove


that a function satisfying the conditions of the theorems does not exist.We need therefore a more manageable criterion guaranteeing non-explosion.Such a result is provided by the following

Theorem 7.2.10. Let f ∈ Dom+(Γ). If

• f →∞,

• there exists an increasing (not necessarily strictly) function g : R+ →R+ whose inverse is locally integrable but has non integrable tail (i.e.G(z) :=

∫ z0

dyg(y) < +∞ for all z ∈ R+ but limz→∞G(z) =∞), and

• Γf(x) ≤ g(f(x)) for all x ∈ X,

then Px[ζ = +∞] = 1 for all x ∈ X.

7.2.3 Implosion

Finally, we state results about implosion.

Theorem 7.2.11. Suppose that the chain is recurrent.

(i) The following are equivalent:

(i1) There exist a triple (ε, F, f) with ε > 0, F a finite set and f ∈Dom+(Γ) such that supx∈X f(x) = b <∞ and

x 6∈ F ⇒ Γf(x) ≤ −ε.

(i2) For every finite A ∈ X , there exists a constant C := CA ∈ (0,∞)such that the following holds

x 6∈ A⇒ Ex τA ≤ C,

(hence there is implosion towards A and subsequently the chainis implosive).

(ii) Let f ∈ Dom+(Γ) be such that f → ∞ and assume there exist con-stants a > 0, c > 0, ε > 0, and r > 1 such that f r ∈ Dom+(Γ). Iffurther

(ii1) Γf(x) ≥ −ε, for all x 6∈ Sa(f), and

(ii2) Γf r(x) ≤ cf r−1(x), for all x 6∈ Sa(f),

then the chain does not implode towards Sa(f).


Proposition 7.2.12. Suppose that the chain is implosive. Then there ex-ists α > 0 such that for every finite set F and every x ∈ F c, we haveEx(exp(ατF )) <∞.

Consequently implosion implies the existence of all moments and evenof exponential moments.

We conclude this section by the following

Proposition 7.2.13. Suppose that the chain is recurrent. Let f ∈ Dom+(Γ)be strictly positive and such that supx∈X f(x) = b < ∞; assume furtherthat for any a such that 0 < a < b, the sublevel set Sa(f) is finite. Let

g : [0, b] → R+ be an increasing function such that B :=∫ b

0dyg(y) < ∞. If

Γf(x) ≤ −g(f(x)) for all x 6∈ Sa(f) then Ex τSa(f) ≤ B for all x 6∈ Sa(f),i.e. the chain implodes towards Sa(f).

In some applications, it is quite difficult to guess immediately the formof the function f satisfying the uniform condition Γf(x) ≤ −ε required forthe first statement of the previous Theorem 7.2.11 to apply. It is sometimesmore convenient to check merely that Γf(x) ≤ −g(f(x)) for some func-tion g vanishing at 0 in some controlled way. Proposition 7.2.13 — althoughdoes not improve the already optimal statement 1 of the Theorem 7.2.11 —provides us with a convenient alternative condition to be checked.

7.3 Proof of the main theorems

We have already introduced the notion of Dom(Γ). A related notion is thatof locally p-integrable functions, defined as

`p(Γ) =f ∈ mX :

∑y∈X

Γxy|f(y)− f(x)|p < +∞, for all x ∈ X,

for some p > 0. Obviously `1(Γ) = Dom(Γ). In accordance to our notationalconvention on decorations, `p+(Γ) will denote positive p-integrable functions.For f ∈ `1(Γ), we define the local f -drift of the embedded Markov chain asthe random variable

∆fn+1 := ∆f(ξn+1) := f(ξn+1)− f(ξn+1−) = f(ξn+1)− f(ξn),

the local mean f -drift by

mf (x) := E[∆fn+1 | ξn = x] =

∑y∈X

Pxy(f(y)− f(x)) = Ex ∆f1 ,

7.3. Proof of the main theorems 345

and for ρ ≥ 1 and f ∈ `ρ(Γ) the ρ-moment of the local f -drift by

v(ρ)f (x) := E

[|∆f

n+1|ρ | ξn = x]

=∑y∈X

Pxy|f(y)− f(x)|ρ = Ex |∆f1 |ρ.

We always write shortly vf (x) := v(2)f (x). The action of the generator Γ

on f reads

Γf(x) :=∑y∈X

Γxyf(y) = γxmf (x).

Since (ξt)t is a pure jump process, the process (Xt)t transformed by f ∈Dom(Γ), i.e. Xt = f(ξt), is also a pure jump process reading, for t < ζ,

Xt = f(ξt) =

∞∑k=0

f(ξk)1[Jk, Jk+1[(t) = X0 +

∞∑k=0

∆fk+11(0, t](Jk).

If there is no explosion, the process (Xt) is a (Ft)-semimartingale admittingthe decomposition Xt = X0 +Mt +At, where Mt is a martingale vanishingat 0 and the predictable compensator reads [158]

At =

∫(0,t]

Γf(ξs−)ds =

∫[0,t[

Γf(ξs)ds.

Notice that, although not explicitly marked, (Xt), (Mt), and (At) dependon f . We use in the sequel also the infinitesimal form of the above decom-position. For any admissible f we have dXt = dMt+dAt = dMt+Γf(ξt−)dt;in particular, since (Mt) is a (Ft)-martingale, E[dXt | Ft−] = E[dAt | Ft−] =Γf(ξt−)dt represents the conditional increment of X as an ordinary differ-ential multiplied by a previsible random factor.

7.3.1 Proof of Theorems 7.2.1 and 7.2.3 on moments of pas-sage times

We start by Theorem 7.2.3.

Lemma 7.3.1. Let (Ω,G, (Gt),P) be a filtered probability space and (Yt) a(Gt)-adapted process taking values in [0,∞). Let c ≥ 0 and denote T =inft ≥ 0 : Yt ≤ c. Suppose that

(i) For some y ∈]c,∞), we have Py[T <∞] = 1.

(ii) There exists ε > 0 such that E[dYt | Gt−] ≤ −εdt on the event T ≥ t.


Then Ey T ≤ yε .

Proof. Obviously Ey T = Ey[T1T =∞] + Ey[T1T <∞] and if Py[T =∞] > 0, then of course we have Ey T =∞. But this possibility is excluded bythe hypothesis of the lemma. It remains thus to study what happens whenT < ∞ almost surely. Since T ≥ t ∈ Gt−, the hypothesis of the lemmareads E[dYt∧T | Gt−] ≤ −ε1T ≥ tdt. Taking expectations and integratingover time, yields, for every t ∈ R+, 0 ≤ Ey Yt∧T ≤ y−ε

∫ t0 Py[T ≥ s]ds which

implies εEy[T ∧ t] ≤ y. Now, since the event T <∞ has probability 1, bytaking the limit t → ∞ in the previous formula yields Ey T ≤ y

ε , by Fatoulemma.

Proof of Theorem 7.2.3. First, we prove that (ii) implies (i). Without lossof generality, using Remark 7.2.4, maybe by modifying f , we can alwaysassume that t infx∈F c f(x) > 0 and f(x) = 0 for x ∈ F . Choose then anarbitrary c ∈ (0, infx∈F c f(x)[. Let Xt = f(ξt) and T = inft ≥ 0 : Xt ≤ c.Obviously, T = τF := inft ≥ 0 : ξt ∈ F and the assumed recurrenceof the chain guarantees that Px[T < ∞] = 1. Then the condition 2 readsE[dXt∧T | Gt−] ≤ −ε1T ≥ tdt. Hence, in accordance with the previous

Lemma 7.3.1, we get Ex τF = Ex(T ) ≤ f(x)ε for every x 6∈ F . Now, let x ∈ F .

Then

Ex τF = Ex[τF | ξ1 ∈ F ]Px[ξ1 ∈ F ] + Ex[τF | ξ1 6∈ F ]Px[ξ1 6∈ F ]

≤ supx∈F

( 1

γx+∑y∈X

Pxyf(y)

ε

)<∞,

the finiteness of the last expression being guaranteed by the conditions f ∈Dom+(Γ) and the finiteness of the set F .

Now, let is prove that (i) implies (ii). Let F = z for some fixed z ∈ X;positive recurrence of the chain implies that Ex τF <∞ for all x ∈ X. Define

f(x) =

Ex τF , if x 6∈ F,0, otherwise.

Then, mf (x) =∑

y 6=z Pxy Ey τF − Ex τF = Ex[τF − σ1]− Ex τF = −Ex σ1 =

− 1γx

, for all x 6∈ F . It follows that Γf(x) = γxmf (x) = −1 outside F .By adding the constant 1 to the function f determined above (see Re-mark 7.2.4), we see that f meets all the requirements.

The proof of Theorem 7.2.1 is quite technical and will be split into severalsteps formulated as independent lemmata and propositions on semimartin-gales that may have an interest per se. As a matter of fact, we use theseintermediate results to prove various results of very different nature.


Lemma 7.3.2. Let f ∈ Dom+(Γ) tending to infinity, p ≥ 2, and a > 0.Use the abbreviation A := Sa(f). Denote Xt = f(ξt) and assume furtherthat fp ∈ Dom+(Γ). If there exists c > 0 such that

Γfp(x) ≤ −cfp−2(x), ∀x 6∈ A,

then the process defined by Zt = (X2τA∧t + c

p/2τA ∧ t)p/2 is a non-negativesupermartingale.

Proof. Introducing the predictable decomposition 1 = 1τA < t+1τA ≥ t,we get E[dZt | Ft−] = E[d(X2

t + c1p/2 t)

p/2 | Ft−]1τA ≥ t. Now, (Xt) is a pure

jump process, hence by applying Ito formula, reading for any g ∈ C2 and(St) a semimartingale, dg(St) = g′(St−)dSct + ∆g(St), where (Sct ) denotesthe continuous component of (St), we get

d(X2t +

c

p/2t)p/2

= c(X2t−+

c

p/2t)p/2−1

dt+(X2t +

c

p/2t)p/2−(X2t−+

c

p/2t)p/2

.

Writing the semimartingale decomposition for the process (Xpt ), we remark

that the hypothesis of the lemma implies that

E[dXpt | Ft−] = Γfp(ξt−)dt ≤ −cXp−2

t− dt on the event τA ≥ t.

Applying conditional Minkowski inequality and the supermartingale hypo-thesis, we get, on the set τA ≥ t,

E[(X2t +

c

p/2t)p/2

| Ft−]≤((

E[Xpt | Ft−]

)2/p+

c

p/2t)p/2

=(

(Xpt− +

(E[dXp

t | Ft−])2/p

+c

p/2t)p/2

≤(

(Xpt− − cXp−2

t− dt)2/p +c

p/2t)p/2

≤(X2t−

(1− c

X2t−dt)2/p

+c

p/2t)p/2

≤(X2t−

(1− c

p/2

1

X2t−dt)

+c

p/2t)p/2

≤(X2t− −

c

p/2dt+

c

p/2t)p/2

.

Hence, on the event τA ≥ t, we have the estimate

E[dZt | Ft−] ≤ c(X2t−+

c

p/2t)p/2−1

dt+(X2t−+

c

p/2t− c

p/2dt)p/2−(X2t−+

c

p/2t)p/2

.


Simple expansion of the remaining differential forms (containing now onlyFt−-measurable random variables) yields E[dZt | Ft−] ≤ 0.

Corollary 7.3.3. Let f ∈ Dom+(Γ) tending to infinity, p ≥ 2, and a > 0.Use the abbreviation A := Sa(f). Denote Xt = f(ξt) and assume furtherthat fp ∈ Dom+(Γ). If there exists c > 0 such that

Γfp(x) ≤ −cfp−2(x),∀x 6∈ A,

then there exists c′ > 0 such that

Ex τ qA ≤ c′f(x)2q for all q ≤ p/2 and all x ∈ X.

Proof. Without loss of generality, we can assume that x ∈ Ac since otherwisethe corollary holds trivially. Denoting by Yt = X2

t∧τA + cp/2 t∧ τA, we observe

that Zt = Yp/2t is a non-negative supermartingale by virtue of Lemma 7.3.2.

Since the function R+ 3 w 7→ w2q/p ∈ R+ is increasing and concave forq ≤ p/2, it follows that Y q

t is also a supermartingale. Hence,

c

p/2Ex[(t ∧ τA)q] ≤ Ex Y q

t ≤ Ex Y q0 = f(x)2q.

We conclude by the dominated convergence theorem on identifying c′ =p2c .

Proposition 7.3.4. Let f ∈ Dom+(Γ) tending to infinity, 0 < p ≤ 2, anda > 0. Use the abbreviation A := Sa(f). Denote Xt = f(ξt) and assumefurther that fp ∈ Dom+(Γ). If there exists c > 0 such that

Γfp(x) ≤ −cfp−2(x),∀x 6∈ A,

then the process, defined by Zt = XpτA∧t + c

q (τA ∧ t)q, satisfies

Ex Zt ≤ c′′f(x)p, for all q ∈ (0, p/2].

Proof. Since d(Xpt + c

q tq) = dXp

t + ctq−1dt, we have

E[dZt | Ft−] ≤ E[dZt | Ft−]1τA ≥ t ≤ cdt1τA ≥ t(−Xp−2t− + tq−1).

Now, qp ≤ 1

2 ≤1−q2−p . Choosing v ∈

( qp ,

1−q2−p), we write

E[dZt | Ft−] ≤ cdt1τA ≥ t(−Xp−2t− + tq−1)1Xt− ∈ (a, tv]

+ cdt1τA ≥ t(−Xp−2t− + tq−1)1Xt− ∈ (tv,+∞].


For Xt− ≤ tv, the first term of the right hand side of the previous inequal-ity is non-positive; as for the second, the condition Xt− > tv implies that−Xp−2

t− + tq−1 ≤ tq−1. Combining the latter with Markov inequality guaran-

tees that, on the set Xt− > tv, we have Ex[dZt] ≤ cdt1τA ≥ tf(x)p

tvp tq−1.

Integrating this differential inequality yields Ex Zt ≤ cfp(x)∫∞a1/v

dttvp+1−q ;

the condition v > q/p insures the finiteness of the last integral proving thusLemma with c′′ = c

∫∞a1/v

dttvp+1−q .

Corollary 7.3.5. Under the same conditions as in Proposition 7.3.4, thereexists c′′′ > 0 such that Ex τ qA ≤ c′′′f(x)p,∀q ∈ (0, p/2].

Proof. Since Xt is non-negative, qcZt ≥ (t ∧ τA)q. By the previous propos-

ition, Ex[(t ∧ τA)q] ≤ qc Ex Zt ≤ c′′ qcf(x)p. We conclude by the dominated

convergence theorem on identifying c′′′ = c′′qc .

Remark 7.3.6. All the propositions, lemmata, and corollaries shown so farallow to prove statement (i) of Theorem 7.2.1. The subsequent propositionsare needed for statement (ii) of this theorem. Notice also that the followingProposition 7.3.7 is very important and tricky. It provides us with a gener-alisation of Theorem 3.1 of Lamperti [175] and serves twice in this chapter:one first time to establish conditions for some moments of passage time tobe infinite (statement (ii) of Theorem 7.2.1) and once more in a very dif-ferent context, namely for finding conditions for the chain not to implode(statement (ii) of Theorem 7.2.11).

Proposition 7.3.7. Let (Ω,G, (Gt)t,P) be a filtered probability space and(Yt) be a (Gt)-adapted process taking values in an unbounded subset of R+.Let a > 0 and Ta = inft ≥ 0 : Yt ≤ a. Suppose that there exist constantsc1 > 0 and c2 > 0 such that

(i) E[dYt | Gt−] ≥ −c1dt on the event Ta > t, and

(ii) there exists r > 1 such that E[dY rt | Gt−] ≤ c2Y

r−1t− dt on the event

Ta > t.

Then, for all α ∈ (0, 1), there exist ε > 0 and δ > 0 such that

∀t > 0 : P[Ta > t+εYt∧Ta | Gt] ≥ 1−α, on the event Ta > t;Yt > a(1+δ).

Remark 7.3.8. The meaning of the previous Proposition 7.3.7 is the follow-ing. If the process (Yt) has average increments bounded from below by aconstant −c1, it is intuitively appealing to suppose that the average time


of reaching 0 is of the same order of magnitude as Y0. However this intu-ition proves false if the increments are wildly unbounded since then 0 canbe reached in one step. The condition 2, by imposing some control on r-moments of the increments with r > 1, prevents this from occurring. It is infact established that if Ta > t, the remaining time Ta−t to reach Aa := [0, a]exceeds εYt with probability 1−α; more precisely, for every α we can chose εsuch that P[Ta − t > εYt | Gt] ≥ 1− α.

Proof of Proposition 7.3.7. Let σ = (Ta− t)1Ta ≥ a; then for all s > 0 wehave σ < s = Ta ≥ t∩Ta < t+s ∈ Gt+s−. To prove the proposition, itis enough to establish P[σ > εYt | Gt] ≥ 1−α on the set σ > 0;Yt > a(1+δ).On this latter set: P[σ > εYt | Gt] = P[Yt+(εYt)∧σ > a | Gt], because once theprocess Yt+(εYt)∧σ enters in Aa, it remains there forever, due to the stoppingby σ. On defining U := Y(εYt)∧σ+t one has

E[U | Gt] = E[U1U ≤ a | Gt] + E[U1U > a | Gt]≤ a+

(E[U r | G]

)1/r(P[U > a | Gt])1−1/r

;

therefore

P[U > a | Gt] ≥((E[U | Gt]− a)+

(E[U r | G])1/r

)r/(r−1).

To minorise the numerator, we observe that

E[U | Gt] = E[Yt+εYt − Yt | Gt] + Yt =

∫ t+εYt

tE[dYs | Gt] + Yt ≥ −c1εYt + Yt.

To majorise the denominator E[U r | G] = E[Y rt+(εYt)∧σ | Gt], we must be able

to majorise E[Y rt+s∧σ | Gt] for arbitrary s > 0. Let t > 0 be arbitrary and S

be a Gt-optional random variable, S > 0. For c3 = c2/r and any s ∈ (0, S],define

FS(s) = E[(Yt+s∧σ + c3S − c3s ∧ σ)r | Gt].We shall show that FS(s) ≤ FS(s−) for all s ∈ (0, S]. It is enough to showthis inequality on σ > s since otherwise FS(s) = FS(s−) and there isnothing to prove. To show that FS is decreasing in (0, S], it is enough to showthat E[dΞs | Gt+s−] ≤ 0 for all s ∈ (0, S], where Ξs = (Yt+s∧σ+c3S−c3s∧σ)r.Now, on σ > s, by use of Ito’s formula, we get

dΞs = −rc3(Yt+s−+c3S−c3s)r−1ds+(Yt+s+c3S−c3s)

r−(Yt+s−+c3S−c3s)r.

Moreover, using Minkowski inequality, we get

E[(Yt+s + c3S − c3s)r | Gt+s−] ≤

(E[Y r

t+s | Gt+s−]1/r + c3S − c3s)r,


and by use of the hypothesis

E[Y rt | Gt−] ≤ Y r

t− + c2Yr−1t− 1ρ > tdt

= Y rt−

(1 +

c2

Yt−1Ta > tdt

)≤(Yt− + c31Ta > tdt

)r.

Therefore, E[dΞs | Gt+s−] ≤ 0 for all s ∈ (0, S]. Subsequently, for all S > 0;

Y rt+S = FS(S) ≤ lim

s→0+FS(s) = (Yt + c3S)r.

Since S is an arbitrary Gt-optional random variable, on choosing S = εYt,we get finally (

E[Y rt+εYt | Gt]

)1/r ≤ Yt + c3εYt.

Substituting yields, that for any α ∈ (0, 1), parameters ε > 0 and δ > 0 canbe chosen so that the following inequality holds:

P[U > a | Gt] ≥((1− c1ε)− a

Yt)+

1− c3ε

) rr−1

≥(

1− c1ε−1

1 + δ

) rr−1

+(1 + c3ε)

− rr−1 ≥ 1− α.

Lemma 7.3.9. Let (Ω,G, (Gt),P) be a filtered probability space, (Yt) and(Zt) two R+-valued, (Gt)-adapted processes on it. For a > 0, we denoteSa = inft ≥ 0 : Yt ≤ a and Ta = inft ≥ 0 : Zt ≤ a. Suppose that thereexist positive constants a, b, p,K1 such that

• Yt ≤ bZt almost surely, for all t, and

• E[Zpt∧Ta ] ≤ K1.

Then, there exists K2 > 0 such that E[Y pt∧Sab ] ≤ K2.

Proof. For arbitrary s > 0, the condition Zs < a implies Xs ≤ bZs < abalmost surely. Hence, Sab ≥ t ⊆ Ta ≥ t. Then,

E[Y pt∧Sab ] ≤ E

[Y pt∧Sab(1Sab ≥ t+ 1Sab > t)

]≤ E[Y p

t 1Sab ≥ t] + (ab)p ≤ bpK1 + (ab)p := K2.


Proposition 7.3.10. Let (Ω,G, (Gt),P) be a filtered probability space, (Yt)and (Zt) two R+-valued, (Gt)-adapted processes on it. For a > 0, we denoteSa = inft ≥ 0 : Yt ≤ a and Ta = inft ≥ 0 : Zt ≤ a. Suppose that

(i) there exist positive constants a, c1, c2, r such that

• E[dZ2t | Gt−] ≥ −c1dt on the event Ta ≥ t,

• E[dZrt | Gt−] ≤ c2Zr−1t− dt on the event Ta ≥ t,

• Z0 = z > a.

(ii) Y0 = y and there exists a constant b > 0 such that

• ab < y < bz and

• Yt ≤ bZt almost surely for all t.

If for some p > 0, the process (Y pt∧Sab) is a submartingale, then ET qa = ∞

for all q > p.

Proof. We can without loss of generality examine solely the case P[Sab <∞] = 1, since otherwise ET qa = ∞ holds trivially for all q > 0. Assumefurther that for some q > p, it happens ET qa < ∞. Hypothesis 1 allowsapplying Proposition 7.3.7; for α = 1/2, we can thus determine positiveconstants ε and δ such that P[Ta > t + εZt∧Ta | Gt] ≥ 1/2 on the eventZt∧Ta > a(1 + δ). Hence

ET qa ≥ E[T qa1Zt∧Ta > a(1 + δ)]

≥ 1

2E[(t+ Zt∧Ta)q1Zt∧Ta > a(1 + δ)

]≥ εq

2E[Zqt∧Ta ]− εq

2aq(1 + δ)q.

Now, finiteness of the q moment of Ta implies the existence of a constantK1 > 0 such that E[Zqt∧Ta ] ≤ K1. From the previous Lemma 7.3.9 it followsthat there exists some K2 > 0 such that E[Y q

t∧Sab ] ≤ K2. Since the time Sabis assumed almost surely finite, limt→∞ E[Y q

t∧Sab ] = E[Y qSab

] ≤ (ab)q. On the

other hand, (Y pt∧Sab) is a submartingale, so is thus a fortiori (Y q

t∧Sab). Hence,

E[Y qt∧Sab ] ≥ EY q

0 = yq, leading to a contradiction if we chose y > ab.

Corollary 7.3.11. Let (Ω,G, (Gt),P) be a filtered probability space, (Xt) aR+-valued, (Gt)-adapted processes on it. For a > 0, we denote Ta = inft ≥0 : Xt ≤ a. Suppose that there exist positive constants a, c1, c2, p, r suchthat


• X0 = x > a,

• E[dXt | Gt−] ≥ −c1dt on the event Ta ≥ t,

• E[dXrt | Gt−] ≤ c2X

r−1t− dt on the event Ta ≥ t,

• (Xpt∧Ta) is a submartingale.

Then ET qa =∞ for all q > p.

After all this preparatory work, the proof of Theorem 7.2.1 is now im-mediate.

Proof of Theorem 7.2.1. Write Xt = f(ξt) and use the abbreviation A :=Sa(f). Notice moreover that τA = Ta. Since f →∞, the set A is finite. Theintegrability of the passage time follows from Corollaries 7.3.3 and 7.3.5.

Then, on identifying Zt and Yt in Proposition 7.3.10 with g(ξt) and f(ξt)respectively, we see that the conditions of the theorem imply the hypothesesof the proposition. The non-existence of moments immediately follows.

7.3.2 Proof of theorems on explosion and implosion

As stated in the introduction, the result concerning integrability of explosiontime is reminiscent to Foster’s criterion for positive recurrence! The reasonlies in Lemma 7.3.1.

Since we wish to treat explosion, we work with a Markov chain evolvingon the augmented state space X = X∪∂ and with augmented functions fas explained in the introduction. Note that if f > 0 then fX = f > 0 whilef(∂) = 0. We extend also naturally Xt = f(ξt) (and stricto sensu also thegenerator Γ). We use the “hat” notation · to denote quantities referring tothe extended chain.

Proposition 7.3.12. Let f : X → [0,∞) belong to Dom+(Γ) and verifyf(x) > 0 for all x ∈ X (it vanishes only at ∂). We denote f the restrictionof f on X. The following are equivalent:

(i) There exists ε > 0 such that Γf(x) ≤ −ε for all x ∈ X.

(ii) Exτ∂ <∞ for all x ∈ X.

Proof. Let Yt = f(ξt), for t ≥ 0. Then (Yt) is a [0,∞)-valued, (Ft)-adaptedprocess. Note also that τ∂ = T0 := inft ≥ 0 : Yt = 0.


The condition Γf(x) ≤ −ε for all x ∈ X is equivalent to the conditionE[dYt∧T0 | Gt−] ≤ −ε1T0 ≥ tdt. The fact that (i) implies (ii) follows fromdirect application of Lemma 7.3.1.

To show that (ii) implies (i), define f(x) = εEx τ∂ . Obviously f(∂) = 0while 0 < f(x) < ∞ for all x ∈ X. Conditioning on the first move of theembedded Markov chain ξ we get:

mf (x) = E[f(ξ1)− f(x) | ξ0 = x] = E[f(ξJ1)− f(ξ0) | ξ0 = x]

= ε∑

y∈X\x

Pxy Ey τ∂ − εEx τ∂ = ε(Ex τ∂ − Ex σ1)− εEx τ∂

= −εEx σ1 = − ε

γx.

Hence Γf(x) ≤ −ε for all x ∈ X.

Proof of Theorem 7.2.5. First, we prove that (i) implies (ii). ConditionΓf(x) ≤ −ε implies thatmf (x) < 0. Since this occurs for all x ∈ X, the func-tion f cannot be constant. Therefore i := infx∈X f(x) < supx∈X f(x) =: s.On choosing a a ∈]i, s[, and defining Ac = x ∈ X : f(x) ≤ a, we are in thesituation described by Theorem 2.5.7, hence transience follows from thattheorem. Let (Fn)n∈N be an arbitrary nested increasing sequence of finitesets exhausting X. Since the chain is transient, it follows that it leaves almostsurely any finite set in finite time, hence Px[τF cn <∞] = 1, for every x ∈ Fn.Since the condition Γf(x) ≤ −ε holds everywhere for a strictly positive f ,using Remark 7.2.4, we can change the function f into fn = f1· ∈ Fnthat vanishes outside Fn and still verifies the condition Γfn(x) ≤ −ε, forx ∈ Fn. Additionally, since the set Fn is finite, we have minx∈Fn f(x) > 0.

Consider now the process Y(n)t = fn(ξt); the condition Γfn(x) ≤ −ε implies

that the process is a strong supermartingale on Fn. Choose an arbitrary

c ∈ (0,minx∈Fn f(x)) and observe that Tn = inft ≥ 0 : Y(n)t ≤ c = τF cn .

Lemma 7.3.1 guarantees that εEx τF cn ≤ fn(x) = f(x) on Fn. But Fn Xand τF cn ζ; therefore, by Fatou’s lemma, Ex ζ ≤ f(x)

ε on X.

Let us prove that (ii) implies (i). Let f(x) := Ex ζ ∈ (0,∞). Computethen mf (x) =

∑y∈X Pxyf(y) − f(x) =

∑y∈X Pxy Ey ζ − Ex ζ = Ex[ζ −

σ1]− Ex ζ = − 1γx

(the penultimate equality holding because the kernel P isstochastic on X, hence it is impossible to reach the boundary ∂ in one step).It follows that Γf(x) = −1 for all x ∈ X.

Proof of Proposition 7.2.7. Let G(z) =∫ z

0dyg(y) . Then G is differentiable,

with G′(z) = 1g(z) > 0 hence an increasing function of z ∈ [0, b]. Since g is


increasing, G′ is decreasing and hence G is concave satisfying limz→0G(z) =0 and limz→∞G(z) < ∞. Additionally, boundedness of G implies thatG f ∈ `1+(Γ). Due to differentiability and concavity of G, we have:

ΓG f(x) = γx E[G(f(ξn) + ∆f

n+1)−G(f(ξn)) | ξn = x]

≤ γxG′(f(x))E[∆fn+1 | ξn = x

]= γx

1

g(f(x))mf (x) =

Γf(x)

g(f(x))≤ −c;

we conclude by Theorem 7.2.5 because Gf is strictly positive and bounded.

Proof of Theorem 7.2.8. The conditions, thanks to Theorem 2.5.7, implytransience of the chain. Additionally, it can be shown that Px0 [τA =∞] > 0,analogously to the proof of Theorem 2.5.7. In fact, let Yt = f(ξt∧τA). Thecondition Γf(x) ≤ −ε on Ac implies that (Yt) is a non-negative super-martingale, converging almost surely towards Y∞. For every x ∈ Ac, Fatou’slemma implies Ex Y∞ ≤ limt→∞ Ex Yt ≤ f(x). Suppose now that the chainis not transient, hence Px[τA < ∞] = 1. In that case limt→∞ Yt = YτA al-most surely. We get then by monotone convergence theorem infy∈A f(y) ≤Ex f(ξτA) ≤ f(x), in contradiction with the hypothesis that there existsx ∈ Ac such that f(x) < infy∈A f(y). Hence, Px[τA =∞] > 0.

Let (Fn)n∈N be a nested increasing sequence of finite sets exhausting X\A. Since every Fn is finite and the chain is transient, Px0 [τF cn <∞] = 1; thevery same arguments used in the proof of Theorem 7.2.5 and Lemma 7.3.1imply here that Ex0 τF cn ≤

f(x0)ε . Now τF cn ζ∧τA; hence by Fatou’s lemma,

Ex0 [ζ ∧ τA] ≤ f(x0)ε . But Ex0 [ζ ∧ τA] ≥ Ex0 [ζ ∧ τA | τA = ∞]Px0 [τA = ∞].

We conclude that

Ex0 [ζ ∧ τA | τA =∞] ≤ f(x0)

εPx0 [τA =∞].

Proof of Theorem 7.2.10. Let G(z) =∫ z

01

g(y)dy; this function is differenti-

able with G′(z) = 1g(z) > 0, hence increasing. Since g is increasing, G′ is

decreasing, hence G is concave. Non integrability of the infinite tail meansthat G is unboundedly increasing towards ∞ as z →∞, hence G f →∞.Concavity and differentiability of G imply that G(f(x) + ∆) − G(f(x)) ≤∆G′(f(x)); integrability of f guarantees then

0 ≤ E[G(f(ξn) + ∆fn+1) | ξn = x] ≤ G(f(x)) +

mf (x)

g(f(x))<∞,


so that F := G f ∈ `1+(Γ) as well. Now

ΓG f(x) = γx E[G(f(ξn) + ∆f

n+1)−G(f(ξn)) | ξn = x]

= γx E[ ∫ f(x)+∆f

n+1

f(x)

1

g(y)dy | ξn = x

]≤ γx E

[ 1

g(f(x))∆fn+11∆f

n+1 ≥ 0

− 1

g(f(x))|∆f

n+1|1∆fn+1 < 0 | ξn = x

]= γx

1

g(f(x))mf (x) ≤ c.

Let Xt = F (ξt) be the process obtained form the Markov chain after trans-formation by F . Using the semimartingale decomposition of Xt = X0+Mt+∫

(0,t] ΓF (ξs−)ds, it becomes obvious then that ExXt ≤ F (x) + ct, showing

that for every finite t, Px[Xt = ∞] = 0. But since F → ∞ the processitself ξt cannot explode.

Proposition 7.3.13. Suppose the chain is recurrent and there exists a finiteset A ∈ X and a constant C = CA > 0 such that, for all x ∈ X, the uniformbound Ex τA ≤ C holds. Then the chain implodes towards any state z ∈ X.

Proof. We only sketch the proof since it relies on the same ideas serving toprove that an irreducible Markov chain that positively recurs to a finite set,positively recurs everywhere. First remark that obviously σ0 ≤ τA, where σ0

is the holding time at the initial state, i.e. 1γx

= Ex σ0 ≤ Ex τA ≤ C. Henceγ := infx∈X γx > 0. Let a be an arbitrary state in A and z an arbitrary fixedstate in X. Irreducibility means that there exists a path of finite length,say k = ka, satisfying a ≡ x0, . . . , xk ≡ z and δ = δa =

∏k−1i=0 Pxi,xi+1 > 0.

Now it is straightforward to show that the probability that starting on A,we recur n times to A before reaching z ∈ X decays exponentially with n.Therefore, once we have reached A, with probability 1 we will reach z.Standard arguments then allow to show that that supx Ex τz ≤ KA,z forsome constant KA,z <∞ and this guarantees implosion towards any z ∈ X.First remark that obviously σ0 ≤ τA, where σ0 is the holding time at theinitial state, i.e. 1

γx= Ex σ0 ≤ Ex τA ≤ C. Hence γ := infx∈X γx > 0. Let a

be an arbitrary point in A and z an arbitrary fixed state in X. Irreducibilitymeans that there exists a path of finite length, say k = ka, satisfying a ≡x0, . . . , xk ≡ z and δ = δa =

∏k−1i=0 Pxi,xi+1 > 0. Then, for K ′ = KA >


12γ supa∈A ka, we estimate

Pa[τz ≤ K ′] ≥ Pa[ k−1∑i=0

σi ≤ K ′ | ξ1 = x1, . . . , ξk = xk = z]

× Pa[ξ1 = x1, . . . , ξk = xk = z].

Then, Markov’s inequality yields

Pa[τz ≤ K ′A] ≥(

1− kaK ′Aγ

)δa ≥

1

2mina∈A

δa := δ′A > 0.

Now, the set A being finite and the chain irreducible, by repeating ex-actly the same arguments as above, we know that z′ := ξτAc is accessiblefrom a by following a finite path of length, say la, of probability minorisedby δ′′a > 0. Define δ′′ := δ′′A = mina∈A δ

′′a a nd K ′′ := K ′′A > 1

2γ supa∈A la;

obviously δ′′ > 0 and K ′′ < ∞. Denote by K = KA = max(K ′,K ′′) < ∞,δ = δA = min(δ′, δ′′) > 0, and r = rA = supa∈A Ea τz.

The hypothesis of the proposition and the strong Markov property imply

Ex τz = Ex[τz | τz ≤ τA]Px[τz ≤ τA] + Ex[τz | τz > τA]Px[τz > τA]

≤ CA + Ex[τz | τz > τA]Px[τz > τA].

Now

Ex[τz | τz > τA] ≤ Ex τAPx[τz > τA]

+ Ex[E[τz − τA | FτA ] | τz < τA

]≤ Ex τA

Px[τz > τA]+ supa′∈A

Ea′ [τz | τz < τA].

We thus obtain Ex τz ≤ 2CA + rA.On the other hand,

Ea τz = Ea[τz | τz ≤ K]Pa[τz ≤ K] + Ea[τz | τz > K]Pa[τz > K]

≤ K + Ea[EXK τz].

Now, on the set τz > K, XK can be any a′ ∈ A or any x′ ∈ Ac \ z.Hence,

Ea τz ≤ K + δmax( ∑x′∈Ac\z

Ex′ τz Pa[XK = x′],∑a′∈A

Ea′ τz Pa[XK = a′]).

Using the previously obtained majorisations, we get Ea τz ≤ K + δ(2C + r)and taking the maxa of the left-hand side, we obtain finally r(1− δ) ≤ K +2Cδ proving thus that supx Ex τz ≤ 2C + K+2C

1−δ that guarantees implosiontowards any z ∈ X.


Proof of Theorem 7.2.11. We first show implosion. Since f is bounded, thecondition Γf(x) = γxmf (x) ≤ −ε guarantees that γ = infx γx > 0, hencethe chain has always strictly positive speed.

Let us prove that (i1) implies (i2). Since F is finite and the chain isrecurrent, it follows that Px[τF <∞] = 1 for all x 6∈ F ; associated with thecondition Γf(x) ≤ −ε for x 6∈ F these properties guarantee — by virtue

of Lemma 7.3.1 — that Ex τF ≤ f(x)ε ≤ supz∈X

f(z)ε ≤ b

ε . Now, due torecurrence and irreducibility, if there exists a constant C ′ = C ′F such thatx 6∈ F ⇒ Ex τF < C ′, then, for every z ∈ X, there exists a constant C suchthat x 6= z ⇒ Ex τz < C, by virtue of the previous Proposition 7.3.13.

Now, we show that (i2) implies (i1). Suppose now that for a finiteA ∈ X , there exists a constant C such that Ex τA ≤ C. Consequenlty1γx≤ Ex τA ≤ C, leading necessarily to the lower bound γ > 0. Define

f(x) =

0, if x ∈ A,Ex τA, if x 6∈ A.

Then it is immediate to show that for x 6∈ A, we have Γf(x) ≤ −1.

To show non-implosion, use Lemma 7.3.1 to guarantee that if at sometime t the process ξt is at some point x0, then the time needed for theprocess Xt = f(ξt) to reach Sa(f) exceeds εf(x0) with some substantiallylarge probability. More precisely, Px0 [τSa(f) − t > εf(x0)] ≥ 1− α, for someα ∈ (0, 1). Therefore Ex0 τSa(f) ≥ (1 − α)εf(x0) ] and since f → ∞ thenthis expectation cannot be bounded uniformly in x0.

Proof of Proposition 7.2.12. Since the chain is implosive, for every finite setF , there exists a constant C = CF such that and every x ∈ F c, we haveEx τF ≤ C. By Markov’s inequality, we get that Px[τF ≥ 2C] ≤ 1

2 . Now,using Strong Markov property we show that Px[τF ≥ 2kC] ≤ 1

2kfor all

k ≥ 1, from which existence of exponential moments Ex[exp(ατF )] < ∞follows, for sufficiently small α > 0.

Proof of Proposition 7.2.13. Let G(z) =∫ z

0dyg(y) . Since G′(z) = 1

g(z) > 0 the

function G is increasing with G(0) = 0 and G(b) = B. Since g is increasingG′ = 1

g is decreasing, hence the function G is concave. Then concavity leadsto the majorisation

mGf (x) ≤ G′(f(x))mf (x) =mf (x)

g(f(x)).

The condition imposed on the statement of the proposition implies thatΓG f(x) ≤ −1. We conclude from Lemma 7.3.1.

7.4. Applications to some critical models 359

7.4 Applications to some critical models

This section intends to treat some problems that even with without the ex-plosion phenomenon have critical behaviour illustrating thus how our meth-ods can be applied and showing their power.

We need first some technical conditions. Let f ∈ `2+δ0 for some δ0 > 0.For every g : R+ → R+ a C3 function, we get

mgf (x) = E[g(f(ξn+1))− g(f(ξn)) | ξn = x

]= g′(f(x))mf (x) +

1

2g′′(f(x))vf (x) +Rg(x),

where Rg(x) is the conditional expectation of the remainder occurring inthe Taylor expansion1.

Definition 7.4.1. Let F = (Fn)n∈N be an exhaustive nested sequence ofsets Fn ↑ X, f ∈ `2+δ

+ (Γ) and g ∈ C3(R+;R+). We say that the chain (or itsjumps) satisfies the remainder condition (or shortly condition R) for F, f ,and g, if

limn→∞

supx∈F cn

Rg(x)(g′(f(x))mf (x) +

1

2g′′(f(x))vf (x)

)−1= 0.

The quantity Dg(f, x) := g′(f(x))mf (x) + 12g′′(f(x))vf (x) in the expres-

sion above is the effective average drift at the scale defined by the function g.If the function f is non-trivial, there exists a natural exhaustive nested se-quence determined by the sub-level sets of f . When we omit to specify thenested sequence, we shall always consider the natural one.

Introduce the notation ln(0) s = s and recursively, for all integers k ≥ 1,

ln(k) s = ln(ln(k−1) s) and denote by Lk(s) =∏ki=0 ln(i) s, for k ≥ 0, and

Lk(s) := 1 for k < 0. Equivalently, we define exp(k) as the inverse functionof ln(k). In most occasions we shall use (Lyapunov) functions g(s) := lnη(l) swith some integer l ≥ 0 and some real η 6= 0, defined for s sufficiently large,s ≥ s0 := exp(l)(2) say. It is cumbersome but straightforward to show then

that for f ∈ `2+δ0(Γ) with some δ0 > 0, the condition R is satisfied. Itwill be shown in this section that condition R and Lyapunov functions ofthe aforementioned form play a crucial role in the study of models wherethe effective average drift Dg(f, x) tends to 0 in some controlled way with nwhen x ∈ F cn, where (Fn) is an exhaustive nested sequence; models of thistype lie in the deeply critical regime between recurrence and transience.

1The most convenient form of the remainder is the Roche–Schlomlich one (see AppendixA, Table 9.IV, p. 1754 of [127] for instance).


7.4.1 Critical models on countably infinite and unboundedsubsets of R+

Consider a discrete time irreducible Markov chain (ξn) on a countably infin-ite and unbounded subset X of R+. Since now X inherits the compositionproperties stemming from the field structure of R, we can define directlym(x) := mid(x) and v(x) := vid(x), where id is the identity function on X.The model is termed critical when the drift m tends to 0, in some precisemanner, when x→∞.

In [173] Markov chains on X = 0, 1, 2, . . . were considered; it has beenestablished that the chain is recurrent if m(x) ≤ v(x)/2x while is transientif m(x) > θv(x)/2x for some θ > 1. In particular, if m(x) = O( 1

x) andv(x) = O(1) the model is in the critical regime.

The case with arbitrary degree of criticality

m(x) =k∑i=0

αiLi(x)

+ o( 1

Lk(x)

)and v(x) =

k∑i=0

xβiLi(x)

+ o( x

Lk(x)

),

with αi, βi constants, has been settled in [208] by using Lyapunov functions.Under the additional conditions

lim supn→∞

ξn =∞ and lim infx∈X

v(x) > 0

and some technical moment conditions — guaranteeing the condition R forthis model — that can straightforwardly be shown to hold if id ∈ `2+δ0(Γ),for some δ0 > 2, it has been shown in [208] (Theorem 4) that

• if 2α0 < β0 the chain is recurrent while if 2α0 > β0 the chain istransient;

• if 2α0 = β0 and 2αi − βi − β0 = 0 for all i : 0 ≤ i < k and there existsi : 0 ≤ i < k such that βi > 0 then

– if 2αk − βk − β0 ≤ 0 the chain is recurrent,

– if 2αk − βk − β0 > 0 the chain is transient.

We assume that the moment conditions id ∈ `2+δ0(Γ) — guaranteeingthe condition R for this model — are satisfied through out this section.

Let (ξn) be a Markov chain on X = 0, 1, . . . satisfying for some k ≥ 0

m(x) =k−1∑i=0

β0

2Li(x)+

αkLk(x)

+ o( 1

Lk(x)

)and v(x) = β0 + o(1),


with 2ak > β0. The aforementioned result guarantees the transience of thechain; we term such a chain k-critically transient. When 2αk < β0 theprevious result guarantees the recurrence of the chain; we term such a chaink-critically recurrent.

In spite of its seemingly idiosyncratic character, this model proves uni-versal as Lyapunov functions used in the study of many general models incritical regimes map those models to some k-critical models on countablyinfinite unbounded subsets of R+.

Moments of passage times

Proposition 7.4.2. Let (ξt) be a continuous-time Markov chain on X and Abe the finite set A := [0, x0] ∩X for some sufficiently large x0. Suppose thatfor some integer k ≥ 0, its embedded chain is k-critically recurrent, i.e.

m(x) =k−1∑i=0

β0

2Li(x)+

αkLk(x)

+ o( 1

Lk(x)

)and v(x) = β0 + o(1),

with 2ak < β0. Denote C = αk/β0 (hence C < 1/2) and assume there

exists a constant κ > 0 such that 2 γx = O(L2k(x)

lnκ(k)(x)) for large x. Define

p0 := p0(C, κ) = (1− 2C)/κ.

(i) Assume that v(ρ)(x) = O(1) with ρ = max(2, 1 − 2C) + δ0. If q < p0

then Ex τ qA <∞.

(ii) Assume that v(ρ)(x) = O(1) with ρ = max(2, p0) + δ0. If q ≥ p0 thenEx τ qA =∞.

Remark 7.4.3. We remark that when κ ↓ 0 (for fixed k and C) then p0 ↑ ∞implying that more and more moments exist.

Proof of Proposition 7.4.2. For the function f(x) = lnη(k) x we determine

mf (x) =1

2β0η(2C + η − 1 + o(1))

lnη(k)(x)

L2k(x)

.

Concerning the part (i), for p > 0, we remark that(0 < pη < 1− 2C and η ≥ κ

2

)⇒(Γfp(x) ≤ −cfp−2(x)

),

2 i.e. there exists a constant c3 > 0 such that 1c3

≤γx lnκ(k)(x)

L2k(x)

≤ c3 for x ≥ x0.


for some constant3 c > 0. Hence p < 2(1 − 2C)/κ and statement 1 ofTheorem 7.2.1 implies for any q ∈ (0, (1 − 2C)/κ), the q-th moment of thepassage time exists. Optimising over the accessible values of q we get thatEx τ q <∞ for all q < p0.

As for the part (ii), assume first that 1−2C < κ. We verify that, choosingη ∈]1− 2C, κ] and p > 1−2C

κ , the three conditions conditions of statement 2of Theorem 7.2.1 (with f = g), namely Γf ≥ −c1, Γf r ≤ c2f

r−1, for somer > 1, and Γfp ≥ 0, outside some finite set A.

Then, suppose that κ ≤ 1−2C. In this situation, choosing the paramet-ers η ∈ (0, κ] and p > 1−2C

κ implies simultaneous verification of the threeconditions of statement 2 of Theorem 7.2.1. In both situations, we concludethat for all q > 1−2C

κ , the corresponding moment does not exist. Optimisingover the accessible values of q, we get that Ex τ q =∞ for all q > p0.

Explosion and implosion

Proposition 7.4.4. Let (ξt) be a continuous time Markov chain. Supposethat for some integer k ≥ 0, its embedded chain is k-critically transient.

(i) If there exist a constant d1 > 0, an integer l > k, and a real κ > 0(arbitrarily small) such that

γx ≥ d1Lk(x)Ll(x)(ln(l) x)κ, x ≥ x0,

then Py[ζ <∞] = 1 for all y ∈ X.

(ii) If there exist a constant d2 > 0 and an integer l > k such that

γx ≤ d2Lk(x)Ll(x), x ≥ x0,

then the continuous-time chain is conservative.

Proof. Part (i): for a k-critically transient chain, chose a function f behavingat large x as f(x) = 1

lnη(l)

(x). Denote C = αk/β0 (hence C > 1/2 for the chain

to be transient). We estimate then

mf (x) = −2β0η(2C − 1)1

Lk(x)Ll(x) lnη(l) x+ o(

1

Lk(x)Ll(x) lnη(l) x).

We conclude by Theorem 7.2.5.

3The constant c can be chosen c ≥ 12β0η(1 − 2C)c3.


Part (ii): choosing as Lyapunov function the identity function f(x) = xand estimate

mln(l+1) f (x) = 2β0(2C − 1)ln(l+1) x

Lk(x)Ll+1(x)= 2β0(2C − 1)

1

Lk(x)Ll(x).

We conclude by Theorem 7.2.10 by choosing g(s) = Ll(s).

Proposition 7.4.5. Let (ξt) be a continuous time Markov chain. Supposethat for some integer k ≥ 0, its embedded chain is k-critically recurrent.Denote C = αk/β0 (hence C < 1/2 for the chain to be recurrent). Let A bethe finite set A := [0, x0] ∩ X for some sufficiently large x0.

(i) If there exist a constant d1 > 0, an integer l > k, and an arbitrarilysmall real κ > 0 such that

γx ≥ d1Lk(x)Ll(x)(ln(l) x)κ, x ≥ x0,

then there exists a constant B such that Ey τA ≤ B, uniformly iny ∈ Ac, i.e. the chain implodes.

(ii) If there exist a constant d2 > 0 and an integer l > k such that

γx ≤ d2Lk(x)Ll(x), x ≥ x0,

then the continuous time chain does not implode.

Proof. Part (i): use the function f defined for sufficiently large x by theformula f(x) = 1− 1

lnη(l)x, for some l > k and η > 0. We estimate then

mf (x) = 2β0η(2C − 1)1

Lk(x)Ll(x) lnη(l) x+ o(

1

Lk(x)Ll(x) lnη(l) x).

We conclude by statement 1 of Theorem 7.2.11.Part (ii): using the function f defined for sufficiently large x by the

formula f(x) = lnη(l+1) x, for some l ≥ k. We estimate then

mf (x) = 2β0η(2C − 1)lnη(l+1) x

Lk(x)Ll+1(x)+ o(

lnη(l+1) x

Lk(x)Ll+1(x)).

If γx ≤ d2Lk(x)Ll(x) for large x, then, using the above estimate for the caseη = 1 and the case η = r for some small r > 1, we observe that the conditionsΓf ≥ −ε and Γf r ≤ f r−1 are simultaneously verified. We conclude by thestatement 2 of Theorem 7.2.11.


7.4.2 Simple random walk on Zd for d = 2 and d ≥ 3

Here the state space X = Zd and the embedded chain is a simple randomwalk on X. Since in dimension 2 the simple random walk is null recurrentwhile in dimension d ≥ 3 is transient, a different treatment is imposed.

Dimension d ≥ 3

For the Lyapunov function f defined by Zd 3 x 7→ f(x) := ‖x‖, we can showthat there exist constants α0 > 0 and β0 > 0 such that lim‖x‖→∞ ‖x‖mf (x) =α0 and lim‖x‖→∞ vf (x) = β0 such that C = α0/β0 > 1/2. Therefore theone dimensional process Xt = f(ξt) has 0-critically transient Lamperti be-haviour.

We get therefore that (ξt)t∈R+ is a (quite unsurprisingly) transient pro-cess and that if there exist a constant a > 0 and

• a constant d1 > 0, an integer l > 0, and a real κ > 0 (arbitrarily small)such that

γx ≥ d1‖x‖Ll(‖x‖)(ln(l) ‖x‖)κ, ‖x‖ ≥ a,then Py[ζ <∞] = 1 for all y ∈ X;

• a constant d2 > 0 and an integer l > 0 such that

γx ≤ d2‖x‖Ll(‖x‖), ‖x‖ ≥ a,then the continuous time chain is conservative.

Dimension 2

Using again the Lyapunov function f(x) = ‖x‖, we show that the one di-mensional process Xt = f(ξt) is of the 1-critically recurrent Lamperti type.Hence, again using the results obtained in Section 7.4.1, we get that if thereexist a constant a > 0 and

• a constant d1 > 0, an integer l > 1, and an arbitrarily small real κ > 0such that

γx ≥ d1L1(‖x‖)Ll(‖x‖)(ln(l) ‖x‖)κ, ‖x‖ ≥ a,then there exists a constant C such that Ey τA ≤ C, uniformly in y fory : ‖y‖ ≥ a, i.e. the chain implodes;

• a constant d2 > 0 and an integer l > 1 such that

γx ≤ d2L1(‖x‖)Ll(‖x‖), ‖x‖ ≥ a,then the continuous time chain does not implode.


7.4.3 Random walk on Z2+ with reflecting boundaries

The model in discrete time

Here X = Z2+. We denote by

X = x ∈ Z2

+ : x1 > 0, x2 > 0 the interiorof the wedge and by ∂1X = x ∈ Z2

+ : x2 = 0 (and similarly for ∂2X) itsboundaries. Since X is a subset of a vector space, we can define directly theincrement vector D := ξn+1 − ξn and the average conditional drift m(x) :=

mid(x) = E[D | ξn = x] ∈ R2. We assume that for all x ∈X, m(x) = 0

so that we are in a critical regime. For x ∈ ∂[X; with [ = 1, 2, the driftm[(x) does not vanish but is a constant vector m[ that forms angles ϕ[ with

respect to the normal to ∂[X. For x ∈X, the conditional covariance matrix

C(x) := (C(x)ij), with C(x)ij(= E[DiDj | ξn = x], is the constant 2 × 2matrix C, reading

C := Cov(D,D) =

(s2

1 λλ s2

2

).

There exists an isomorphism Φ on R2 such that Cov(ΦD,ΦD) = ΦCΦt = I;it is elementary to show that

Φ =

( s2d − λ

s2d

0 1s2

),

where d =√

detC, is a solution to the aforementioned isomorphism equa-tion. This isomorphism maps the quadrant R2

+ into a squeezed wedge Φ(R2+)

having an angle ψ at its summit reading ψ = arctan(−d2/λ). Obviouslyψ = π/2 if λ = 0, while ψ ∈ (0, π/2) if λ < 0 and ψ ∈]π/2, π[ if λ > 0.We denote Y = Φ(X) the squeezed image of the lattice. The isomorphism Φtransforms the average drifts at the boundaries into n[ = Φm[ forming newangles, ψ[, with the normal to the boundaries of Φ(R2

+).The discrete time model has been exhaustively treated in [9] and its

extension to the case of excitable boundaries carrying internal states in [203].Here we recall the main results of [9] under some simplifying assumptionsthat allow us to present them here without redefining completely the modelor considering all the technicalities. The assumptions we need are that thejumps

• are bounded from below, i.e. there exists a constant K > 0 such thatD1 ≥ −K and D2 ≥ −K,

• satisfy a sufficient integrability condition, for instance E[‖D‖2+δ0

2+δ0

]<

∞ for some δ0 > 0,


• are such that their covariance matrix is non degenerate.

Under these assumptions we can state the following simplified version of theresults in [9].

Denote by χ = (ψ1 + ψ2)/ψ and A = x ∈ X : ‖x‖ ≤ a.1. If χ ≥ 0 the chain is recurrent.

2. If χ < 0 the chain is transient.

3. If 0 < χ < 2+δ0, then for every p < χ/2 and every x 6∈ A, Ex τpA <∞.

4. If 0 < χ < 2+δ0, then for every p > χ/2 and every x 6∈ A, Ex τpA =∞.

Let f : R2+ → R+ be a C2 function. Define f(y) = f(Φ−1y). Although

the Hessian operator does not in general transform as a tensor, the linearity

of Φ allows however to write Hessf (x) = ΦtHessf (Φx)Φ. For every x ∈X

we establish then4 the identity:

E[〈D,Hessf (x)D〉 | ξn = x

]=∑i,j

Cov((ΦD)i, (ΦD)j

)| ξn = x)(Hessf (Φx))ij

= tr Hessf (Φx) = LapfΦ−1 Φ(x),

soE[〈D,HessfΦ(x)D〉 | ξn = x

]= LapfΦ(x).

We denote by hβ,β1(x) = ‖x‖β cos(β arctan(x2x1 )−β1

). Then this function

is harmonic, i.e. Laphβ,β1= 0. We are interested in harmonic functions that

are positive on Φ(R2+); positivity and geometry impose then conditions on β

and β1. In fact, sgn(β)β1 is the angle of ∇hβ,β1(x) at x ∈ ∂1X, with thenormal to ∂1X. Similarly, if β2 = βψ− β1, then sgn(β)β2 is the angle of thegradient with the normal to ∂2X. Now, it becomes evident that βi, i = 1, 2,must lie in the interval (−π/2, π/2) and subsequently β = β1+β2

ψ . Notice alsothat the datum of two admissible angles β1 and β2 uniquely determines theharmonic function whose gradient at the boundaries forms angles as above.Hence, 〈∇hβ,β1(y), n[〉 = ‖y‖ββ sin(ψ[ − β[), for y ∈ ∂[Y and [ = 1, 2.

Let now g : R+ → R+ be a C3 function and h = hβ,β1 an harmonicfunction that remains positive in Φ(R2

+). On denoting y = Φx, abbreviating

Ξ := g(h(Φξn+1))−g(h(Φξn)), and using the fact that h is harmonic, we get

E[Ξ | ξn = x] = g′(h(y))E[〈∇h(y),ΦD〉 | ξn = x]

4Since we have used the symbol ∆ to denote the jumps of the process, we introducethe symbol Lap to denote the Laplacian.


+g′′(h(y))

2E[〈∇h(y),ΦD〉2 | ξn = x]

+[g′(h(y))]2

2E[〈ΦD,Hessh(y)ΦD〉 | ξn = x] +R3

= g′(h(y))〈∇h(y), n(y)〉+g′′(h(y))

2‖∇h(y)‖2 +R3(y),

where R3 is the remainder of the Taylor expansion. The value of the con-ditional increment depends on the position of x. If x ∈ ∂[X the dominantterm of the right hand side is g′(h(y))〈∇h(y), n[〉, while in the interior ofthe space, that term strictly vanishes because there n(y) = 0; hence the

dominant term becomes the term g′′(h(y))2 ‖∇h(y)‖2.

The model in continuous-time

Proposition 7.4.6. Let 0 < χ = (ψ1 +ψ2)/ψ (hence the chain is recurrent)and A := Aa = x ∈ X : ‖x‖ ≤ a for some a > 0, and γx = O(‖x‖2−κ);denote p0 = χ/κ. Suppose further that id ∈ `ρ(Γ) for some ρ > 2.

(i) If q < p0, then Ex τ qA <∞.

(ii) If q > p0, then Ex τ qA =∞.

Proof. Consider the Lyapunov function f(x) = hβ,β1(x)η. We start with thepart (i). If 0 < pη < 1 then mfp(x) < 0. The condition Γfp ≤ −cfp−2 reads

then γx ≥ Chpη−2ηβ (x)

hpη−2β ‖∇hβ‖2

= C ′ ‖x‖2β−2βη

‖x‖2β−2 = C ′‖x‖2−2βη from which follows

that 2βη > κ. This inequality, combined with 0 < pη < 1, yields that for allq < β

κ <χκ , Ex τ qA < ∞. Hence, on optimising on the accessible values of q

we obtain the value of p0.As for the part (ii), we proceed similarly; we need however to use the

full-fledged version of the statement (ii) of Theorem 7.2.1, with both thefunction f and g = hηχ,ψ1

. Then f(x) ≤ Cg(x) and(ηβ < κ and ηp > 1

)⇒(Γg ≥ −ε,Γgr ≤ cgr−1for r > 1, and Γfp ≥ 0

).

Simultaneous verification of these inequalities yields p0 = χ/κ.

Using again Lyapunov functions of the form f = hηβ we can show thatthe drift of the chain in the transient case can be controlled by two 0-critically transient Lamperti processes in the variable ‖x‖ that are uniformlycomparable. We can thus show, using methods developed in Section 7.4.1the following


Proposition 7.4.7. Let χ < 0 (hence the chain is transient).

(i) If there exist a constant d1 > 0 and an arbitrary integer l > 0 such thatγx ≥ d1‖x‖Ll(‖x‖) lnκ(l) ‖x‖, for some arbitrarily small κ > 0, then thechain explodes.

(ii) If there exist a constant d2 > 0 and an arbitrary integer l > 0 suchthat γx ≤ d2‖x‖Ll(‖x‖), then the chain does not explode.

With the help of similar arguments we can show the following

Proposition 7.4.8. Let 0 < χ (hence the chain is recurrent).

(i) If there exist a constant d1 > 0 and an arbitrary integer l > 0 such thatγx ≥ d1‖x‖Ll(‖x‖) lnκ(l) ‖x‖, for some arbitrarily small κ > 0, then thechain implodes.

(ii) If there exist a constant d2 > 0 and an arbitrary integer l > 0 suchthat γx ≤ d2‖x‖Ll(‖x‖), then the chain does not implode.

7.4.4 Collection of one-dimensional complexes

We introduce some simple models to illustrate two phenomena:

• it is possible to have 0 < Px[ζ <∞] < 1,

• it is possible to have Px[ζ =∞] = 0 and Ex ζ =∞.

The simplest situation corresponds to a continuous-time Markov chainwhose embedded chain is a simple transient random walk on X = Z with nontrivial tail boundary. For instance, choose some p ∈](1/2, 1) and transitionmatrix

Pxy =

1/2, if x = 0, y = x± 1;p, if x > 0, y = x+ 1 or x < 0, y = x− 1;1− p, if x > 0, y = x− 1 or x < 0, y = x+ 1;0, otherwise.

Then, for every x 6= 0, 0 < Px[τ0 = ∞] < 1. Suppose now that γx =c for x < 0 while there exists a sufficiently large integer l ≥ 0 and anarbitrarily small δ > 0 such that γx ≥ cLl(x) lnδ(l) x for x ≥ x0. Then,using Theorem 7.2.8, we establish that Ex[ζ | τZ− > ζ] < ∞ for all x > 0while Px[ζ = ∞ | τZ+ = ∞] = 1 for all x < 0. This result combined withirreducibility of the chain leads to the conclusion: 0 < Px[ζ <∞] < 1 for allx ∈ X.


It is worth noting that bending the axis Z at 0 allows considering thestate space as the gluing of two one-dimensional complexes X2 = 0 ∪N×−,+; every point x ∈ Z \ 0 is now represented as x = (|x|, sgn(x)).This construction can be generalised by gluing a countably infinite family ofone-dimensional complexes through a particular point o and introducing thestate space X∞ = o∪N×N; every point x ∈ X∞ \ o can be representedas x = (x1, x2) with x1, x2 ∈ N.

Let (ξt)t∈[0,∞) be a continuous time Markov chain evolving on the state

space X∞. Its embedded (at the moments of jumps) chain (ξn)n∈N hastransition matrix given by

Pxy =

πy2 , if x = o, y = (1, y2), y2 ∈ N,p, if x = (x1, x2), y = (x1 + 1, x2), x1 ≥ 1, x2 ∈ N,1− p, if x = (x1, x2), y = (x1 − 1, x2), x1 > 1, x2 ∈ N,1− p, if x = (1, x2), n ∈ N, y = o,0, otherwise,

where 1/2 < p < 1 and π = (πn)n∈N is a probability vector on N, satisfyingπn > 0 for all n. The chain is obviously irreducible and transient.

The space X∞ must be thought as a “mock-tree” since, for transientMarkov chains, it has a sufficiently rich boundary structure without anyof the complications of the homogeneous tree (the study on full-fledgedtrees is postponed in a subsequent publication). Suppose that for everyn ∈ N there exist an integer ln ≥ 0, a real δn > 0, and a Kn > 0 suchthat for x = (x1, x2) ∈ N × N for x1 large enough, γx satisfies γ(x1,x2) =

Kx2O(Llx2 (x1) lnδx2(lx2 ) x1).

By applying Theorem 7.2.8, we establish that Zx2 := E(x1,x2)[ζ | τo =∞] < ∞ for all x1 > 0 and all x2 ∈ N, hence P(x1,x2)[ζ = ∞ | τo = ∞] = 0.Irreducibility implies then that Po[ζ =∞] = 0. However,

Eo ζ ≥∑x2∈N

πx2 E(1,x2)[ζ | τo =∞]P(1,x2)[τo =∞] =2p− 1

p

∑x2∈N

πx2Zx2 .

Since the sequences (ln)n, (δn)n, and (Kn)n are totally arbitrary, whilethe positive sequence (πn)n must solely satisfy the probability constraint∑

n∈N πn = 1, all possible behaviour for Eo ζ can occur. In particular, we canchoose, for all n ∈ N, ln = 0 and δn = 1; this choice gives γ(x1,n) = KnO( 1

x21)

for every n and for large x1, leading to the estimate Zn ≥ CKn, for all n.Choosing now, for instance, πn = O(1/n2) and Kn = O(n) for large n,

we getPo[ζ =∞] = 0 and Eo ζ =∞.


This remark leads naturally to the question whether for transient explod-ing chains with non-trivial tail boundary, there exists some critical q > 0such that E ζp <∞ for p < q while E ζp =∞ for p > q. Such models includecontinuous time random walks on the homogeneous tree and more generallyon non-amenable groups. These questions are currently under investigationand are postponed to a subsequent publication. rewrite the last sentence

7.4.5 Random walk on a family of infinitely decorated lat-tices

We intend now to introduce a model where implosion occurs in the sense ofDefinition 7.1.2 but where the expectation Ex τA is finite but not uniformlybounded. The model will be constructed by decorating every point of N bya truncated version of the X∞ space introduced in the Section 7.4.4.

For any M ∈ N, consider YM = o∪ 1, . . . ,M×N (the set YM is ob-tained from X∞ by “haircutting” all one-dimensional complexes at size M).We denote ∂YM = M × N ' N, the set of endpoints of YM . Since inSection 7.4.4 we showed that Ex[ζ | τo = ∞] ≤ Eo[ζ | τo = ∞] < ∞, itfollows that if we keep the same transition probabilities and γ inside YM ,we shall have E[τ∂YM | τo =∞] <∞.

We construct now the state space of the Markov chain as X = ? ∪⊕k∈NYk. On every point k ∈ N, we have placed a decoration Yk, therefore,the gluing points of Yk are now distinguished by attaching them a label ok,k ∈ N. In the interior of every decoration Yk we keep the same transitionmatrix as in Section 7.4.4; when the chain reaches a point of ∂Yk, it jumpswith probability 1 to ?. More precisely,

P (x, y) =

1, if x = ?, y = o1,1/2− ε/2, if x = ok, y = ok±1, k ∈ N,επl, if x = ok, y = (k; 1, y2), k ∈ N, y2 ∈ N,p, if x = (k;x1, x2), y = (k;x1 + 1, x2), 1 ≤ x1 < k, x2 ∈ N,1− p, if x = (k;x1,2 ) y = (k;x1 − 1, x2), 1 < x1 < k, x2 ∈ N,1− p, if x = (k; 1, x2), x2 ∈ N, y = ok,1, if x = (k, k, x2), x2 ∈ N, y = ?,0, otherwise,

with p and (πn) as in Section 7.4.4 and 0 < ε < 1. Now, it becomesevident that for every δ > 0 there exists K > 0 such that for all x ∈ X,we have Px[τ? > K] ≤ δ because the time needed to reach ? from any x isnecessarily less than the time needed for the chain of the previous subsection(in X∞) to explode, hence the uniformity of the bound. On the other hand,


for x = (k; l, n) with 1 ≤ l ≤ k, we have Ex[τ? | τok = ∞] < ∞, henceEx τ? <∞.

The set X plays again the role of a “mock-graph” reminiscent of thesituation on a homogeneous tree with points of given levels wired to theneutral element.


The results of this chapter were taken from [MenPet13]–to bib!

Chung [35] has established that the condition∑∞

n=1 1/γξn < +∞, where

ξn is the position of the chain immediately after the n-th jump has occurred,is equivalent to explosion. However this condition is very difficult to checksince it is global, i.e., requires the knowledge of the entire trajectory of theembedded Markov chain. Later, sufficient conditions for explosion — whosevalidity can be verified by local estimates — have been introduced. Suf-ficiently sharp conditions of explosion and non-explosion, applicable onlyto Markov chains on countably infinite subsets of non-negative reals, aregiven in [111, 148], while Lyapunov functions are used in [29] for the studyof Markov chains with state space Z and time-dependent holding times.In [272], a sufficient condition of explosion is established for Markov chainson general countable sets; similar sufficient conditions of explosion are es-tablished in [283] for Markov chains on locally compact separable metricspaces.

The first generalisation of Foster’s criteria to the continuous-time casewas given in the unpublished technical report [218]. Notice however thatthe method in that paper is subjected to the same important restriction asin the original paper of Foster [95], namely the semi-martingale conditionmust be verified everywhere but in one point.

The idea of the proof of Theorem 7.2.10 relies on the intuitive under-standing that if Γf(x) ≤ g(f(x)) for all x, then E f(Xt) cannot grow veryfast with time and since f → ∞ the process itself cannot grow fast either.The same idea has been used in [74] to prove non-explosion for Markov chainson metric separable spaces. The proof here relies also on the powerful ideasdeveloped in the proof of Theorems 1 and 2 of [148] and of Theorem 4.1of [74] but improves the original results in several respects. In first place,the present result is valid on arbitrary countably infinite state spaces X (notnecessarily subsets of R); in particular, it can cope with models on higherdimensional lattices (like random walks in Zd or reflected random walks inquadrants). Additionally, even for processes on countably infinite subsets


of R, Theorem 7.2.10 covers critical regimes such as those exhibited by theLamperti model (see Section 7.4.1), a “crash test” model, recalcitrant to themethods of [148].

Other papers that we need to cite: [154] (non-ergodicity criteria) [241,278] (analogues of Foster’s theorems) [262] (relation between existence ofmoments and rates of convergence to stationarity).

Glossary of namedassumptions

Section 2.8

(A0) Suppose that a(x) → ∞ as x → ∞, there exists na ∈ N such thatx 7→ a(x) is increasing for all x ≥ na, and

∑∞n=1

1a(n) <∞.

(A1) Suppose that v(x) → ∞ as x → ∞, there exists xv ∈ R+ such thatx 7→ v(x) is increasing for all x ≥ xv, and

∑∞n=1

1nv(n) <∞.

Section 3.2

(M0) Let Xn be an irreducible, time-homogeneous Markov chain on Σ, alocally finite, unbounded subset of R+, with 0 ∈ Σ.

(M1) Suppose that for some p > 2,

supx∈Σ

E[|∆n|p | Xn = x] <∞.

(M2) Suppose that there exist a, b ∈ R with b > 0 such that

limx→∞

µ2(x) = b, and limx→∞

xµ1(x) = a.

Section 3.3

(L0) Let Xn be a stochastic process adapted to a filtration Fn and takingvalues in S ⊆ R+ with inf S = 0 and supS = +∞.

(L1) Suppose that for some p > 2, δ ∈ (0, p− 2], and C ∈ R+,

E[|∆n|p | Fn] ≤ C(1 +Xn)p−2−δ, a.s., for all n ≥ 0.

373

374 Glossary of named assumptions

(L2) Suppose that lim supn→∞Xn = +∞, a.s.

(L3) Suppose that for each x ∈ R+ there exist rx ∈ N and δx > 0 suchthat, for all n ≥ 0,

P[

maxn≤m≤n+rx

Xm ≥ x∣∣∣Fn] ≥ δx, on Xn ≤ x.

Section 3.6

(I) Suppose that for each x, y ∈ S there exist m(x, y) ∈ N and ϕ(x, y) > 0such that

P[Xn+m(Xn,y) = y | Fn] ≥ ϕ(Xn, y), a.s., for all n ≥ 0.

(Ra) Suppose that, for all k ∈ N, the distribution of Ek on νk−1 < ∞ isthe same as the distribution of E1.

(Rb) Suppose that, on N =∞, (Ek)k∈N is an i.i.d. sequence.

(R) Suppose that X0 = 0 and both (Ra) and (Rb) hold.

Section 3.12

(L5) Suppose that for some C ∈ R+, E[|∆n|p | Fn] ≤ C, a.s.

Bibliography

[1] M. Abramowitz and I. A. Stegun, Handbook of mathematical func-tions with formulas, graphs, and mathematical tables, National Bur-eau of Standards Applied Mathematics Series, vol. 55, For sale bythe Superintendent of Documents, U.S. Government Printing Office,Washington, D.C., 1964.

[2] K. S. Alexander, Excursions and local limit theorems for Bessel-likerandom walks, Electron. J. Probab. 16 (2011), no. 1, 1–44.

[3] O. S. M. Alves, F. P. Machado, and S. Yu. Popov, The shape theoremfor the frog model, Ann. Appl. Probab. 12 (2002), no. 2, 533–546.

[4] E. D. Andjel, M. V. Menshikov, and V. V. Sisko, Positive recurrenceof processes associated to crystal growth models, Ann. Appl. Probab.16 (2006), no. 3, 1059–1085.

[5] S. Asmussen, Applied probability and queues, second ed., Applicationsof Mathematics (New York), vol. 51, Springer-Verlag, New York, 2003,Stochastic Modelling and Applied Probability.

[6] S. Aspandiiarov and R. Iasnogorodski, Tails of passage-times and anapplication to stochastic processes with boundary reflection in wedges,Stochastic Process. Appl. 66 (1997), no. 1, 115–145.

[7] , Asymptotic behaviour of stationary distributions for countableMarkov chains, with some applications, Bernoulli 5 (1999), no. 3, 535–569.

[8] , General criteria of integrability of functions of passage-timesfor nonnegative stochastic processes and their applications, TheoryProbab. Appl. 43 (1999), 343–369, translated from Teor. Veroyatnost.i Primenen. 43 (1998) 509–539 (in Russian).

375

376 Bibliography

[9] S. Aspandiiarov, R. Iasnogorodski, and M. Menshikov, Passage-timemoments for nonnegative stochastic processes and an application toreflected random walks in a quadrant, Ann. Probab. 24 (1996), no. 2,932–960.

[10] K. B. Athreya and P. E. Ney, Branching processes, Dover Publications,Inc., Mineola, NY, 2004, reprint of the 1972 original.

[11] O. Aydogmus, A. P. Ghosh, S. Ghosh, and A. Roitershtein, Coloredmaximal branching processes, Theory Probab. Appl. 59 (2015), no. 4,translated from Teor. Veroyatnost. i Primenen. 59 (2014) 790–800 (inRussian).

[12] K. Azuma, Weighted sums of certain dependent random variables,Tohoku Math. J. (2) 19 (1967), 357–367.

[13] L. Bachelier, Theorie de la speculation, Ann. Sci. Ecole Norm. Sup.(3) 17 (1900), 21–86.

[14] M. N. Barber and B. W. Ninham, Random and restricted walks: theoryand applications, Gordon and Breach, New York, 1970.

[15] L. E. Baum, On convergence to +∞ in the law of large numbers, Ann.Math. Statist. 34 (1963), 219–222.

[16] V. Belitsky, P. A. Ferrari, M. V. Menshikov, and S. Yu. Popov, A mix-ture of the exclusion process and the voter model, Bernoulli 7 (2001),no. 1, 119–144.

[17] I. Benjamini, R. Izkovsky, and H. Kesten, On the range of the simplerandom walk bridge on groups, Electron. J. Probab. 12 (2007), no. 20,591–612.

[18] H. C. Berg, Random walks in biology, expanded edition ed., PrincetonUniversity Press, New Jersey, 1993.

[19] P. Billingsley, Probability and measure, third ed., Wiley Series in Prob-ability and Mathematical Statistics, John Wiley & Sons Inc., NewYork, 1995.

[20] N. H. Bingham, Random walk on spheres, Z. Wahrscheinlichkeitsthe-orie und Verw. Gebiete 22 (1972), 169–192.

Bibliography 377

[21] D. Blackwell, On transient Markov processes with a countable numberof states and stationary transition probabilities, Ann. Math. Statist.26 (1955), 654–658.

[22] D. Blackwell and D. Freedman, A remark on the coin tossing game,Ann. Math. Statist 35 (1964), 1345–1347.

[23] B. Bottcher, An overshoot approach to recurrence and transience ofMarkov processes, Stochastic Process. Appl. 121 (2011), no. 9, 1962–1981.

[24] P. Bougerol, Oscillation des produits de matrices aleatoires dontl’exposant de Lyapounov est nul, Lyapunov exponents (Bremen, 1984),Lecture Notes in Math., vol. 1186, Springer, Berlin, 1986, pp. 27–36.

[25] P. Bougerol and J. Lacroix, Products of random matrices with applica-tions to Schrodinger operators, Progress in Probability and Statistics,vol. 8, Birkhauser Boston Inc., Boston, MA, 1985.

[26] L. Breiman, Transient atomic Markov chains with a denumerable num-ber of states, Ann. Math. Statist. 29 (1958), 212–218.

[27] B. Carazza, The history of the random-walk problem: considerationson the interdisciplinarity in modern physics, Riv. Nuovo Cimento (2)7 (1977), no. 3, 419–427.

[28] J. Cerny, Moments and distribution of the local time of a two-dimensional random walk, Stochastic Process. Appl. 117 (2007), no. 2,262–270.

[29] P.-L. Chou and R. Z. Khas′minskiı, The method of Lyapunov functionfor the analysis of the absorption and explosion of Markov chains,Problemy Peredachi Informatsii 47 (2011), no. 3, 19–38.

[30] Y. S. Chow, H. Robbins, and H. Teicher, Moments of randomly stoppedsums, Ann. Math. Statist. 36 (1965), 789–799.

[31] Y. S. Chow and H. Teicher, Probability theory: Independence, in-terchangeability, martingales, third ed., Springer Texts in Statistics,Springer-Verlag, New York, 1997.

[32] Y. S. Chow and C.-H. Zhang, A note on Feller’s strong law of largenumbers, Ann. Probab. 14 (1986), no. 3, 1088–1094.

378 Bibliography

[33] F. Chung and L. Lu, Complex graphs and networks, CBMS RegionalConference Series in Mathematics, vol. 107, Published for the Con-ference Board of the Mathematical Sciences, Washington, DC; by theAmerican Mathematical Society, Providence, RI, 2006.

[34] K. L. Chung, On the maximum partial sums of sequences of independ-ent random variables, Trans. Amer. Math. Soc. 64 (1948), 205–233.

[35] , Markov chains with stationary transition probabilities, Secondedition. Die Grundlehren der mathematischen Wissenschaften, Band104, Springer-Verlag New York, Inc., New York, 1967.

[36] , A course in probability theory, third ed., Academic Press Inc.,San Diego, CA, 2001.

[37] K. L. Chung and W. H. J. Fuchs, On the distribution of values of sumsof random variables, Mem. Amer. Math. Soc. 1951 (1951), no. 6, 12.

[38] Peter Clifford and Aidan Sudbury, A model for spatial conflict, Bio-metrika 60 (1973), 581–588.

[39] J. W. Cohen, Analysis of random walks, Studies in Probability, Op-timization and Statistics, vol. 2, IOS Press, Amsterdam, 1992.

[40] , On the random walk with zero drifts in the first quadrant ofR2, Comm. Statist. Stochastic Models 8 (1992), no. 3, 359–374.

[41] F. Comets, M. V. Menshikov, and S. Yu. Popov, One-dimensionalbranching random walk in a random environment: a classification,Markov Process. Related Fields 4 (1998), no. 4, 465–477, I BrazilianSchool in Probability (Rio de Janeiro, 1997).

[42] F. Comets, M. V. Menshikov, S. Volkov, and A. R. Wade, Randomwalk with barycentric self-interaction, J. Stat. Phys. 143 (2011), no. 5,855–888.

[43] F. Comets and S. Popov, On multidimensional branching randomwalks in random environment, Ann. Probab. 35 (2007), no. 1, 68–114.

[44] , Shape and local growth for multidimensional branching ran-dom walks in random environment, ALEA Lat. Am. J. Probab. Math.Stat. 3 (2007), 273–299.

Bibliography 379

[45] F. Comets, S. Popov, G. M. Schutz, and M. Vachkovskaia, Billiards ina general domain with random reflections, Arch. Ration. Mech. Anal.191 (2009), no. 3, 497–537.

[46] , Knudsen gas in a finite random tube: transport diffusion andfirst passage properties, J. Stat. Phys. 140 (2010), no. 5, 948–984.

[47] Francis Comets, Mikhail Menshikov, and Serguei Popov, Lyapunovfunctions for random walks and strings in random environment, Ann.Probab. 26 (1998), no. 4, 1433–1445.

[48] P. Coolen-Schrijner and E. A. van Doorn, Analysis of random walks us-ing orthogonal polynomials, J. Comput. Appl. Math. 99 (1998), no. 1-2, 387–399.

[49] E. Crane, N. Georgiou, S. Volkov, A. R. Wade, and R. J. Waters, Thesimple harmonic urn, Ann. Probab. 39 (2011), no. 6, 2119–2177.

[50] E. Csaki, On the lower limits of maxima and minima of Wiener processand partial sums, Z. Wahrsch. Verw. Gebiete 43 (1978), no. 3, 205–221.

[51] E. Csaki, M. Csorgo, A. Foldes, and P. Revesz, On the local time ofrandom walk on the 2-dimensional comb, Stochastic Process. Appl.121 (2011), no. 6, 1290–1314.

[52] E. Csaki, A. Foldes, and P. Revesz, Joint asymptotic behavior of localand occupation times of random walk in higher dimension, Studia Sci.Math. Hungar. 44 (2007), no. 4, 535–563.

[53] , Transient nearest neighbor random walk and Bessel process,J. Theoret. Probab. 22 (2009), no. 4, 992–1009.

[54] , Transient nearest neighbor random walk on the line, J. The-oret. Probab. 22 (2009), no. 1, 100–122.

[55] E. Csaki, P. Revesz, and J. Rosen, Functional laws of the iteratedlogarithm for local times of recurrent random walks on Z2, Ann. Inst.H. Poincare Probab. Statist. 34 (1998), no. 4, 545–563.

[56] J. De Coninck, F. Dunlop, and T. Huillet, Random walk weakly at-tracted to a wall, J. Stat. Phys. 133 (2008), no. 2, 271–280.

380 Bibliography

[57] D. DeBlassie and R. Smits, The influence of a power law drift on theexit time of Brownian motion from a half-line, Stochastic Process.Appl. 117 (2007), no. 5, 629–654.

[58] P. Deheuvels and P. Revesz, Simple random walk on the line in randomenvironment, Probab. Theory Relat. Fields 72 (1986), no. 2, 215–230.

[59] F. den Hollander, Random polymers, Lecture Notes in Mathematics,vol. 1974, Springer-Verlag, Berlin, 2009, Lectures from the 37th Prob-ability Summer School held in Saint-Flour, 2007.

[60] F. den Hollander, M. V. Menshikov, and S. Yu. Popov, A note ontransience versus recurrence for a branching random walk in randomenvironment, J. Statist. Phys. 95 (1999), no. 3-4, 587–614.

[61] D. Denisov, D. Korshunov, and V. Wachtel, Potential analysis forpositive recurrent Markov chains with asymptotically zero drift: power-type asymptotics, Stochastic Process. Appl. 123 (2013), no. 8, 3027–3051.

[62] D. E. Denisov and S. G. Foss, On transience conditions for Markovchains and random walks, Siberian Math. J. 44 (2003), no. 1, 44–57,translated from Sibirsk. Mat. Zh. 44 (2003) 53–68, in Russian.

[63] C. Derman and H. Robbins, The strong law of large numbers when thefirst moment does not exist, Proc. Nat. Acad. Sci. U.S.A. 41 (1955),586–587.

[64] B. Derrida, S. Goldstein, J. L. Lebowitz, and E. R. Speer, Shift equi-valence of measures and the intrinsic structure of shocks in the asym-metric simple exclusion process, J. Statist. Phys. 93 (1998), no. 3-4,547–571.

[65] W. Doeblin, Elements d’une theorie generale des chaınes simples con-stantes de Markoff, Ann. Sci. Ecole Norm. Sup. (3) 57 (1940), 61–111.

[66] M. D. Donsker and S. R. S. Varadhan, On the number of distinct sitesvisited by a random walk, Comm. Pure Appl. Math. 32 (1979), no. 6,721–747.

[67] J. L. Doob, Stochastic processes, John Wiley & Sons Inc., New York,1953.

Bibliography 381

[68] R. Douc, G. Fort, E. Moulines, and P. Soulier, Practical drift condi-tions for subgeometric rates of convergence, Ann. Appl. Probab. 14(2004), no. 3, 1353–1377.

[69] D. Y. Downham and S. B. Fotopoulos, The transient behaviour of thesimple random walk in the plane, J. Appl. Probab. 25 (1988), no. 1,58–69.

[70] P. G. Doyle and J. L. Snell, Random walks and electric networks,Carus Mathematical Monographs, vol. 22, Mathematical Associationof America, Washington, DC, 1984.

[71] R. Durrett, Ten lectures on particle systems, Lectures on probabil-ity theory (Saint-Flour, 1993), Lecture Notes in Math., vol. 1608,Springer, Berlin, 1995, pp. 97–201.

[72] , Probability: theory and examples, fourth ed., CambridgeSeries in Statistical and Probabilistic Mathematics, Cambridge Uni-versity Press, Cambridge, 2010.

[73] A. Dvoretzky and P. Erdos, Some problems on random walk in space,Proceedings of the Second Berkeley Symposium on Mathematical Stat-istics and Probability, 1950 (Berkeley and Los Angeles), University ofCalifornia Press, 1951, pp. 353–367.

[74] A. Eibeck and W. Wagner, Stochastic interacting particle systems andnonlinear kinetic equations, Ann. Appl. Probab. 13 (2003), no. 3, 845–889.

[75] A. Einstein, Investigations on the theory of the Brownian movement,Dover Publications Inc., New York, 1956, Edited with notes by R.Furth, translated by A. D. Cowper.

[76] A. Erdelyi, W. Magnus, F. Oberhettinger, and F. G. Tricomi, Tablesof integral transforms. Vol. I, McGraw-Hill Book Company, Inc., NewYork-Toronto-London, 1954, Based, in part, on notes left by HarryBateman.

[77] P. Erdos and S. J. Taylor, Some problems concerning the structure ofrandom walk paths, Acta Math. Acad. Sci. Hungar. 11 (1960), 137–162.

[78] K. B. Erickson, The strong law of large numbers when the mean isundefined, Trans. Amer. Math. Soc. 185 (1973), 371–381 (1974).

382 Bibliography

[79] S. N. Evans, Stochastic billiards on general tables, Ann. Appl. Probab.11 (2001), no. 2, 419–437.

[80] A. M. Fal′, Recurrence times for certain Markov random walks,Ukrainian Math. J. 23 (1971), 676–681, translated from Ukrain. Mat.Zh. 23 (1971) 824–830 (in Russian).

[81] , Generalization of the results obtained by R. L. Dobrushin foradditive functionals of a Markov random walk, Theory Probab. Appl.22 (1978), 569–571, translated from Teor. Veroyatnost. i Primenen.22 (1977) 582–584 (in Russian).

[82] , Certain limit theorems for an elementary Markov randomwalk, Ukrainian Math. J. 33 (1981), 433–435, translated from Ukrain.Mat. Zh. 33 (1981) 564–566 (in Russian).

[83] G. Fayolle, V. A. Malyshev, and M. V. Men′shikov, Random walksin a quarter plane with zero drifts. I. Ergodicity and null recurrence,Ann. Inst. H. Poincare Probab. Statist. 28 (1992), no. 2, 179–194.

[84] G. Fayolle, V. A. Malyshev, and M. V. Menshikov, Topics in the con-structive theory of countable Markov chains, Cambridge UniversityPress, Cambridge, 1995.

[85] W. Feller, A limit theorem for random variables with infinite moments,Amer. J. Math. 68 (1946), 257–262.

[86] , An introduction to probability theory and its applications. Vol.I, Third edition, John Wiley & Sons Inc., New York, 1968.

[87] P. A. Ferrari, Shock fluctuations in asymmetric simple exclusion,Probab. Theory Related Fields 91 (1992), no. 1, 81–101.

[88] , Shocks in one-dimensional processes with drift, Probabilityand phase transition (Cambridge, 1993), NATO Adv. Sci. Inst. Ser.C Math. Phys. Sci., vol. 420, Kluwer Acad. Publ., Dordrecht, 1994,pp. 35–48.

[89] P. A. Ferrari, C. Kipnis, and E. Saada, Microscopic structure of trav-elling waves in the asymmetric simple exclusion process, Ann. Probab.19 (1991), no. 1, 226–244.

[90] Yu. P. Filonov, Criterion for ergodicity of homogeneous discreteMarkov chains, Ukrain. Math. J. 41 (1989), 1223–1225, translatedfrom Ukrain. Mat. Zh. 41 (1989) 1421–1422.

Bibliography 383

[91] L. Flatto, A problem on random walk, Quart. J. Math. Oxford Ser. (2)9 (1958), 299–300.

[92] S. Foss, D. Korshunov, and S. Zachary, An introduction to heavy-tailed and subexponential distributions, second ed., Springer Series inOperations Research and Financial Engineering, Springer, New York,2013.

[93] S. G. Foss and D. E. Denisov, On transience conditions for Markovchains, Siberian Math. J. 42 (2001), no. 2, 364–371, translated fromSibirsk. Mat. Zh. 42 (2001) 425–433, in Russian.

[94] F. G. Foster, Markoff chains with an enumerable number of states anda class of cascade processes, Math. Proc. Cambridge Philos. Soc. 47(1951), 77–85.

[95] , On the stochastic matrices associated with certain queuingprocesses, Ann. Math. Statistics 24 (1953), 355–360.

[96] B. Franke, The scaling limit behaviour of periodic stable-like processes,Bernoulli 12 (2006), no. 3, 551–570.

[97] A. S. Gaırat, V. A. Malyshev, M. V. Men′shikov, and K. D. Pelikh,Classification of Markov chains that describe the evolution of randomstrings, Uspekhi Mat. Nauk 50 (1995), no. 2(302), 5–24.

[98] A. S. Gajrat, V. A. Malyshev, and A. A. Zamyatin, Two-sided evol-ution of a random chain, Markov Process. Related Fields 1 (1995),no. 2, 281–316.

[99] L. Gallardo, Comportement asymptotique des marches aleatoires as-sociees aux polynomes de Gegenbauer et applications, Adv. in Appl.Probab. 16 (1984), no. 2, 293–323.

[100] C. Gallesco, S. Popov, and G. M. Schutz, Localization for a randomwalk in slowly decreasing random potential, J. Stat. Phys. 150 (2013),no. 2, 285–298.

[101] N. Georgiou, M. V. Menshikov, A. Mijatovic, and A. R. Wade, An-omalous recurrence properties of many-dimensional zero-drift randomwalks, Adv. in Appl. Probab. (2016), To appear.

[102] N. Georgiou and A. R. Wade, Non-homogeneous random walks on asemi-infinite strip, Stochastic Process. Appl. 124 (2014), no. 10, 3179–3205.

384 Bibliography

[103] G. Giacomin, Random polymer models, Imperial College Press, Lon-don, 2007.

[104] J. Gillis, Centrally biased discrete random walk, Quart. J. Math. Ox-ford Ser. (2) 7 (1956), 144–152.

[105] M. L. Glasser and I. J. Zucker, Extended Watson integrals for the cubiclattices, Proc. Nat. Acad. Sci. U.S.A. 74 (1977), no. 5, 1800–1801.

[106] A. O. Golosov, Limit distributions for random walks in a random en-vironment, Dokl. Akad. Nauk SSSR 271 (1983), no. 1, 25–29.

[107] P. S. Griffin, An integral test for the rate of escape of d-dimensionalrandom walk, Ann. Probab. 11 (1983), no. 4, 953–961.

[108] R. F. Gundy and D. Siegmund, On a stopping rule and the centrallimit theorem, Ann. Math. Statist. 38 (1967), 1915–1917.

[109] A. Gut, Probability: A graduate course, Springer, 2005.

[110] Y. Hamana, An almost sure invariance principle for the range of ran-dom walks, Stochastic Process. Appl. 78 (1998), no. 2, 131–143.

[111] K. Hamza and F. C. Klebaner, Conditions for integrability of Markovchains, J. Appl. Probab. 32 (1995), no. 2, 541–547.

[112] C. M. Harris and P. G. Marlin, A note on feedback queues with bulkservice, J. Assoc. Comput. Mach. 19 (1972), 727–733.

[113] T. E. Harris, First passage and recurrence distributions, Trans. Amer.Math. Soc. 73 (1952), 471–486.

[114] W. M. Hirsch, A strong law for the maximum cumulative sum of inde-pendent random variables, Comm. Pure Appl. Math. 18 (1965), 109–127.

[115] J. L. Hodges, Jr. and M. Rosenblatt, Recurrence-time moments inrandom walks, Pacific J. Math. 3 (1953), 127–136.

[116] W. Hoeffding, Probability inequalities for sums of bounded randomvariables, J. Amer. Statist. Assoc. 58 (1963), 13–30.

[117] P. Holgate, Random walk models for animal behavior, Statistical Eco-logy: vol. 2 (G. Patil, E. Pielou, and W. Walters, eds.), Penn StateUniversity Press, University Park, PA, 1971, pp. 1–12.

Bibliography 385

[118] R. A. Holley and T. M. Liggett, Ergodic theorems for weakly interact-ing infinite systems and the voter model, Ann. Probability 3 (1975),no. 4, 643–663.

[119] O. Hryniv, I. M. MacPhee, M. V. Menshikov, and A. R. Wade, Non-homogeneous random walks with non-integrable increments and heavy-tailed random walks on strips, Electron. J. Probab. 17 (2012), article59.

[120] O. Hryniv and M. Menshikov, Long-time behaviour in a model of mi-crotubule growth, Adv. in Appl. Probab. 42 (2010), no. 1, 268–291.

[121] O. Hryniv, M. V. Menshikov, and A. R. Wade, Excursions and pathfunctionals for stochastic processes with asymptotically zero drifts,Stochastic Process. Appl. 123 (2013), no. 6, 1891–1921.

[122] , Random walk in mixed random environment without uniformellipticity, Proc. Stekov. Inst. Math. 282 (2013), 106–123.

[123] Y. Hu and H. Nyrhinen, Large deviations view points for heavy-tailedrandom walks, J. Theoret. Probab. 17 (2004), no. 3, 761–768.

[124] Y. Hu and Z. Shi, The limits of Sinai’s simple random walk in randomenvironment, Ann. Probab. 26 (1998), no. 4, 1477–1521.

[125] B. D. Hughes, Random walks and random environments. Vol. 1, Ox-ford Science Publications, The Clarendon Press Oxford UniversityPress, New York, 1995, Random walks.

[126] T. Huillet, Random walk with long-range interaction with a barrier andits dual: exact results, J. Comput. Appl. Math. 233 (2010), no. 10,2449–2467.

[127] K. Ito (ed.), Encyclopedic dictionary of mathematics. Vol. I–IV,second ed., MIT Press, Cambridge, MA, 1987, translated from theJapanese.

[128] N. C. Jain and W. E. Pruitt, Maxima of partial sums of independentrandom variables, Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 27(1973), 141–151.

[129] S. F. Jarner and G. O. Roberts, Polynomial convergence rates ofMarkov chains, Ann. Appl. Probab. 12 (2002), no. 1, 224–247.

386 Bibliography

[130] R. C. Jones, On the theory of fluctuations in the decay of sound, J.Acoust. Soc. Amer. 11 (1940), 324–332.

[131] Paul Jung, The noisy voter-exclusion process, Stochastic Process.Appl. 115 (2005), no. 12, 1979–2005.

[132] V. V. Kalashnikov, Practical stability of difference equations, Automa-tion and Remote Control 9 (1967), 1404–1407, translated from Avto-matika i Telemekhanika 9 (1967) 172–175 (in Russian).

[133] , The use of Lyapunov’s method in the solution of queueingtheory problems, Engrg. Cybernetics (1968), no. 5, 77–84, translatedfrom Izv. Akad. Nauk SSSR Tehn. Kibernet. 1968 89–95 (in Russian).

[134] , Reliability analysis by Lyapunov’s method, Engrg. Cybernetics(1970), no. 2, 257–269, translated from Izv. Akad. Nauk SSSR Tehn.Kibernet. 1970 65–76 (in Russian).

[135] , Analysis of ergodicity of queueing systems by means of the dir-ect method of Lyapunov, Automation and Remote Control 32 (1971),559–566, translated from Avtomatika i Telemekhanika 32 (1971) 46–54 (in Russian).

[136] , The property of γ-reflexivity for Markov sequences, SovietMath. Dokl. 14 (1973), 1869–1873, translated from Dokl. Akad. NaukSSSR 213 (1973) 1243–1246 (in Russian).

[137] O. Kallenberg, Foundations of modern probability, second ed., Prob-ability and its Applications (New York), Springer-Verlag, New York,2002.

[138] M. Kaplan, A sufficient condition for nonergodicity of a Markov chain,IEEE Trans. Inform. Theory 25 (1979), no. 4, 470–471.

[139] S. Karlin and J. McGregor, Representation of a class of stochasticprocesses, Proc. Nat. Acad. Sci. U. S. A. 41 (1955), 387–391.

[140] , Random walks, Illinois J. Math. 3 (1959), 66–81.

[141] K. Kawazu, Y. Tamura, and H. Tanaka, Limit theorems for one-dimensional diffusions and random walks in random environments,Probab. Theory Related Fields 80 (1989), no. 4, 501–541.

Bibliography 387

[142] G. Keller, G. Kersting, and U. Rosler, On the asymptotic behaviourof discrete time stochastic growth processes, Ann. Probab. 15 (1987),no. 1, 305–343.

[143] F. P. Kelly, Markovian functions of a Markov chain, Sankhya Ser. A44 (1982), no. 3, 372–379.

[144] J. H. B. Kemperman, The oscillating random walk, Stochastic Pro-cesses Appl. 2 (1974), 1–29.

[145] D. G. Kendall, Some problems in the theory of queues, J. Roy. Statist.Soc. Ser. B. 13 (1951), 151–173; discussion: 173–185.

[146] G. Kersting, On recurrence and transience of growth models, J. Appl.Probab. 23 (1986), no. 3, 614–625.

[147] , Asymptotic Γ-distribution for stochastic difference equations,Stochastic Process. Appl. 40 (1992), no. 1, 15–28.

[148] G. Kersting and F. C. Klebaner, Sharp conditions for nonexplosionsand explosions in Markov jump processes, Ann. Probab. 23 (1995),no. 1, 268–272.

[149] H. Kesten, The limit points of a normalized random walk, Ann. Math.Statist. 41 (1970), 1173–1205.

[150] , The limit distribution of Sinai’s random walk in random en-vironment, Phys. A 138 (1986), no. 1-2, 299–309.

[151] H. Kesten and R. A. Maller, Two renewal theorems for general randomwalks tending to infinity, Probab. Theory Related Fields 106 (1996),no. 1, 1–38.

[152] , Random walks crossing power law boundaries, Studia Sci.Math. Hungar. 34 (1998), no. 1-3, 219–252.

[153] R. Z. Khas’minskii, On the stability of the trajectory of Markov pro-cesses, J. Appl. Math. Mech. 26 (1962), 1554–1565.

[154] B. Kim and I. Lee, Tests for nonergodicity of denumerable continuoustime Markov processes, Comput. Math. Appl. 55 (2008), no. 6, 1310–1321.

[155] J. F. C. Kingman, The ergodic behaviour of random walks, Biometrika48 (1961), 391–396.

388 Bibliography

[156] , Random walks with spherical symmetry, Acta Math. 109(1963), 11–53.

[157] F. C. Klebaner, Stochastic difference equations and generalized gammadistributions, Ann. Probab. 17 (1989), no. 1, 178–188.

[158] , Introduction to stochastic calculus with applications, seconded., Imperial College Press, London, 2005.

[159] L. A. Klein Haneveld, Random walk on the quadrant, Stochastic Pro-cess. Appl. 26 (1987), 228.

[160] , Random walk in the quadrant, Ph.D. thesis, University ofAmsterdam, 1996.

[161] L. A. Klein Haneveld and A. O. Pittenger, Escape time for a randomwalk from an orthant, Stochastic Process. Appl. 35 (1990), no. 1, 1–9.

[162] H. Kleinert, Path integrals in quantum mechanics, statistics, polymerphysics, and financial markets, fourth ed., World Scientific PublishingCo. Pte. Ltd., Hackensack, NJ, 2006.

[163] M. Knudsen, Kinetic theory of gases: some modern aspects, Methuen’sMonographs on Physical Subjects, Methuen, London, 1952.

[164] D. A. Korshunov, Moments of stationary Markov chains with asymp-totically zero drift, Sibirsk. Mat. Zh. 52 (2011), no. 4, 829–840.

[165] Y. Kovchegov and N. Michalowski, A class of Markov chains with nospectral gap, Proc. Amer. Math. Soc. 141 (2013), no. 12, 4317–4326.

[166] M. V. Kozlov, Random walk in a one-dimensional random medium,Teor. Verojatnost. i Primenen. 18 (1973), 406–408.

[167] G. Kozma, T. Orenshtein, and I. Shinkar, Excited random walk withperiodic cookies, Preprint.

[168] V. M. Kruglov, A strong law of large numbers for pairwise independentidentically distributed random variables with infinite means, Statist.Probab. Lett. 78 (2008), no. 7, 890–895.

[169] I. Kurkova and K. Raschel, Passage time from four to two blocks ofopinions in the voter model and walks in the quarter plane, QueueingSyst. 74 (2013), no. 2-3, 219–234.

Bibliography 389

[170] H. J. Kushner, On the stability of stochastic dynamical systems, Proc.Nat. Acad. Sci. U.S.A. 53 (1965), 8–12.

[171] , Finite time stochastic stability and the analysis of trackingsystems, IEEE Trans. Automatic Control AC-11 (1966), 219–227.

[172] P. Kuster, Asymptotic growth of controlled Galton-Watson processes,Ann. Probab. 13 (1985), no. 4, 1157–1178.

[173] J. Lamperti, Criteria for the recurrence or transience of stochasticprocess. I., J. Math. Anal. Appl. 1 (1960), 314–330.

[174] , A new class of probability limit theorems, J. Math. Mech. 11(1962), 749–772.

[175] , Criteria for stochastic processes. II. Passage-time moments,J. Math. Anal. Appl. 7 (1963), 127–145.

[176] , Maximal branching processes and ‘long-range percolation’, J.Appl. Probability 7 (1970), 89–98.

[177] , Remarks on maximal branching processes, Teor. Verojatnost.i Primenen. 17 (1972), 46–54.

[178] G. F. Lawler and V. Limic, Random walk: a modern introduction,Cambridge Studies in Advanced Mathematics, vol. 123, CambridgeUniversity Press, Cambridge, 2010.

[179] F. W. Leysieffer, Functions of finite Markov chains, Ann. Math. Stat-ist. 38 (1967), 206–212.

[180] T. M. Liggett, Coupling the simple exclusion process, Ann. Probability4 (1976), no. 3, 339–356.

[181] , Interacting particle systems, Grundlehren der Mathemat-ischen Wissenschaften [Fundamental Principles of Mathematical Sci-ences], vol. 276, Springer-Verlag, New York, 1985.

[182] , Stochastic interacting systems: contact, voter and exclusionprocesses, Grundlehren der Mathematischen Wissenschaften [Funda-mental Principles of Mathematical Sciences], vol. 324, Springer-Verlag,Berlin, 1999.

[183] M. Loeve, Probability theory. I, fourth ed., Springer-Verlag, New York,1977, Graduate Texts in Mathematics, Vol. 45.

390 Bibliography

[184] , Probability theory. I, fourth ed., Springer-Verlag, New York,1977, Graduate Texts in Mathematics, Vol. 45.

[185] G. G. Lowry (ed.), Markov chains and monte carlo calculations inpolymer science, Marcel Dekker, New York, 1970.

[186] F. P. Machado, M. V. Menshikov, and S. Yu. Popov, Recurrence andtransience of multitype branching random walks, Stochastic Process.Appl. 91 (2001), no. 1, 21–37.

[187] F. P. Machado and S. Yu. Popov, One-dimensional branching ran-dom walks in a Markovian random environment, J. Appl. Probab. 37(2000), no. 4, 1157–1163.

[188] , Branching random walk in random environment on trees,Stochastic Process. Appl. 106 (2003), no. 1, 95–106.

[189] I. MacPhee, M. Menshikov, D. Petritis, and S. Popov, A Markov chainmodel of a polling system with parameter regeneration, Ann. Appl.Probab. 17 (2007), no. 5-6, 1447–1473.

[190] I. MacPhee, M. V. Menshikov, and M. Vachkovskaia, Dynamics of thenon-homogeneous supermarket model, Stoch. Models 28 (2012), no. 4,533–556.

[191] I. M. MacPhee and M. V. Menshikov, Critical random walks ontwo-dimensional complexes with applications to polling systems, Ann.Appl. Probab. 13 (2003), no. 4, 1399–1422.

[192] I. M. MacPhee, M. V. Menshikov, S. Popov, and S. Volkov, Period-icity in the transient regime of exhaustive polling systems, Ann. Appl.Probab. 16 (2006), no. 4, 1816–1850.

[193] I. M. MacPhee, M. V. Menshikov, S. Volkov, and A. R. Wade,Passage-time moments and hybrid zones for the exclusion-voter model,Bernoulli 16 (2010), no. 4, 1312–1342.

[194] I. M. MacPhee, M. V. Menshikov, and A. R. Wade, Angular asymptot-ics for multi-dimensional non-homogeneous random walks with asymp-totically zero drift, Markov Process. Related Fields 16 (2010), no. 2,351–388.

[195] , Moments of exit times from wedges for non-homogeneous ran-dom walks with asymptotically zero drifts, J. Theoret. Probab. 26(2013), 1–30.

Bibliography 391

[196] V. A. Malysev and M. V. Men′sikov, Ergodicity, continuity, and ana-lyticity of countable Markov chains, Trans. Moscow Math. Soc. 39(1979), 1–48, translated from Trudy Moskov. Mat. Obshch. 39 (1979)3–48 (in Russian).

[197] V. A. Malyshev, Classification of two-dimensional positive randomwalks and almost linear semimartingales, Soviet Math. Dokl. 13(1972), 136–139, translated from Dokl. Akad. Nauk SSSR 202 (1972)526–528 (in Russian).

[198] , Random grammars, Uspekhi Mat. Nauk 53 (1998), no. 2(320),107–134.

[199] X. Mao, Stochastic differential equations and applications, second ed.,Horwood Publishing Limited, Chichester, 2008.

[200] M. B. Marcus and J. Rosen, Logarithmic averages for the local times ofrecurrent random walks and Levy processes, Stochastic Process. Appl.59 (1995), no. 2, 175–184.

[201] J. G. Mauldon, On non-dissipative Markov chains, Proc. CambridgePhilos. Soc. 53 (1957), 825–835.

[202] W. H. McCrea and F. J. W. Whipple, Random paths in two and threedimensions, Proc. Roy. Soc. Edinburgh 60 (1940), 281–298.

[203] M. Menshikov and D. Petritis, Markov chains in a wedge with excit-able boundaries, Analytic methods in applied probability, Amer. Math.Soc. Transl. Ser. 2, vol. 207, Amer. Math. Soc., Providence, RI, 2002,pp. 141–164.

[204] M. Menshikov, S. Popov, A. Ramırez, and M. Vachkovskaia, On a gen-eral many-dimensional excited random walk, Ann. Probab. 40 (2012),no. 5, 2106–2130.

[205] M. Menshikov and S. Zuyev, Polling systems in the critical regime,Stochastic Process. Appl. 92 (2001), no. 2, 201–218.

[206] M. V. Menshikov, Ergodicity and transience conditions for randomwalks in the positive octant of space, Soviet Math. Dokl. 15 (1974),1118–1121, translated from Dokl. Akad. Nauk SSSR 217 (1974) 755–758 (in Russian).

392 Bibliography

[207] , Martingale approach for Markov processes in random envir-onment and branching Markov chains, Resenhas 3 (1997), no. 2, 159–171.

[208] M. V. Menshikov, I. M. Asymont, and R. Iasnogorodskii, Markov pro-cesses with asymptotically zero drifts, Probl. Inf. Transm. 31 (1995),248–261, translated from Problemy Peredachi Informatsii 31 (1995)60–75 (in Russian).

[209] M. V. Menshikov and S. Yu. Popov, Exact power estimates for count-able Markov chains, Markov Process. Related Fields 1 (1995), no. 1,57–78.

[210] M. V. Menshikov, S. Yu. Popov, V. Sisko, and M. Vachkovskaia, Ona many-dimensional random walk in a rarefied random environment,Markov Process. Related Fields 10 (2004), no. 1, 137–160.

[211] M. V. Menshikov, M. Vachkovskaia, and A. R. Wade, Asymptotic be-haviour of randomly reflecting billiards in unbounded tubular domains,J. Stat. Phys. 132 (2008), no. 6, 1097–1133.

[212] M. V. Menshikov and S. E. Volkov, Branching Markov chains: qualit-ative characteristics, Markov Process. Related Fields 3 (1997), no. 2,225–241.

[213] M. V. Menshikov and A. R. Wade, Random walk in random environ-ment with asymptotically zero perturbation, J. Eur. Math. Soc. (JEMS)8 (2006), no. 3, 491–513.

[214] , Logarithmic speeds for one-dimensional perturbed randomwalks in random environments, Stochastic Process. Appl. 118 (2008),no. 3, 389–416.

[215] , Rate of escape and central limit theorem for the supercriticalLamperti problem, Stochastic Process. Appl. 120 (2010), no. 10, 2078–2099.

[216] J.-F. Mertens, E. Samuel-Cahn, and S. Zamir, Necessary and sufficientconditions for recurrence and transience of Markov chains, in termsof inequalities, J. Appl. Probab. 15 (1978), no. 4, 848–851.

[217] S. Meyn and R. L. Tweedie, Markov chains and stochastic stability,second ed., Cambridge University Press, Cambridge, 2009, With aprologue by Peter W. Glynn.

Bibliography 393

[218] R. G. Miller, Jr., Foster’s Markov chain theorems in continuous time,Tech. Report 88, Applied mathematics and statistics laboratory, Stan-ford University, 1963.

[219] P. A. P. Moran, An introduction to probability theory, Oxford Sci-ence Publications, The Clarendon Press, Oxford University Press, NewYork, 1984, Corrected reprint of the 1968 original.

[220] M. D. Moustafa, Input-output Markov processes, Proc. Konin. Neder.Akad. Wetens. A. Math. Sci. 19 (1957), 112–118.

[221] J. R. Norris, Markov chains, Cambridge Series in Statistical and Prob-abilistic Mathematics, vol. 2, Cambridge University Press, Cambridge,1998.

[222] R. Nossal, Stochastic aspects of biological locomotion, J. Stat. Phys.30 (1983), 391–400.

[223] R. J. Nossal and G. H. Weiss, A generalized Pearson random walkallowing for bias, J. Stat. Phys. 10 (1974), 245–253.

[224] E. Nummelin, General irreducible Markov chains and nonnegative op-erators, Cambridge Tracts in Mathematics, vol. 83, Cambridge Uni-versity Press, Cambridge, 1984.

[225] S. Orey, Lecture notes on limit theorems for Markov chain transitionprobabilities, Van Nostrand Reinhold Co., London-New York-Toronto,Ont., 1971, Van Nostrand Reinhold Mathematical Studies, No. 34.

[226] V. I. Oseledec, A multiplicative ergodic theorem. Characteristic Lja-punov, exponents of dynamical systems, Trudy Moskov. Mat. Obsc.19 (1968), 179–210.

[227] A. G. Pakes, Some conditions for ergodicity and recurrence of Markovchains, Operations Res. 17 (1969), 1058–1061.

[228] , Some remarks on a one-dimensional skip-free process withrepulsion, J. Austral. Math. Soc. Ser. A 30 (1980/81), no. 1, 107–128.

[229] E. Parzen, Stochastic processes, Classics in Applied Mathematics,vol. 24, Society for Industrial and Applied Mathematics (SIAM), Phil-adelphia, PA, 1999, Reprint of the 1962 original.

394 Bibliography

[230] K. Pearson, The problem of the random walk, Nature 72 (1905), 342.

[231] K. Pearson and J. Blakeman, A mathematical theory of random mi-gration, Drapers’ Company Research Memoirs Biometric Series, Dulauand co., London, 1906.

[232] R. Pemantle, A survey of random processes with reinforcement,Probab. Surv. 4 (2007), 1–79.

[233] Y. Peres, S. Popov, and P. Sousi, On recurrence and transience ofself-interacting random walks, Bull. Braz. Math. Soc. 44 (2013), no. 4,1–27.

[234] R. G. Pinsky, Positive harmonic functions and diffusion, CambridgeStudies in Advanced Mathematics, vol. 45, Cambridge UniversityPress, Cambridge, 1995.

[235] G. Polya, Uber eine Aufgabe der Wahrscheinlichkeitsrechnung betref-fend die Irrfahrt im Straßennetz, Math. Ann. 84 (1921), no. 1-2, 149–160.

[236] N. N. Popov, The rate of convergence for countable Markov chains,Theor. Probability Appl. 24 (1979), no. 2, 401–405, translated fromTeor. Verojatnost. i Primenen. 24 (1979) 395–399 (in Russian).

[237] W. E. Pruitt, The rate of escape of random walk, Ann. Probab. 18(1990), no. 4, 1417–1461.

[238] C. Rau, Sur le nombre de points visites par une marche aleatoire surun amas infini de percolation, Bull. Soc. Math. France 135 (2007),no. 1, 135–169.

[239] Lord Rayleigh, On the resultant of a large number of vibrations of thesame pitch and of arbitrary phase, Phil. Mag. Ser. 5 10 (1880), 73–78.

[240] S. I. Resnick, A probability path, Birkhauser Boston Inc., Boston, MA,1999.

[241] G. E. H. Reuter, Competition processes, Proc. 4th Berkeley Sympos.Math. Statist. and Prob., Vol. II, Univ. California Press, Berkeley,Calif., 1961, pp. 421–430.

[242] Pal Revesz, Random walk in random and non-random environments,third ed., World Scientific Publishing Co. Pte. Ltd., Hackensack, NJ,2013.

Bibliography 395

[243] D. Revuz, Markov chains, North-Holland Publishing Co., Amsterdam-Oxford; American Elsevier Publishing Co., Inc., New York, 1975,North-Holland Mathematical Library, Vol. 11.

[244] , Markov chains, second ed., North-Holland Mathematical Lib-rary, vol. 11, North-Holland Publishing Co., Amsterdam, 1984.

[245] P. Robert, Stochastic networks and queues, Applications of Mathem-atics, vol. 52, Springer-Verlag, Berlin, 2003.

[246] L. C. G. Rogers and J. W. Pitman, Markov functions, Ann. Probab.9 (1981), no. 4, 573–582.

[247] B. A. Rogozin and S. G. Foss, The recurrence of an oscillating randomwalk, Theor. Probability Appl. 23 (1978), no. 1, 155–162, translatedfrom Teor. Verojatnost. i Primenen. 23 (1978) 161–169 (in Russian).

[248] M. Rosenblatt, Functions of a Markov process that are Markovian, J.Math. Mech. 8 (1959), 585–596.

[249] W. A. Rosenkrantz, A local limit theorem for a certain class of randomwalks, Ann. Math. Statist. 37 (1966), 855–859.

[250] N. Sandric, Recurrence and transience property of some Markovchains, Ph.D. thesis, University of Zagreb, 2012.

[251] N. Sandric, Long-time behavior of stable-like processes, Stochastic Pro-cess. Appl. 123 (2013), no. 4, 1276–1300.

[252] , Recurrence and transience property for a class of Markovchains, Bernoulli 19 (2013), no. 5B, 2167–2199.

[253] , Recurrence and transience criteria for two cases of stable-likemarkov chains, J. Theoret. Probab. 27 (2014), 754–788.

[254] K.-I. Sato, Levy processes and infinitely divisible distributions, Cam-bridge Studies in Advanced Mathematics, vol. 68, Cambridge Uni-versity Press, Cambridge, 1999, translated from the 1990 Japaneseoriginal, Revised by the author.

[255] R. L. Schilling and J. Wang, Some theorems on Feller processes: tran-sience, local times and ultracontractivity, Trans. Amer. Math. Soc. 365(2013), no. 6, 3255–3286.

396 Bibliography

[256] S. Schumacher, Diffusions with random coefficients, Ph.D. thesis, Uni-versity of California, Los Angeles, 1984.

[257] , Diffusions with random coefficients, Particle systems, ran-dom media and large deviations (Brunswick, Maine, 1984), Contemp.Math., vol. 41, Amer. Math. Soc., Providence, RI, 1985, pp. 351–356.

[258] L. I. Sennott, P. A. Humblet, and R. L. Tweedie, Mean drifts and thenonergodicity of Markov chains, Oper. Res. 31 (1983), no. 4, 783–789.

[259] V. Shcherbakov and S. Volkov, Stability of a growth process gen-erated by monomer filling with nearest-neighbour cooperative effects,Stochastic Process. Appl. 120 (2010), no. 6, 926–948.

[260] L. A. Shepp, Symmetric random walk, Trans. Amer. Math. Soc. 104(1962), 144–153.

[261] , Recurrent random walks with arbitrarily large steps, Bull.Amer. Math. Soc. 70 (1964), 540–542.

[262] T. Shiga, A. Shimizu, and T. Soshi, Passage-time moments for posit-ively recurrent Markov chains, Nagoya Math. J. 162 (2001), 169–185.

[263] A. N. Shiryaev, Probability, second ed., Graduate Texts in Mathemat-ics, vol. 95, Springer-Verlag, New York, 1996, translated from the first(1980) Russian edition by R. P. Boas.

[264] M. F. Shlesinger and B. J. West (eds.), Random walks and their ap-plications on the physical and biological sciences, American Instituteof Physics, New York, 1984.

[265] Ya. G. Sinaı, The limiting behavior of a one-dimensional random walkin a random medium, Theor. Probability Appl. 27 (1983), 256–268,translated from Teor. Veroyatnost. i Primenen. 27 (1982) 247–258 (inRussian).

[266] A. Singh, A slow transient diffusion in a drifted stable potential, J.Theoret. Probab. 20 (2007), no. 2, 153–166.

[267] P. E. Smouse, S. Focardi, P. R. Moorcroft, J. G. Kie, J. D. Forester,and J. M. Morales, Stochastic modelling of animal movement, Phil.Trans. Roy. Soc. Ser. B Biol. Sci. 365 (2010), 2201–2211.

[268] F. Solomon, Random walks in a random environment, Ann. Probabil-ity 3 (1975), 1–31.

Bibliography 397

[269] F. Spitzer, Renewal theorems for Markov chains, Proc. 5th BerkeleySympos. Math. Statist. and Prob., Vol. II, Univ. California Press,Berkeley, Calif., 1967, pp. 311–320.

[270] , Principles of random walk, second ed., Springer-Verlag, NewYork, 1976, Graduate Texts in Mathematics, Vol. 34.

[271] W. F. Stout, Almost sure convergence, Academic Press, 1974, Prob-ability and Mathematical Statistics, Vol. 24.

[272] D. W. Stroock, An introduction to Markov processes, Graduate Textsin Mathematics, vol. 230, Springer-Verlag, Berlin, 2005.

[273] A. Sturm and J. M. Swart, Tightness of voter model interfaces, Elec-tron. Commun. Probab. 13 (2008), 165–174.

[274] R. Syski, Exit time from a positive quadrant for a two-dimensionalMarkov chain, Comm. Statist. Stochastic Models 8 (1992), no. 3, 375–395.

[275] G. J. Szekely, On the asymptotic properties of diffusion processes, Ann.Univ. Sci. Budapest. Eotvos Sect. Math. 17 (1974), 69–71 (1975).

[276] A.-S. Sznitman, Topics in random walks in random environment,School and Conference on Probability Theory, ICTP Lect. Notes,XVII, Abdus Salam Int. Cent. Theoret. Phys., Trieste, 2004, pp. 203–266 (electronic).

[277] R. L. Tweedie, Sufficient conditions for ergodicity and recurrence ofMarkov chains on a general state space, Stochastic Processes Appl. 3(1975), no. 4, 385–403.

[278] , Sufficient conditions for regularity, recurrence and ergodicityof Markov processes, Math. Proc. Cambridge Philos. Soc. 78 (1975),125–136.

[279] , Criteria for classifying general Markov chains, Advances inAppl. Probability 8 (1976), no. 4, 737–771.

[280] Y. Velenik, Localization and delocalization of random interfaces,Probab. Surv. 3 (2006), 112–169.

[281] M. Voit, Central limit theorems for random walks on N0 that are as-sociated with orthogonal polynomials, J. Multivariate Anal. 34 (1990),no. 2, 290–322.

398 Bibliography

[282] , Strong laws of large numbers for random walks associated witha class of one-dimensional convolution structures, Monatsh. Math.113 (1992), no. 1, 59–74.

[283] W. Wagner, Explosion phenomena in stochastic coagulation-fragmentation models, Ann. Appl. Probab. 15 (2005), no. 3, 2081–2112.

[284] G. H. Weiss and R. J. Rubin, Random walks: Theory and selectedapplications, Adv. Chem. Phys. 52 (1983), 363–505.

[285] W. M. Wonham, Liapunov criteria for weak stochastic stability, J.Differential Equations 2 (1966), 195–207.

[286] , A Liapunov method for the estimation of statistical averages,J. Differential Equations 2 (1966), 365–377.

[287] S. Zachary, On two-dimensional Markov chains in the positive quad-rant with partial spatial homogeneity, Markov Process. Related Fields1 (1995), no. 2, 267–280.

[288] O. Zeitouni, Random walks in random environment, Lectures onprobability theory and statistics, Lecture Notes in Math., vol. 1837,Springer, Berlin, 2004, pp. 189–312.

[289] , Random walks in random environments, J. Phys. A 39 (2006),no. 40, R433–R464.

Index

adapted, 22almost-sure bounds, 80, 82, 98, 227,

244, 281, 283, 287angular asymptotics, xiii, 13, 15, 217,

222annealed, 329anomalous recurrence, xiii, 11, 18, 179asymptotically zero drift, 14, 81, 94,

179, 182, 215, 226, 287Azuma–Hoeffding inequalities, 48, 87,

289, 293

Bessel process, 176, 178Borel sets, 21Borel–Cantelli lemma, 38branching process, 113, 117, 173, 334branching random walk, 334

central limit theorem, 160, 177Chung–Fuchs theorem, 10, 18, 172,

179, 184convexity, 33

diffusion, 92, 176diffusive, 141, 145, 175Doob decomposition, 33Doob’s inequality, 36drift condition, xDvoretzky–Erdos theorem, 145Dynkin’s formula, 36, 44, 64, 87

ergodic theorem, 86, 125, 173, 298

exclusion process, 310, 319, 331exclusion-voter process, 310, 326, 332excursion, 123, 127excursion maximum, 40, 140

filtration, 22Foster’s criterion, 64

gambler’s ruin, 1, 40Gamma distribution, 153growth model, 334

Heaviside configuration, 310heavy tails, xiii, 4, 15, 239hitting time, 23, 24

defective, 54existence of moments, 71, 98, 130,

227integrable, 63non-existence of moments, 4, 75,

98, 130non-integrable, 69, 130

increment covariance matrix, 11, 66,112, 131, 183–185, 193

increment moment function, 6, 8, 93,99, 100, 189, 304

inequalityAzuma–Hoeffding inequalities, 48,

87, 289, 293maximal inequality, 42, 43, 45,

46, 86

399

400 Index

for submartingales, 36for supermartingales, 42Kolmogorov’s inequalities, 37,

46, 87, 185, 233Kushner–Kalashnikov inequal-

ity, 45invariance principle, 176irreducibility, 24, 120

Kolmogorov’s inequalities, 37, 46, 87,185, 233

Lamperti problem, 7, 93, 188, 304supercritical, 158, 218

Lamperti’s problem, xiilast exit time, 146, 175, 239, 250law of large numbers, 10, 16, 19, 127–

129, 159, 177, 217, 218law of the iterated logarithm, 37, 285,

288limiting direction, 13, 15, 217local escape, 102, 121, 145, 150local time, see occupation timelocally finite, 26, 121, 124Lyapunov exponent, 296Lyapunov function, ix, 4Lyapunov function criterion

for positive recurrence, 64for recurrence, 51for transience, 54

Marcinkiewicz–Zygmund strong law,129

Markov chainaperiodic, 24countable, 24criterion for positive recurrence,

64criterion for recurrence, 51criterion for transience, 54discrete time, 23

irreducible, 24

null recurrence, 25, 70, 97

period, 24

positive recurrence, 25, 64, 69,97

recurrence, 24, 50, 51

stationary measure, 25, 69

time-homogeneous, 24

transience, 24, 50, 54, 61, 96

transition function, 24

transition probabilities, 24

Markov property, 18, 23

strong, 28

martingale, 22

convergence theorem

for submartingales, 33

for supermartingales, 34

difference, 100, 104

optional stopping theorem, see op-tional stopping theorem

orthogonality of increments, 33

semimartingale, 23

submartingale, 22

supermartingale, 22

maximal inequality, see inequality

mean drift, 7, 8, 183, 184

moments Markov property, 104

natural filtration, 22

non-confinement, 101, 121, 122, 172,184, 193, 233, 240, 276, 303

non-degeneracy, 46, 52, 184, 185, 193

non-dissipative, 89

normal distribution, 153, 160

nullity, 4, 70, 125, 152

occupation time, 68, 125, 229, 236

open problem, 234, 236, 237, 331, 333

optional stopping theorem

for submartingales, 34

Index 401

for supermartingales, 35Oseledets multiplicative ergodic the-

orem, 298

Polya’s theorem, 3, 111, 184partial sums, 3, 9passage time, see hitting timepolling system, 334polynomial ergodicity, 174, 175power set, 21, 24previsible, 33

quadratic variation, 38quenched, 279queue, 333

random environment, 279random walk, ix, 1, 8

branching, 334centrally biased, 14, 179, 182, 216,

226, 235centrally biased, discrete, 195, 234controlled, 18, 181elliptic, 18, 181, 197in random environment, 279, 329multidimensional, 81, 94, 102, 111,

131, 145, 179, 183one dimensional, 25, 29, 84, 97,

98, 146oscillating, 88, 239Pearson–Rayleigh, 9, 180simple, 2, 9, 25, 29, 84, 94, 100,

111, 145range, 182, 230, 236rate of escape, 10, 145, 158, 217, 227,

239, 244, 287reaction-diffusion process, 333recurrence, 3, 7, 10, 25, 51, 110, 123,

185, 193, 198, 204, 216, 227,280, 287, 297, 304, 319, 322,326, 327

regeneration, 123renewal function, 149, 176reversible, 330

semimartingale, xSinai’s regime, 281, 286, 329Solomon’s theorem, 280spatially homogeneous, 9, 13, 179square-root boundary, 67, 89squared-difference identity, 46staircase, 313state space, 21stochastic billiards, 302, 331stochastic difference equation, 100, 104,

172stochastic process, 21stopping time, 22string, 294strong law of large numbers, see law

of large numbersstrong Markov property, see Markov

propertysub-diffusive, 98, 141, 175super-diffusive, 159, 217supermarket model, 334

transience, 3, 7, 10, 25, 54–56, 110,123, 185, 193, 194, 198, 202,204, 216, 227, 241, 280, 287,297, 304, 320

uniform ellipticity, 14, 102, 172, 184,193, 230, 286

voter model, 310, 322, 331

weak renewal theorem, 47, 87

zero drift, 10, 52, 88, 94, 112, 131,170, 179, 180, 184, 194, 233

zero-one law, 19