an introduction to lévy and feller processes · 2016-10-17 · an introduction to lévy and feller...

112
An Introduction to Lévy and Feller Processes – Advanced Courses in Mathematics - CRM Barcelona 2014 – René L. Schilling TU Dresden, Institut für Mathematische Stochastik, 01062 Dresden, Germany [email protected] http://www.math.tu-dresden.de/sto/schilling These course notes will be published, together Davar Khoshnevisan’s notes on Invariance and Comparison Principles for Parabolic Stochastic Partial Differential Equations as From Lévy-Type Processes to Parabolic SPDEs by the CRM, Barcelona and Birkäuser, Cham 2017 (ISBN: 978-3-319-34119-4). The arXiv-version and the published version may differ in layout, pagination and wording, but not in content.

Upload: dinhcong

Post on 30-Jul-2018

227 views

Category:

Documents


0 download

TRANSCRIPT

An Introduction to Lévy and FellerProcesses

– Advanced Courses in Mathematics - CRM Barcelona 2014 –

René L. Schilling

TU Dresden, Institut für Mathematische Stochastik,01062 Dresden, Germany

[email protected]

http://www.math.tu-dresden.de/sto/schilling

These course notes will be published, together Davar Khoshnevisan’s notes on Invariance and ComparisonPrinciples for Parabolic Stochastic Partial Differential Equations as From Lévy-Type Processes to

Parabolic SPDEs by the CRM, Barcelona and Birkäuser, Cham 2017 (ISBN: 978-3-319-34119-4). ThearXiv-version and the published version may differ in layout, pagination and wording, but not in content.

Contents

Preface 3

Symbols and notation 5

1. Orientation 7

2. Lévy processes 12

3. Examples 16

4. On the Markov property 24

5. A digression: semigroups 30

6. The generator of a Lévy process 36

7. Construction of Lévy processes 44

8. Two special Lévy processes 49

9. Random measures 55

10.A digression: stochastic integrals 64

11.From Lévy to Feller processes 75

12.Symbols and semimartingales 84

13.Dénouement 93

A. Some classical results 97

Bibliography 104

1

Preface

These lecture notes are an extended version of my lectures on Lévy and Lévy-type processesgiven at the Second Barcelona Summer School on Stochastic Analysis organized by the Centrede Recerca Matemàtica (CRM). The lectures are aimed at advanced graduate and PhD students.In order to read these notes, one should have sound knowledge of measure theoretic probabilitytheory and some background in stochastic processes, as it is covered in my books Measures,Integals and Martingales [54] and Brownian Motion [56].

My purpose in these lectures is to give an introduction to Lévy processes, and to show howone can extend this approach to space inhomogeneous processes which behave locally like Lévyprocesses. After a brief overview (Chapter 1) I introduce Lévy processes, explain how to char-acterize them (Chapter 2) and discuss the quintessential examples of Lévy processes (Chapter 3).The Markov (loss of memory) property of Lévy processes is studied in Chapter 4. A short analyticinterlude (Chapter 5) gives an introduction to operator semigroups, resolvents and their generatorsfrom a probabilistic perspective. Chapter 6 brings us back to generators of Lévy processes whichare identified as pseudo differential operators whose symbol is the characteristic exponent of theLévy process. As a by-product we obtain the Lévy–Khintchine formula.

Continuing this line, we arrive at the first construction of Lévy processes in Chapter 7. Chap-ter 8 is devoted to two very special Lévy processes: (compound) Poisson processes and Brownianmotion. We give elementary constructions of both processes and show how and why they arespecial Lévy processes, indeed. This is also the basis for the next chapter (Chapter 9) where weconstruct a random measure from the jumps of a Lévy process. This can be used to provide afurther construction of Lévy processes, culminating in the famous Lévy–Itô decomposition andyet another proof of the Lévy–Khintchine formula.

A second interlude (Chapter 10) embeds these random measures into the larger theory of ran-dom orthogonal measures. We show how we can use random orthogonal measures to develop anextension of Itô’s theory of stochastic integrals for square-integrable (not necessarily continuous)martingales, but we restrict ourselves to the bare bones, i.e. the L2-theory. In Chapter 11 we in-troduce Feller processes as the proper spatially inhomogeneous brethren of Lévy processes, andwe show how our proof of the Lévy–Khintchine formula carries over to this setting. We will see,in particular, that Feller processes have a symbol which is the state-space dependent analogue ofthe characteristic exponent of a Lévy process. The symbol describes the process and its gener-ator. A probabilistic way to calculate the symbol and some first consequences (in particular thesemimartingale decomposition of Feller processes) is discussed in Chapter 12; we also show that

3

4 R. L. Schilling: An Introduction to Lévy and Feller Processes

the symbol contains information on global properties of the process, such as conservativeness. Inthe final Chapter 13, we summarize (mostly without proofs) how other path properties of a Fellerprocess can be obtained via the symbol. In order to make these notes self-contained, we collect inthe appendix some material which is not always included in standard graduate probability courses.

It is now about time to thank many individuals who helped to bring this enterprise on the way.I am grateful to the scientific board and the organizing committee for the kind invitation to deliverthese lectures at the Centre de Recerca Matemàtica in Barcelona. The CRM is a wonderful placeto teach and to do research, and I am very happy to acknowledge their support and hospitality. Iwould like to thank the students who participated in the CRM course as well as all students andreaders who were exposed to earlier (temporally & spatially inhomogeneous. . . ) versions of mylectures; without your input these notes would look different!

I am greatly indebted to Ms. Franziska Kühn for her interest in this topic; her valuable commentspinpointed many mistakes and helped to make the presentation much clearer.

And, last and most, I thank my wife for her love, support and forbearance while these noteswere being prepared.

Dresden, September 2015 René L. Schilling

Symbols and notation

This index is intended to aid cross-referencing, so notation that is specific to a single chapter isgenerally not listed. Some symbols are used locally, without ambiguity, in senses other than thosegiven below; numbers following an entry are page numbers.

Unless otherwise stated, functions are real-valued and binary operations between functions suchas f ± g, f · g, f ∧ g, f ∨ g, comparisons f 6 g, f < g or limiting relations fn

n→∞−−−→ f , limn fn,liminfn fn, limsupn fn, supn fn or infn fn are understood pointwise.

General notation: analysis

positive always in the sense > 0

negative always in the sense 6 0

N 1,2,3, . . .

inf /0 inf /0 =+∞

a∨b maximum of a and b

a∧b minimum of a and b

bxc largest integer n6 x

|x| norm in Rd : |x|2 = x21 + · · ·+ x2

d

x · y scalar product in Rd : ∑dj=1 x jy j

1A 1A(x) =

1, x ∈ A

0, x /∈ A

δx point mass at x

D domain

∆ Laplace operator

∂ j partial derivative ∂

∂x j

∇, ∇x gradient(

∂x1, . . . , ∂

∂xd

)>F f , f Fourier transform

(2π)−d ∫ e− ix·ξ f (x)dx

F−1 f , qf inverse Fourier transform∫eix·ξ f (x)dx

eξ (x) e− ix·ξ

General notation: probability

∼ ‘is distributed as’

⊥⊥ ‘is stochastically independent’

a.s. almost surely (w. r. t. P)

iid independent and identicallydistributed

N,Exp,Poi normal, exponential, Poissondistribution

P,E probability, expectation

V,Cov variance, covariance

(L0)–(L3) definition of a Lévy process, 7

(L2′) 12

Sets and σ -algebras

Ac complement of the set A

A closure of the set A

A∪· B disjoint union, i.e. A∪B fordisjoint sets A∩B = /0

Br(x) open ball,centre x, radius r

supp f support, f 6= 0

5

6 R. L. Schilling: An Introduction to Lévy and Feller Processes

B(E) Borel sets of E

F Xt canonical filtration σ(Xs : s6 t)

F∞ σ(⋃

t>0 Ft)

Fτ 75

Fτ+ 29

P predictable σ -algebra, 101

Stochastic processes

Px,Ex law and mean of a Markovprocess starting at x, 24

Xt− left limit lims↑t Xs

∆Xt jump at time t: Xt −Xt−

σ ,τ stopping times: σ 6 t ∈Ft ,t > 0

τxr ,τr inft > 0 : |Xt −X0|> r, first

exit time from the open ball Br(x)centered at x = X0

càdlàg right continuous on [0,∞) withfinite left limits on (0,∞)

Spaces of functions

B(E) Borel functions on E

Bb(E) – – , bounded

C(E) continuous functions on E

Cb(E) – – , bounded

C∞(E) – – , lim|x|→∞

f (x) = 0

Cc(E) – – , compact support

Cn(E) n times continuously diff’blefunctions on E

Cnb(E) – – , bounded (with all

derivatives)

Cn∞(E) – – , 0 at infinity (with all

derivatives)

Cnc(E) – – , compact support

Lp(E,µ),Lp(µ),Lp(E) Lp space w. r. t. themeasure space (E,A ,µ)

S(Rd) rapidly decreasing smoothfunctions on Rd , 36

1. Orientation

Stochastic processes with stationary and independent increments are classical examples of Markovprocesses. Their importance both in theory and for applications justifies to study these processesand their history.

The origins of processes with independent increments reach back to the late 1920s and they areclosely connected with the notion of infinite divisibility and the genesis of the Lévy–Khintchineformula. Around this time, the limiting behaviour of sums of independent random variables

X0 := 0 and Xn := ξ1 +ξ2 + · · ·+ξn, n ∈N,

was well understood through the contributions of Borel, Markov, Cantelli, Lindeberg, Feller, deFinetti, Khintchine, Kolmogorov and, of course, Lévy; two new developments emerged, on theone hand the study of dependent random variables and, on the other, the study of continuous-timeanalogues of sums of independent random variables. In order to pass from n ∈N to a continuousparameter t ∈ [0,∞) we need to replace the steps ξk by increments Xt −Xs. It is not hard to seethat Xt , t ∈ N, with iid (independent and identically distributed) steps ξk enjoys the followingproperties:

X0 = 0 a.s. (L0)

stationary increments Xt −Xs ∼ Xt−s−X0 ∀s6 t (L1)

independent increments Xt −Xs⊥⊥σ(Xr,r 6 s) ∀s6 t (L2)

where ‘∼’ stands for ‘same distribution’ and ‘⊥⊥’ for stochastic independence. In the non-discretesetting we will also require a mild regularity condition

continuity in probability limt→0P(|Xt −X0|> ε) = 0 ∀ε > 0 (L3)

which rules out fixed discontinuities of the path t 7→ Xt . Under (L0)–(L2) one has that

Xt =n

∑k=1

ξk,n(t) and ξk,n(t) = (X ktn−X (k−1)t

n) are iid (1.1)

for every n ∈N. Letting n→ ∞ shows that Xt arises as a suitable limit of (a triangular array of)iid random variables which transforms the problem into a question of limit theorems and infinitedivisibility.

7

8 R. L. Schilling: An Introduction to Lévy and Feller Processes

This was first observed in 1929 by de Finetti [15] who introduces (without naming it, the nameis due to Bawly [6] and Khintchine [29]) the concept of infinite divisibility of a random variableX

∀n ∃ iid random variables ξi,n : X ∼n

∑i=1

ξi,n (1.2)

and asks for the general structure of infinitely divisible random variables. His paper contains tworemarkable results on the characteristic function χ(ξ ) = Eeiξ ·X of an infinite divisible randomvariable (taken from [39]):

De Finetti’s first theorem. A random variable X is infinitely divisible if, and only if, its charac-teristic function is of the form χ(ξ ) = limn→∞ exp

[− pn(1−φn(ξ ))

]where pn > 0 and φn

is a characteristic function.

De Finetti’s second theorem. The characteristic function of an infinitely divisible random vari-able X is the limit of finite products of Poissonian characteristic functions

χn(ξ ) = exp[− pn(1− eihnξ )

],

and the converse is also true. In particular, all infinitely divisible laws are limits of convo-lutions of Poisson distributions.

Because of (1.1), Xt is infinitely divisible and as such one can construct, in principle, all indepen-dent-increment processes Xt as limits of sums of Poisson random variables. The contributionsof Kolmogorov [31], Lévy [37] and Khintchine [28] show the exact form of the characteristicfunction of an infinitely divisible random variable

− logEeiξ ·X =− i l ·ξ +12

ξ ·Qξ +∫

y 6=0

(1− eiy·ξ + iξ · y1(0,1)(|y|)

)ν(dy) (1.3)

where l ∈ Rd , Q ∈ Rd×d is a positive semidefinite symmetric matrix, and ν is a measure onRd \ 0 such that

∫y 6=0 min1, |y|2ν(dy) < ∞. This is the famous Lévy–Khintchine formula.

The exact knowledge of (1.3) makes it possible to find the approximating Poisson variables in deFinetti’s theorem explicitly, thus leading to a construction of Xt .

A little later, and without knowledge of de Finetti’s results, Lévy came up in his seminal paper[37] (see also [38, Chap. VII]) with a decomposition of Xt in four independent components: adeterministic drift, a Gaussian part, the compensated small jumps and the large jumps ∆Xs :=Xs−Xs−. This is now known as Lévy–Itô decomposition:

Xt = tl +√

QWt + limε→0

(∑

0<s6tε6|∆Xs|<1

∆Xs− t∫

ε6|y|<1

yν(dy))+ ∑

0<s6t|∆Xs|>1

∆Xs (1.4)

= tl +√

QWt +∫∫

(0,t]×B1(0)

y(N(ds,dy)−dsν(dy))+∫∫

(0,t]×B1(0)c

yN(ds,dy). (1.5)

Chapter 1: Orientation 9

Lévy uses results from the convergence of random series, notably Kolmogorov’s three series the-orem, in order to explain the convergence of the series appearing in (1.4). A rigorous proof basedon the representation (1.5) are due to Itô [23] who completed Lévy’s programme to construct Xt .The coefficients l,Q,ν are the same as in (1.3), W is a d-dimensional standard Brownian mo-tion, and Nω((0, t]×B) is the random measure #s ∈ (0, t] : Xs(ω)−Xs−(ω) ∈ B counting thejumps of X ; it is a Poisson random variable with intensity EN((0, t]×B) = tν(B) for all Borel setsB⊂Rd \0 such that 0 /∈ B.

Nowadays there are at least six possible approaches to constructing processes with (stationaryand) independent increments X = (Xt)t>0.

The de Finetti–Lévy(–Kolmogorov–Khintchine) construction. The starting point is theobservation that each Xt satisfies (1.1) and is, therefore, infinitely divisible. Thus, the character-istic exponent logEeiξ ·Xt is given by the Lévy–Khintchine formula (1.3), and using the triplet(l,Q,ν) one can construct a drift lt, a Brownian motion

√QWt and compound Poisson processes,

i.e. Poisson processes whose intensities y ∈ Rd are mixed with respect to the finite measureνε(dy) := 1[ε,∞)(|y|)ν(dy). Using a suitable compensation (in the spirit of Kolmogorov’s threeseries theorem) of the small jumps, it is possible to show that the limit ε → 0 exists locally uni-formly in t. A very clear presentation of this approach can be found in Breiman [10, Chapter14.7–8], see also Chapter 7.

The Lévy–Itô construction. This is currently the most popular approach to independent-in-crement processes, see e.g. Applebaum [2, Chapter 2.3–4] or Kyprianou [36, Chapter 2]. Origi-nally the idea is due to Lévy [37], but Itô [23] gave the first rigorous construction. It is based onthe observation that the jumps of a process with stationary and independent increments define aPoisson random measure Nω([0, t]×B) and this can be used to obtain the Lévy–Itô decomposition(1.5). The Lévy–Khintchine formula is then a corollary of the pathwise decomposition. Some ofthe best presentations can be found in Gikhman–Skorokhod [18, Chapter VI], Itô [24, Chapter 4.3]and Bretagnolle [11]. A proof based on additive functionals and martingale stochastic integrals isdue to Kunita & Watanabe [35, Section 7]. We follow this approach in Chapter 9.

Variants of the Lévy–Itô construction. The Lévy–Itô decomposition (1.5) is, in fact, thesemimartingale decomposition of a process with stationary and independent increments. Using thegeneral theory of semimartingales – which heavily relies on general random measures – we canidentify processes with independent increments as those semimartingales whose semimartingalecharacteristics are deterministic, cf. Jacod & Shiryaev [27, Chapter II.4c]. A further interestingderivation of the Lévy–Itô decomposition is based on stochastic integrals driven by martingales.The key is Itô’s formula and, again, the fact that the jumps of a process with stationary and in-dependent increments defines a Poisson point process which can be used as a good stochastic

10 R. L. Schilling: An Introduction to Lévy and Feller Processes

integrator; this unique approach1 can be found in Kunita [34, Chapter 2].

Kolmogorov’s construction. This is the classic construction of stochastic processes startingfrom the finite-dimensional distributions. For a process with stationary and independent incre-ments these are given as iterated convolutions of the form

E f (Xt0 , . . . ,Xtn)

=∫· · ·∫

f (y0,y0 + y1, . . . ,y0 + · · ·+ yn) pt0(dy0)pt1−t0(dy1) . . . ptn−tn−1(dyn)

with pt(dy) = P(Xt ∈ dy) or∫

eiξ ·y pt(dy) = exp [−tψ(ξ )] where ψ is the characteristic exponent(1.3). Particularly nice presentations are those of Sato [51, Chapter 2.10–11] and Bauer [5, Chapter37].

The invariance principle. Just as for a Brownian motion, it is possible to construct Lévyprocesses as limits of (suitably interpolated) random walks. For finite dimensional distributionsthis is done in Gikhman & Skorokhod [18, Chapter IX.6]; for the whole trajectory, i.e. in thespace of càdlàg2 functions D[0,1] equipped with the Skorokhod topology, the proper referencesare Prokhorov [42] and Grimvall [19].

Random series constructions. A series representation of an independent-increment pro-cess (Xt)t∈[0,1] is an expression of the form

Xt = limn→∞

n

∑k=1

(Jk1[0,t](Uk)− tck

)a.s.

The random variables Jk represent the jumps, Uk are iid uniform random variables and ck aresuitable deterministic centering terms. Compared with the Lévy–Itô decomposition (1.4), themain difference is the fact that the jumps are summed over a deterministic index set 1,2, . . .nwhile the summation in (1.4) extends over the random set s : |∆Xs|> 1/n. In order to constructa process with characteristic exponent (1.3) where l = 0 and Q = 0, one considers a disintegration

ν(dy) =∫

0σ(r,dy)dr.

It is possible, cf. Rosinski [47], to choose σ(r,dy) = P(H(r,Vk) ∈ dy) where V = (Vk)k∈N is anysequence of d-dimensional iid random variables and H : (0,∞)×Rd → Rd is measurable. Nowlet Γ = (Γk)k∈N be a sequence of partial sums of iid standard exponential random variables andU = (Uk)k∈N iid uniform random variables on [0,1] such that U,V,Γ are independent. Then

Jk := H(Γk,Vk) and ck =∫ k

k−1

∫|y|<1

yσ(r,dy)dr

1It reminds of the elegant use of Itô’s formula in Kunita-and-Watanabe’s proof of Lévy’s characterization of Brow-nian motion, see e.g. Schilling & Partzsch [56, Chapter 18.2].

2A french acronym meaning ‘right-continuous and finite limits from the left’.

Chapter 1: Orientation 11

is the sought-for series representation, cf. Rosinski [47] and [46]. This approach is important ifone wants to simulate independent-increment processes. Moreover, it still holds for Banach spacevalued random variables.

2. Lévy processes

Throughout this chapter, (Ω,A ,P) is a fixed probability space, t0 = 06 t1 6 . . .6 tn and 06 s < tare positive real numbers, and ξk,ηk, k = 1, . . . ,n, denote vectors from Rd ; we write ξ ·η for theEuclidean scalar product.

Definition 2.1. A Lévy process X = (Xt)t>0 is a stochastic process Xt : Ω→Rd satisfying (L0)–(L3); this is to say that X starts at zero, has stationary and independent increments and is continu-ous in probability.

One should understand Lévy processes as continuous-time versions of sums of iid random vari-ables. This can easily be seen from the telescopic sum

Xt −Xs =n

∑k=1

(Xtk −Xtk−1

), s < t, n ∈N, (2.1)

where tk = s+ kn(t− s). Since the increments Xtk −Xtk−1 are iid random variables, we see that all

Xt of a Lévy process are infinitely divisible, i.e. (1.2) holds. Many properties of a Lévy processwill, therefore, resemble those of sums of iid random variables.

Let us briefly discuss the conditions (L0)–(L3).

Remark 2.2. We have stated (L2) using the canonical filtration F Xt := σ(Xr, r6 t) of the process

X . Often this condition is written in the following way

Xtn−Xtn−1 , . . . ,Xt1−Xt0 are independent random variables

for all n ∈N, t0 = 0 < t1 < · · ·< tn.(L2′)

It is easy to see that this is actually equivalent to (L2): From

(Xt1 , . . . ,Xtn)bi-measurable←−−−−−−−→ (Xt1−Xt0 , . . . ,Xtn−Xtn−1)

it follows that

F Xt = σ

((Xt1 , . . . ,Xtn), 06 t1 6 . . .6 tn 6 t

)= σ

((Xt1−Xt0 , . . . ,Xtn−Xtn−1), 0 = t0 6 t1 6 . . .6 tn 6 t

)= σ

(Xu−Xv, 06 v6 u6 t

),

(2.2)

and we conclude that (L2) and (L2′) are indeed equivalent.The condition (L3) is equivalent to either of the following

12

Chapter 2: Lévy processes 13

• ‘t 7→ Xt is continuous in probability’;

• ‘t 7→ Xt is a.s. càdlàg’1 (up to a modification of the process).

The equivalence with the first claim, and the direction ‘⇐’ of the second claim are easy:

limu→tP(|Xu−Xt |> ε) = lim

|t−u|→0P(|X|t−u||> ε)6 lim

h→0

1εE(|Xh|∧ ε), (2.3)

but it takes more effort to show that continuity in probatility (L3) guarantees that almost all pathsare càdlàg.2 Usually, this is proved by controlling the oscillations of the paths of a Lévy process,cf. Sato [51, Theorem 11.1], or by the fundamental regularization theorem for submartingales, seeRevuz & Yor [44, Theorem II.(2.5)] and Remark 11.2; in contrast to the general martingale setting[44, Theorem II.(2.9)], we do not need to augment the natural filtration because of (L1) and (L3).Since our construction of Lévy processes gives directly a càdlàg version, we do not go into furtherdetail.

The condition (L3) has another consequence. Recall that the Cauchy–Abel functional equa-tions have unique solutions if, say, φ , ψ and θ are (right-)continuous:

φ(s+ t) = φ(s) ·φ(t) φ(t) = φ(1)t ,

ψ(s+ t) = ψ(s)+ψ(t) (s, t > 0) =⇒ ψ(t) = ψ(1) · t (2.4)

θ(st) = θ(s) ·θ(t) θ(t) = tc, c> 0.

The first equation is treated in Theorem A.1 in the appendix. For a thorough discussion on condi-tions ensuring uniqueness we refer to Aczel [1, Chapter 2.1].

Proposition 2.3. Let (Xt)t>0 be a Lévy process in Rd . Then

Eeiξ ·Xt =[Eeiξ ·X1

]t, t > 0, ξ ∈Rd . (2.5)

Proof. Fix s, t > 0. We get

Eeiξ ·(Xt+s−Xs)+iξ ·Xs (L2)= Eeiξ ·(Xt+s−Xs)Eeiξ ·Xs (L1)

= Eeiξ ·Xt Eeiξ ·Xs ,

or φ(t + s) = φ(t) · φ(s), if we write φ(t) = Eeiξ ·Xt . Since x 7→ eiξ ·x is continuous, there is forevery ε > 0 some δ > 0 such that

|φ(t)−φ(s)|6 E∣∣eiξ ·(Xt−Xs)−1

∣∣6 ε +2P(|Xt −Xs|> δ ) = ε +2P(|X|t−s||> δ ).

Thus, (L3) guarantees that t 7→ φ(t) is continuous, and the claim follows from (2.4).

Notice that any solution f (t) of (2.4) also satisfies (L0)–(L2); by Proposition 2.3 Xt + f (t)is a Lévy process if, and only if, f (t) is continuous. On the other hand, Hamel, cf. [1, p. 35],constructed discontinuous (non-measurable and locally unbounded) solutions to (2.4). Thus, (L3)means that t 7→ Xt has no fixed discontinuities, i.e. all jumps occur at random times.

1‘Right-continuous and finite limits from the left’2More precisely: that there exists a modification of X which has almost surely càdlàg paths.

14 R. L. Schilling: An Introduction to Lévy and Feller Processes

Corollary 2.4. The finite-dimensional distributions P(Xt1 ∈ dx1, . . . ,Xtn ∈ dxn) of a Lévy processare uniquely determined by

Eexp

(i

n

∑k=1

ξk ·Xtk

)=

n

∏k=1

[Eexp(i(ξk + · · ·+ξn) ·X1)

]tk−tk−1(2.6)

for all ξ1, . . . ,ξn ∈Rd , n ∈N and 0 = t0 6 t1 6 . . .6 tn.

Proof. The left-hand side of (2.6) is just the characteristic function of (Xt1 , . . . ,Xtn). Consequently,the assertion follows from (2.6). Using Proposition 2.3, we have

Eexp

(i

n

∑k=1

ξk ·Xtk

)= Eexp

(i

n−2

∑k=1

ξk ·Xtk + i(ξn +ξn−1) ·Xtn−1 + iξn · (Xtn−Xtn−1)

)(L2)=

(L1)Eexp

(i

n−2

∑k=1

ξk ·Xtk + i(ξn +ξn−1) ·Xtn−1

)(Eeiξn·X1

)tn−tn−1 .

Since the first half of the right-hand side has the same structure as the original expression, we caniterate this calculation and obtain (2.6).

It is not hard to invert the Fourier transform in (2.6). Writing pt(dx) := P(Xt ∈ dx) we get

P(Xt1 ∈ B1, . . . ,Xtn ∈ Bn) =∫· · ·∫ n

∏k=1

1Bk(x1 + · · ·+ xk)ptk−tk−1(dxk) (2.7)

=∫· · ·∫ n

∏k=1

1Bk(yk)ptk−tk−1(dyk− yk−1). (2.8)

Let us discuss the structure of the characteristic function χ(ξ ) = Eeiξ ·X1 of X1. From (2.1)we see that each random variable Xt of a Lévy process is infinitely divisible. Clearly, |χ(ξ )|2 isthe (real-valued) characteristic function of the symmetrization X1 = X1−X ′1 (X ′1 is an independentcopy of X1) and X1 is again infinitely divisible:

X1 =n

∑k=1

(X kn− X k−1

n) =

n

∑k=1

[(X k

n−X k−1

n)− (X ′k

n−X ′k−1

n)].

In particular, |χ|2 = |χ1/n|2n where |χ1/n|2 is the characteristic function of X1/n. Since everythingis real and |χ(ξ )|6 1, we get

θ(ξ ) := limn→∞|χ1/n(ξ )|2 = lim

n→∞|χ(ξ )|2/n, ξ ∈Rd ,

which is 0 or 1 depending on |χ(ξ )| = 0 or |χ(ξ )| > 0, respectively. As χ(ξ ) is continuous atξ = 0 with χ(0) = 1, we have θ ≡ 1 in a neighbourhood Br(0) of 0. Now we can use Lévy’scontinuity theorem (Theorem A.5) and conclude that the limiting function θ(ξ ) is continuouseverywhere, hence θ ≡ 1. In particular, χ(ξ ) has no zeroes.

Chapter 2: Lévy processes 15

Corollary 2.5. Let (Xt)t>0 be a Lévy process in Rd . There exists a unique continuous functionψ :Rd → C such that

Eexp(iξ ·Xt) = e−tψ(ξ ), t > 0, ξ ∈Rd .

The function ψ is called the characteristic exponent.

Proof. In view of Proposition 2.3 it is enough to consider t = 1. Set χ(ξ ) := Eexp(iξ ·X1). Anobvious candidate for the exponent is ψ(ξ ) = − log χ(ξ ), but with complex logarithms there isalways the trouble which branch of the logarithm one should take. Let us begin with the unique-ness:

e−ψ = e−φ =⇒ e−(ψ−φ) = 1 =⇒ ψ(ξ )−φ(ξ ) = 2π ikξ

for some integer kξ ∈Z. Since φ ,ψ are continuous and φ(0) = ψ(0) = 1, we get kξ ≡ 0.To prove the existence of the logarithm, it is not sufficient to take the principal branch of the

logarithm. As we have seen above, χ(ξ ) is continuous and has no zeroes, i.e. inf|ξ |6r |χ(ξ )| > 0for any r > 0; therefore, there is a ‘distinguished’, continuous3 version of the argument arg χ(ξ )

such that arg χ(0) = 0.This allows us to take a continuous version of log χ(ξ ) = log |χ(ξ )|+ arg χ(ξ ).

Corollary 2.6. Let Y be an infinitely divisible random variable. Then there exists at most one4

Lévy process (Xt)t>0 such that X1 ∼ Y .

Proof. Since X1 ∼ Y , infinite divisibility is a necessary requirement for Y . On the other hand,Proposition 2.3 and Corollary 2.4 show how to construct the finite-dimensional distributions of aLévy process, hence the process, from X1.

So far, we have seen the following one-to-one correspondences

(Xt)t>0 Lévy process 1:1←→ Eeiξ ·X1 1:1←→ ψ(ξ ) =− logEeiξ ·X1

and the next step is to find all possible characteristic exponents. This will lead us to the Lévy–Khintchine formula.

3A very detailed argument is given in Sato [51, Lemma 7.6], a completely different proof can be found in Dieudonné[16, Chapter IX, Appendix 2].

4We will see in Chapter 7 how to construct this process. It is unique in the sense that its finite-dimensionaldistributions are uniquely determined by Y .

3. Examples

We begin with a useful alternative characterisation of Lévy processes.

Theorem 3.1. Let X = (Xt)t>0 be a stochastic process with values in Rd , P(X0 = 0) = 1 andFt = F X

t = σ(Xr, r 6 t). The process X is a Lévy process if, and only if, there exists an exponentψ :Rd → C such that

E(

eiξ ·(Xt−Xs)∣∣∣Fs

)= e−(t−s)ψ(ξ ) for all s < t, ξ ∈Rd . (3.1)

Proof. If X is a Lévy process, we get

E(

eiξ ·(Xt−Xs)∣∣∣Fs

)(L2)= Eeiξ ·(Xt−Xs) (L1)

= Eeiξ ·Xt−s Cor. 2.5= e−(t−s)ψ(ξ ).

Conversely, assume that X0 = 0 a.s. and (3.1) holds. Then

Eeiξ ·(Xt−Xs) = e−(t−s)ψ(ξ ) = Eeiξ ·(Xt−s−X0)

which shows Xt −Xs ∼ Xt−s−X0 = Xt−s, i.e. (L1).

For any F ∈Fs we find from the tower property of conditional expectation

E(1F · eiξ ·(Xt−Xs)

)= E

(1FE

[eiξ ·(Xt−Xs)

∣∣∣Fs

])= E1F · e−(t−s)ψ(ξ ). (3.2)

Observe that eiu1F = 1Fc + eiu1F for any u ∈R; since both F and Fc are in Fs, we get

E(

eiu1F eiξ ·(Xt−Xs))

= E(1Fceiξ ·(Xt−Xs)

)+E

(1Feiueiξ ·(Xt−Xs)

)(3.2)= E

(1Fc + eiu1F

)e−(t−s)ψ(ξ )

(3.2)= Eeiu1F Eeiξ ·(Xt−Xs).

Thus, 1F⊥⊥(Xt −Xs) for any F ∈Fs, and (L2) follows.

Finally, limt→0Eeiξ ·Xt = limt→0 e−tψ(ξ ) = 1 proves that Xt → 0 in distribution, hence in proba-bility. This gives (L3).

Theorem 3.1 allows us to give concrete examples of Lévy processes.

Example 3.2. The following processes are Lévy processes.

a) Drift in direction l/|l|, l ∈Rd , with speed |l|: Xt = tl and ψ(ξ ) =− i l ·ξ .

16

Chapter 3: Examples 17

b) Brownian motion with (positive semi-definite) covariance matrix Q ∈Rd×d : Let (Wt)t>0 be astandard Wiener process on Rd and set Xt :=

√QWt .

Then ψ(ξ ) = 12 ξ ·Qξ and P(Xt ∈ dy) = (2πt)−d/2(detQ)−1/2 exp(−y ·Q−1y/2t)dy.

c) Poisson process in R with jump height 1 and intensity λ . This is an integer-valued countingprocess (Nt)t>0 which increases by 1 after an independent exponential waiting time with mean λ .Thus,

Nt =∞

∑k=11[0,t](τk), τk = σ1 + · · ·+σk, σk ∼ Exp(λ ) iid.

Using this definition, it is a bit messy to show that N is indeed a Lévy process (see e.g. Çinlar [12,Chapter 4]). We will give a different proof in Theorem 3.4 below. Usually, the first step is to showthat its law is a Poisson distribution

P(Nt = k) = e−tλ (λ t)k

k!, k = 0,1,2, . . .

(thus the name!) and from this one can calculate the characteristic exponent

EeiuNt =∞

∑k=0

eiuke−tλ (λ t)k

k!= e−tλ exp

[λ teiu]= exp

[− tλ (1− eiu)

],

i.e. ψ(u) = λ (1− eiu). Mind that this is strictly weaker than (3.1) and does not prove that N is aLévy process.

d) Compound Poisson process in Rd with jump distribution µ and intensity λ . Let N = (Nt)t>0

be a Poisson process with intensity λ and replace the jumps of size 1 by independent iid jumps ofrandom height H1,H2, . . . with values in Rd and H1 ∼ µ . This is a compound Poisson process:

Ct =Nt

∑k=1

Hk, Hk ∼ µ iid and independent of (Nt)t>0.

We will see in Theorem 3.4 that compound Poisson processes are Lévy processes.

Let us show that the Poisson and compound Poisson processes are Lévy processes. For this weneed the following auxiliary result. Since t 7→Ct is a step function, the Riemann–Stieltjes integral∫

f (u)dCu is well-defined.

Lemma 3.3 (Campbell’s formula). Let Ct = H1 + · · ·+HNt be a compound Poisson process as inExample 3.2.d) with iid jumps Hk ∼ µ and an independent Poisson process (Nt)t>0 with intensityλ . Then

Eexp(

i∫

0f (t + s)dCt

)= exp

∫∞

0

∫y6=0

(eiy f (s+t)−1)µ(dy)dt)

(3.3)

holds for all s> 0 and bounded measurable functions f : [0,∞)→Rd with compact support.

18 R. L. Schilling: An Introduction to Lévy and Feller Processes

Proof. Set τk = σ1 + · · ·+σk where σk ∼ Exp(λ ) are iid. Then

φ(s) :=Eexp(

i∫

0f (s+ t)dCt

)=Eexp

(i

∑k=1

f (s+σ1 + · · ·+σk)Hk

)iid=∫

0Eexp

(i

∑k=2

f (s+ x+σ2 + · · ·+σk)Hk

)︸ ︷︷ ︸

=φ(s+x)

Eexp(i f (s+ x)H1)︸ ︷︷ ︸=:γ(s+x)

P(σ1 ∈ dx)︸ ︷︷ ︸=λe−λx dx

∫∞

0φ(s+ x)γ(s+ x)e−λx dx

=λeλ s∫

sγ(t)φ(t)e−λ t dt.

This is equivalent toe−λ s

φ(s) = λ

∫∞

s(φ(t)e−λ t)γ(t)dt

and φ(∞) = 1 since f has compact support. This integral equation has a unique solution; it is nowa routine exercise to verify that the right-hand side of (3.3) is indeed a solution.

Theorem 3.4. Let Ct = H1 + · · ·+HNt be a compound Poisson process as in Example 3.2.d) withiid jumps Hk ∼ µ and an independent Poisson process (Nt)t>0 with intensity λ . Then (Ct)t>0 (andalso (Nt)t>0) is a d-dimensional Lévy process with characteristic exponent

ψ(ξ ) = λ

∫y6=0

(1− eiy·ξ )µ(dy). (3.4)

Proof. Since the trajectories of t 7→Ct are càdlàg step functions with C0 = 0, the properties (L0)and (L3), see (2.3), are satisfied. We will show (L1) and (L2). Let ξk ∈Rd , 0 = t0 6 . . .6 tn, anda < b. Then the Riemann–Stieltjes integral∫

01(a,b](t)dCt =

∑k=11(a,b](τk)Hk =Cb−Ca

exists. We apply the Campbell formula (3.3) to the function

f (t) :=n

∑k=1

ξk1(tk−1,tk](t)

and with s = 0. Then the left-hand side of (3.3) becomes the characteristic function of the incre-ments

Eexp

(i

n

∑k=1

ξk · (Ctk −Ctk−1)

),

while the right-hand side is equal to

exp

∫y 6=0

n

∑k=1

∫ tk

tk−1

(eiξk·y−1)dt µ(dy)

]=

n

∏k=1

exp[

λ (tk− tk−1)∫

y6=0(eiξk·y−1)µ(dy)

]=

n

∏k=1Eexp

[iξk ·Ctk−tk−1

]

Chapter 3: Examples 19

(use Campbell’s formula with n = 1 for the last equality). This shows that the increments areindependent, i.e. (L2′) holds, as well as (L1): Ctk −Ctk−1 ∼Ctk−tk−1 .

If d = 1 and Hk ∼ δ1, Ct is a Poisson process.

Denote by µ∗k the k-fold convolution of the measure µ; as usual, µ∗0 := δ0.

Corollary 3.5. Let (Nt)t>0 be a Poisson process with intensity λ and Ct = H1 + · · ·+HNt a com-pound Poisson process with iid jumps Hk ∼ µ . Then, for all t > 0,

P(Nt = k) = e−λ t (λ t)k

k!, k = 0,1,2, . . . (3.5)

P(Ct ∈ B) = e−λ t∞

∑k=0

(λ t)k

k!µ∗k(B), B⊂Rd Borel. (3.6)

Proof. If we use Theorem 3.4 for d = 1 and µ = δ1, we see that the characteristic function of Nt isχt(u) = exp[−λ t(1−eiu)]. Since this is also the characteristic function of the Poisson distribution(i.e. the r.h.s. of (3.5)), we get Nt ∼ Poi(λ t).

Since (Hk)k∈N⊥⊥(Nt)t>0, we have for any Borel set B

P(Ct ∈ B) =∞

∑k=0P(Ct ∈ B, Nt = k)

= δ0(B)P(Nt = 0)+∞

∑k=1P(H1 + · · ·+Hk ∈ B)P(Nt = k)

= e−λ t∞

∑k=0

(λ t)k

k!µ∗k(B).

Example 3.2 contains the basic Lévy processes which will also be the building blocks for allLévy processes. In order to define more specialized Lévy processes, we need further assumptionson the distributions of the random variables Xt .

Definition 3.6. Let (Xt)t>0 be a stochastically continuous process in Rd . It is called self-similar,if

∀a> 0 ∃b = b(a) : (Xat)t>0 ∼ (bXt)t>0 (3.7)

in the sense that both sides have the same finite-dimensional distributions.

Lemma 3.7 (Lamperti). If (Xt)t>0 is self-similar and non-degenerate, then there exists a uniqueindex of self-similarity H > 0 such that b(a) = aH . If (Xt)t>0 is a self-similar Lévy process, thenH > 1

2 .

Proof. Since (Xt)t>0 is self-similar, we find for a,a′ > 0 and each t > 0

b(aa′)Xt ∼ Xaa′t ∼ b(a)Xa′t ∼ b(a)b(a′)Xt ,

20 R. L. Schilling: An Introduction to Lévy and Feller Processes

and so b(aa′) = b(a)b(a′) as Xt is non-degenerate.1 By the convergence of types theorem (Theo-rem A.6) and the continuity in probability of t 7→ Xt we see that a 7→ b(a) is continuous. Thus, theCauchy functional equation b(aa′) = b(a)b(a′) has the unique continuous solution b(a) = aH forsome H > 0.

Assume now that (Xt)t>0 is a Lévy process. We are going to show that H > 12 . Using self-

similarity and the properties (L1), (L2) we get (primes always denote iid copies of the respectiverandom variables)

(n+m)HX1 ∼ Xn+m = (Xn+m−Xm)+Xm ∼ X ′′n +X ′m ∼ nHX ′′1 +mHX ′1. (3.8)

Any standard normal random variable X1 satisfies (3.8) with H = 12 . On the other hand, if X1 has

a second moment, we get (n+m)VX1 =VXn+m =VX ′′n +VX ′m = nVX ′′1 +mVX ′1 by Bienaymésidentity for variances, i.e. (3.8) can only hold with H = 1

2 . Thus, any self-similar X1 with finitesecond moment has to satisfy (3.8) with H = 1

2 . If we can show that H < 12 implies the existence

of a second moment, we have reached a contradiction.If Xn is symmetric and H < 1

2 , we find because of Xn ∼ nHX1 some u > 0 such that

P(|Xn|> unH) = P(|X1|> u)<14.

By the symmetrization inequality (Theorem A.7),

12(1− exp−nP(|X1|> unH)

)6 P(|Xn|> unH)<

14

which means that nP(|X1|> unH)6 c for all n∈N. Thus, P(|X1|> x)6 c′x−1/H for all x > u+1,and so

E|X1|2 = 2∫

0xP(|X1|> x)dx6 2(u+1)+2c′

∫∞

u+1x1−1/H dx < ∞

as H < 12 . If Xn is not symmetric, we use its symmetrization Xn−X ′n where X ′n are iid copies of

Xn.

Definition 3.8. A random variable X is called stable if

∀n ∈N ∃bn > 0, cn ∈Rd : X ′1 + · · ·+X ′n ∼ bnX + cn (3.9)

where X ′1, . . . ,X′n are iid copies of X . If (3.9) holds with cn = 0, the random variable is called

strictly stable. A Lévy process (Xt)t>0 is (strictly) stable if X1 is a (strictly) stable random variable.

1We use here that bX ∼ cX =⇒ b = c if X is non-degenerate. To see this, set χ(ξ ) =Eeiξ ·X and notice

|χ(ξ )|=∣∣χ( b

c ξ)∣∣= · · ·= ∣∣χ(( b

c)n

ξ)∣∣.

If b < c, the right-hand side converges for n→ ∞ to χ(0) = 1, hence |χ| ≡ 1, contradicting the fact that X is non-degenerate. Since b,c play symmetric roles, we conclude that b = c.

Chapter 3: Examples 21

Note that the symmetrization X − X ′ of a stable random variable is strictly stable. Settingχ(ξ ) = Eeiξ ·X it is easy to see that (3.9) is equivalent to

∀n ∈N ∃bn > 0, cn ∈Rd : χ(ξ )n = χ(bnξ )eicn·ξ . (3.9′)

Example 3.9. a) Stable processes. By definition, any stable random variable is infinitely divisible,and for every stable X there is a unique Lévy process on Rd such that X1 ∼ X , cf. Corollary 2.6.

A Lévy process (Xt)t>0 is stable if, and only if, all random variables Xt are stable. This followsat once from (3.9′) if we use χt(ξ ) := Eeiξ ·Xt :

χt(ξ )n (2.5)=(χ1(ξ )

n)t (3.9′)= χ1(bnξ )tei(tcn)·ξ (2.5)

= χt(bnξ )ei(tcn)·ξ .

It is possible to determine the characteristic exponent of a stable process, cf. Sato [51, Theorem14.10] and (3.10) further down.

b) Self-similar processes. Assume that (Xt)t>0 is a self-similar Lévy process. Then

∀n ∈N : b(n)X1 ∼ Xn =n

∑k=1

(Xk−Xk−1)∼ X ′1,n + · · ·+X ′n,n

where the X ′k,n are iid copies of X1. This shows that X1, hence (Xt)t>0, is strictly stable. In fact, theconverse is also true:

c) A strictly stable Lévy process is self-similar. We have already seen in b) that self-similarLévy processes are strictly stable. Assume now that (Xt)t>0 is strictly stable. Since Xnt ∼ bnXt weget

e−ntψ(ξ ) = Eeiξ ·Xnt = Eeibnξ ·Xt = e−tψ(bnξ ).

Taking n = m, t t/m and ξ b−1m ξ we see

e−tm ψ(ξ ) = e−tψ(b−1

m ξ ).

From these equalities we obtain for q = n/m ∈Q+ and b(q) := bn/bm

e−qtψ(ξ ) = e−tψ(b(q)ξ ) =⇒ Xqt ∼ b(q)Xt =⇒ Xat ∼ b(a)Xt

for all t > 0 because of the continuity in probability of (Xt)t>0. Since, by Corollary 2.4, the finite-dimensional distributions are determined by the one-dimensional distributions, we conclude that(3.7) holds.

This means, in particular, that strictly stable Lévy processes have an index of self-similarityH > 1

2 . It is common to call α = 1/H ∈ (0,2] the index of stability of (Xt)t>0, and we haveXnt ∼ n1/αXt .

If X is ‘only’ stable, its symmetrization is strictly stable and, thus, every stable Lévy processhas an index α ∈ (0,2]. It plays an important role for the characteristic exponent. For a general

22 R. L. Schilling: An Introduction to Lévy and Feller Processes

stable process the characteristic exponent is of the form

ψ(ξ ) =

∫Sd|z ·ξ |α

(1− i sgn(z ·ξ ) tan απ

2

)σ(dz)− i µ ·ξ , (α 6= 1),∫

Sd|z ·ξ |

(1+ 2

πi sgn(z ·ξ ) log |z ·ξ |

)σ(dz)− i µ ·ξ , (α = 1),

(3.10)

where σ is a finite measure on Sd and µ ∈Rd . The strictly stable exponents have µ = 0 (if α 6= 1)and

∫Sd zk σ(dz) = 0, k = 1, . . . ,d (if α = 1). These formulae can be derived from the general

Lévy–Khintchine formula; a good reference is the monograph by Samorodnitsky & Taqqu [48,Chapters 2.3–4].

If X is strictly stable such that the distribution of Xt is rotationally invariant, it is clear thatψ(ξ ) = c|ξ |α . If Xt is symmetric, i.e. Xt ∼ −Xt , then ψ(ξ ) =

∫Sd |z · ξ |α σ(dz) for some finite,

symmetric measure σ on the unit sphere Sd ⊂Rd .

Let us finally show Kolmogorov’s proof of the Lévy–Khintchine formula for one-dimensionalLévy processes admitting second moments. We need the following auxiliary result.

Lemma 3.10. Let (Xt)t>0 be a Lévy process on R. If VX1 < ∞, then VXt < ∞ for all t > 0 and

EXt = tEX1 =: tµ and VXt = tVX1 =: tσ2.

Proof. If VX1 < ∞, then E|X1|< ∞. With Bienaymé’s identity, we get

VXm =m

∑k=1V(Xk−Xk−1) = mVX1 and VX1 = nVX1/n.

In particular, VXm,VX1/n < ∞. This, and a similar argument for the expectation, show

VXq = qVX1 and EXq = qEX1 for all q ∈Q+.

Moreover, V(Xq−Xr) =VXq−r = (q− r)VX1 for all rational numbers r 6 q, and this shows thatXq−EXq = Xq−qµ converges in L2 as q→ t. Since t 7→ Xt is continuous in probability, we canidentify the limit and find Xq−qµ → Xt − tµ . Consequenctly, VXt = tσ2 and EXt = tµ .

We have seen in Proposition 2.3 that the characteristic function of a Lévy process is of the form

χt(ξ ) = Eeiξ Xt =[Eeiξ X1

]t= χ1(ξ )

t .

Let us assume that X is real-valued and has finite (first and) second moments VX1 = σ2 andEX1 = µ . By Taylor’s formula

Eeiξ (Xt−tµ) = E

[1+ iξ (Xt − tµ)−

∫ 1

2(Xt − tµ)2 (1−θ)eiθξ (Xt−tµ) dθ

]= 1−E

2(Xt − tµ)2∫ 1

0(1−θ)eiθξ (Xt−tµ) dθ

].

Chapter 3: Examples 23

Since ∣∣∣∣∫ 1

0(1−θ)eiθξ (Xt−tµ) dθ

∣∣∣∣6 ∫ 1

0(1−θ)dθ =

12,

we get ∣∣Eeiξ Xt∣∣= ∣∣Eeiξ (Xt−tµ)

∣∣> 1− ξ 2

2tσ2.

Thus, χ1/n(ξ ) 6= 0 if n > N(ξ ) ∈ N is large, hence χ1(ξ ) = χ1/n(ξ )n 6= 0. For ξ ∈ R we find

(using a suitable branch of the complex logarithm)

ψ(ξ ) :=− log χ1(ξ ) =−∂

∂ t

[χ1(ξ )

]t∣∣∣∣t=0

= limt→0

1−Eeiξ Xt

t

= limt→0

1t

∫∞

−∞

(1− eiyξ + iyξ

)pt(dy)− iξ µ

= limt→0

∫∞

−∞

1− eiyξ + iyξ

y2 πt(dy)− iξ µ (3.11)

where pt(dy) = P(Xt ∈ dy) and πt(dy) := y2 t−1 pt(dy). Yet another application of Taylor’s theo-rem shows that the integrand in the above integral is bounded, vanishes at infinity, and admits acontinuous extension onto the whole real line if we choose the value 1

2 ξ 2 at y = 0. The family(πt)t∈(0,1] is uniformly bounded,

1t

∫y2 pt(dy) =

1tE(X2

t ) =1t

(VXt +

[EXt

]2)= σ

2 + tµ2 t→0−−→ σ2,

hence sequentially vaguely relatively compact (see Theorem A.3). We conclude that every se-quence (πt(n))n∈N ⊂ (πt)t∈(0,1] with t(n)→ 0 as n→ ∞ has a vaguely convergent subsequence.But since the limit (3.11) exists, all subsequential limits coincide which means2 that πt convergesvaguely to a finite measure π on R. This proves that

ψ(ξ ) =− log χ1(ξ ) =∫

−∞

1− eiyξ + iyξ

y2 π(dy)− iξ µ

for some finite measure π on (−∞,∞) with total mass π(R) = σ2. This is sometimes called the deFinetti–Kolmogorov formula. If we set ν(dy) := y−21y 6=0π(dy) and σ2

0 := π0, we obtainthe Lévy–Khintchine formula

ψ(ξ ) =− i µξ +12

σ20 ξ

2 +∫

y 6=0

(1− eiyξ + iyξ

)ν(dy)

where σ2 = σ20 +

∫y6=0 y2 ν(dy).

2Note that eiyξ = ∂ 2ξ

(1− eiyξ + iyξ

)/y2, i.e. the kernel appearing in (3.11) is indeed measure-determining.

4. On the Markov property

Let (Ω,A ,P) be a probability space with some filtration (Ft)t>0 and a d-dimensional adaptedstochastic process X = (Xt)t>0, i.e. each Xt is Ft measurable. We write B(Rd) for the Borel setsand set F∞ := σ(

⋃t>0 Ft).

The process X is said to be a simple Markov process, if

P(Xt ∈ B |Fs) = P(Xt ∈ B | Xs), s6 t, B ∈B(Rd), (4.1)

holds true. This is pretty much the most general definition of a Markov process, but it is usuallytoo general to work with. It is more convenient to consider Markov families.

Definition 4.1. A (temporally homogeneous) Markov transition function is a measure kernelpt(x,B), t > 0, x ∈Rd , B ∈B(Rd) such that

a) B 7→ ps(x,B) is a probability measure for every s> 0 and x ∈Rd ;

b) (s,x) 7→ ps(x,B) is a Borel measurable function for every B ∈B(Rd);

c) the Chapman–Kolmogorov equations hold

ps+t(x,B) =∫

pt(y,B) ps(x,dy) for all s, t > 0, x ∈Rd , B ∈B(Rd). (4.2)

Definition 4.2. A stochastic process (Xt)t>0 is called a (temporally homogeneous) Markov pro-cess with transition function if there exists a Markov transition function pt(x,B) such that

P(Xt ∈ B |Fs) = pt−s(Xs,B) a.s. for all s6 t, B ∈B(Rd). (4.3)

Conditioning w.r.t. σ(Xs) and using the tower property of conditional expectation shows that(4.3) implies the simple Markov property (4.1). Nowadays the following definition of a Markovprocess is commonly used.

Definition 4.3. A (universal) Markov process is a tuple (Ω,A ,Ft ,Xt , t > 0,Px,x ∈ Rd) suchthat pt(x,B) = Px(Xt ∈ B) is a Markov transition function and (Xt)t>0 is for each Px a Markovprocess in the sense of Definition 4.2 such that Px(X0 = x) = 1. In particular,

Px(Xt ∈ B |Fs) = PXs(Xt−s ∈ B) Px-a.s. for all s6 t, B ∈B(Rd). (4.4)

24

Chapter 4: On the Markov property 25

We are going to show that a Lévy process is a (universal) Markov process. Assume that (Xt)t>0

is a Lévy process and set Ft := F Xt = σ(Xr, r 6 t). Define probability measures

Px(X• ∈ Γ) := P(X•+ x ∈ Γ), x ∈Rd ,

where Γ is a Borel set of the path space (Rd)[0,∞) = w |w : [0,∞)→Rd.1 We setEx :=∫. . .dPx.

By construction, P= P0 and E= E0.Note that Xx

t := Xt + x satisfies the conditions (L1)–(L3), and it is common to call (Xxt )t>0 a

Lévy process starting from x.

Lemma 4.4. Let (Xt)t>0 be a Lévy process on Rd . Then

pt(x,B) := Px(Xt ∈ B) := P(Xt + x ∈ B), t > 0, x ∈Rd , B ∈B(Rd),

is a Markov transition function.

Proof. Since pt(x,B) = E1B(Xt + x) (the proof of) Fubini’s theorem shows that x 7→ pt(x,B) isa measurable function and B 7→ pt(x,B) is a probability measure. The Chapman–Kolmogorovequations follow from

ps+t(x,B) = P(Xs+t + x ∈ B) = P((Xs+t −Xt)+ x+Xt ∈ B)

(L2)=∫RdP(y+Xt ∈ B)P((Xs+t −Xt)+ x ∈ dy)

(L1)=∫RdP(y+Xt ∈ B)P(Xs + x ∈ dy)

=∫Rd

pt(y,B) ps(x,dy).

Remark 4.5. The proof of Lemma 4.4 shows a bit more: From

pt(x,B) =∫1B(x+ y)P(Xt ∈ dy) =

∫1B−x(y)P(Xt ∈ dy) = pt(0,B− x)

we see that the kernels pt(x,B) are invariant under shifts in Rd (translation invariant). In slightabuse of notation we write pt(x,B) = pt(B− x). From this it becomes clear that the Chapman–Kolmogorov equations are convolution identities pt+s(B) = pt ∗ ps(B), and (pt)t>0 is a convolu-tion semigroup of probability measures; because of (L3), this semigroup is weakly continuous att = 0, i.e. pt → δ0 as t→ 0, cf. Theorem A.3 et seq. for the weak convergence of measures.

Lévy processes enjoy an even stronger version of the above Markov property.

Theorem 4.6 (Markov property for Lévy processes). Let X be a d-dimensional Lévy process andset Y := (Xt+a−Xa)t>0 for some fixed a> 0. Then Y is again a Lévy process satisfying

a) Y ⊥⊥(Xr)r6a, i.e. FY∞ ⊥⊥F X

a .

1Recall that B((Rd)[0,∞)) is the smallest σ -algebra containing the cylinder sets Z =×t>0 Bt where Bt ∈B(Rd)

and only finitely many Bt 6=Rd .

26 R. L. Schilling: An Introduction to Lévy and Feller Processes

b) Y ∼ X, i.e. X and Y have the same finite dimensional distributions.

Proof. Observe that FYs = σ(Xr+a−Xa, r 6 s)⊂F X

s+a. Using Theorem 3.1 and the tower prop-erty of conditional expectation yields for all s6 t

E(

eiξ ·(Yt−Ys)∣∣∣FY

s

)= E

[E(

eiξ ·(Xt+a−Xs+a)∣∣∣F X

s+a

)∣∣∣FYs

]= e−(t−s)ψ(ξ ).

Thus, (Yt)t>0 is a Lévy process with the same characteristic function as (Xt)t>0. The property(L2′) for X gives

Xtn+a−Xtn−1+a, Xtn−1+a−Xtn−2+a, . . . , Xt1+a−Xa⊥⊥F Xa .

As σ(Yt1 , . . . ,Ytn) = σ(Ytn−Ytn−1 , . . . ,Yt1−Yt0)⊥⊥F Xa for all t0 = 0 < t1 < · · ·< tn, we get

FY∞ = σ

( ⋃t1<···<tn,n∈N

σ(Yt1 , . . . ,Ytn)

)⊥⊥F X

a .

Using the Markov transition function pt(x,B) we can define a linear operator on the boundedBorel measurable functions f :Rd →R:

Pt f (x) :=∫

f (y) pt(x,dy) = Ex f (Xt), f ∈Bb(Rd), t > 0, x ∈Rd . (4.5)

For a Lévy process, cf. Remark 4.5, we have pt(x,B) = pt(B−x) and the operators Pt are actuallyconvolution operators:

Pt f (x) = E f (Xt + x) =∫

f (y+ x) pt(dy) = f ∗ pt(x) where pt(B) := pt(−B). (4.6)

Definition 4.7. Let Pt , t > 0, be defined by (4.5). The operators are said to be

a) acting on Bb(Rd), if Pt : Bb(R

d)→Bb(Rd).

b) an operator semigroup, if Pt+s = Pt Ps for all s, t > 0 and P0 = id.

c) sub-Markovian if 06 f 6 1 =⇒ 06 Pt f 6 1.

d) contractive if ‖Pt f‖∞ 6 ‖ f‖∞ for all f ∈Bb(Rd).

e) conservative if Pt1 = 1.

f) Feller operators, if Pt : C∞(Rd)→ C∞(R

d).2

g) strongly continuous on C∞(Rd), if limt→0 ‖Pt f − f‖∞ = 0 for all f ∈ C∞(R

d).

h) strong Feller operators, if Pt : Bb(Rd)→ Cb(R

d).

2C∞(Rd) denotes the space of continuous functions vanishing at infinity. It is a Banach space when equipped with

the uniform norm ‖ f‖∞ = supx∈Rd | f (x)|.

Chapter 4: On the Markov property 27

Lemma 4.8. Let (Pt)t>0 be defined by (4.5). The properties 4.7.a)–e) hold for any Markov process,4.7.a)–g) hold for any Lévy process, and 4.7.a)–h) hold for any Lévy process such that all transitionprobabilities pt(dy) = P(Xt ∈ dy), t > 0, are absolutely continuous w.r.t. Lebesgue measure.

Proof. We only show the assertions about Lévy processes (Xt)t>0.

a) Since Pt f (x) = E f (Xt + x), the boundedness of Pt f is obvious, and the measurability in xfollows from (the proof of) Fubini’s theorem.

b) By the tower property of conditional expectation, we get for s, t > 0

Pt+s f (x) = Ex f (Xt+s) = Ex(Ex[ f (Xt+s) |Fs])

(4.4)= Ex(EXs f (Xt)

)= Ps Pt f (x).

For the Markov transition functions this is the Chapman–Kolmogorov identity (4.2).

c) and d), e) follow directly from the fact that B 7→ pt(x,B) is a probability measure.

f) Let f ∈ C∞(Rd). Since x 7→ f (x+Xt) is continuous and bounded, the claim follows from

dominated convergence as Pt f (x) = E f (x+Xt).

g) f ∈ C∞ is uniformly continuous, i.e. for every ε > 0 there is some δ > 0 such that

|x− y|6 δ =⇒ | f (x)− f (y)|6 ε.

Hence,

‖Pt f − f‖∞ 6 supx∈Rd

∫| f (Xt)− f (x)|dPx

= supx∈Rd

(∫|Xt−x|6δ

| f (Xt)− f (x)|dPx +∫|Xt−x|>δ

| f (Xt)− f (x)|dPx)

6 ε +2‖ f‖∞ supx∈Rd

Px(|Xt − x|> δ )

= ε +2‖ f‖∞P(|Xt |> δ )(L3)−−−→t→0

ε.

Since ε > 0 is arbitrary, the claim follows. Note that this proof shows that uniform conti-nuity in probability is responsible for the strong continuity of the semigroup.

h) see Lemma 4.9.

Lemma 4.9 (Hawkes). Let X = (Xt)t>0 be a Lévy process on Rd . Then the operators Pt definedby (4.5) are strong Feller if, and only if, Xt ∼ pt(y)dy for all t > 0.

Proof. ‘⇐’: Let Xt ∼ pt(y)dy. Since pt ∈ L1 and since convolutions have a smoothing property(e.g. [54, Theorem 14.8] or [55, Satz 18.9]), we get with pt(y) = pt(−y)

Pt f = f ∗ pt ∈ L∞ ∗L1 ⊂ Cb(Rd).

28 R. L. Schilling: An Introduction to Lévy and Feller Processes

‘⇒’: We show that pt(dy) dy. Let N ∈ B(Rd) be a Lebesgue null set λ d(N) = 0 andg ∈Bb(R

d). Then, by the Fubini–Tonelli theorem∫g(x)Pt1N(x)dx =

∫∫g(x)1N(x+ y) pt(dy)dx

=∫ ∫

g(x)1N(x+ y)dx︸ ︷︷ ︸=0

pt(dy) = 0.

Take g = Pt1N , then the above calculation shows∫(Pt1N(x))2 dx = 0.

Hence, Pt1N = 0 Lebesgue-a.e. By the strong Feller property, Pt1N is continuous, and so Pt1N ≡ 0,hence

pt(N) = Pt1N(0) = 0.

Remark 4.10. The existence and smoothness of densities for a Lévy process are time-dependentproperties, cf. Sato [51, Chapter V.23]. The typical example is the Gamma process. This is a(one-dimensional) Lévy process with characteristic exponent

ψ(ξ ) =12

log(1+ |ξ |2)− i arctanξ , ξ ∈R,

and this process has the transition density

pt(x) =1

Γ(t)xt−1e−x1(0,∞)(x), t > 0.

The factor xt−1 gives a time-dependent condition for the property pt ∈ Lp(dx). One can show, cf.[30], that

lim|ξ |→∞

Reψ(ξ )

log(1+ |ξ |2)= ∞ =⇒ ∀t > 0 ∃pt ∈ C∞(Rd).

The converse direction remains true if ψ(ξ ) is rotationally invariant or if it is replaced by itssymmetric rearrangement.

Remark 4.11. If (Pt)t>0 is a Feller semigroup, i.e. a semigroup satisfying the conditions 4.7.a)-g),then there exists a unique stochastic process (a Feller process) with (Pt)t>0 as transition semi-group. The idea is to use Kolmogorov’s consistency theorem for the following family of finite-dimensional distributions

pxt1,...,tn(B1×·· ·×Bn) = Pt1

(1B1Pt2−t1

(1B2Pt3−t2(. . .Ptn−tn−1(1Bn))

))(x)

Here Xt0 = X0 = x a.s. Note: It is not enough to have a semigroup on Lp as we need pointwiseevaluations.

If the operators Pt are not a priori given on Bb(Rd) but only on C∞(R

d), one still can use theRiesz representation theorem to construct Markov kernels pt(x,B) representing and extending Pt

onto Bb(Rd), cf. Lemma 5.2.

Chapter 4: On the Markov property 29

Recall that a stopping time is a random time τ : Ω→ [0,∞] such that τ 6 t ∈Ft for all t > 0.It is not hard to see that τn := (b2nτc+1)2−n, n ∈N, is a sequence of stopping times with valuesk2−n, k = 1,2, . . . , such that

τ1 > τ2 > . . .> τn ↓ τ = infn∈N

τn.

This approximation is the key ingredient to extend the Markov property (Theorem 4.6) to randomtimes.

Theorem 4.12 (Strong Markov property for Lévy processes). Let X be a Lévy process on Rd

and set Y := (Xt+τ −Xτ)t>0 for some a.s. finite stopping time τ . Then Y is again a Lévy processsatisfying

a) Y ⊥⊥(Xr)r6τ , i.e. FY∞ ⊥⊥F X

τ+ :=

F ∈F X∞ : F ∩τ < t ∈F X

t ∀t > 0

.

b) Y ∼ X, i.e. X and Y have the same finite dimensional distributions.

Proof. Let τn := (b2nτc+ 1)2−n. For all 0 6 s < t, ξ ∈ Rd and F ∈F Xτ+ we find by the right-

continuity of the sample paths (or by the continuity in probability (L3))

E[eiξ ·(Xt+τ−Xs+τ )1F

]= lim

n→∞E[eiξ ·(Xt+τn−Xs+τn )1F

]= lim

n→∞

∑k=1E[eiξ ·(Xt+k2−n−Xs+k2−n )1τn=k2−n ·1F

]= lim

n→∞

∑k=1E[

eiξ ·(Xt+k2−n−Xs+k2−n )︸ ︷︷ ︸⊥⊥F X

k2−n by (L2)

1(k−1)2−n6τ<k2−n1F︸ ︷︷ ︸∈F X

k2−n as F∈F Xτ+

]

= limn→∞

∑k=1E[eiξ ·Xt−s

]P(

(k−1)2−n 6 τ < k2−n∩F)

= E[eiξ ·Xt−s

]P(F).

In the last equality we use⋃·∞k=1(k−1)2−n 6 τ < k2−n= τ < ∞ for all n> 1.

The same calculation applies to finitely many increments. Let F ∈F Xτ+, t0 = 0 < t1 < · · · < tn

and ξ1, . . . ,ξn ∈Rd . Then

E[ei∑

nk=1 ξk·(Xtk+τ−Xtk−1+τ )1F

]=

n

∏k=1E[eiξk·Xtk−tk−1

]P(F).

This shows that the increments Xtk+τ−Xtk−1+τ are independent and distributed like Xtk−tk−1 . More-over, all increments are independent of F ∈F X

τ+.Therefore, all random vectors of the form (Xt1+τ −Xτ , . . . ,Xtn+τ −Xtn−1+τ) are independent of

F Xτ+, and we conclude that FY

∞ = σ(Xt+τ −Xτ , t > 0)⊥⊥F Xτ+.

5. A digression: semigroups

We have seen that the Markov kernel pt(x,B) of a Lévy or Markov process induces a semigroupof linear operators (Pt)t>0. In this chapter we collect a few tools from functional analysis for thestudy of operator semigroups. By Bb(R

d) we denote the bounded Borel functions f :Rd→R, andC∞(R

d) are the continuous functions vanishing at infinity, i.e. lim|x|→∞ f (x) = 0; when equippedwith the uniform norm ‖ · ‖∞ both sets become Banach spaces.

Definition 5.1. A Feller semigroup is a family of linear operators Pt : Bb(Rd)→Bb(R

d) satisfy-ing the properties a)–g) of Definition 4.7: (Pt)t>0 is a semigroup of conservative, sub-Markovianoperators which enjoy the Feller property Pt(C∞(R

d))⊂ C∞(Rd) and which are strongly contin-

uous on C∞(Rd).

Notice that (t,x) 7→ Pt f (x) is for every f ∈ C∞(Rd) continuous. This follows from

|Pt f (x)−Ps f (y)|6 |Pt f (x)−Pt f (y)|+ |Pt f (y)−Ps f (y)|

6 |Pt f (x)−Pt f (y)|+‖P|t−s| f − f‖∞,

the Feller property 4.7.f) and the strong continuity 4.7.g).

Lemma 5.2. If (Pt)t>0 is a Feller semigroup, then there is a Markov transition function pt(x,dy)(Definition 4.1) such that Pt f (x) =

∫f (y) pt(x,dy).

Proof. By the Riesz representation theorem we see that the operators Pt are of the form

Pt f (x) =∫

f (y) pt(x,dy)

where pt(x,dy) is a Markov kernel. The tricky part is to show the joint measurability of thetransition function (t,x) 7→ pt(x,B) and the Chapman–Kolmogorov identities (4.2).

For every compact set K ⊂Rd the functions defined by

fn(x) :=d(x,Uc

n )

d(x,K)+d(x,Ucn ), d(x,A) := inf

a∈A|x−a|, Un := y : d(y,K)< 1/n,

are in C∞(Rd) and fn ↓ 1K . By monotone convergence, pt(x,K) = infn∈NPt fn(x) which proves

the joint measurability in (t,x) for all compact sets.By the same, the semigroup property Pt+s fn = PsPt fn entails the Chapman–Kolmogorov identi-

ties for compact sets: pt+s(x,K) =∫

pt(y,K) ps(x,dy). Since

D :=

B ∈B(Rd)

∣∣∣∣∣∣(t,x) 7→ pt(x,B) is measurable &

pt+s(x,B) =∫

pt(y,B) ps(x,dy)

30

Chapter 5: A digression: semigroups 31

is a Dynkin system containing the compact sets, we have D = B(Rd).

To get an intuition for semigroups it is a good idea to view the semigroup property

Pt+s = Ps Pt and P0 = id

as an operator-valued Cauchy functional equation. If t 7→ Pt is—in a suitable sense—continuous,the unique solution will be of the form Pt = etA for some operator A. This can be easily maderigorous for matrices A,Pt ∈ Rn×n since the matrix exponential is well defined by the uniformlyconvergent series

Pt = exp(tA) :=∞

∑k=0

tkAk

k!and A =

ddt

Pt

∣∣∣t=0

with A0 := id and Ak = A A · · · A (k times). With a bit more care, this can be made to workalso in general settings.

Definition 5.3. Let (Pt)t>0 be a Feller semigroup. The (infinitesimal) generator is a linear opera-tor defined by

D(A) :=

f ∈ C∞(Rd)

∣∣∣∣ ∃g ∈ C∞(Rd) : lim

t→0

∥∥∥∥Pt f − ft−g∥∥∥∥

= 0

(5.1)

A f := limt→0

Pt f − ft

, f ∈D(A). (5.2)

The following lemma is the rigorous version for the symbolic notation ‘Pt = etA’.

Lemma 5.4. Let (Pt)t>0 be a Feller semigroup with infinitesimal generator (A,D(A)). ThenPt(D(A))⊂D(A) and

ddt

Pt f = APt f = PtA f for all f ∈D(A), t > 0. (5.3)

Moreover,∫ t

0 Ps f ds ∈D(A) for any f ∈ C∞(Rd), and

Pt f − f = A∫ t

0Ps f ds, f ∈ C∞(R

d), t > 0 (5.4)

=∫ t

0PsA f ds, f ∈D(A), t > 0. (5.5)

Proof. Let 0 < ε < t and f ∈D(A). The semigroup and contraction properties give∥∥∥∥Pt f −Pt−ε fε

−PtA f∥∥∥∥

6

∥∥∥∥Pt−ε

Pε f − fε

−Pt−εA f∥∥∥∥

+∥∥Pt−εA f −Pt−εPεA f

∥∥∞

6

∥∥∥∥Pε f − fε

−A f∥∥∥∥

+∥∥A f −PεA f

∥∥∞−−→ε→0

0

where we use the strong continuity in the last step. This shows d−dt Pt f = APt f = PtA f ; a similar

(but simpler) calculation proves this also for d+

dt Pt f .

32 R. L. Schilling: An Introduction to Lévy and Feller Processes

Let f ∈ C∞(Rd) and t,ε > 0. By Fubini’s theorem and the representation of Pt with a Markov

transition function (Lemma 5.2) we get

∫ t

0Ps f (x)ds =

∫ t

0PεPs f (x)ds,

and so,

Pε − idε

∫ t

0Ps f (x)ds =

∫ t

0

(Ps+ε f (x)−Ps f (x)

)ds

=1ε

∫ t+ε

tPs f (x)ds− 1

ε

∫ε

0Ps f (x)ds.

Since t 7→ Pt f (x) is continuous, the fundamental theorem of calculus applies, and we get

limε→0

∫ r+ε

rPs f (x)ds = Pr f (x)

for r > 0. This shows that∫ t

0 Ps f ds ∈D(A) as well as (5.4). If f ∈D(A), then we deduce (5.5)from ∫ t

0PsA f (x)ds

(5.3)=∫ t

0

dds

Ps f (x)ds = Pt f (x)− f (x)(5.4)= A

∫ t

0Ps f (x)ds.

Remark 5.5 (Consequences of Lemma 5.4). Write C∞ := C∞(Rd).

a) (5.4) shows that D(A) is dense in C∞, since D(A) 3 t−1 ∫ t0 Ps f ds−−→

t→0f for any f ∈ C∞.

b) (5.5) shows that A is a closed operator, i.e.

fn ∈D(A), ( fn,A fn)uniformly−−−−−−→

n→∞( f ,g) ∈ C∞×C∞ =⇒ f ∈D(A) & A f = g.

c) (5.3) means that A determines (Pt)t>0 uniquely.

Let us now consider the Laplace transform of (Pt)t>0.

Definition 5.6. Let (Pt)t>0 be a Feller semigroup. The resolvent is a linear operator on Bb(Rd)

given byRλ f (x) :=

∫∞

0e−λ tPt f (x)dt, f ∈Bb(R

d), x ∈Rd , λ > 0. (5.6)

The following formal calculation can easily be made rigorous. Let (λ −A) := (λ id−A) forλ > 0 and f ∈D(A). Then

(λ −A)Rλ f = (λ −A)∫

0e−λ tPt f dt

(5.4),(5.5)=

∫∞

0e−λ t(λ −A)Pt f dt

= λ

∫∞

0e−λ tPt f dt−

∫∞

0e−λ t

(ddt

Pt f)

dt

parts= λ

∫∞

0e−λ tPt f dt−λ

∫∞

0e−λ tPt f dt− [e−λ tPt f ]∞t=0 = f .

A similar calculation for Rλ (λ −A) gives

Chapter 5: A digression: semigroups 33

Theorem 5.7. Let (A,D(A)) and (Rλ )λ>0 be the generator and the resolvent of a Feller semi-group. Then

Rλ = (λ −A)−1 for all λ > 0.

Since Rλ is the Laplace transform of (Pt)t>0, the properties of (Rλ )λ>0 can be found from(Pt)t>0 and vice versa. With some effort one can even invert the (operator-valued) Laplace trans-form which leads to the familiar expression for ex:(n

tR n

t

)n=(

id− tn

A)−n strongly−−−−−→

n→∞etA = Pt (5.7)

(the notation etA = Pt is, for unbounded operators A, formal), see Pazy [41, Chapter 1.8].

Lemma 5.8. Let (Rλ )λ>0 be the resolvent of a Feller1 semigroup (Pt)t>0. Then

dn

dλ n Rλ = n!(−1)nRn+1λ

n ∈N0. (5.8)

Proof. Using a symmetry argument we see

tn =∫ t

0. . .∫ t

0dt1 . . .dtn = n!

∫ t

0

∫ tn

0. . .∫ t2

0dt1 . . .dtn.

Let f ∈ C∞(Rd) and x ∈Rd . Then

(−1)n dn

dλ n Rλ f (x) =∫

0(−1)n dn

dλ n e−λ tPt f (x)dt =∫

0tne−λ tPt f (x)dt

= n!∫

0

∫ t

0

∫ tn

0. . .∫ t2

0e−λ tPt f (x)dt1 . . .dtn dt

= n!∫

0

∫∞

tn. . .∫

t1e−λ tPt f (x)dt dt1 . . .dtn

= n!∫

0

∫∞

0. . .∫

0e−λ (t+t1+···+tn)Pt+t1+···+tn f (x)dt dt1 . . .dtn

= n!Rn+1λ

f (x).

The key result identifying the generators of Feller semigroups is the following theorem due toHille, Yosida and Ray, a proof can be found in Pazy [41, Chapter 1.4] or Ethier & Kurtz [17,Chapter 4.2]; a probabilistic approach is due to Itô [25].

Theorem 5.9 (Hille–Yosida–Ray). A linear operator (A,D(A)) on C∞(Rd) generates a Feller

semigroup (Pt)t>0 if, and only if,

a) D(A)⊂ C∞(Rd) dense.

b) A is dissipative, i.e. ‖λ f −A f‖∞ > λ‖ f‖∞ for some (or all) λ > 0.

c) (λ −A)(D(A)) = C∞(Rd) for some (or all) λ > 0.

1This Lemma only needs that the operators Pt are strongly continuous and contractive, Definition 4.7.g), d).

34 R. L. Schilling: An Introduction to Lévy and Feller Processes

d) A satisfies the positive maximum principle:

f ∈D(A), f (x0) = supx∈Rd

f (x)> 0 =⇒ A f (x0)6 0. (PMP)

This variant of the Hille–Yosida theorem is not the standard version from functional analysissince we are interested in positivity preserving (sub-Markov) semigroups. Let us briefly discussthe role of the positive maximum principle.

Remark 5.10. Let (Pt)t>0 be a strongly continuous contraction semigroup on C∞(Rd), i.e.

‖Pt f‖∞ 6 ‖ f‖∞ and limt→0‖Pt f − f‖∞ = 0,

cf. Definition 4.7.d),g).2

1 Sub-Markov⇒ (PMP). Assume that f ∈D(A) is such that f (x0) = sup f > 0. Then

Pt f (x0)− f (x0)f6 f+

6 P+t f (x0)− f+(x0)6 ‖ f+‖∞− f+(x0) = 0.

=⇒ A f (x0) = limt→0

Pt f (x0)− f (x0)

t6 0.

Thus, (PMP) holds.

2 (PMP)⇒ dissipativity. Assume that (PMP) holds and let f ∈ D(A). Since f ∈ C∞(Rd), we

may assume that f (x0) = | f (x0)|= sup | f | (otherwise f − f ). Then

‖λ f −A f‖∞ > λ f (x0)−A f (x0)︸ ︷︷ ︸60

> λ f (x0) = λ‖ f‖∞.

3 (PMP)⇒ sub-Markov. Since Pt is contractive, we have Pt f (x) 6 ‖Pt f‖∞ 6 ‖ f‖∞ 6 1 for allf ∈ C∞(R

d) such that | f | 6 1. In order to see positivity, let f ∈ C∞(Rd) be non-negative.

We distinguish between two cases:

1 Rλ f does not attain its infimum. Since Rλ f ∈ C∞(Rd) vanishes at infinity, we have nec-

essarily Rλ f > 0.

2 ∃x0 : Rλ f (x0) = infRλ f . Because of the (PMP) we find

λRλ f (x0)− f (x0) = ARλ f (x0)> 0

=⇒ λRλ f (x)> infλRλ f = λRλ f (x0)> f (x0)> 0.

This proves that f > 0 =⇒ λRλ f > 0. From (5.8) we see that λ 7→ Rλ f (x) is completelymonotone, hence it is the Laplace transform of a positive measure. Since Rλ f (x) has theintegral representation (5.6), we conclude that Pt f (x)> 0 (for all t > 0 as t 7→ Pt f is contin-uous).

Using the Riesz representation theorem (as in Lemma 5.2) we can extend Pt as a sub-Markovoperator onto Bb(R

d).

2These properties are essential for the existence of a generator and the resolvent on C∞(Rd).

Chapter 5: A digression: semigroups 35

In order to determine the domain D(A) of the generator the following ‘maximal dissipativity’result is handy.

Lemma 5.11 (Dynkin, Reuter). Assume that (A,D(A)) generates a Feller semigroup and that(A,D(A)) extends A, i.e. D(A)⊂D(A) and A|D(A) = A. If

u ∈D(A), u−Au = 0 =⇒ u = 0, (5.9)

then (A,D(A)) = (A,D(A)).

Proof. Since A is a generator, (id−A) : D(A)→ C∞(Rd) is bijective. On the other hand, the

relation (5.9) means that (id−A) is injective, but (id−A) cannot have a proper injective extension.

Theorem 5.12. Let (Pt)t>0 be a Feller semigroup with generator (A,D(A)). Then

D(A) =

f ∈ C∞(Rd)

∣∣∣∣ ∃g ∈ C∞(Rd) ∀x : lim

t→0

Pt f (x)− f (x)t

= g(x). (5.10)

Proof. Denote by D(A) the right-hand side of (5.10) and define

A f (x) := limt→0

Pt f (x)− f (x)t

for all f ∈D(A), x ∈Rd .

Obviously, (A,D(A)) is a linear operator which extends (A,D(A)). Since (PMP) is, essentially,a pointwise assertion (see Remark 5.10, 1), A inherits (PMP); in particular, A is dissipative (seeRemark 5.10, 2):

‖A f −λ f‖∞ > λ‖ f‖∞.

This implies (5.9), and the claim follows from Lemma 5.11.

6. The generator of a Lévy process

We want to study the structure of the generator of (the semigroup corresponding to) a Lévy processX = (Xt)t>0. This will also lead to a proof of the Lévy–Khintchine formula.

Our approach uses some Fourier analysis. We denote by C∞c (R

d) and S(Rd) the smooth, com-pactly supported functions and the smooth, rapidly decreasing ‘Schwartz functions’.1 The Fouriertransform is denoted by

f (ξ ) = F f (ξ ) := (2π)−d∫Rd

f (x)e− iξ ·x dx, f ∈ L1(dx).

Observe that F f is chosen in such a way that the characteristic function becomes the inverseFourier transform.

We have seen in Proposition 2.3 and its Corollaries 2.4 and 2.5 that X is completely character-ized by the characteristic exponent ψ :Rd → C

Eeiξ ·Xt =[Eeiξ ·X1

]t= e−tψ(ξ ), t > 0, ξ ∈Rd .

We need a few more properties of ψ which result from the fact that χ(ξ ) = e−ψ(ξ ) is a character-istic function.

Lemma 6.1. Let χ(ξ ) be any characteristic function of a probability measure µ . Then

|χ(ξ +η)−χ(ξ )χ(η)|2 6 (1−|χ(ξ )|2)(1−|χ(η)|2), ξ ,η ∈Rd . (6.1)

Proof. Since µ is a probability measure, we find from the definition of χ

χ(ξ +η)−χ(ξ )χ(η) =∫∫ (

eix·ξ eix·η − eix·ξ eiy·η)µ(dx)µ(dy)

=12

∫∫ (eix·ξ − eiy·ξ )(eix·η − eiy·η)

µ(dx)µ(dy).

In the last equality we use that the integrand is symmetric in x and y, which allows us to in-terchange the variables. Using the elementary formula |eia − eib|2 = 2− 2cos(b− a) and the

1To be precise, f ∈ S(Rd), if f ∈ C∞(Rd) and if supx∈Rd (1+ |x|N)|∂ α f (x)| 6 cN,α for any N ∈ N0 and anymultiindex α ∈Nd

0 .

36

Chapter 6: The generator of a Lévy process 37

Cauchy–Schwarz inequality yield

|χ(ξ +η)−χ(ξ )χ(η)|

612

∫∫ ∣∣eix·ξ − eiy·ξ ∣∣ · ∣∣eix·η − eiy·η ∣∣µ(dx)µ(dy)

=∫∫ √

1− cos(y− x) ·ξ√

1− cos(y− x) ·η µ(dx)µ(dy)

6

√∫∫ (1− cos(y− x) ·ξ

)µ(dx)µ(dy)

√∫∫ (1− cos(y− x) ·η

)µ(dx)µ(dy).

This finishes the proof as∫∫cos(y− x) ·ξ µ(dx)µ(dy) = Re

[∫eiy·ξ

µ(dy)∫

e− ix·ξµ(dx)

]= |χ(ξ )|2.

Theorem 6.2. Let ψ :Rd→C be the characteristic exponent of a Lévy process. Then the functionξ 7→

√|ψ(ξ )| is subadditive and

|ψ(ξ )|6 cψ(1+ |ξ |2), ξ ∈Rd . (6.2)

Proof. We use (6.1) with χ = e−tψ , divide by t > 0 and let t→ 0. Since |χ|= e−t Reψ , this gives

|ψ(ξ +η)−ψ(ξ )−ψ(η)|2 6 4Reψ(ξ )Reψ(η)6 4|ψ(ξ )| · |ψ(η)|.

By the lower triangle inequality,

|ψ(ξ +η)|− |ψ(ξ )|− |ψ(η)|6 2√|ψ(ξ )|

√|ψ(η)|

and this is the same as subadditivity:√|ψ(ξ +η)|6

√|ψ(ξ )|+

√|ψ(η)|.

In particular, |ψ(2ξ )| 6 4|ψ(ξ )|. For any ξ 6= 0 there is some integer n = n(ξ ) ∈ Z such that2n−1 6 |ξ |6 2n, so

|ψ(ξ )|= |ψ(2n2−nξ )|6max1,22n sup

|η |61|ψ(η)|6 2 sup

|η |61|ψ(η)|(1+ |ξ |2).

Lemma 6.3. Let (Xt)t>0 be a Lévy process and denote by (A,D(A)) its infinitesimal generator.Then C∞

c (Rd)⊂D(A).

Proof. Let f ∈ C∞c (R

d). By definition, Pt f (x) = E f (Xt + x). Using the differentiation lemma forparameter-dependent integrals (e.g. [54, Theorem 11.5] or [55, 12.2]) it is not hard to see thatPt : S(Rd)→ S(Rd). Obviously,

e−tψ(ξ ) = Eeiξ ·Xt = Exeiξ ·(Xt−x) = e−ξ (x)Pteξ (x) (6.3)

for eξ (x) := eiξ ·x. Recall that the Fourier transform of f ∈ S(Rd) is again in S(Rd). From

Pt f = Pt

∫f (ξ )eξ (·)dξ =

∫f (ξ )Pteξ (·)dξ (6.4)

(6.3)=∫

f (ξ )eξ (·)e−tψ(ξ ) dξ

38 R. L. Schilling: An Introduction to Lévy and Feller Processes

we conclude that Pt f = f e−tψ . Hence,

Pt f = F−1( f e−tψ). (6.5)

Consequently,

Pt f − ft

=e−tψ f − f

t−−−→

t→0−ψ f

f∈S(Rd)=====⇒ Pt f (x)− f (x)

t−−−→

t→0g(x) := F−1(−ψ f )(x).

Since ψ grows at most polynomially (Lemma 6.2) and f ∈ S(Rd), we see ψ f ∈ L1(dx) and, bythe Riemann–Lebesgue lemma, g ∈ C∞(R

d). Using Theorem 5.12 it follows that f ∈D(A).

Definition 6.4. Let L : C2b(R

d)→ Cb(Rd) be a linear operator. Then

L(x,ξ ) := e−ξ (x)Lxeξ (x) (6.6)

is the symbol of the operator L = Lx, where eξ (x) := eiξ ·x.

The proof of Lemma 6.3 actually shows that we can recover an operator L from its symbolL(x,ξ ) if, say, L : C2

b(Rd)→ Cb(R

d) is continuous:2 Indeed, for all u ∈ C∞c (R

d)

Lu(x) = L∫

u(ξ )eξ (x)dξ

=∫

u(ξ )Lxeξ (x)dξ

=∫

u(ξ )L(x,ξ )eξ (x)dξ = F−1(L(x, ·)Fu(·))(x).

Example 6.5. A typical example would be the Laplace operator (i.e. the generator of a Brownianmotion)

12 ∆ f (x) =−1

2(1i ∂x)

2 f (x) =∫

f (ξ )(− 1

2 |ξ |2)eiξ ·x dξ , i.e. L(x,ξ ) =−1

2|ξ |2,

or the fractional Laplacian of order 12 α ∈ (0,1) which generates a rotationally symmetric α-stable

Lévy process

−(−∆)α/2 f (x) =∫

f (ξ )(−|ξ |α

)eiξ ·x dξ , i.e. L(x,ξ ) =−|ξ |α .

More generally, if P(x,ξ ) is a polynomial in ξ , then the corresponding operator is obtained byreplacing ξ by 1

i ∇x and formally expanding the powers.

Definition 6.6. An operator of the form

L(x,D) f (x) =∫

f (ξ )L(x,ξ )eix·ξ dξ , f ∈ S(Rd), (6.7)

is called (if defined) a pseudo differential operator with (non-classical) symbol L(x,ξ ).

2As usual, C2b(R

d) is endowed with the norm ‖u‖(2) = ∑06|α|62 ‖∂ α u‖∞.

Chapter 6: The generator of a Lévy process 39

Remark 6.7. The symbol of a Lévy process does not depend on x, i.e. L(x,ξ ) = L(ξ ). Thisis a consequence of the spatial homogeneity of the process which is encoded in the translationinvariance of the semigroup (cf. (4.6) and Lemma 4.4):

Pt f (x) = E f (Xt + x) =⇒ Pt f (x) = ϑx(Pt f )(0) = Pt(ϑx f )(0)

where ϑxu(y) = u(y+x) is the shift operator. This property is obviously inherited by the generator,i.e.

A f (x) = ϑx(A f )(0) = A(ϑx f )(0), f ∈D(A).

As a matter of fact, the converse is also true: If L : C∞c (R

d) → C(Rd) is a linear operatorsatisfying ϑx(L f ) = L(ϑx f ), then L f = f ∗ λ where λ is a distribution, i.e. a continuous linearfunctional λ : C∞

c (Rd)→R, cf. Theorem A.10.

Theorem 6.8. Let (Xt)t>0 be a Lévy process with generator A. Then

A f (x) = l ·∇ f (x)+12

∇ ·Q∇ f (x)+∫

y 6=0

[f (x+ y)− f (x)−∇ f (x) · y1(0,1)(|y|)

]ν(dy) (6.8)

for any f ∈ C∞c (R

d), where l ∈Rd , Q ∈Rd×d is a positive semidefinite matrix, and ν is a measureon Rd \0 such that

∫y 6=0 min1, |y|2ν(dy)< ∞.

Equivalently, A is a pseudo differential operator

Au(x) =−ψ(D)u(x) =−∫

u(ξ )ψ(ξ )eix·ξ dξ , u ∈ C∞c (R

d), (6.9)

whose symbol is the characteristic exponent −ψ of the Lévy process. It is given by the Lévy–Khintchine formula

ψ(ξ ) =− i l ·ξ +12

ξ ·Qξ +∫

y 6=0

(1− eiy·ξ + iξ · y1(0,1)(|y|)

)ν(dy) (6.10)

where the triplet (l,Q,ν) as above.

I learned the following proof from Francis Hirsch; it is based on arguments by Courrège [13]and Herz [20]. The presentation below follows the version in Böttcher, Schilling & Wang [9,Section 2.3].

Proof. The proof is divided into several steps.

1 We have seen in Lemma 6.3 that C∞c (R

d)⊂D(A).

2 Set A0 f := (A f )(0) for f ∈ C∞c (R

d). This is a linear functional on C∞c . Observe that

f ∈ C∞c (R

d), f > 0, f (0) = 0(PMP)===⇒ A0 f > 0.

40 R. L. Schilling: An Introduction to Lévy and Feller Processes

3 By 2, f 7→ A00 f := A0(| · |2 · f ) is a positive linear functional on C∞c (R

d). Therefore it isbounded. Indeed, let f ∈ C∞

c (K) for a compact set K ⊂Rd and let φ ∈ C∞c (R

d) be a cut-offfunction such that 1K 6 φ 6 1. Then

‖ f‖∞φ ± f > 0.

By linearity and positivity ‖ f‖∞A00φ ±A00 f > 0 which shows |A00 f | 6 CK‖ f‖∞ with theconstant CK = A00φ .

By Riesz’ representation theorem, there exists a Radon measure3 µ such that

A0(| · |2 f ) =∫

f (y)µ(dy) =∫|y|2 f (y)

µ(dy)|y|2︸ ︷︷ ︸

=:ν(dy)

=∫|y|2 f (y)ν(dy).

This implies that

A0 f0 =∫

y6=0f0(y)ν(dy) for all f0 ∈ C∞

c (Rd \0);

since any compact subset of Rd \ 0 is contained in an annulus BR(0) \Bε(0), we havesupp f0∩Bε(0) = /0 for some sufficiently small ε > 0. The measure ν is a Radon measureon Rd \0.

4 Let f ,g ∈ C∞c (R

d), 06 f ,g6 1, supp f ⊂ B1(0), suppg⊂ B1(0)c

and f (0) = 1. From

supy∈Rd

(‖g‖∞ f (y)+g(y)

)= ‖g‖∞ = ‖g‖∞ f (0)+g(0)

and (PMP), it follows that A0(‖g‖∞ f +g)6 0. Consequently,

A0g6−‖g‖∞A0 f .

If g ↑ 1−1B1(0), then this shows∫

|y|>1ν(dy)6−A0 f < ∞.

Hence,∫

y6=0(|y|2∧1)ν(dy)< ∞.

5 Let f ∈ C∞c (R

d) and φ(y) = 1(0,1)(|y|). Define

S0 f :=∫

y 6=0

[f (y)− f (0)− y ·∇ f (0)φ(y)

]ν(dy). (6.11)

By Taylor’s formula, there is some θ ∈ (0,1) such that

f (y)− f (0)− y ·∇ f (0)φ(y) =12

d

∑k,l=1

∂ 2 f (θy)∂xk∂xl

ykyl.

3A Radon measure on a topological space E is a Borel measure which is finite on compact subsets of E and regular:for all open sets µ(U) = supK⊂U µ(K) and for all Borel sets µ(B) = infU⊃B µ(U) (K, U are generic compact and opensets, respectively).

Chapter 6: The generator of a Lévy process 41

Using the elementary inequality 2ykyl 6 y2k + y2

l 6 |y|2, we obtain

| f (y)− f (0)− y ·∇ f (0)φ(y)|6

14 ∑

dk,l=1

∥∥ ∂ 2

∂xk∂xlf∥∥

∞|y|2, |y|< 1

2‖ f‖∞, |y|> 1

6 2‖ f‖(2)(|y|2∧1).

This means that S0 defines a distribution (generalized function) of order 2.

6 Set L0 := A0−S0. The steps 2 and 5 show that A0 is a distribution of order 2. Moreover,

L0 f0 =∫

y 6=0

[f0(0)− y ·∇ f0(0)φ(y)

]ν(dy) = 0

for any f0 ∈ C∞c (R

d) with f0|Bε (0) = 0 for some ε > 0. Hence, supp(L0)⊂ 0.

Let us show that L0 is almost positive (also: ‘fast positiv’, ‘prèsque positif’):

f0 ∈ C∞c (R

d), f0(0) = 0, f0 > 0 =⇒ L0 f0 > 0. (PP)

Indeed: Pick 06 φn ∈ C∞c (R

d \0), φn ↑ 1Rd\0 and let f0 be as in (PP). Then

L0 f0suppL0⊂0

= L0[(1−φn) f0)]

= A0[(1−φn) f0]−S0[(1−φn) f0]

f0(0)=0=

∇ f0(0)=0A0[(1−φn) f0]−

∫y 6=0

(1−φn(y)) f0(y)ν(dy)

2

>(PMP)

−∫(1−φn(y)) f0(y)ν(dy)−−−→

n→∞0

by the monotone convergence theorem.

7 As in 5 we find with Taylor’s formula for f ∈ C∞c (R

d), supp f ⊂ K and φ ∈ C∞c (R

d) satis-fying 1K 6 φ 6 1

( f (y)− f (0)−∇ f (0) · y)φ(y)6 2‖ f‖(2)|y|2φ(y).

(As usual, ‖ f‖(2) = ∑06|α|62 ‖∂ α f‖∞.) Therefore,

2‖ f‖(2)|y|2φ(y)+ f (0)φ(y)+∇ f (0) · yφ(y)− f (y)> 0,

and (PP) implies

L0 f 6 f (0)L0φ + |∇ f (0)|L0(| · |φ)+2‖ f‖(2)L0(| · |2φ)6CK‖ f‖(2).

8 We have seen in 6 that L0 is of order 2 and suppL0 ⊂ 0. Therefore,

L0 f =12

d

∑k,l=1

qkl∂ 2 f (0)∂xk∂xl

+d

∑k=1

lk∂ f (0)

∂xk− c f (0). (6.12)

42 R. L. Schilling: An Introduction to Lévy and Feller Processes

We will show that (qkl)k,l is positive semidefinite. Set g(y) :=(y ·ξ )2 f (y) where f ∈C∞c (R

d)

is such that 1B1(0) 6 f 6 1. By (PP), L0g> 0. It is not difficult to see that this implies

d

∑k,l=1

qklξkξl > 0, for all ξ = (ξ1, . . . ,ξd) ∈Rd .

9 Since Lévy processes and their semigroups are invariant under translations, cf. Remark 6.7,we get A f (x) = A0[ f (x+ ·)]. If we replace f by f (x+ ·), we get

A f (x) = c f (x)+ l ·∇ f (x)+12

∇ ·Q∇ f (x)

+∫

y6=0

[f (x+ y)− f (x)− y ·∇ f (x)1(0,1)(|y|)

]ν(dy).

(6.8′)

We will show in the next step that c = 0.

10 So far, we have seen in 5,7 and 9 that

‖A f‖∞ 6C‖ f‖(2) =C ∑|α|62‖∂ α f‖∞, f ∈ C∞

c (Rd),

which means that A (has an extension which) is continuous as an operator from C2b(R

d) toCb(R

d). Therefore, (A,C∞c (R

d)) is a pseudo differential operator with symbol

−ψ(ξ ) = e−ξ (x)Axeξ (x), eξ (x) = eiξ ·x.

Inserting eξ into (6.8′) proves (6.10) and, as ψ(0) = 0, c = 0.

Remark 6.9. In step 8 of the proof of Theorem 6.8 one can use the (PMP) to show that thecoefficient c appearing in (6.8′) is positive. For this, let ( fn)n∈N ⊂ C∞

c (Rd), fn ↑ 1 and fn|B1(0) = 1.

By (PMP), A0 fn 6 0. Moreover, ∇ fn(0) = 0 and, therefore,

S0 fn =−∫(1− fn(y))ν(dy)−−−→

n→∞0.

Consequently,limsup

n→∞

L0 fn = limsupn→∞

(A0 fn−S0 fn)6 0 =⇒ c> 0.

For Lévy processes we have c = ψ(0) = 0 and this is a consequence of the infinite life-time ofthe process:

P(Xt ∈Rd) = Pt1 = 1 for all t > 0,

and we can use the formula Pt f − f = A∫ t

0 Ps f ds, cf. Lemma 5.4, for f ≡ 1 to show that

c = A1 = 0 ⇐⇒ Pt1 = 1.

Definition 6.10. A Lévy measure is a Radon measure ν onRd \0 s.t.∫

y6=0(|y|2∧1)ν(dy)< ∞.A Lévy triplet is a triplet (l,Q,ν) consisting of a vector l ∈ Rd , positive semi-definite matrixQ ∈Rd×d and a Lévy measure ν .

Chapter 6: The generator of a Lévy process 43

The proof of Theorem 6.8 incidentally shows that the Lévy triplet defining the exponent (6.10)or the generator (6.8) is unique. The following corollary can easily be checked using the represen-tation (6.8).

Corollary 6.11. Let A be the generator and (Pt)t>0 the semigroup of a Lévy process. Then theLévy triplet is given by∫

f0 dν = A f0(0) = limt→0

Pt f0(0)t

∀ f0 ∈ C∞c (R

d \0),

lk = Aφk(0)−∫

yk[φ(y)−1(0,1)(|y|)

]ν(dy), k = 1, . . . ,d,

qkl = A(φkφl)(0)−∫

y 6=0φk(y)φl(y)ν(dy), k, l = 1, . . .d,

(6.13)

where φ ∈ C∞c (R

d) satisfies 1B1(0) 6 φ 6 1 and φk(y) := ykφ(y). In particular, (l,Q,ν) is uniquelydetermined by A or the characteristic exponent ψ .

We will see an alternative uniqueness proof in the next Chapter 7.

Remark 6.12. Setting pt(dy) = P(Xt ∈ dy), we can recast the formula for the Lévy measure as

ν(dy) = limt→0

pt(dy)t

(vague limit of measures on the set Rd \0).

Moreover, a direct calculation using the Lévy–Khintchine formula (6.13) gives the following al-ternative representation for the qkl:

12

ξ ·Qξ = limn→∞

ψ(nξ )

n2 , ξ ∈Rd .

7. Construction of Lévy processes

Our starting point is now the Lévy–Khintchine formula for the characteristic exponent ψ of a Lévyprocess

ψ(ξ ) =− i l ·ξ +12

ξ ·Qξ +∫

y 6=0

[1− eiy·ξ + iξ · y1(0,1)(|y|)

]ν(dy) (7.1)

where (l,Q,ν) is a Lévy triplet in the sense of Definition 6.10; a proof of (7.1) is contained inTheorem 6.8, but the exposition below is independent of this proof, see however Remark 7.7 atthe end of this chapter.

What will be needed is that a compound Poisson process is a Lévy process with càdlàg pathsand characteristic exponent of the form

φ(ξ ) =∫

y 6=0

[1− eiy·ξ ]

ρ(dy) (7.2)

(ρ is any finite measure), see Example 3.2.d), where ρ(dy) = λ ·µ(dy).Let ν be a Lévy measure and denote by Aa,b = y : a 6 |y| < b an annulus. Since we have∫

y 6=0[|y|2 ∧ 1

]ν(dy) < ∞, the measure ρ(B) := ν(B∩ Aa,b) is a finite measure, and there is a

corresponding compound Poisson process. Adding a drift with l = −∫

yρ(dy) shows that forevery exponent

ψa,b(ξ ) =

∫a6|y|<b

[1− eiy·ξ + iy ·ξ

]ν(dy) (7.3)

there is some Lévy process Xa,b = (Xa,bt )t>0. In fact,

Lemma 7.1. Let 0 < a < b 6 ∞ and ψa,b given by (7.3). Then the corresponding Lévy processXa,b is an L2(P)-martingale with càdlàg paths such that

E[Xa,b

t]= 0 and E

[Xa,b

t · (Xa,bt )>

]=

(t∫

a6|y|<bykyl ν(dy)

)k,l.

Proof. Set Xt := Xa,bt , ψ := ψa,b and Ft := σ(Xr, r 6 t). Using the differentiation lemma for

parameter-dependent integrals we see that ψ is twice continuously differentiable and

∂ψ(0)∂ξk

= 0 and∂ 2ψ(0)∂ξk∂ξl

=∫

a6|y|<bykyl ν(dy).

Since the characteristic function e−tψ(ξ ) is twice continuously differentiable, X has first and secondmoments, cf. Theorem A.2, and these can be obtained by differentiation:

EX (k)t =

1i

∂ξkEeiξ ·Xt

∣∣∣∣ξ=0

= i t∂ψ(0)

∂ξk= 0

44

Chapter 7: Construction of Lévy processes 45

and

E(X (k)t X (l)

t ) =− ∂ 2

∂ξk∂ξlEeiξ ·Xt

∣∣∣∣ξ=0

=t∂ 2ψ(0)∂ξk∂ξl

− t∂ψ(0)∂ξk

t∂ψ(0)∂ξl

= t∫

a6|y|<bykyl ν(dy).

The martingale property now follows from the independence of the increments: Let s6 t, then

E(Xt |Fs) = E(Xt −Xs +Xs |Fs) = E(Xt −Xs |Fs)+Xs(L2)=

(L1)E(Xt−s)+Xs = Xs.

We will use the processes from Lemma 7.1 as main building blocks for the Lévy process. Forthis we need some preparations.

Lemma 7.2. Let (Xkt )t>0 be Lévy processes with characteristic exponents ψk. If X1⊥⊥X2, then

X := X1 +X2 is a Lévy process with characteristic exponent ψ = ψ1 +ψ2.

Proof. Set F kt := σ(Xk

s , s 6 t) and Ft = σ(F 1t ,F

2t ). Since X1⊥⊥X2, we get for F = F1 ∩F2,

Fk ∈F ks ,

E(

eiξ ·(Xt−Xs)1F

)= E

(eiξ ·(X1

t −X1s )1F1 · eiξ ·(X2

t −X2s )1F2

)= E

(eiξ ·(X1

t −X1s )1F1

)E(

eiξ ·(X2t −X2

s )1F2

)(L2)= e−(t−s)ψ1(ξ )P(F1) · e−(t−s)ψ2(ξ )P(F2)

= e−(t−s)(ψ1(ξ )+ψ2(ξ ))P(F).

As F1∩F2 : Fk ∈F ks is a ∩-stable generator of Fs, we find

E(

eiξ ·(Xt−Xs)∣∣∣Fs

)= e−(t−s)ψ(ξ ).

Observe that Fs could be larger than the canonical filtration F Xs . Therefore, we first condition

w.r.t. E(· · · |F Xs ) and then use Theorem 3.1, to see that X is a Lévy process with characteristic

exponent ψ = ψ1 +ψ2.

Lemma 7.3. Let (Xn)n∈N be a sequence of Lévy processes with characteristic exponents ψn.Assume that Xn

t → Xt converges in probability for every t > 0. If

either: the convergence is uniform in probability, i.e.

∀ε > 0 ∀t > 0 : limn→∞

P

(sups6t|Xn

s −Xs|> ε

)= 0,

or: the limiting process X has càdlàg paths,

then X is a Lévy process with characteristic exponent ψ := limn→∞ ψn.

46 R. L. Schilling: An Introduction to Lévy and Feller Processes

Proof. Let 0 = t0 < t1 < · · ·< tm and ξ1, . . . ,ξm ∈Rd . Since the Xn are Lévy processes,

Eexp

[i

m

∑k=1

ξk · (Xntk −Xn

tk−1)

](L2),(L1)

=m

∏k=1Eexp

[iξk ·Xn

tk−tk−1

]and, because of convergence in probability, this equality is inherited by the limiting process X .This proves that X has independent (L2′) and stationary (L1) increments.

The condition (L3) follows either from the uniformity of the convergence in probability or thecàdlàg property. Thus, X is a Lévy process. From

limn→∞

Eeiξ ·Xn1 = Eeiξ ·X1

we get that the limit limn→∞ ψn = ψ exists.

Lemma 7.4 (Notation of Lemma 7.1). Let (an)n∈N be a sequence a1 > a2 > .. . decreasing to zeroand assume that the processes (Xan+1,an)n∈N are independent Lévy processes with characteristicexponents ψan+1,an . Then X := ∑

∞n=1 Xan+1,an is a Lévy process with exponent ψ := ∑

∞n=1 ψan+1,an

and càdlàg paths. Moreover, X is an L2(P)-martingale.

Proof. Lemmas 7.1, 7.2 show that Xan+m,an =∑mk=1 Xan+k,an+k−1 is a Lévy process with characteristic

exponent ψan+m,an = ∑mk=1 ψan+k,an+k−1 , and Xan+m,an is an L2(P)-martingale. By Doob’s inequality

and Lemma 7.1

E

(sups6t|Xan+m,an

s |2)6 4E

(|Xan+m,an

t |2)

= 4t∫

an+m6|y|<an

y2ν(dy)

dom. convergence−−−−−−−−−→m,n→∞

0.

Hence, the limit X = limn→∞ Xan,a1 exists (uniformly in t) in L2, i.e. X is an L2(P)-martingale;since the convergence is also uniform in probability, Lemma 7.3 shows that X is a Lévy processwith exponent ψ = ∑

∞n=1 ψan+1,an . Taking a uniformly convergent subsequence, we also see that

the limit inherits the càdlàg property from the approximating Lévy processes Xan,a1 .

We can now prove the main result of this chapter.

Theorem 7.5. Let (l,Q,ν) be a Lévy triplet and ψ be given by (7.1). Then there exists a Lévyprocess X with càdlàg paths and characteristic exponent ψ .

Proof. Because of Lemma 7.1, 7.2 and 7.4 we can construct X piece by piece.

1 Let (Wt)t>0 be a Brownian motion and set

Xct := tl +

√QWt and ψc(ξ ) :=− i l ·ξ +

12

ξ ·Qξ .

2 WriteRd \0=⋃·∞n=0 An with A0 := |y|> 1 and An :=

1n+1 6 |y|<

1n

; set λn := ν(An)

and µn := ν(·∩An)/ν(An).

Chapter 7: Construction of Lévy processes 47

3 Construct, as in Example 3.2.d), a compound Poisson process comprising the large jumps

X0t := X1,∞

t and ψ0(ξ ) :=∫

16|y|<∞

[1− eiy·ξ ]

ν(dy)

and compensated compound Poisson processes taking account of all small jumps

Xnt := Xan+1,an

t , an :=1n

and ψn(ξ ) :=∫

An

[1− eiy·ξ + iy ·ξ

]ν(dy).

We can construct the processes Xn stochastically independent (just choose independent jumptime processes and independent iid jump heights when constructing the compound Poissonprocesses) and independent of the Wiener process W .

4 Setting ψ = ψ0 +ψc +∑∞n=1 ψn, Lemma 7.2 and 7.4 prove the theorem. Since all approx-

imating processes have càdlàg paths, this property is inherited by the sums and the limit(Lemma 7.4).

The proof of Theorem 7.5 also implies the following pathwise decomposition of a Lévy process.We write ∆Xt := Xt −Xt− for the jump at time t. From the construction we know that

(large jumps) J[1,∞)t = ∑

s6t∆Xs1[1,∞)(|∆Xs|) (7.4)

(small jumps) J[1/n,1)t = ∑

s6t∆Xs1[1/n,1)(|∆Xs|) (7.5)

(compensated small jumps) Jt[1/n,1)

= J[1/n,1)t −EJ[1/n,1)

t (7.6)

= ∑s6t

∆Xs1[1/n,1)(|∆Xs|)− t∫

1n6|y|<1

yν(dy).

are Lévy processes and J[1,∞)⊥⊥ J[1/n,1).

Corollary 7.6. Let ψ be a characteristic exponent given by (7.1) and let X be the Lévy processconstructed in Theorem 7.5. Then

Xt =√

QWt + limn→∞

(∑s6t

∆Xs1[1/n,1)(|∆Xs|)− t∫

[1/n,1)

yν(dy)) ]]]

=: Mt , L2-martingale

tl

continuousGaussian

+ ∑s6t

∆Xs1[1,∞)(|∆Xs|).

pure jump part

]]]=: At , bdd. variation

where all appearing processes are independent.

Proof. The decomposition follows directly from the construction in Theorem 7.5. By Lemma 7.4,limn→∞ X1/n,1 is an L2(P)-martingale, and since the (independent!) Wiener process W is also anL2(P)-martingale, so is their sum M.

48 R. L. Schilling: An Introduction to Lévy and Feller Processes

The paths t 7→ At(ω) are a.s. of bounded variation since, by construction, on any time-interval[0, t] there are N0

t (ω) jumps of size > 1. Since N0t (ω)< ∞ a.s., the total variation of At(ω) is less

or equal than |l|t +∑s6t |∆Xs|1[1,∞)(|∆Xs|)< ∞ a.s.

Remark 7.7. A word of caution: Theorem 7.5 associates with any ψ given by the Lévy–Khintchi-ne formula (7.1) a Lévy process. Unless we know that all characteristic exponents are of this form(this was proved in Theorem 6.8), it does not follow that we have constructed all Lévy processes.

On the other hand, Theorem 7.5 shows that the Lévy triplet determining ψ is unique. Indeed,assume that (l,Q,ν) and (l′,Q′,ν ′) are two Lévy triplets which yield the same exponent ψ . Nowwe can associate, using Theorem 7.5, with each triplet a Lévy process X and X ′ such that

Eeiξ ·Xt = e−tψ(ξ ) = Eeiξ ·X ′t .

Thus, X ∼ X ′ and so these processes have (in law) the same pathwise decomposition, i.e. the samedrift, diffusion and jump behaviour. This, however, means that (l,Q,ν) = (l′,Q′,ν ′).

8. Two special Lévy processes

We will now study the structure of the paths of a Lévy process. We begin with two extreme cases:Lévy processes which only grow by jumps of size 1 and Lévy processes with continuous paths.

Throughout this chapter we assume that all paths [0,∞) 3 t 7→ Xt(ω) are right-continuous withfinite left-hand limits (càdlàg). This is a bit stronger than (L3), but it is always possible toconstruct a càdlàg version of a Lévy process (see the discussion on page 13). This allows us toconsider the jumps of the process X

∆Xt := Xt −Xt− = Xt − lims↑t

Xs.

Theorem 8.1. Let X be a one-dimensional Lévy process which moves only by jumps of size 1.Then X is a Poisson process.

Proof. Set F Xt := σ(Xs, s 6 t) and let T1 = inft > 0 : ∆Xt = 1 be the time of the first jump.

Since T1 > t= Xt = 0 ∈F Xt , T1 is a stopping time.

Let T0 = 0 and Tk = inft > Tk−1 : ∆Xt = 1, be the time of the kth jump; this is also a stoppingtime. By the Markov property ((4.4) and Lemma 4.4),

P(T1 > s+ t) = P(T1 > s, T1 > s+ t)

= E[1T1>sP

Xs(T1 > t)]

= E[1T1>sP

0(T1 > t)]

= P(T1 > s)P(T1 > t)

where we use that Xs = 0 if T1 > s (the process hasn’t yet moved!) and P= P0.Since t 7→P(T1 > t) is right-continuous, P(T1 > t) = exp[t logP(T1 > 1)] is the unique solution

of this functional equation (Theorem A.1). Thus, the sequence of inter-jump times σk := Tk−Tk−1,k ∈N, is an iid sequence of exponential times. This follows immediately from the strong Markovproperty (Theorem 4.12) for Lévy processes and the observation that

Tk+1−Tk = TY1 where Y = (Yt+Tk −YTk)t>0

and TY1 is the first jump time of the process Y .

Obviously, Xt = ∑∞k=11[0,t](Tk), Tk = σ1+ · · ·+σk, and Example 3.2.c) (and Theorem 3.4) show

that X is a Poisson process.

A Lévy process with uniformly bounded jumps admits moments of all orders.

49

50 R. L. Schilling: An Introduction to Lévy and Feller Processes

Lemma 8.2. Let (Xt)t>0 be a Lévy process such that |∆Xt(ω)|6 c for all t > 0 and some constantc > 0. Then E(|Xt |p)< ∞ for all p> 0.

Proof. Let F Xt := σ(Xs, s6 t) and define the stopping times

τ0 := 0, τn := inf

t > τn−1 : |Xt −Xτn−1 |> c.

Since X has càdlàg paths, τ0 < τ1 < τ2 < .. . . Let us show that τ1 < ∞ a.s. For fixed t > 0 andn ∈N we have

P(τ1 = ∞)6 P(τ1 > nt) 6 P(|Xkt −X(k−1)t |6 2c, ∀k = 1, . . . ,n)

(L2)=

(L2′)

n

∏k=1P(|Xkt −X(k−1)t |6 2c)

(L1)= P(|Xt |6 2c)n.

Letting n→ ∞ we see that P(τ1 = ∞) = 0 if P(|Xt | 6 2c) < 1 for some t > 0. (In the alternativecase, we have P(|Xt |6 2c) = 1 for all t > 0 which makes the lemma trivial.)

By the strong Markov property (Theorem 4.12) τn−τn−1 ∼ τ1 and τn−τn−1⊥⊥F Xτn−1

, i.e. (τn−τn−1)n∈N is an iid sequence. Therefore,

Ee−τn =(Ee−τ1

)n= qn

for some q ∈ [0,1). From the very definition of the stoppping times we infer

|Xt∧τn |6n

∑k=1|Xτk −Xτk−1 |6

n

∑k=1

(|∆Xτk |︸ ︷︷ ︸6c

+ |Xτk−−Xτk−1 |︸ ︷︷ ︸6c

)6 2nc.

Thus, |Xt |> 2nc implies that τn < t, and by Markov’s inequality

P(|Xt |> 2nc)6 P(τn < t)6 etEe−τn = et qn.

Finally,

E(|Xt |p

)=

∑n=0E(|Xt |p12nc<|Xt |62(n+1)c

)6 (2c)p

∑n=0

(n+1)pP(|Xt |> 2nc)6 (2c)pet∞

∑n=0

(n+1)pqn < ∞.

Recall that a Brownian motion (Wt)t>0 onRd is a Lévy process such that Wt is a normal randomvariable with mean 0 and covariance matrix t id. We will need Paul Lévy’s characterization ofBrownian motion which we state without proof. An elementary proof can be found in [56, Chapter9.4].

Theorem 8.3 (Lévy). Let M = (Mt ,Ft), M0 = 0, be a one-dimensional martingale with contin-uous sample paths such that (M2

t − t,Ft)t>0 is also a martingale. Then M is a one-dimensionalstandard Brownian motion.

Chapter 8: Two special Lévy processes 51

Theorem 8.4. Let (Xt)t>0 be a Lévy process in Rd whose sample paths are a.s. continuous. ThenXt ∼ tl+

√QWt where l ∈Rd , Q is a positive semidefinite symmetric matrix, and W is a standard

Brownian motion in Rd .

We will give two proofs of this result.

Proof (using Theorem 8.3). By Lemma 8.2, the process (Xt)t>0 has moments of all orders. There-fore, Mt := Mξ

t := ξ · (Xt −EXt) exists for any ξ ∈ Rd and is a martingale for the canonicalfiltration Ft := σ(Xs, s6 t). Indeed, for all s6 t

E(Mt |Fs) = E(Mt −Ms |Fs)−Ms(L2)=

(L1)EMt−s +Ms = Ms.

Moreover

E(M2t −M2

s |Fs) = E((Mt −Ms)2 +2Ms(Mt −Ms) |Fs)

(L2)= E((Mt −Ms)

2)+2MsE(Mt −Ms)

Lemma 3.10= (t− s)EM2

1 = (t− s)VM1,

and so (M2t − tVM1)t>0 and (Mt)t>0 are martingales with continuous paths.

Now we can use Theorem 8.3 and deduce that ξ · (Xt −EXt) is a one-dimensional Brownianmotion with variance ξ ·Qξ where tQ is the covariance matrix of Xt (cf. the proof of Lemma 7.1or Lemma 3.10). Thus, Xt−EXt =

√QWt where Wt is a d-dimensional standard Brownian motion.

Finally, EXt = tEX1 =: tl.

Standard proof (using the CLT). Fix ξ ∈Rd and set M(t) := ξ ·(Xt−EXt). Since X has momentsof all orders, M is well-defined and it is again a Lévy process. Moreover,

EM(t) = 0 and tσ2 =VM(t) = E[(ξ · (Xt −EXt))2] = tξ ·Qξ

where Q is the covariance matrix of X , cf. the proof of Lemma 7.1. We proceed as in the proof ofthe CLT: Using a Taylor expansion we get

EeiM(t) = Eei∑nk=1[M(tk/n)−M(t(k−1)/n)] (L2′)

=(L1)

(EeiM(t/n)

)n=

(1− 1

2EM2(t/n)+Rn

)n

.

The remainder term Rn is estimated by 16E|M

3( tn)|. If we can show that |Rn| 6 ε

tn for large

n = n(ε) and any ε > 0, we get because of EM2( tn) =

tn σ2

EeiM(t) = limn→∞

(1− 1

2(σ2 +2ε)

tn

)n

= e−12 (σ

2+2ε)t −−−→ε→0

e−12 σ2t .

This shows that ξ · (Xt −EXt) is a centered Gaussian random variable with variance σ2. SinceEXt = tEX1 we conclude that Xt is Gaussian with mean tl and covariance tQ.

52 R. L. Schilling: An Introduction to Lévy and Feller Processes

We will now estimateE|M3( tn)|. For every ε > 0 we can use the uniform continuity of s 7→M(s)

on [0, t] to getlimn→∞

max16k6n

|M( kn t)−M( k−1

n t)|= 0.

Thus, we have for all ε > 0

1 = limn→∞

P(

max16k6n

|M( kn t)−M( k−1

n t)|6 ε

)= lim

n→∞P

( n⋂k=1

|M( k

n t)−M( k−1n t)|6 ε

)(L2)=

(L1)limn→∞

n

∏k=1P(|M( t

n)|6 ε)

= limn→∞

[1−P(|M( t

n)|> ε)]n

6 limn→∞

e−nP(|M(t/n)|>ε) 6 1

where we use the inequality 1+ x6 ex. This proves limn→∞ nP(|M(t/n)|> ε) = 0. Therefore,

E|M3( tn)|6 εEM2( t

n)+∫|M(t/n)|>ε

|M3( tn)|dP

6 εtn

σ2 +√P(|M( t

n)|> ε)√EM6( t

n).

It is not hard to see that EM6(s) = a1s+ · · ·+ a6s6 (differentiate EeiuM(s) = e−sψ(u) six times atu = 0), and so

E|M3( tn)|6 ε

tn

σ2 + c

tn

√P(|M(t/n)|> ε)

t/n=

tn

(εσ

2 +o(1))

We close this chapter with Paul Lévy’s construction of a standard Brownian motion (Wt)t>0.Since W is a Lévy process which has the Markov property, it is enough to construct a Brownianmotion W (t) only for t ∈ [0,1], then produce independent copies (W n(t))t∈[0,1],n = 0,1,2, . . . , andjoin them continuously:

Wt :=

W 0(t), t ∈ [0,1),

W 0(1)+ · · ·+W n−1(1)+W n(t−n), t ∈ [n,n+1).

Since each Wt is normally distributed with mean 0 and variance t, we will get a Lévy process withcharacteristic exponent 1

2 ξ 2, ξ ∈ R. In the same vein we get a d-dimensional Brownian motionby making a vector (W (1)

t , . . . ,W (d)t )t>0 of d independent copies of (Wt)t>0. This yields a Lévy

process with exponent 12(ξ

21 + · · ·+ξ 2

d ), ξ1, . . . ,ξd ∈R.Denote a one-dimensional normal distribution with mean m and variance σ2 as N(m,σ2). The

motivation for the construction is the observation that a Brownian motion satisfies the followingmid-point law (cf. [56, Chapter 3.4]):

P(W(s+t)/2 ∈ • |Ws = x,Wt = y) = N(1

2(x+ y), 14(t− s)

), s6 t, x,y ∈R.

Chapter 8: Two special Lévy processes 53 | 1

W2n+1 (t)

W2n (t)

Än(t)

k−12n 2k−12n+1 k2n

| 1

1341214

W14W12

W34W1

Figure 8.1.: Interpolation of order four in Lévy’s construction of Brownian motion.

This can be turned into the following construction method:

Algorithm. Set W (0) = 0 and let W (1)∼N(0,1). Let n> 1 and assume that the random variablesW (k2−n), k = 1, . . . ,2n−1 have already been constructed. Then

W (l2−n−1) :=

W (k2−n), l = 2k,12

(W (k2−n)+W ((k+1)2−n)

)+Γ2n+k, l = 2k+1,

where Γ2n+k is an independent (of everything else) N(0,2−n/4) Gaussia random variable, cf. Fig-ure 8.1. In-between the nodes we use piecewise linear interpolation:

W2n(t,ω) := Linear interpolation of(W (k2−n,ω), k = 0,1, . . . ,2n), n> 1.

At the dyadic points t = k2− j we get the ‘true’ value of W (t,ω), while the linear interpolationis an approximation, see Figure 8.1.

Theorem 8.5 (Lévy 1940). The series

W (t,ω) :=∞

∑n=0

(W2n+1(t,ω)−W2n(t,ω)

)+W1(t,ω), t ∈ [0,1],

converges a.s. uniformly. In particular (W (t))t∈[0,1] is a one-dimensional Brownian motion.

Proof. Set ∆n(t,ω) :=W2n+1(t,ω)−W2n(t,ω). By construction,

∆n((2k−1)2−n−1,ω) = Γ2n+(k−1)(ω), k = 1,2, . . . ,2n,

are iid N(0,2−(n+2)) distributed random variables. Therefore,

P

(max

16k62n

∣∣∆n((2k−1)2−n−1)∣∣> xn√

2n+2

)6 2nP

(∣∣√2n+2 ∆n(2−n−1)∣∣> xn

),

54 R. L. Schilling: An Introduction to Lévy and Feller Processes

and the right-hand side equals

2 ·2n√

∫∞

xn

e−r2/2 dr 62n+1√

∫∞

xn

rxn

e−r2/2 dr =2n+1

xn√

2πe−x2

n/2.

Choose c > 1 and xn := c√

2n log2. Then

∑n=1P

(max

16k62n

∣∣∆n((2k−1)2−n−1)∣∣> xn√

2n+2

)6

∑n=1

2n+1

c√

2πe−c2 log2n

=2

c√

∑n=1

2−(c2−1)n < ∞.

Using the Borel–Cantelli lemma we find a set Ω0 ⊂Ω with P(Ω0) = 1 such that for every ω ∈Ω0

there is some N(ω)> 1 with

max16k62n

∣∣∆n((2k−1)2−n−1)∣∣6 c

√n log22n+1 for all n> N(ω).

∆n(t) is the distance between the polygonal arcs W2n+1(t) and W2n(t); the maximum is attained atone of the midpoints of the intervals [(k−1)2−n,k2−n], k = 1, . . . ,2n, see Figure 8.1. Thus,

sup06t61

∣∣W2n+1(t,ω)−W2n(t,ω)∣∣6 max

16k62n

∣∣∆n((2k−1)2−n−1,ω

)∣∣6 c

√n log22n+1 ,

for all n> N(ω) which means that the limit

W (t,ω) := limN→∞

W2N (t,ω) =∞

∑n=0

(W2n+1(t,ω)−W2n(t,ω)

)+W1(t,ω)

exists for all ω ∈Ω0 uniformly in t ∈ [0,1]. Therefore, t 7→W (t,ω), ω ∈Ω0, inherits the continuityof the polygonal arcs t 7→W2n(t,ω). Set

W (t,ω) :=W (t,ω)1Ω0(ω).

By construction, we find for all 06 k 6 l 6 2n

W (l2−n)−W (k2−n) =W2n(l2−n)−W2n(k2−n)

=l

∑l=k+1

(W2n(l2−n)−W2n((l−1)2−n)

)iid∼ N(0,(l− k)2−n).

Since t 7→ W (t) is continuous and the dyadic numbers are dense in [0, t], we conclude that theincrements W (tk)−W (tk−1), 0 = t0 < t1 < · · ·< tN 6 1 are independent N(0, tk− tk−1) distributedrandom variables. This shows that (W (t))t∈[0,1] is a Brownian motion.

9. Random measures

We continue our investigations of the paths of càdlàg Lévy processes. Independently of Chapters 5and 6 we will show in Theorem 9.12 that the processes constructed in Theorem 7.5 are indeed allLévy processes; this gives also a new proof of the Lévy–Khintchine formula, cf. Corollary 9.13.As before, we denote the jumps of (Xt)t>0 by

∆Xt := Xt −Xt− = Xt − lims↑t

Xs.

Definition 9.1. Let X be a Lévy process. The counting measure

Nt(B,ω) := #s ∈ (0, t] : ∆Xs(ω) ∈ B , B ∈B(Rd \0) (9.1)

is called the jump measure of the process X .

Since a càdlàg function x : [0,∞)→Rd has on any compact interval [a,b] at most finitely manyjumps |∆xt |> ε exceeding a fixed size,1 we see that

Nt(B,ω)< ∞ ∀t > 0, B ∈B(Rd) such that 0 /∈ B.

Notice that 0 /∈ B is equivalent to Bε(0)∩B = /0 for some ε > 0. Thus, B 7→ Nt(B,ω) is for everyω a locally finite Borel measure on Rd \0.

Definition 9.2. Let Nt(B) be the jump measure of the Lévy process X . For every Borel functionf :Rd →R with 0 /∈ supp f we define

Nt( f ,ω) :=∫

f (y)Nt(dy,ω). (9.2)

Since 0 /∈ supp f , it is clear that Nt(supp f ,ω)< ∞, and for every ω

Nt( f ,ω) = ∑0<s6t

f (∆Xs(ω)) =∞

∑n=1

f (∆Xτn(ω))1(0,t](τn(ω)). (9.2′)

Both sums are finite sums, extending only over those s where ∆Xs(ω) 6= 0. This is obvious in thesecond sum where τ1(ω),τ2(ω),τ3(ω) . . . are the jump times of X .

Lemma 9.3. Let Nt(·) be the jump measure of a Lévy process X, f ∈ Cc(Rd \ 0,Rm) (i.e. f

takes values in Rm), s < t and tk,n := s+ kn(t− s). Then

Nt( f ,ω)−Ns( f ,ω) = ∑s<u6t

f(∆Xu(ω)

)= lim

n→∞

n−1

∑k=0

f(Xtk+1,n(ω)−Xtk,n(ω)

). (9.3)

1Otherwise we would get an accumulation point of jumps within [a,b], and x would be unbounded on [a,b].

55

56 R. L. Schilling: An Introduction to Lévy and Feller Processes

Proof. Throughout the proof ω is fixed and we will omit it in our notation. Since 0 /∈ supp f ,there is some ε > 0 such that Bε(0)∩ supp f = /0; therefore, we need only consider jumps of size|∆Xt |> ε . Denote by J = τ1, . . . ,τN those jumps. For sufficiently large n we can achieve that

• #(J∩ (tk,n, tk+1,n]

)6 1 for all k = 0, . . . ,n−1;

• |Xtκ+1,n−Xtκ,n |< ε if κ is such that J∩ (tκ,n, tκ+1,n] = /0.

Indeed: Assume this is not the case, then we could find sequences s < sk < tk 6 t such thattk− sk → 0, J ∩ (sk, tk] = /0 and |Xtk −Xsk | > ε . Without loss of generality we may assumethat sk ↑ u and tk ↓ u for some u ∈ (s, t]; u = s can be ruled out because of right-continuity.By the càdlàg property of the paths, |∆Xu|> ε , i.e. u ∈ J, which is a contradiction.

Since we have f (Xtκ+1,n−Xtκ,n) = 0 for intervals of the ‘second kind’, only the intervals containingsome jump contribute to the (finite!) sum (9.3), and the claim follows.

Lemma 9.4. Let Nt(·) be the jump measure of a Lévy process X.

a) (Nt( f ))t>0 is a Lévy process on Rm for all f ∈ Cc(Rd \0,Rm).

b) (Nt(B))t>0 is a Poisson process for all B ∈B(Rd) such that 0 /∈ B.

c) ν(B) := EN1(B) is a locally finite measure on Rd \0.

Proof. Set Ft := σ(Xs, s6 t).

a) Let f ∈ Cc(Rd \ 0,Rm). From Lemma 9.3 and (L2′) we see that Nt( f ) is Ft measurable

and Nt( f )−Ns( f )⊥⊥Fs, s6 t. Moreover, if NYt (·) denotes the jump measure of the Lévy process

Y = (Xt+s−Xs)t>0, we see that Nt( f )−Ns( f ) = NYt−s( f ). By the Markov property (Theorem 4.6),

X ∼ Y , and we get NYt−s( f )∼ Nt−s( f ). Since t 7→ Nt( f ) is càdlàg, (Nt( f ))t>0 is a Lévy process.

b) By definition, N0(B) = 0 and t 7→ Nt(B) is càdlàg. Since X is a Lévy process, we see as in theproof of Theorem 8.1 that the jump times

τ0 := 0, τ1 := inf

t > 0 : ∆Xt ∈ B, τk := inf

t > τk−1 : ∆Xt ∈ B

satisfy τ1∼Exp(ν(B)), and the inter-jump times (τk−τk−1)k∈N are an iid sequence. The condition0 /∈ B ensures that Nt(B)< ∞ a.s., which means that the intensity ν(B) is finite. Indeed, we have

1− e−tν(B) = P(τ1 6 t) = P(Nt(B)> 0)−−→t→0

0;

this shows that ν(B)< ∞. Thus,

Nt(B) =∞

∑k=11(0,t](τk)

is a Poisson process (Example 3.2) and, in particular, a Lévy process (Theorem 3.4).

c) The intensity of (Nt(B))t>0 is ν(B) = EN1(B). By Fubini’s theorem it is clear that ν is ameasure.

Chapter 9: Random measures 57

Definition 9.5. Let Nt(·) be the jump measure of a Lévy process X . The intensity measure is themeasure ν(B) := EN1(B) from Lemma 9.4.

We will see in Corollary 9.13 that ν is the Lévy measure of (Xt)t>0 appearing in the Lévy–Khintchine formula.

Lemma 9.6. Let Nt(·) be the jump measure of a Lévy process X and ν the intensity measure. Forevery f ∈ L1(ν), f : Rd → Rm, the random variable Nt( f ) :=

∫f (y)Nt(dy) exists as L1-limit of

integrals of simple functions and satisfies

ENt( f ) = E∫

f (y)Nt(dy) = t∫

y6=0f (y)ν(dy). (9.4)

Proof. For any step function f of the form f (y) = ∑Mk=1 φk1Bk(y) with 0 /∈ Bk the formula (9.4)

follows from ENt(Bk) = tν(Bk) and the linearity of the integral.Since ν is defined on Rd \ 0, any f ∈ L1(ν) can be approximated by a sequence of step

functions ( fn)n∈N in L1(ν)-sense, and we get

E|Nt( fn)−Nt( fm)|6 t∫| fn− fm|dν −−−−→

m,n→∞0.

Because of the completeness of L1(P), the limit limn→∞ Nt( fn) exists, and with a routine argumentwe see that it is independent of the approximating sequence fn → f ∈ L1(ν). This allows us todefine Nt( f ) for f ∈ L1(ν) as L1(P)-limit of stochastic integrals of simple functions; obviously,(9.4) is preserved under this limiting procedure.

Theorem 9.7. Let Nt(·) be the jump measure of a Lévy process X and ν the intensity measure.

a) Nt( f ) :=∫

f (y)Nt(dy) is a Lévy process for every f ∈ L1(ν), f : Rd → Rm. In particular,(Nt(B))t>0 is a Poisson process for every B ∈B(Rd) such that 0 /∈ B.

b) XBt := Nt(y1B(y)) and Xt −XB

t are for every B ∈B(Rd), 0 /∈ B, Lévy processes.

Proof. a) Note that ν is a locally finite measure onRd \0. This means that, by standard densityresults from integration theory, the family Cc(R

d \ 0) is dense in L1(ν). Fix f ∈ L1(ν) andchoose fn ∈ Cc(R

d \0) such that fn→ f in L1(ν). Then, as in Lemma 9.6,

E|Nt( f )−Nt( fn)|6 t∫| f − fn|dν −−−→

n→∞0.

Since P(|Nt( f )| > ε) 6 tε

∫| f |dν → 0 for every ε > 0 as t → 0, the process Nt( f ) is continuous

in probability. Moreover, it is the limit (in L1, hence in probability) of the Lévy processes Nt( fn)

(Lemma 9.4); therefore it is itself a Lévy process, see Lemma 7.3.In view of Lemma 9.4.c), the indicator function 1B ∈ L1(ν) whenever B is a Borel set satisfying

0 /∈ B. Thus, Nt(B) = Nt(1B) is a Lévy process which has only jumps of unit size, i.e. it is byTheorem 8.1 a Poisson process.

58 R. L. Schilling: An Introduction to Lévy and Feller Processes

b) Set f (y) := y1B(y) and Bn := B∩Bn(0). Then fn(y) = y1Bn(y) is bounded and 0 /∈ supp fn,hence fn ∈ L1(ν). This means that Nt( fn) is for every n ∈N a Lévy process. Moreover,

Nt( fn) =∫

Bn

yNt(dy)−−−→n→∞

∫B

yNt(dy) = Nt( f ) a.s.

Since Nt( f ) changes its value only by jumps,

P(|Nt( f )|> ε)6 P(X has at least one jump of size B in [0, t])

= P(Nt(B)> 0) = 1− e−tν(B),

which proves that the process Nt( f ) is continuous in probability. Lemma 7.3 shows that Nt( f ) is aLévy process.

Finally, approximate f (y) := y1B(y) by a sequence φl ∈ Cc(Rd \ 0,Rd). Now we can use

Lemma 9.3 to get

Xt −Nt(φl) = limn→∞

n−1

∑k=0

[(Xtk+1,n−Xtk,n)−φl(Xtk+1,n−Xtk,n)

],

The increments of X are stationary and independent, and so we conclude from the above for-mula that X −N(φl) has also stationary and independent increments. Since both X and N(φl) arecontinuous in probability, so is their difference, i.e. X−N(φl) is a Lévy process. Finally,

Nt(φl)−−→l→∞

Nt( f ) and Xt −Nt(φl)−−→l→∞

Xt −Nt( f ),

and since X and N( f ) are continuous in probability, Lemma 7.3 tells us that X −N( f ) is a Lévyprocess.

We will now show that Lévy processes with ‘disjoint jump heights’ are independent. For thiswe need the following immediate consequence of Theorem 3.1:

Lemma 9.8 (Exponential martingale). Let (Xt)t>0 be a Lévy process. Then

Mt :=eiξ ·Xt

Eeiξ ·Xt= eiξ ·Xt etψ(ξ ), t > 0,

is a martingale for the filtration F Xt = σ(Xs, s6 t) such that sups6t |Ms|6 et Reψ(ξ ).

Theorem 9.9. Let Nt(·) be the jump measure of a Lévy process X and U,V ∈B(Rd), 0 /∈U ,0 /∈Vand U ∩V = /0. Then the processes

XUt := Nt(y1U(y)), XV

t := Nt(y1V (y)), Xt −XU∪Vt

are independent Lévy processes in Rd .

Chapter 9: Random measures 59

Proof. Set W :=U ∪V . By Theorem 9.7, XU ,XV and X−XW are Lévy processes. In fact, a slightvariation of that argument even shows that (XU ,XV ,X−XW ) is a Lévy process in R3d .

In order to see their independence, fix s > 0 and define for t > s and ξ ,η ,θ ∈Rd the processes

Ct :=eiξ ·(XU

t −XUs )

E[eiξ ·(XU

t −XUs )] −1, Dt :=

eiη ·(XVt −XV

s )

E[eiη ·(XV

t −XVs )] −1,

Et :=eiθ ·(Xt−XW

t −Xs+XWs )

E[eiθ ·(Xt−XW

t −Xs+XWs )] −1.

By Lemma 9.8, these processes are bounded martingales satisfying ECt = EDt = EEt = 0. Settk,n = s+ k

n (t− s). Observe that

E(CtDtEt) = E

(n−1

∑k,l,m=0

(Ctk+1,n−Ctk,n)(Dtl+1,n−Dtl,n)(Etm+1,n−Etm,n)

)

= E

(n−1

∑k=0

(Ctk+1,n−Ctk,n)(Dtk+1,n−Dtk,n)(Etk+1,n−Etk,n)

).

In the second equality we use that martingale increments Ct−Cs,Dt−Ds,Et−Es are independentof F X

s , and by the tower property

E[(Ctk+1,n−Ctk,n)(Dtl+1,n−Dtl,n)(Etm+1,n−Etm,n)

]= 0 unless k = l = m.

An argument along the lines of Lemma 9.3 gives

E(CtDtEt) = E

(∑

s<u6t∆Cu ∆Du ∆Eu︸ ︷︷ ︸

=0

)= 0

as XUt , XV

t and Yt := Xt −XWt cannot jump simultaneously since U , V and Rd \W are mutually

disjoint. Thus,E[eiξ ·(XU

t −XUs )eiη ·(XV

t −XVs )eiθ ·(Yt−Ys)

]= E

[eiξ ·(XU

t −XUs )]·E[eiη ·(XV

t −XVs )]·E[eiθ ·(Yt−Ys)

].

(9.5)

Since all processes are Lévy processes, (9.5) already proves the independence of XU , XV andY = X−XW . Indeed, we find for 0 = t0 < t1 < .. . < tm = t and ξk,ηk,θk ∈Rd

E(

ei∑k ξk·(XUtk+1−XU

tk)ei∑k ηk·(XV

tk+1−XV

tk)ei∑k θk·(Ytk+1−Ytk )

)= E

(∏k

eiξk·(XUtk+1−XU

tk)eiηk·(XV

tk+1−XV

tk)eiθk·(Ytk+1−Ytk )

)(L2′)= ∏

kE(

eiξk·(XUtk+1−XU

tk)eiηk·(XV

tk+1−XV

tk)eiθk·(Ytk+1−Ytk )

)(9.5)= ∏

kE(

eiξk·(XUtk+1−XU

tk))E(

eiηk·(XVtk+1−XV

tk))E(

eiθk·(Ytk+1−Ytk )).

The last equality follows from (9.5); the second equality uses (L2′) for the Lévy process

(XUt ,XV

t ,Xt −XWt ).

60 R. L. Schilling: An Introduction to Lévy and Feller Processes

This shows that the families (XUtk+1−XU

tk )k, (XVtk+1−XV

tk )k and (Ytk+1−Ytk)k are independent, hencethe σ -algebras σ(XU

t , t > 0), σ(XVt , t > 0) and σ(Xt −XW

t , t > 0) are independent.

Corollary 9.10. Let Nt(·) be the jump measure of a Lévy process X and ν the intensity measure.

a) (Nt(U))t>0⊥⊥(Nt(V ))t>0 for U,V ∈B(Rd), 0 /∈U, 0 /∈V , U ∩V = /0.

b) For all measurable f :Rd →Rm satisfying f (0) = 0 and f ∈ L1(ν)

E(

eiξ ·Nt( f ))= E

(ei∫

ξ · f (y)Nt(dy))= e−t

∫y6=0[1−eiξ · f (y)]ν(dy). (9.6)

c)∫

y6=0

(|y|2∧1

)ν(dy)< ∞.

Proof. a) Since (Nt(U))t>0 and (Nt(V ))t>0 are completely determined by the independent pro-cesses (XU

t )t>0 and (XVt )t>0, cf. Theorem 9.9, the independence is clear.

b) Let us first prove (9.6) for step functions f (x) = ∑nk=1 φk1Uk(x) with φk ∈Rm and disjoint sets

U1, . . . ,Un ∈B(Rd) such that 0 6∈Uk. Then

Eexp[

i∫

ξ · f (y)Nt(dy)]

= Eexp[

in

∑k=1

∫ξ ·φk1Uk(y)Nt(dy)

]a)=

n

∏k=1Eexp [iξ ·φk Nt(Uk)]

9.7.a)=

n

∏k=1

exp[tν(Uk)

[eiξ ·φk −1

]]= exp

[t

n

∑k=1

[eiξ ·φk −1

]ν(Uk)

]= exp

[− t∫ [

1− eiξ · f (y)]ν(dy)

].

For any f ∈ L1(ν) the integral on the right-hand side of (9.6) exists. Indeed, the elementaryinequality |1− eiu|6 |u|∧2 and ν|y|> 1< ∞ (Lemma 9.4.c)) yield∣∣∣∣∫y 6=0

[1− eiξ · f (y)]

ν(dy)∣∣∣∣6 |ξ |∫0<|y|<1

| f (y)|ν(dy)+2∫|y|>1

ν(dy)< ∞.

Therefore, (9.6) follows with a standard approximation argument and dominated convergence.

c) We have already seen in Lemma 9.4.c) that ν|y|> 1< ∞.Let us show that

∫0<|y|<1 |y|2 ν(dy) < ∞. For this we take U = δ < |y| < 1. Again by Theo-

rem 9.9, the processes XUt and Xt −XU

t are independent, and we get

0 <∣∣Eeiξ ·Xt

∣∣= ∣∣Eeiξ ·XUt∣∣ · ∣∣Eeiξ ·(Xt−XU

t )∣∣6 ∣∣Eeiξ ·XU

t∣∣.

Since XUt is a compound Poisson process—use part b) with f (y) = y1U(y)—we get for all |ξ |6 1

0 <∣∣Eeiξ ·Xt

∣∣6 ∣∣Eeiξ ·XUt∣∣= e−t

∫U (1−cosξ ·y)ν(dy) 6 e−t

∫δ<|y|61

14 (ξ ·y)

2 ν(dy).

Chapter 9: Random measures 61

For the equality we use |ez|= eRez, the inequality follows 14 u2 6 1−cosu if |u|6 1. Letting δ → 0

we see that∫

0<|y|<1 |y|2 ν(dy)< ∞.

Corollary 9.11. Let Nt(·) be the jump measure of a Lévy process X and ν the intensity measure.For all f :Rd →Rm satisfying f (0) = 0 and f ∈ L2(ν) we have2

E

(∣∣∣∣∫ f (y)[Nt(dy)− tν(dy)

]∣∣∣∣2)= t∫

y 6=0| f (y)|2 ν(dy). (9.7)

Proof. It is clearly enough to show (9.7) for step functions of the form

f (x) =n

∑k=1

φk1Bk(x), Bk disjoint, 0 6∈ Bk, φk ∈Rm,

and then use an approximation argument.Since the processes Nt(Bk, ·) are independent Poisson processes with mean ENt(Bk) = tν(Bk)

and variance VNt(Bk) = tν(Bk), we find

E[(Nt(Bk)− tν(Bk))(Nt(Bl)− tν(Bl))

]=

0, if Bk∩Bl = /0, i.e. k 6= l,

VNt(Bk) = tν(Bk), if k = l,

= tν(Bk∩Bl).

Therefore,

E

(∣∣∣∣∫ f (y)(Nt(dy)− tν(dy)

)∣∣∣∣2)= E

(∫∫f (y) f (z)

(Nt(dy)− tν(dy)

)(Nt(dz)− tν(dz)

))=

n

∑k,l=1

φkφlE((

Nt(Bk)− tν(Bk))(

Nt(Bl)− tν(Bl)))

︸ ︷︷ ︸= tν(Bk∩Bl)

= tn

∑k=1|φk|2 ν(Bk) = t

∫| f (y)|2 ν(dy).

In contrast to Corollary 7.6 the following theorem does not need (but constructs) the Lévy triplet(l,Q,ν).

Theorem 9.12 (Lévy–Itô decomposition). Let X be a Lévy process and denote by Nt(·) and ν the

2This is a special case of an Itô isometry, cf. (10.9) in the following chapter.

62 R. L. Schilling: An Introduction to Lévy and Feller Processes

jump and intensity measures. Then

Xt =√

QWt +∫

0<|y|<1y(Nt(dy)− tν(dy)

) ]]]=: Mt , L2-martingale

tl

continuousGaussian

+∫|y|>1

yNt(dy).

pure jump part

]]]=: At , bdd. variation (9.8)

where l ∈ Rd and Q ∈ Rd×d is a positive semidefinite symmetric matrix and W is a standardBrownian motion in Rd . The processes on the right-hand side of (9.8) are independent Lévyprocesses.

Proof. 1 Set Un := 1n < |y|< 1, V = |y|> 1, Wn :=Un∪· V and define

XVt :=

∫V

yNt(dy) and XUnt :=

∫Un

yNt(dy)− t∫

Un

yν(dy).

By Theorem 9.9 (XVt )t>0, (XUn

t )t>0 and(Xt −XWn

t + t∫

Unyν(dy)

)t>0 are independent Lévy pro-

cesses. Since

X = (X− XUn−XV )+ XUn +XV ,

the theorem follows if we can show that the three terms on the right-hand side converge separatelyas n→ ∞.

2 Lemma 9.6 shows EXUnt = 0; since XUn is a Lévy process, it is a martingale: for s6 t

E(

XUnt

∣∣∣Fs

)= E

(XUn

t − XUns

∣∣∣Fs

)+ XUn

s

(L2)=

(L1)E(

XUnt−s

)+ XUn

s = XUns .

(Fs can be taken as the natural filtration of XUn or X). By Doob’s L2 martingale inequality we findfor any t > 0 and m < n

E

(sups6t

∣∣XUns − XUm

s

∣∣2)6 4E(∣∣XUn

t − XUmt∣∣2)

= 4t∫

1n<|y|6

1m

|y|2 ν(dy)−−−−→m,n→∞

0.

Therefore, the limit∫

0<|y|<1 y(Nt(dy)− tν(dy)

)= L2-limn→∞ XUn

t exists locally uniformly (in t).The limit is still an L2 martingale with càdlàg paths (take a locally uniformly a.s. convergentsubsequence) and, by Lemma 7.3, also a Lévy process.

3 Observe that

(X− XUn−XV )− (X− XUm−XV ) = XUm− XUn ,

Chapter 9: Random measures 63

and so Xct := L2- limn→∞(Xt − XUn

t −XVt ) exists locally uniformly (in t) Since, by construction

|∆(Xt − XUnt −XV

t )| 6 1n , it is clear that Xc has a.s. continuous sample paths. By Lemma 7.3 it

is a Lévy process. From Theorem 8.4 we know that all Lévy processes with continuous samplepaths are of the form tl +

√QWt where W is a Brownian motion, Q ∈Rd×d a symmetric positive

semidefinite matrix and l ∈Rd .

4 Since independence is preserved under L2-limits, the decomposition (9.8) follows. Finally,∫|y|>1

yNt(dy,ω) = ∑0<s6t

∆Xs(ω)1∆Xs(ω)>1− ∑0<s6t

|∆Xs(ω)|1∆Xs(ω)6−1

is the difference of two increasing processes, i.e. it is of bounded variation.

Corollary 9.13 (Lévy–Khintchine formula). Let X be a Lévy process. Then the characteristicexponent ψ is given by

ψ(ξ ) =− i l ·ξ +12

ξ ·Qξ +∫

y 6=0

[1− eiy·ξ + iξ · y1(0,1)(|y|)

]ν(dy) (9.9)

where ν is the intensity measure, l ∈Rd and Q ∈Rd×d is symmetric and positive semidefinite.

Proof. Since the processes appearing in the Lévy–Itô decomposition (9.8) are independent, wesee

e−ψ(ξ ) = Eeiξ ·X1 = Eeiξ ·(−l+√

QW1) ·Eei∫

0<|y|<1 ξ ·y(N1(dy)−ν(dy)) ·Eei∫|y|>1 ξ ·yN1(dy).

Since W is a standard Brownian motion,

Eeiξ ·(l+√

QW1) = ei l·ξ− 12 ξ ·Qξ .

Using (9.6) with f (y) = y1Un(y), Un = 1n < |y| < 1, subtracting

∫Un

yν(dy) and letting n→ ∞

we get

Eexp[

i∫

0<|y|<1ξ · y(N1(dy)−ν(dy))

]= exp

[−∫

0<|y|<1

[1− eiy·ξ + iξ · y

]ν(dy)

];

finally, (9.6) with f (y) = y1V (y), V = |y|> 1, once again yields

Eexp[

i∫|y|>1

ξ · yN1(dy)]= exp

[−∫|y|>1

[1− eiy·ξ

]ν(dy)

]finishing the proof.

10. A digression: stochastic integrals

In this chapter we explain how one can integrate with respect to (a certain class of) random mea-sures. Our approach is based on the notion of random orthogonal measures and it will includethe classical Itô integral with respect to square-integrable martingales. Throughout this chapter,(Ω,A ,P) is a probability space, (Ft)t>0 some filtration, (E,E ) is a measurable space and µ isa (positive) measure on (E,E ). Moreover, R ⊂ E is a semiring, i.e. a family of sets such that/0 ∈ R, for all R,S ∈ R we have R∩ S ∈ R, and R \ S can be represented as a finite union ofdisjoint sets from R, cf. [54, Chapter 6] or [55, Definition 5.1]. It is not difficult to check thatR0 := R ∈R : µ(R)< ∞ is again a semiring.

Definition 10.1. Let R be a semiring on the measure space (E,E ,µ). A random orthogonalmeasure with control measure µ is a family of random variables N(ω,R) ∈R, R ∈R0, such that

E[|N(·,R)|2

]< ∞ ∀R ∈R0 (10.1)

E [N(·,R)N(·,S)] = µ(R∩S) ∀R,S ∈R0. (10.2)

The following Lemma explains why N(R) = N(ω,R) is called a (random) measure.

Lemma 10.2. The random set function R 7→ N(R) := N(ω,R), R ∈R0, is countably additive inL2, i.e.

N

(∞⋃·

n=1

Rn

)= L2- lim

n→∞

n

∑k=1

N(Rk) a.s. (10.3)

for every sequence (Rn)n∈N ⊂ R0 of mutually disjoint sets such that R :=⋃·∞n=1 Rn ∈ R0. In

particular, N(R∪· S) = N(R)+N(S) a.s. for disjoint R,S ∈R0 such that R∪· S ∈R0 and N( /0) = 0a.s. (notice that the exceptional set may depend on the sets R,S).

Proof. From R = S = /0 andE[N( /0)2] = µ( /0) = 0 we get N( /0) = 0 a.s. It is enough to prove (10.3)as finite additivity follows if we take (R1,R2,R3,R4 . . .) = (R,S, /0, /0, . . .). If Rn ∈R0 are mutuallydisjoint sets such that R :=

⋃·∞n=1 Rn ∈R0, then

E

[(N(R)−

n

∑k=1

N(Rk))2]

= EN2(R)+n

∑k=1EN2(Rk)−2

n

∑k=1E [N(R)N(Rk)]+

n

∑j 6=k, j,k=1

E [N(R j)N(Rk)]

(10.2)= µ(R)−

n

∑k=1

µ(Rk)−−−→n→∞

0

where we use the σ -additivity of the measure µ .

64

Chapter 10: A digression: stochastic integrals 65

Example 10.3. a) (White noise) Let R = (s, t] : 0 6 s < t < ∞ and µ = λ be Lebesguemeasure on (0,∞). Clearly, R =R0 is a semiring. Let W = (Wt)t>0 be a one-dimensional standardBrownian motion. The random set function

N(ω,(s, t]) :=Wt(ω)−Ws(ω), 06 s < t < ∞

is a random orthogonal measure with control measure λ . This follows at once from

E[(Wt −Ws)(Wv−Wu)

]= t ∧ v− s∨u = λ

[(s, t]∩ (u,v]

]for all 06 s < t < ∞ and 06 u < v < ∞.

Mind, however, that N is not σ -additive. To see this, take Rn := (1/(n+1),1/n], where n ∈N,and observe that

⋃·n Rn = (0,1]. Since W has stationary and independent increments, and scaleslike Wt ∼

√tW1, we have

Eexp[−∑

n=1 |N(Rn)|]

= Eexp[−∑

n=1 |W1/(n+1)−W1/n|]

= ∏∞

n=1Eexp[−(n(n+1))−1/2|W1|

]Jensen’s6

ineq.∏

n=1 α(n(n+1))−1/2

, α := Ee−|W1| ∈ (0,1).

As the series ∑∞n=1(n(n+ 1))−1/2 diverges, we get Eexp [−∑

∞n=1 |N(Rn)|] = 0 which means that

∑∞n=1 |N(ω,Rn)|= ∞ for almost all ω . This shows that N(·) cannot be countably additive. Indeed,

countable additivity implies that the series

N

(ω,

∞⋃n=1

Rn

)=

∑n=1

N(ω,Rn)

converges. The left-hand side, hence the summation, is independent under rearrangements. This,however, entails absolute convergence of the series ∑

∞n=1 |N(ω,Rn)| < ∞ which does not hold as

we have seen above.

b) (2nd order orthogonal noise) Let X = (Xt)t∈T be a complex-valued stochastic process definedon a bounded or unbounded interval T ⊂R. We assume that X has a.s. càdlàg paths. If E(|Xt |2)<∞, we call X a second-order process; many properties of X are characterized by the correlationfunction K(s, t) = E

(XsX t

), s, t ∈ T .

IfE[(Xt−Xs)(Xv−Xu)

]= 0 for all s6 t 6 u6 v, s, t,u,v∈ T , then X is said to have orthogonal

increments. Fix t0 ∈ T and define for all t ∈ T

F(t) :=

E(|Xt −Xt0 |2), if t > t0,

−E(|Xt0−Xt |2), if t 6 t0.

Clearly, F is increasing and, since t 7→ Xt is a.s. right-continuous, it is also right-continuous.Moreover,

F(t)−F(s) = E(|Xt −Xs|2) for all s6 t, s, t ∈ T. (10.4)

66 R. L. Schilling: An Introduction to Lévy and Feller Processes

To see this, we assume without loss of generality that s6 t0 6 t. We have

F(t)−F(s) = E(|Xt −Xt0 |2)−E(|Xs−Xt0 |2)

= E(|(Xt −Xs)+(Xs−Xt0)|2)−E(|Xs−Xt0 |2)

orth.=

incr.E(|Xt −Xs|2).

This shows that µ(s, t] := F(t)−F(s) defines a measure on R = R0 = (s, t] : −∞ < s < t <∞, s, t ∈ T, which is the control measure of N(ω,(s, t]) := Xt(ω)−Xs(ω). In fact, for s < t,u < v,s, t,u,v ∈ T , we have

Xt −Xs =(Xt −Xt∧v

)+(Xt∧v−Xs∨u

)+(Xs∨u−Xs

)Xv−Xu =

(Xu−X t∧v

)+(X t∧v−X s∨u

)+(X s∨u−Xu

).

Using the orthogonality of the increments we get

E[(Xt −Xs)(Xv−Xu)

]= E

[(Xt∧v−Xs∨u

)(X t∧v−X s∨u

)]= F(t ∧ v)−F(s∨u) = µ

((s, t]∩ (u,v]

),

i.e. N(ω,•) is a random orthogonal measure.

c) (Martingale noise) Let M = (Mt)t>0 be a square-integrable martingale with respect to the fil-tration (Ft)t>0, M0 = 0, and with càdlàg paths. Denote by 〈M〉 the predictable quadratic variation,i.e. the unique (〈M〉0 := 0) increasing predictable process such that M2−〈M〉 is a martingale. Therandom set function

N(ω,(s, t]) := Mt(ω)−Ms(ω), s6 t,

is a random orthogonal measure on R = (s, t] : 0 6 s < t < ∞ with control measure µ(s, t] =E(〈M〉t −〈M〉s). This follows immediately from the tower property of conditional expectation

E[MtMv]tower= E[MtE(Mv |Ft)] = E[M2

t ] = E〈M〉t if t 6 v

which, in turn, gives for all 06 s < t and 06 u < v

E[(Mt −Ms)(Mv−Mu)] = E〈M〉t∧v−E〈M〉s∧v−E〈M〉t∧u +E〈M〉s∧u

= µ((s, t]∩ (0,v]

)−µ

((s, t]∩ (0,u]

)= µ

((s, t]∩ (u,v]

).

d) (Poisson random measure) Let X be a d-dimensional Lévy process,

S :=

B ∈B(Rd) : 0 /∈ B, R :=

(s, t]×B : 06 s < t < ∞, B ∈S

,

and Nt(B) the jump measure (Definition 9.1). The random set function

N(ω,(s, t]×B) := [Nt(ω,B)− tν(B)]− [Ns(ω,B)− sν(B)], R = (s, t]×B ∈R,

Chapter 10: A digression: stochastic integrals 67

is a random orthogonal measure with control measure λ × ν where λ is Lebesgue measure on(0,∞) and ν is the Lévy measure of X . Indeed, by definition R =R0, and it is not hard to see thatR is a semiring1.

Set Nt(B) := N((0, t]×B) and let B,C ∈S , t,v> 0. As in the proof of Corollary 9.11 we have

E[Nt(B)Nt(C)

]= tν(B∩C).

Since S is a semiring, we get B = (B∩C)∪· (B\C) = (B∩C)∪· B1∪· · · · ∪· Bn with finitely manymutually disjoint Bk ∈S such that Bk ⊂ B\C.

The processes N(Bk) and N(C) are independent (Corollary 9.10) and centered. Therefore wehave for t 6 v

E[Nt(B)Nv(C)

]= E

[Nt(B∩C)Nv(C)

]+

n

∑k=1E[Nt(Bk)Nv(C)

]= E

[Nt(B∩C)Nv(C)

]+

n

∑k=1ENt(Bk) ·ENv(C)

= E[Nt(B∩C)Nv(C)

].

Use the same argument over again, as well as the fact that Nt(B∩C) has independent and centeredincrements (Lemma 9.4), to get

E[Nt(B)Nv(C)

]= E

[Nt(B∩C)Nv(B∩C)

]= E

[Nt(B∩C)Nt(B∩C)

]+

=ENt(B∩C)E[Nv(B∩C)−Nt(B∩C)]=0︷ ︸︸ ︷E[Nt(B∩C)

Nv(B∩C)− Nt(B∩C)

]= E

[Nt(B∩C)Nt(B∩C)

]= tν(B∩C)

= λ ((0, t]∩ (0,v])ν(B∩C).

(10.5)

For s 6 t, u 6 v and B,C ∈S a lengthy, but otherwise completely elementary, calculation basedon (10.5) shows

E[N((s, t]×B)N((u,v]×C)

]= λ ((s, t]∩ (u,v])ν(B∩C).

e) (Space-time white noise) Let R :=(0, t]×B : t > 0, B ∈B(Rd)

and µ = λ Lebesgue

measure on the half-spaceH+ := [0,∞)×Rd .Consider the mean-zero, real-valued Gaussian process (W(R))R∈B(H+) whose covariance func-

tion is given by Cov(W(R)W(S)) = λ (R∩S).2 By its very definition W(R) is a random orthog-onal measure on R0 with control measure λ .

1Both S and I := (s, t] : 06 s < t < ∞ are semirings, and so is their cartesian product R = I ×S , see [54,Lemma 13.1] or [55, Lemma 15.1] for the straightforward proof.

2The map (R,S) 7→ λ (R∩S) is positive semidefinite, i.e. for R1, . . . ,Rn ∈B(H+) and ξ1, . . . ,ξn ∈Rn

∑j,k=1

ξ jξkλ (R j ∩Rk) =n

∑j,k=1

∫ξ j1R j (x)ξk1Rk (x)λ (dx) =

∫ ( n

∑k=1

ξk1Rk (x))2

λ (dx)> 0.

68 R. L. Schilling: An Introduction to Lévy and Feller Processes

We will now define a stochastic integral in the spirit of Itô’s original construction.

Definition 10.4. Let R be a semiring and R0 = R ∈R : µ(R) < ∞. A simple function is adeterministic function of the form

f (x) =n

∑k=1

ck1Rk(x), n ∈N, ck ∈R, Rk ∈R0. (10.6)

Intuitively, IN( f ) = ∑nk=1 ckN(Rk) should be the stochastic integral of a simple function f . The

only problem is the well-definedness. Since a random orthogonal measure is a.s. finitely addi-tive, the following lemma has exactly the same proof as the usual well-definedness result for theLebesgue integral of a step function, see e.g. Schilling [54, Lemma 9.1] or [55, Lemma 8.1]; notethat finite unions of null sets are again null sets.

Lemma 10.5. Let f be a simple function and assume that f = ∑nk=1 ck1Rk = ∑

mj=1 b j1S j has two

representations as step-function. Then

n

∑k=1

ckN(Rk) =m

∑j=1

b jN(S j) a.s.

Definition 10.6. Let N(R), R∈R0, be a random orthogonal measure with control measure µ . Thestochastic integral of a simple function f given by (10.6) is the random variable

IN(ω, f ) :=n

∑k=1

ckN(ω,Rk). (10.7)

The following properties of the stochastic integral are more or less immediate from the defini-tion.

Lemma 10.7. Let N(R), R ∈R0, be a random orthogonal measure with control measure µ , f ,gsimple functions, and α,β ∈R.

a) IN(1R) = N(R) for all R ∈R0;

b) S 7→ IN(1S) extends N uniquely to S ∈ ρ(R0), the ring generated by R0;3

c) IN(α f +βg) = αIN( f )+β IN(g); (linearity)

d) E[IN( f )2

]=∫

f 2 dµ; (Itô’s isometry)

Proof. The properties a) and c) are clear. For b) we note that ρ(R0) can be constructed from R0

by adding all possible finite unions of (disjoint) sets (see e.g. [54, Proof of Theorem 6.1, Step 2]).

3A ring is a family of sets which contains /0 and which is stable under unions and differences of finitely many sets.Since R∩S = R\ (R\S), it is automatically stable under finite intersections. The ring generated by R0 is the smallestring containing R0.

Chapter 10: A digression: stochastic integrals 69

In order to see d), we use (10.6) and the orthogonality relation E [N(R j)N(Rk)] = µ(R j ∩Rk) toget

E[IN( f )2]= n

∑j,k=1

c jckE [N(R j)N(Rk)]

=n

∑j,k=1

c jckµ(R j ∩Rk)

=∫ n

∑j,k=1

c j1R j(x)ck1Rk(x)µ(dx)

=∫

f 2(x)µ(dx).

Itô’s isometry now allows us to extend the stochastic integral to the L2(µ)-closure of the simplefunctions: L2(E,σ(R),µ). For this take f ∈ L2(E,σ(R),µ) and any approximating sequence( fn)n∈N of simple functions, i.e.

limn→∞

∫| f − fn|2 dµ = 0.

In particular, ( fn)n∈N is an L2(µ) Cauchy sequence, and Itô’s isometry shows that the randomvariables (IN( fn))n∈N are a Cauchy sequence in L2(P):

E[(IN( fn)− IN( fm))

2]= E[IN( fn− fm)2]= ∫ ( fn− fm)

2 dµ −−−−→m,n→∞

0.

Because of the completeness of L2(P), the limit limn→∞ IN( fn) exists and, by a standard argument,it does not depend on the approximating sequence.

Definition 10.8. Let N(R), R∈R0, be a random orthogonal measure with control measure µ . Thestochastic integral of a function f ∈ L2(E,σ(R),µ) is the random variable∫

f (x)N(ω,dx) := L2(P)- limn→∞

IN(ω, fn) (10.8)

where ( fn)n∈N is any sequence of simple functions which approximate f in L2(µ).

It is immediate from the definition of the stochastic integral, that f 7→∫

f dN is linear and enjoysItô’s isometry

E

[(∫f (x)N(dx)

)2]=∫

f 2(x)µ(dx). (10.9)

Remark 10.9. Assume that the random orthogonal measure N is of space-time type, i.e. E =

(0,∞)×X where (X ,X ) is some measurable space, and R = (0, t]×B : B ∈S where S isa semiring in X . If for B ∈S the stochastic process Nt(B) := N((0, t]×B) is a martingale w.r.t.the filtration Ft := σ(N((0,s]×B), s6 t,B ∈S ), then

Nt( f ) :=∫∫1(0,t](s) f (x)N(ds,dx), 1(0,t]⊗ f ∈ L2(µ), t > 0,

is again a(n L2-)martingale. For simple functions f this follows immediately from the fact thatsums and differences of finitely many martingales (with a common filtration) are again a martin-gale. Since L2(P)-limits preserve the martingale property, the claim follows.

70 R. L. Schilling: An Introduction to Lévy and Feller Processes

At first sight, the stochastic integral defined in 10.8 looks rather restrictive since we can onlyintegrate deterministic functions f . As all randomness can be put into the random orthogonal mea-sure, we have considerable flexibility, and the following construction shows that Definition 10.8covers pretty much the most general stochastic integrals.

From now on we assume that

• the random measure N(dt,dx) on (E,E ) = ((0,∞)×X , B(Rd)⊗X ) is of space-time type,cf. Remark 10.9, with control measure µ(dt,dx);

• (Ft)t>0 is some filtration in (Ω,A ,P).

Let τ be a stopping time; the set K0,τK := (ω, t) : 0 < t 6 τ(ω) is called stochastic interval.We define

• E := Ω× (0,∞)×X ;

• E := P ⊗X where P is the predictable σ -algebra in Ω× (0,∞), see Definition A.8 inthe appendix;

• R := K0,τK×B : τ bounded stopping time, B ∈S ;

• µ(dω,dt,dx) := P(dω)µ(dt,dx) as control measure;

• N(ω,K0,τK×B

):= N

(ω,(0,τ(ω)]×B

)as random orthogonal measure4.

Lemma 10.10. Let N, R0 and µ be as above. The R0 -simple processes

f (ω, t,x) :=n

∑k=1

ck1K0,τkK(ω, t)1Bk(x), ck ∈R, K0,τkK×Bk ∈R0

are L2(µ)-dense in L2(E,P⊗σ(S ),µ).

Proof. This follows from standard arguments from measure and integration; notice that the pre-dictable σ -algebra P ⊗σ(S ) is generated by sets of the form K0,τK×B where τ is a boundedstopping time and B ∈S , cf. Theorem A.9 in the appendix.

Observe that for simple processes appearing in Lemma 10.10∫∫f (ω, t,x)N(ω,dt,dx) :=

n

∑k=1

ck N(ω,K0,τkK×Bk) =n

∑k=1

ck N(ω,(0,τk(ω)]×Bk)

is a stochastic integral which satisfies

E

[(∫∫f (·, t,x)N(·,dt,dx)

)2]=

n

∑k=1

c2k µ(K0,τkK×Bk) =

n

∑k=1

c2kEµ((0,τk]×Bk).

Just as above we can now extend the stochastic integral to L2(E,P⊗σ(S ),µ).

4To see that it is indeed a random orthogonal measure, use a discrete approximation of τ .

Chapter 10: A digression: stochastic integrals 71

Corollary 10.11. Let N(ω,dt,dx) be a random orthogonal measure on E of space-time type(cf. Remark 10.9) with control measure µ(dt,dx) and f : Ω× (0,∞)×X → R be an element ofL2(E,P⊗σ(S ),µ). Then the stochastic integral∫∫

f (ω, t,x)N(ω,dt,dx)

exists and satisfies the following Itô isometry

E

[(∫∫f (·, t,x)N(·,dt,dx)

)2]=∫∫E f 2(·, t,x)µ(dt,dx). (10.10)

Let us show that the stochastic integral w.r.t. a space-time random orthogonal measure extendsthe usual Itô integral. To do so we need the following auxiliary result.

Lemma 10.12. Let N(ω,dt,dx) be a random orthogonal measure on E of space-time type (cf.Remark 10.9) with control measure µ(dt,dx) and τ a stopping time. Then∫∫

φ(ω)1Kτ,∞J(ω, t) f (ω, t,x)N(ω,dt,dx)

= φ(ω)∫∫1Kτ,∞J(ω, t) f (ω, t,x)N(ω,dt,dx)

(10.11)

for all φ ∈ L∞(Fτ) and f ∈ L2(E,P⊗σ(S ),µ).

Proof. Since t 7→ 1Kτ,∞J(ω, t) is adapted to the filtration (Ft)t>0 and left-continuous, the inte-grands appearing in (10.11) are predictable, hence all stochastic integrals are well-defined.

1 Assume that φ(ω) = 1F(ω) for some F ∈ Fτ and f (ω, t,x) = 1K0,σK(ω, t)1B(x) for somebounded stopping time σ and K0,σK×B ∈R0 . Define

Nσ (ω,B) := N(ω,(0,σ(ω)]×B) = N(ω,K0,σK×B)

for any K0,σK×B ∈R0 . The random time τF := τ1F +∞1Fc is a stopping time5, and we have

φ1Kτ,∞J f = 1F1Kτ,∞J1K0,σK1B = 1KτF ,∞J1K0,σK1B = 1KτF∧σ ,σK1B.

From this we get (10.11) for our choice of φ and f :∫∫φ1Kτ,∞J(t) f (t,x)N(dt,dx) =

∫∫1KτF∧σ ,σK(t)1B(x)N(dt,dx)

= Nσ (B)−NτF∧σ (B)

= 1F · (Nσ (B)−Nτ∧σ (B))

= 1F

∫∫1Kτ∧σ ,σK(t)1B(x)N(dt,dx)

= 1F

∫∫1Kτ,∞J1K0,σK(t)1B(x)N(dt,dx)

= φ

∫∫1Kτ,∞J(t) f (t,x)N(dt,dx).

5Indeed, τF 6 t= τ 6 t∩F =

/0, τ > t

F, τ 6 t

∈Ft for all t > 0.

72 R. L. Schilling: An Introduction to Lévy and Feller Processes

2 If φ = 1F for some F ∈Fτ and f is a simple process, then (10.11) follows from 1 because ofthe linearity of the stochastic integral.

3 If φ = 1F for some F ∈Fτ and f ∈ L2(µ), then (10.11) follows from 2 and Itô’s isometry:Let fn be a sequence of simple processes which approximate f . Then

E

[(∫∫ (φ1Kτ,∞J(t) fn(t,x)−φ1Kτ,∞J(t) f (t,x)

)N(dt,dx)

)2]

=∫∫E[(

φ1Kτ,∞J(t) fn(t,x)−φ1Kτ,∞J(t) f (t,x))2]

µ(dt,dx)

6∫∫E[(

fn(t,x)− f (t,x))2]

µ(dt,dx)−−−→n→∞

0.

4 If φ is an Fτ measurable step-function and f ∈ L2(µ), then (10.11) follows from 3 becauseof the linearity of the stochastic integral.

5 Since we can approximate φ ∈ L∞(Fτ) uniformly by Fτ measurable step functions φn, (10.11)follows from 4 and Itô’s isometry because of the following inequality:

E

[(∫∫ [φn1Kτ,∞J(t) f (t,x)−φ1Kτ,∞J(t) f (t,x)

]N(dt,dx)

)2]

=∫∫E([

φn1Kτ,∞J(t) f (t,x)−φ1Kτ,∞J(t) f (t,x)]2)

µ(dt,dx)

6 ‖φn−φ‖2L∞(P)

∫∫E[

f 2(t,x)]

µ(dt,dx).

We will now consider ‘martingale noise’ random orthogonal measures, see Example 10.3.c),which are given by (the predictable quadratic variation of) a square-integrable martingale M. Forthese random measures our definition of the stochastic integral coincides with Itô’s definition.Recall that the Itô integral driven by M is first defined for simple, left-continuous processes of theform

f (ω, t) :=n

∑k=1

φk(ω)1Kτk,τk+1K(ω, t), t > 0, (10.12)

where 0 6 τ1 6 τ2 6 . . . 6 τn+1 are bounded stopping times and φk bounded Fτk measurablerandom variables. The Itô integral for such simple processes is∫

f (ω, t)dMt(ω) :=n

∑k=1

φk(ω)(Mτk+1(ω)−Mτk(ω)

)and it is extended by Itô’s isometry to all integrands from L2(Ω× (0,∞),P,dP⊗ d〈M〉t). Fordetails we refer to any standard text on Itô integration, e.g. Protter [43, Chapter II] or Revuz & Yor[44, Chapter IV].

We will now use Lemma 10.12 in the particular situation where the space component dx is notpresent.

Theorem 10.13. Let N(dt) be a ‘martingale noise’ random orthogonal measure induced by thesquare-integrable martingale M (Example 10.3). The stochastic integral w.r.t. the random orthog-onal measure N(dt) and Itô’s stochastic integral w.r.t. M coincide.

Chapter 10: A digression: stochastic integrals 73

Proof. Let 0 6 τ1 6 τ2 6 . . . 6 τn+1 be bounded stopping times, φk ∈ L∞(Fτk) bounded randomvariables and f (ω, t) be a simple stochastic process of the form (10.12). From Lemma 10.12 weget ∫

f (t)N(dt) =∫ n

∑k=1

φk1Kτk,τk+1K(t)N(dt)

=n

∑k=1

∫φk1Kτk,∞J(t)1K0,τk+1K(t)N(dt)

=n

∑k=1

φk

∫1Kτk,∞J(t)1K0,τk+1K(t)N(dt)

=n

∑k=1

φk(Mτk+1−Mτk).

This means that both stochastic integrals coincide on the simple stochastic processes. Since bothintegrals are extended by Itô’s isometry, the assertion follows.

Example 10.14. Using random orthogonal measures we can re-state the Lévy-Itô decompositionappearing in Theorem 9.12. For this, let N(dt,dx) be the Poisson random orthogonal measure (Ex-ample 10.3.d) on E = (0,∞)× (Rd \0) with control measure dt×ν(dx) (ν is a Lévy measure).Additionally, we define for all deterministic functions h : (0,∞)×Rd →R∫∫

h(s,x)N(ω,ds,dx) := ∑0<s<∞

h(s,∆Xs(ω)) ∀ω ∈Ω

provided that the sum ∑0<s<∞ |h(s,∆Xs(ω))|< ∞ for each ω .6 If X is a Lévy process with charac-teristic exponent ψ and Lévy triplet (l,Q,ν), then

Xt =√

QWt +∫∫1(0,t](s)y1(0,1)(|y|) N(ds,dy)

]]]=: Mt , L2-martingale

tl

continuousGaussian

+∫∫1(0,t](s)y1|y|>1N(ds,dy).

pure jump part

]]]=: At , bdd. variation

Example 10.14 is quite particular in the sense that N(·,dt,dx) is a bona fide positive measure,and the control measure µ(dt,dx) is also the compensator, i.e. a measure such that

N((0, t]×B) = N((0, t]×B)−µ((0, t]×B)

is a square-integrable martingale.

6This is essentially an ω-wise Riemann–Stieltjes integral. A sufficient condition for the absolute convergence is,e.g. that h is continuous and h(t, ·) vanishes uniformly in t in some neighbourhood of x = 0. The reason for this is thefact that Nt(ω,Bc

ε (0)) = N(ω,(0, t]×Bcε (0))< ∞, i.e. there are at most finitely many jumps of size exceeding ε > 0.

74 R. L. Schilling: An Introduction to Lévy and Feller Processes

Following Ikeda & Watanabe [22, Chapter II.4] we can generalize the set-up of Example 10.14in the following way: Let N(ω,dt,dx) be for each ω a positive measure of space-time type. Sincet 7→ N(ω,(0, t]×B) is increasing, there is a unique compensator N(ω,dt,dx) such that for all Bwith EN((0, t]×B)< ∞

N(ω,(0, t]×B) := N(ω,(0, t]×B)− N(ω,(0, t]×B), t > 0,

is a square-integrable martingale. If t 7→ N((0, t]×B) is continuous and B 7→ N((0, t]×B) a σ -finite measure, then one can show that the angle bracket satisfies⟨

N((0, ·]×B), N((0, ·]×C)⟩

t = N((0, t]× (B∩C)

).

This means, in particular, that N(ω,dt,dx) is a random orthogonal measure with control measureµ((0, t]×B) =EN((0, t]×B), and we are back in the theory which we have developed in the firstpart of this chapter.

It is possible to develop a fully-fledged stochastic calculus for this kind of random measures.

Definition 10.15. Let (Ω,A ,P) be a probability space with a filtration (Ft)t>0. A semimartin-gale is a stochastic process X of the form

Xt = X0 +At +Mt +∫ t

0

∫f (·,s,x) N(ds,dx)+

∫ t

0

∫g(·,s,x)N(ds,dx)

(∫ t

0 :=∫(0,t]) where

• X0 is an F0 measurable random variable,

• M is a continuous square-integrable local martingale (w.r.t. Ft),

• A is a continuous Ft adapted process of bounded variation,

• N(ds,dx), N(ds,dx) and N(ds,dx) are as described above,

• f1K0,τnK ∈ L2(Ω× (0,∞)×X ,P⊗X ,µ) for some increasing sequence τn ↑∞ of boundedstopping times,

• g is such that∫ t

0∫

g(ω,s,x)N(ω,ds,dx) exists as an ω-wise integral,

• f (·,s,x)g(·,s,x)≡ 0.

In this case, we even have Itô’s formula, see [22, Chapter II.5], for any F ∈ C2(R,R):

F(Xt)−F(X0) =∫ t

0F ′(Xs−)dAs +

∫ t

0F ′(Xs−)dMs +

12

∫ t

0F ′′(Xs−)d〈M〉s

+∫ t

0

∫ [F(Xs−+ f (s,x))−F(Xs−)

]N(ds,dx)

+∫ t

0

∫ [F(Xs−+g(s,x))−F(Xs−)

]N(ds,dx)

+∫ t

0

∫ [F(Xs + f (s,x))−F(Xs)− f (s,x)F ′(Xs)

]N(ds,dx)

where we use again the convention that∫ t

0 :=∫(0,t].

11. From Lévy to Feller processes

We have seen in Lemma 4.8 that the semigroup Pt f (x) := Ex f (Xt) = E f (Xt + x) of a Lévy pro-cess (Xt)t>0 is a Feller semigroup. Moreover, the convolution structure of the transition semigroupE f (Xt +x) =

∫f (x+y)P(Xt ∈ dy) is a consequence of the spatial homogeneity (translation invari-

ance) of the Lévy process, see Remark 4.5 and the characterization of translation invariant linearfunctionals (Theorem A.10). Lemma 4.4 shows that the translation invariance of a Lévy processis due to the assumptions (L1) and (L2).

It is, therefore, a natural question to ask what we get if we consider stochastic processes whosesemigroups are Feller semigroups which are not translation invariant. Since every Feller semi-group admits a Markov transition kernel (Lemma 5.2), we can use Kolmogorov’s construction toobtain a Markov process. Thus, the following definition makes sense.

Definition 11.1. A Feller process is a càdlàg Markov process (Xt)t>0, Xt : Ω→Rd , t > 0, whosetransition semigroup Pt f (x) = Ex f (Xt) is a Feller semigroup.

Remark 11.2. It is no restriction to require that a Feller process has càdlàg paths. By a fundamentalresult in the theory of stochastic processes we can construct such modifications. Usually, oneargues like this: It is enough to study the coordinate processes, i.e. d = 1. Rather than lookingat t 7→ Xt we consider a (countable, point-separating) family of functions u : R→ R and showthat each t 7→ u(Xt) has a càdlàg modification. One way of achieving this is to use martingaleregularization techniques (e.g. Revuz & Yor [44, Chapter II.2]) which means that we should picku in such a way that u(Xt) is a supermartingale. The usual candidate for this is the resolvente−λ tRλ f (Xt) for some f ∈ C+

∞(R). Indeed, if Ft = σ(Xs, s6 t) is the natural filtration, f > 0 ands6 t, then

Ex[Rλ f (Xt) |Fs]= EXs

∫∞

0e−λ rPr f (Xt−s)dr =

∫∞

0e−λ rPrPt−s f (Xs)dr

= eλ (t−s)∫

t−se−λuPu f (Xs)du6 eλ (t−s)

∫∞

0e−λuPu f (Xs)du

= eλ (t−s)Rλ f (Xs).

Let Ft = F Xt := σ(Xs, s6 t) be the canonical filtration.

Lemma 11.3. Every Feller process (Xt)t>0 is a strong Markov process, i.e.

Ex[ f (Xt+τ) |Fτ

]= EXτ f (Xt), Px-a.s. on τ < ∞, t > 0, (11.1)

holds for any stopping time τ , Fτ := F ∈F∞ : F ∩τ 6 t ∈Ft ∀t > 0 and f ∈ C∞(Rd).

75

76 R. L. Schilling: An Introduction to Lévy and Feller Processes

A routine approximation argument shows that (11.1) extends to f (y) = 1K(y) (where K is acompact set) and then, by a Dynkin-class argument, to any f (y) = 1B(y) where B ∈B(Rd).

Proof. To prove (11.1), approximate τ from above by discrete stopping times τn =(b2nτc+1

)2−n

and observe that for F ∈Fτ ∩τ < ∞

Ex[1F f (Xt+τ)](i)= lim

n→∞Ex[1F f (Xt+τn)]

(ii)= lim

n→∞Ex[1FE

Xτn f (Xt)] (iii)= Ex[1FE

Xτ f (Xt)].

Here we use that t 7→ Xt is right-continuous, plus (i) dominated convergence and (iii) the Fellercontinuity 4.7.f); (ii) is the strong Markov property for discrete stopping times which followsdirectly from the Markov property: Since τn < ∞= τ < ∞, we get

Ex[1F f (Xt+τn)] =∞

∑k=1Ex[1F∩τn=k2−n f (Xt+k2−n)

]=

∑k=1Ex[1F∩τn=k2−nE

Xk2−n f (Xt)]

= Ex[1FEXτn f (Xt)

].

In the last calculation we use that F ∩τn = k2−n ∈Fk2−n for all F ∈Fτ .

Once we know the generator of a Feller process, we can construct many important martingaleswith respect to the canonical filtration of the process.

Corollary 11.4. Let (Xt)t>0 be a Feller process with generator (A,D(A)) and semigroup (Pt)t>0.For every f ∈D(A) the process

M[ f ]t := f (Xt)−

∫ t

0A f (Xr)dr, t > 0, (11.2)

is a martingale for the canonical filtration FXt := σ(Xs, s6 t) and any Px, x ∈Rd .

Proof. Let s6 t, f ∈D(A) and write, for short, Fs :=FXs and Mt := M[ f ]

t . By the Markov property

Ex [Mt −Ms | Fs] = Ex[

f (Xt)− f (Xs)−∫ t

sA f (Xr)dr

∣∣∣∣ Fs

]= EXs f (Xt−s)− f (Xs)−

∫ t−s

0EXsA f (Xu)du.

On the other hand, we get from the semigroup identity (5.5)∫ t−s

0EXsA f (Xu)du =

∫ t−s

0PuA f (Xs)du = Pt−s f (Xs)− f (Xs) = E

Xs f (Xt−s)− f (Xs)

which shows that Ex [Mt −Ms | Fs] = 0.

Our approach from Chapter 6 to prove the structure of a Lévy generator ‘only’ uses the positivemaximum principle. Therefore, it can be adapted to Feller processes provided that the domainD(A) is rich in the sense that C∞

c (Rd) ⊂ D(A). All we have to do is to take into account that

Feller processes are not any longer invariant under translations. The following theorem is due toCourrège [14] and von Waldenfels [61, 62].

Chapter 11: From Lévy to Feller processes 77

Theorem 11.5 (von Waldenfels, Courrège). Let (A,D(A)) be the generator of a Feller processsuch that C∞

c (Rd)⊂D(A). Then A|C∞

c (Rd) is a pseudo differential operator

Au(x) =−q(x,D)u(x) :=−∫

q(x,ξ )u(ξ )eix·ξ dξ (11.3)

whose symbol q :Rd×Rd → C is a measurable function of the form

q(x,ξ ) = q(x,0)︸ ︷︷ ︸>0

− i l(x) ·ξ +12

ξ ·Q(x)ξ +∫

y 6=0

[1− eiy·ξ + iy ·ξ1(0,1)(|y|)

]ν(x,dy) (11.4)

and (l(x),Q(x),ν(x,dy)) is a Lévy triplet1 for every fixed x ∈Rd .

If we insert (11.4) into (11.3) and invert the Fourier transform we obtain the following integro-differential representation of the Feller generator A:

A f (x) = l(x) ·∇ f (x)+12

∇ ·Q(x)∇ f (x)

+∫

y6=0

[f (x+ y)− f (x)−∇ f (x) · y1(0,1)(|y|)

]ν(x,dy).

(11.5)

This formula obviously extends to all functions f ∈ C2b(R

d). In particular, we may use the functionf (x) = eξ (x) = eix·ξ , and get2

e−ξ (x)Aeξ (x) =−q(x,ξ ). (11.6)

Proof of Theorem 11.5 (sketch). For a worked-out version see [9, Chapter 2.3]. In the proof of The-orem 6.8 use, instead of A0 and A00

A0 f Ax f := (A f )(x) and A00 f Axx f := Ax(| ·−x|2 f )

for every x ∈ Rd . This is needed since Pt and A are not any longer translation invariant, i.e. wecannot shift A0 f to get Ax f . Then follow the steps 1–4 to get ν(dy) ν(x,dy) and 6–9 for(l(x),Q(x)). Remark 6.9 shows that the term q(x,0) is non-negative.

The key observation is, as in the proof of Theorem 6.8, that we can use in steps 3 and 7 thepositive maximum principle3 to make sure that Ax f is a distribution of order 2, i.e.

|Ax f |= |Lx f +Sx f |6CK‖ f‖(2) for all f ∈ C∞c (K) and all compact sets K ⊂Rd .

Here Lx is the local part with support in x accounting for (q(x,0), l(x),Q(x)), and Sx is thenon-local part supported in Rd \x giving ν(x,dy).

With some abstract functional analysis we can show some (local) boundedness properties ofx 7→ A f (x) and (x,ξ ) 7→ q(x,ξ ).

1Cf. Definition 6.102This should be compared with Definition 6.4 and the subsequent comments.3To be precise: its weakened form (PP), cf. page 41.

78 R. L. Schilling: An Introduction to Lévy and Feller Processes

Corollary 11.6. In the situation of Theorem 11.5, the condition (PP) shows that

sup|x|6r|A f (x)|6Cr‖ f‖(2) for all f ∈ C∞

c (Br(0)) (11.7)

and the positive maximum principle (PMP) gives

sup|x|6r|A f (x)|6Cr,A‖ f‖(2) for all f ∈ C∞

c (Rd), r > 0. (11.8)

Proof. In the above sketched analogue of the proof of Theorem 6.8 we have seen that the familyof linear functionals

C∞c (Br(0)) 3 f 7→ Ax f : x ∈ Br(0)

where Ax f := (A f )(x) satisfies

|Ax f |6 cr,x‖ f‖(2), f ∈ C∞c (Br(0)),

i.e. Ax : (C2b(Br(0)),‖ · ‖(2))→ (R, | · |) is bounded. By the Banach–Steinhaus theorem (uniform

boundedness principle)sup|x|6r|Ax f |6Cr‖ f‖(2).

Since A also satisfies the positive maximum principle (PMP), we know from step 4 of the(suitably adapted) proof of Theorem 6.8 that∫

|y|>1ν(x,dy)6 Aφ0(x) for some φ0 ∈ Cc(B1(0)).

Let r > 1, pick χ = χr ∈ C∞c (R

d) such that 1B2r(0) 6 χ 6 1B3r(0). We get for |x|6 r

A f (x) = A[χ f ](x)+A[(1−χ) f ](x)

= A[χ f ](x)+∫|y|>r

[(1−χ(x+ y)) f (x+ y)− (1−χ(x)) f (x)︸ ︷︷ ︸

=0

]ν(x,dy),

and so

sup|x|6r|A f (x)|6Cr‖χ f‖(2)+‖ f‖∞‖Aφ0‖∞ 6Cr,A‖ f‖(2).

Corollary 11.7. In the situation of Theorem 11.5 there exists a locally bounded nonnegative func-tion γ :Rd → [0,∞) such that

|q(x,ξ )|6 γ(x)(1+ |ξ |2), x,ξ ∈Rd . (11.9)

Proof. Using (11.8) we can extend A by continuity to C2b(R

d) and, therefore,

−q(x,ξ ) = e−ξ (x)Aeξ (x), eξ (x) = eix·ξ

makes sense. Moreover, we have sup|x|6r |Aeξ (x)| 6Cr,A‖eξ‖(2) for any r > 1; since ‖eξ‖(2) is apolynomial of order 2 in the variable ξ , the claim follows.

Chapter 11: From Lévy to Feller processes 79

For a Lévy process we have ψ(0) = 0 since P(Xt ∈ Rd) = 1 for all t > 0, i.e. the process Xt

does not explode in finite time. For Feller processes the situation is more complicated. We needthe following technical lemmas.

Lemma 11.8. Let q(x,ξ ) be the symbol of (the generator of ) a Feller process as in Theorem 11.5and F ⊂Rd be a closed set. Then the following assertions are equivalent.

a) |q(x,ξ )|6C(1+ |ξ |2) for all x,ξ ∈Rd where C = 2 sup|ξ |61

supx∈F|q(x,ξ )|.

b) supx∈F

q(x,0)+ supx∈F|l(x)|+ sup

x∈F‖Q(x)‖+ sup

x∈F

∫y6=0

|y|2

1+ |y|2ν(x,dy)< ∞.

If F = Rd , then the equivalent properties of Lemma 11.8 are often referred to as ‘the symbolhas bounded coefficients’.

Outline of the proof (see [57, Appendix] for a complete proof ). The direction b)⇒a) is proved asTheorem 6.2. Observe that ξ 7→ q(x,ξ ) is for fixed x the characteristic exponent of a Lévy process.The finiteness of the constant C follows from the assumption b) and the Lévy–Khintchine formula(11.4).

For the converse a)⇒b) we note that the integrand appearing in (11.4) can be estimated byc |y|2

1+|y|2 which is itself a Lévy exponent:

|y|2

1+ |y|2=∫

[1− cos(y ·ξ )]g(ξ )dξ , g(ξ ) =12

∫∞

0(2πλ )−d/2e−|ξ |

2/2λ e−λ/2 dλ .

Therefore, by Tonelli’s theorem,∫y 6=0

|y|2

1+ |y|2ν(x,dy) =

∫∫y6=0

[1− cos(y ·ξ )] ν(x,dy)g(ξ )dξ

=∫

g(ξ )(

Req(x,ξ )− 12

ξ ·Q(x)ξ −q(x,0))

dξ .

Lemma 11.9. Let A be the generator of a Feller process, assume that C∞c (R

d)⊂D(A) and denoteby q(x,ξ ) the symbol of A. For any cut-off function χ ∈ C∞

c (Rd) satisfying 1B1(0) 6 χ 6 1B2(0) and

χr(x) := χ(x/r) one has∣∣q(x,D)(χreξ )(x)∣∣6 4 sup

|η |61|q(x,η)|

∫Rd

[1+ r−2|ρ|2 + |ξ |2

]∣∣χ(ρ)∣∣dρ, (11.10)

limr→∞

A(χreξ )(x) =−eξ (x)q(x,ξ ) for all x,ξ ∈Rd . (11.11)

Proof. Observe that χreξ (η) = rd χ(r(η−ξ )) and

q(x,D)(χreξ )(x) =∫

q(x,η)eix·ηχreξ (η)dη

=∫

q(x,η)eix·η rdχ(r(η−ξ ))dη

=∫

q(x,ξ + r−1ρ)eix·(ξ+ρ/r)

χ(ρ)dρ.

(11.12)

80 R. L. Schilling: An Introduction to Lévy and Feller Processes

Therefore we can use the estimate (11.9) with the optimal constant γ(x) = 2sup|η |61 |q(x,η)| andthe elementary estimate (a+b)2 6 2(a2 +b2) to obtain∣∣q(x,D)(χreξ )(x)

∣∣6 ∫ ∣∣q(x,ξ + r−1ρ)∣∣∣∣χ(ρ)∣∣dρ

6 4 sup|η |61|q(x,η)|

∫ [1+ r−2|ρ|2 + |ξ |2

]∣∣χ(ρ)∣∣dρ.

This proves (11.10); it also allows us to use dominated convergence in (11.12) to get (11.11). Justobserve that χ ∈ S(Rd) and

∫χ(ρ)dρ = χ(0) = 1.

Lemma 11.10. Let q(x,ξ ) be the symbol of (the generator of ) a Feller process. Then the followingassertions are equivalent:

a) x 7→ q(x,ξ ) is continuous for all ξ .

b) x 7→ q(x,0) is continuous.

c) Tightness: limr→∞

supx∈K

ν(x,Rd \Br(0)) = 0 for all compact sets K ⊂Rd .

d) Uniform continuity at the origin: lim|ξ |→0

supx∈K|q(x,ξ )−q(x,0)|= 0 for all compact K ⊂Rd .

Proof. Let χ : [0,∞) → [0,1] be a decreasing C∞-function satisfying 1[0,1) 6 χ 6 1[0,4). Thefunctions χn(x) := χ(|x|2/n2), x ∈ Rd and n ∈N, are radially symmetric, smooth functions with1Bn(0) 6 χn 6 1B2n(0). Fix any compact set K ⊂ Rd and some sufficiently large n0 such thatK ⊂ Bn0(0).

a)⇒b) is obvious.

b)⇒c) For m> n> 2n0 the positive maximum principle implies

1K(x)A(χn−χm)(x)> 0.

Therefore,−1K(x)q(x,0) = limn→∞1K(x)Aχn+n0(x) is a decreasing limit of continuous functions.Since the limit function q(x,0) is continuous, Dini’s theorem implies that the limit is uniform onthe set K. From the integro-differential representation (11.5) of the generator we get

1K(x)|Aχm(x)−Aχn(x)|= 1K(x)∫

n−n06|y|62m+n0

(χm(x+ y)−χn(x+ y)

)ν(x,dy).

Letting m→ ∞ yields

1K(x)|q(x,0)+Aχn(x)|= 1K(x)∫|y|>n−n0

(1−χn(x+ y)

)ν(x,dy)

> 1K(x)∫|y|>n−n0

(1−1B2n+n0 (0)

(y))

ν(x,dy)

> 1K(x)ν(x,Bc

2n+n0(0)).

Chapter 11: From Lévy to Feller processes 81

where we use that K ⊂ Bn0(0) and

χn(x+ y)6 1B2n(0)(x+ y) = 1B2n(0)−x(y)6 1B2n+n0 (0)(y).

Since the left-hand side converges uniformly to 0 as n→ ∞, c) follows.

c)⇒d) Since the function x 7→ q(x,ξ ) is locally bounded, we conclude from Lemma 11.8 thatsupx∈K |l(x)|+ supx∈K ‖Q(x)‖ < ∞. Thus, lim|ξ |→0 supx∈K(|l(x) · ξ |+ |ξ ·Q(x)ξ |) = 0, and wemay safely assume that l ≡ 0 and Q≡ 0. If |ξ |6 1 we find, using (11.4) and Taylor’s formula forthe integrand,

|q(x,ξ )−q(x,0)|

=

∣∣∣∣∫y6=0

[1− eiy·ξ + iy ·ξ1(0,1)(y)

]ν(x,dy)

∣∣∣∣6∫

0<|y|2<1

12|y|2|ξ |2 ν(x,dy)+

∫16|y|2<1/|ξ |

|y||ξ |ν(x,dy)+∫|y|2>1/|ξ |

2ν(x,dy)

6∫

0<|y|2<1/|ξ |

|y|2

1+ |y|2ν(x,dy)

(1+ |ξ |−1/2)|ξ |+2ν

(x,y : |y|2 > 1/|ξ |

).

Since this estimate is uniform for x ∈ K, we get d) as |ξ | → 0.

d)⇒c) As before, we may assume that l ≡ 0 and Q≡ 0. For every r > 0

12

ν(x,Bcr(0))6

∫|y|>r

|y/r|2

1+ |y/r|2ν(x,dy)

=∫|y|>r

∫Rd

(1− cos

η · yr

)g(η)dη ν(x,dy)

6∫Rd

[Req

(x, η

r

)−q(x,0)

]g(η)dη ,

where g(η) is as in the proof of Lemma 11.8. Since∫(1+ |η |2)g(η)dη < ∞, we can use (11.9)

and findν(x,Bc

r(0))6 cg sup|η |61/r

∣∣Req(x,η)−q(x,0)∣∣.

Taking the supremum over all x ∈ K and letting r→ ∞ proves c).

c)⇒a) From Lemma 11.9 we know that limn→∞ e−ξ (x)A[χneξ ](x) = −q(x,ξ ). Let us show thatthis convergence is uniform for x ∈ K. Let m> n> 2n0. For x ∈ K∣∣e−ξ (x)A[eξ χn](x)− e−ξ (x)A[eξ χm](x)

∣∣=

∣∣∣∣∫y 6=0

[eξ (y)χn(x+ y)− eξ (y)χm(x+ y)

]ν(x,dy)

∣∣∣∣6∫

y 6=0

[χm(x+ y)−χn(x+ y)

]ν(x,dy)

6∫

n−n06|y|62m+n0

[χm(x+ y)−χn(x+ y)

]ν(x,dy)

6 ν(x,Bcn−n0

(0)).

82 R. L. Schilling: An Introduction to Lévy and Feller Processes

In the penultimate step we use that, because of the definition of the functions χn,

supp(χm(x+ ·)−χn(x+ ·)

)⊂ B2m(x)\Bn(x)⊂ B2m+n0(0)\Bn−n0(0)

for all x ∈ K ⊂ Bn0(0). The right-hand side tends to 0 uniformly for x ∈ K as n→ ∞, hencem→ ∞.

Remark 11.11. The argument used in the first three lines of the step b)⇒c) in the proof of Lemma11.10 shows, incidentally, that

x 7→ q(x,ξ ) is always upper semicontinuous

since it is (locally) a decreasing limit of continuous functions.

Remark 11.12. Let A be the generator of a Feller process, and assume that C∞c (R

d) ⊂ D(A);although A maps D(A) into C∞(R

d), this is not enough to guarantee that the symbol q(x,ξ ) iscontinuous in the variable x. On the other hand, if the Feller process X has only bounded jumps,i.e. if the support of the Lévy measure ν(x, ·) is uniformly bounded, then q(·,ξ ) is continuous.This is, in particular, true for diffusions.

This follows immediately from Lemma 11.10.c) which holds if ν(x,Bcr(0)) = 0 for some r > 0

and all x ∈Rd .We can also give a direct argument: pick χ ∈ C∞

c (Rd) satisfying 1B3r(0) 6 χ 6 1. From the

representation (11.5) it is not hard to see that

A f (x) = A[χ f ](x) for all f ∈ C2b(R

d) and x ∈ Br(0);

in particular, A[χ f ] is continuous.If we take f (x) := eξ (x), then, by (11.6), −q(x,ξ ) = e−ξ (x)Aeξ (x) = e−ξ (x)A[χeξ ](x) which

proves that x 7→ q(x,ξ ) is continuous on every ball Br(0), hence everywhere.

We can now discuss the role of q(x,ξ ) for the conservativeness of a Feller process.

Theorem 11.13. Let (Xt)t>0 be a Feller process with infinitesimal generator (A,D(A)) such thatC∞

c (Rd)⊂D(A), symbol q(x,ξ ) and semigroup (Pt)t>0.

a) If x 7→ q(x,ξ ) is continuous for all ξ ∈Rd and Pt1 = 1, then q(x,0) = 0.

b) If q(x,ξ ) has bounded coefficients and q(x,0) = 0, then x 7→ q(x,ξ ) is continuous for allξ ∈Rd and Pt1 = 1.

Proof. Let χ and χr, r ∈N, be as in Lemma 11.9. Then eξ χr ∈D(A) and, by Corollary 11.4,

Mt := eξ χr(Xt)− eξ χr(x)−∫[0,t)

A(eξ χr)(Xs)ds, t > 0,

is a martingale. Using optional stopping for the stopping time

τ := τxR := infs> 0 : |Xs− x|> R, x ∈Rd , R > 0,

Chapter 11: From Lévy to Feller processes 83

the stopped process (Mt∧τ)t>0 is still a martingale. Since ExMt∧τ = 0, we get

Ex(χreξ )(Xt∧τ)−χreξ (x) = Ex∫[0,t∧τ)

A(χreξ )(Xs)ds.

Note that the integrand is evaluated only for times s < t ∧ τ where |Xs|6 R+ x. Since A(eξ χr)(x)is locally bounded, we can use dominated convergence and Lemma 11.9 and we find, as r→ ∞,

Exeξ (Xt∧τ)− eξ (x) =−Ex∫[0,t∧τ)

eξ (Xs)q(Xs,ξ )ds.

a) Set ξ = 0 and observe that Pt1 = 1 implies that τ = τxR→ ∞ a.s. as R→ ∞. Therefore,

Px(Xt∧τ ∈Rd)−1 =−Ex∫[0,τ∧t)

q(Xs,0)ds,

and with Fatou’s Lemma we can let R→ ∞ to get

0 = liminfR→∞

Ex∫[0,τ∧t)

q(Xs,0)ds> Ex[

liminfR→∞

∫[0,τ∧t)

q(Xs,0)ds]= Ex

∫ t

0q(Xs,0)ds.

Since x 7→ q(x,0) is continuous and q(x,0) non-negative, we conclude with Tonelli’s theorem that

q(x,0) = limt→0

1t

∫ t

0Exq(Xs,0)ds = 0.

b) Set ξ = 0 and observe that the boundedness of the coefficients implies that

Px(Xt ∈Rd)−1 =−Ex∫[0,t)

q(Xs,0)ds

as R→ ∞. Since the right-hand side is 0, we get Pt1 = Px(Xt ∈Rd) = 1.

Remark 11.14. The boundedness of the coefficients in Theorem 11.13.b) is important. If thecoefficients of q(x,ξ ) grow too rapidly, we may observe explosion in finite time even if q(x,0) = 0.A typical example in dimension 3 is the diffusion process given by the generator

L f (x) =12

a(x)∆ f (x)

where a(x) is continuous, rotationally symmetric a(x) = α(|x|) for a suitable function α(r), andsatisfies

∫∞

1 1/α(√

r)dr < ∞, see Stroock & Varadhan [60, p. 260, 10.3.3]; the correspondingsymbol is q(x,ξ ) = 1

2 a(x)|ξ |2. This process explodes in finite time. Since this is essentially atime-changed Brownian motion (see Böttcher, Schilling & Wang [9, Chapter 4.1]), this exampleworks only if Brownian motion is transient, i.e. in dimensions d = 3 and higher. A sufficientcriterion for conservativeness in terms of the symbol is

liminfr→∞

sup|y−x|62r

sup|η |61/r

|q(y,η)|< ∞ for all x ∈Rd ,

see [9, Theorem 2.34].

12. Symbols and semimartingales

So far, we have been treating the symbol q(x,ξ ) of (the generator of) a Feller process X as an an-alytic object. On the other hand, Theorem 11.13 indicates, that there should be some probabilisticconsequences. In this chapter we want to follow this lead, show a probabilistic method to calcu-late the symbol and link it to the semimartingale characteristics of a Feller process. The blueprintfor this is the relation of the Lévy–Itô decomposition (which is the semimartingale decompositionof a Lévy process, cf. Theorem 9.12) with the Lévy–Khintchine formula for the characteristicexponent (which coincides with the symbol of a Lévy process, cf. Corollary 9.13).

For a Lévy process Xt with semigroup Pt f (x) = Ex f (Xt) = E f (Xt + x) the symbol can be cal-culated in the following way:

limt→0

e−ξ (x)Pteξ (x)−1t

= limt→0

Exeiξ ·(Xt−x)−1t

= limt→0

e−tψ(ξ )−1t

=−ψ(ξ ). (12.1)

For a Feller process a similar formula is true.

Theorem 12.1. Let X = (Xt)t>0 be a Feller process with transition semigroup (Pt)t>0 and gen-erator (A,D(A)) such that C∞

c (Rd) ⊂ D(A). If x 7→ q(x,ξ ) is continuous and q has bounded

coefficients (Lemma 11.8 with F =Rd), then

−q(x,ξ ) = limt→0

Exeiξ ·(Xt−x)−1t

. (12.2)

Proof. Pick χ ∈C∞c (R

d), 1B1(0)6 χ 61B2(0) and set χn(x) := χ( x

n

). Obviously, χn→ 1 as n→∞.

By Lemma 5.4,

Pt [χneξ ](x)−χn(x)eξ (x) =∫ t

0APs[χneξ ](x)ds

= Ex∫ t

0A[χneξ ](Xs)ds

=−∫ t

0

∫RdEx(

eη(Xs)q(Xs,η) χneξ (η)︸ ︷︷ ︸=nd χ(n(η−ξ ))

)dη ds

−−−→n→∞

−∫ t

0Ps[eξ q(·,ξ )

](x)ds.

Since x 7→ q(x,ξ ) is continuous, we can divide by t and let t→ 0; this yields

limt→0

1t

(Pteξ (x)− eξ (x)

)=−eξ (x)q(x,ξ ).

84

Chapter 12: Symbols and semimartingales 85

Theorem 12.1 is a relatively simple probabilistic formula to calculate the symbol. We want torelax the boundedness and continuity assumptions. Here Dynkin’s characteristic operator becomesuseful.

Lemma 12.2 (Dynkin’s formula). Let (Xt)t>0 be a Feller process with semigroup (Pt)t>0 andgenerator (A,D(A)). For every stopping time σ with Exσ < ∞ one has

Ex f (Xσ )− f (x) = Ex∫[0,σ)

A f (Xs)ds, f ∈D(A). (12.3)

Proof. From Corollary 11.4 we know that M[ f ]t := f (Xt)− f (X0)−

∫ t0 A f (Xs)ds is a martingale;

thus (12.3) follows from the optional stopping theorem.

Definition 12.3. Let (Xt)t>0 be a Feller process. A point a ∈Rd is an absorbing point, if

Pa(Xt = a, ∀t > 0) = 1.

Denote by τr := inft > 0 : |Xt −X0|> r the first exit time from the ball Br(x) centered at thestarting position x = X0.

Lemma 12.4. Let (Xt)t>0 be a Feller process and assume that b∈Rd is not absorbing. Then thereexists some r > 0 such that Ebτr < ∞.

Proof. 1 If b is not absorbing, then there is some f ∈ D(A) such that A f (b) 6= 0. Assume thecontrary, i.e.

A f (b) = 0 for all f ∈D(A).

By Lemma 5.4, Ps f ∈D(A) for all s> 0, and

Pt f (b)− f (b) =∫ t

0A(Ps f )(b)ds = 0.

So, Pt f (b) = f (b) for all f ∈D(A). Since the domain D(A) is dense in C∞(Rd) (Remark 5.5), we

get Pt f (b) = f (b) for all f ∈ C∞(Rd), hence Pb(Xt = b) = 1 for any t > 0. Therefore,

Pb(Xq = b, ∀q ∈Q, q> 0) = 1 and Pb(Xt = b, ∀t > 0) = 1,

because of the right-continuity of the sample paths. This means that b is an absorbing point,contradicting our assumption.

2 Pick f ∈D(A) such that A f (b)> 0. Since A f is continuous, there exist ε > 0 and r > 0 suchthat A f |Br(b) > ε > 0. From Dynkin’s formula (12.3) with σ = τr ∧n, n> 1, we deduce

εEb(τr ∧n)6 Eb∫[0,τr∧n)

A f (Xs)ds = Eb f (Xτr∧n)− f (b)6 2‖ f‖∞.

Finally, monotone convergence shows that Ebτr 6 2‖ f‖∞/ε < ∞.

86 R. L. Schilling: An Introduction to Lévy and Feller Processes

Definition 12.5 (Dynkin’s operator). Let X be a Feller process. The linear operator (A,D(A))

defined by

A f (x) :=

limr→0

Ex f (Xτr)− f (x)Exτr

, if x is not absorbing,

0, otherwise,

D(A) :=

f ∈ C∞(Rd) : the above limit exists pointwise

,

is called Dynkin’s (characteristic) operator.

Lemma 12.6. Let (Xt)t>0 be a Feller process with generator (A,D(A)) and characteristic opera-tor (A,D(A)).

a) A is an extension of A, i.e. D(A)⊂D(A) and A|D(A) = A.

b) (A,D) = (A,D(A)) if D= f ∈D(A) : f , A f ∈ C∞(Rd).

Proof. a) Let f ∈D(A) and assume that x ∈ Rd is not absorbing. By Lemma 12.4 there is somer = r(x)> 0 with Exτr < ∞. Since we have A f ∈ C∞(R

d), there exists for every ε > 0 some δ > 0such that

|A f (y)−A f (x)|< ε for all y ∈ Bδ (x).

Without loss of generality let δ < r. Using Dynkin’s formula (12.3) with σ = τδ , we see∣∣Ex f (Xτδ)− f (x)−A f (x)Ex

τδ

∣∣6 Ex∫[0,τδ )|A f (Xs)−A f (x)|︸ ︷︷ ︸

ds6 εExτδ .

Thus, limr→0(Ex f (Xτr)− f (x)

)/Exτr = A f (x).

If x is absorbing and f ∈D(A), then A f (x) = 0 and so A f (x) = A f (x).

b) Since (A,D) satisfies the (PMP), the claim follows from Lemma 5.11.

Theorem 12.7. Let (Xt)t>0 be a Feller process with infinitesimal generator (A,D(A)) such thatC∞

c (Rd)⊂D(A) and x 7→ q(x,ξ ) is continuous1. Then

−q(x,ξ ) = limr→0

Exei(Xτr−x)·ξ −1Exτr

(12.4)

for all x ∈Rd (as usual, 1∞

:= 0). In particular, q(a,ξ ) = 0 for all absorbing states a ∈Rd .

Proof. Let χn ∈ C∞c (R

d) such that 1Bn(0) 6 χn 6 1. By Dynkin’s formula (12.3)

e−ξ (x)Ex[χn(Xτr∧t)eξ (Xτr∧t)]−χn(x) = Ex

∫[0,τr∧t)

e−ξ (x)A[χneξ ](Xs)ds.

1For instance, if X has bounded jumps, see Lemma 11.10 and Remark 11.12. Our proof will show that it is actuallyenough to assume that s 7→ q(Xs,ξ ) is right-continuous.

Chapter 12: Symbols and semimartingales 87

Observe that A[χneξ ](Xs) is bounded if s < τr, see Corollary 11.6. Using the dominated conver-gence theorem, we can let n→ ∞ to get

e−ξ (x)Exeξ (Xτr∧t)−1 = Ex

∫[0,τr∧t)

e−ξ (x)Aeξ (Xs)ds

(11.6)= −Ex

∫[0,τr∧t)

eξ (Xs− x)q(Xs,ξ )ds.(12.5)

If x is absorbing, we have q(x,ξ ) = 0, and (12.4) holds trivially. For non-absorbing x, we pass tothe limit t→ ∞ and get, using Exτr < ∞ (see Lemma 12.4),

e−ξ (x)Exeξ (Xτr)−1Exτr

=− 1Exτr

Ex∫[0,τr)

eξ (Xs− x)q(Xs,ξ )ds.

Since s 7→ q(Xs,ξ ) is right-continuous at s = 0, the limit r→ 0 exists, and (12.4) follows.

A small variation of the above proof yields

Corollary 12.8 (Schilling, Schnurr [57]). Let X be a Feller process with generator (A,D(A)) suchthat C∞

c (Rd)⊂D(A) and x 7→ q(x,ξ ) is continuous. Then

−q(x,ξ ) = limt→0

Exei(Xt∧τr−x)·ξ −1t

(12.6)

for all x ∈Rd and r > 0.

Proof. We follow the proof of Theorem 12.7 up to (12.5). This relation can be rewritten as

Exei(Xt∧τr−x)·ξ −1t

=−1tEx∫ t

0eξ (Xs− x)q(Xs,ξ )1[0,τr)(s)ds.

Observe that Xs is bounded if s < τr and that s 7→ q(Xs,ξ ) is right-continuous. Therefore, the limitt→ 0 exists and yields (12.6).

Every Feller process (Xt)t>0 such that C∞c (R

d) ⊂ D(A) is a semimartingale. Moreover, thesemimartingale characteristics can be expressed in terms of the Lévy triplet (l(x),Q(x),ν(x,dy))of the symbol. Recall that a (d-dimensional) semimartingale is a stochastic process of the form

Xt = X0 +Xct +

∫ t

0

∫y1(0,1)(|y|)

X(·,ds,dy)−ν(·,ds,dy)]+∑

s6t1[1,∞)(|∆Xs|)∆Xs +Bt

where Xc is the continuous martingale part, B is a previsible process with paths of finite variation(on compact time intervals) and with the jump measure

µX(ω,ds,dy) = ∑

s :∆Xs(ω)6=0δ(s,∆Xs(ω))(ds,dy)

whose compensator is ν(ω,ds,dy). The triplet (B,C,ν) with the (predictable) quadratic variationC = [Xc,Xc] of Xc is called the semimartingale characteristics.

88 R. L. Schilling: An Introduction to Lévy and Feller Processes

Theorem 12.9 (Schilling [53], Schnurr [59]). Let (Xt)t>0 be a Feller process with infinitesimalgenerator (A,D(A)) such that C∞

c (Rd)⊂D(A) and symbol q(x,ξ ) given by (11.4). If q(x,0) = 0,2

then X is a semimartingale whose semimartingale characteristics can be expressed by the Lévytriplet (l(x),Q(x),ν(x,dy))

Bt =∫ t

0l(Xs)ds, Ct =

∫ t

0Q(Xs)ds, ν(·,ds,dy) = ν(Xs(·),dy)ds.

Proof. 1 Combining Corollary 11.4 with the integro-differential representation (11.5) of thegenerator shows that

M[ f ]t = f (Xt)−

∫[0,t)

A f (Xs)ds

= f (Xt)−∫[0,t)

l(Xs) ·∇ f (Xs)ds− 12

∫[0,t)

∇ ·Q(Xs)∇ f (Xs)ds

−∫[0,t)

∫y 6=0

[f (Xs + y)− f (Xs)−∇ f (Xs) · y1(0,1)(|y|)

]ν(Xs,dy)ds

is for all f ∈ C2b(R

d)∩D(A) a martingale.

2 We claim that C2c(R

d) ⊂ D(A). Indeed, let f ∈ C2c(R

d) with supp f ⊂ Br(0) for some r > 0and pick a sequence fn ∈ C∞

c (B2r(0)) such that limn→∞ ‖ f − fn‖(2) = 0. Using (11.7) we get

sup|x|63r

|A fn(x)−A fm(x)|6 c3r‖ fn− fm‖(2) −−−−→m,n→∞0.

Since supp fn ⊂ B2r(0) and fn→ f uniformly, there is some u ∈ C∞c (B3r(0)) with | fn(x)| 6 u(x).

Therefore, we get for |x|> 2r

|A fn(x)−A fm(x)|6∫

y 6=0| fn(x+ y)− fm(x+ y)|ν(x,dy)

6 2∫

y 6=0u(x+ y)ν(x,dy) = 2Au(x)−−−→

|x|→∞

0

uniformly for all m,n. This shows that (A fn)n∈N is a Cauchy sequence in C∞(Rd). By the closed-

ness of (A,D(A)) we gather that f ∈D(A) and A f = limn→∞ A fn.

3 Fix r > 1, pick χr ∈ C∞c (R

d) such that 1B3r(0) 6 χ 6 1, and set

σ = σr := inft > 0 : |Xt −X0|> r∧ inft > 0 : |∆Xt |> r.

Since (Xt)t>0 has càdlàg paths and infinite life-time, σr is a family of stopping times with σr ↑ ∞.

2A sufficient condition is, for example, that X has infinite life-time and x 7→ q(x,ξ ) is continuous (either for all ξ

or just for ξ = 0), cf. Theorem 11.13 and Lemma 11.10.

Chapter 12: Symbols and semimartingales 89

For any f ∈ C2(Rd)∩Cb(Rd) and x ∈Rd we set fr := χr f , f x := f (·−x), f x

r (·−x), and consider

M[ f x]t∧σ = f x(Xt∧σ )−

∫[0,t∧σ)

l(Xs) ·∇ f x(Xs)ds− 12

∫[0,t∧σ)

∇ ·Q(Xs)∇ f x(Xs)ds

−∫[0,t∧σ)

∫y 6=0

[f x(Xs + y)− f x(Xs)−∇ f x(Xs) · y1(0,1)(|y|)

]ν(Xs,dy)ds

= f xr (Xt∧σ )−

∫[0,t∧σ)

l(Xs) ·∇ f xr (Xs)ds− 1

2

∫[0,t∧σ)

∇ ·Q(Xs)∇ f xr (Xs)ds

−∫[0,t∧σ)

∫0<|y|<r

[f xr (Xs + y)− f x

r (Xs)−∇ f xr (Xs) · y1(0,1)(|y|)

]ν(Xs,dy)ds

= M[ f xr ]

t∧σ .

Since fr ∈ C2c(R

d) ⊂ D(A), we see that M[ f x]t is a local martingale (with reducing sequence σr,

r > 0), and by a general result of Jacod & Shiryaev [27, Theorem II.2.42], it follows that X is asemimartingale with the characteristics mentioned in the theorem.

We close this chapter with the discussion of a very important example: Lévy driven stochasticdifferential equations. From now on we assume that

Φ :Rd →Rd×n is a matrix-valued, measurable function,

L = (Lt)t>0 is an n-dimensional Lévy process with exponent ψ(ξ ),

and consider the following Itô stochastic differential equation (SDE)

dXt = Φ(Xt−)dLt , X0 = x. (12.7)

If Φ is globally Lipschitz continuous, then the SDE (12.7) has a unique solution which is a strongMarkov process, see Protter [43, Chapter V, Theorem 32]3. If we write Xx

t for the solution of (12.7)with initial condition X0 = x = Xx

0 , then the flow x 7→ Xxt is continuous [43, Chapter V, Theorem

38].If we use Lt = (t,Wt ,Jt)

> as driving Lévy process where W is a Brownian motion and J is apure-jump Lévy process (we assume4 that W ⊥⊥J), and if Φ is a block-matrix, then we see that(12.7) covers also SDEs of the form

dXt = f (Xt−)dt +F(Xt−)dWt +G(Xt−)dJt .

Lemma 12.10. Let Φ be bounded and Lipschitz, X the unique solution of the SDE (12.7), anddenote by A the generator of X. Then C∞

c (Rd)⊂D(A).

3Protter requires that L has independent coordinate processes, but this is not needed in his proof. For existence anduniqueness the local Lipschitz and linear growth conditions are enough; the strong Lipschitz condition is used for theMarkov nature of the solution.

4This assumption can be relaxed under suitable assumptions on the (joint) filtration, see for example Ikeda &Watanabe [22, Theorem II.6.3]

90 R. L. Schilling: An Introduction to Lévy and Feller Processes

Proof. Because of Theorem 5.12 (this theorem does not only hold for Feller semigroups, butfor any strongly continuous contraction semigroup satisfying the positive maximum principle), itsuffices to show

limt→0

1t

(Ex f (Xt)− f (x)

)= g(x) and g ∈ C∞(R

d)

for any f ∈ C∞c (R

d). For this we use, as in the proof of the following theorem, Itô’s formula to get

Ex f (Xt)− f (x) = Ex∫ t

0A f (Xs)ds,

and a calculation similar to the one in the proof of the next theorem. A complete proof can befound in Schilling & Schnurr [57, Theorem 3.5].

Theorem 12.11. Let (Lt)t>0 be an n-dimensional Lévy process with exponent ψ and assume that Φ

is Lipschitz continuous. Then the unique Markov solution X of the SDE (12.7) admits a generalizedsymbol in the sense that

limt→0

Exei(Xt∧τr−x)·ξ −1t

=−ψ(Φ>(x)ξ ), r > 0.

If Φ ∈ C1b(R

d), then X is Feller, C∞c (R

d) ⊂D(A) and q(x,ξ ) = ψ(Φ>(x)ξ ) is the symbol of thegenerator.

Theorem 12.11 indicates that the formulae (12.4) and (12.6) may be used to associate symbolsnot only with Feller processes but with more general Markov semimartingales. This has beeninvestigated by Schnurr [58] and [59] who shows that the class of Itô processes with jumps isessentially the largest class that can be described by symbols; see also [9, Chapters 2.4–5]. Thisopens up the way to analyze rather general semimartingales using the symbol. Let us also pointout that the boundedness of Φ is only needed to ensure that X is a Feller process.

Proof. Let τ = τr be the first exit time for the process X from the ball Br(x) centered at x = X0.We use the Lévy–Itô decomposition (9.8) of L. From Itô’s formula for jump processes (see, e.g.Protter [43, Chapter II, Theorem 33]) we get

1tEx (eξ (Xt∧τ − x)−1

)=

1tEx[∫ t∧τ

0i eξ (Xs−− x)ξ dXs−

12

∫ t∧τ

0eξ (Xs−− x)ξ ·d[X ,X ]cs ξ

+ e−ξ (x) ∑s6τ∧t

(eξ (Xs)− eξ (Xs−)− i eξ (Xs−)ξ ·∆Xs

)]=: I1 + I2 + I3.

Chapter 12: Symbols and semimartingales 91

We consider the three terms separately.

I1 =1tEx∫ t∧τ

0i eξ (Xs−− x)ξ dXs

=1tEx∫ t∧τ

0i eξ (Xs−− x)ξ ·Φ(Xs−)dLs

=1tEx∫ t∧τ

0i eξ (Xs−− x)ξ ·Φ(Xs−)ds

[ls+

∫|y|>1

yNs(dy)]

=: I11 + I12

where we use that the diffusion part and the compensated small jumps of a Lévy process are amartingale, cf. Theorem 9.12. Further,

I3 + I12

=1tEx∫ t∧τ

0

∫y 6=0

eξ (Xs−− x)[eξ (Φ(Xs−)y)−1− iξ ·Φ(Xs−)y1(0,1)(|y|)

]dsNs(dy)

=1tEx∫ t∧τ

0

∫y 6=0

eξ (Xs−− x)[eξ (Φ(Xs−)y)−1− iξ Φ(Xs−)y1(0,1)(|y|)

]ν(dy)ds

−−→t→0

∫y6=0

[eiξ ·Φ(x)y−1− iξ ·Φ(x)y1(0,1)(|y|)

]ν(dy).

Here we use that ν(dy)ds is the compensator of dsNs(dy), see Lemma 9.4. This requires that theintegrand is ν-integrable, but this is ensured by the local boundedness of Φ(·) and the fact that∫

y6=0 min|y|2,1ν(dy)< ∞. Moreover,

I11 =1tEx∫ t∧τ

0i eξ (Xs−− x)ξ ·Φ(Xs−)l ds−−→

t→0iξ ·Φ(x)l,

and, finally, we observe that

[X ,X ]c =[∫

Φ(Xs−)dLs,∫

Φ(Xs−)dLs

]c

=∫

Φ(Xs−)d[L,L]cs Φ>(Xs−)

=∫

Φ(Xs−)QΦ>(Xs−)ds ∈ Rd×d

which gives

I2 =−12tEx∫ t∧τ

0eξ (Xs−− x)ξ ·d[X ,X ]cs ξ

=− 12tEx∫ t∧τ

0eξ (Xs−− x)ξ ·Φ(Xs−)QΦ

>(Xs−)ξ ds

−−→t→0−1

2ξ ·Φ(x)QΦ

>(x)ξ .

This proves

q(x,ξ ) =− i l ·Φ>(x)ξ +12

ξ ·Φ(x)QΦ>(x)ξ

+∫

y 6=0

[1− eiy·Φ>(x)ξ + iy ·Φ>(x)ξ1(0,1)(|y|)

]ν(dy)

= ψ(Φ>(x)ξ ).

92 R. L. Schilling: An Introduction to Lévy and Feller Processes

For the second part of the theorem we use Lemma 12.10 to see that C∞c (R

d) ⊂ D(A). The con-tinuity of the flow x 7→ Xx

t (Protter [43, Chapter V, Theorem 38])—Xx is the unique solution ofthe SDE with initial condition Xx

0 = x—ensures that the semigroup Pt f (x) := Ex f (Xt) = E f (Xxt )

maps f ∈ C∞(Rd) to Cb(R

d). In order to see that Pt f ∈ C∞(Rd) we need that lim|x|→∞ |Xx

t | = ∞

a.s. This requires some longer calculations, see e.g. Schnurr [58, Theorem 2.49] or Kunita [34,Proof of Theorem 3.5, p. 353, line 13 from below] (for Brownian motion this argument is fullyworked out in Schilling & Partzsch [56, Corollary 19.31]).

13. Dénouement

It is well known that the characteristic exponent ψ(ξ ) of a Lévy process L = (Lt)t>0 can be usedto describe many probabilistic properties of the process. The key is the formula

Exeiξ ·(Lt−x) = Eeiξ ·Lt = e−tψ(ξ ) (13.1)

which gives access to the Fourier transform of the transition functionPx(Lt ∈ dy)=P(Lt +x∈ dy).Although it is not any longer true that the symbol q(x,ξ ) of a Feller process X = (Xt)t>0 is thecharacteristic exponent, we may interpret formulae like (12.4)

−q(x,ξ ) = limr→0

Exei(Xτr−x)·ξ −1Exτr

as infinitesimal versions of the relation (13.1). What is more, both ψ(ξ ) and q(x,ξ ) are theFourier symbols of the generators of the processes. We have already used these facts to discussthe conservativeness of Feller processes (Theorem 11.13) and the semimartingale decompositionof Feller processes (Theorem 12.9).

It is indeed possible to investigate further path properties of a Feller process using its symbolq(x,ξ ). Below we will, mostly without proof, give some examples which are taken from Böttcher,Schilling & Wang [9]. Let us point out the two guiding principles.

I For sample path properties, the symbol q(x,ξ ) of a Feller process assumes samerole as the characteristic exponent ψ(ξ ) of a Lévy process.

II A Feller process is ‘locally Lévy’, i.e. for short-time path properties the Fellerprocess, started at x0, behaves like the Lévy process (Lt + x0)t>0 with characteristicexponent ψ(ξ ) := q(x0,ξ ).

The latter property is the reason why such Feller processes are often called Lévy-type processes.The model case is the stable-like process whose symbol is given by q(x,ξ ) = |ξ |α(x) where α :Rd → (0,2) is sufficiently smooth1. This process behaves locally, and for short times t 1, likean α(x)-stable process, only that the index now depends on the starting point X0 = x.

The key to many path properties are the following maximal estimates which were first provedin [53]. The present proof is taken from [9], the observation that we may use a random time τ

instead of a fixed time t is due to F. Kühn.1In dimension d = 1 Lipschitz or even Dini continuity is enough (see Bass [4]), in higher dimensions we need

something like C5d+3-smoothness, cf. Hoh [21]. Meanwhile, Kühn [33] established the existence for d > 1 with α(x)satisfying any Hölder condition.

93

94 R. L. Schilling: An Introduction to Lévy and Feller Processes

Theorem 13.1. Let (Xt)t>0 be a Feller process with generator A, C∞c (R

d) ⊂ D(A) and symbolq(x,ξ ). If τ is an integrable stopping time, then

Px(

sups6τ

|Xs− x|> r)6 cEx

τ sup|y−x|6r

sup|ξ |6r−1

|q(y,ξ )|. (13.2)

Proof. Denote by σr = σ xr the first exit time from the closed ball Br(x). Clearly,

σ xr < τ ⊂

sups6τ

|Xs− x|> r⊂ σ x

r 6 τ .

Pick u ∈ C∞c (R

d), 06 u6 1, u(0) = 1, suppu⊂ B1(0), and set

uxr(y) := u

(y− x

r

).

In particular, uxr |Br(x)c = 0. Hence,

1σ xr6τ 6 1−ux

r(Xτ∧σ xr).

Now we can use (5.5) or (12.3) to get

Px(σ xr 6 τ) 6 Ex [1−ux

r(Xτ∧σ xr)]

= Ex∫[0,τ∧σ x

r )q(Xs,D)ux

r(Xs)ds

= Ex∫[0,τ∧σ x

r )

∫1Br(x)(Xs)eξ (Xs)q(Xs,ξ )ux

r(ξ )dξ ds

6 Ex∫[0,τ∧σ x

r )

∫sup|y−x|6r

|q(y,ξ )||uxr(ξ )|dξ ds

= Ex [τ ∧σxr ]∫

sup|y−x|6r

|q(y,ξ )||uxr(ξ )|dξ

(11.9)6

L 11.8cEx

τ sup|y−x|6r

sup|ξ |6r−1

|q(y,ξ )|∫(1+ |ξ |2)|u(ξ )|dξ .

There is a also a probability estimate for sups6t |Xs − x| 6 r, but this requires some sectorcondition for the symbol, that is an estimate of the form

| Imq(x,ξ )|6 κ Req(x,ξ ), x,ξ ∈Rd . (13.3)

One consequence of (13.3) is that the drift (which is contained in the imaginary part of the symbol)is not dominant. This is a familiar assumption in the study of path properties of Lévy processes,see e.g. Blumenthal & Getoor [8]; a typical example where this condition is violated are (Lévy)symbols of the form ψ(ξ ) = iξ + |ξ | 12 . For a Lévy process the sector condition on ψ coincideswith the sector condition for the generator and the associated non-symmetric Dirichlet form, seeJacob [26, Volume 1, 4.7.32–33].

Chapter 13: Dénouement 95

With some more effort (see [9, Theorem 5.5]), we may replace the sector condition by imposingconditions on the expression

sup|y−x|6r

Req(x,ξ )|ξ | | Imq(y,ξ )|

as r→ ∞.

Theorem 13.2 (see [9, pp. 117–119]). Let (Xt)t>0 be a Feller process with infinitesimal generatorA, C∞

c (Rd)⊂D(A) and symbol q(x,ξ ) satisfying the sector condition (13.3). Then

Px(

sups6t|Xs− x|< r

)6

cκ,d

t sup|ξ |6r−1 inf|y−x|62r |q(y,ξ )|. (13.4)

The maximal estimates (13.2) and (13.4) are quite useful tools. With them we can estimate themean exit time from balls (X and q(x,ξ ) are as in Theorem 13.2):

csup|ξ |61/r inf|y−x|6r |q(y,ξ )|

6 Exσ

xr 6

sup|ξ |6k∗/r inf|y−x|6r |q(y,ξ )|

for all x ∈Rd and r > 0 and with k∗ = arccos√

2/3≈ 0.615.

Recently, Kühn [32] studied the existence of and estimates for generalized moments; a typicalresult is contained in the following theorem.

Theorem 13.3. Let X = (Xt)t>0 be a Feller process with infinitesimal generator (A,D(A)) andC∞

c (Rd) ⊂ D(A). Assume that the symbol q(x,ξ ), given by (11.4), satisfies q(x,0) = 0 and has

(l(x),Q(x),ν(x,dy)) as x-dependent Lévy triplet. If f : Rd → [0,∞) is (comparable to) a twicecontinuously differentiable submultiplicative function such that

supx∈K

∫f (y)ν(x,dy)< ∞ for a compact set K ⊂Rd ,

then the generalized moment supx∈K sups6tEx f (Xs∧τK − x)< ∞ exists and

Ex f (Xt∧τK )6 c f (x)eC(M1+M2)t ;

here τK = inft > 0 : Xt /∈ K is the first exit time from K, C =CK is some absolute constant and

M1 = supx∈K

(|l(x)|+ |Q(x)|+

∫y6=0

(|y|2∧1)ν(x,dy)), M2 = sup

x∈K

∫|y|>1

f (y)ν(x,dy).

If X has bounded coefficients (Lemma 11.8), then K =Rd is admissible.

There are also counterparts for the Blumenthal–Getoor and Pruitt indices. Below we give tworepresentatives, for a full discussion we refer to [9].

Definition 13.4. Let q(x,ξ ) be the symbol of a Feller process. Then

βx∞ := inf

λ > 0 : lim

|ξ |→∞

sup|η |6|ξ | sup|y−x|61/|ξ | |q(y,η)||ξ |λ

= 0, (13.5)

δx∞ := sup

λ > 0 : lim

|ξ |→∞

inf|η |6|ξ | inf|y−x|61/|ξ | |q(y,η)||ξ |λ

= ∞

. (13.6)

96 R. L. Schilling: An Introduction to Lévy and Feller Processes

By definition, 0 6 δ x∞ 6 β x

∞ 6 2. For example, if q(x,ξ ) = |ξ |α(x) with a smooth exponentfunction α(x), then β x

∞ = δ x∞ = α(x); in general, however, we cannot expect that the two indices

coincide.As for Lévy processes, these indices can be used to describe the path behaviour.

Theorem 13.5. Let (Xt)t>0 be a d-dimensional Feller process with the generator A such thatC∞

c (Rd) ⊂ D(A) and symbol q(x,ξ ). For every bounded analytic set E ⊂ [0,∞), the Hausdorff

dimension

dimXt : t ∈ E6min

d, sup

x∈Rdβ

x∞ dimE

. (13.7)

A proof can be found in [9, Theorem 5.15]. It is instructive to observe that we have to takethe supremum w.r.t. the space variable x, as we do not know how the process X moves while weobserve it during t ∈ E. This shows that we can only expect to get ‘exact’ results if t→ 0. Here issuch an example.

Theorem 13.6. Let (Xt)t>0 be a d-dimensional Feller process with symbol q(x,ξ ) satisfying thesector condition. Then, Px-a.s.

limt→0

sup06s6t |Xs− x|t1/λ

= 0 ∀λ > βx∞, (13.8)

limt→0

sup06s6t |Xs− x|t1/λ

= ∞ ∀λ < δx∞. (13.9)

As one would expect, these results are proved using the maximal estimates (13.2) and (13.4) inconjunction with the Borel–Cantelli lemma, see [9, Theorem 5.16].

If we are interested in the long-term behaviour, one could introduce indices ‘at zero’, where wereplace in Definition 13.4 the limit |ξ | → ∞ by |ξ | → 0, but we will always have to pay the pricethat we loose the influence of the starting point X0 = x, i.e. we will have to take the supremum orinfimum for all x ∈Rd .

With the machinery we have developed here, one can also study further path properties, such asinvariant measures, ergodicity, transience and recurrence etc. For this we refer to the monograph[9] as well as recent developments by Behme & Schnurr [7] and Sandric [49, 50].

A. Some classical results

In this appendix we collect some classical results from (or needed for) probability theory whichare not always contained in a standard course.

The Cauchy–Abel functional equation

Below we reproduce the standard proof for continuous functions which, however, works also forright-continuous (or monotone) functions.

Theorem A.1. Let φ : [0,∞)→C be a right-continuous function satisfying the functional equationφ(s+ t) = φ(s)φ(t). Then φ(t) = φ(1)t .

Proof. Assume that φ(a) = 0 for some a> 0. Then we find for all t > 0

φ(a+ t) = φ(a)φ(t) = 0 =⇒ φ |[a,∞) ≡ 0.

To the left of a we find for all n ∈N

0 = φ(a) =[φ(a

n

)]n=⇒ φ

(an

)= 0.

Since φ is right-continuous, we infer that φ |[0,∞) ≡ 0, and φ(t) = φ(1)t holds.

Now assume that φ(1) 6= 0. Setting f (t) := φ(t)φ(1)−t we get

f (s+ t) = φ(s+ t)φ(1)−(s+t) = φ(s)φ(1)−sφ(t)φ(1)−t = f (s) f (t)

as well as f (1) = 1. Applying the functional equation k times we conclude that

f( k

n

)=[

f(1

n

)]k for all k,n ∈N.

The same calculation done backwards yields

[f(1

n

)]k=[

f(1

n

)]n kn =

[f(n

n

)] kn =

[f (1)

] kn = 1.

Hence, f |Q+ ≡ 1. Since φ , hence f , is right-continuous, we see that f ≡ 1 or, equivalently,φ(t) = [φ(1)]t for all t > 0.

97

98 R. L. Schilling: An Introduction to Lévy and Feller Processes

Characteristic functions and moments

Theorem A.2 (Even moments and characteristic functions). Let Y = (Y (1), . . . ,Y (d)) be a randomvariable in Rd and let χ(ξ ) = Eeiξ ·Y be its characteristic function. Then E(|Y |2) exists if, andonly if, the second derivatives ∂ 2

∂ξ 2k

χ(0), k = 1, . . . ,d, exist and are finite. In this case all mixedsecond derivatives exist and

EY (k) =1i

∂ χ(0)∂ξk

and E(Y (k)Y (l)) =−∂ 2χ(0)∂ξk∂ξl

. (A.1)

Proof. In order to keep the notation simple, we consider only d = 1. IfE(Y 2)<∞, then the formu-lae (A.1) are routine applications of the differentiation lemma for parameter-dependent integrals,see e.g. [54, Theorem 11.5] or [55, Satz 12.2]. Moreover, χ is twice continuously differentiable.

Let us prove that E(Y 2)6−χ ′′(0). An application of l’Hospital’s rule gives

χ′′(0) = lim

h→0

12

(χ ′(2h)−χ ′(0)

2h+

χ ′(0)−χ ′(−2h)2h

)= lim

h→0

χ ′(2h)−χ ′(−2h)4h

= limh→0

χ(2h)−2χ(0)+χ(−2h)4h2

= limh→0

E

[(eihY − e− ihY

2h

)2]

=− limh→0

E

[(sinhY

h

)2].

From Fatou’s lemma we get

χ′′(0)6−E

[limh→0

(sinhY

h

)2]=−E

[Y 2].

In the multivariate case observe that E|Y (k)Y (l)|6 E[(Y (k))2

]+E

[(Y (l))2

].

Vague and weak convergence of measures

A sequence of locally finite1 Borel measures (µn)n∈N on Rd converges vaguely to a locally finitemeasure µ if

limn→∞

∫φ dµn =

∫φ dµ for all φ ∈ Cc(R

d). (A.2)

Since the compactly supported continuous functions Cc(Rd) are dense in the space of continuous

functions vanishing at infinity C∞(Rd)= φ ∈C(Rd) : lim|x|→∞ φ(x)= 0, we can replace in (A.2)

the set Cc(Rd) with C∞(R

d). The following theorem guarantees that a family of Borel measuresis sequentially relatively compact2 for the vague convergence.

1I.e. every compact set K has finite measure.2Note that compactness and sequential compactness need not coincide!

Appendix A: Some classical results 99

Theorem A.3. Let (µt)t>0 be a family of measures onRd which is uniformly bounded, in the sensethat supt>0 µt(R

d)< ∞. Then every sequence (µtn)n∈N has a vaguely convergent subsequence.

If we test in (A.2) against all bounded continuous functions φ ∈ Cb(Rd), we get weak conver-

gence of the sequence µn→ µ . One has

Theorem A.4. A sequence of measures (µn)n∈N converges weakly to µ if, and only if, µn convergesvaguely to µ and limn→∞ µn(R

d) = µ(Rd) (preservation of mass). In particular, weak and vagueconvergence coincide for sequences of probability measures.

Proofs and a full discussion of vague and weak convergence can be found in Malliavin [40,Chapter III.6] or Schilling [55, Chapter 25].

For any finite measure µ onRd we denote by qµ(ξ ) :=∫Rd eiξ ·y µ(dy) its characteristic function.

Theorem A.5 (Lévy’s continuity theorem). Let (µn)n∈N be a sequence of finite measures on Rd .If µn→ µ weakly, then the characteristic functions qµn(ξ ) converge locally uniformly to qµ(ξ ).

Conversely, if the limit limn→∞ qµn(ξ ) = χ(ξ ) exists for all ξ ∈ Rd and defines a function χ

which is continuous at ξ = 0, then there exists a finite measure µ such that qµ(ξ ) = χ(ξ ) andµn→ µ weakly.

A proof in one dimension is contained in the monograph by Breiman [10], d-dimensional ver-sions can be found in Bauer [5, Chapter 23] and Malliavin [40, Chapter IV.4].

Convergence in distribution

By d−−→ we denote convergence in distribution.

Theorem A.6 (Convergence of types). Let (Yn)n∈N, Y and Y ′ be random variables and supposethat there are constants an > 0, cn ∈R such that

Ynd−−−→

n→∞Y and anYn + cn

d−−−→n→∞

Y ′.

If Y and Y ′ are non-degenerate, then the limits a = limn→∞

an and c = limn→∞

cn exist and Y ′ ∼ aY + c.

Proof. Write χZ(ξ ) := Eeiupξ Z for the characteristic function of the random variable Z.

1 By Lévy’s continuity theorem (Theorem A.5) convergence in distribution ensures that

χanYn+cn(ξ ) = eicn·ξ χYn(anξ )locally unif.−−−−−−→

n→∞χY ′(ξ ) and χYn(ξ )

locally unif.−−−−−−→n→∞

χY (ξ ).

Take some subsequence (an(k))k∈N ⊂ (an)n∈N such that limk→∞ an(k) = a ∈ [0,∞].

2 Claim: a > 0. Assume, on the contrary, that a = 0.

|χan(k)Yn(k)+cn(k)(ξ )|= |χYn(k)(an(k)ξ )| −−−→k→∞

|χY (0)|= 1.

Thus, |χY ′ | ≡ 1 which means that Y ′ would be degenerate, contradicting our assumption.

100 R. L. Schilling: An Introduction to Lévy and Feller Processes

3 Claim: a < ∞. If a = ∞, we use Yn = (an)−1(Yn− cn) and the argument from step 1 to reach

the contradiction(an(k))

−1 −−−→k→∞

a−1 > 0 ⇐⇒ a < ∞.

4 Claim: There exists a unique a ∈ [0,∞) such that limn→∞ an = a. Assume that there were twodifferent subsequential limits an(k)→ a, am(k)→ a′ and a 6= a′. Then

|χan(k)Yn(k)+cn(k)(ξ )|= |χan(k)Yn(k)(ξ )| −−−→k→∞

χY (aξ ),

|χam(k)Ym(k)+cm(k)(ξ )|= |χam(k)Ym(k)(ξ )| −−−→k→∞

χY (a′ξ ).

On the other hand, χY (aξ ) = χY (a′ξ ) = χY ′(ξ ). If a′ < a, we get by iteration

|χY (ξ )|=∣∣χY(a′

a ξ)∣∣= · · ·= ∣∣χY

((a′a

)Nξ)∣∣−−−→

N→∞|χY (0)|= 1.

Thus, |χ| ≡ 1 and Y is a.s. constant. Since a,a′ can be interchanged, we conclude that a = a′.

5 We haveeicn·ξ =

χanYn+cn(ξ )

χanYn(ξ )=

χanYn+cn(ξ )

χYn(anξ )−−−→n→∞

χY ′(ξ )

χY (aξ ).

Since χY is continuous and χY (0) = 1, the limit limn→∞ eicn·ξ exists for all |ξ |6 δ and some smallδ . For ξ = tξ0 with |ξ0|= 1, we get

0 <

∣∣∣∣∫ δ

0

χY ′(tξ0)

χY (taξ0)dt∣∣∣∣= ∣∣∣∣ limn→∞

∫δ

0ei tcn·ξ0 dt

∣∣∣∣=∣∣∣∣∣ limn→∞

eiδcn·ξ0−1icn ·ξ0

∣∣∣∣∣6 liminfn→∞

2|cn ·ξ0|

,

and so limsupn→∞ |cn|< ∞; if there were two limit points c 6= c′, then eic·ξ = eic′·ξ for all |ξ |6 δ .This gives c = c′, and we finally see that cn→ c, eicn·ξ → eic·ξ , as well as

χY ′(ξ ) = eic·ξχY (aξ ).

A random variable is called symmetric if Y ∼−Y .

Theorem A.7 (Symmetrization inequality). Let Y1, . . . ,Yn be independent symmetric random vari-ables. Then the partial sum Sn = Y1 + · · ·+Yn is again symmetric and

P(|Y1 + · · ·+Yn|> u)> 12P(

max16k6n

|Yk|> u). (A.3)

If the Yk are iid with Y1 ∼ µ , then

P(|Y1 + · · ·+Yn|> u)> 12

(1− e−nP(|Y1|>u)). (A.4)

Proof. By independence, Sn = Y1 + · · ·+Yn ∼−Y1−·· ·−Yn =−Sn.Let τ = min1 6 k 6 n : |Yk| = max16l6n |Yl| and set Yn,τ = Sn−Yτ . Then the four (counting

all possible ± combinations) random variables (±Yτ ,±Yn,τ) have the same law. Moreover,

P(Yτ > u)6 P(Yτ > u, Yn,τ > 0)+P(Yτ > u, Yn,τ 6 0) = 2P(Yτ > u, Yn,τ > 0),

Appendix A: Some classical results 101

and soP(Sn > u) = P(Yτ +Yn,τ > u)> P(Yτ > u, Yn,τ > 0)> 1

2P(Yτ > u).

By symmetry, this implies (A.3). In order to see (A.4), we use that the Yk are iid, hence

P( max16k6n

|Yk|6 u) = P(|Y1|6 u)n 6 e−nP(|Y1|>u),

along with the elementary inequality 1− p6 e−p for 06 p6 1. This proves (A.4).

The predictable σ -algebra

Let (Ω,A ,P) be a probability space and (Ft)t>0 some filtration. A stochastic process (Xt)t>0 iscalled adapted, if for every t > 0 the random variable Xt is Ft measurable.

Definition A.8. The predictable σ -algebra P is the smallest σ -algebra on Ω× (0,∞) such thatall left-continuous adapted stochastic processes (ω, t) 7→ Xt(ω) are measurable. A P measurableprocess X is called a predictable process.

For a stopping time τ we denote by

K0,τK := (ω, t) : 0 < t 6 τ(ω) and Kτ,∞J:= (ω, t) : t > τ(ω) (A.5)

the left-open stochastic intervals. The following characterization of the predictable σ -algebra isessentially from Jacod & Shiryaev [27, Theorem I.2.2].

Theorem A.9. The predictable σ -algebra P is generated by any one of the following families ofrandom sets

a) K0,τK where τ is any bounded stopping time;

b) Fs× (s, t] where Fs ∈Fs and 06 s < t.

Proof. We write Pa and Pb for the σ -algebras generated by the families listed in a) and b),respectively.

1 Pick 06 s < t, F = Fs ∈Fs and let n > t. Observe that sF := s1F +n1Fc is a bounded stoppingtime3 and F × (s, t] =KsF , tFK. Therefore, KsF , tFK =K0, tFK\K0,sFK ∈Pa, and we conclude thatPb ⊂Pa.

2 Let τ be a bounded stopping time. Since t 7→ 1K0,τK(ω, t) is adapted and left-continuous, wehave Pa ⊂P .

3 Let X be an adapted and left-continuous process and define for every n ∈N

Xnt :=

∑k=0

Xk2−n1Kk2−n,(k+1)2−nK(t).

Obviously, Xn = (Xnt )t>0 is Pb measurable; because of the left-continuity of t 7→ Xt , the limit

limn→∞ Xnt = Xt exists, and we conclude that X is Pb measurable; consequently, P ⊂Pb.

3Indeed, sF 6 t= s6 t∩F =

/0, s > t

F, s6 t

∈Ft for all t > 0.

102 R. L. Schilling: An Introduction to Lévy and Feller Processes

The structure of translation invariant operators

Let ϑx f (y) := f (y + x) be the translation operator and f (x) := f (−x). A linear operator L :C∞

c (Rd)→ C(Rd) is called translation invariant if

ϑx(L f ) = L(ϑx f ). (A.6)

A distribution λ is an element of the topological dual (C∞c (R

d))′, i.e. a continuous linear func-tional λ : C∞

c (Rd)→R. The convolution of a distribution with a function f ∈ C∞

c (Rd) is defined

asf ∗λ (x) := λ (ϑ−x f ), λ ∈

(C∞

c (Rd))′, f ∈ C∞

c (Rd), x ∈Rd .

If λ = µ is a measure, this formula generalizes the ‘usual’ convolution

f ∗µ(x) =∫

f (x− y)µ(dy) =∫

f (y− x)µ(dy) = µ(ϑ−x f ).

Theorem A.10. If λ ∈ (C∞c (R

d))′ is a distribution, then L f (x) := f ∗λ (x) defines a translationinvariant continuous linear map L : C∞

c (Rd)→ C(Rd).

Conversely, every translation-invariant continuous linear map L : C∞c (R

d)→ C(Rd) is of theform L f = f ∗λ for some unique distribution λ ∈ (C∞

c (Rd))′.

Proof. Let L f = f ∗λ . From the very definition of the convolution we get

(ϑ−x f )∗λ = λ (ϑ−xϑ−x f ) = λ (ϑ−x[ f (x−·)])

= ϑ−xλ (ϑ−x[ f (−·)])

= ϑ−x( f ∗λ ).

For any sequence xn→ x in Rd and f ∈ C∞c (R

d) we know that ϑ−xn f → ϑ−x f in C∞c (R

d). Sinceλ is a continuous linear functional on C∞

c (Rd), we conclude that

limn→∞

L f (xn) = limn→∞

λ (ϑ−xn f ) = λ (ϑ−x f ) = L f (x)

which shows that L f ∈ C(Rd). (With a similar argument based on difference quotients we couldeven show that L f ∈ C∞(Rd).)

In order to prove the continuity of L, it is enough to show that L : C∞c (K)→ C(Rd) is continuous

for every compact set K ⊂ Rd (this is because of the definition of the topology in C∞c (R

d)). Wewill use the closed graph theorem: Assume that fn→ f in C∞

c (Rd) and fn ∗λ → g in C(Rd), then

we have to show that g = f ∗λ .For every x ∈Rd we have ϑ−x fn→ ϑ−x f in C∞

c (Rd), and so

g(x) = limn→∞

( fn ∗λ )(x) = limn→∞

λ (ϑ−x fn) = λ (ϑ−x f ) = ( f ∗λ )(x).

Assume now, that L is translation invariant and continuous. Define λ ( f ) := (L f )(0). Since Lis linear and continuous, and f 7→ f and the evaluation at x = 0 are continuous operations, λ is acontinuous linear map on C∞

c (Rd). Because of the translation invariance of L we get

(L f )(x) = (ϑxL f )(0) = L(ϑx f )(0) = λ (ϑx f ) = λ (ϑ−x f ) = ( f ∗λ )(x).

Appendix A: Some classical results 103

If µ is a further distribution with L f (0) = f ∗µ(0), we see

(µ−λ )( f ) = f ∗ (µ−λ )(0) = f ∗µ(0)− f ∗λ (0) = 0 for all f ∈ C∞c (R

d)

which proves µ = λ .

Bibliography

[1] Aczél, J.: Lectures on Functional Equations and Their Applications. Academic Press, NewYork (NY) 1966.

[2] Applebaum, D.: Lévy Processes and Stochastic Calculus. Cambridge University Press, Cam-bridge 2009 (2nd ed).

[3] Barndorff-Nielsen, O.E. et al.: Lévy Processes. Theory and Applications. Birkhäuser, Boston(MA) 2001.

[4] Bass, R.: Uniqueness in law for pure-jump Markov processes. Probab. Theor. Relat. Fields79 (1988) 271–287.

[5] Bauer, H.: Probability Theory. De Gruyter, Berlin 1996.

[6] Bawly, G.M.: Über einige Verallgemeinerungen der Grenzwertsätze der Wahrscheinlich-keitsrechnung. Mat. Sbornik (Réc. Math. Moscou, N.S.) 1 (1936) 917–930.

[7] Behme, A., Schnurr, A.: A criterion for invariant measures of Itô processes based on thesymbol. Bernoulli 21 (2015) 1697–1718.

[8] Blumenthal, R.M., Getoor, R.K.: Sample functions of stochastic processes with stationaryindependent increments. J. Math. Mech. 10 (1961) 493–516.

[9] Böttcher, B., Schilling, R.L., Wang, J.: Lévy-type Processes: Construction, Approxima-tion and Sample Path Properties. Lecture Notes in Mathematics 2099 (Lévy Matters III),Springer, Cham 2014.

[10] Breiman, L.: Probability. Addison–Wesley, Reading (MA) 1968 (several reprints by SIAM,Philadelphia).

[11] Bretagnolle, J.L.: Processus à accroissements independants. In: Bretagnolle, J.L. et al.:École d’Été de Probabilités: Processus Stochastiques. Springer, Lecture Notes in Mathe-matics 307, Berlin 1973, pp. 1–26.

[12] Çinlar, E.: Introduction to Stochastic Processes. Prentice–Hall, Englewood Cliffs (NJ) 1975(reprinted by Dover, Mineola).

104

Bibliography 105

[13] Courrège, P.: Générateur infinitésimal d’un semi-groupe de convolution sur Rn, et formulede Lévy–Khintchine. Bull. Sci. Math. 88 (1964) 3–30.

[14] Courrège, P.: Sur la forme intégro différentielle des opérateurs de C∞k dans C satisfaisant

au principe du maximum. In: Séminaire Brelot–Choquet–Deny. Théorie du potentiel 10(1965/66) exposé 2, pp. 1–38.

[15] de Finetti, B.: Sulle funzioni ad incremento aleatorio. Rend. Accad. Lincei Ser. VI 10 (1929)163–168.

[16] Dieudonné, J.: Foundations of Modern Analysis. Academic Press, New York (NY) 1969.

[17] Ethier, S.N., Kurtz, T.G.: Markov Processes. Characterization and Convergence. John Wiley& Sons, New York (NY) 1986.

[18] Gikhman, I.I., Skorokhod, A.V.: Introduction to the Theory of Random Processes. W.B.Saunders, Philadelphia (PA) 1969 (reprinted by Dover, Mineola 1996).

[19] Grimvall, A.: A theorem on convergence to a Lévy process. Math. Scand. 30 (1972) 339–349.

[20] Herz, C.S.: Théorie élémentaire des distributions de Beurling. Publ. Math. Orsay no. 5, 2èmeannée 1962/63, Paris 1964.

[21] Hoh, W.: Pseudo differential operators with negative definite symbols of variable order. Rev.Mat. Iberoamericana 16 (2000) 219–241.

[22] Ikeda, N., Watanabe S.: Stochastic Differential Equations and Diffusion Processes. North-Holland Publishing Co./Kodansha, Amsterdam and Tokyo 1989 (2nd ed).

[23] Itô, K.: On stochastic processes. I. (Infinitely divisible laws of probability). Japanese J. Math.XVIII (1942) 261–302.

[24] Itô, K.: Lectures on Stochastic Processes. Tata Institute of Fundamental Research, Bombay1961 (reprinted by Springer, Berlin 1984).http://www.math.tifr.res.in/~publ/ln/tifr24.pdf

[25] Itô, K.: Semigroups in probability theory. In: Functional Analysis and Related Topics, Pro-ceedings in Memory of K. Yosida (Kyoto 1991). Springer, Lecture Notes in Mathemtics1540, Berlin 1993, 69–83.

[26] Jacob, N.: Pseudo Differential Operators and Markov Processes (3 volumes). Imperial Col-lege Press, London 2001–2005.

[27] Jacod, J., Shiryaev, A.N.: Limit Theorems for Stochastic Processes. Springer, Berlin 1987(2nd ed Springer, Berlin 2003).

106 R. L. Schilling: An Introduction to Lévy and Feller Processes

[28] Khintchine, A.Ya.: A new derivation of a formula by P. Lévy. Bull. Moscow State Univ. 1(1937) 1–5 (Russian; an English translation is contained in Appendix 2.1, pp. 44–49 of [45]).

[29] Khintchine, A.Ya.: Zur Theorie der unbeschränkt teilbaren Verteilungsgesetze. Mat. Sbornik(Réc. Math. Moscou, N.S.) 2 (1937) 79–119 (German; an English translation is contained inAppendix 3.3, pp. 79–125 of [45]).

[30] Knopova, V., Schilling, R.L.: A note on the existence of transition probability densities forLévy processes. Forum Math. 25 (2013) 125–149.

[31] Kolmogorov, A.N.: Sulla forma generale di un processo stocastico omogeneo (Un problemadie Bruno de Finetti). Rend. Accad. Lincei Ser. VI 15 (1932) 805–808 and 866–869 (an En-glish translation is contained in: Shiryayev, A.N. (ed.): Selected Works of A.N. Kolmogorov.Kluwer, Dordrecht 1992, vol. 2, pp.121–127).

[32] Kühn, F.: Existence and estimates of moments for Lévy-type processes. Stoch. Proc. Appl.(in press). DOI: 10.1016/j.spa.2016.07.008

[33] Kühn, F.: Probability and Heat Kernel Estimates for Lévy(-Type) Processes. PhD Thesis,Technische Universität Dresden 2016.

[34] Kunita, H.: Stochastic differential equations based on Lévy processes and stochastic flowsof diffeomorphisms. In: Rao, M.M. (ed.): Real and Stochastic Analysis. Birkhäuser, Boston(MA) 2004, pp. 305–373.

[35] Kunita, H., Watanabe, S.: On square integrable martingales. Nagoya Math. J. 30 (1967)209–245.

[36] Kyprianou, A.: Introductory Lectures on Fluctuations and Lévy Processes with Applications.Springer, Berlin 2006.

[37] Lévy, P.: Sur les intégrales dont les élements sont des variables aléatoires indépendantes.Ann. Sc. Norm. Sup. Pisa 3 (1934) 337–366.

[38] Lévy, P.: Théorie de l’addition des variables aléatoires. Gauthier–Villars, Paris 1937.

[39] Mainardi, F., Rogosin, S.V.: The origin of infinitely divisible distributions: from de Finetti’sproblem to Lévy–Khintchine formula. Math. Methods Economics Finance 1 (2006) 37–55.

[40] Malliavin, P.: Integration and Probability. Springer, New York (NY) 1995.

[41] Pazy, A.: Semigroups of Linear Operators and Applications to Partial Differential Equations.Springer, New York (NY) 1983.

[42] Prokhorov, Yu.V.: Convergence of random processes and limit theorems in probability the-ory. Theor. Probab. Appl. 1 (1956) 157–214.

Bibliography 107

[43] Protter, P.E.: Stochastic Integration and Differential Equations. Springer, Berlin 2004 (2nded).

[44] Revuz, D., Yor, M.: Continuous Martingales and Brownian Motion. Springer, Berlin 2005(3rd printing of the 3rd ed).

[45] Rogosin, S.V., Mainardi, F.: The Legacy of A.Ya. Khintchine’s Work in Probability Theory.Cambridge Scientific Publishers, Cambridge 2010.

[46] Rosinski, J.: On series representations of infinitely divisible random vectors. Ann. Probab.18 (1990) 405–430.

[47] Rosinski, J.: Series representations of Lévy processes from the perspective of point pro-cesses. In: Barndorff-Nielsen et al. [3] (2001) 401–415.

[48] Samorodnitsky, G., Taqqu, M.S.: Stable Non-Gaussian Random Processes. Chapman &Hall, New York (NY) 1994.

[49] Sandric, N.: On recurrence and transience of two-dimensional Lévy and Lévy-type pro-cesses. Stoch. Proc. Appl. 126 (2016) 414–438.

[50] Sandric, N.: Long-time behavior for a class of Feller processes. Trans. Am. Math. Soc. 368(2016) 1871–1910.

[51] Sato, K.: Lévy Processes and Infinitely Divisible Distributions. Cambridge University Press,Cambridge 1999 (2nd ed 2013).

[52] Schilling, R.L.: Conservativeness and extensions of Feller semigroups. Positivity 2 (1998)239–256.

[53] Schilling, R.L.: Growth and Hölder conditions for the sample paths of a Feller process.Probab. Theor. Relat. Fields 112 (1998) 565–611.

[54] Schilling, R.L.: Measures, Integrals and Martingales. Cambridge University Press, Cam-bridge 2011 (3rd printing).

[55] Schilling, R.L.: Maß und Integral. De Gruyter, Berlin 2015.

[56] Schilling, R.L., Partzsch, L.: Brownian Motion. An Introduction to Stochastic Processes. DeGruyter, Berlin 2014 (2nd ed).

[57] Schilling, R.L., Schnurr, A.: The symbol associated with the solution of a stochastic differ-ential equation. El. J. Probab. 15 (2010) 1369–1393.

[58] Schnurr, A.: The Symbol of a Markov Semimartingale. PhD Thesis, Technische UniversitätDresden 2009. Shaker-Verlag, Aachen 2009.

108 R. L. Schilling: An Introduction to Lévy and Feller Processes

[59] Schnurr, A.: On the semimartingale nature of Feller processes with killing. Stoch. Proc. Appl.122 (2012) 2758–2780.

[60] Stroock, D.W., Varadhan, S.R.S.: Multidimensional Stochastic Processes. Springer, Berlin1997 (2nd corrected printing).

[61] von Waldenfels, W.: Eine Klasse stationärer Markowprozesse. Kernforschungsanlage Jülich,Institut für Plasmaphysik, Jülich 1961.

[62] von Waldenfels, W.: Fast positive Operatoren. Z. Wahrscheinlichkeitstheorie verw. Geb. 4(1965) 159–174.

Index

absorbing point, 85adapted, 24, 101almost positive (PP), 41

Blumenthal–Getoor–Pruitt index, 95bounded coefficients, 79Brownian motion, 17

characterization among LP, 51construction, 52–54generator, 38martingale characterization, 50white noise, 65

Campbell’s formula, 17canonical filtration, 12Cauchy–Abel functional equation, 13, 97Chapman–Kolmogorov equations, 24characteristic exponent, 15

is polynomially bounded, 37of a stable process, 22is subadditive, 37

characteristic operator, 86closed operator, 32compound Poisson process, 17, 44

building block of LP, 47, 60distribution, 19is a Lévy process, 18

continuous in probability, 7, 12control measure, 64convergence of types theorem, 99convolution semigroup, 25, 26correlation function, 65

Courrège–von Waldenfels theorem, 77

de Finetti’s theorem, 8dissipative operator, 33Dynkin’s formula, 85Dynkin’s operator, 86Dynkin–Reuter lemma, 35

exponential martingale, 58

Feller process, 28, 75asymptotics as t→ 0, 96càdlàg modification, 75conservativeness, 82generator, 31, 35, 77Hausdorff dimension, 96maximal estimate, 94, 95moments, 95semimartingale characteristics, 88strong Markov property, 75

Feller property, 26, 30Feller semigroup, 28, 30

extension to Bb(Rd), 30

Feller semigroup, see also Feller processFourier transform, 36

Gamma process, 28generator, 31

of Brownian motion, 38domain, 35, 37Dynkin’s operator, 86of a Feller process, 77Hille–Yosida–Ray theorem, 33

109

110 R. L. Schilling: An Introduction to Lévy and Feller Processes

of a Lévy process, 39no proper extension, 35positive maximum principle, 34

Hille–Yosida–Ray theorem, 33

independent increments (L2), 7index of self-similarity, 19index of stability, 21infinitely divisible, 8infinitesimal generator, see generatorintensity measure, 57Itô isometry, 69, 71Itô’s formula, 74

jump measure, 55

(L0), (L1), (L2), (L3), 7(L2′), 12Lévy measure, 42Lévy process, 12

càdlàg modification, 13, 75characteristic exponent, 15characterization, 16finite-dimensional distributions, 14generator, 39independent components, 58infinite divisibility, 7, 12, 15infinite life-time, 42intensity measure, 57jump measure, 55limit of LPs, 45Markov property, 25moments, 22, 50no fixed discontinuities, 13stable process, 21starting from x, 25strong Markov property, 29sum of two LP, 45symbol, 39translation invariance, 25, 39

Lévy triplet, 42Lévy’s continuity theorem, 99Lévy’s theorem, 50Lévy–Itô decomposition, 8, 47, 61, 73Lévy–Khintchine formula, 8, 23, 39, 63Lévy-type process, 93

Markov process, 24strong Markov property, 29, 75universal Markov process, 24

martingale noise, 66

orthogonal increments, 65

PMP (positive maximum principle), 34Poisson process, 17

characterization among LP, 49is a Lévy process, 18jump measure of LP, 56, 60

Poisson process, see also compound —–Poisson random measure, 66

compensator, 74positive maximum principle (PMP), 34PP (almost positive), 41predictable, 101pseudo differential operator, 38, 39, 77

random orthogonal measure, 64stochastic integral, 68, 69

resolvent, 32

second-order process, 65self-similar process, 21semigroup, 26

conservative, 26contractive, 26, 34Feller property, 26, 30strong Feller property, 26strongly continuous, 26, 34sub-Markovian, 26

semimartingale, 74, 87

Index 111

space-time noise, 67, 69stable process, 21stable random variable, 20stable-like process, 93stationary increments (L1), 7stochastic integral, 68, 69stochastic interval, 70, 101strong continuity, 26, 34strong Feller property, 26

of a Lévy process, 27strong Markov property

of a Feller process, 75of a Lévy process, 29

symbol, 38bounded coefficients, 79continuous in x, 80, 82of a Feller process, 77of a Lévy process, 39of an SDE, 90probabilistic formula, 84, 86, 87sector condition, 94upper semicontinuous in x, 82

transition function, 24transition semigroup, see semigrouptranslation invariant, 25, 39, 102

vague convergence, 98

white noise, 65, 67is not σ -additive, 65