[www.gfxmad.me] 9814335479 probabilit

Upload: sound05

Post on 14-Apr-2018

221 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    1/90

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    2/90

    This page intentionally left blankThis page intentionally left blank

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    3/90

    N E W J E R S E Y L O N D O N S I N G A P O R E B E I J I N G S H A N G H A I H O N G K O N G T A I P E I C H E N N A I

    World Scientifc

    Narahari PrabhuCornell University, USA

    TOPICS INPROBABILITY

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    4/90

    British Library Cataloguing-in-Publication Data

    A catalogue record for this book is available from the British Library.

    For photocopying of material in this volume, please pay a copying fee through the Copyright

    Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to

    photocopy is not required from the publisher.

    ISBN-13 978-981-4335-47-8

    ISBN-10 981-4335-47-9

    Typeset by Stallion Press

    Email: [email protected]

    All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means,

    electronic or mechanical, including photocopying, recording or any information storage and retrieval

    system now known or to be invented, without written permission from the Publisher.

    Copyright 2011 by World Scientific Publishing Co. Pte. Ltd.

    Published by

    World Scientific Publishing Co. Pte. Ltd.

    5 Toh Tuck Link, Singapore 596224

    USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601

    UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE

    Printed in Singapore.

    TOPICS IN PROBABILITY

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    5/90

    Now Ive understood

    Times magic play:

    Beating his drum he rolls out the show,

    Shows different images

    And then gathers them in again

    Kabir (14501518)

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    6/90

    This page intentionally left blankThis page intentionally left blank

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    7/90

    CONTENTS

    Preface ix

    Abbreviations xi

    1. Probability Distributions 11.1. Elementary Properties . . . . . . . . . . . . . . . . . 1

    1.2. Convolutions . . . . . . . . . . . . . . . . . . . . . . . 4

    1.3. Moments . . . . . . . . . . . . . . . . . . . . . . . . . 6

    1.4. Convergence Properties . . . . . . . . . . . . . . . . . 8

    2. Characteristic Functions 11

    2.1. Regularity Properties . . . . . . . . . . . . . . . . . . 112.2. Uniqueness and Inversion . . . . . . . . . . . . . . . . 15

    2.3. Convergence Properties . . . . . . . . . . . . . . . . . 17

    2.3.1. Convergence of types . . . . . . . . . . . . . . 19

    2.4. A Criterion for c.f.s . . . . . . . . . . . . . . . . . . . 21

    2.5. Problems for Solution . . . . . . . . . . . . . . . . . . 24

    3. Analytic Characteristic Functions 27

    3.1. Definition and Properties . . . . . . . . . . . . . . . . 27

    3.2. Moments . . . . . . . . . . . . . . . . . . . . . . . . . 30

    3.3. The Moment Problem . . . . . . . . . . . . . . . . . . 31

    3.4. Problems for Solution . . . . . . . . . . . . . . . . . . 40

    vii

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    8/90

    viii Topics in Probability

    4. Infinitely Divisible Distributions 43

    4.1. Elementary Properties . . . . . . . . . . . . . . . . . 43

    4.2. Feller Measures . . . . . . . . . . . . . . . . . . . . . 464.3. Characterization of Infinitely Divisible

    Distributions . . . . . . . . . . . . . . . . . . . . . . . 50

    4.4. Special Cases of Infinitely Divisible Distributions . . 54

    4.5. Levy Processes . . . . . . . . . . . . . . . . . . . . . . 57

    4.6. Stable Distributions . . . . . . . . . . . . . . . . . . . 58

    4.7. Problems for Solution . . . . . . . . . . . . . . . . . . 66

    5. Self-Decomposable Distributions; Triangular Arrays 69

    5.1. Self-Decomposable Distributions . . . . . . . . . . . . 69

    5.2. Triangular Arrays . . . . . . . . . . . . . . . . . . . . 72

    5.3. Problems for Solution . . . . . . . . . . . . . . . . . . 78

    Bibliography 79

    Index 81

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    9/90

    PREFACE

    In this monograph we treat some topics that have been of some

    importance and interest in probability theory. These include, in

    particular, analytic characteristic functions, the moment problem,

    infinitely divisible and self-decomposable distributions.

    We begin with a review of the measure-theoretical foundations

    of probability distributions (Chapter 1) and characteristic functions

    (Chapter 2).

    In many important special cases the domain of characteristic func-

    tions can be extended to a strip surrounding the imaginary axis

    of the complex plane, leading to analytic characteristic functions.

    It turns out that distributions that have analytic characteristic func-

    tions are uniquely determined by their moments. This is the essence

    of the moment problem. The pioneering work in this area is due toC. C. Heyde. This is treated in Chapter 3.

    Infinitely divisible distributions are investigated in Chapter 4. The

    final Chapter 5 is concerned with self-decomposable distributions and

    triangular arrays. The coverage of these topics as given by Feller in

    his 1971 book is comparatively modern (as opposed to classical) but

    is still somewhat diffused. We give a more compact treatment.

    N. U. Prabhu

    Ithaca, New York

    January 2010

    ix

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    10/90

    This page intentionally left blankThis page intentionally left blank

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    11/90

    ABBREVIATIONS

    Term Abbreviation

    characteristic function c.f.

    distribution function d.f

    if and only if iff

    Laplace transform L.T.

    probability generating function p.g.f

    random variable r.v.

    Terminology: We write xd

    = y if the r.v.s x, y have the same distribution.

    xi

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    12/90

    Chapter 1

    Probability Distributions

    1.1. Elementary Properties

    A function F on the real line is called a probability distribution

    function if it satisfies the following conditions:

    (i) F is non-decreasing: F(x + h) F(x) for h > 0;(ii) F is right-continuous: F(x+) = F(x);

    (iii) F() = 0, F() 1.

    We shall say that F is proper if F() = 1, and F is defective other-wise.

    Every probability distribution induces an assignment of proba-

    bilities to all Borel sets on the real line, thus yielding a proba-bility measure P. In particular, for an interval I = (a, b] we have

    P{I} = F(b) F(a). We shall use the same letter F both for thepoint function and the corresponding set function, and write F{I}instead of P{I}. In particular

    F(x) = F{(, x]}.

    We shall refer to F as a probability distribution, or simply a distri-

    bution.

    A point x is an atom if it carries positive probability (weight). It is

    a point of increaseiffF{I} > 0 for every open interval I containing x.

    1

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    13/90

    2 Topics in Probability

    A distribution F is concentrated on the set A if F(Ac) = 0, where

    Ac is the complement of A. It is atomic if it is concentrated on the

    set of its atoms. A distribution without atoms is continuous.As a special case of the atomic distribution we have the arithmetic

    distribution which is concentrated on the set {k(k = 0, 1, 2, . . .)}for some > 0. The largest with this property is called the span

    of F.

    A distribution is singular if it is concentrated on a set of Lebesgue

    measure zero. Theorem 1.1 (below) shows that an atomic distribution

    is singular, but there exist singular distributions which are continu-

    ous.

    A distribution F is absolutely continuous if there exists a function

    f such that

    F(A) =

    A

    f(x)dx.

    If there exists a second function g with the above property, then it is

    clear that f = g almost everywhere, that is, except possibly on a setof Lebesgue measure zero. We have F(x) = f(x) almost everywhere;

    f is called the density of F.

    Theorem 1.1. A probability distribution has at most countably

    many atoms.

    Proof. Suppose F has n atoms x1, x2, . . . , xn in I = (a, b] witha < x1 < x2 < < xn b and weights p(xk) = F{xk}. Thenn

    k=1

    p(xk) F{I}.

    This shows that the number of atoms with weights > 1n is at most

    equal to n. Let

    Dn = {x : p(x) > 1/n};

    then the set Dn has at most n points. Therefore the set D = Dn isat most countable.

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    14/90

    Probability Distributions 3

    Theorem 1.2 (Jordan decomposition). A probability distribu-

    tion F can be represented in the form

    F = pFa + qFc (1.1)

    where p 0, q 0, p + q = 1, Fa, Fc are both distributions, Fa beingatomic and Fc continuous.

    Proof. Let {xn, n 1} be the atoms and p =

    p(xn), q = 1 p.If p = 0 or if p = 1, the theorem is trivially true. Let us assume that

    0 < p < 1 and for < x < define the two functions

    Fa(x) =1

    p

    xnx

    p(xn), Fc(x) =1

    q[F(x) pFa(x)]. (1.2)

    Here Fa is a distribution because it satisfies the conditions (i)(iii)

    above. For Fc we find that for h > 0

    q[Fc(x + h) Fc(x)] = F(x + h) F(x) x

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    15/90

    4 Topics in Probability

    Proof. By the Lebesgue decomposition theorem on measures we

    can express F as

    F = aFs + bFac, (1.5)

    where a 0, b 0, a + b = 1, Fs is a singular distribution andFac is an absolutely continuous distribution. Applying Theorem 1.2

    to Fs we find that Fs = p1Fa + q1Fsc, where p1 0, q1 0, p1 +q1 = 1. Writing p = ap1, q = aq1, r = b we arrive at the desired

    result (1.4).

    Remark. Although it is possible to study distribution functionsand measures without reference to random variables (r.v.) as we have

    done above, it is convenient to start with the definition

    F(x) = P{X x}where X is a random variable defined on an appropriate sample

    space.

    1.2. Convolutions

    Let F1, F2 be distributions and F be defined by

    F(x) =

    F1(x y)dF2(y) (1.6)

    where the integral obviously exists. We call F the convolution of F1and F2 and write F = F1 F2. Clearly F1 F2 = F2 F1.Theorem 1.4. The function F is a distribution.

    Proof. For h > 0 we have

    F(x + h) F(x) =

    [F1(x y + h) F1(x y)]dF2(y) 0

    (1.7)

    so that F is non-decreasing. As h 0,F1(x y + h) F1(x y) F1(x y+) F1(x y) = 0;

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    16/90

    Probability Distributions 5

    since

    |F1(x y + h) F1(x y)| 2,

    2dF2(y) = 2,

    the right side of (1.7) tends to 0 by the dominated convergence theo-

    rem. Therefore F(x+)F(x) = 0, so that F is right-continuous. SinceF1() = 1 the dominated convergence theorem gives F() = 1 .Similarly F() = 0. Therefore F is a distribution.

    Theorem 1.5. If F1 is continuous, so is F. If F1 is absolutely

    continuous, so is F.

    Proof. We have seen in Theorem 1.4 that the right-continuity of

    F1 implies the right-continuity of F. Similarly the left-continuity of

    F1 implies that of F. It follows that if F1 is continuous, so is F.

    Next let F1 be absolutely continuous, so there exists a function

    f1 such that

    F1(x) =

    x

    f1(u)du.

    Then

    F(x) =

    dF2(y)

    x

    f1(u y)du

    =

    x

    f1(u y)dF2(y) duso that F is absolutely continuous, with density

    f(x) =

    f1(x y)dF2(y). (1.8)

    Remarks.1. If X1, X2 are independent random variables with distributions

    F1, F2, then the convolution F = F1 F2 is the distribution of their

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    17/90

    6 Topics in Probability

    sum X1 + X2. For

    F(z) = P{

    X1

    + X2

    z}

    = x+yz

    dF1

    (x)dF2

    (y)

    =

    dF2(y)

    zy

    dF1(x) =

    F1(z y)dF2(y).

    However, it should be noted that dependent random variables X1, X2may have the property that the distribution of their sum is given by

    the convolution of their distributions.

    2. The converse of Theorem 1.5 is false. In fact two singular distri-

    butions may have a convolution which is absolutely continuous.

    3. The conjugate of any distribution F is defined as the distribution

    F, where

    F(x) = 1 F(x).If F is the distribution of the random variable X, then F is the

    distribution of

    X. The distribution F is symmetric if F = F.

    4. Given any distribution F, we can symmetrize it by defining thedistribution F, where

    F = F F .It is seen that F is a symmetric distribution. It is the distribution

    of the difference X1 X2, where X1, X2 are independent variableswith the same distribution F.

    1.3. Moments

    The moment of order > 0 of a distribution F is defined by

    =

    xdF(x)

    provided that the integral converges absolutely, that is,

    =

    |x|dF(x) < ;

    is called the absolute moment of order . Let 0 < < . Then

    for |x| 1 we have |x| 1, while for |x| > 1 we have |x| |x|.

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    18/90

    Probability Distributions 7

    Thus we can write |x| |x| + 1 for all x and so

    |x|dF(x)

    (1 + |x|)dF(x) = 1 +

    |x|dF(x).

    This shows that the existence of the moment of order implies the

    existence of all moments of order < .

    Theorem 1.6. The moment of a distribution F exists iff

    x1[1

    F(x) + F(

    x)] (1.9)

    is integrable over (0, ).

    Proof. For t > 0 an integration by parts yields the relation

    tt

    |x|dF(x) = t[1 F(t) + F(t)]

    + t0

    x1[1 F(x) + F(x)]dx. (1.10)From this we find thatt

    t|x|dF(x)

    t0

    x1[1 F(x) + F(x)]dx

    so that if (1.9) is integrable over (0, ), (and therefore ) exists.Conversely, if exists, then since

    |x|>t|x|dF(x) > |t|[1 F(t) + F(t)]

    the first term on the right side of (1.10) vanishes as t and theintegral there converges as t .

    Theorem 1.7. Let

    (t) =

    |x|tdF(x) <

    for t in some interval I. Then log (t) is a convex function of t I.

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    19/90

    8 Topics in Probability

    Proof. Let a 0, b 0, a + b = 1. Then for two functions 1, 2we have the Holder inequality

    |1(x)2(x)|dF(x)

    |1(x)|1/adF(x)

    a

    |2(x)|1/bdF(x)

    b

    provided that the integrals exist. In this put 1(x) = xat1 , 2(x) =

    xbt2 , where t1, t2

    I. Then

    (at1 + bt2) (t1)a(t2)b (1.11)or taking logarithms,

    log (at1 + bt2) a log (t1) + b log (t2)which establishes the convexity property of log .

    Corollary 1.1 (Lyapunovs inequality). Under the hypothesis of

    Theorem 1.7, 1

    t

    t is non-decreasing for t I.

    Proof. Let , I and choose a = /, t1 = , b = 1 a, t2 = 0.Then (1.11) reduces to

    / ( )where we have written t = (t).

    1.4. Convergence Properties

    We say that I is an interval of continuity of a distribution F if I is

    open and its end points are not atoms ofF. The whole line (, )

    is considered to be an interval of continuity.Let {Fn, n 1} be a sequence of proper distributions. We saythat the sequence converges to F if

    Fn{I} F{I} (1.12)

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    20/90

    Probability Distributions 9

    for every bounded interval of continuity of F. If (1.12) holds for

    every (bounded or unbounded) interval of continuity of F, then the

    convergence is said to be proper, and otherwise improper. Properconvergence implies in particular that F() = 1.Examples

    1. Let Fn be uniform in (n, n). Then for every bounded intervalcontained in (n, n) we have

    Fn{I} =I

    dx

    2n=

    |I|2n

    0 as n

    where |I| is the length of I. This shows that the convergence isimproper.

    2. Let Fn be concentrated on { 1n , n} with weight 1/2 at each atom.Then for every bounded interval I we have

    Fn{I} 0 or 1/2according as I does not or does contain the origin. Therefore the

    limit F is such that it has an atom at the origin, with weight 1/2.Clearly F is not a proper distribution.

    3. Let Fn be the convolution of a proper distribution F with the

    normal distribution with mean zero and variance n2. Thus

    Fn(x) =

    F(x y) n2

    e(1/2)n2y2dy

    = F(x y/n)

    12 e

    (1/2)y2

    dy.

    For finite a, b we haveba

    dFn(x) =

    [F(b y/n) F(a y/n)] 12

    e(1/2)y2

    dy

    F(b) F(a) as n

    by the dominated convergence theorem. Ifa, b are points of continuityof we can write

    Fn{(a, b)} F{(a, b)} (1.13)so that the sequence {Fn} converges properly to F.

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    21/90

    10 Topics in Probability

    If X is a random variable with the distribution F and Yn is an

    independent variable with the above normal distribution, then we

    know that Fn is the distribution of the sum X + Yn. As n , itis obvious that the distribution of this sum converges to that of X.

    This justifies the definition of convergence which requires (1.13) to

    hold only for points of continuity a, b.

    Theorem 1.8 (Selection theorem). Every sequence {Fn} of dis-tributions contains a subsequence {Fnk , k 1} which converges(properly or improperly) to a limit F.

    Theorem 1.9. A sequence{Fn} of proper distributions converges toF iff

    u(x)dFn(x)

    u(x)dF(x) (1.14)

    for every function u which is bounded, continuous and vanishing at

    . If the convergence is proper, then (1.14) holds for every boundedcontinuous function u.

    The proofs of these two theorems are omitted.

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    22/90

    Chapter 2

    Characteristic Functions

    2.1. Regularity Properties

    Let F be a probability distribution. Then its characteristic function

    (c.f.) is defined by

    () =

    eixdF(x) (2.1)

    where i =1, real. This integral exists, since

    |eix|dF(x) =

    dF(x) = 1. (2.2)

    Theorem 2.1. A c.f. has the following properties:

    (a) (0) = 1 and |()| 1 for all .(b) () = (), and is also a c.f.(c) Re is also a c.f.

    Proof. (a) We have

    (0) =

    dF(x) = 1, |()|

    |eix|dF(x) = 1.

    11

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    23/90

    12 Topics in Probability

    (b) () = e

    ixF(dx) = (). Moreover, let F(x) =1

    F(

    x). Then

    eixF{dx} =

    eixF{dx} =

    eixF{dx}.

    Thus () is the c.f. of F, which is a distribution.(c) Re = 12 +

    12 = c.f. of

    12F +

    12 F, which is a distribution.

    Theorem 2.2. If 1, 2 are c.f.s, so is their product 12.

    Proof. Let 1, 2 be the c.f.s of F1, F2 respectively and considerthe convolution

    F(x) =

    F1(x y)dF2(y).

    We know that F is a distribution. Its c.f. is given by

    () =

    eixdF(x) =

    eix

    dF1(x y)dF2(y)

    =

    eiydF2(y)

    ei(xy)dF1(x y)

    = 1()2().

    Thus the product 12 is the c.f. of the convolution F1 F2. Corollary 2.1. If is a c.f., so is ||2.

    Proof. We can write ||2

    = , where is a c.f. byTheorem 2.1(b).

    Theorem 2.3. A distribution F is arithmetic iff there exists a real

    0 = 0 such that (0) = 1.Proof. (i) Suppose that the distribution is concentrated on {k, > 0, k = 0, 1, 2, . . .} with the weight pk at k. Then the c.f.is given by

    () =

    pkeik.

    Clearly (2/) = 1.

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    24/90

    Characteristic Functions 13

    (ii) Conversely, let (0) = 1 for 0 = 0. This gives

    (1 e

    i0x

    )dF(x) = 0.

    Therefore

    (1 cos 0x)dF(x) = 0

    which shows that the points of increase of F are among 2k0 (k =

    0, 1, 2, . . .). Thus the distribution is arithmetic. Corollary 2.2. If () = 1 for all , then the distribution is con-

    centrated at the origin.

    Remarks.

    1. If F is the distribution of a random variable, then we can write

    () = E(eiX)

    so that the c.f. is the expected value of eiX

    . We have () =E(eiX), so that () is the c.f. of the random variable X. Thisis Theorem 2.1(b).

    2. If X1, X2 are two independent random variables with c.f.s 1, 2,

    then

    1()2() = E[ei(X1+X2)]

    so that the product 12 is the c.f. of the sum X1 + X2. This is only

    a special case of Theorem 2.2, since the convolution F1 F2 is notnecessarily defined for independent random variables.

    3. If is the c.f. of the random variable X, then ||2 is the c.f. ofthe symmetrized variable X1 X2, where X1, X2 are independentvariables with the same distribution as X.

    Theorem 2.4. (a) is uniformly continuous.

    (b) If the n-th moment exists, then the n-th derivative exists and isa continuous function given by

    (n)() =

    eix(ix)ndF(x). (2.3)

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    25/90

    14 Topics in Probability

    (c) If the n-th moment exists, then admits the expansion

    () = 1 +

    n1

    n(i)n

    n! + 0(n

    ) ( 0). (2.4)Proof. (a) We have

    ( + h) () =

    eix(eihx 1)dF(x) (2.5)

    so that

    |( + h) ()|

    |eihx 1|dF(x)

    2

    | sin(hx/2)|dF(x).

    Now xB

    | sin(hx/2)|dF(x) xB

    dF(x) <

    by taking A, B large, whileBA

    | sin(hx/2)|dF(x) BA

    dF(x) < .

    since | sin(hx/2)| < for h small. Therefore |( +h)()| 0as h 0, which proves uniform continuity.

    (b) We shall prove (2.3) for n = 1, the proof being similar for n > 1.

    We can write (2.5) as

    ( + h) ()h

    =

    eix eihx 1

    hdF(x). (2.5)

    Here eix e

    ihx 1h

    eihx 1h

    |x|

    and

    |x|dF(x) <

    by hypothesis. Moreover (eihx 1)/h ix as h 0. Thereforeletting h 0 in (2.5) we obtain by the dominated convergence

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    26/90

    Characteristic Functions 15

    theorem that

    ( + h) ()h

    ixeixdF(x)

    as required. Clearly, this limit is continuous.

    (c) We have

    eix =n

    n=0

    (ix)n

    n!+ o(nxn) ( 0)

    so that

    eixdF(x) = 1 +

    nn=1

    (i)n

    n!n +

    o(nxn)dF(x),

    where the last term on the right side is seen to be o(n).

    Remark . The converse of (b) is not always true: thus () may

    exist, but the mean may not. A partial converse is the following:

    Suppose that (n)() exists. Ifn is even, then the first n moments

    exist, while if n is odd, the first n 1 moments exist.2.2. Uniqueness and Inversion

    Theorem 2.5 (uniqueness). Distinct distributions have distinct

    c.f.s.

    Proof. Let F have the c.f. , so that

    () =

    eixdF(x).

    We have for a > 0

    a2

    e1

    2a22iy()d

    =

    a2

    e1

    2a22iy

    eixdF(x)

    =

    dF(x)

    ei(xy)

    a2

    e1

    2a22d,

    the inversion of integrals being clearly justified. The last integral is

    the c.f. (evaluated at x y) of the normal distribution with mean 0

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    27/90

    16 Topics in Probability

    and variance a2, and therefore equals e(xy)2/2a2 . We therefore

    obtain the identity

    1

    2

    e

    1

    2a22iy()d =

    12a

    e1

    2a2(yx)2dF(x)

    (2.6)

    for all a > 0. We note that the right side of (2.6) is the density of the

    convolution F Na, where Na is the normal distribution with mean0 and variance a2. Now if G is a second distribution with the c.f. ,

    it follows from (2.6) that F Na = G Na. Letting a 0+ we findthat F G as required. Theorem 2.6 (inversion). (a) If the distribution F has c.f. and

    |()/| is integrable, then for h > 0

    F(x + h) F(x) = 12

    eix 1 eih

    i()d. (2.7)

    (b) If || is integrable, then F has a bounded continuous density fgiven by

    f(x) =1

    2

    eix()d. (2.8)

    Proof. (b) From (2.6) we find that the density fa of Fa = F Nais given by

    fa(x) = 12

    e1

    2a2

    2

    ix()d. (2.9)

    Here the integrand is bounded by |()|, which is integrable byhypothesis. Moreover, as a 0+, the integrand eix(). There-fore by the dominated convergence theorem as a 0+,

    fa(x) 12

    eix()d = f(x) (say).

    Clearly, f is bounded and continuous. Now for every bounded

    interval I we have

    Fa{I} =I

    fa(x)dx.

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    28/90

    Characteristic Functions 17

    Letting a 0+ in this we obtain

    F{I} = I

    f(x)dx

    if I is an interval of continuity of F. This shows that f is the density

    of F, as required.

    (a) Consider the uniform distribution with density

    uh(x) =1

    h

    for

    h < x < 0, and = 0 elsewhere.

    Its convolution with F has the density

    fh(x) =

    uh(x y)dF(y) =x+hx

    1

    hdF(y) =

    F(x + h) F(x)h

    and c.f.

    h() = ()

    eixuh(x)dx = () 1 eih

    ih.

    By (b) we therefore obtain

    F(x + h) F(x)h

    =1

    2

    eix() 1 eih

    ihd

    provided that |()(1 eih)/i| is integrable. This conditionreduces to condition that |()/| is integrable.

    2.3. Convergence Properties

    Theorem 2.7 (continuity theorem). A sequence

    {Fn

    }of distri-

    butions converges properly to a distribution F iff the sequence {n}of their c.f.s converges to , which is continuous at the origin. In

    this case is the c.f. of F.

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    29/90

    18 Topics in Probability

    Proof. (i) If{Fn} converges properly to F, then

    u(x)dFn(x)

    u(x)dF(x)

    for every continuous and bounded function u. For u(x) = eix

    it follows that n() () where is the c.f. of F. FromTheorem 2.4(a) we know that is uniformly continuous.

    (ii) Conversely suppose that n() (), where is continuous atthe origin. By the selection theorem there exists a subsequence{Fnk , k 1} which converges to F, a possibly defective distri-bution. Using (2.6) we have

    a2

    eiy1

    2a22nk()d =

    e1

    2a2(yx)2dFnk(x).

    Letting k in this we obtaina2

    eiy

    1

    2a22()d =

    e

    1

    2a2(yx)2dF(x)

    F() F(). (2.10)

    Writing the first expression in (2.10) as

    1

    2

    ei(y/a)1

    22(/a)d (2.11)

    and applying the dominated convergence theorem we find that (2.11)

    converges to (0) = 1 as a . By (2.10) it follows that F() F() 1, which gives F() = 0, F() = 1, so that F is proper.By (i) is the c.f. of F, and by the uniqueness theorem F is unique.

    Thus every subsequence {Fnk} converges to F.

    Theorem 2.8 (weak law of large numbers). Let {Xn, n 1}be a sequence of independent random variables with a common dis-

    tribution and finite mean . Let Sn = X1 + X2 + + Xn (n 1).Then as n , Sn/n in probability.

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    30/90

    Characteristic Functions 19

    Proof. Let be the c.f. of Xn. The c.f. of Sn/n is then

    E(ei(Sn/n)

    ) = (/n)n

    = [1 + i(/n) + 0(1/n)]n

    ei

    as n . Here ei is the c.f. of a distribution concentrated at thepoint . By the continuity theorem it follows that the distribution of

    Sn/n converges to this degenerate distribution.

    Theorem 2.9 (central limit theorem). Let {Xn, n 1} be asequence of independent random variables with a common distribu-

    tion and

    E(Xn) = , Var(Xn) = 2

    (both being finite). Let Sn = X1 + X2 + + Xn (n 1). Then asn , the distribution of (Snn)/

    n converges to the standard

    normal.

    Proof. The random variables (Xn)/ have mean zero and vari-ance unity. Let their common c.f. be . Then the c.f. of (Sn n)/

    n is

    (/

    n)n = [1 2/2n + 0(1/n)]n e 122

    where the limit is the c.f. of the standard normal distribution. The

    desired result follows by the continuity theorem.

    Remark . In Theorem 2.7 the convergence of n is uniformwith respect to in [, ].

    2.3.1. Convergence of types

    Two distributions F and G are said to be of the same type if

    G(x) = F(ax + b) (2.12)

    with a > 0, b real.

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    31/90

    20 Topics in Probability

    Theorem 2.10. If for a sequence {Fn} of distributions we haveFn(nx + n)

    G(x), Fn(anx + bn)

    H(x) (2.13)

    for all points of continuity, with n > 0, an > 0, and G and H are

    non-degenerate distributions, then

    nan

    a, n bnan

    b and G(x) = H(ax + b) (2.14)

    (0 < a < , |b| < ).

    Proof. Let Hn(x) = Fn(anx + bn). Then we are given thatHn(x) H(x) and also Hn(nx + n) = Fn(nx + n) G(x),where

    n =nan

    , n =n bn

    an. (2.15)

    With the obvious notations we are given that

    n() (), n() ein/nn(/n) ()uniformly in . Let {nk} be a subsequence of{n} suchthat nk a (0 a ). Let a = , then

    |()| = lim |nk()| = lim |nk(/nk)| = |(0)| = 1uniformly in [, ], so that is degenerate, which is not true. If

    a = 0, then|()| = lim |nk()| = lim |nk(nk)| = |(0)| = 1,

    so that is degenerate, which is not true. So 0 < a < . Now

    ei(nk /nk ) =nk()

    nk() ()

    ()

    so that nk/nk

    a limit b/a (say). Also

    () = ei(b/a)(/a). (2.16)

    It remains to prove the uniqueness of the limit a. Suppose there are

    two subsequences of {n} converging to a and a, and assume that

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    32/90

    Characteristic Functions 21

    a < a. Then the corresponding subsequences of{bn} converge to b, b(say) From (2.16) we obtain

    ei(b/a)(/a) = ei(b/a)(/a)

    and hence |(/a)| = |(/a)| or|()| = |(a/a)| = |(a2/a2)| = = |(an/an)| = |(0)| = 1.This means that is degenerate, which is not true. So a a.

    Similarly a a. Therefore a = a, as required. Since we have

    proved (2.16), the theorem is completely proved.

    2.4. A Criterion for c.f.s

    A function f of a real variable is said to be non-negative definite in

    (, ) if for all real numbers 1, 2, . . . , n and complex numbersa1, a2, . . . , an

    nr,s=1

    f(r s)aras 0. (2.17)

    For such a function the following properties hold.

    (a) f(0) 0. If in (2.17) we put n = 2, 1 = , 2 = 0, a1 = a, a2 = 1we obtain

    f(0)(1 + |a|2) + f()a + f()a 0. (2.18)When = 0 and a = 1 this reduces to f(0) 0.

    (b) f() = f(). We see from (2.18) that f()a + f()a is real.This gives f() = f().

    (c) |f()| f(0). In (2.18) let us choose a = f() where is real.Then

    f(0) + 2|f()|2 + 2|f()|2f(0) 0.

    This is true for all , so |f()|4

    |f()|2

    [f(0)]2

    or |f()| f(0), asrequired.

    Theorem 2.11. A function of a real variable is the c.f. of a dis-

    tribution iff it is continuous and non-negative definite.

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    33/90

    22 Topics in Probability

    Proof. (i) Suppose is a c.f.; that is,

    () = e

    ix

    dF(x)

    where F is a distribution. By Theorem 2.4(a), is continuous.

    Moreover,

    nr,s=1

    (r s)aras

    =n

    r,s=1

    aras

    ei(rs)xdF(x)

    =

    n1

    areirn

    n1

    aseisx

    dF(x)

    =

    n

    i

    ar

    eirx2

    dF(x)

    0

    which shows that is non-negative definite.

    (ii) Conversely, let be continuous and non-negative definite. Then

    considering the integral as the limit of a sum we find that

    0

    0ei(

    )x( )dd 0 (2.19)

    for > 0. Now consider

    P(x) =1

    0

    0

    e()x( )dd

    =

    eisx(s)ds (2.20)

    where

    (t) =

    1 |t|

    (t) for |t| 0 for |t|

    .

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    34/90

    Characteristic Functions 23

    From (2.20) we obtain

    (t) =

    1

    2

    1 |

    | eitP()d=

    1

    2

    (s)ds

    1 ||

    e(ts)d

    =1

    2

    4sin2 12 (s t)(s t)2 (s)ds (t) as .

    On the account of (2.19), is a c.f., and is continuous at

    the origin. By the continuity theorem is a c.f. Again

    (t) (t) as and since is continuous at the origin it follows that is a c.f.

    as was to be proved.

    Remark. This last result is essentially a theorem due to S. Bochner.

    Remark on Theorem 2.7. If a sequence {Fn} of distributions con-verges properly to a distribution F, then the sequence {n} of theirc.f.s converges to , which is the c.f. of F and the convergence is

    uniform in every finite interval.

    Proof. Let A < 0, B > 0 be points of continuity of F. We have

    n() () =

    eixFn{dx}

    eixF{dx}

    =

    xB

    eixFn{dx} xB

    eixF{dx}

    +

    BA

    eixFn{dx} BA

    eixF{dx}

    = I1 + I2 + I3 (say).

    We have

    I3 =

    BA

    eixFn{dx} BA

    eixF{dx}

    = {eix[Fn(x) F(x)]}BA iBA

    eix[Fn(x) F(x)]dx

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    35/90

    24 Topics in Probability

    and so

    |I3

    |=

    |Fn(B)

    F(B)

    |+

    |Fn(A)

    F(A)

    |+ ||

    BA

    |Fn(x) F(x)|dx.

    Given > 0 we can make

    |Fn(B) F(B)| < /9, |Fn(A) F(A)| < /9for n sufficiently large. Also, since |Fn(x) F(x)| 2 and Fn(x) F(x) at points of continuity of F, we have for || <

    ||BA

    |Fn(x) F(x)|dx BA

    |Fn(x) F(x)|dx < /9.

    Thus

    |I3| < /3.

    Also for A, B sufficiently large

    |I1| xB

    eixFn{dx} 1 Fn(B) + Fn(A) < 13

    |I2| xB

    eixFn{dx} 1 Fn(B) Fn(A) < 13 .

    The results follow from the last three inequalities.

    2.5. Problems for Solution

    1. Consider the family of distributions with densities fa(1 a 1) given by

    fa(x) = f(x)[1 + a sin(2 log x)]

    where f(x) is the log-normal density

    f(x) =12

    x1e1/2(log x)2

    for x > 0.

    = 0 for x 0.

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    36/90

    Characteristic Functions 25

    Show that fa has exactly the same moments as f. (Thus

    the log-normal distribution is not uniquely determined by its

    moments).2. Let {pk, k 0} be a probability distribution, and {Fn, n 0} a

    sequence of distributions. Show thatn=0

    pnFn(x)

    is also a distribution.

    3. Show that () = e(e||1) is a c.f., and find the corresponding

    density.

    4. A distribution is concentrated on {2, 3, . . .} with weightspk =

    c

    k2 log |k|(k = 2, 3, . . .)

    where c is such that the distribution is proper. Find its c.f. and

    show that exists but the mean does not.

    5. Show that the function () = e||

    ( > 2) is not a c.f.

    6. If a c.f. is such that ()2 = (c) for some constant c, and

    the variance is finite, show that is the c.f. of the normal distri-

    bution.

    7. A degenerate c.f. is factorized in the form = 12, where 1and 2 are c.f.s. Show that 1 and 2 are both degenerate.

    8. If the sequence of c.f.s {n} converges to a c.f. and n 0,show that n(n)

    n(0).

    9. If{n} is a sequence of c.f.s such that n() 1 for < < ,then n() 1 for all .

    10. A sequence of distributions {Fn} converges properly to a non-degenerate distribution F. Prove that the sequence {Fn(anx +bn)} converges to a distribution degenerate at the origin iffan and bn = 0(an).

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    37/90

    Chapter 3

    Analytic CharacteristicFunctions

    3.1. Definition and Properties

    Let F be a probability distribution and consider the transform

    () =

    exdF(x) (3.1)

    for = + i, where , are real and i =1. This certainly

    exists for = i. SinceBA

    exdF(x)

    BA

    exdF(x), (3.2)

    () exists if e

    xdF(x) is finite. Clearly, the integrals

    0

    exdF(x),0

    exdF(x) (3.3)

    converge for < 0, > 0 respectively. Suppose there exist numbers

    , (0 < , ) such that the first integral in (3.3) convergesfor < and the second for > , then

    exdF(x) < for < < . (3.4)

    In this case () converges in the strip < < of the complexplane, and we say (in view of Theorem 3.1 below) that F has an

    analytic c.f. . If = = the c.f. is said to be entire (analytic onthe whole complex plane).

    27

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    38/90

    28 Topics in Probability

    The following examples show that a distribution need not have

    an analytic c.f. and also that there are distributions with entire

    c.f.s. The conditions under which an analytic c.f. exists are statedin Theorem 3.5.

    Examples

    Distribution c.f. Regions of existence

    Binomial: f(n, k) =

    n

    k

    pkqnk (q+ pe)n whole plane

    Normal: f(x) = 12 e

    1

    2x2

    e1

    2

    2

    whole plane

    Cauchy: f(x) =1

    1

    1 + x2e|| = 0

    Gamma: f(x) = exx1

    ()

    1

    <

    Laplace: f(x) =1

    2e|x| (1 2)1 1 < < 1

    Poisson: f(k) = ek

    k!e(e

    1) whole plane

    Theorem 3.1. The c.f. is analytic in the interior of the strip of

    its convergence.

    Proof. Let

    I =( + h)

    ()

    h xe

    xdF(x)

    where the integral converges in the interior of the strip of conver-

    gence, since for > 0,

    xexdF(x)

    |x|exdF(x)

    e|x|+xdF(x)

    and the last integral is finite for + < < . We have

    I =

    ex

    ehx 1 hxh

    dF(x)

    =

    ex(h(x2/2!) + h2x3/3! + )dF(x).

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    39/90

    Analytic Characteristic Functions 29

    Therefore

    |I|

    ex

    |h||x|2

    (1 + |hx|/1! + |hx|2

    /2! + )dF(x)

    |h|

    ex+|x|+|h||x|dF(x) <

    in the interior of the strip of convergence. As |h| 0 the last expres-sion tends to zero, so

    ( + h)

    ()

    h xe

    xdF(x).

    Thus () exists for in the interior of the strip, which means that() is analytic there.

    Theorem 3.2. The c.f. is uniformly continuous along vertical

    lines that belong to the strip of convergence.

    Proof. We have

    |( + i1) ( + i2)| =

    ex(ei1x ei2x)dF(x)

    ex|ei(12)x 1|dF(x)

    = 2

    ex

    |sin(1

    2)(x/2)

    |dF(x).

    Since the integrand is uniformly bounded by ex and approaches 0

    as 1 2, uniformly continuity follows.

    Theorem 3.3. An analytic c.f. is uniquely determined by its values

    on the imaginary axis.

    Proof. (i) is the c.f. discussed in Chapter 2 and the result fol-lows by the uniqueness theorem of that section.

    Theorem 3.4. The function log () is convex in the interior of the

    strip of convergence.

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    40/90

    30 Topics in Probability

    Proof. We have

    d2

    d2 log () =

    ()()

    ()2

    ()2

    and by the Schwarz inequality

    ()2 =

    xexdF(x)

    2=

    e12x xe 12xdF(x)

    2

    exdF(x)

    x2exdF(x) = ()().

    Therefore d2d2

    log () 0, which shows that log () is convex. Corollary 3.1. If F has an analytic c.f. and (0) = 0, then ()is minimal at = 0. If is an entire function, then () as , unless F is degenerate.

    3.2. Moments

    Recall that

    n =

    xndF(x), n

    |x|ndF(x)

    have been defined as the ordinary moment and absolute moment of

    order n respectively. If F has an analytic c.f. , then n = (n)(0),

    and

    () =0

    nn

    n!,

    the series being convergent in || < = min(, ). The converse isstated in the following theorem.

    Theorem 3.5. If all moments of F exist and the series

    nn

    n! has

    a nonzero radius of convergence , then exists in||

    < , and

    inside the circle || < ,

    () =

    0

    nn

    n!.

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    41/90

    Analytic Characteristic Functions 31

    Proof. We first consider the series

    nn

    n! and show that it also

    converges in || < . From Lyapunovs inequality

    1nn

    1n+1

    n+1

    we obtain

    lim sup

    1nn

    n= lim sup

    12n2n

    2n= lim sup

    12n2n

    2n lim sup |n|

    1n

    n.

    Also, since |n| n we have

    lim sup |n|1n

    n lim sup

    1

    nn

    n.

    Therefore

    limsup|n| 1n

    n= lim sup

    1nn

    n

    which shows that the series

    n

    n

    n! has radius of convergence . For

    arbitrary A > 0 we have

    >0

    n||nn!

    ||n

    n!

    AA

    |x|ndF(x) =AA

    e|x|dF(x)

    for || < . So

    AA

    exdF(x)

    AA

    e|x|dF(x) <

    for || < . Since A is arbitrary, this implies that () converges inthe strip || < .

    3.3. The Moment Problem

    The family of distributions given by

    F(x) = kx e|

    y

    |

    {1 + sin(|y|

    tan )}dyfor 1 1, 0 < < 1 has the same moments of all orders. Thisraises the question: under what conditions is a distribution uniquely

    determined by its moments?

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    42/90

    32 Topics in Probability

    Theorem 3.6. If F has an analytic c.f. then it is uniquely deter-

    mined by its moments.

    Proof. If F has an analytic c.f., then the series

    nnn converges

    in || < = min(, ) and () is given by this series there. Ifthere is a second d.f. G with the same moments n, then by Theo-

    rem 3.5, G has an analytic c.f. (), and () is also given by that

    series in || < . Therefore () = () in the strip || < andhence F = G.

    The cumulant generating function

    The principal value of log () is called the cumulant generating func-

    tion K(). It exists at least on the imaginary axis between = 0 and

    the first zero of (i). The cumulant of order r is defined by

    Kr = i1r

    d

    d

    rlog (i)

    =0

    .

    This exists if, and only if, r exists; Kr can be expressed in terms ofr. We have

    K(i) =0

    Kr(i)r

    r!

    whenever the series converges.

    Theorem 3.7. Let () = 1()2(), where (), 1(), 2() arec.f.s. If () is analytic in < < , so are 1() and 2().Proof. We have (with the obvious notations)

    exdF(x) =

    exdF1(x)

    exdF2(x),

    and since () is convergent, so are 1() and 2().

    Theorem 3.8 (Cramer). If X1 and X2 are independent r.v. such

    that their sum X = X1 + X2 has a normal distribution, then X1,

    X2 have normal distributions (including the degenerate case of the

    normal with zero variance).

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    43/90

    Analytic Characteristic Functions 33

    Proof. Assume without loss of generality that E(X1) = E(X2) = 0 .

    Then E(X) = 0. Assume further that E(X2) = 1. Let 1(), 2()

    be the c.f.s of X1 and X2. Then we have

    1()2() = e122 . (3.5)

    Since the right side of (3.5) is an entire function without zeros, so are

    1() and 2(). By the convexity property (Theorem 3.4) we have

    1() 1, 2() 1 as moves away from zero. Then (3.5) gives

    e122 = 1()2()

    1()

    |1()

    |. (3.6)

    Similarly |2()| e 122 . Therefore

    e122 |1()| |1()2()| = e 12Re(2) = e 12 (22),

    so that

    |()| e 122 . (3.7)From (3.6) and (3.7) we obtain

    12||2 1

    22 log |1()| 1

    22 1

    2||2,

    or, setting K1() = log 1(),

    |Re K1()| 12||2. (3.8)

    From a strengthened version of Liouvilles theorem (see Lemma 3.1)

    it follows that K1() = a1 + a22. Similarly K2() = b1 +

    b22.

    Theorem 3.9 (Raikov). If X1 and X2 are independent r.v. such

    that their sum X = X1 + X2 has a Poisson distribution, then X1,

    X2 have also Poisson distributions.

    Proof. The points of increase of X are k = 0, 1, 2, . . . , so all points

    of increase 1 and 2 of X1 and X2 are such that 1 + 2 = some k,

    and moreover the first points of increase of X1 and X2 are and where is some finite number. Without loss of generality we take

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    44/90

    34 Topics in Probability

    = = 0, so that X1 and X2 have k = 0, 1, 2, . . . as the onlypossible points of increase. Their c.f.s are then of the form

    1() =0

    akek , 2() =

    0

    bkek (3.9)

    with a0, b0 > 0, ak, bk 0 (k 1) and

    ak =

    bk = 1. Let z = e

    and 1() = f1(z), 2() = f2(z). We have

    f1(z)f2(z) = e(z1). (3.10)

    Therefore

    a0bk + a1bk1 + + akb0 = ek

    k!(k = 0, 1, . . .), (3.11)

    which gives

    ak 1b0

    ek

    k!, |f1(z)| 1

    b0e(|z|1). (3.12)

    Similarly |f2(z)| 1a0 e(|z|1). Hence1

    a0e(|z|1)|f1(z)| |f1(z)f2(z)| = e(u1)

    where u = Re (z). This gives

    |f1(z)| a0e(|z|u) a0e2|z|. (3.13)From (3.12) and (3.13), noting that a0b0 = e

    we find that

    2

    |z

    | log

    |f1(z)

    | log a0

    2

    |z

    |,

    or setting K1(z) = log f1(z), and log a0 = 1 < 0,|Re K1(z) + 1| 2|z|. (3.14)

    Proceeding as in the proof of Theorem 3.8, we obtain 1+K1(z) = cz,

    where c is a constant. Since f1 = 1, K1(1) = 0, so c = 1and f1(z) = e

    1(z1), which is the transform of the Poissondistribution.

    Theorem 3.10 (Marcinkiewicz). Suppose a distribution has a

    c.f. () such that (i) = eP(i), where P is a polynomial. Then

    (i) () = eP() in the whole plane, and (ii) is the c.f. of a normal

    distribution (so that P() = + 2 with 0).

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    45/90

    Analytic Characteristic Functions 35

    Proof. Part (i) is obvious. For Part (ii) let

    P() =

    n1

    akk, n finite, ak real (cumulants).

    From |()| () we obtain |eP()| eP() or eReP() eP().Therefore Re P() P(). Put = rei, so that = r cos , =r sin . Then

    anrn cos n + an1rn1 cos(n 1) + anrn cosn + an1rn1 cos(n1) +

    Suppose an = 0. Dividing both sides of this inequality by rn andletting r we obtain an cos n an cosn . Putting = 2n weobtain

    an 0 an cosn 2n

    ,

    so an 0 for n 2. Similarly, putting = 2n we find that

    an an cosn 2n

    ,

    and since cosn 2n < 1 for n > 2 we obtain an 0. Thereforean = 0 for n > 2, P() = a1 + a2

    2, and () is the c.f. of a nor-

    mal distribution, the case a2 = 0 being the degenerate case of zero

    variance.

    Theorem 3.11 (Bernstein). Let X1 and X2 be independent r.v.

    with unit variances. Then if

    Y1 = X1 + X2, Y2 = X1 X2 (3.15)are independent, all four r.v. X1, X2, Y1, Y2 are normal.

    This is a special case of the next theorem (with n = 2, a1 =

    b1 = a2 = 1, b2 = 1). For a more general result see [Feller (1971),pp. 7780, 525526]. He considers the linear transformation Y1 =

    a11X1 + a12X2, Y2 = a21X1 + a22X2 with || = 0, where is the

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    46/90

    36 Topics in Probability

    determinant

    =a11 a12

    a21 a22

    .

    If a11a21 + a12a22 = 0 then the transformation represents a rotation.

    Thus (3.15) is a rotation.

    Theorem 3.12 (Skitovic). Let X1, X2, . . . , X n be n independent

    r.v. such that the linear forms

    L1 = a1X1 + a2X2 + + anXn,L2 = b1X1 + b2X2 + + bnXn, (ai = 0, bi = 0),

    are independent. Then all the (n + 2) r.v. are normal.

    Proof. We shall first assume that (i) the ratios ai/bi are all distinct,

    and (ii) all moments of X1, X2, . . . , X n exist. Then for , real we

    have (with obvious notations)

    ()L1+L2 = ()L1

    ()L2

    so thatni=1

    ()(ai+bi)Xi

    =

    ni=1

    ()aiXi

    ni=1

    ()biXi

    .

    Taking logarithms of both sides and expanding in powers of we

    obtainni=1

    K(ai+bi)Xir =

    ni=1

    KaiXir +

    ni=1

    KbiXir

    orni=1

    K(Xi)r {(ai + bi)r (ai)r (bi)r} = 0

    for all r 1. This can be written asni=1

    K(Xi)r

    r1s=1

    r

    s

    (ai)

    s(bi)rs = 0

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    47/90

    Analytic Characteristic Functions 37

    for all r 1 and all , . Henceni=1

    asi brsi K(Xi)r = 0 (s = 1, 2, . . . , r 1, r 1),

    Let r n + 1. Then for s = 1, 2, . . . , n, i = 1, 2, . . . n we can writethe above equations as

    Arr = 0 (3.16)

    where Ar

    = aSi

    brsi 1 s, i n and r is the column vector withelements K(X1)r , K(X2)r , . . . , K (Xn)r . Since

    |Ar| = (a1a2 an)(b1b2 bn)r1j>i

    (cj ci) = 0,

    the only solution of (3.16) is r = 0. Therefore

    K(Xi)r = 0 for r

    n + 1, i = 1, 2, . . . , n . (3.17)

    Thus all cumulants of Xi of order n + 1 vanish, and K(Xi)()reduces to a polynomial of degree at most n. By the theorem of

    Marcinkiewicz, each Xi has a normal distribution. Hence L1 and L2have normal distributions.

    Next suppose that some of the ai/bi are the same. For example,

    let a1/b1 = a2/b2, and let Y1 = a1X1 + a2X2. Then

    L1 = Y1 + a3X3 + + anXn,L2 =

    b1a1

    Y1 + b3X3 + + bnXn.

    Repeat this process till all the ai/bi are distinct. Then by what has

    just proved, the Yi are normal. By Cramers theorem the Xi are

    normal.

    Finally it remains to prove that the moments of Xi exist. Thisfollows from the fact that L1 and L2 have finite moments of all orders.

    To prove this, we note that since ai = 0, bi = 0 we can take a, c > 0such that |ai|, |bi| c > 0. Also, let us standardize the ai and bi sothat |ai| 1, |bi| 1. Now if|L| = |a1X1+a2X2 + + anXn| nM,

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    48/90

    38 Topics in Probability

    then at least one |Xi| M. Therefore

    P{|L1| nM} ni=1

    P{|Xi| M}. (3.18)

    Further, if c|Xi| nM and |Xj | < M for all j = i, then |L1| M,|L2| M. Thus

    P{|L1| M, |L2| M} P

    |Xi| nMc

    j=1

    P{|Xj | < M}

    P

    |Xi| nMc

    nj=1

    P{|Xj | < M}.

    Summing this over i = 1, 2, . . . , n we obtain, using (3.18),

    n P{|L1| M, |L2| M} P

    |L1| n2M

    c

    nj=1

    P{|Xj | < M}.

    Since L1 and L2 are independent, this gives

    P|L1| n2Mc

    P{|L1| M} n

    P{|L2| M}n1 P{|Xj | < M}

    0 (3.19)

    as M . We can write (3.19) as follows. Choose n2/c = > 1.Then

    P

    {|L1

    | M

    }P{|L1| M} 0 as M . (3.20)By a known result (Lemma 3.2), L1, and similarly L2, has finite

    moments of all orders.

    Lemma 3.1 (see [Hille (1962)]). Iff() is an entire function and

    |Ref()| c||2, then f() = a1 + a22.

    Proof. We have f() =0 ann, the series being convergent on

    the whole plane. Here

    an =n!

    2i

    ||r

    f()

    n+1d (n = 0, 1, 2, . . .). (3.21)

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    49/90

    Analytic Characteristic Functions 39

    Also, since there are no negative powers

    0 =

    n!

    2i||r f()

    n

    1

    d (n = 1, 2, . . .). (3.22)

    From (3.21) we obtain

    an =n!

    2i

    20

    f(rei)

    rn+1ei(n+1)reiid

    or

    anrn

    =

    n!

    220 f(re

    i

    )ein

    d (n = 0, 1, . . .). (3.23)

    Similarly from (3.22) we obtain

    0 =n!

    2

    20

    f(rei)eind

    or

    0 =

    n!

    220 f(re

    i

    )ein

    d (n = 1, 2, . . .). (3.24)

    From (3.23) and (3.24) we obtain

    anrn =

    n!

    20

    Ref(rei)eind (n 1).

    Therefore

    |an|rn n!20

    ck2d = 2cn!r2

    or

    |an| 2cn!rn2

    0 as r for n > 2.

    This gives f() = a0 + a1 + a22.

    Lemma 3.2 (see [Loeve (1963)]). For > 1 if1 F(x) + F(x)

    1 F(x) + F(x) 0 as x

    then F has moments of all orders.

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    50/90

    40 Topics in Probability

    Proof. Given > 0 choose A so large that for x > A

    1

    F(x) + F(

    x)

    1 F(x) + F(x) < and 1 F(A) + F(A) < .Then for any positive integer r,

    1 F(rA) + F(rA)1 F(A) + F(A) =

    rs=1

    1 F(sA) + F(sA)1 F(s1A) + F(s1A) <

    r

    so that

    1 F(rA) + F(rA) < r+1.Therefore

    1 F(x) + F(x) < r+1

    for x > rA. Now

    A nx

    n

    1

    [1 F(x) + F(x)]dx=

    r=0

    r+1ArA

    nxn1[1 F(x) + F()]dx

    0, showthat F is uniquely determined by its moments.

    2. Show that the distribution whose density is given by

    f(x) =

    12

    e|x| for x > 0

    0 for x 0does not have an analytic c.f.

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    51/90

    Analytic Characteristic Functions 41

    3. Proof of Bernsteins theorem. Introduce a change of scale so

    that Y1 =12

    (X1 + X2), Y2 =12

    (X1 X2). Then prove that

    K(Y1)s =

    1

    2

    s

    1sK(X1)s + 1sK(X2)s

    ,

    K(Y2)s =

    1

    2

    s 1sK(X1)s + (1)sK(X2)s

    ,

    and similarly for K(X1)s , K

    (X2)s in terms of K

    (Y1)s , K

    (Y2)s . Hence

    show thatK(Xi)s 12s

    2K(X1)s + 2K(X2)s (i = 1, 2).

    This gives K(Xi)s = 0 for s > 2, i = 1, 2.

    4. If X1, X2 are independent and there exists one rotation (X1,

    X2) (Y1, Y2) such that Y1, Y2 are also independent, then showthat Y1, Y2 are independent for every rotation.

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    52/90

    Chapter 4

    Infinitely DivisibleDistributions

    4.1. Elementary Properties

    A distribution and its c.f. are called infinitely divisible if for each

    positive integer n there exists a c.f. n such that

    () = n()n

    . (4.1)

    It is proved below (Corollary 4.1) that if is infinitely divisible, then

    () = 0. Defining 1/n as the principal branch of the n-th root,we see that the above definition implies that 1/n is a c.f. for every

    n 1.Examples

    (1) A distribution concentrated at a single point is infinitely divisi-

    ble, since for it we have

    () = eia = (eia/n)n

    where a is a real constant.

    (2) The Cauchy density f(x) = a [a2+(x)2]1 (a > 0) has () =

    eia||. The relation (4.1) holds with n() = ei/na||/n.Therefore the Cauchy density is infinitely divisible.

    (3) The normal density with mean m and variance 2 has c.f. () =

    eim1

    222 = (eim/n

    1

    2

    2

    n 2

    )n. Thus the normal distribution is

    infinitely divisible.

    43

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    53/90

    44 Topics in Probability

    (4) The gamma distribution (including the exponential) is infinitely

    divisible, since its c.f. is

    () = (1 i/) =

    (1 i/)/nn .

    The discrete counterparts, the negative binomial and geometric

    distributions are also infinitely divisible.

    (5) Let N be a random variable with the (simple) Poisson distribu-

    tion ek/k!(k = 0, 1, 2, . . .). Its c.f. is given by

    () = e(ei1),

    which is clearly infinitely divisible. Now let {Xk} be a sequenceof independent random variables with a common c.f. and let

    these be independent ofN. Then the sum X1+ X2 + + XNbhas the c.f.

    () = eib+[()1],

    which is the compound Poisson. Clearly, this is also infinitely

    divisible.

    Lemma 4.1. Let {n} be a sequence of c.f.s. Then nn contin-uous iff n(n 1) with continuous. In this case = e.Theorem 4.1. A c.f. is infinitely divisible iff there exists a

    sequence {n} of c.f.s such that nn .

    Proof. If is infinitely divisible, then by definition there exists ac.f. n such that

    nn = (n 1). Therefore the condition is necessary.

    Conversely, let nn . Then by Lemma 4.1, n[n() 1] =log . Now for t > 0,

    ent[n()1] et() as n .Here the expression on the left side is the c.f. of the compound Poisson

    distribution and the right side is a continuous function. Therefore foreach t > 0, et is a c.f. and

    = e = (e/n)n,

    which shows that is infinitely divisible.

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    54/90

    Infinitely Divisible Distributions 45

    Corollary 4.1. If is infinitely divisible, = 0.This was proved in the course of the proof of Theorem 4.1.

    Corollary 4.2. If is infinitely divisible, so is ()a for each a > 0.

    Proof. We have a = ea = (aa/n)n.

    Proof of Lemma 4.1. (i) Suppose n(n 1) which is contin-uous. Then n 1 and the convergence is uniform in [, ].Therefore |1 n()| < 12 for [, ] and n > N. Thus log nexists for [, ], and n > N, and is continuous and bounded.Now

    log n = log[1 + (n 1)]

    = (n 1) 12

    (n 1)2 + 13

    (n 1)3 = (n 1)[1 + o(1)]

    and therefore

    n log n = n(n 1)[1 + o(1)] or nn = e.

    (ii) Suppose nn . We shall first prove that has no zeros. Itsuffices to prove that |n|2n ||2 implies ||2 > 0. Assume that thissymmetrization has been carried out, so that nn with n 0, 0. Since is continuous with (0) = 1, there exists an interval[, ] in which does not vanish and therefore log exists and is

    bounded. Therefore log n exists and is bounded for [, ] andn > N, so n log n log . Thus log n 0 or n 1. As in (i),n(n 1) log = .Theorem 4.2. If {n} is a sequence of infinitely divisible c.f.s andn which is continuous, then is an infinitely divisible c.f.

    Proof. Since n is infinitely divisible, 1/nn is a c.f. Since

    1/nnn continuous,

    is an infinitely divisible c.f. by Theorem 4.1.

    Theorem 4.3 (De Finetti). A distribution is infinitely divisible iff

    it is the limit of compound Poisson distributions.

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    55/90

    46 Topics in Probability

    Proof. If n is the c.f. of a compound Poisson distribution, and

    n which is continuous, then by Theorem 4.2, is an infinitelydivisible c.f. Conversely, let be an infinitely divisible c.f. Then byTheorem 4.1 there exists a sequence {n} of c.f.s such that nn .By Lemma 4.1

    en[n()1] e = .Here en[n()1] is the c.f. of a compound Poisson distribution.

    4.2. Feller Measures

    A measure M is said to be a Feller measure if M{I} < for everyfinite interval I, and the integrals

    M+(x) =

    x

    1

    y2M{dy}, M(x) =

    x+

    1

    y2M{dy} (4.2)

    converge for all x > 0.

    Examples

    (1) A finite measure M is a Feller measure, since|y|>x

    1

    y2M{dy} 1

    x2[M{(, x)} + M{(x, )}].

    (2) The Lebesgue measure is a Feller measure, since|y|>x

    1

    y2dy =

    2

    x(x > 0).

    (3) Let F be a distribution measure and M{dx} = x2F{dx}. ThenM is a Feller measure with

    M+(x) = 1 F(x), M(x) = F(x+).

    Theorem 4.4. Let M be a Feller measure, b a real constant and

    () = ib +

    eix 1 i sin xx2

    M{dx} (4.3)

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    56/90

    Infinitely Divisible Distributions 47

    (the integral being convergent). Then corresponding to a given there

    is only one measure M and one constant b.

    Proof. Consider

    () = () 12h

    hh

    ( + s)ds (h > 0). (4.4)

    We have

    () =

    eix{dx} (4.5)

    where

    {dx} =

    1 sin hxhx

    1

    x2M{dx} (4.6)

    and it is easily verified that is a finite measure. Therefore ()determines uniquely, so M uniquely. Since b = Im (1), the con-

    stant b is uniquely determined.

    Convergence of Feller measures. Let {Mn} be a sequence ofFeller measures. We say that Mn converges properly to a Feller mea-

    sure M if Mn{I} M{I} for all finite intervals I of continuity ofM, and

    M+n (x) M+(x), Mn (x) M(x) (4.7)

    at all points x of continuity of M. In this case we write Mn M.Examples

    (1) Let Mn{dx} = nx2Fn{dx} where Fn is a distribution measurewith weights 12 at each of the points 1n . Then

    Mn{I} = Inx2Fn{dx} = n

    1

    n 1

    2+

    1

    n 1

    2 = 1if { 1

    n, 1

    n} I. Also M+n (x) = Mn (x) = 0 for x > 1n .

    Therefore Mn M where M is a distribution measure concen-trated at the origin. Clearly, M is a Feller measure.

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    57/90

    48 Topics in Probability

    (2) Let Fn be a distribution measure with Cauchy density1 n1+n2x2

    and consider Mn

    {dx

    }= nx2Fn

    {dx

    }. We have

    Mn{(a, b)} =ba

    n2x2

    1 + n2x2dx |b a|,

    M+n (x) =

    x

    n2

    1 + n2y2dy

    x

    dy

    y2,

    Mn (x) = x

    n2

    1 + n2y2dy

    x

    dy

    y2.

    Therefore Mn M where M is the Lebesgue measure.

    Theorem 4.5. Let {Mn} be a sequence of Feller measures, {bn} asequence of real constants and

    n() = ibn +

    eix 1 i sin xx2

    Mn{dx}. (4.8)

    Then n continuous iff there exists a Feller measure M and areal constant b such that Mn M and bn b. In this case

    () = ib +

    eix 1 i sin xx2

    M{dx}. (4.9)

    Proof. As suggested by (4.4)(4.6) let

    n{dx} = K(x)Mn{dx}, where K(x) = x21 sin hxhx

    (4.10)

    n = n{(, )} < . (4.11)Then

    Mn{dx} =1

    nn{dx} (4.12)

    is a distribution measure. We can write

    n() = ibn + n

    eix 1 i sin xx2

    K(x)1M n {dx}.(4.13)

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    58/90

    Infinitely Divisible Distributions 49

    (i) Let Mn M and bn b. Then

    n =

    K(x)M{dx} > 0and

    M n M, where M{dx} =1

    K(x)M{dx}.

    Therefore from (4.13) we find that

    n()

    ib +

    eix 1 i sin x

    x2

    K(x)1M

    {dx

    }= ().

    (ii) Conversely, let n() () continuous. Then with n(),with () defined as in (4.4), n

    () (); that is,

    eixn{dx} (). (4.14)

    In particularn = n{(, )} (0).

    If(0) = 0, then n{I} and Mn{I} tend to 0 for every finite intervalI and by (i) () = ib with b = lim bn. We have thus proved the

    required results in this case. Let = (0) > 0. Then (4.14) can bewritten as

    n e

    ix

    Mn{dx} ().Therefore M n M where M is the distribution measure corre-sponding to the c.f. ()/(0). Thus

    n

    eix 1 i sin xx2

    K(x)1M n {dx}

    eix 1 i sin xx2

    K(x)1M

    {dx

    },

    (the integrand being a bounded continuous function), and bn b.Clearly,

    M{dx} = K(x)1M{dx}

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    59/90

    50 Topics in Probability

    is a Feller measure and

    () = ib +

    eix

    1

    i sin x

    x2 M{dx}as required.

    4.3. Characterization of Infinitely Divisible

    Distributions

    Theorem 4.6. A distribution is infinitely divisible iff its c.f. is ofthe form = e, with

    () = ib +

    eix 1 i sin xx2

    M{dx}, (4.15)

    M being a Feller measure, and b a real constant.

    Proof. (i) Let = e with given by (4.15). We can write

    () = ib 12

    2M{0} + lim0+

    () (4.16)

    where

    () =

    |x

    |>

    eix 1 i sin xx2

    M{dx}

    = i + c

    |x|>

    (eix 1)G{dx}

    with

    cx2G{dx} = M{dx} for |x| > , and

    = |x|>sin x

    M{dx}x2

    ,

    c being determined so that G is a distribution measure. Let

    denote the c.f. of G; then

    e() = ei+c[()1]

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    60/90

    Infinitely Divisible Distributions 51

    is the c.f. of a compound Poisson distribution. As 0, 0, where

    0() =

    |x|>0

    eix 1 i sin xx2

    M{dx}

    is clearly a continuous function. By Theorem 4.3, e0 is an

    infinitely divisible c.f. Now we can write

    e() = eib1

    22M{0} e0(),

    so that is the product of e0() and the c.f. of a normal distri-

    bution. Therefore is infinitely divisible.

    (ii) Conversely, let be an infinitely divisible c.f. Then by Theo-

    rem 4.3. is the limit of a sequence of compound Poisson c.f.s.

    That is,

    ecn[n()1in] ()or

    cn

    (eix 1 iwn)Fn{dx} log ()

    where cn > 0, n is real and Fn is the distribution measure cor-

    responding to the c.f. n. We can write this as

    eix 1 i sin xx2

    Mn{dx}

    + icn

    sin xFn{dx} n

    log ()where Mn{dx} = cnx2Fn{dx}. Clearly, Mn is a Feller measure.By Theorem 4.5 it follows that

    Mn M and cn

    sin xFn{dx} n

    b

    where M is a Feller measure, b a real constant and

    log () = ib +

    eix 1 i sin xx2

    M{dx}.

    This proves that = e, with given by (4.15).

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    61/90

    52 Topics in Probability

    Remarks.

    (a) The centering function sin x is such that

    eix 1 i sin xx2

    M{dx}

    is real. Other possible centering functions are

    (i) (x) =x

    1 + x2

    and

    (ii) (x) =

    a for x < a,|x| for a x a,a for x > a with a > 0.

    .

    (b) The measure (Levy measure) is defined as follows: {0} = 0and {dx} = x2M{dx} for x = 0. We have

    min(1, x

    2

    ){dx} < ,as can be easily verified. The measure K{dx} = (1+x2)1M{dx}is seen to be a finite measure. This was used by Khintchine.

    (c) The spectral function H is defined as follows:

    H(x) =

    x

    M{dy}y2

    for x > 0

    x+

    M{dy}

    y2for x < 0,

    H being undefined at x = 0. We can then write

    () = ib 12

    w22 +

    0+

    [eix 1 i(x)]dH(x)

    + 0 [eix 1 i(x)]dH(x),where the centering function is usually (x) = x(1 + x2)1. Thisis the so-called LevyKhintchine representation. Here H is non-

    decreasing in (, 0) and (0, ), with H() = 0, H() = 0.

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    62/90

    Infinitely Divisible Distributions 53

    Also, for each > 0

    |x|

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    63/90

    54 Topics in Probability

    or

    P(s) = (q0n + q1ns + q2ns2 +

    )n.

    In particular qn0n = P(0) = p0. If p0 = 0, then q0n = 0 and P(s) =

    sn(q1n + q2ns + q3ns2 + )n. This implies that p0 = p1 = p2 = =

    pn1 = 0 for each n 1, which is absurd. Therefore p0 > 0. It followsthat P(s) > 0 and therefore P(s)1/n 1 for 0 s 1. Now

    log P(s) log P(0)

    log P(0)=

    log n

    P(s)P(0)

    log n1

    P(0)

    n

    P(s)P(0) 1

    n1

    P(0) 1

    =n

    P(s) nP(0)1 nP(0) = Qn(s) Qn(0)1 Qn(0) .

    Thus

    Qn(s) Qn(0)1

    Qn(0)

    log P(s) log P(0)

    log P(0)

    .

    Here the left side is seen to be a p.g.f. By the continuity theorem

    the limit is the generating function of a non-negative sequence {fj}.Thus

    log P(s) log P(0) log P(0) =

    1

    fjsf = F(s) (say).

    Putting s = 1 we find that F(1) = 1. Putting =

    log P(0) > 0 we

    obtain

    P(s) = e[1F(s)]

    which is equivalent to (4.18).

    4.4. Special Cases of Infinitely Divisible

    Distributions

    (A) Let the measure M be concentrated at the origin, with weight

    2 > 0. Then (4.15) gives () = eib1

    222 , which is the c.f. of

    the normal distribution.

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    64/90

    Infinitely Divisible Distributions 55

    (B) Let M be concentrated at h(= 0) with weight h2. Then

    () = e

    ir+(eih

    1)

    , r = b sin h.Thus is the c.f. of the random variable hN + r, where N has

    the (simple) Poisson distribution ek/k! (k = 0, 1, 2, . . .).(C) Let M{dx} = x2G{dx} where G is the distribution measure

    with the c.f. . Clearly, M is a Feller measure and

    () = ei+[()1], = b

    sin x G{dx}.

    We thus obtain the c.f. of a compound Poisson distribution.

    (D) Let M be concentrated on (0, ) with density exx(x > 0).It is easily verified that M is a Feller measure. We have

    0

    eix 1x2

    M{dx} = 0

    e(i)x exx

    {dx}

    = log

    i = log

    1 i

    .

    Choosing

    b =

    0

    sin x

    xexdx <

    we find that

    () =

    1 i

    ,This is the c.f. of the gamma density exx1/().

    (E) Stable distributions. These are characterized by the measure M,

    where

    M{(y, x)} = C(px2 + qy2) (x > 0, y > 0)where C > 0, p 0, q 0, p + q = 1, 0 < 2. If = 2,M is concentrated at the origin, and the distribution is the

    normal, as discussed in (A). Let 0 < < 2, and denote by

    the corresponding expression . In evaluating it we choose

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    65/90

    56 Topics in Probability

    an appropriate centering function (x) depending on . This

    changes the constant b and we obtain

    () = i +

    eix 1 i(x)x2

    M{dx}

    where

    = b +

    (x) sin xx2

    M{dx} (|r| < )

    and

    (x) =

    sin x if = 1

    0 if 0 < < 1

    x if 1 < < 2.

    Substituting for M we find that

    () = i + c(2 )[pI() + qI()]

    where

    I() =

    0

    eix 1 i(x)x+1

    dx.

    Evaluating the integral I we find that

    () = i c|w|

    1 + i ||(||, )

    where c > 0, || 1 and

    (||, ) =

    tan

    2if = 1

    2

    log

    |w

    |if = 1.

    In Sec. 4.6 we shall discuss the detailed properties of stable dis-

    tributions. We note that when = 0 and = 1 we obtain

    () = i c||, so that is the c.f. of the Cauchy distribution.

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    66/90

    Infinitely Divisible Distributions 57

    4.5. Levy Processes

    We say a stochastic process {X(t), t 0} has stationary independentincrements if it satisfies the following properties:

    (i) For 0 t1 < t2 < < tn(n 2) the random variablesX(t1), X(t2) X(t1), X(t3) X(t2), . . . , X (tn) X(tn1)

    are independent.

    (ii) The distribution of the increment X(tp)X(tp1) depends onlyon the difference tp tp1.For such a process we can take X(0) 0 without loss ofgenerality. For ifX(0) 0, then the process Y(t) = X(t)X(0)has stationary independent increments, and Y(0) = 0.

    If we write

    X(t) =

    nk=1

    X

    k

    nt

    X

    k 1

    nt

    (4.19)

    then X(t) is seen to be the sum of n independent random vari-ables all of which are distributed as X(t/n). Thus a process with

    stationary independent increments is the generalization to con-

    tinuous time of sums of independent and identically distributed

    random variables.

    A Levy process is a process with stationary independent incre-

    ments that satisfies the following additional conditions:

    (iii) X(t) is continuous in probability. That is, for each > 0P{|X(t)| > } 0 as t 0. (4.20)

    (iv) There exist left and right limits X(t) and X(t+) and we assumethat X(t) is right-continuous: that is, X(t+) = X(t).

    Theorem 4.9. The c.f. of a Levy process is given by E[eiX(t)] =

    et(), where is given by Theorem 4.6.

    Proof. Let 1() = E[eiX(t)]. From (4.19) we find that t() =

    [t/n()]n, so for each t > 0, t is infinitely divisible and t = e

    t .

    Also from the relation X(t+s) d= X(t)+X(s) we obtain the functional

    equation t+s = t + s. On account of (4.20), t 0 as t 0, so

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    67/90

    58 Topics in Probability

    we must have t() = t1(). Thus t() = et() with = 1 in

    the required form.

    Special cases: Each of the special cases of infinitely divisible dis-

    tributions discussed in Sec. 4.4 leads to a Levy process with c.f.

    t() = et() and in the prescribed form. Thus for appropriate

    choices of the measure M we obtain the Brownian motion, simple

    and compound Poisson processes, gamma process and stable pro-

    cesses (including the Cauchy process).

    A Levy process with non-decreasing sample functions is called a

    subordinator. Thus the simple Poisson process and gamma processare subordinators.

    4.6. Stable Distributions

    A distribution and its c.f. are called stable if for every positive integer

    n there exist real numbers cn > 0, dn such that

    ()n = (cn)eidn . (4.21)

    If X, X1, X2, . . . are independent random variables with the c.f. ,

    then the above definition is equivalent to

    X1 + X2 + + Xn d= cnX + dn. (4.22)Examples

    (A) IfX has a distribution concentrated at a single point, then (4.22)

    is satisfied with cn = n, dn = 0. Thus a degenerate distribution

    is (trivially) stable. We shall ignore this from our consideration.

    (B) If X has the Cauchy density f(x) = a [a2 + (x r)2]1 (a > 0),

    then () = eira||. The relation (4.21) holds with cn = n,dn = 0. Thus the Cauchy distribution is stable.

    (C) If X has a normal density with mean m and variance 2, then

    (22) holds with cn = n and dn = m(n cn). Thus the normaldistribution is stable.

    The concept of stable distributions is due to Levy (1924), who

    gave a second definition (see Problem 11).

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    68/90

    Infinitely Divisible Distributions 59

    Theorem 4.10. Stable distributions are infinitely divisible.

    Proof. The relation (4.21) can be written as

    () =

    cn

    ei

    dnncn

    n= n()

    n

    where n is clearly a c.f. By definition is infinitely divisible.

    Domains of attraction. Let {Xk, k 1} be a sequence of inde-pendent random variables with a common distribution F, and Sn =

    X1 + X2 + + Xn (n 1). We say that F belongs to the domainof attraction of a distribution G if there exist real constants an > 0,

    bn such that the normed sum (Sn bn)/an converges in distributionto G.

    It is clear that a stable distribution G belongs to its own domain

    of attraction, with an = cn, bn = dn. Conversely, we shall prove

    below that the only non-empty domains of attraction are those of

    stable distributions.

    Theorem 4.11. If the normed sum (Sn bn)/an converges in dis-tribution to a limit, then

    (i) as n , an , an+1/an 1 and (bn+1 bn)/an b with|b| < , and

    (ii) the limit distribution is stable.

    Proof. (i) With the obvious notation we are given that

    [(/an)eibn/nan ]n () (4.23)

    uniformly in [, ]. By Lemma 4.1 we conclude that

    n[(/an)eibn/nan 1] ()

    where = e. Therefore

    n() = (/an)eibn/nan 1.

    Let {ank} be a subsequence of{an} such that ank a (0 a ).

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    69/90

    60 Topics in Probability

    If 0 < a < , then

    1 = lim|(/a

    nk)|

    =|(/a)

    |,

    while if a = 0, then

    1 = lim|nkank| = |()|.Both implications here would mean that is degenerate, which is

    not true. Hence a = and an . From (4.23) we have

    an+1

    n+1eibn+1/an+1 (),

    which can be written as

    an+1

    neibn+1/an+1 (), (4.24)

    since (/an+1)

    1. By Theorem 2.10 it follows from (4.23) and

    (4.24) that an+1/an 1 and (bn+1 bn)/an b.(ii) For fixed m 1 we have

    an

    mneimbn/an =

    an

    neibn/an

    m m().

    Again by Theorem 2.10 it follows that amn/an cm, (bmn mbn)/an

    dm, where cm > 0 and dm is real, while

    () = m

    cm

    eidm/cm

    or

    m() = (cm)eidm .

    This shows that is stable.

    Theorem 4.12. A c.f. is stable iff = e, with

    () = i c||

    1 + i

    ||(||, )

    (4.25)

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    70/90

    Infinitely Divisible Distributions 61

    where is real, c > 0, 0 < 2, || 1 and

    (||, ) = tan

    2 if = 12

    log|| if = 1.

    (4.26)

    Here is called the characteristic exponent of .

    Proof. (i) Suppose is given by (4.25) and (4.26). Then for a > 0

    we have

    a() (a1/) = i(a a1/)ac||i || [(||, ) (a

    1/||, )]

    =

    i(a a1/) if = 1

    i

    2c

    a log a if = 1.

    This shows that is stable.(ii) Conversely, let be stable. Then by Theorem 4.11 it possesses

    a domain of attraction; that is, there exists a c.f. and real constants

    an > 0, bn such that as n [(/an)e

    ibn ]n ().Therefore by Lemma 4.1,

    n[(/an)eibn 1] ()where = e. Let F be the distribution corresponding to . We first

    consider the case where F is symmetric; then bn = 0. Let Mn{dx} =nx2F{andx}. Then by Theorem 4.5 it follows that there exists aFeller measure M and a constant b such that

    () = ib +

    eix 1 i sin xx2

    M

    {dx

    }. (4.27)

    Let

    U(x) =

    xx

    y2F{dy} (x > 0). (4.28)

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    71/90

    62 Topics in Probability

    Then

    Mn{(x, x)} =n

    a2n U(anx) M{(x, x)} (4.29a)

    n[1 F(anx)] =x

    y2Mn{dy} M+(x) (4.29b)

    nF(anx) =x+

    y2Mn{dy} M(x). (4.29c)

    By Theorem 4.11 we know that an

    , an+1/an

    1. Therefore

    U(x) varies regularly at infinity and M{(x, x)} = Cx2 whereC > 0, 0 < 2. If = 2 the measure M is concentrated at theorigin. If 0 < < 2 the measure M is absolutely continuous.

    In the case where F is unsymmetric we have

    n[1 F(anx + anbn)] M+(x), nF(anx + anbn) M(x)

    and an analogous modification of (4.29a). However it is easily seenthat bn 0, and so these results are fully equivalent to (4.29).Considering (4.29b) we see that either M+(x) 0 o r 1 F(x)varies regularly at infinity and M+(x) = Ax. Similarly F(x) and1 F(x) + F(x) vary regularly at infinity and the exponent isthe same for both M+ and M. Clearly 0 < 2.

    If M+ and M vanish identically, then clearly M is concentrated

    at the origin. Conversely of M has an atom at the origin, then asymmetrization argument shows that M is concentrated at the ori-

    gin, and M+, M vanish identically. Accordingly, when < 2 themeasure M is uniquely determined by its density, which is propor-

    tional to |x|1. For each interval (y, x) containing the origin wetherefore obtain

    M{

    (

    y, x)}

    = C(px2

    + qy2

    ) (4.30)

    where p + q = 1. For = 2, M is concentrated at the origin. For

    0 < < 2 we have already shown in Sec. 4.4 that the measure (4.30)

    yields the required expression (4.25) for .

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    72/90

    Infinitely Divisible Distributions 63

    Corollary 4.3. IfG is the stable distribution with the characteristic

    exponent , then as x

    x[1 G(x)] Cp 2

    , xG(x) Cq2

    . (4.31)

    Proof. Clearly, G belongs to its own domain of attraction with

    the norming constants an = n1/. For 0 < < 2, choosing n1/x = t

    in (4.29b) we find that t[1 G(t)] Cp2 as t . For = 2,G is the normal distribution and for it we have a stronger result,

    namely, x[1

    G(x)]

    0 as x

    .

    Theorem 4.13. (i) All stable distributions are absolutely continu-

    ous.

    (ii) Let 0 < < 2. Then moments of order < exist, while

    moments of order > do not.

    Proof. (i) We have |()| = ec||, with c > 0. Since the functionis integrable over (

    ,

    ), the result (i) follows by Theorem 2.6(b).

    (ii) For t > 0 an integration by parts givestt

    |x|F{dx} = t[1 F(t) + F(t)]

    +

    t0

    x1[1 F(x) + F(x)]dx

    t

    0

    x1[1

    F(x) + F(

    x)]dx.

    If < , this last integral converges as t . Since by Corollary 4.3we have x[1 F(x) + F(x)] M for x > t where t is large. It fol-lows that the absolute moment (and therefore the ordinary moment)

    of order < is finite. Conversely if the absolute moment of order

    > exists, then for > 0 we have

    > |x|>t |x|F

    {dx

    }> t[1

    F(t) + F(

    t)]

    or t[1 F(t) + F(t)] < t 0 as t , which is a contra-diction. Therefore absolute moments of order > do not exist.

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    73/90

    64 Topics in Probability

    Remarks

    (1) From the proof of Theorem 4.12 it is clear that()a = (ca)e

    ida

    for all a > 0, and the functions ca and da are given by

    (i) ca = a1/ with 0 < 2, and

    (ii) da =

    (a a1/) if = 1(2c/)a log a if = 1.

    (2) If in the definition (4.21), dn = 0, then the distribution is calledstrictly stable. However, the distinction between strict and weak

    stability matters only when = 1, because when = 1 we cantake dn = 0 without loss of generality. To prove this we note that

    dn = (n n1/) for = 1, and consider the c.f.() = ()ei.

    We have

    ()n = ()nein = (cn)ei(dnn)

    = (cn)ei(cn+dnn) = (cn)

    which shows that is strictly stable.

    (3) Let = 1 and assume that = 0. Then we can write

    () = a|| for > 0, and a|| for < 0 (4.32)where a is a complex constant. Choosing a scale so that |a| = 1we can write a = ei

    2, where tan 2 = tan

    2 . Since || 1 it

    follows that

    || if 0 < < 1, and || 2 if 1 < < 2.(4.33)

    Theorem 4.14. Let = 1 and let the c.f. of a stable distribution beexpressed in the form

    () = e||ei/2 (4.34)

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    74/90

    Infinitely Divisible Distributions 65

    where in 1 the upper sign prevails for > 0 and the lower sign for < 0. Let the corresponding density be denoted by f(x; , ). Then

    f(x; , ) = f(x; , ) for x > 0. (4.35)For x > 0 and 0 < < 1,

    f(x; , ) =1

    x

    k=1

    (k + 1)

    k!(x)k sin k

    2( ) (4.36)

    and for x > 0 and 1 < < 2

    f(x; , ) =1

    x

    k=1

    (k1 + 1)k!

    (x)k sin k2

    ( ). (4.37)

    Corollary 4.4. A stable distribution is concentrated on (0, ) if0 < < 1, = and on (, 0) if 0 < < 1, = .Proofs are omitted.

    Theorem 4.15. (a) A distribution F belongs to the domain ofattraction of the normal distribution iff

    U(x) =

    xx

    y2F{dy} (4.38)

    varies slowly.

    (b) A distribution F belongs to the domain of attraction of a stable

    distribution with characteristic exponent < 2 iff

    1 F(x) + F(x) xL(x) (x ) (4.39)and

    1 F(x)1 F(x) + F(x) p,

    F(x)1 F(x) + F(x) q (4.40)

    where p 0, q 0 and p +q = 1, Here L is a slowly varying function

    on (0, ); that is, for each x > 0L(tx)

    L(t) 1 as t . (4.41)

    The proof is omitted.

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    75/90

    66 Topics in Probability

    Theorem 4.16. Let F be a proper distribution concentrated on

    (0, ) and Fn the n-fold convolution of F with itself: If Fn(anx) G(x), where G is a non-degenerate distribution, then G = G, thestable distribution concentrated on (0, ), with exponent (0 < 0) or() = ec

    , so that G is the stable distribution with exponent .

    Here 0 < < 1 since G is non-degenerate.(ii) Conversely, let 1 F(t) tL(t)/(1 )(t ). Thisgives 1 F() L(1/)( 0+). Let us choose constants an sothat n[1 F(an)] c/(1 ) for 0 < c < . Then as n ,

    nan L(an) =an L(an)

    [1 F(an)](1 ) n[1 F(an)](1 ) c

    and also

    nan L(an/) = nan L(an)

    L(an/)L(an)

    c.

    Therefore 1 F(/an) c/n andF(/an)n = [1 c/n + o(1/n)]n ec .

    This shows that Fn(anx) G(x).

    4.7. Problems for Solution

    1. Show that if F and G are infinitely divisible distributions so is

    their convolution F G.

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    76/90

    Infinitely Divisible Distributions 67

    2. If is an infinitely divisible c.f., prove that || is also an infinitelydivisible c.f.

    3. Show that the uniform distribution is not infinitely divisible.More generally, a distribution concentrated on a finite interval is

    not infinitely divisible, unless it is concentrated at a point.

    4. Let 0 < rj < 1 and

    rj < . Prove that for arbitrary aj theinfinite product

    () =

    j=11 rj

    1

    rjeiaj

    converges, and represents an infinitely divisible c.f.

    5. Let X =

    1 Xk/k where the random variables Xk are inde-

    pendent and have the common density 12e|x|. Show that X is

    infinitely divisible, and find the associated Feller measure.

    6. Let P be an infinitely divisible p.g.f. and the c.f. of an arbitrary

    distribution. Show that P() is an infinitely divisible c.f.

    7. If 0

    a < b < 1 and is a c.f., then show that

    1 b1 a

    1 a1 b

    is an infinitely divisible c.f.

    8. Prove that a probability distribution with a completely monotone

    density is infinitely divisible.

    9. Mixtures of exponential (geometric) distributions. Let

    f(x) =n

    k=1

    pkkekx

    where pk > 0,

    pk = 1 and for definiteness 0 < 1 < 2 < < n. Show that the density f(x) is infinitely divisible. (Similarly

    a mixture of geometric distributions is infinitely divisible.) By a

    limit argument prove that the density

    f(x) =0

    exG(d),

    where G is a distribution concentrated on (0, ), is infinitelydivisible.

  • 7/29/2019 [Www.gfxmad.me] 9814335479 Probabilit

    77/90

    68 Topics in Probability

    10. If X, Y are two independent random variables such that X > 0

    and Y has an exponential density, then prove that XY is

    infinitely divisible.11. Show that a c.f. is stable if and only if given c > 0, c > 0

    there exist constants c > 0, d such that

    (c)(c) = (c)eid .

    12. Let the c.f. be given by log () = 2

    2k(cos 2k 1).

    Show that ()n = (n) for n = 2, 4, 8, . . . , () is infinitely

    divisible, but not stable.

    13. If ()2 = (c) and the variance is finite, show that () is

    stable (in fact normal).

    14. If ()2 = (a) and ()3 = (b) with a > 0, b > 0, show

    that () is stable.

    15. If F and G are stable with the same exponent , so is their

    convolution F G.16. If X, Y are independent random variables such that X is stable

    with exponent , while Y is positive and stable with exponent(< 1), show that XY1/ is stable with exponent .

    17. The Holtsmark distribution. Suppose that n stars are dis-

    tributed in the interval (n, n) on the real line, their locationsdi(i = 1, 2, . . . , n) being independent r.v. with a