math 6710 lecture notes - cornell neldredge/6710/6710-lecture-notes.pdfmath 6710 lecture notes nate...

Download Math 6710 lecture notes - Cornell  neldredge/6710/6710-lecture-notes.pdfMath 6710 lecture notes Nate Eldredge November 29, 2012 Caution! These lecture notes are very rough. ... indicating probability of ”yes”. Basic courses:

Post on 16-Mar-2018




0 download

Embed Size (px)


  • Math 6710 lecture notes

    Nate Eldredge

    November 29, 2012

    Caution! These lecture notes are very rough. They are mainly intended for my own use during lecture.They almost surely contain errors and typos, and tend to be written in a stream-of-consciousness style. Inmany cases details, precise statements, and proofs are left to Durretts text, homework or presentations. Butperhaps these notes will be useful as a reminder of what was done in lecture. If I do something that issubstantially different from Durrett I will put it in here.

    Thursday, August 23

    Overview: Probability: study of randomness. Questions whose answer is not yes or no but a numberindicating probability of yes. Basic courses: divide into discrete and continuous, and are mainlyrestricted to finite or short term problems - involving a finite number of events or random variables. Toescape these bounds: measure theory (introduced to probability by Kolmogorov). Unify discrete/continuous,and enable the study of long term or limiting properties. Ironically, many theorems will be about howrandomness disappears or is constrained in the limit.

    1 Intro

    1. Basic objects of probability: events and their probabilities, combining events with logical operations,random variables: numerical quantities, statements about them are events. Expected values.

    2. Table of measure theory objects: measure space, measurable functions, almost everywhere, integral,.Recall definitions. Ex: Borel/Lebesgue measure on Rn.

    3. Whats the correspondence? Sample space: all possible outcomes of an experiment, or statesof the world. Events: set of outcomes corresponding to event happened. -field F : all reason-able eventsmeasurable sets. Logic operations correspond to set operations. Random variables:measurable functions, so statement about it i.e pullback of Borel set, is an event.

    4. 2 coin flip example.

    5. Fill in other half of the table. Notation matchup.

    6. Suppressing the sample space

    Tuesday, August 28


  • 2 Random variables

    2.1 Random vectors

    Definition 2.1. A n-dimensional random vector is a measurable map X : Rn. (As usual Rn is equippedwith its Borel -field.)

    Fact: If X has components X = (X1, . . . , Xn) then X is a random vector iff the Xi are random variables.More generally we can consider measurable functions X from to any other set S equipped with a

    -field S (measurable space). Such an X could be called an S -valued random variable. Examples:

    Set (group) of permutations of some set. For instance when we shuffle a deck of cards, we get arandom permutation, which could be considered an S 52-valued random variable.

    A manifold with its Borel -field. Picking random points on the manifold.

    Function spaces. E.g. C([0, 1]) with its Borel -field. Brownian motion is a random continuous pathwhich could be viewed as a C([0, 1])-valued random variable.

    Many results that dont use the structures of Rn (e.g. arithmetic, ordering, topology) in any obvious way willextend to any S -valued random variable. In some cases problems can arise if the -field S is bad. Howevergood examples are Polish spaces (complete separable metric spaces) with their Borel -fields. It turns outthat these measurable spaces behave just like R and so statements proved for R (as a measurable space) workfor them as well.

    2.2 Sequences of random variables

    If X1, X2, . . . is a sequence of random variables then:

    lim supn Xn and lim inf Xn are random variables. (To prove, use the fact that X is a random variable,i.e. measurable, iff {X < a} = X1((, a)) F for all a R. This is because the sets {(, a)}generate BR.)

    {lim Xn exists} is an event. (It is the event {lim sup Xn = lim inf Xn}. Or argue directly.) If this eventhas probability 1 we say {Xn} converges almost surely, or Xn X a.s. (In this case the limit X is arandom variable because it equals the lim sup.)

    2.3 Distributions

    Most important questions about X are what values does it take on, with what probabilities? Heres anobject that encodes that information.

    Definition 2.2. The distribution (or law) of X is a probability measure on (R,BR) defined by (B) = P(X B). In other notation = P X1. It is the pushforward of P onto R by the map X. We write X . Easy tocheck that is fact a probability measure.

    Note tells us the probability of every event that is a question about X.Note: Every probability measure on R arises as the distribution of some random variable on some

    probability space. Specifically: take = R, F = BR, P = , X() = .


  • Example 2.3. = c a point mass at c (i.e. (A) = 1 if c A and 0 otherwise). Corresponds to a constantr.v. X = c a.s.

    Example 2.4. Integer-valued (discrete) distributions: (B) =

    nBZ p(n) for some probability mass func-tion p : Z [0, 1] with n p(n) = 1. Such X takes on only integer values (almost surely) and hasP(X = n) = p(n). Other notation: =

    n p(n)n.

    Bernoulli: p(1) = p, p(0) = 1 p.

    Binomial, Poisson, geometric, etc.

    Example 2.5. Continuous distributions: (B) =

    B f dm for some f 0 withR

    f dm = 1. f is called thedensity of the measure . RadonNikodym theorem: is of this form iff m(B) = 0 implies (B) = 0.

    Uniform distribution U(a, b): f = 1ba 1[a,b]

    Normal distribution N(, 2) (note different ): f (x) = 12


    Exponential, gamma, chi-square, etc.

    2.4 Distribution function (CDF)

    Definition 2.6. Associated to each probability measure on R (and hence each random variable X) is the(cumulative) distribution function F(x) := ((, x]). We write FX for F where X ; then FX(x) =P(X x).

    Fact: F is monotone increasing and right continuous; F() = 0, F(+) = 1. (*)

    Proposition 2.7. F uniquely determines the measure .

    Well use the proof of this as an excuse to introduce a very useful measure-theoretic tool: Dynkins -lemma. We need two ad-hoc definitions.

    Definition 2.8. Let be any set. A collection P 2 is a -system if for all A, B P we have A B P(i.e. closed under intersection). A collection L 2 is a -system if:

    1. L;

    2. If A, B L with A B then B \ A L (closed under subset subtraction)

    3. If An L where A1 A2 A3 . . . and A =

    An (we write An A) then A L (closed underincreasing unions).

    Theorem 2.9. If P is a -system, L is a -system, and P L, then (P) L.

    Proof: See Durrett A.1.4. Its just some set manipulations and not too instructive.Heres how well prove our proposition:

    Proof. Suppose , both have distribution function F. Let

    P = {(, a] : a RL = {B BR : (B) = (B)}.

    Notice that P is clearly a -system and (P) = BR. Also P L by assumption, since ((, a]) = F(a) =((, a]). So by Dynkins lemma, if we can show L is a -system we are done.


  • 1. Since , are both probability measures we have (R) = (R) = 1 so R L.

    2. If A, B L with A B, we have (B \ A) = (B) (A) by additivity (B is the disjoint union ofA and B \ A), and the same for But since A, B L we have (B) = (B), (A) = (A). Thus(B \ A) = (B \ A).

    3. If An A, then it follows from countable additivity that (A) = lim (An). (This is called continuityfrom below and can be seen by writing Bn = An \ An1, so that A is the disjoint union of the Bn.)Likewise (A) = lim (An). But (An) = (An) so we must have (A) = (A).

    This is a typical application. We want to show some property Q holds for all sets in some -field F . Sowe show that the collection of all sets with the property Q (whatever it may be) is a -system. Then we finda collection of sets P for which we know that Q holds, and show that its a -system which generates F .

    A similar, simpler technique that you have probably used before: show that the collection of all setswith property Q is a -field. Then find a collection of sets for which Q holds and show that it generates F .However this wont work for this problem. In general, if , are two probability measures, the collection ofall B with (B) = (B) is always a -system but need not be a -field.

    Fact: Any F satisfying the conditions (*) above is in fact the cdf of some probability measure on R(which as we have shown is unique). Proof: Use Caratheodory extension theorem (see appendix). Given Fits obvious how to define on a half-open interval (a, b] (as F(b) F(a)) and hence on any finite disjointunion of intervals. Call this latter collection A; its an algebra. Check that is countably additive on A.Caratheodory says that extends to a countably additive measure on (A) which is of course BR. (Theextension is constructed as an outer measure: (B) = inf{(A) : A A, B A}.)

    Notation: If two random variables X,Y have the same distribution, we say they are identically distributedand write X d= Y .

    Thursday, August 30

    2.5 Joint distribution

    If X is an n-dimensional random vector, its distribution is a probability measure on Rn defined in the sameway. (Its possible to define multi-dimensional cdfs but messier and I am going to avoid it.)

    Given random variables X1, . . . , Xn, their joint distribution is the probability measure on Rn which isthe distribution of the random vector (X1, . . . , Xn). This is not determined solely by the distributions of theXn! Dumb example: X = 1 a coin flip, Y = X. Then both X,Y have the distribution = 121 +


    But (X, X) does not have the same joint distribution as (X,Y). For instance, P((X, X) = (1, 1)) = 1/2 butP((X,Y) = (1, 1)) = 0.

    Moral: The distribution of X will let you answer any question you have that is only about X itself. If youwant to know how it interacts with another random variable then you need to know their joint distribution.

    3 Expectation

    3.1 Definition, integrability

    Expectation EX or E[X].Defining: