probability theory presentation 11
TRANSCRIPT
-
8/8/2019 Probability Theory Presentation 11
1/36
BST 401 Probability Theory
Xing Qiu Ha Youn Lee
Department of Biostatistics and Computational BiologyUniversity of Rochester
October 12, 2009
Qiu, Lee BST 401
http://goforward/http://find/http://goback/ -
8/8/2019 Probability Theory Presentation 11
2/36
Outline
1 Convergence of Sequence of Measurable Functions
Qiu, Lee BST 401
http://goforward/http://find/http://goback/ -
8/8/2019 Probability Theory Presentation 11
3/36
Random Variables Review
Ill start with the simplest non-trivial probability space:(1, 21 , 1), where 1 = {H, T}, 1({H}) = 1({T}) =
12 .
I can define a sequence of random variables X1, X2, . . . on
this probability space in this way:
Xn() =
0, = H,1n
, = T.
My point: different random variables are only different ways
to assign numbers to events. They do not change the
whole space nor the probability measure.Bernoulli random variable review: X Bernoulli(p) meansP(X = 1) = p and P(X = 0) = 1 p. So it is more thanjust a coin tossing distribution: it mustsend the two
possible outcomes to numbers 0 and 1.
Qiu, Lee BST 401
http://goforward/http://find/http://goback/ -
8/8/2019 Probability Theory Presentation 11
4/36
Random Variables Review
Ill start with the simplest non-trivial probability space:
(1, 21 , 1), where 1 = {H, T}, 1({H}) = 1({T}) =
12 .
I can define a sequence of random variables X1, X2, . . . on
this probability space in this way:
Xn() =
0, = H,1n
, = T.
My point: different random variables are only different ways
to assign numbers to events. They do not change the
whole space nor the probability measure.Bernoulli random variable review: X Bernoulli(p) meansP(X = 1) = p and P(X = 0) = 1 p. So it is more thanjust a coin tossing distribution: it mustsend the two
possible outcomes to numbers 0 and 1.
Qiu, Lee BST 401
http://goforward/http://find/http://goback/ -
8/8/2019 Probability Theory Presentation 11
5/36
Random Variables Review
Ill start with the simplest non-trivial probability space:
(1, 21 , 1), where 1 = {H, T}, 1({H}) = 1({T}) =
12 .
I can define a sequence of random variables X1, X2, . . . on
this probability space in this way:
Xn() =
0, = H,1n
, = T.
My point: different random variables are only different ways
to assign numbers to events. They do not change the
whole space nor the probability measure.Bernoulli random variable review: X Bernoulli(p) meansP(X = 1) = p and P(X = 0) = 1 p. So it is more thanjust a coin tossing distribution: it mustsend the two
possible outcomes to numbers 0 and 1.
Qiu, Lee BST 401
http://goforward/http://find/http://goback/ -
8/8/2019 Probability Theory Presentation 11
6/36
Random Variables on the Same Space
Let Y be the casino r.v., P(Y = 1) = q,P(Y = 1) = 1 q. Y is not a Bernoulli r.v.!
The following examples show that X and Y can be definedon the same probability space: a) Y = 2X 1 (q = p) ; b)Y = 1 2X (q = 1 p); c) Y 1 (q = 1). d) Y 1(q = 0).
If X1 Bernoulli(12 ), X2 Bernoulli(
13 ), they can not be
defined on the same probability space!
Qiu, Lee BST 401
http://goforward/http://find/http://goback/ -
8/8/2019 Probability Theory Presentation 11
7/36
Random Variables on the Same Space
Let Y be the casino r.v., P(Y = 1) = q,P(Y = 1) = 1 q. Y is not a Bernoulli r.v.!
The following examples show that X and Y can be definedon the same probability space: a) Y = 2X 1 (q = p) ; b)Y = 1 2X (q = 1 p); c) Y 1 (q = 1). d) Y 1(q = 0).
If X1 Bernoulli(12 ), X2 Bernoulli(
13 ), they can not be
defined on the same probability space!
Qiu, Lee BST 401
http://goforward/http://find/http://goback/ -
8/8/2019 Probability Theory Presentation 11
8/36
Random Variables on the Same Space
Let Y be the casino r.v., P(Y = 1) = q,P(Y = 1) = 1 q. Y is not a Bernoulli r.v.!
The following examples show that X and Y can be definedon the same probability space: a) Y = 2X 1 (q = p) ; b)Y = 1 2X (q = 1 p); c) Y 1 (q = 1). d) Y 1(q = 0).
If X1 Bernoulli(12 ), X2 Bernoulli(
13 ), they can not be
defined on the same probability space!
Qiu, Lee BST 401
http://goforward/http://find/http://goback/ -
8/8/2019 Probability Theory Presentation 11
9/36
Random Variables and the Product Space (I)
Let X1, X2 be two separate1 yet identical Bernoulli r.v.s.defined on 1 and 2, where 2 is just a copy of 1.
My point is, though 2 is a copy of 1 and X2 assigns thesame numbers to the same events, X1 = X2 because they
can take different values.In probability theory, X1 = X2 is very strict. It means that a)X1 and X2 are defined on the same probability space; b)
X1() = X2() for all .
In the same spirit, Xn
a.s.
X
means that a) Xn are definedon the same probability space; b) they converge to X
almost surely.
1I can not use the word independent here because I havent defined it
yet.Qiu, Lee BST 401
http://goforward/http://find/http://goback/ -
8/8/2019 Probability Theory Presentation 11
10/36
-
8/8/2019 Probability Theory Presentation 11
11/36
Random Variables and the Product Space (I)
Let X1, X2 be two separate1 yet identical Bernoulli r.v.s.defined on 1 and 2, where 2 is just a copy of 1.
My point is, though 2 is a copy of 1 and X2 assigns thesame numbers to the same events, X1 = X2 because they
can take different values.In probability theory, X1 = X2 is very strict. It means that a)X1 and X2 are defined on the same probability space; b)
X1() = X2() for all .
In the same spirit, Xn
a.s.
X means that a) X
nare defined
on the same probability space; b) they converge to X
almost surely.
1I can not use the word independent here because I havent defined it
yet.Qiu, Lee BST 401
http://goforward/http://find/http://goback/ -
8/8/2019 Probability Theory Presentation 11
12/36
Random Variables and the Product Space (I)
Let X1, X2 be two separate1 yet identical Bernoulli r.v.s.defined on 1 and 2, where 2 is just a copy of 1.
My point is, though 2 is a copy of 1 and X2 assigns thesame numbers to the same events, X1 = X2 because they
can take different values.In probability theory, X1 = X2 is very strict. It means that a)X1 and X2 are defined on the same probability space; b)
X1() = X2() for all .
In the same spirit, Xn
a.s.
X means that a) X
nare defined
on the same probability space; b) they converge to X
almost surely.
1I can not use the word independent here because I havent defined it
yet.Qiu, Lee BST 401
http://goforward/http://find/http://goback/ -
8/8/2019 Probability Theory Presentation 11
13/36
Random Variables and the Product Space (II)
The product space/measure is a way to connect the
otherwise separate probability spaces/random variables.
X1 on 1, X2 on 2. We may consider them as X1 and X2on 1 2 in this way:
X1 : 1 2 R, X1(1, 2) = X1(1).
X2 : 1 2 R, X2(1, 2) = X2(2).
We can do this for an infinite sequence of r.v.s. Let
X1, X2, . . . be a sequence of r.v.s defined on separate
probability spaces 1, 2, . . .. The product space
contains outcomes such as (H, H, T, H, T, T, . . .).
Xn :n
n R, Xn(1, 2, . . .) = Xn(n).
Qiu, Lee BST 401
http://goforward/http://find/http://goback/ -
8/8/2019 Probability Theory Presentation 11
14/36
Random Variables and the Product Space (II)
The product space/measure is a way to connect the
otherwise separate probability spaces/random variables.
X1 on 1, X2 on 2. We may consider them as X1 and X2on 1 2 in this way:
X1 : 1 2 R, X1(1, 2) = X1(1).
X2 : 1 2 R, X2(1, 2) = X2(2).
We can do this for an infinite sequence of r.v.s. Let
X1, X2, . . . be a sequence of r.v.s defined on separate
probability spaces 1, 2, . . .. The product space
contains outcomes such as (H, H, T, H, T, T, . . .).
Xn :n
n R, Xn(1, 2, . . .) = Xn(n).
Qiu, Lee BST 401
http://goforward/http://find/http://goback/ -
8/8/2019 Probability Theory Presentation 11
15/36
Random Variables and the Product Space (II)
The product space/measure is a way to connect the
otherwise separate probability spaces/random variables.
X1 on 1, X2 on 2. We may consider them as X1 and X2on 1 2 in this way:
X1 : 1 2 R, X1(1, 2) = X1(1).
X2 : 1 2 R, X2(1, 2) = X2(2).
We can do this for an infinite sequence of r.v.s. Let
X1, X2, . . . be a sequence of r.v.s defined on separate
probability spaces 1, 2, . . .. The product space
contains outcomes such as (H, H, T, H, T, T, . . .).
Xn :n
n R, Xn(1, 2, . . .) = Xn(n).
Qiu, Lee BST 401
http://goforward/http://find/http://goback/ -
8/8/2019 Probability Theory Presentation 11
16/36
Random Variables and the Product Space (III)
The point: Xn is just Xn defined for the product space so
we dont need to make any distinction in practice.
Xns are defined on the same probability spaces now, so
they can be compared.
Apparently Xn = Xm in general. There is only oneexception: Xn(n) = Xm(m) = const. for all n n andm m. It turns out to be the case for the strong law oflarge numbers (SLLN).
SLLN (without proof, just state the conclusion for a specialcase): Let Zn be
1n
ni=1 Xi. Demonstrate the behavior of
Zn up to n = 3.
Qiu, Lee BST 401
http://goforward/http://find/http://goback/ -
8/8/2019 Probability Theory Presentation 11
17/36
Random Variables and the Product Space (III)
The point: Xn is just Xn defined for the product space so
we dont need to make any distinction in practice.
Xns are defined on the same probability spaces now, so
they can be compared.
Apparently Xn = Xm in general. There is only oneexception: Xn(n) = Xm(m) = const. for all n n andm m. It turns out to be the case for the strong law oflarge numbers (SLLN).
SLLN (without proof, just state the conclusion for a specialcase): Let Zn be
1n
ni=1 Xi. Demonstrate the behavior of
Zn up to n = 3.
Qiu, Lee BST 401
http://goforward/http://find/http://goback/ -
8/8/2019 Probability Theory Presentation 11
18/36
Random Variables and the Product Space (III)
The point: Xn is just Xn defined for the product space so
we dont need to make any distinction in practice.
Xns are defined on the same probability spaces now, so
they can be compared.
Apparently Xn = Xm in general. There is only oneexception: Xn(n) = Xm(m) = const. for all n n andm m. It turns out to be the case for the strong law oflarge numbers (SLLN).
SLLN (without proof, just state the conclusion for a specialcase): Let Zn be
1n
ni=1 Xi. Demonstrate the behavior of
Zn up to n = 3.
Qiu, Lee BST 401
http://goforward/http://find/http://goback/ -
8/8/2019 Probability Theory Presentation 11
19/36
Random Variables and the Product Space (III)
The point: Xn is just Xn defined for the product space so
we dont need to make any distinction in practice.
Xns are defined on the same probability spaces now, so
they can be compared.
Apparently Xn = Xm in general. There is only oneexception: Xn(n) = Xm(m) = const. for all n n andm m. It turns out to be the case for the strong law oflarge numbers (SLLN).
SLLN (without proof, just state the conclusion for a specialcase): Let Zn be
1n
ni=1 Xi. Demonstrate the behavior of
Zn up to n = 3.
Qiu, Lee BST 401
http://goforward/http://find/http://goback/ -
8/8/2019 Probability Theory Presentation 11
20/36
About the Homework
Back to the homework #6, problem 1. It asks you to prove
a.s. convergence. Without any handy theorems/tools, you
must start from scratch, that is, proof that for almost surely
every , X1(), X2(), . . . as a sequence of real numbersconverges.
Homework #7, problem 2. Limits are defined for .You must use countably many set operations of rectangles
(essentially finitely dimensional rectangles) to defined
those sets.
Qiu, Lee BST 401
http://goforward/http://find/http://goback/ -
8/8/2019 Probability Theory Presentation 11
21/36
About the Homework
Back to the homework #6, problem 1. It asks you to prove
a.s. convergence. Without any handy theorems/tools, you
must start from scratch, that is, proof that for almost surely
every , X1(), X2(), . . . as a sequence of real numbersconverges.
Homework #7, problem 2. Limits are defined for .You must use countably many set operations of rectangles
(essentially finitely dimensional rectangles) to defined
those sets.
Qiu, Lee BST 401
http://goforward/http://find/http://goback/ -
8/8/2019 Probability Theory Presentation 11
22/36
Convergence in measure/probability
fn
f iff > 0, ({ : |fn() f()| }) 0.
Convergence in measure says that the measure of
not-convergent points shrinks to zero. Or in probabilitytheory: the probability of seeing outlier (those such that
|fn() f()| > ) decreases to zero.
It looks awfully like a.e. convergence! Counter example:
shrinking but bouncy indicators.
Qiu, Lee BST 401
http://goforward/http://find/http://goback/ -
8/8/2019 Probability Theory Presentation 11
23/36
Convergence in measure/probability
fn
f iff > 0, ({ : |fn() f()| }) 0.
Convergence in measure says that the measure of
not-convergent points shrinks to zero. Or in probabilitytheory: the probability of seeing outlier (those such that
|fn() f()| > ) decreases to zero.
It looks awfully like a.e. convergence! Counter example:
shrinking but bouncy indicators.
Qiu, Lee BST 401
http://goforward/http://find/http://goback/ -
8/8/2019 Probability Theory Presentation 11
24/36
Convergence in measure/probability
fn
f iff > 0, ({ : |fn() f()| }) 0.
Convergence in measure says that the measure of
not-convergent points shrinks to zero. Or in probabilitytheory: the probability of seeing outlier (those such that
|fn() f()| > ) decreases to zero.
It looks awfully like a.e. convergence! Counter example:
shrinking but bouncy indicators.
Qiu, Lee BST 401
http://goforward/http://find/http://goback/ -
8/8/2019 Probability Theory Presentation 11
25/36
Weak Convergence of Measures
All the convergence we defined so far are convergence of
measurable functions/random variables w.r.t. a fixed
probability measure.
In SLLN, we need convergence in probability or even a.e.convergence. But in CLT, we are satisfied by knowing the
resulting distribution is normal, we dont really care about
pointwise convergence.
This makes us consider about a totally different
convergence. A convergence of distributions/measuresinstead of convergence of random variables.
Qiu, Lee BST 401
http://goforward/http://find/http://goback/ -
8/8/2019 Probability Theory Presentation 11
26/36
Weak Convergence of Measures
All the convergence we defined so far are convergence of
measurable functions/random variables w.r.t. a fixed
probability measure.
In SLLN, we need convergence in probability or even a.e.convergence. But in CLT, we are satisfied by knowing the
resulting distribution is normal, we dont really care about
pointwise convergence.
This makes us consider about a totally different
convergence. A convergence of distributions/measuresinstead of convergence of random variables.
Qiu, Lee BST 401
http://goforward/http://find/http://goback/ -
8/8/2019 Probability Theory Presentation 11
27/36
Weak Convergence of Measures
All the convergence we defined so far are convergence of
measurable functions/random variables w.r.t. a fixed
probability measure.
In SLLN, we need convergence in probability or even a.e.convergence. But in CLT, we are satisfied by knowing the
resulting distribution is normal, we dont really care about
pointwise convergence.
This makes us consider about a totally different
convergence. A convergence of distributions/measuresinstead of convergence of random variables.
Qiu, Lee BST 401
http://goforward/http://find/http://goback/ -
8/8/2019 Probability Theory Presentation 11
28/36
Definition
Let P1, P2, . . . be probability measures on . Pnw
P iff any oneof the following equivalent conditions hold:
Fn(x) F(x) for all continuous points (including ).(Durrett book definition)
n(A) (A) for all continuity sets A of P, which are setssuch that (A) = 0.
fdn
fd, for all bounded, continuousfunctions.
Several other criteria. See Thm 2.8.1. in Ashs book.
We say a sequence of r.v.s X1, X2, . . . converges weakly
(converges in distribution) to X if the distribution functions
Fn associated with Xn converges weakly to that of X.
This topic will be re-studied in the CLT chapter.
Qiu, Lee BST 401
http://goforward/http://find/http://goback/ -
8/8/2019 Probability Theory Presentation 11
29/36
Definition
Let P1, P2, . . . be probability measures on . Pnw
P iff any oneof the following equivalent conditions hold:
Fn(x) F(x) for all continuous points (including ).(Durrett book definition)
n(A) (A) for all continuity sets A of P, which are setssuch that (A) = 0.
fdn
fd, for all bounded, continuousfunctions.
Several other criteria. See Thm 2.8.1. in Ashs book.
We say a sequence of r.v.s X1, X2, . . . converges weakly
(converges in distribution) to X if the distribution functions
Fn associated with Xn converges weakly to that of X.
This topic will be re-studied in the CLT chapter.
Qiu, Lee BST 401
http://goforward/http://find/http://goback/ -
8/8/2019 Probability Theory Presentation 11
30/36
Definition
Let P1, P2, . . . be probability measures on . Pnw
P iff any oneof the following equivalent conditions hold:
Fn(x) F(x) for all continuous points (including ).(Durrett book definition)
n(A) (A) for all continuity sets A of P, which are setssuch that (A) = 0.
fdn
fd, for all bounded, continuousfunctions.
Several other criteria. See Thm 2.8.1. in Ashs book.
We say a sequence of r.v.s X1, X2, . . . converges weakly
(converges in distribution) to X if the distribution functions
Fn associated with Xn converges weakly to that of X.
This topic will be re-studied in the CLT chapter.
Qiu, Lee BST 401
http://goforward/http://find/http://goback/ -
8/8/2019 Probability Theory Presentation 11
31/36
Definition
Let P1, P2, . . . be probability measures on . Pnw
P iff any oneof the following equivalent conditions hold:
Fn(x) F(x) for all continuous points (including ).(Durrett book definition)
n(A) (A) for all continuity sets A of P, which are setssuch that (A) = 0.
fdn
fd, for all bounded, continuousfunctions.
Several other criteria. See Thm 2.8.1. in Ashs book.
We say a sequence of r.v.s X1, X2, . . . converges weakly
(converges in distribution) to X if the distribution functions
Fn associated with Xn converges weakly to that of X.
This topic will be re-studied in the CLT chapter.
Qiu, Lee BST 401
http://goforward/http://find/http://goback/ -
8/8/2019 Probability Theory Presentation 11
32/36
Definition
Let P1, P2, . . . be probability measures on . Pnw
P iff any oneof the following equivalent conditions hold:
Fn(x) F(x) for all continuous points (including ).(Durrett book definition)
n(A) (A) for all continuity sets A of P, which are setssuch that (A) = 0.
fdn
fd, for all bounded, continuousfunctions.
Several other criteria. See Thm 2.8.1. in Ashs book.
We say a sequence of r.v.s X1, X2, . . . converges weakly
(converges in distribution) to X if the distribution functions
Fn associated with Xn converges weakly to that of X.
This topic will be re-studied in the CLT chapter.
Qiu, Lee BST 401
http://goforward/http://find/http://goback/ -
8/8/2019 Probability Theory Presentation 11
33/36
Definition
Let P1, P2, . . . be probability measures on . Pnw
P iff any oneof the following equivalent conditions hold:
Fn(x) F(x) for all continuous points (including ).(Durrett book definition)
n(A) (A) for all continuity sets A of P, which are setssuch that (A) = 0.
fdn
fd, for all bounded, continuousfunctions.
Several other criteria. See Thm 2.8.1. in Ashs book.
We say a sequence of r.v.s X1, X2, . . . converges weakly
(converges in distribution) to X if the distribution functions
Fn associated with Xn converges weakly to that of X.
This topic will be re-studied in the CLT chapter.
Qiu, Lee BST 401
http://goforward/http://find/http://goback/ -
8/8/2019 Probability Theory Presentation 11
34/36
Relations Between Different Convergences
Lp convergence implies convergence in measure.
If is a probability measure, a.e. convergence implies
convergence in measure.
For finite measures (probabilities), L convergence implies
Lp convergence; Lp convergence implies Lp
convergence,
if p > p (a homework problem).
Qiu, Lee BST 401
http://goforward/http://find/http://goback/ -
8/8/2019 Probability Theory Presentation 11
35/36
Relations Between Different Convergences
Lp convergence implies convergence in measure.
If is a probability measure, a.e. convergence implies
convergence in measure.
For finite measures (probabilities), L convergence implies
Lp convergence; Lp convergence implies Lp
convergence,
if p > p (a homework problem).
Qiu, Lee BST 401
R l i B Diff C
http://goforward/http://find/http://goback/ -
8/8/2019 Probability Theory Presentation 11
36/36
Relations Between Different Convergences
Lp convergence implies convergence in measure.
If is a probability measure, a.e. convergence implies
convergence in measure.
For finite measures (probabilities), L convergence implies
Lp convergence; Lp convergence implies Lp
convergence,
if p > p (a homework problem).
Qiu, Lee BST 401
http://goforward/http://find/http://goback/