discrete random variablesdiscrete random variables chapter 3 { lecture 12 yiren ding shanghai qibao...

Discrete Random VariablesChapter 3 – Lecture 12

Yiren Ding

Shanghai Qibao Dwight High School

April 4, 2016

Yiren Ding Discrete Random Variables 1 / 14

Outline

1 Independence of Random VariablesCumulative Distribution FunctionDefinition of Independence

2 Alternative DefinitionCorollaries

3 Expected Value and VarianceIndependence and Expected ValueIndependence and VarianceSquare-root Law

4 Convolution RuleExample


Independence of Random Variables Cumulative Distribution Function

Definition 1 (Cumulative Distribution Function).

The cumulative distribution function (CDF) of a discrete randomvariable X is defined by

FX (x) = P(X ≤ x) =∑t≤x

P(X = t).

The functino FX (x) tells you the probability that the random variableX takes on values at most as extreme as x .

For example, the median of a random variable X can be definedmore precisely as the value m = x1+x2

2 , where

x1 = max{x : FX (x) ≤ 0.5}x2 = min{x : FX (x) ≥ 0.5}.


Independence of Random Variables Definition of Independence

Definition 2 (Independence of Random Variables).

Let X and Y be two random variables (discrete or continuous) that aredefined on the same sample space with probability measure P. And let FXand FY denote their CDF’s, respectively. The random variables X and Yare said to be independent if

P(X ≤ x ,Y ≤ y) = FX (x)FY (x),

for any x and y , where P(X ≤ x ,Y ≤ y) represents the probability ofoccurrence of both events {X ≤ x} and {Y ≤ y}.

In words, the random variable X and Y are independent if the eventof X taking on a value less than or equal to x and the event that Ytakes on value less than or equal to y are independent for all x and y .

It is worth noting that this definition works for any random variable,not just discrete.


Alternative Definition

Theorem 1 (Alternative Definition).

For any two sets A,B ⊆ R, Definition 2 is equivalent to

P(X ∈ A,Y ∈ B) = P(X ∈ A)P(Y ∈ B).

Proof. We first prove this for the special case where

A = {r : am ≤ r ≤ aM} and B = {r : bm ≤ r ≤ bM}.

For convenience, let aXb denote a ≤ X ≤ b, and P = P(X ∈ A,Y ∈ B):

P = P(amXaM , bmYbM) = P(XaM , bmYbM)− P(Xam, bmYbM)

= P(XaM ,YbM)− P(XaM ,Ybm)− P(Xam,YbM) + P(XaM ,Ybm)

= FX (aM)FY (bM)− FX (aM)FY (bm)− FX (am)FY (bM) + FX (aM)FY (bm)

= (FX (aM)− FX (am))(FY (bM)− FY (bm)) = P(amXaM)P(bmYbM)

= P(X ∈ A)P(Y ∈ B).


Alternative Definition

Proof of Theorem 1

Now, for arbitrary sets A and B, we can always write

A =⋃i

Ai and B =⋃j

Bj ,

where Ai and Bi are subsets of the form {r : α ≤ r ≤ β}, for some α, β ∈ R.

Therefore, by the axioms of probability theory, we have,

P = P(x ∈⋃i

Ai ,Y ∈⋃j

Bj) =∑i

P(x ∈ Ai ,Y ∈⋃j

Bj)

=∑i

∑j

P(x ∈ Ai ,Y ∈ Bj) =∑i

∑j

P(x ∈ Ai )P(Y ∈ Bj)

=∑j

P(X ∈ A)P(Y ∈ Bj) = P(X ∈ A)P(Y ∈ B).


Alternative Definition Corollaries

Corollary 1.

If X and Y are independent random variables, then the random variablesf (X ) and g(Y ) are independent for any two functions f and g .

Proof. Suppose that X and Y are independent. Let A′ and B ′ denotethe images f (A) and g(B), respectively.

Since X and Y are independent, by Theorem 1:

P(f (X ) ∈ A′, g(Y ) ∈ B ′) = P(X ∈ A,Y ∈ B)

= P(X ∈ A)P(Y ∈ B)

= P(f (X ) ∈ A′)P(g(Y ) ∈ B ′).

Hence f (X ) and g(X ) are also independent by Theorem 1.


Alternative Definition Corollaries

Corollary 2 (Independence of Discrete Random Variables).

Discrete random variables X and Y are independent if and only if

P(X = x ,Y = y) = P(X = x)P(Y = y) for all x , y .

Proof. This is the easiest proof in the world.

Simply let A = {x} and B = {y} in Theorem 1, and we’re done.


Expected Value and Variance Independence and Expected Value

Theorem 2.

If the random variable X and Y are independent, then

E (XY ) = E (X )E (Y ),

assuming that E (X ) and E (Y ) exist and are finite.

Proof. Let I and J denote the range of X and Y , respectively, anddefine the random variable Z by Z = XY .

Then we have

E (Z ) =∑z

P(Z = z) =∑z

∑xy=z

P(X = x ,Y = y)

=∑z

∑xy=z

xyP(X = x ,Y = y)


Expected Value and Variance Independence and Expected Value

Theorem 2 Proof

Since X and Y are independent, by Corollary 2,

E (Z ) =∑x ,y

xyP(X = x ,Y = y) =∑x ,y

xyP(X = x)P(Y = y)

=∑x∈I

xP(X = x)∑y∈J

yP(Y = y) = E (X )E (Y )

Note that the converse of this theorem is not true!

For example, suppose two fair dice are tossed. Denote by the randomvariable V1 the number appearing on the first die and by V2 thenumber appearing on the second die.

Let X = V1 + V2 and Y = V1 − V2. It is obvious that X and Y arenot independent. (Why?) Verify that E (XY ) = E (X )E (Y ).


Expected Value and Variance Independence and Variance

Theorem 3.

If the random variables X and Y are independent, then

var(X + Y ) = var(X ) + var(Y ).

Proof. Let µX = E (X ) and µY = E (Y ). We have

var(X + Y ) = E ((X + Y )2)− (µX + µY )2

= E (X 2 + 2XY + Y 2)− µ2X − 2µXµY − µ2Y= E (X 2) + ��2µXµY + E (Y 2)− µ2X −��2µXµY − µ2Y= var(X ) + var(Y )

Note that in general, if X1,X2, ...,Xn are independent,

var(X1 + X2 + · · ·+ Xn) = var(X1) + var(X2) + · · · var(Xn).


Expected Value and Variance Square-root Law

Corollary 3.

If the random variables X1,X2, ...,Xn are i.i.d. (independently identicallydistributed) with standard deviation σ, then the standard deviation of thesum X1 + X2 + · · ·+ Xn is given by

σ(X1 + X2 + · · ·+ Xn) = σ√n.

This is one of the most important results used in statistics. It isgenerally stated as the famous square-root law:

σ

(X1 + X2 + · · ·+ Xn

n

)=

σ√n.

The term (X1 + X2 + · · ·Xn)/n represents the sample mean of nsamples, and is itself a random variable. The Central Limit Theoremis closely associated with this random variable.


Convolution Rule

Theorem 4 (Convolution Rule).

If discrete random variables X and Y have the set of nonnegative integersas the range of possible values, and are independent, then

P(X + Y = k) =k∑

j=0

P(X = j)P(Y = k − j) for k = 0, 1, ....

Proof. By definition of X + Y and independence,

P(X + Y = k) =∑

x+y=k

P(X = x ,Y = y)

=k∑

j=0

P(X = j ,Y = k − j)

=k∑

j=0

P(X = j)P(Y = k − j)


Convolution Rule Example

Example 1.

Suppose the random variables X and Y are independent and have Poissondistribution with respective means λ and µ. What is the probabilitydistribution of X + Y ?

By the convolution rule and the binomial theorem,

P(X + Y = k) =k∑

j=0

e−λλj

j!e−µ

µk−j

(k − j)!

=e−(λ+µ)

k!

k∑j=0

(k

j

)λjµk−j

= e−(λ+µ)(λ+ µ)k

k!, for k = 0, 1, ....

Hence, X + Y is also Poisson distributed with mean λ+ µ.


discrete random variablesdiscrete random variables chapter 3 { lecture 12 yiren ding shanghai qibao...

Documents