sds 321: introduction to probability and statisticspsarkar/sds321/lecture22-new.pdf · p(z z) =p(fx...

Post on 17-Jun-2020

4 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

SDS 321: Introduction to Probability andStatistics

Lecture 21: Continuous random variables- max oftwo independent r.v.’s, iterated expectation

Purnamrita SarkarDepartment of Statistics and Data Science

The University of Texas at Austin

www.cs.cmu.edu/∼psarkar/teaching

1

Roadmap

I Function of two independent random variables

I Conditional expectation as a random variable, law of iteratedexpectations

I Useful inequalities and the normal approximation to Binomial

2

Functions of two random variables

I If X and Y are both random variables, then Z = g(X ,Y ) is also arandom variable.

I In the discrete case, we could easily find the PMF of the newrandom variable:

pZ (z) =∑

x ,y |g(x ,y)=z

pX ,Y (x , y)

I For example, if I roll two fair dice, what is the probability that thesum is 6?

I Each possible ordered pair has probability 1/36.

I The options that sum to 6 are (1, 5), (2, 4), (3, 3), (4, 2), (5, 1)... or inother words (k, 6− k) for k = 1, . . . , 5.

I So, pZ (5) = 5/36

3

Functions of two random variables

I If X and Y are both random variables, then Z = g(X ,Y ) is also arandom variable.

I In the discrete case, we could easily find the PMF of the newrandom variable:

pZ (z) =∑

x ,y |g(x ,y)=z

pX ,Y (x , y)

I For example, if I roll two fair dice, what is the probability that thesum is 6?

I Each possible ordered pair has probability 1/36.

I The options that sum to 6 are (1, 5), (2, 4), (3, 3), (4, 2), (5, 1)... or inother words (k, 6− k) for k = 1, . . . , 5.

I So, pZ (5) = 5/36

3

Functions of two independent continuous random variables

I I know two guests will both arrive in the next hour. The arrivaltimes are independent random variables which follow a uniformdistribution. What is the PDF of the arrival time of the last guest toarrive?

I X ∼ Uniform([0, 1]) and Y ∼ Uniform([0, 1]).

I In other words, what is the PDF of Z = max(X ,Y )?

I Let’s first think about the CDF...

P(Z ≤ z) =P({X ≤ z} ∩ {Y ≤ z})

=P(X ≤ z)P(Y ≤ z) =

0 z < 0

z2 0 ≤ z ≤ 1

1 z > 1

I Differentiating, fZ (z) =

{2z 0 ≤ z ≤ 1

0 otherwise

4

Functions of two independent continuous random variables

I I know two guests will both arrive in the next hour. The arrivaltimes are independent random variables which follow a uniformdistribution. What is the PDF of the arrival time of the last guest toarrive?

I X ∼ Uniform([0, 1]) and Y ∼ Uniform([0, 1]).

I In other words, what is the PDF of Z = max(X ,Y )?

I Let’s first think about the CDF...

P(Z ≤ z) =P({X ≤ z} ∩ {Y ≤ z})

=P(X ≤ z)P(Y ≤ z) =

0 z < 0

z2 0 ≤ z ≤ 1

1 z > 1

I Differentiating, fZ (z) =

{2z 0 ≤ z ≤ 1

0 otherwise

4

Functions of two independent continuous random variables

I I know two guests will both arrive in the next hour. The arrivaltimes are independent random variables which follow a uniformdistribution. What is the PDF of the arrival time of the last guest toarrive?

I X ∼ Uniform([0, 1]) and Y ∼ Uniform([0, 1]).

I In other words, what is the PDF of Z = max(X ,Y )?

I Let’s first think about the CDF...

P(Z ≤ z) =P({X ≤ z} ∩ {Y ≤ z})

=P(X ≤ z)P(Y ≤ z) =

0 z < 0

z2 0 ≤ z ≤ 1

1 z > 1

I Differentiating, fZ (z) =

{2z 0 ≤ z ≤ 1

0 otherwise

4

Functions of two independent continuous random variables

I I know two guests will both arrive in the next hour. The arrivaltimes are independent random variables which follow a uniformdistribution. What is the PDF of the arrival time of the last guest toarrive?

I X ∼ Uniform([0, 1]) and Y ∼ Uniform([0, 1]).

I In other words, what is the PDF of Z = max(X ,Y )?

I Let’s first think about the CDF...

P(Z ≤ z) =P({X ≤ z} ∩ {Y ≤ z})

=P(X ≤ z)P(Y ≤ z) =

0 z < 0

z2 0 ≤ z ≤ 1

1 z > 1

I Differentiating, fZ (z) =

{2z 0 ≤ z ≤ 1

0 otherwise

4

Functions of two independent continuous random variables

I I know two guests will both arrive in the next hour. The arrivaltimes are independent random variables which follow a uniformdistribution. What is the PDF of the arrival time of the last guest toarrive?

I X ∼ Uniform([0, 1]) and Y ∼ Uniform([0, 1]).

I In other words, what is the PDF of Z = max(X ,Y )?

I Let’s first think about the CDF...

P(Z ≤ z) =P({X ≤ z} ∩ {Y ≤ z})

=P(X ≤ z)P(Y ≤ z) =

0 z < 0

z2 0 ≤ z ≤ 1

1 z > 1

I Differentiating, fZ (z) =

{2z 0 ≤ z ≤ 1

0 otherwise

4

Functions of two independent continuous random variables

I I know two guests will both arrive in the next hour. The arrivaltimes are independent random variables which follow a uniformdistribution. What is the PDF of the arrival time of the last guest toarrive?

I X ∼ Uniform([0, 1]) and Y ∼ Uniform([0, 1]).

I In other words, what is the PDF of Z = max(X ,Y )?

I Let’s first think about the CDF...

P(Z ≤ z) =P({X ≤ z} ∩ {Y ≤ z})

=P(X ≤ z)P(Y ≤ z) =

0 z < 0

z2 0 ≤ z ≤ 1

1 z > 1

I Differentiating, fZ (z) =

{2z 0 ≤ z ≤ 1

0 otherwise

4

Functions of two random variables: Summary

I If Y = g(X ), in order to get the PDF of Y we first looked at theCDF, P(Y ≤ y) = P(g(X ) ≤ y) and then differentiated with respectto y .

I For functions Z = g(X ,Y ) of two random variables, the samegeneral idea applies:

I First we look at the CDF, P(Z ≤ z) = P(g(X ,Y ) ≤ x)

I Then we differentiate with respect to z.

I We looked at a special case: The maximum of two independentr.v.’s.

5

Conditional expectation

I Recall that the conditional expectation of a discrete random variableis given by

E [X |Y = y ] =∑x

xP(X = x |Y = y)

I Recall that the conditional expectation of a continuous randomvariable is given by

E [X |Y = y ] =

∫ ∞−∞

xfX |Y (x |y)dx

I For a given value of y E [X |Y = y ] gives a numerical value.

6

Example 1

A miner is trapped in a mine containing 3 doors. The first door leads toa tunnel that will take him to safety after 3 hours of travel. The seconddoor leads to a tunnel that will return him to the mine after 5 hours oftravel. The third door leads to a tunnel that will return him to the mineafter 7 hours. If we assume that the miner is at all times equally likely tochoose any one of the doors, what is the expected length of time until hereaches safety?

I Let X denote the amount of time (in hours) until the miner reachessafety, and let Y denote the door he initially chooses

I

E [X ] = E [X |Y = 1]P(Y = 1) + E [X |Y = 2]P(Y = 2) + E [X |Y = 3]P(Y = 3)

=1

3(3 + (5 + E [X ]) + (7 + E [X ]))

=1

3(15 + 2E [X ])

E [X ] = 15

7

Example 1

A miner is trapped in a mine containing 3 doors. The first door leads toa tunnel that will take him to safety after 3 hours of travel. The seconddoor leads to a tunnel that will return him to the mine after 5 hours oftravel. The third door leads to a tunnel that will return him to the mineafter 7 hours. If we assume that the miner is at all times equally likely tochoose any one of the doors, what is the expected length of time until hereaches safety?

I Let X denote the amount of time (in hours) until the miner reachessafety, and let Y denote the door he initially chooses

I

E [X ] = E [X |Y = 1]P(Y = 1) + E [X |Y = 2]P(Y = 2) + E [X |Y = 3]P(Y = 3)

=1

3(3 + (5 + E [X ]) + (7 + E [X ]))

=1

3(15 + 2E [X ])

E [X ] = 15

7

Example 1

I John’s tank holds 15 gallons of gas, and he always refills his tankwhen he gets down to 5 gallons. For y gallons of gas in John’s tank,total number of miles he will run is X ∼ N(30Y , 1).

I Y ∼ Uniform([5, 15])

I Whats E [X ]?

I E [X |Y = y ] = 30y .

8

Conditional expectation as a random variable

I In the last example, we saw that E [X |Y = y ] = 30y .

I For any number y E [X |Y = y ] is also a number.

I As y varies so does E [X |Y = y ]. So, E [X |Y = y ], so we can viewE [X |Y = y ] as a function of Y .

I Since a function of a random variable, E [X |Y = y ] is also a randomvariable.

I Concretely we will define E [X |Y ] as a function of Y , such that:

E [X |Y ] = E [X |Y = y ] When Y = y

I In the last example, E [X |Y ] = 30Y .

I Fun fact. E [Xh(Y )|Y ] = E [X |Y ]h(Y ). Why?

9

Law of iterated expectations

Since E [X |Y ] is a random variable, its expectation can be calculated asE [E [X |Y ]].

I E [E [X |Y ]] =

∑y

E [X |Y = y ]P(Y = y) Y discrete∫yE [X |Y = y ]fY (y) Y continuous

I But we know the RHS of the above, don’t we?

I Total expectation theorem! So E [E [X |Y ]] = E [X ]

For the last problem where X ∼ N(30y , 1), find E [X |Y = y ]

I We can do it by calculating fX (x), or we can just use the iteratedexpectation theorem.

I We already saw that Y ∼ Uniform([5, 15]).

I E [X ] = E [E [X |Y ]] = 30E [Y ] = 30× 10 = 300.

10

Law of iterated expectations

Since E [X |Y ] is a random variable, its expectation can be calculated asE [E [X |Y ]].

I E [E [X |Y ]] =

∑y

E [X |Y = y ]P(Y = y) Y discrete∫yE [X |Y = y ]fY (y) Y continuous

I But we know the RHS of the above, don’t we?

I Total expectation theorem! So E [E [X |Y ]] = E [X ]

For the last problem where X ∼ N(30y , 1), find E [X |Y = y ]

I We can do it by calculating fX (x), or we can just use the iteratedexpectation theorem.

I We already saw that Y ∼ Uniform([5, 15]).

I E [X ] = E [E [X |Y ]] = 30E [Y ] = 30× 10 = 300.

10

Law of iterated expectations

Since E [X |Y ] is a random variable, its expectation can be calculated asE [E [X |Y ]].

I E [E [X |Y ]] =

∑y

E [X |Y = y ]P(Y = y) Y discrete∫yE [X |Y = y ]fY (y) Y continuous

I But we know the RHS of the above, don’t we?

I Total expectation theorem! So E [E [X |Y ]] = E [X ]

For the last problem where X ∼ N(30y , 1), find E [X |Y = y ]

I We can do it by calculating fX (x), or we can just use the iteratedexpectation theorem.

I We already saw that Y ∼ Uniform([5, 15]).

I E [X ] = E [E [X |Y ]] = 30E [Y ] = 30× 10 = 300.

10

Law of iterated expectations

Since E [X |Y ] is a random variable, its expectation can be calculated asE [E [X |Y ]].

I E [E [X |Y ]] =

∑y

E [X |Y = y ]P(Y = y) Y discrete∫yE [X |Y = y ]fY (y) Y continuous

I But we know the RHS of the above, don’t we?

I Total expectation theorem! So E [E [X |Y ]] = E [X ]

For the last problem where X ∼ N(30y , 1), find E [X |Y = y ]

I We can do it by calculating fX (x), or we can just use the iteratedexpectation theorem.

I We already saw that Y ∼ Uniform([5, 15]).

I E [X ] = E [E [X |Y ]] = 30E [Y ] = 30× 10 = 300.

10

Law of iterated expectations

Since E [X |Y ] is a random variable, its expectation can be calculated asE [E [X |Y ]].

I E [E [X |Y ]] =

∑y

E [X |Y = y ]P(Y = y) Y discrete∫yE [X |Y = y ]fY (y) Y continuous

I But we know the RHS of the above, don’t we?

I Total expectation theorem! So E [E [X |Y ]] = E [X ]

For the last problem where X ∼ N(30y , 1), find E [X |Y = y ]

I We can do it by calculating fX (x), or we can just use the iteratedexpectation theorem.

I We already saw that Y ∼ Uniform([5, 15]).

I E [X ] = E [E [X |Y ]] = 30E [Y ] = 30× 10 = 300.

10

Stick breakingWe are breaking a stick of length ` at a point which is chosen uniformlyover its length and keep the piece that contains the left end and then werepeat the process with the piece we have. What is the expected lengthof the remaining stick?

I Let Y is the length of the stick after we break it for the first timeand X be the length after the second break.

I What is E [X |Y = y ]? For that we need fX |Y (x |y). Uniform in [0, y ].

I So E [X |Y = y ] = y/2, and E [X |Y ] = Y /2.

I So E [X ] is E [E [X |Y ]] = E [Y /2].

I But Y is also Uniform in the interval [0, `]

I So, E [Y ] = `/2. and E [X ] = E [Y ]/2 = `/4

11

Stick breakingWe are breaking a stick of length ` at a point which is chosen uniformlyover its length and keep the piece that contains the left end and then werepeat the process with the piece we have. What is the expected lengthof the remaining stick?

I Let Y is the length of the stick after we break it for the first timeand X be the length after the second break.

I What is E [X |Y = y ]? For that we need fX |Y (x |y). Uniform in [0, y ].

I So E [X |Y = y ] = y/2, and E [X |Y ] = Y /2.

I So E [X ] is E [E [X |Y ]] = E [Y /2].

I But Y is also Uniform in the interval [0, `]

I So, E [Y ] = `/2. and E [X ] = E [Y ]/2 = `/4

11

Stick breakingWe are breaking a stick of length ` at a point which is chosen uniformlyover its length and keep the piece that contains the left end and then werepeat the process with the piece we have. What is the expected lengthof the remaining stick?

I Let Y is the length of the stick after we break it for the first timeand X be the length after the second break.

I What is E [X |Y = y ]? For that we need fX |Y (x |y). Uniform in [0, y ].

I So E [X |Y = y ] = y/2, and E [X |Y ] = Y /2.

I So E [X ] is E [E [X |Y ]] = E [Y /2].

I But Y is also Uniform in the interval [0, `]

I So, E [Y ] = `/2. and E [X ] = E [Y ]/2 = `/4

11

Stick breakingWe are breaking a stick of length ` at a point which is chosen uniformlyover its length and keep the piece that contains the left end and then werepeat the process with the piece we have. What is the expected lengthof the remaining stick?

I Let Y is the length of the stick after we break it for the first timeand X be the length after the second break.

I What is E [X |Y = y ]? For that we need fX |Y (x |y). Uniform in [0, y ].

I So E [X |Y = y ] = y/2, and E [X |Y ] = Y /2.

I So E [X ] is E [E [X |Y ]] = E [Y /2].

I But Y is also Uniform in the interval [0, `]

I So, E [Y ] = `/2. and E [X ] = E [Y ]/2 = `/4

11

Stick breakingWe are breaking a stick of length ` at a point which is chosen uniformlyover its length and keep the piece that contains the left end and then werepeat the process with the piece we have. What is the expected lengthof the remaining stick?

I Let Y is the length of the stick after we break it for the first timeand X be the length after the second break.

I What is E [X |Y = y ]? For that we need fX |Y (x |y). Uniform in [0, y ].

I So E [X |Y = y ] = y/2, and E [X |Y ] = Y /2.

I So E [X ] is E [E [X |Y ]] = E [Y /2].

I But Y is also Uniform in the interval [0, `]

I So, E [Y ] = `/2. and E [X ] = E [Y ]/2 = `/4

11

Stick breakingWe are breaking a stick of length ` at a point which is chosen uniformlyover its length and keep the piece that contains the left end and then werepeat the process with the piece we have. What is the expected lengthof the remaining stick?

I Let Y is the length of the stick after we break it for the first timeand X be the length after the second break.

I What is E [X |Y = y ]? For that we need fX |Y (x |y). Uniform in [0, y ].

I So E [X |Y = y ] = y/2, and E [X |Y ] = Y /2.

I So E [X ] is E [E [X |Y ]] = E [Y /2].

I But Y is also Uniform in the interval [0, `]

I So, E [Y ] = `/2. and E [X ] = E [Y ]/2 = `/4

11

Stick breakingWe are breaking a stick of length ` at a point which is chosen uniformlyover its length and keep the piece that contains the left end and then werepeat the process with the piece we have. What is the expected lengthof the remaining stick?

I Let Y is the length of the stick after we break it for the first timeand X be the length after the second break.

I What is E [X |Y = y ]? For that we need fX |Y (x |y). Uniform in [0, y ].

I So E [X |Y = y ] = y/2, and E [X |Y ] = Y /2.

I So E [X ] is E [E [X |Y ]] = E [Y /2].

I But Y is also Uniform in the interval [0, `]

I So, E [Y ] = `/2. and E [X ] = E [Y ]/2 = `/4 11

Some useful inequalities-Markov

So far we have looked at expectations and variances of sums ofindependent random variables. Today we will also look at there behaviorwhen the number of random variables is increasing.

I For a positive random variable X and t > 0,

P(X > t) ≤ E [X ]/t.

I How do we show this?

E [X ] = E [X |X ≥ t]P(X ≥ t) + E [X |X < t]P(X < t)

≥ E [X |X ≥ t]P(X ≥ t) ≥ tP(X ≥ t)

P(X ≥ t) ≤ E [X ]/t

I All this comes in handy to show that a random variable cannot betoo far from its expectation if the variance is small.

12

Some useful inequalities

So far we have looked at expectations and variances of sums ofindependent random variables. Today we will also look at there behaviorwhen the number of random variables is increasing.

I Remember markov’s inequality? For a positive random variable X

and some t > 0, we said that P(X ≥ t) ≤ E [X ]

tI We can use this to bound P(|X − E [X ]| ≥ c).

P(|X − µ| ≥ c) = P((X − µ)2 ≥ c2) ≤ E [(X − µ)2]

c2

I This is the famous Chebyshev inequality.

I All this comes in handy to show that a random variable cannot betoo far from its expectation if the variance is small.

13

Weak law of large numbers

The WLLN basically states that the sample mean of a large number ofrandom variables is very close to the true mean with high probability.

I Consider a sequence of i.i.d random variables X1, . . .Xn with mean µ

and variance σ2.

I Let Mn =X1 + · · ·+ Xn

n.

I E [Mn] =E [X1] + · · ·+ E [Xn]

n= µ

I var(Mn) =var[X1] + · · ·+ var[Xn]

n2=σ2

n

I So P(|Mn − µ| ≥ ε) ≤σ2

nε2

I For large n this probability is small.

14

Weak law of large numbers

The WLLN basically states that the sample mean of a large number ofrandom variables is very close to the true mean with high probability.

I Consider a sequence of i.i.d random variables X1, . . .Xn with mean µ

and variance σ2.

I Let Mn =X1 + · · ·+ Xn

n.

I E [Mn] =E [X1] + · · ·+ E [Xn]

n= µ

I var(Mn) =var[X1] + · · ·+ var[Xn]

n2=σ2

n

I So P(|Mn − µ| ≥ ε) ≤σ2

nε2

I For large n this probability is small.

14

Weak law of large numbers

The WLLN basically states that the sample mean of a large number ofrandom variables is very close to the true mean with high probability.

I Consider a sequence of i.i.d random variables X1, . . .Xn with mean µ

and variance σ2.

I Let Mn =X1 + · · ·+ Xn

n.

I E [Mn] =E [X1] + · · ·+ E [Xn]

n= µ

I var(Mn) =var[X1] + · · ·+ var[Xn]

n2=σ2

n

I So P(|Mn − µ| ≥ ε) ≤σ2

nε2

I For large n this probability is small.

14

Weak law of large numbers

The WLLN basically states that the sample mean of a large number ofrandom variables is very close to the true mean with high probability.

I Consider a sequence of i.i.d random variables X1, . . .Xn with mean µ

and variance σ2.

I Let Mn =X1 + · · ·+ Xn

n.

I E [Mn] =E [X1] + · · ·+ E [Xn]

n= µ

I var(Mn) =var[X1] + · · ·+ var[Xn]

n2=σ2

n

I So P(|Mn − µ| ≥ ε) ≤σ2

nε2

I For large n this probability is small.

14

Weak law of large numbers

The WLLN basically states that the sample mean of a large number ofrandom variables is very close to the true mean with high probability.

I Consider a sequence of i.i.d random variables X1, . . .Xn with mean µ

and variance σ2.

I Let Mn =X1 + · · ·+ Xn

n.

I E [Mn] =E [X1] + · · ·+ E [Xn]

n= µ

I var(Mn) =var[X1] + · · ·+ var[Xn]

n2=σ2

n

I So P(|Mn − µ| ≥ ε) ≤σ2

nε2

I For large n this probability is small.

14

Illustration

Consider the mean of n independent Poisson(1) random variables. Foreach n, we plot the distribution of the average.

15

Can we say more? Central Limit Theorem

Turns out that not only can you say that the sample mean is close to thetrue mean, you can actually predict its distribution using the famousCentral Limit Theorem.

I Consider a sequence of i.i.d random variables X1, . . .Xn with mean µ

and variance σ2.

I Let X̄n =X1 + · · ·+ Xn

n. Remember E [X̄n] = µ and var(X̄n) = σ2/n

I Standardize X̄n to getX̄n − µσ/√n

I As n gets bigger,X̄n − µσ/√n

behaves more and more like a Normal(0, 1)

random variable.

I P

(X̄n − µσ/√n< z

)≈ Φ(z)

16

Can we say more? Central Limit Theorem

Figure: (Courtesy: Tamara Broderick) You bet!

17

ExampleAn astronomer is interested in measuring the distance, in light-years, from his observatory toa distant star. Although the astronomer has a measuring technique, he knows that, becauseof changing atmospheric conditions and normal error, each time a measurement is made itwill not yield the exact distance, but merely an estimate. As a result, the astronomer plansto make a series of measurements and then use the average value of these measurements ashis estimated value of the actual distance. If the astronomer believes that the values of themeasurements are independent and identically distributed random variables having acommon mean d (the actual distance) and a common variance of 4 (light-years), how manymeasurements need he make to 95% sure that his estimated distance is accurate to within±.5 lightyears?

I Let X̄n be the mean of the measurements.

I How large does n have to be so that P(|X̄n − d | ≤ .5) = 0.95

I

P

(|X̄n − d |

2/√n≤ 0.25

√n

)≈ P(|Z | ≤ 0.25

√n) = 1− 2P(Z ≤ −0.25

√n) = 0.95

P(Z ≤ −0.25√n) = 0.025

−0.25√n = −1.96√n = 7.84

n ≈ 62

18

ExampleAn astronomer is interested in measuring the distance, in light-years, from his observatory toa distant star. Although the astronomer has a measuring technique, he knows that, becauseof changing atmospheric conditions and normal error, each time a measurement is made itwill not yield the exact distance, but merely an estimate. As a result, the astronomer plansto make a series of measurements and then use the average value of these measurements ashis estimated value of the actual distance. If the astronomer believes that the values of themeasurements are independent and identically distributed random variables having acommon mean d (the actual distance) and a common variance of 4 (light-years), how manymeasurements need he make to 95% sure that his estimated distance is accurate to within±.5 lightyears?

I Let X̄n be the mean of the measurements.

I How large does n have to be so that P(|X̄n − d | ≤ .5) = 0.95

I

P

(|X̄n − d |

2/√n≤ 0.25

√n

)≈ P(|Z | ≤ 0.25

√n) = 1− 2P(Z ≤ −0.25

√n) = 0.95

P(Z ≤ −0.25√n) = 0.025

−0.25√n = −1.96√n = 7.84

n ≈ 62

18

Normal Approximation to Binomial

The probability of selling an umbrella is 0.5 on a rainy day. If there are400 umbrellas in the store, whats the probability that the owner will sellat least 180?

I Let X be the total number of umbrellas sold.

I X ∼ Binomial(400, .5)

I We want P(X > 180). Crazy calculations.

I But can we approximate the distribution of X/n?

I X/n = (∑i

Yi )/n where E [Yi ] = 0.5 and var(Yi ) = 0.25.

I Sure! CLT tells us that for large n,X/400− 0.5√

0.25/400∼ N(0, 1)

I So P(X > 180) = P((X − 200)/√

100 > −2) ≈ P(Z ≥ −2) =

1− Φ(−2) = 0.97

19

Normal Approximation to Binomial

The probability of selling an umbrella is 0.5 on a rainy day. If there are400 umbrellas in the store, whats the probability that the owner will sellat least 180?

I Let X be the total number of umbrellas sold.

I X ∼ Binomial(400, .5)

I We want P(X > 180). Crazy calculations.

I But can we approximate the distribution of X/n?

I X/n = (∑i

Yi )/n where E [Yi ] = 0.5 and var(Yi ) = 0.25.

I Sure! CLT tells us that for large n,X/400− 0.5√

0.25/400∼ N(0, 1)

I So P(X > 180) = P((X − 200)/√

100 > −2) ≈ P(Z ≥ −2) =

1− Φ(−2) = 0.97

19

Summary

I We looked at conditional expectation as a random variable.

I For two r.v.’s X ,Y E [X |Y ] is a random variable which is a functionof Y

I Law of Iterated expectation: E [E [X |Y ]] = E [X ].

I We also saw Chebyshev’s inequality, WLLN and the CLT.

20

top related