chapter 3 random variables and probability distributionsorf 245: random variables { j.fan 58 3.2...

22
Chapter 3 Random Variables and Probability Distributions 3.1 Random Variables §3.1 * Different experiments yield different outcomes and we are interested in some numerical aspects of the random outcome. No. of people voting for a candidate; # of times that the ball in a roulette lands in even 0* This is the section number in the textbook 54

Upload: others

Post on 23-Mar-2021

12 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chapter 3 Random Variables and Probability DistributionsORF 245: Random Variables { J.Fan 58 3.2 Probability Distributions x3.2 Prob dist says how the total probability of 1 is distributed

Chapter 3

Random Variables and Probability Distributions

3.1 Random Variables §3.1∗

Different experiments yield different outcomes and we are interested in

some numerical aspects of the random outcome.

•No. of people voting for a candidate;

•# of times that the ball in a roulette lands in even0∗ This is the section number in the textbook

54

Page 2: Chapter 3 Random Variables and Probability DistributionsORF 245: Random Variables { J.Fan 58 3.2 Probability Distributions x3.2 Prob dist says how the total probability of 1 is distributed

ORF 245: Random Variables – J.Fan 55

numbered pockets.

Random variable: X is a function on Ω. Formally,

it should be written as X(ω), but the outcome ω is often suppressed.

Example 3.1 Random dialing

Consider a random number dialer which picks a telephone number at

random in a certain area. Let Y be 1 if the call is picked up and 0

otherwise. Then the sample space is all allowable phone numbers and

Y is a binary random variable, called a Bernoulli r.v.

> sample(9999999,10) #draw a random sample of size 10 from 1:9999999

2097071 4378927 374022 4330670 9301962 4432222 9323759 5897322 835193 9509962

Example 3.2 Toss a coin 3 times: the sample space is Ω =

H,T × H,T × H,T.

Page 3: Chapter 3 Random Variables and Probability DistributionsORF 245: Random Variables { J.Fan 58 3.2 Probability Distributions x3.2 Prob dist says how the total probability of 1 is distributed

ORF 245: Random Variables – J.Fan 56

Let X be the number of heads. Then,

Outcomes HHH HTH THH HHT HTT THT TTH TTT

Notation ω1 ω2 ω3 ω4 ω5 ω6 ω7 ω8

X(ω) = 3 2 2 2 1 1 1 0

FX is called a binomial r.v. w/ parameters n = 3 (no. of trials)

and p = 0.5 (prob. of success). It counts # of S in a Bernoulli trial.

> rbinom(30,3,0.5) #draw 30 times from Binomial dist with n=3, p=0.5

[1] 1 2 3 2 3 0 1 2 1 1 1 1 1 2 1 1 1 2 1 1 1 1 0 1 1 3 2 1 1 3

Example 3.3 Rare disease

For a rare disease, a commonly used method is to sample until getting

a certain # of cases. Let X be the number of samples required to ob-

tain the first case. Then, the sample space is Ω = S, FS, FFS, · · ·

Page 4: Chapter 3 Random Variables and Probability DistributionsORF 245: Random Variables { J.Fan 58 3.2 Probability Distributions x3.2 Prob dist says how the total probability of 1 is distributed

ORF 245: Random Variables – J.Fan 57

and X(ω) is simply the number of letters in ω.

Example 3.4 Spatial data

Let X be the current temperature at a random location (defined by

latitude and longitude). Then, the sample space is [0, 180] × [0, 360]

and X(ω) = current temperature at that location.

Range of a random variable is all of its possible values. When the

range is countable, the random variable is discrete. When the range

is an interval on the number line, it is a continuous.

Examples: In Ex 3.1 – 3.3, the random variables are discrete; while

in Ex 3.4, the random variable is continuous.

Page 5: Chapter 3 Random Variables and Probability DistributionsORF 245: Random Variables { J.Fan 58 3.2 Probability Distributions x3.2 Prob dist says how the total probability of 1 is distributed

ORF 245: Random Variables – J.Fan 58

3.2 Probability Distributions §3.2

Prob dist says how the total probability of 1 is distributed among

possible values of a r.v. X . For a discrete X, it is given by

p(x) = P (X = x) = Pω : X(ω) = x, for all x in the range ,

also called probability mass function (pmf).

Example 3.2 (continued). The range of X is 0, 1, 2, 3.

p(0) = P (X = 0) =1

8, p(1) =

3

8, p(2) =

3

8, p(3) =

1

8.

It can easily be visualized by the line diagram (graph), called a bi-

nomial dist with no. of trials n = 3 and prob. of success p = 0.5.

Page 6: Chapter 3 Random Variables and Probability DistributionsORF 245: Random Variables { J.Fan 58 3.2 Probability Distributions x3.2 Prob dist says how the total probability of 1 is distributed

ORF 245: Random Variables – J.Fan 59

Figure 3.1: The line diagrams for the pmf in Examples 3.2 and 3.3. It is equivalent to the histogram in this case. A probabilityhistogram represents probability by area.

Example 3.3 (continued). The range of X is 1, 2, · · · . Let p be

the prevalence probability of the disease and q = 1− p. Then,

p(1) = P (X = 1) = p, p(2) = qp, p(3) = q2p,

p(x) = P (X = x) = qx−1p, · · · .x︷ ︸︸ ︷

FFFFFFS

It is referred to as a geometric distribution with parameter p.

Page 7: Chapter 3 Random Variables and Probability DistributionsORF 245: Random Variables { J.Fan 58 3.2 Probability Distributions x3.2 Prob dist says how the total probability of 1 is distributed

ORF 245: Random Variables – J.Fan 60

Example 3.5 Sampling inspection

In 100 products, 3 of them are defective. Suppose that we pick 4

products at random. Let X be the number of defective products.

Find the distribution of X .

First of all, the range of X is 0, 1, 2, 3. Now,

p(0) = P (X = 0) =

(974

)(1004

) =97×96×95×94

4×3×2×1100×99×98×97

4×3×2×1

=96× 95× 94

100× 99× 98= 88.36%.

Similarly,

p(1) = P (X = 1) =

(973

)(31

)(1004

) = = = 11.28%,

p(2) = P (X = 2) =

(972

)(32

)(1004

) = = = 0.36%

Page 8: Chapter 3 Random Variables and Probability DistributionsORF 245: Random Variables { J.Fan 58 3.2 Probability Distributions x3.2 Prob dist says how the total probability of 1 is distributed

ORF 245: Random Variables – J.Fan 61

and

p(3) = P (X = 3) =

(971

)(33

)(1004

) = = ≈ 0%.

called a hypergeometric distribution.

Cumulative distribution function (cdf) is defined as

F (x) = P (X ≤ x) =∑y:y≤x

p(y),

the probability that the observed value of X is at most x.

Example 3.2 (continued). The cdf of X is

F (0) =1

8, F (1) =

1

8+

3

8=

1

2, F (2) = F (1) + p(2) =

7

8, F (3) = 1.

Example 3.3 (continued). For any given integer x,

F (x) =

x∑y=1

pqy−1 = p1− qx

1− q= 1− qx.

Page 9: Chapter 3 Random Variables and Probability DistributionsORF 245: Random Variables { J.Fan 58 3.2 Probability Distributions x3.2 Prob dist says how the total probability of 1 is distributed

ORF 245: Random Variables – J.Fan 62

For noninteger x, replace x above by its integer part [x].x=seq(0,40, 0.02) #create x values for calculating CDF of Ex 3.3

p = 0.3; q = 1-p #setting parameters

cdf = 1 - q^floor(x) #calculate CDF

plot (x, cdf, col="red", type="l", xlab="x", ylab="density")

p = 0.1; q = 1-p #setting parameters

cdf = 1 - q^floor(x)

lines(x, cdf, col="blue")

title("Ex 3.3: CDF of genometric distribution with p = 0.3 and 0.1")

0 1 2 3 4

0.0

0.2

0.4

0.6

0.8

1.0

x

y

Ex 3.1: A binomial Distribution

0 10 20 30 40

0.0

0.2

0.4

0.6

0.8

1.0

x

dens

ity

Ex 3.3: CDF of genometric distribution with p = 0.3 and 0.1

Figure 3.2: The cumulative distribution of Examples 3.1 and 3.3 (with parameter p = 0.3 and 0.1), which are a step function.

Page 10: Chapter 3 Random Variables and Probability DistributionsORF 245: Random Variables { J.Fan 58 3.2 Probability Distributions x3.2 Prob dist says how the total probability of 1 is distributed

ORF 245: Random Variables – J.Fan 63

3.3 Expected Values §3.3

Used to summarize the outcome of a r.v.

Example 3.6 Playing a roulette

If you bet $1 on an even number, the chance to win is 18/38. What

is the expected payoff?

Imagine that you play 38000 times, you would expect to win 18000

times and lose 20000 times. Thus,

Expected payoff per game =18000× ($1) + 20000× ($− 1)

38000

=18

38× ($1) +

20

38× (−$1)︸ ︷︷ ︸∑

value×its probability

= −$0.05263

Page 11: Chapter 3 Random Variables and Probability DistributionsORF 245: Random Variables { J.Fan 58 3.2 Probability Distributions x3.2 Prob dist says how the total probability of 1 is distributed

ORF 245: Random Variables – J.Fan 64

That is, you expect to lose 5 cents per game.

Definition: E(X) = µx =∑

j xj ×P(X = xj).

Interpretation: It is the long-run average. It is not a value

that is likely/expected to get.

> x = rbinom(100000,1,18/38) #play 100K (long-run) games with Bernoulli trials

> x=2*x-1 #convert into $1 and -$1

> x[1:30] #outcomes of first 30 games

[1] -1 1 1 1 -1 -1 1 -1 1 -1 1 1 1 1 1 -1 -1 -1 -1 1 -1 -1 1 -1 1 -1 -1 -1 -1 -1

> mean(x); sd(x) #average and SD of the outcomes of the game

[1] -0.05442 [1] 0.9985231

Example 3.7 Rolling a dice.

Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.

What is the expected number of spots X on the top?

EX = 1× 1

6+ 2× 1

6+ · · · + 6× 1

6= 3.5.

Page 12: Chapter 3 Random Variables and Probability DistributionsORF 245: Random Variables { J.Fan 58 3.2 Probability Distributions x3.2 Prob dist says how the total probability of 1 is distributed

ORF 245: Random Variables – J.Fan 65

Example 3.3 (continued). The expected value of the geometric dis-

tribution is

E(X) =

∞∑x=1

xqx−1p =d

dq(

∞∑n=1

qn)p = = 1/p.

Example 3.8 Group testing.

For a rare disease with 1% prevalence rate, the following group testing

is used. Pull the blood sample of 10 people together. If the result is

negative, all of them are negative. If the result is positive, test them

individually. If each test costs $1, what is the expected cost?

Page 13: Chapter 3 Random Variables and Probability DistributionsORF 245: Random Variables { J.Fan 58 3.2 Probability Distributions x3.2 Prob dist says how the total probability of 1 is distributed

ORF 245: Random Variables – J.Fan 66

Pnone of 10 have disease = 0.9910 = 0.9044. Thus,

Expected cost = $1× 0.9044 + $11× 0.0956 = $1.956,

comparing with $10 of the naive method. Q: How to simulate?

Function of a r.v.: Y = g(X) is a r.v. defined as Y (ω) =

g(X(ω)). The pmf is given by

P (Y = y) =∑

xj:g(xj)=yP (X = xj).

Its expected value is given by E(Y ) =∑i yiP (Y = yi) and satisfies

♠ Eg(X) =∑i g(xi)P (X = xi).

♠ E(aX + b) = aE(X) + b.

Page 14: Chapter 3 Random Variables and Probability DistributionsORF 245: Random Variables { J.Fan 58 3.2 Probability Distributions x3.2 Prob dist says how the total probability of 1 is distributed

ORF 245: Random Variables – J.Fan 67

Example 3.2 (Continued). Let X = No. of heads in 3 tosses. Find

the distribution of Y = (X − 2)2 and its expected value.

Note that

X = 0 1 2 3

Y = 4 1 0 1. The range of Y is 0, 1, 4 with

P (Y = 0) = P (X = 2) =3

8,

P (Y = 1) = P (X = 1) + P (X = 3) =1

2,

P (Y = 4) = P (X = 0) =1

8.

Method 1: (p(y)) EY = 0× 38 + 1× 1

2 + 4× 18 = 1.

Method 2: (p(x)) E(X − 2)2 = 4× 18 + 1× 3

8 + 0× 38 + 1× 1

8 = 1.

Variance: var(X) = σ2 = E(X − µ)2. SD(X) = σ =√

var(X). It

shows the typical size of the deviation of the r.v. X from µ.

Page 15: Chapter 3 Random Variables and Probability DistributionsORF 245: Random Variables { J.Fan 58 3.2 Probability Distributions x3.2 Prob dist says how the total probability of 1 is distributed

ORF 245: Random Variables – J.Fan 68

FE(X) and SD(X) are long-run ave and SD if X is drawn repeatedly.

Shortcut formula: var(X) = E(X2)− (EX)2.

Properties: Fvar(aX + b) = a2 var(X)

FSD(aX + b) = |a| SD(X)

Example 3.8: (generalized, binary outcomes).

Let Y be the cost. Assume that P (Y = a) = p and P (Y = b) = q.

Then, EY = ap + bq. Similarly, EY 2 = a2p + b2q. Thus,

var(Y ) = a2p + b2q − (ap + bq)2 = (a− b)2pq.

Hence, SD(Y ) = |b− a|√pq.

—For roulette game in Ex 3.6, SD of gain = 2√

1838 ∗

2038 ≈ $1

—For data in Ex 3.8, we have SD(Y ) = 10√.9044 ∗ 0.0956 ≈ $2.94.

Page 16: Chapter 3 Random Variables and Probability DistributionsORF 245: Random Variables { J.Fan 58 3.2 Probability Distributions x3.2 Prob dist says how the total probability of 1 is distributed

ORF 245: Random Variables – J.Fan 69

3.4 Probability density functions §4.1 & 4.2

Continuous rv’s: FIncome of a randomly drawn tax-payer;

FAmount of precipitation per year at a randomly selected location.

Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.

( )y f x=

ba

( )P a X b≤ ≤

Figure 3.3: Probabilities of a continuous rv are given

by the area under the density function. Hence P (X =

x) = 0 and P (X ≤ x) = P (X < x).

Definition: The probability den-

sity function (pdf) of a continuous rv

X is the function such that

P (a ≤ X ≤ b) =

∫ b

af (x)dx,

denoted by X ∼ f .

The density function should satisfy

♠ f (x) ≥ 0;

Page 17: Chapter 3 Random Variables and Probability DistributionsORF 245: Random Variables { J.Fan 58 3.2 Probability Distributions x3.2 Prob dist says how the total probability of 1 is distributed

ORF 245: Random Variables – J.Fan 70

♠∫ +∞−∞ f (x)dx = 1⇐⇒ Total area is one.

Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.

Area = 1( )y f x=

Example 3.9 Uniform distribution.

Assume that measurements of gene expressions using a microarray

technique are recorded to the 2nd decimal point. Let X be

the rounding error, which is assumed to be uniformly distributed on

[−0.005, 0.005). The density of X is

f (x) =

100, if −0.005 ≤ x < 0.005

0, otherwise

Denoted by X ∼ unif[−0.005, 0.005).

The definition can easily be generalized to interval (a, b).

Page 18: Chapter 3 Random Variables and Probability DistributionsORF 245: Random Variables { J.Fan 58 3.2 Probability Distributions x3.2 Prob dist says how the total probability of 1 is distributed

ORF 245: Random Variables – J.Fan 71

Figure 3.4: Density functions of X ∼ unif(a, b) and X ∼ exp(λ).

Example 3.10 Modeling of life time: products, firms.

Suppose that the life time of a product X follows the exponential

distribution with parameter λ:

f (x) = λ exp(−λx), for x ≥ 0

It is easy to check that f is a pdf:∫ +∞

−∞f (x)dx = λ

∫ ∞0

exp(−λx)dx =

∫ ∞0

exp(−y)dy = 1.

> rexp(5,2) #draw 5 data from exponential dist with lambda=2

[1] 0.2414199 1.4790871 0.1378886 0.8888352 0.3890791

Page 19: Chapter 3 Random Variables and Probability DistributionsORF 245: Random Variables { J.Fan 58 3.2 Probability Distributions x3.2 Prob dist says how the total probability of 1 is distributed

ORF 245: Random Variables – J.Fan 72

> mean(rexp(1000,2)) #draw 1000 data and compute its average

[1] 0.5202178 #it is approx. 1/lambda

Interpretation of density:

♠ P (X = x) = 0 6= f (x).

♠ P (X ∈ x±∆) ≈ f (x)2∆ for small ∆ —showing how likely X = x at a unit interval near x.

♠ How can one observe a birth weight of 8.43 lbs? Think of X = 8.43 as X ∈ [8.425, 8.435)

due to the accuracy of the weighting device.

Cumulative distribution function:

F (x) = P (X ≤ x) =∫ x−∞ f (y)dy. Hence, F ′(x) = f (x).

Ex 3.10 (Cont): F (x) =∫ x

0 λ exp(−λx)dx = 1− exp(−λx)

Percentile: The p-th quantile ηp is the point such that

p = F (ηp) =

∫ ηp

−∞f (y)dy.

Page 20: Chapter 3 Random Variables and Probability DistributionsORF 245: Random Variables { J.Fan 58 3.2 Probability Distributions x3.2 Prob dist says how the total probability of 1 is distributed

ORF 245: Random Variables – J.Fan 73

x

0 1 2 3 4 5 6

0.0

0.1

0.2

0.3

(a) density

x

y

0 1 2 3 4 5 6

0.0

0.2

0.4

0.6

0.8

1.0

(b) distribution

F(2)

F -1(0.8)

Figure 3.5: cdf and percentile of a continuous distribution.

η0.5 is called the median (half-life). You can write ηp = F−1(p).

Ex 3.10 (Cont): p = F (ηp) = 1 − exp(−ληp). Thus, ηp =

−log(1−p)λ . Half-life = log(2)/λ, by taking p = .5.

Page 21: Chapter 3 Random Variables and Probability DistributionsORF 245: Random Variables { J.Fan 58 3.2 Probability Distributions x3.2 Prob dist says how the total probability of 1 is distributed

ORF 245: Random Variables – J.Fan 74

3.5 Expected value and variance of continuous RV’s

Expected value: Eg(X) =∫ +∞−∞ g(x)f (x)dx.

Compared with Eg(X) =∑x g(x)P (X = x) for the discrete case,

pdf plays the same role as pmf P (X = x).

Variance: var(X) = E(X−µ)2 = EX2− (EX)2, where µ = EX .

Example 3.11 Expected value and variance of uniform dist

Suppose that X ∼ unif(0, 1). Then,

EX =

∫ 1

0xdx = = 1/2.

EX2 =

∫ 1

0x2dx = = 1/3.

Page 22: Chapter 3 Random Variables and Probability DistributionsORF 245: Random Variables { J.Fan 58 3.2 Probability Distributions x3.2 Prob dist says how the total probability of 1 is distributed

ORF 245: Random Variables – J.Fan 75

Hence, var(X) = 1/3− (1/2)2 = 1/12 and SD(X) =√

1/12 = 0.29.

Now, if Y = a + (b− a)X , then Y ∼ unif(a, b). Thus,

EY = = (b + a)/2.

var(Y ) = = (b− a)2/12.