chapter 3 random variables and probability distributionsorf 245: random variables { j.fan 58 3.2...
TRANSCRIPT
Chapter 3
Random Variables and Probability Distributions
3.1 Random Variables §3.1∗
Different experiments yield different outcomes and we are interested in
some numerical aspects of the random outcome.
•No. of people voting for a candidate;
•# of times that the ball in a roulette lands in even0∗ This is the section number in the textbook
54
ORF 245: Random Variables – J.Fan 55
numbered pockets.
Random variable: X is a function on Ω. Formally,
it should be written as X(ω), but the outcome ω is often suppressed.
Example 3.1 Random dialing
Consider a random number dialer which picks a telephone number at
random in a certain area. Let Y be 1 if the call is picked up and 0
otherwise. Then the sample space is all allowable phone numbers and
Y is a binary random variable, called a Bernoulli r.v.
> sample(9999999,10) #draw a random sample of size 10 from 1:9999999
2097071 4378927 374022 4330670 9301962 4432222 9323759 5897322 835193 9509962
Example 3.2 Toss a coin 3 times: the sample space is Ω =
H,T × H,T × H,T.
ORF 245: Random Variables – J.Fan 56
Let X be the number of heads. Then,
Outcomes HHH HTH THH HHT HTT THT TTH TTT
Notation ω1 ω2 ω3 ω4 ω5 ω6 ω7 ω8
X(ω) = 3 2 2 2 1 1 1 0
FX is called a binomial r.v. w/ parameters n = 3 (no. of trials)
and p = 0.5 (prob. of success). It counts # of S in a Bernoulli trial.
> rbinom(30,3,0.5) #draw 30 times from Binomial dist with n=3, p=0.5
[1] 1 2 3 2 3 0 1 2 1 1 1 1 1 2 1 1 1 2 1 1 1 1 0 1 1 3 2 1 1 3
Example 3.3 Rare disease
For a rare disease, a commonly used method is to sample until getting
a certain # of cases. Let X be the number of samples required to ob-
tain the first case. Then, the sample space is Ω = S, FS, FFS, · · ·
ORF 245: Random Variables – J.Fan 57
and X(ω) is simply the number of letters in ω.
Example 3.4 Spatial data
Let X be the current temperature at a random location (defined by
latitude and longitude). Then, the sample space is [0, 180] × [0, 360]
and X(ω) = current temperature at that location.
Range of a random variable is all of its possible values. When the
range is countable, the random variable is discrete. When the range
is an interval on the number line, it is a continuous.
Examples: In Ex 3.1 – 3.3, the random variables are discrete; while
in Ex 3.4, the random variable is continuous.
ORF 245: Random Variables – J.Fan 58
3.2 Probability Distributions §3.2
Prob dist says how the total probability of 1 is distributed among
possible values of a r.v. X . For a discrete X, it is given by
p(x) = P (X = x) = Pω : X(ω) = x, for all x in the range ,
also called probability mass function (pmf).
Example 3.2 (continued). The range of X is 0, 1, 2, 3.
p(0) = P (X = 0) =1
8, p(1) =
3
8, p(2) =
3
8, p(3) =
1
8.
It can easily be visualized by the line diagram (graph), called a bi-
nomial dist with no. of trials n = 3 and prob. of success p = 0.5.
ORF 245: Random Variables – J.Fan 59
Figure 3.1: The line diagrams for the pmf in Examples 3.2 and 3.3. It is equivalent to the histogram in this case. A probabilityhistogram represents probability by area.
Example 3.3 (continued). The range of X is 1, 2, · · · . Let p be
the prevalence probability of the disease and q = 1− p. Then,
p(1) = P (X = 1) = p, p(2) = qp, p(3) = q2p,
p(x) = P (X = x) = qx−1p, · · · .x︷ ︸︸ ︷
FFFFFFS
It is referred to as a geometric distribution with parameter p.
ORF 245: Random Variables – J.Fan 60
Example 3.5 Sampling inspection
In 100 products, 3 of them are defective. Suppose that we pick 4
products at random. Let X be the number of defective products.
Find the distribution of X .
First of all, the range of X is 0, 1, 2, 3. Now,
p(0) = P (X = 0) =
(974
)(1004
) =97×96×95×94
4×3×2×1100×99×98×97
4×3×2×1
=96× 95× 94
100× 99× 98= 88.36%.
Similarly,
p(1) = P (X = 1) =
(973
)(31
)(1004
) = = = 11.28%,
p(2) = P (X = 2) =
(972
)(32
)(1004
) = = = 0.36%
ORF 245: Random Variables – J.Fan 61
and
p(3) = P (X = 3) =
(971
)(33
)(1004
) = = ≈ 0%.
called a hypergeometric distribution.
Cumulative distribution function (cdf) is defined as
F (x) = P (X ≤ x) =∑y:y≤x
p(y),
the probability that the observed value of X is at most x.
Example 3.2 (continued). The cdf of X is
F (0) =1
8, F (1) =
1
8+
3
8=
1
2, F (2) = F (1) + p(2) =
7
8, F (3) = 1.
Example 3.3 (continued). For any given integer x,
F (x) =
x∑y=1
pqy−1 = p1− qx
1− q= 1− qx.
ORF 245: Random Variables – J.Fan 62
For noninteger x, replace x above by its integer part [x].x=seq(0,40, 0.02) #create x values for calculating CDF of Ex 3.3
p = 0.3; q = 1-p #setting parameters
cdf = 1 - q^floor(x) #calculate CDF
plot (x, cdf, col="red", type="l", xlab="x", ylab="density")
p = 0.1; q = 1-p #setting parameters
cdf = 1 - q^floor(x)
lines(x, cdf, col="blue")
title("Ex 3.3: CDF of genometric distribution with p = 0.3 and 0.1")
0 1 2 3 4
0.0
0.2
0.4
0.6
0.8
1.0
x
y
Ex 3.1: A binomial Distribution
0 10 20 30 40
0.0
0.2
0.4
0.6
0.8
1.0
x
dens
ity
Ex 3.3: CDF of genometric distribution with p = 0.3 and 0.1
Figure 3.2: The cumulative distribution of Examples 3.1 and 3.3 (with parameter p = 0.3 and 0.1), which are a step function.
ORF 245: Random Variables – J.Fan 63
3.3 Expected Values §3.3
Used to summarize the outcome of a r.v.
Example 3.6 Playing a roulette
If you bet $1 on an even number, the chance to win is 18/38. What
is the expected payoff?
Imagine that you play 38000 times, you would expect to win 18000
times and lose 20000 times. Thus,
Expected payoff per game =18000× ($1) + 20000× ($− 1)
38000
=18
38× ($1) +
20
38× (−$1)︸ ︷︷ ︸∑
value×its probability
= −$0.05263
ORF 245: Random Variables – J.Fan 64
That is, you expect to lose 5 cents per game.
Definition: E(X) = µx =∑
j xj ×P(X = xj).
Interpretation: It is the long-run average. It is not a value
that is likely/expected to get.
> x = rbinom(100000,1,18/38) #play 100K (long-run) games with Bernoulli trials
> x=2*x-1 #convert into $1 and -$1
> x[1:30] #outcomes of first 30 games
[1] -1 1 1 1 -1 -1 1 -1 1 -1 1 1 1 1 1 -1 -1 -1 -1 1 -1 -1 1 -1 1 -1 -1 -1 -1 -1
> mean(x); sd(x) #average and SD of the outcomes of the game
[1] -0.05442 [1] 0.9985231
Example 3.7 Rolling a dice.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
What is the expected number of spots X on the top?
EX = 1× 1
6+ 2× 1
6+ · · · + 6× 1
6= 3.5.
ORF 245: Random Variables – J.Fan 65
Example 3.3 (continued). The expected value of the geometric dis-
tribution is
E(X) =
∞∑x=1
xqx−1p =d
dq(
∞∑n=1
qn)p = = 1/p.
Example 3.8 Group testing.
For a rare disease with 1% prevalence rate, the following group testing
is used. Pull the blood sample of 10 people together. If the result is
negative, all of them are negative. If the result is positive, test them
individually. If each test costs $1, what is the expected cost?
ORF 245: Random Variables – J.Fan 66
Pnone of 10 have disease = 0.9910 = 0.9044. Thus,
Expected cost = $1× 0.9044 + $11× 0.0956 = $1.956,
comparing with $10 of the naive method. Q: How to simulate?
Function of a r.v.: Y = g(X) is a r.v. defined as Y (ω) =
g(X(ω)). The pmf is given by
P (Y = y) =∑
xj:g(xj)=yP (X = xj).
Its expected value is given by E(Y ) =∑i yiP (Y = yi) and satisfies
♠ Eg(X) =∑i g(xi)P (X = xi).
♠ E(aX + b) = aE(X) + b.
ORF 245: Random Variables – J.Fan 67
Example 3.2 (Continued). Let X = No. of heads in 3 tosses. Find
the distribution of Y = (X − 2)2 and its expected value.
Note that
X = 0 1 2 3
Y = 4 1 0 1. The range of Y is 0, 1, 4 with
P (Y = 0) = P (X = 2) =3
8,
P (Y = 1) = P (X = 1) + P (X = 3) =1
2,
P (Y = 4) = P (X = 0) =1
8.
Method 1: (p(y)) EY = 0× 38 + 1× 1
2 + 4× 18 = 1.
Method 2: (p(x)) E(X − 2)2 = 4× 18 + 1× 3
8 + 0× 38 + 1× 1
8 = 1.
Variance: var(X) = σ2 = E(X − µ)2. SD(X) = σ =√
var(X). It
shows the typical size of the deviation of the r.v. X from µ.
ORF 245: Random Variables – J.Fan 68
FE(X) and SD(X) are long-run ave and SD if X is drawn repeatedly.
Shortcut formula: var(X) = E(X2)− (EX)2.
Properties: Fvar(aX + b) = a2 var(X)
FSD(aX + b) = |a| SD(X)
Example 3.8: (generalized, binary outcomes).
Let Y be the cost. Assume that P (Y = a) = p and P (Y = b) = q.
Then, EY = ap + bq. Similarly, EY 2 = a2p + b2q. Thus,
var(Y ) = a2p + b2q − (ap + bq)2 = (a− b)2pq.
Hence, SD(Y ) = |b− a|√pq.
—For roulette game in Ex 3.6, SD of gain = 2√
1838 ∗
2038 ≈ $1
—For data in Ex 3.8, we have SD(Y ) = 10√.9044 ∗ 0.0956 ≈ $2.94.
ORF 245: Random Variables – J.Fan 69
3.4 Probability density functions §4.1 & 4.2
Continuous rv’s: FIncome of a randomly drawn tax-payer;
FAmount of precipitation per year at a randomly selected location.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
( )y f x=
ba
( )P a X b≤ ≤
Figure 3.3: Probabilities of a continuous rv are given
by the area under the density function. Hence P (X =
x) = 0 and P (X ≤ x) = P (X < x).
Definition: The probability den-
sity function (pdf) of a continuous rv
X is the function such that
P (a ≤ X ≤ b) =
∫ b
af (x)dx,
denoted by X ∼ f .
The density function should satisfy
♠ f (x) ≥ 0;
ORF 245: Random Variables – J.Fan 70
♠∫ +∞−∞ f (x)dx = 1⇐⇒ Total area is one.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Area = 1( )y f x=
Example 3.9 Uniform distribution.
Assume that measurements of gene expressions using a microarray
technique are recorded to the 2nd decimal point. Let X be
the rounding error, which is assumed to be uniformly distributed on
[−0.005, 0.005). The density of X is
f (x) =
100, if −0.005 ≤ x < 0.005
0, otherwise
Denoted by X ∼ unif[−0.005, 0.005).
The definition can easily be generalized to interval (a, b).
ORF 245: Random Variables – J.Fan 71
Figure 3.4: Density functions of X ∼ unif(a, b) and X ∼ exp(λ).
Example 3.10 Modeling of life time: products, firms.
Suppose that the life time of a product X follows the exponential
distribution with parameter λ:
f (x) = λ exp(−λx), for x ≥ 0
It is easy to check that f is a pdf:∫ +∞
−∞f (x)dx = λ
∫ ∞0
exp(−λx)dx =
∫ ∞0
exp(−y)dy = 1.
> rexp(5,2) #draw 5 data from exponential dist with lambda=2
[1] 0.2414199 1.4790871 0.1378886 0.8888352 0.3890791
ORF 245: Random Variables – J.Fan 72
> mean(rexp(1000,2)) #draw 1000 data and compute its average
[1] 0.5202178 #it is approx. 1/lambda
Interpretation of density:
♠ P (X = x) = 0 6= f (x).
♠ P (X ∈ x±∆) ≈ f (x)2∆ for small ∆ —showing how likely X = x at a unit interval near x.
♠ How can one observe a birth weight of 8.43 lbs? Think of X = 8.43 as X ∈ [8.425, 8.435)
due to the accuracy of the weighting device.
Cumulative distribution function:
F (x) = P (X ≤ x) =∫ x−∞ f (y)dy. Hence, F ′(x) = f (x).
Ex 3.10 (Cont): F (x) =∫ x
0 λ exp(−λx)dx = 1− exp(−λx)
Percentile: The p-th quantile ηp is the point such that
p = F (ηp) =
∫ ηp
−∞f (y)dy.
ORF 245: Random Variables – J.Fan 73
x
0 1 2 3 4 5 6
0.0
0.1
0.2
0.3
(a) density
x
y
0 1 2 3 4 5 6
0.0
0.2
0.4
0.6
0.8
1.0
(b) distribution
F(2)
F -1(0.8)
Figure 3.5: cdf and percentile of a continuous distribution.
η0.5 is called the median (half-life). You can write ηp = F−1(p).
Ex 3.10 (Cont): p = F (ηp) = 1 − exp(−ληp). Thus, ηp =
−log(1−p)λ . Half-life = log(2)/λ, by taking p = .5.
ORF 245: Random Variables – J.Fan 74
3.5 Expected value and variance of continuous RV’s
Expected value: Eg(X) =∫ +∞−∞ g(x)f (x)dx.
Compared with Eg(X) =∑x g(x)P (X = x) for the discrete case,
pdf plays the same role as pmf P (X = x).
Variance: var(X) = E(X−µ)2 = EX2− (EX)2, where µ = EX .
Example 3.11 Expected value and variance of uniform dist
Suppose that X ∼ unif(0, 1). Then,
EX =
∫ 1
0xdx = = 1/2.
EX2 =
∫ 1
0x2dx = = 1/3.
ORF 245: Random Variables – J.Fan 75
Hence, var(X) = 1/3− (1/2)2 = 1/12 and SD(X) =√
1/12 = 0.29.
Now, if Y = a + (b− a)X , then Y ∼ unif(a, b). Thus,
EY = = (b + a)/2.
var(Y ) = = (b− a)2/12.