1
Chap 5 Sums of Random Variables and Long-Term Averages
• Many problems involve the counting of number of occurrences of events, computation of arithmetic averages in a series of measurements.
• These problems can be reduced to the problem of finding the distribution of a random variable that consists of sum of n i.i.d. random variables.
2
5.1 Sums of Random Variables
1 2
1 2
Let , ,..... be a sequence of random variables, and let
.n
n n
X X X
S X X X
1 1 1
2
, regardless of statistical dependence. .....
n
n n n
S X X X
S S
E E E E
VAR S E
2
1 1
1 1
,11 1,
E X E Xi ii
n nE X E X X E Xj j k kj kn n
E X E X X E Xj j k kj kn n n
VAR X COV X Xjk kjk k j k
3
1
In general, ( , ) 0, so ( ) ( ).
If , ,....., are indep., ( , ) 0 for .1 2
1
j k j k i ii i
n
n ii
COV X X VAR X VAR X
X X X COV X X j kn j kn
VAR S VAR X VAR Xii
Ex. 5.2
2
Find the mean and variance of the sum of n independent, identically
distributed (iid) random variables, each with mean and variance .
n
n
E S
VAR S
4
The characteristic function of Sn
1 2
1
1
.....( )
( ) ( )
nn
n
n
n
j X X Xj SS
j Xj X
X X
E e E e
E e E e
1
1
The pdf of can be found by
..... .n n
n
S X X
S
f s
1 2
1 2
Let , ,..... be independent random variables, and
.n
n n
X X X
S X X X
5
Ex. 5.3 Sum of n iid Gaussian r.v. with parameters and .
2 2 / 2
( )
i i
i
n
j mX
S
e
2iim
Ex. 5.5 Sum of n iid exponential r.v. with parameter
n
X
n
S
j
j
p.102 n-Erlang
6
'iX sIf are integer-valued r.v.s. , it is preferable to use the prob. generating function (z-transform).
NNG z E z
1 1If ..... , ,..., independent,n nN X X X X
1 1
1
( )
( ) ( )
n n
n
X X XX
X X
G z E z E z E zNG z G z
Ex. Find the generating function for a sum of n iid geometrically distributed r.v.
( )1
( )1
X
n
N
pzG z
qz
pzG z
qz
p.100, negative binomial
7
Sum of a random number of random variables
1
i.i.d.N
N k kk
S X X
N is a r.v., independent of ' .kX s
1
1
|
( | )
( )| ( )
| ( )
( ) |
N
N N
N n
N nX
NNX
NS
E S E E S N
E N E X E S N n E X X nE X
E N E X
j S j X X nE e N n E e
j SE e N
j SE E e N
E
( )| ( ( ))X
Nz N Xz G Ex. 5.7
8
( )
( )
( )
( )
N
N
N
X
S
S
G z
f x
Ex. 5.7 The number of jobs submitted to a computer in an hour
is a geometric random variable with parameter , and the
job execution times are independent exponential
N
p
ly distribiuted
1 random variables with mean . Find the pdf of the sum
of the execution times of the jobs submitted in an hour.
9
5.2 Sample Mean and Laws of Large Numbers
1 2
Let X be a random variable for which the mean, = , is unknown.
Let , ,..... denote independent, repeated measurements of .
i.e., 's are iid random variables.
The of of the sequence
n
j
E X
X X X X
X
sample mean
1 2
1 ( )
can be used to estimate .
n nM X X Xn
E X
r.v.a is itself nM
n n1 1
j 1 j 1j jE M E X E Xn n n
is an unbiased estimator for .Mn
10
2 2( ) ( )E M E M E Mn n n
nM oferror square mean nM of Variance
1Since n nM S
n
2 2
2 2
1( ) ( ) ( 0 )n n
nVAR M VAR S as n
n n n
Using Chebyshev inequality
2
( )[ ] n
n n
VAR MP M E M
2
2[ ]nP M
n
2
2 [ ] 1or P Mn n
11
Ex.5.9 Voltage measurement , where is the desired voltage
and is the noise voltage with mean zero and standard deviation 1 V.
Assume that noise voltages are inde
j j
j
X v N v
N
pendent random variables.
How many measurements are required so that the probability that
is within =1 V of the true mean is at least 0.99?
12
1 2Let , , be a sequence of iid random variables with finite mean [ ] , then for 0
X X
E X
lim [ ] 1nn
P M
1]lim[
nn
MP
Weak Law of Larger Numbers
Strong Law of Larger Numbers
Fig. 5.1
Sample mean will be close to the true mean with high probability
1 2Let , , be a sequence of iid random variables with finite mean [ ] and finite variance, then
X X
E X
With probability 1, every sequence of sample mean calculations will eventually approach and stay close to E[X].
n
Mn
13
Ex.5.10 In order to estimate the probability of an event A, a sequence of Bernoulli
trials is carried out and relative frequency of A is observed. How large
should be inn order to have a 0.95 probability that relative frequency
is within 0.01 of [ ]?p P A
14
5.3 The central Limit Theorem
2.Let be the sum of iid r.v.s with finite mean [ ] and finite variance S n E Xn
Let be the zero-mean, unit-variance r.v. defined byZn
n
nSZ n
n
z X
nndxezZP 2
2
2
1][lim
then
1 2
2
1 2
Let , , be a sequence of iid random variables with finite mean
and finite variance , and let
.
In sec. 5.1, we learn how to find the exact pdf of .n n
n
X X
S X X X
S
CLT: as n becomes large, cdf of Sn approach that of a Gaussian.
15
1
( )
1
( )
1
( )
( ) [ ]
[exp ( ) ]
[ ]
[ ]
[ ]
n
n
k
k
k
j ZZ
n
kk
j Xnn
k
j Xnn
k
nj Xn
w E e
jE X
n
E e
E e
E e
n
k kXnn
nnSnZ
1)(
1 :pf
16
characteristic function of a zero-mean, unit-variance Gaussian r.v.
22
2
22
2
2
2
1 ,2!
1 ,2!
1 ,2
, can be neglected relative to .2n
j X nE e
jjE X X R X
nn
jjE X E X E R X
nn
E R Xn
as n E R X
22
2lim 12n
n
Zn
en
Fig 5.2-5.4 show approx.
17
Ex.5.12 In Ex. 5.11, after how many orders can we be 90% sure that the total
spending by all customers is more than $1000?
Ex.5.11 Suppose that orders at a restaurant are iid random variables with mean
$8 and standard deviation $2. Estimate the probability that
the first 100 customers spe
nd a total of more than $840.
Using Gaussian approximation:
18
Ex.5.14 In order to estimate the probability of an event A, a sequence of Bernoulli
trials is carried out and relative frequency of A is observed. How large
should be in n order to have a 0.95 probability that relative frequency
is within 0.01 of [ ]? (Using Gaussian approximation for binomial)p P A
19
5.4 Confidence Intervals
1
1 n
n jj
M Xn
The sample mean estimator provides a single numerical
value for the estimate of ,nM
E X
In order to know how good is the estimate provided by ,
we can compute the sample variance, which is the average
dispersion about .
n
n
M
M
22
1
2 2
1
1
n
n j nj
n
V X Mn
E V
If is small, Xj’s are tightly clustered about Mn.
and we can be confident that Mn is close to E[X ].
2nV
20
Another way of specifying accuracy and confidence of an estimate:
Find an interval ( ), ( ) such that
1
Such an interval is a (1- ) 100% .
1- is called
l u
P l u
X X
X X
confidence interval
the confidence level.
The probability 1- is a measure of degree of confidence.
The width of the confidence interval is a measure of accuracy.
21
Case 1. Xj’s Gaussian with unknown Mean and known Variance
Mn is Gaussian with mean and variance
2.2
n
1 2
1 2
n
n n
MP z z Q z
n
Z ZP M M Q z
n n
( )l X
2 2
2 2
Choose a such that 2 ( ), then
( , )
is a (1- ) 100% confidence interval for .
n n
z z Q z
M z M zn n
( )u X
22
2
EX.5.15 A voltage is given by , where is an unknown
constant voltage and is a random noise voltage that has
a Gaussian pdf with zero mean and variance 1 .
X X v N v
N
V
Find the 95% confidence interval for if the voltage is
measured 100 independent times and the sample mean
is found to be 5.25 .
v X
V
2
1- 0.90 0.95 0.99
1.645 1.960 2.576 z
Table5.1
23
Case2: Gaussian; Mean and Variance unknown use sample variance as replacement of variance the confidence interval becomes
sjX '
2
,n nn n
zV zVM M
n n
n n nn n
n
M zV zVP z z P M M
nV n n
n
n
n
n
V
Mn
nV
MW
)(
12
2 2
( ) ( )
( 1) / ( 1)
n
n
M n
n V n
Zero-mean unit-variance Gaussian
Indep.
W is a student’s t-distribution with n-1 degrees of freedom.
Chi-square r.v. with n-1 degrees of freedom
24
(Ex. 4.38)
22
1 11
)1(2)1(
2)(
n
n n
y
nn
nyf
1
1
( )
1 2 ( )
zn n
n n nz
n
zV zVP M M f y dy
n n
F z
2, 1 1 2, 1
2, 1 2, 1
Choose a such that 2 ( ), then
( , )
is a (1- ) 100% confidence interval for .
n n n
n nn n n n
z z F z
V VM z M z
n n
25
Table 5.2
Ex.5.16 The life time of a certain device is assumed to have a Gaussian
distribution. Eight devices are tested and the sample mean and
sample variance for the lifetime obtai 2ned are 10 days and 4 days .
Find the 99% confidence interval for the mean lifetime.
1-
-1 .90 .95 .99
1 6.314 12.706 63.657
2 2.920 4.303
n
9.925
3 2.353 3.182 5.841
4 2.132 2.776 4.604
5 2.015 2.571 4.032
6 1.943 2.447 3.707
7 1.895 2.365 3.499
2, 1nz
26
'jX sCase 3: non-Gaussian; Mean and Variance unknown. Use method of batch mean.
Ex.5.17 A computer simulation program generates exponentially distributed
random variables of unknown mean. Two hundred samples of these
random variables are generated and grouped into 10 batches of 20
samples each. The sample means of the 10 batches are:
1.04 0.64 0.80 0.75 1.12
1.30 0.98 0.64 1.39 1.26
Find the 90% confidence interval for the eman of the r.v.
Performing a series of M independent experiments in which sample mean (from a large number of observations) is computed.
27
5.4 Convergence of Sequences of Random Variables
1 2
A sequence of random variables is a function that assigns a countably
infinite number of real values to each outcome from some sample
space :
, ,..., ,... .
- We sometimes use or
n
n
S
X X X
X
X
X
to denote ( ).nX X
In Section 5.2, we discussed the convergence of the sequence of arithmetic
averages of iid random variables to the expected value :
as .
In this section we consider
n
n
M
M n
1 2
the more general situation where a sequence
of random variables (usually not iid) , , converges to some
random variable :
as .n
X X
X
X X n
a sequence of functions of
28
Ex.5.18.
11 , 0,1nV in S
n
A sequence of functions of .
nV
a sequence of real numberfor a given .
1
1
nV
2
1
2V
3
2
3V
1
n1 2 3 4 5
1
2
2
3
3
4
4
5
0
29
The sequence to if, given 0, we can specify an
integer such that for values of beyond we can guarantee that
< .
n
n
x x
N n N
x x
converges any
all
If the limit x is unknown, we can use Cauchy criterion:
The sequence if and only if, given 0, we can specify
an integer ' such that for , greater than ', < .n
n m
x
N m n N x x
converges
nx
N
2x
n
30
: The sequence of random variables ( ) converges surely
to the random variable ( ) if the sequence of functions ( ) converges to the
function ( ) as for in .
( )
n
n
n
X
X X
X n S
X X
Sure Convergence
all
( ) as for all in .
n S
Ex: Strong Law of Large numbers
nx
2x
:
( ) ( ) as for all in , except possibly on a set of
probability zero; that is, : ( ) ( ) as 1.
n
n
X X n S
P X X n
Almost - Sure Convergence
n
31
Ex. 5.20 Let be selected at random from the interval 0,1 , where
we assume that the probability that is in a subinterval of is equal to the
length of the subinterval. Define the following five s
S
( 1)
equences of random variables:
11
cos 2
Which of these sequences converge surely? almost surely?
n
n
nn
n
n nn
Un
Vn
W e
Y n
Z e
32
Ex. 5.21 Let the sequence of random variables ( ) consist of
independent equiprobable Bernoulli random variables,
1 ( ) 0 ( ) 1
2Does this sequence of random variables converge?
n
n n
X
P X P X
Ex. 5.22 An urn contains 2 black balls and 2 white balls.
At time a ball is selected at random from the urn, and the color is noted.
If the number of balls of this color is greater than the number of
n
balls of
the other color, then the ball is put back in the urn; otherwise, the ball is
left out. Let ( ) be the number of black balls in the urn after the th
draw. Does this sequence of random varianX n
bles converge?
33
Mean-Square Convergence
20nE X X as n
0nP X X as n
Ex. 5.23 Does the sequence ( ) converge in the mean square sense?
1 ( ) (1 )
n
n
V
Vn
Convergence in Probability
Ex: weak law of large numbers.
nx
n0
2x
n
34
: The sequence of random variables with
cumulative distribution function ( ) converges in distribution to the
random variable X with cumulative distribution ( ) if
n
n
X
F x
F x
Convergence in Distribution
( ) ( )
for all at which ( ) is continuous.nF x F x as n
x F x
2 ( 1)
Ex. 5.24 Does ( ) converge in the mean square sense?
n
n nn
Z
Z e
Ex. Central limit theorem Ex. 5.21: Bernoulli iid sequence
35
dist
prob
a.s.s
m.s.