moment generating function and statistical distributions
DESCRIPTION
MOMENT GENERATING FUNCTION AND STATISTICAL DISTRIBUTIONS. MOMENT GENERATING FUNCTION. The m.g.f. of random variable X is defined as. for t Є (-h,h) for some h>0. Properties of m.g.f. M(0)=E[1]=1 If a r.v. X has m.g.f. M(t), then Y=aX+b has a m.g.f. - PowerPoint PPT PresentationTRANSCRIPT
MOMENT GENERATING FUNCTION AND
STATISTICAL DISTRIBUTIONS
1
MOMENT GENERATING FUNCTION
2
xall
tx
xall
tx
tXX
discreteisXif)x(fe
.contisXifdx)x(fe
)e(E)t(M
The m.g.f. of random variable X is defined as
for t Є (-h,h) for some h>0.
Properties of m.g.f.
• M(0)=E[1]=1
• If a r.v. X has m.g.f. M(t), then Y=aX+b has a m.g.f.
•
• M.g.f does not always exists (e.g. Cauchy distribution)
3
)at(Mebt
.derivativektheisMwhere)0(M)X(E th)k()k(k
Example
• Suppose that X has the following p.d.f.
Find the m.g.f; expectation and variance.
4
0xforxe)x(f x
CHARACTERISTIC FUNCTION
5
xall
itx
xall
itx
itXX
discreteisXifxfe
contisXifdxxfe
eEt)(
.)(
)()(
The c.h.f. of random variable X is defined as
for all real numbers t. 1,12 ii
C.h.f. always exists.
Uniqueness
Theorem:
1.If two r.v.s have mg.f.s that exist and are equal, then they have the same distribution.
2.If two r,v,s have the same distribution, then they have the same m.g.f. (if they exist)
Similar statements are true for c.h.f.
6
Problem
• It is sometimes the case that exact values of random variables (Y1, Y2, …) cannot be observed, but we can observe they are greater than some fixed value. Let Y1, Y2, … be i.i.d. r.v.s. Let a be a fixed number on real line. For i=1,2,… define,
7
aYif
aYifX
i
ii 0
1
Problem, cont.
For example, if a manufacturing process produces parts with strength Yi that are tested to see if they can withstand stress a, then Xi denotes whether the strength is at least a or it is less than a. In such a case, we cannot directly observe the strength Yi of the ith part, but we can observe whether it breaks in stress test.
8
Problem, cont.
• Define p=P(Y1≥a) and q=1-p, Sn=X1+X2+…+Xn. Note that, Sn is the number of Y1, …, Yn that exceed a.
i) Define the characteristic function, say , of a r.v. X
ii) Find
iii)Find
iv)Find P(Sn=j)
9
)(tX
)(1
tX
)(tnS
Other generating functions
• logM(t) is called cumulant generating function.
• is factorial moment generating function.
• Note:
there is a simple relation between m.g.f. and f.m.g.f.
10
)()( XX tEt
)()()()(ln )ln(ln XttX tEeEeEtMX
Other generating functions
11
))1)...(1((|)(
...
))1((|)(
)(|)(|)(
1
12
2
11
1
kXXXEtdt
d
XXEtdt
d
XEXtEtdt
d
tXk
k
tX
tX
tX
Example
• Suppose X has the following p.m.f.
Find the expectation and variance of X.
• Solution: Let’s use factorial m.g.f.
12
0,...2,1,0,!
)(
xx
exXP
x
Example
13
22
2
22
21
2)1(
1)1(
)1(
0
))(()())1((
))(()()(
))1((|)1(
)(|)1(
!
)()()(
XEXEXXE
XEXEXVar
XXEe
XEe
eeex
ettEt
tt
X
tt
X
tt
x
XX
X
14
STATISTICAL DISTRIBUTIONS
Recall
• Random variable: A function defined on the sample space S that associates a real number with each outcome in S.
15
Example
• Toss three coins• Sample space S={s1=HHH,s2=HHT,
…,s6=THT,s7=TTH,s8=TTT}
• Define X=number of heads: X(s1)=3,X(s6)=1,X(s8)=0
• Define Y=number of tails before first head: Y(s1)=0, Y(s6)=1, Y(s8)=3
16
Random variables
• A random variable is continuous if its CDF, F(x)=P(X≤x), is continuous.
• A random variable is discrete if its CDF, F(x)=P(X≤x), is a step function.
• It is possible for a CDF to have continuous pieces and steps, but we will mostly concentrate on the previous two bullets in this course.
17
SOME DISCRETE PROBABILITY
DISTRIBUTIONSDegenerate, Uniform, Bernoulli,
Binomial, Poisson, Negative Binomial, Geometric,
Hypergeometric
18
19
DEGENERATE DISTRIBUTION
• An rv X is degenerate at point k if
1,
0, . .
X kP X x
o w
The cdf:
0,
1,
X kF x P X x
X k
UNIFORM DISTRIBUTION
• A finite number of equally spaced values are equally likely to be observed.
• Example: throw a fair die. P(X=1)=…=P(X=6)=1/6
20
,...2,1N;N,...,2,1x;N
1)xX(P
12
)1N)(1N()X(Var;
2
1N)X(E
21
BERNOULLI DISTRIBUTION
• A Bernoulli trial is an experiment with only two outcomes. An r.v. X has Bernoulli(p) distribution if
1 with probability ;0 1
0 with probability 1
pX p
p
1p0and;1,0xfor)p1(p)xX(P x1x
BERNOULLI DISTRIBUTION
• P(X=0)=1-p and P(X=1)=p
• E(X)=p
•
22
)1(
)1()1()0(
E(X))-E(XVar(X)22
2
pp
pppp
23
BINOMIAL DISTRIBUTION• Define an rv Y by
Y = total number of successes in n Bernoulli trials.
1.There are n trials (n is finite and fixed).2. Each trial can result in a success or a failure.3. The probability p of success is the same for all
the trials.4. All the trials of the experiment are independent.
1
~ , where ~ .n
i ii
Y X Bin n p X Ber p
Let ~ , . ,independent
i iX Bin n p Then
1 21
~ , .k
i ki
X Bin n n n p
BINOMIAL DISTRIBUTION
• Example: • There are black and white balls in a box. Select
and record the color of the ball. Put it back and re-pick (sampling with replacement).
• n: number of independent and identical trials • p: probability of success (e.g. probability of
picking a black ball)• X: number of successes in n trials
24
25
BINOMIAL THEOREM
• For any real numbers x and y and integer n>0
inin
i
n yxi
nyx
0
)(
BINOMIAL DISTRIBUTION
• If Y~Bin(n,p), then
26
p)-np(1Var(Y)
npE(Y)
ntY )]p1(pe[)t(M
10,...,1,0)1()(
pnypp
y
nyYP yny
27
POISSON DISTRIBUTION
• The number of occurrences in a given time interval can be modeled by the Poisson distribution.
• e.g. number of customers to arrive in a bank between 13:00 and 13:30.
• Another application is in spatial distributions. • e.g. modeling the distribution of bomb hits in an
area or the distribution of fish in a lake.
POISSON DISTRIBUTION
• If X~ Poi(λ), then
• E(X)= Var(X)=λ
•
28
]}1)[exp(exp{)( ttM X
0,...2,1,0,!
)(
xx
exXP
x
29
Relationship between Binomial and Poisson
~ , with mgf 1nt
XX Bin n p M t pe p
Let =np.
1
lim lim 1
1lim 1
nt
Xn n
ntte
Yn
M t pe p
ee M t
n
The mgf of Poisson()
The limiting distribution of Binomial rv is the Poisson distribution.
NEGATIVE BINOMIAL DISTRIBUTION (PASCAL OR WAITING TIME DISTRIBUTION)
• X: number of Bernoulli trials required to get a fixed number of failures before the r th success; or, alternatively,
• Y: number of Bernoulli trials required to get a fixed number of successes, such as r successes.
30
31
NEGATIVE BINOMIAL DISTRIBUTION (PASCAL OR WAITING TIME DISTRIBUTION)
X~NB(r,p)
1p0,...;1,0x;)p1(px
1xr)xX(P xr
2p
)p1(r)X(Var
p
)p1(r)X(E
rtrX ]e)p1(1[p)t(M
NEGATIVE BINOMIAL DISTRIBUTION
• An alternative form of the pdf:
Note: Y=X+r
32
1p0,...;1r,ry;)p1(p1r
1y)yY(P ryr
2p
)p1(r)X(Var)Y(Var
p
rr)X(E)Y(E
33
GEOMETRIC DISTRIBUTION• Distribution of the number of Bernoulli trials
required to get the first success.• It is the special case of the Negative Binomial
Distribution r=1.
1
1 , 1,2,x
P X x p p x
X~Geometric(p)
2p
)p1()X(Var
p
1)X(E
• Example: If probability is 0.001 that a light bulb will fail on any given day, then what is the probability that it will last at least 30 days?
• Solution:
34
GEOMETRIC DISTRIBUTION
97.0)999.0()001.01(001.0)30X(P 30
31x
1x
35
HYPERGEOMETRIC DISTRIBUTION
• A box contains N marbles. Of these, M are red. Suppose that n marbles are drawn randomly from the box without replacement. The distribution of the number of red marbles, x is
, 0,1,...,
M N M
x n xP X x x n
N
n
It is dealing with finite population.
X~Hypergeometric(N,M,n)
HYPERGEOMETRIC DISTRIBUTION
• As N →∞, hypergeometric → binomial.
• In that case, sampling with or without replacement does not make much difference (especially if n/N is small).
36
MULTIVARIATE
DISTRIBUTIONS
37
EXTENDED HYPERGEOMETRIC DISTRIBUTION
• Suppose that a collection consists of a finite number of items, N and that there are k+1 different types; M1 of type 1, M2 of type 2, and so on. Select n items at random without replacement, and let Xi be the number of items of type i that are selected. The vector X=(X1, X2,…,Xk) has an extended hypergeometric distribution and the joint pdf is
38
. xand M where
},...,1,0{ ,
...
,...,,
11k
11k
1
1
2
2
1
1
21
k
ii
k
ii
iik
k
k
k
k
xnMN
Mx
n
N
x
M
x
M
x
M
x
M
xxxf
MULTINOMIAL DISTRIBUTION
• Let E1,E2,...,Ek,Ek+1 be k+1 mutually exclusive and exhaustive events which can occur on any trial of an experiment with P(Ei)=pi,i=1,2,…,k+1. On n independent trials of the experiment, let Xi be the number of occurrences of the event Ei. Then, the vector X=(X1, X2,…,Xk) has a multinomial distribution with joint pdf
39
.1p and xwhere
},...,1,0{ ,...!!...!
!,...,,
11k
11k
121121
21121
k
ii
k
ii
ixk
xx
kk
pxn
nxpppxxx
nxxxf k
• Experiment involves drawing with replacement.
• Binomial is a special case of multinomial with k+1=2
40
MULTINOMIAL DISTRIBUTION
MULTINOMIAL DISTRIBUTION
• Consider trinomial case for simplicity.
41
},...,1,0{ ,)1()!(!!
!, 2121
21212121
21 nxppppxxnxx
nxxf i
xxnxx
ntt
n
x
xn
x
xxnxtxt
XtXt
ppepep
ppepepxxnxx
n
eEttM
)1(
)1()()()!(!!
!
)(),(
2121
0 02121
2121
21
21
1
1
2
212211
2211
MULTINOMIAL DISTRIBUTION
• M.g.f. of X1:
X1~Bin(n,p1)
Similarly, X2~Bin(n,p2)
But, Cov(X1,X2)≠0!
Cov(X1,X2)=?42
ntXt pepeEtM )1()()0,( 110
1111
• Example: Suppose we have a bowl with 10 marbles - 2 red marbles, 3 green marbles, and 5 blue marbles. We randomly select 4 marbles from the bowl, with replacement. What is the probability of selecting 2 green marbles and 2 blue marbles?
43
MULTINOMIAL DISTRIBUTION
• n = 4, k+1=3, nred = 0, ngreen = 2, nblue = 2
• pred = 0.2, pgreen = 0.3, pblue = 0.5• P = [ n! / ( n1! * n2! * ... nk! ) ] * ( p1
n1 * p2
n2 * . . . * pk
nk )
P = [ 4! / ( 0! * 2! * 2! ) ] * [ (0.2)0 * (0.3)2 * (0.5)2 ]
P = 0.135
44
MULTINOMIAL DISTRIBUTION
Problem
1. a) Does a distribution exist for which the m.g.f. ? If yes, find it. If
no, prove it.
b) Does a distribution exist for which the m.g.f. ? If yes, find it. If no, prove it.
45
t
ttM X
1
)(
tX etM )(
Problem
2. An appliance store receives a shipment of 30 microwave ovens, 5 of which are (unknown to the manager) defective. The store manager selects 4 ovens at random, without replacement, and tests to see if they are defective. Let X=number of defectives found. Calculate the pmf and cdf of X.
46
Problem
3. Let X denote the number of “do loops” in a Fortran program and Y the number of runs needed for a novice to debug the program. Assume that the joint density for (X,Y) is given in the following table.
47
Problem
x/y 1 2 3 4
0 0.059 0.1 0.05 0.001
1 0.093 0.12 0.082 0.003
2 0.065 0.102 0.1 0.01
3 0.05 0.075 0.07 0.0248
Problem
a) Find the probability that a randomly selected program contains at most one “do loop” and requires at least two runs to debug the program.
b) Find E[XY].
c) Find the marginal densities for X and Y. Find the mean and variance for both X and Y.
d) Find the probability that a randomly selected program requires at least two runs to debug given that it contains exactly one “do loop”.
e) Find Cov(X,Y). Find the correlation between X and Y. Based on the observed value correlation, can you claim that X and Y are not independent? Why?
49