ang2ed-3-r
DESCRIPTION
download n enjoyTRANSCRIPT
Ang and Tang: Probability Concepts in Engineering (2nd Ed, 2004)Chapter 3: Analytical Models of Random Phenomena
Analytical Modelsof
Random Phenomena
Cheng-Liang Chen
PSELABORATORY
Department of Chemical EngineeringNational TAIWAN University
Chen CL 1
Analytical Modelsof
Random Phenomena
Chen CL 2
Random Events and Random Variables
➢ In engineering and the physical sciences, many random phenomena of interestare associated with the numerical outcomes of some physical quantities.
☞ The number of bulldozers that remain operating after 6 months,☞ The time required to complete a project,☞ The flood (in meter) of a river above mean flow level.
➢ Sometimes, the possible outcomes are not in numerical terms,
☞ Failure or survival of a chain,☞ Incompletion or completion of a project,☞ Closing and opening of highway routes.
➢ These events may also be identified in numerical terms by artificially assigningnumerical values to each of the possible outcomes,
☞ Assigning a numerical value of 1 to survival of a chain☞ Assigning a numerical value of 1 to completion of a project☞ Assigning a numerical value of 1 to opening of a highway route
➢ Therefore, the possible outcomes of a random phenomenon can be representedby numerical values, either naturally or assigned artificially.
Chen CL 3
Random Events and Random Variables
➢ In any case, an outcome or event may then be identified through the value orrange of values of a function, which is called a random variable X.
☞ If the values of X represent floods above the mean flow level,then X > 2 m stands for the occurrence of a flood higher than 2 m;
☞ If X represents the possible states of a chain (failure or survival),then X = 0 means failure of the chain.
➢ A random variable is a mathematical device for identifying events in numericalterms.In terms of the random variable X, we can speak of an event as (X = a), or(X > a), or (X ≤ a), or (a < X ≤ b).
➢ A random variable may be considered as a mathematical function or rule thatmaps (or transform) events in a sample space into the number system (i.e., thereal line).
Chen CL 4
Random Events and Random Variables
➢ The advantages and purpose of identifying events in numerical terms:
☞ To conveniently represent events analytically,☞ To graphically display events and their respective probabilities.
For example:☞ Mutually exclusive events are mapped into nonoverlapping intervals on the
real line, whereas☞ Intersecting events are represented by the respective overlapping intervals on
the real line.
➢ In Fig. 3.1, the events E1 and E2 are mappedinto the real line through the random variableX, and thus can be identified, respectively, asindicated below: (a < c < b < d)
E1 = (a < X ≤ b)E2 = (c < X ≤ d)
E1E2 = (c < X ≤ b)E1 ∪ E2 = (a < X ≤ d)E1 ∪ E2 = (X ≤ a) + (X > d)
Chen CL 5
Probability Distribution of a Random VariableAs the values or ranges of values of a random variable represent events, thenumerical values of the random variable are associated with specific probability orprobability measures.These probability measures may be assigned according to prescribed rules that arecalled probability distributions or probability law.
➢ If X is a random variable, its probability distribution can always be described byits cumulative distribution function (CDF),
FX = P(X ≤ x) for all x
☞ X is a discrete RV if only discrete values of x have positive probabilities.☞ X is a continuous RV if probability measures are defined for all values of x.
➢ Probability Distribution for a discrete random variable X,
Probability mass fcn, PMF, pX(x): p
X(x) ≡ P(X = x)
Cumulative distribution fcn, CDF: FX(x) =∑∀xi≤x
P(X = xi) =∑∀xi≤x
pX(xi)
Chen CL 6
Probability Distribution of a Random Variable
➢ Probability Distribution for a continuous random variable X,
Probability density fcn, PDF, fX(x): P(a < X ≤ b) ≡
∫ b
a
fX(x)dx
Cumulative distribution fcn, CDF: FX(x) = P(X ≤ x) =∫ x
−∞f
X(τ)dτ
fX(x) =
dFX(x)dx
fX(x)dx = P(x < X ≤ x + dx)
Chen CL 7
Probability Distribution of a Random Variable
➢ Any function used to represent the probability distribution of a random variablemust necessarily satisfy the axioms of probability theory.If FX(x) is the CDF of X, then it must satisfy the following conditions:
(i) FX(−∞) = 0; and FX(∞) = 1.0(ii) FX(x) ≥ 0, for all values of x, and is nondecreasing with x.
(iii) FX(x) is continuous to the right with x.
➢ Some observations:
P(a < X ≤ b) =∫ b
−∞f
X(x)dx −
∫ a
−∞f
X(x)dx
P(a < X ≤ b) =∑∀xi≤b
pX(xi) −
∑∀xi≤a
pX(xi)
P(a < X ≤ b) = FX(b) − FX(a)
Chen CL 8
Probability Distribution of a Random VariableEx: Mapping Events into Real Line
Consider Example 2.1 again, which involves a discrete random variable.
Using X as the random variable whose valuesrepresent the number of operating bulldozers after6 months, the events of interest are mapped intothe real line as shown in Fig. E3.1a.Thus, (X = 0), (X = 1), (X = 2), and (X = 3)now represent the corresponding events of interest.
Assuming again that each of the three bulldozers is equally likely to be operatingor nonoperating after 6 months, i.e., probability of operating is 0.5, and that theconditions between bulldozers are statistically independent, the PMF and CDP ofX are shown in Fig.s E3.1b and E3.1c.
Chen CL 9
Probability Distribution of a Random VariableEx: Load on A Beam
For a continuous random variable, consider the100-kg load in Example 2.5. If the load is equallylikely to be placed anywhere along the span of thebeam of 10 m, then the PDF of the load positionX is uniformly distributed in 0 < x ≤ 10;
fX(x) =
{c = 1
10 for 0 < x ≤ 100 otherwise
FX =∫ x
0
cdx =
0 for 0 > x
x/10 for 0 < x ≤ 101 for x > 10
P(2 < X ≤ 5) =∫ 5
2
110
dx = 0.30
P(2 < X ≤ 5) = FX(5)−FX(2)
=510
− 210
= 0.30
Chen CL 10
Probability Distribution of a Random VariableEx: Useful Life of Welding Machines
The useful life, T (in hours) of welding machines isnot completely predictable, but may be describedby the exponential distribution, with the followingPDF and CDF (λ is a constant):
fT(t) =
{λe−λt for t ≥ 00 for t < 0
FT (t) =∫ t
0
λe−λτdτ
=
{1− e−λt for t ≥ 00 for t < 0
Chen CL 11
Main Descriptors of a Random VariableCentral Values
The main descriptors contain information on the properties of the
random variable that are of first importance in many practical
applications.
➢ Mean Value or Expected Value, E(X), of a random variable X
µX
= E(X) =
∑∀xi
xipX(xi) if X is a discrete RV∫ ∞
−∞xf
X(x)dx if X is a continuous RV
➢ The median, xm, of a random variable X, is the value at which the
CDF is 50% (FX(xm) = 0.50) and thus larger and smaller values
are equally probable.
➢ The mode, x̃, is the most probable value of a random variable
X; i.e., it is the value of the random variable with the largest
probability or the highest probability density.
Chen CL 12
Main Descriptors of a Random VariableMathematical Expectation
➢ Given a function g(X), its expected value E [g(X)], can be obtained
as a generalization of previous equation.
➢ E [g(X)] is known as the mathematical expectation of g(X) and is
the weighted average of the function g(X).
E [g(X)] =
∑∀xi
g(xi)pX(xi) if X is a discrete RV
∫ ∞
−∞g(x)f
X(x)dx if X is a continuous RV
Chen CL 13
Main Descriptors of a Random VariableMeasures of Dispersion
➢ Measure of dispersion is used to indicate how widely or narrowly
the values of a random variable are dispersed.
➢ Of special interest is a quantity that gives a measure of how closely
or widely the values of the variate are clustered around a central
value. (g(x) = (x− µX)2)
Var(X) =
∑∀xi
(xi − µX)2p
X(xi) if X is a discrete RV
∫ ∞
−∞(x− µ
X)2f
X(x)dx if X is a continuous RV
Chen CL 14
Main Descriptors of a Random VariableMeasures of Dispersion
Var(X) =∫ ∞
−∞(x− µ
X)2f
X(x)dx
=∫ ∞
−∞(x2 − 2µ
Xx + µ2
X)f
X(x)dx
= E(X2)− 2µXE(X) + µ2
X
= E(X2)− µ2X
A more convenient measure
of dispersion is the square
root of the variance (the
standard deviation), σX;
σX
=√Var(X)
Coefficient of variation
(c.o.v.), a (nondimensional)
measure of dispersion
relative to central value.
δX
=σ
X
µX
Chen CL 15
Main Descriptors of a Random VariableMeasures of DispersionEx: Operating Bulldozers
➢ The PMF of the number of operating bulldozersafter 6 months is shown in Fig. E3.1b. Onthis basis, we obtain the expected number ofoperating bulldozers after 6 months as
µX
= E(X) = 0(18
)+ 1
(38
)+ 2
(38
)+ 3
(18
)= 1.50
As the random variable is discrete, the mean value of 1.5 is not necessarilya possible value; in this case, we may only conclude that the mean numberof operating bulldozers is between 1 and 2 at the end of 6 months. Thecorresponding variance is
Var(X) = (0− 1.5)2(18
)+ (1− 1.5)2
(38
)+ (2− 1.5)2
(38
)+ (3− 1.5)2
(18
)Var(X) =
[02(18
)+ 12
(38
)+ 22
(38
)+ 32
(18
)]− (1.5)2 = 0.75
σX
=√
0.75 = 0.866 δX
= 0.8661.50 = 0.577
which means that the degree of dispersion is over 50% of the mean value, arelatively large dispersion.
Chen CL 16
Main Descriptors of a Random VariableMeasures of Dispersion
Ex: Useful Life Time
➢ The useful life, T , of welding machines is a random variable with an exponentialprobability distribution; the PDF and CDF are,
fT(t) = λe−λt and FT (t) = 1− e−λt; t ≥ 0
Chen CL 17
Main Descriptors of a Random VariableMeasures of Dispersion
Ex: Useful Life Time
➢ The mean life of the welding machines
µT
= E(T ) =∫ ∞
0
tλe−λtdt =1λ
The parameter λ of the exponential distribution is the reciprocal of the meanvalue, λ = 1/E(T ). The mode is zero, whereas the median life tm, variance,standard deviation, and the coefficient of variance are,
0.50 =∫ tm
0
λe−λtdt = − e−λt∣∣∣tm
0= − e−λtm + 1
tm =− ln (0.50)
λ=
0.693λ
= 0.693µT
Var(T ) =∫ ∞
0
(t− 1/λ)2λe−λtdt =1λ2
σT
=1λ
= µT
⇒ δT
= 1.0
Chen CL 18
Main Descriptors of a Random VariableMeasures of Skewness
➢ A measure of asymmetry or skewness is the third central moment;
E(X − µX)3 =
∑∀xi
(xi − µX)3p
X(xi) if X is a discrete RV∫ ∞
−∞(x− µ
X)3f
X(x)dx if X is a continuous RV
θ =E(X − µ
X)3
σ3(dimensionless) skewness coefficient
➢ The above third moment will be positive (or negative) if the values of X aboveµ
Xare more widely dispersed than the dispersion of the values below µ
X.
Chen CL 19
Useful Probability Distributions
There are a number of both discrete and continuous distribution
functions that are especially useful because of one or more of the
following reasons:
➢ The function is the result of an underlying physical process and can
be derived on the basis of certain physically reasonable assumptions.
➢ The function is the result of some limiting process.
➢ It is widely known, and the necessary probability and statistical
information (including probability tables) are widely available.
Chen CL 20
Useful Probability DistributionsGaussian (Normal) Distribution
➢ Gaussian (Normal) Distribution, N (µ, σ)
fX(x) =
1σ√
2πexp
[−1
2
(x− µ
σ
)2]
−∞ < x < ∞
Chen CL 21
Useful Probability DistributionsGaussian (Normal) Distribution
➢ Standard Normal Distribution, N (0, 1)
fX(x) =
1√2π
e−(1/2)x2 −∞ < x < ∞
CDF: Φ(s) ≡ FS(s), sp = Φ−1(p)
Φ(−s) = 1− Φ(s), = −Φ−1(1− p)
Chen CL 22
Useful Probability DistributionsGaussian (Normal) Distribution
z .00 .01 .02 .03 .04 .05 .06 .07 .08 .090.0 .5000 .5040 .5080 .5120 .5160 .5199 .5239 .5279 .5319 .53590.1 .5398 .5438 .5478 .5517 .5557 .5596 .5636 .5675 .5714 .57530.2 .5793 .5832 .5871 .5910 .5948 .5987 .6026 .6064 .6103 .61410.3 .6179 .6217 .6255 .6293 .6331 .6368 .6406 .6443 .6480 .65170.4 .6554 .6591 .6628 .6664 .6700 .6736 .6772 .6808 .6844 .68790.5 .6915 .6950 .6985 .7019 .7054 .7088 .7123 .7157 .7190 .72240.6 .7257 .7291 .7324 .7357 .7389 .7422 .7454 .7486 .7518 .75490.7 .7580 .7612 .7642 .7673 .7704 .7734 .7764 .7794 .7823 .78520.8 .7881 .7910 .7939 .7967 .7995 .8023 .8051 .8078 .8106 .81330.9 .8159 .8186 .8212 .8238 .8264 .8289 .8315 .8340 .8365 .83891.0 .8413 .8438 .8461 .8485 .8508 .8531 .8554 .8577 .8599 .86211.1 .8643 .8665 .8686 .8708 .8729 .8749 .8770 .8790 .8810 .88301.2 .8849 .8869 .8888 .8907 .8925 .8944 .8962 .8980 .8997 .90151.3 .9032 .9049 .9066 .9082 .9099 .9115 .9131 .9147 .9162 .91771.4 .9192 .9207 .9222 .9236 .9251 .9265 .9279 .9292 .9306 .93191.5 .9332 .9345 .9357 .9370 .9382 .9394 .9406 .9418 .9429 .94411.6 .9452 .9463 .9474 .9484 .9495 .9505 .9515 .9525 .9535 .95451.7 .9554 .9564 .9573 .9582 .9591 .9599 .9608 .9616 .9625 .96331.8 .9641 .9649 .9656 .9664 .9671 .9678 .9686 .9693 .9699 .97061.9 .9713 .9719 .9726 .9732 .9738 .9744 .9750 .9756 .9761 .97672.0 .9772 .9778 .9783 .9788 .9793 .9798 .9803 .9808 .9812 .98172.1 .9821 .9826 .9830 .9834 .9838 .9842 .9846 .9850 .9854 .98572.2 .9861 .9864 .9868 .9871 .9875 .9878 .9881 .9884 .9887 .98902.3 .9893 .9896 .9898 .9901 .9904 .9906 .9909 .9911 .9913 .99162.4 .9918 .9920 .9922 .9925 .9927 .9929 .9931 .9932 .9934 .99362.5 .9938 .9940 .9941 .9943 .9945 .9946 .9948 .9949 .9951 .99522.6 .9953 .9955 .9956 .9957 .9959 .9960 .9961 .9962 .9963 .99642.7 .9965 .9966 .9967 .9968 .9969 .9970 .9971 .9972 .9973 .99742.8 .9974 .9975 .9976 .9977 .9977 .9978 .9979 .9979 .9980 .99812.9 .9981 .9982 .9982 .9983 .9984 .9984 .9985 .9985 .9986 .99863.0 .9986 .9987 .9987 .9988 .9988 .9989 .9989 .9989 .9990 .99903.1 .9990 .9991 .9991 .9991 .9992 .9992 .9992 .9992 .9993 .99933.2 .9993 .9993 .9994 .9994 .9994 .9994 .9994 .9995 .9995 .99953.3 .99952 3.5 .99977 3.7 .99989 3.9 .99995 4.5 1.00003.4 .99966 3.6 .99984 3.8 .99993 4.0 .99997
Chen CL 23
Useful Probability DistributionsGaussian (Normal) Distribution
➢ Areas (or probabilities) covered within
±1, ±2, and ±3σ’s
(± number of standard deviations about
the mean µ = 0 of the standard normal
distribution are, respectively, equal to
68.3%, 95.4%, and 99.7%.)
Φ(1)− Φ(−1) = Φ(1)− [1− Φ(1)]
Φ(1)− Φ(−1) = 0.683
Φ(2)− Φ(−2) = 0.954
Φ(3)− Φ(−3) = 0.997
Chen CL 24
Useful Probability DistributionsGaussian (Normal) Distribution
➢ Evaluate probabilities of any normal distribution by using Φ(s)
P(a < X ≤ b) =1
σ√
2π
∫ b
a
exp
[−1
2
(x− µ
σ
)2]
dx
‖‖ s ≡ x−µ
σ , dx = σds⇓
=1
σ√
2π
∫ (b−µ)/σ
(a−µ)/σ
e−(1/2)s2σds
=1√2π
∫ (b−µ)/σ
(a−µ)/σ
e−(1/2)s2ds
= Φ(
b− µ
σ
)− Φ
(a− µ
σ
)
Chen CL 25
Useful Probability DistributionsGaussian (Normal) Distribution
Drainage from A Community
➢ The drainage from a community during a storm is a normal random variableestimated to have a mean of 1.2 million gallons per day (mgd) and a standarddeviation of 0.4 mgd; i.e., N (1.2, 0.4) mgd. If the storm drain system is designedwith a maximum drainage capacity of 1.5 mgd, what is the underlying probabilityof flooding during a storm that is assumed in the design of the drainage system?
Flooding in the community will occur when the drainage load exceeds thecapacity of the drainage system; the probability of flooding is
P(X > 1.5) = 1− P(X ≤ 1.5)
= 1− Φ(
1.5− 1.20.4
)= 1− Φ(0.75)
= 1− 0.7734 = 0.227
Chen CL 26
Useful Probability DistributionsGaussian (Normal) Distribution
Drainage from A Community
➢ The probability that the drainage during a storm will be between 1.0 mgd and1.6 mgd,
P(1.0 < X ≤ 1.6) = Φ(1.6−1.2
0.4
)− Φ
(1.0−1.2
0.4
)= Φ(1.0)− Φ(−0.5)= Φ(1.0)− [1− Φ(0.5)]= 0.8413− [1− 0.6915] = 0.533
➢ The 90-percentile drainage load from the community during a storm.This is the value of the random variable at which the cummulative probability isless than 0.90.
P(X ≤ x0.90) = Φ(
x0.90−1.2
0.40
)= 0.90
x0.90−1.2
0.40 = Φ−1(0.90) = 1.28 (Table A.1)
x0.90 = 1.28(0.40) + 1.2 = 1.71 mgd
Chen CL 27
Useful Probability DistributionsLognormal Distribution
➢ Logarithmic Normal (Lognormal) Distribution
If a random variable X has alognormal distribution, its PDF is
fX(x) =
1√2π(ζx)
exp
[−1
2
(ln(x)− λ
ζ
)2]
x ≥ 0
x ≥ 0
λ = E(ln(X))
ζ =√Var(ln(X))
Chen CL 28
Useful Probability DistributionsLognormal Distribution
➢ If X is lognormal with parameters λ and ζ, then ln(X) is normalwith mean λ and standard deviation ζ; i.e., ln(X) ∼ N (λ, ζ).
P(a < X ≤ b) =1
(ζx)√
2π
∫ b
a
exp
[−1
2
(ln(x)− λ
ζ
)2]
dx
‖‖ s = ln(x)−λζ , dx = ζxds⇓
=1√2π
∫ (ln(b)−λ)/ζ
(ln(a)−λ)/ζ
e−(1/2)s2ds
=1√2π
∫ (ln(b)−λ)/ζ
0
e−(1/2)s2ds− 1√
2π
∫ (ln(a)−λ)/ζ
0
e−(1/2)s2ds
= Φ(
ln(b)− λ
ζ
)− Φ
(ln(a)− λ
ζ
)
Chen CL 29
Useful Probability DistributionsLognormal Distribution
µX
= E(X) =1
(ζx)√
2π
∫ ∞
0
xexp
[−1
2
(ln(x)− λ
ζ
)2]
dx
‖‖‖ y = ln(x), dx = xdy = eydy‖⇓
=1√2πζ
∫ ∞
−∞eyexp
[−1
2
(y − λ
ζ
)2]
dy
=1√2πζ
∫ ∞
−∞exp
[y − 1
2
(y − λ
ζ
)2]
dy
?=
{1√2πζ
∫ ∞
−∞exp
[−1
2
(y − (λ + ζ2)
ζ
)2]}
dy︸ ︷︷ ︸area under N (λ + ζ2, ζ) = 1.0
exp
(λ +
12ζ2
)
= exp
(λ +
12ζ2
)⇒ λ = ln(µ
X)− 1
2ζ2
Chen CL 30
Useful Probability DistributionsLognormal Distribution
E(X2) =1√2πζ
∫ ∞
−∞eyexp
[−1
2
(y − λ
ζ
)2]
dy
=1√2πζ
∫ ∞
−∞exp
[− 1
2ζ2
(y2 − 2(λ + 2ζ2) + λ2
)]dy
=
{1√2πζ
∫ ∞
−∞exp
[−1
2
(y − (λ + 2ζ2)
ζ
)2]
dy
}︸ ︷︷ ︸
area under N (λ + 2ζ2, ζ) = 1.0
exp(2(λ + ζ2)
)
= exp(2(λ + ζ2)
)= exp
(2(
λ +12ζ2
)+ ζ2
)= µ2
X· eζ2
Var(X) = E(X2)− µ2X
= µ2X
(eζ2 − 1
)ζ2 = ln
[1 +
(σ
X
µX
)2]
= ln(1 + δ2
X
)≈ δ2
X(ζ ≈ δ
Xif δ
X≤ 0.3)
Chen CL 31
Useful Probability DistributionsLognormal Distribution
The median, instead of the mean, is often used to designate thecentral value of a lognormal random variable,
0.5 = P(X ≤ xm) = Φ(ln(xm)− λ
ζ
)0 = Φ−1(0.50) =
ln(xm)− λ
ζ
⇒ λ = ln(xm), xm = eλ
‖‖‖ µX
= exp
(λ +
12ζ2
)= exp (λ)︸ ︷︷ ︸
xm
exp
(12
ln(1 + δ2
X
))‖‖⇓
µX
= xm
√1 + δ2
X> xm
Chen CL 32
Useful Probability DistributionsLognormal DistributionDrainage from A Community
In Example 3.9, if the distribution of storm drainage from the community is alognormal random variable instead of normal, with the same mean and standarddeviation, the probability of flooding during a storm would be evaluated as follows.
First, we obtain the parameters λ and ζ of the lognormal distribution as follows:
ζ2 = ln[1 +
(0.41.2
)2] = ln(1.111) = 0.105
ζ = 0.324λ = ln(1.20)− 1
2(0.324)2 = 0.130P(X > 1.50) = 1− P(X ≤ 1.50)
= 1− Φ(
ln(1.5)−0.1300.324
)= 1− Φ(0.85) = 1− 0.8023 = 0.198 (⇔ 0.227)
which may be compared with the probability of 0.227 from Example 3.9,
illustrating the fact that the result depends on the underlying distribution of the
random variable.
Chen CL 33
Useful Probability DistributionsLognormal DistributionDrainage from A Community
Also, with the lognormal distribution, we obtain the probability that the drainagewill be between 1.0 mgd and 1.6 mgd:
P(1.0 < X ≤ 1.6) = Φ(
ln(1.6)−0.1300.324
)− Φ
(ln(1.0)−0.130
0.324
)= Φ(1.049)− Φ(−0.401)= Φ(1.049)− [1− Φ(0.401)]= 0.8531 − [1− 0.6554] = 0.509 (⇔ 0.533)
The 90% value of the drainage load from the community,
P(X ≤ x0.90) = Φ(
ln(x0.90)−0.130
0.324
)= 0.90
ln(x0.90)− 0.1300.324
= Φ−1(0.90) = 1.28 (Table A.1)
x0.90 = e(0.324)(1.28)+0.130 = 1.72 mgd (⇔ 1.71)
Chen CL 34
Useful Probability DistributionsBernoulli Sequence and Binomial Distribution
➢ In many engineering applications, there are often problems involving theoccurrence or recurrence of an event, which is unpredictable, in a sequenceof discrete trials. For example,
☞ In allocating a fleet of construction equipment for a project, the anticipatedconditions of every piece of equipment in the fleet over the duration of theproject would have some bearing on the determination of the required fleetsize.
☞ In planning the flood control system for a river basin, the annual maximumflow of the river over a sequence of years would be important in determiningthe design flood level. In these cases, the operational conditions of everypiece of equipment, and the annual maximum flow of the river relative to aspecified flood level constitute the respective trials.
➢ In these problems, there are only two possible outcomes in each trial:occurrence and nonoccurrence of an event
☞ Each piece of equipment may or may not malfunction over the duration ofthe project;
☞ In each year, the maximum flow of the river may or may not exceed somespecified flood level.
Chen CL 35
Useful Probability DistributionsBernoulli Sequence and Binomial Distribution
➢ Problems of the type that we just described above may be modeled by aBernoulli sequence, which is based on the following assumptions:
1. In each trial, there are only two possibilities-the occurrence and nonoccurrenceof an event.
2. The probability of occurrence of the event in each trial is constant.3. The trials are statistically independent.
➢ In the two examples introduced above, we may model each of the problems asa Bernoulli sequence as follows:
☞ Over the duration of the project, the operational conditions between equipmentare statistically independent, and the probability of malfunction for every pieceof equipment is the same;then, the conditions of the entire fleet of equipments constitute a Bernoullisequence.
☞ If the annual maximum floods between any 2 years are statistically independentand in each year the probability of the flood’s exceeding some specified levelis constant,then the annual maximum floods over a series of years can be modeled as aBernoulli sequence.
Chen CL 36
Useful Probability DistributionsBernoulli Sequence and Binomial Distribution
➢ In a Bernoulli sequence, if X is the random number of occurrences
of an event among n trials, in which the probability of occurrence
of the event in each trial is p and the corresponding probability of
nonoccurrence is (1− p), then
the probability of exactly x occurrences among the n trials is
governed by the binomial PMF,
P(X = x) =
(n
x
)px(1− p)n−x ≡ b(x;n, p)
=n!
x!(n− x)!px(1− p)n−x, x = 0, 1, . . . , n
P(X ≤ x) =x∑
k=0
(n
k
)pk(1− p)n−k ≡ B(x;n, p)
Chen CL 37
Useful Probability DistributionsBernoulli Sequence and Binomial Distribution
Cumulative Valuesfor The
Binomial Probability Distribution
B(x;n, p) = P[X ≤ x] =x∑
k=0
b(k;n, p)
Chen CL 38
Useful Probability DistributionsBernoulli Sequence and Binomial Distribution
x p = .01 p = .05 p = .10 p = .20 p = .30 p = .40 p = .50n=1 0 0.9900 0.9500 0.0900 0.8000 0.7000 0.6000 0.5000
1 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
n=2 0 0.9801 0.9025 0.8100 0.6400 0.4900 0.3600 0.25001 0.9999 0.9975 0.9900 0.9600 0.9100 0.8400 0.75002 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
n =3 0 0.9703 0.8574 0.7290 0.5120 0.3430 0.2160 0.12501 0.9997 0.9927 0.9720 0.8960 0.7840 0.6480 0.50002 1.0000 0.9999 0.9990 0.9920 0.9730 0.9360 0.87503 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
n=4 0 0.9606 0.8145 0.6561 0.4096 0.2401 0.1296 0.06251 0.9994 0.9860 0.9477 0.8192 0.6517 0.4752 0.31252 1.0000 0.9995 0.9963 0.9728 0.9163 0.8208 0.68753 1.0000 1.0000 0.9999 0.9984 0.9919 0.9744 0.93754 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
n=5 0 0.9510 0.7738 0.5905 0.3277 0.1681 0.0778 0.03131 0.9990 0.9774 0.9185 0.7373 0.5282 0.3370 0.18752 1.0000 0.9988 0.9914 0.9421 0.8369 0.6826 0.50003 1.0000 1.0000 0.9995 0.9933 0.9692 0.9130 0.81254 1.0000 1.0000 1.0000 0.9997 0.9976 0.9898 0.96885 1.0000 1.0000 1.0000 1.0000
n=10 0 0.9044 0.5987 0.3487 0.1074 0.0282 0.0060 0.00101 0.9957 0.9139 0.7361 0.3758 0.1493 0.0464 0.01072 0.9999 0.9885 0.9298 0.6778 0.3828 0.1673 0.05473 1.0000 0.9990 0.9872 0.8791 0.6496 0.3823 0.17194 1.0000 0.9999 0.9984 0.9672 0.8497 0.6331 0.37705 1.0000 1.0000 0.9999 0.9936 0.9526 0.8338 0.62306 1.0000 1.0000 1.0000 0.9991 0.9894 0.9452 0.82817 0.9999 0.9999 0.9877 0.94538 1.0000 1.0000 0.9983 0.98939 0.9999 0.9990
10 1.0000 1.0000
Chen CL 39
Useful Probability DistributionsBernoulli Sequence and Binomial Distribution
x p = .01 p = .05 p = .10 p = .20 p = .30 p = .40 p = .50n=20 0 0.8179 0.3585 0.1216 0.0115 0.0008 0.0000 0.0000
1 0.9831 0.7358 0.3917 0.0692 0.0076 0.0005 0.00002 0.9990 0.9245 0.6769 0.2061 0.0355 0.0036 0.00023 1.0000 0.9841 0.8670 0.4114 0.1071 0.0160 0.00134 1.0000 0.9974 0.9568 0.6296 0.2375 0.0510 0.00595 1.0000 0.9997 0.9887 0.8042 0.4164 0.1256 0.0207
6 1.0000 1.0000 0.9976 0.9133 0.6080 0.2500 0.05777 1.0000 1.0000 0.9996 0.9679 0.7723 0.4159 0.13168 1.0000 1.0000 0.9999 0.9900 0.8867 0.5956 0.25179 1.0000 1.0000 1.0000 0.9974 0.9520 0.7553 0.4119
10 0.9994 0.9829 0.8725 0.5881
11 0.9999 0.9949 0.9435 0.748312 1.0000 0.9987 0.9790 0.868413 0.9997 0.9935 0.942314 1.0000 0.9984 0.979315 0.9997 0.9941
16 1.0000 0.998717 0.999818 1.0000
n=50 0 0.6050 0.0769 0.0052 0.0000 0.0000 0.0000 0.00001 0.9106 0.2794 0.0338 0.0002 0.0000 0.0000 0.00002 0.9862 0.5405 0.1117 0.0013 0.0000 0.0000 0.00003 0.9984 0.7604 0.2503 0.0057 0.0000 0.0000 0.00004 0.9999 0.8964 0.4312 0.0185 0.0002 0.0000 0.00005 1.0000 0.9622 0.6161 0.0480 0.0007 0.0000 0.0000
6 1.0000 0.9882 0.7702 0.1034 0.0025 0.0000 0.00007 1.0000 0.9968 0.8779 0.1094 0.0073 0.0001 0.00008 1.0000 0.9992 0.9421 0.3073 0.0183 0.0002 0.00009 1.0000 0.9998 0.9755 0.4437 0.0402 0.0008 0.0000
10 1.0000 1.0000 0.9906 0.5836 0.0789 0.0022 0.0000
Chen CL 40
Useful Probability DistributionsBernoulli Sequence and Binomial Distribution
x p = .01 p = .05 p = .10 p = .20 p = .30 p = .40 p = .50n=50 11 1.0000 1.0000 0.9968 0.7107 0.1390 0.0057 0.0000
12 1.0000 1.0000 0.9990 0.8139 0.2229 0.0133 0.000213 1.0000 1.0000 0.9997 0.8894 0.3279 0.0280 0.000514 1.0000 1.0000 0.9999 0.9393 0.4468 0.0540 0.001315 1.0000 1.0000 1.0000 0.9692 0.5692 0.0955 0.0033
16 0.9856 0.6839 0.1561 0.007717 0.9937 0.7822 0.2369 0.016418 0.9975 0.8594 0.3356 0.032519 0.9991 0.9152 0.4465 0.059520 0.9997 0.9522 0.5610 0.1013
21 0.9999 0.9749 0.6701 0.161122 1.0000 0.9877 0.7660 0.239923 0.9944 0.8438 0.335924 0.9976 0.9022 0.443925 0.9991 0.9427 0.5561
26 0.9997 0.9686 0.664127 0.9999 0.9840 0.760128 1.0000 0.9924 0.838929 0.9966 0.898730 0.9986 0.9405
31 0.9995 0.967532 0.9998 0.983633 0.9999 0.992334 1.0000 0.996735 0.9987
36 0.999537 0.999838 1.0000
Chen CL 41
Useful Probability DistributionsBernoulli Sequence and Binomial Distribution
x p = .01 p = .05 p = .10 p = .20 p = .30 p = .40 p = .50n=100 0 0.3660 0.0059 0.0000 0.0000 0.0000 0.0000 0.0000
1 0.7358 0.0371 0.0003 0.0000 0.0000 0.0000 0.00002 0.9206 0.1183 0.0019 0.0000 0.0000 0.0000 0.00003 0.9816 0.2578 0.0078 0.0000 0.0000 0.0000 0.00004 0.9966 0.4360 0.0237 0.0000 0.0000 0.0000 0.00005 0.9995 0.6160 0.0576 0.0000 0.0000 0.0000 0.0000
6 0.9999 0.7660 0.1172 0.0001 0.0000 0.0000 0.00007 1.0000 0.8720 0.2061 0.0003 0.0000 0.0000 0.00008 1.0000 0.9369 0.3209 0.0009 0.0000 0.0000 0.00009 1.0000 0.9718 0.4513 0.0023 0.0000 0.0000 0.0000
10 1.0000 0.9885 0.5832 0.0057 0.0000 0.0000 0.0000
11 1.0000 0.9957 0.7030 0.0126 0.0000 0.0000 0.000012 1.0000 0.9985 0.8018 0.0253 0.0000 0.0000 0.000013 1.0000 0.9995 0.8761 0.0469 0.0001 0.0000 0.000014 1.0000 0.9999 0.9274 0.0804 0.0002 0.0000 0.000015 1.0000 1.0000 0.9601 0.1285 0.0004 0.0000 0.0000
16 1.0000 1.0000 0.9794 0.1923 0.0010 0.0000 0.000017 1.0000 1.0000 0.9900 0.2712 0.0022 0.0000 0.000018 1.0000 1.0000 0.9954 0.3621 0.0045 0.0000 0.000019 1.0000 1.0000 0.9980 0.4602 0.0089 0.0000 0.000020 1.0000 1.0000 0.9992 0.5595 0.0165 0.0000 0.0000
21 1.0000 1.0000 0.9997 0.6540 0.0288 0.0000 0.000022 1.0000 1.0000 0.9999 0.7389 0.0479 0.0001 0.000023 1.0000 1.0000 1.0000 0.8109 0.0755 0.0003 0.000024 0.8686 0.1136 0.0006 0.000025 0.9125 0.1631 0.0012 0.0000
Chen CL 42
Useful Probability DistributionsBernoulli Sequence and Binomial Distribution
x p = .01 p = .05 p = .10 p = .20 p = .30 p = .40 p = .50n=100 26 0.9442 0.2244 0.0024 0.0000
27 0.9658 0.2964 0.0046 0.000028 0.9800 0.3768 0.0084 0.000029 0.9888 0.4623 0.0148 0.000030 0.9939 0.5491 0.0248 0.0000
31 0.9969 0.6331 0.0398 0.000132 0.9984 0.7107 0.0615 0.000233 0.9993 0.7793 0.0913 0.000434 0.9997 0.8371 0.1303 0.000935 0.9999 0.8839 0.1795 0.0018
36 0.9999 0.9201 0.2386 0.003337 1.0000 0.9470 0.3068 0.006038 0.9660 0.3822 0.010539 0.9790 0.4621 0.017640 0.9875 0.5433 0.0284
41 0.9928 0.6225 0.044342 0.9960 0.6967 0.066643 0.9979 0.7635 0.096744 0.9989 0.8211 0.135645 0.9995 0.8689 0.1841
46 0.9997 0.9070 0.242147 0.9999 0.9362 0.308648 0.9999 0.9577 0.382249 1.0000 0.9729 0.460250 0.9832 0.5398
Chen CL 43
Useful Probability DistributionsBernoulli Sequence and Binomial Distribution
x p = .01 p = .05 p = .10 p = .20 p = .30 p = .40 p = .50n=100 51 0.9900 0.6178
52 0.9942 0.691453 0.9968 0.757954 0.9983 0.815955 0.9991 0.8644
56 0.9996 0.903357 0.9998 0.933458 0.9999 0.955759 1.0000 0.971660 0.9824
61 0.989562 0.994063 0.996764 0.998265 0.9991
66 0.999667 0.999868 0.999969 1.0000
Chen CL 44
Useful Probability DistributionsBernoulli Sequence and Binomial Distribution
Road Graders of A Highway Project
➢ Five road graders are used in the construction of a highway project. Theoperational life T of each grader is a lognormal random variable with a meanlife of µ
T= 1500 hr and a cov of 30% (δ
T= σ
Tµ
T= 0.3, see Fig. E3.14).
Chen CL 45
Useful Probability DistributionsBernoulli Sequence and Binomial Distribution
Road Graders of A Highway Project
➢ Assuming statistical independence among the conditions of the machines, theprobability that two of the five machines will malfunction in less than 900 hr ofoperation can be evaluated. The parameters of the lognormal distribution are:ζ2 = ln
[1 + 0.32
]= 0.086 ⇒ ζ ≈ 0.30; and λ = ln(1500) − 1
2(0.3)2 = 7.27.Then, the probability that a machine will malfunction within 900 hr is
p = P(T ≤ 900) = Φ(
ln(900)− 7.270.30
)= Φ(−1.56) = 0.0594
➢ For the five machines taken collectively, the actual operational lives of thedifferent machines may conceivably be as shown in Fig. E3.14; i.e., machinesNo.1 and 4 have operational lives less than 900 hr, whereas machines No. 2, 3,and 5 have operational lives longer than 900 hr. The corresponding probabilityof this exact sequence is p2(1− p)3. But the two malfunctioning machines mayhappen to any two of the five machines; therefore, the number of sequenceswith two malfunctioning machines among the five is 5!
2!3! = 10. Consequently, ifX is the number of road graders malfunctioning in 900 hr,
P(X = 2) = 10(0.0594)2(1− 0.0594)3 = 0.0294
Chen CL 46
Useful Probability DistributionsBernoulli Sequence and Binomial Distribution
Road Graders of A Highway Project
➢ The probability of malfunction among the five graders (i.e., there will bemalfunctions in one or more machines) would be
P(X ≥ 1) = 1− P(X = 0) = 1− (1− 0.0594)5 = 0.2638
➢ The probability that there will be no more than two machines malfunctioningwithin 900 hr
P(X ≤ 2) =2∑
k=0
(5k
)(0.0594)k(1− 0.0594)5−k
= (0.9406)5 + 5(0.0594)(0.9406)4 + 10(0.0594)2(0.9406)3
= 0.7362 + 0.2325 + 0.0294 = 0.9981
➢ This last result involves the CDF of the binomial distribution, which is tabulatedin Table A.2 for limited values of the parameters. Using Table A.2 with n = 5,x = r = 2, and p = 0.05, we obtain a value of 0.9988 from this table.
Chen CL 47
Useful Probability DistributionsBernoulli Sequence and Binomial Distribution
➢ In modeling problems with the Bernoulli sequence, the individual
trials must be discrete and statistically independent.
➢ Certain continuous problems may be modeled (approximately at
least) with the Bernoulli sequence.
➢ For example, time and space problems, which are generally
continuous, may be modeled with the Bernoulli sequence by
discretizing time (or space) into appropriate intervals and admitting
only two possibilities within each interval;
what happens in each time (or space) interval then constitutes a
trial, and the series of finite number of intervals is then a Bernoullisequence.
Chen CL 48
Useful Probability DistributionsBernoulli Sequence and Binomial Distribution
Rationing based on Annual Rainfall
➢ The annual rainfall (accumulated generally during the winter and spring) of eachyear in Orange County, California, is a Gaussian random variable with a mean of15 in. and a standard deviation of 4 in.; i.e., N (15, 4).
➢ Suppose the current water policy of the county is such that if the annual rainfallis less than 7 in. for a given year, water rationing will be required during thesummer and fall of that year.
➢ Assuming X is the annual rainfall, the probability of water rationing in OrangeCounty in any given year is then
P(X < 7) = Φ(
7− 154
)= Φ(−2.0) = 1− Φ(2.0) = 1− 0.9772 = 0.0228
Chen CL 49
Useful Probability DistributionsBernoulli Sequence and Binomial Distribution
Rationing based on Annual Rainfall
➢ If the county wishes to reduce the probability of water rationing to half that ofthe current policy, the annual rainfall below which rationing has to be imposedwould be determined as follows:
P(X < xr) = Φ(
xr−154
)= 1
2(0.0228) = 0.0114xr−15
4 = Φ−1(0.0114) = − Φ−1(0.9885) = − 2.28xr = 15 + (4)(−2.28) = 5.88 inch
➢ Under the current water policy, and assuming that the annual rainfalls betweenyears are statistically independent, the probability that in the next 5 years therewill be at least 1 year in which water rationing will be necessary would bedetermined as follows.Denoting N as the number of years when rationing would be imposed in thenext 5 years, the probability would be
P(N ≥ 1) = 1− P(N = 0) = 1−
(50
)(0.0228)0(0.9772)5 = 0.109
Chen CL 50
Useful Probability DistributionsBernoulli Sequence and Binomial Distribution
Rationing based on Annual Rainfall
➢ Whenever the annual rainfall is less than 7 in. in any given year, the probabilityof damage to the agricultural crops in the county is 30%.Assuming that crop damages between dry years (i.e., with rainfall less than 7in.) are statistically independent, the probability of crop damage (denoted D) inthe next 3 years may be of interest. In this case, the probability of crop damagewould depend on the number of years (between 0 and 3) that the annual rainfallwill be less than 7 in.; therefore, the solution requires the theorem of totalprobability, as follows:
P(D) = 1− P(D)
= 1−[1.00(0.9772)3 + (0.70)1(1.00)2
(31
)(0.0228)(0.9772)2
+ (0.70)2(1.00)1(
32
)(0.0228)2(0.9772) + (0.70)3(1.00)0
(33
)(0.0228)3
]= 1− [0.9331 + 0.0457 + 0.0007 + 0] = 1− 0.9795 = 0.020
The probability of crop damage in the next 3 years is only 2%.
Chen CL 51
Useful Probability DistributionsGeometric Distribution
➢ In a Bernoulli sequence, the number of trials until a specified
event occurs for the first time is governed by the geometricdistribution.
We might observe that if the event occurs for the first time on the
nth trial, there must be no occurrence of this event in any of the
prior (n− 1) trials.
➢ Geometric Distribution:If N is the random variable representing the number of trials until
the occurrence of the event, then
P(N = n) = pqn−1 n = 1, 2, . . . ; (q = 1− p)
Chen CL 52
Useful Probability DistributionsGeometric Distribution
Recurrence Time and Return Period
➢ In a time (or space) problem that is appropriately discretized into time (or space)intervals, T = N , and can be modeled as a Bernoulli sequence, number of timeintervals until first occurrence of an event is called first occurrence time.
➢ If the discretized time intervals in the sequence are statistically independent, thetime interval till the first occurrence of an event must be the same as that ofthe time between any two consecutive occurrences of the same event;thus the probability distribution of the recurrence time is equal to that of thefirst occurrence time.
➢ The recurrence time in a Bernoulli sequence is also governed by the geometricdistribution.
➢ The mean recurrence time, which is popularly known in engineering as the(average) return period is
T = E(T ) =∞∑
t=1
t · pqt−1
= p(1 + 2q + 3q2 + . . .) = p1
(1− q)2=
1p
Chen CL 53
Useful Probability DistributionsGeometric Distribution
Recurrence Time and Return Period, Ex: Building Design
➢ Suppose that the building code for the design of buildings in a coastal regionspecifies the 50-yr wind as the “design wind.” That is, a wind velocity with areturn period of 50 years; or on the average, the design wind may be expectedto occur once every 50 yr.
➢ In this case, the probability of encountering the 50-yr wind velocity in any 1 yris p = 1/50 = 0.02.
➢ The probability that a newly completed building in the region will be subjectedto the design wind velocity for the first time on the fifth year after its completionis
P(T = 5) = (0.02)(0.98)4 = 0.018
Chen CL 54
Useful Probability DistributionsGeometric Distribution
Recurrence Time and Return Period, Ex: Building Design
➢ The probability that the first such wind velocity will occur within 5 yr aftercompletion of the building would be
P(T ≤ 5) =5∑
t=1
(0.02)(0.98)t−1
= 0.02 + 0.0196 + 0.0192 + 0.0188 + 0.0184= 0.096
➢ This latter event (the first occurrence of the wind velocity within 5 yr) is thesame as the event of at least one 50-yr wind in 5 yr, which is also the complementof no 50-yr wind in 5 years; The desired probability may also be calculated as1− (0.98)5 = 0.096.
➢ The above is quite different from the event of experiencing exactly one 50-yrwind in 5 yr; the probability in this case is given by the binomial probability
which would be
(51
)(0.02)(0.98)4 = 0.092.
Chen CL 55
Useful Probability DistributionsGeometric Distribution
Recurrence Time and Return Period, Ex: Offshore Platform
➢ A fixed offshore platform is designed fora wave height of 8 m above the meansea level. This wave height correspondsto a 5% probability of being exceededper year.The return period of the design waveheight is,
T =1
0.05= 20 yr
➢ The probability that the platform will be subjected to the design wave heightwithin the return period is,
P(H > 8, in 20 yr) = 1− P(H ≤ 8, in 20 yr) = 1− (0.95)20 = 0.3585
Chen CL 56
Useful Probability DistributionsGeometric Distribution
Recurrence Time and Return Period, Ex: Offshore Platform
➢ The probability that the first exceedance of the design wave height will occurafter the third year is, by the geometric distribution,
P(T > 3) = 1− P(T ≤ 3)= 1− [0.05(0.95)1−1 + 0.05(0.95)2−1 + 0.05(0.95)3−1]= 1− [0.05 + 0.0475 + 0.0451]= 1− 0.1426 = 0.8574
➢ If the first exceedance of the design wave height should occur after the thirdyear as stipulated above, the probability that such a first exceedance will occurin the fifth year is then
P(T = 5|T > 3) =P(T = 5 ∩ T > 3)
P(T > 3)=P(T = 5)P(T > 3)
=0.05(0.95)4
0.8574= .0475
Chen CL 57
Useful Probability DistributionsGeometric Distribution
Recurrence Time and Return Period➢ The probability of no event occurring within its return period T ,
P(no occurrence in T ) = (1− p)T
= 1− Tp +T (T − 1)
2!p2 − T (T − 1)(T − 2)
3!p3 + . . .
= e−Tp = e−1 = 0.3679 (T = 1p)
P( occurrence in T ) = 1− 0.3679 = 0.6321
➢ For a rare event that is defined as one with a long return period, T , theprobability of the event’s occurring within its return period is always 0.632.
➢ This result is a useful approximation even for return periods that are not verylong; for instance, for T = 20 time intervals, such as in Example 3.16, theprobability is (p = 1/20; q = 1− 1/20)
P(occurrence in T ) = 1−(
1− 120
)20
= 1− 0.3585 = 0.6415
which shows that the error in the above exponential approximation is less than1.5%. ((0.6415− 0.6321)/0.6321 = 0.01487)
Chen CL 58
Useful Probability DistributionsNegative Binomial Distribution
➢ The geometric PMF is the probability law governing the number of
trials, or discrete time units, until the first occurrence of an event
in a Bernoulli sequence.
➢ The number of time units (or trials) until a subsequent occurrence
of the same event is governed by the negative binomial distribution.
➢ If Tk is the number of time units until the kth occurrence of the
event in a series of Bernoulli trials, then
P(Tk = n) =
(
n− 1k − 1
)pkqn−k for n = k, k + 1, . . .
0 for n < k
Chen CL 59
Useful Probability DistributionsNegative Binomial Distribution
➢ If the kth occurrence of an event is realized at the nth trial, there
must be exactly (k−1) occurrences of the event in the prior (n−1)trials and at the nth trial the event also occurs.
Thus, from the binomial law, we obtain the probability
P(Tk = n) =
(n− 1k − 1
)pk−1qn−k · p
Chen CL 60
Useful Probability DistributionsNegative Binomial Distribution
Ex: Building Design
➢ In previous building example, the probability that the building in the region willbe subjected to the design wind for the third time on the tenth year is
P(T3 = 10) =
(10− 13− 1
)(0.02)3(0.98)10−3
= 36(0.000008)(0.8681) = 0.00025
➢ The probability that the third design wind will occur within 5 years would be
P(T3 ≤ 5) =5∑
n=3
(n− 13− 1
)(0.02)3(0.98)n−3
=
(22
)(0.02)3(0.98)0 +
(32
)(0.02)3(0.98)1 +
(42
)(0.02)3(0.98)2
= (0.000008) + 3(0.000008)(0.98) + 12(0.000008)(0.96040) = 0.00012
Chen CL 61
Useful Probability DistributionsNegative Binomial Distribution
Ex: A Steel Cable Problem
➢ A steel cable is built up of a number of independent wires as shown in Fig.E3.19. Occasionally, the cable is subjected to high overloads; on such occasionsthe probability of fracture of one of the wires is 0.05, and the failure of two ormore wires during a single overload is unlikely.
Chen CL 62
Useful Probability DistributionsNegative Binomial Distribution
Ex: A Steel Cable Problem
➢ If the cable must be replaced when the third wire fails, the probability that thecable can withstand at least five overloads can be determined as follows.First, we observe that the third wire failure must occur at or after the sixthoverloading. Hence, the required probability is
P(T3 ≥ 6) = 1− P(T3 < 6) = 1−5∑
n=3
P(T3 = n)
= 1−
(3− 13− 1
)(0.05)3(0.95)0 −
(4− 13− 1
)(0.05)3(0.95)1
−
(5− 13− 1
)(0.05)3(0.95)2
= 1− 0.00184 = 0.9982
Chen CL 63
Useful Probability DistributionsPoisson Process and Poisson Distribution
➢ Many physical problems of interest to engineers and scientists involve the possibleoccurrences of events at any point in time and/or space.
☞ Earthquakes could strike at any time and anywhere over a seismically activeregion in the world;
☞ Fatigue cracks may occur anywhere along a continuous weld; and☞ Traffic accidents could happen at any time on a given highway.
➢ Conceivably, such space-time problems may be modeled also with the Bernoullisequence, by dividing the time or space into appropriate small intervals, andassuming that an event will either occur or not occur (only two possibilities)within each interval, thus constituting a Bernoulli trial.
➢ However, if the event can randomly occur at any instant of time (or at any pointin space), it may occur more than once in any given time or space interval. Insuch cases, the occurrences of the event may be more appropriately modeledwith a Poisson process or Poisson sequence.
Chen CL 64
Useful Probability DistributionsPoisson Process and Poisson Distribution
➢ Formally, the Poisson process is based on the following assumptions:
1. An event can occur at random and at any instant of time or any point inspace.
2. The occurrence(s) of an event in a given time (or space) interval is statisticallyindependent of that in any other nonoverlapping interval.
3. The probability of occurrence of an event in a small interval ∆t is proportionalto ∆t, and can be given by ν∆t, where ν is the mean occurrence rate of theevent (assumed to be constant).
4. The probability of two or more occurrences in ∆t is negligible (of higher ordersof ∆t).
➢ If Xt is the number of occurrences in a time (or space) interval (0, t), then thenumber of statistically independent occurrences of an event in t (time or space)is governed by the Poisson PMF;
P(Xt = x) =(νt)x
x!e−νt x = 0, 1, 2, . . .
where ν is the mean occurrence rate, i.e., the average number of occurrences ofthe event per unit time (or space) interval; E(Xt) = νt = Var(Xt).
Chen CL 65
Useful Probability DistributionsPoisson Process and Poisson Distribution
➢ The Bernoulli sequence approaches the Poisson process as the time (or space)interval is decreased.
➢ From previous statistical data of traffic counts, an average of 60 cars per hourwas observed to make left turns at a given intersection. Then, suppose weare interested in the probability of exactly 10 cars making left turns at theintersection in a 10-min interval.
As an approximation, we may first divide the 1-hr duration into 120 intervalsof 30 sec each (1 hour ⇒ 120 intervals; 10 min ⇒ 20 intervals), such that theprobability of a left turn (L.T.) in any 30-sec interval would be p = 60/120 = 0.5,(120× 0.5 = 60). Then, allowing no more than one L.T. in any 30-sec interval,the problem is reduced to the binomial probability of the occurrence of 10 L. T.among the maximum possible of 20 L.T. in the 10-min interval, in which theprobability of a L.T. in each 30-sec interval is 0.5. Thus,
P(10 L.T. in 10 min) =
(2010
)(0.5)10(0.5)20−10 = 0.1762
Chen CL 66
Useful Probability DistributionsPoisson Process and Poisson Distribution
The above solution is grossly approximate because it assumes that no more thanone car will be making L.T. in a 30-sec interval; obviously, two or more L.T.sare possible.The solution would be improved if we selected a shorter time interval, say, a10-sec interval (1 hour ⇒ 360 intervals; 10 min ⇒ 60 intervals). Then, theprobability of an L.T. in each interval is p = 60/360 = 0.1667, (360× 0.1667 =60), and
P(10 L.T. in 10 min) =
(6010
)(0.1667)10(0.8333)60−10 = 0.1370
Further improvements can be made by subdividing time into shorter intervals.If the time t is subdivided into n equal intervals, then the binomial PMF wouldgive
P(x occurrence in t) =
(n
x
)(λ
n
)x(1− λ
n
)n−x
where λ is the average number of occurrences of the event in time t.
Chen CL 67
Useful Probability DistributionsPoisson Process and Poisson Distribution
λ =νt: average # of events in t (min)ν : average # of events per unit time (min)
⇒ divide t into n →∞ intervals with p =λ
n=
νt
n(event/trial)
(Assume: no more than one event in an interval)
P(x occurrence in t) = limn→∞
(n
x
)(λ
n
)x(1− λ
n
)n−x
= limn→∞
n!x!(n− x)!
(λ
n
)x(1− λ
n
)n−x
= limn→∞
n
n
(n− 1)n
· · · (n− x + 1)n︸ ︷︷ ︸
→1 (n>>x)
·λx
x!
(1− λ
n
)n
︸ ︷︷ ︸e−λ
(1− λ
n
)−x
︸ ︷︷ ︸→1(
1− λ
n
)n
= 1− λ +λ2
2!− λ3
3!+ · · · = e−λ
P(x occurrence in t) =λx
x!e−λ =
(νt)x
x!e−νt
Chen CL 68
Useful Probability DistributionsPoisson Process and Poisson Distribution
➢ On this basis, with ν = 1 L.T. per minute,the probability of x = 10 L.T. in t = 10 min is then
P(X10 = 10) =(νt)x
x!e−νt
=(1× 10)10
10!e−1×10 = 0.1250
↑
0.1370
↑
0.1762
Chen CL 69
Useful Probability DistributionsPoisson Process and Poisson Distribution
Ex: Severe Rainstorms
➢ Historical records of severe rainstorms in a town over the last 20 years indicatedthat there had been an average number of four rainstorms per year. Assumingthat the occurrences of rainstorms may be modeled with the Poisson process,the probability that there would not be any rainstorms next year is
P(Xt = 0) =(4× 1)0
0!e−(4×1) = 0.018
➢ The probability of four rainstorms next year
P(Xt = 4) =(4× 1)4
4!e−(4×1) = 0.195
➢ The PMF shows the different probabilities of thenumber of rainstorms in a year(Xt = 0, 1, . . .)
Chen CL 70
Useful Probability DistributionsPoisson Process and Poisson Distribution
Ex: Severe Rainstorms
➢ Although the average yearly occurrences of rainstorms is four, the probability ofactually experiencing four rainstorms in a year is less than 20%.The probability of two or more rainstorms (x ≥ 2) in the next year is
P(X1 ≥ 2) = 1− P(X1 ≤ 1)
= 1− P(X1 = 0)− P(X1 = 1)
= 1−1∑
x=0
(4× 1)x
x!e−(4×1)
= 1− 0.018− 0.074 = 0.908
Chen CL 71
Useful Probability DistributionsPoisson Process and Poisson Distribution
Cumulative Valuesfor The
Poisson Probability Distribution
P (x;λ, t) = P[X ≤ x] =x∑
k=0
p(k;λ, t)
Chen CL 72
Useful Probability DistributionsPoisson Process and Poisson Distribution
νtx 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.00 0.3679 0.1353 0.0498 0.0183 0.0067 0.0025 0.0009 0.0003 0.0001 0.00001 0.7358 0.4060 0.1991 0.0916 0.0404 0.0174 0.0073 0.003 0.0012 0.00052 0.9197 0.6767 0.4232 0.2381 0.1247 0.0620 0.0296 0.0138 0.0062 0.00283 0.9810 0.8571 0.6472 0.4335 0.2650 0.1512 0.0818 0.0424 0.0212 0.01034 0.9963 0.9473 0.8153 0.6288 0.4405 0.2851 0.1730 0.0990 0.0550 0.0293
5 0.9994 0.9834 0.9161 0.7851 0.6160 0.4457 0.3007 0.1912 0.1157 0.06716 0.9999 0.9955 0.9665 0.8893 0.7622 0.6063 0.4497 0.3134 0.2068 0.13017 1.0000 0.9989 0.9881 0.9489 0.8666 0.7440 0.5987 0.4530 0.3239 0.22028 0.9998 0.9962 0.9786 0.9319 0.8472 0.7291 0.5926 0.4557 0.33289 1.0000 0.9989 0.9919 0.9682 0.9161 0.8305 0.7166 0.5874 0.4579
10 0.9997 0.9972 0.9863 0.9574 0.9015 0.8159 0.7060 0.583011 0.9999 0.9991 0.9945 0.9799 0.9466 0.8881 0.8030 0.696812 1.0000 0.9997 0.9980 0.9912 0.9730 0.9362 0.8758 0.791613 0.9999 0.9993 0.9964 0.9872 0.9658 0.9262 0.864514 1.0000 0.9998 0.9986 0.9943 0.9827 0.9585 0.9165
15 0.9999 0.9995 0.9976 0.9918 0.9780 0.951316 1.0000 0.9998 0.9990 0.9963 0.9889 0.973017 0.9999 0.9996 0.9984 0.9947 0.985718 1.0000 0.9999 0.9993 0.9976 0.992819 0.9999 0.9997 0.9989 0.9965
20 1.0000 0.9999 0.9996 0.998421 1.0000 0.9998 0.999322 0.9999 0.999723 1.0000 0.999924 0.9999
25 1.0000
Chen CL 73
Useful Probability DistributionsPoisson Process and Poisson Distribution
νtx 11.0 12.0 13.0 14.0 15.0 16.0 17.0 18.0 19.0 20.00 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.00001 0.0002 0.0001 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.00002 0.0012 0.0005 0.0002 0.0001 0.0000 0.0000 0.0000 0.0000 0.0000 0.00003 0.0049 0.0023 0.0011 0.0005 0.0002 0.0001 0.0000 0.0000 0.0000 0.00004 0.0151 0.0076 0.0037 0.0018 0.0009 0.0004 0.0002 0.0001 0.0000 0.0000
5 0.0375 0.0203 0.0107 0.0055 0.0028 0.0014 0.0007 0.0003 0.0002 0.00016 0.0786 0.0458 0.0259 0.0142 0.0076 0.0040 0.0021 0.0010 0.0005 0.00037 0.1432 0.0895 0.0540 0.0316 0.0180 0.0100 0.0054 0.0029 0.0015 0.00088 0.2320 0.1550 0.0998 0.0621 0.0374 0.0220 0.0126 0.0071 0.0039 0.00219 0.3405 0.2424 0.1658 0.1094 0.0699 0.0433 0.0261 0.0154 0.0089 0.0050
10 0.4599 0.3472 0.2517 0.1757 0.1185 0.0774 0.0491 0.0304 0.0183 0.010811 0.5793 0.4616 0.3532 0.2600 0.1847 0.1270 0.0847 0.0549 0.0347 0.021412 0.6887 0.5760 0.4631 0.3585 0.2676 0.1931 0.1350 0.0917 0.0606 0.039013 0.7813 0.6815 0.5730 0.4644 0.3632 0.2745 0.2009 0.1426 .00984 0.006114 0.8540 0.7720 0.6751 0.5704 0.4656 0.3675 0.2808 0.2081 0.1497 0.1049
15 0.9074 0.8444 0.7636 0.6694 0.5681 0.4667 0.3714 0.2866 0.2148 0.156516 0.9441 0.8987 0.8355 0.7559 0.6641 0.5660 0.4677 0.3750 0.2920 0.221117 0.9678 0.9370 0.8905 0.8272 0.7489 0.6593 0.5640 0.4686 0.3784 0.297018 0.9823 0.9626 0.9302 0.8826 0.8195 0.7423 0.6549 0.5622 0.4695 0.381419 0.9907 0.9787 0.9573 0.9235 0.8752 0.8122 0.7363 0.6509 0.5606 0.4703
20 0.9953 0.9884 0.9750 0.9521 0.9170 0.8682 0.8055 0.7307 0.6472 0.559121 0.9977 0.9939 0.9859 0.9711 0.9469 0.9108 0.8615 0.7991 0.7255 0.643722 0.9989 0.9969 0.9924 0.9833 0.9672 0.9418 0.9047 0.8551 0.7931 0.720623 0.9995 0.9985 0.9960 0.9907 0.9805 0.9633 0.9367 0.8989 0.8490 0.787524 0.9998 0.9993 0.9980 0.9950 0.9888 0.9777 0.9593 0.9317 0.8933 0.8432
Chen CL 74
Useful Probability DistributionsPoisson Process and Poisson Distribution
νtx 11.0 12.0 13.0 14.0 15.0 16.0 17.0 18.0 19.0 20.0
25 0.9999 0.9997 0.9990 0.9974 0.9938 0.9869 0.9747 0.9554 0.9269 0.887826 1.0000 0.9999 0.9995 0.9987 0.9967 0.9925 0.9848 0.9718 0.9514 0.922127 0.9999 0.9998 0.9994 0.9983 0.9959 0.9912 0.9827 0.9687 0.947528 1.0000 0.9999 0.9997 0.9991 0.9978 0.9950 0.9897 0.9805 0.965729 1.0000 0.9999 0.9996 0.9989 0.9973 0.9940 0.9881 0.9782
30 0.9999 0.9998 0.9994 0.9985 0.9967 0.9930 0.986531 1.0000 0.9999 0.9997 0.9992 0.9982 0.9960 0.991932 0.9999 0.9999 0.9996 0.9990 0.9978 0.995333 1.0000 0.9999 0.9998 0.9995 0.9988 0.997334 1.0000 0.9999 0.9997 0.9994 0.9985
35 0.9999 0.9999 0.9997 0.999236 1.0000 0.9999 0.9998 0.999637 1.0000 0.9999 0.999838 1.0000 0.999939 0.9999
40 1.0000
Chen CL 75
Useful Probability DistributionsPoisson Process and Poisson Distribution
Ex: Left-turn Bay Design
➢ In designing the left-tum bay at a state highway intersection, the vehicles makingleft turns at the intersection may be modeled as a Poisson process.
➢ If the cycle time of the traffic light for left turns is 1 min, and the designcriterion requires a left-tum lane that will be sufficient 96% of the time (whichmay be the criterion in some states in the United States), the lane distance, interms of car lengths, to allow for an average left turns of 100 per hour, may bedetermined as follows.
Chen CL 76
Useful Probability DistributionsPoisson Process and Poisson Distribution
Ex: Left-turn Bay Design
➢ The mean rate of left turns at the intersection is ν = 100/60 per minute.Suppose the design length of the left-tum lane is x car lengths.Then, during a 1-min cycle of the traffic light, the design criterion requires thatthe probability of no more than x cars waiting for left turns must be at least96%;
P(Xt=1 ≤ x) =x∑
k=0
1k!
(10060
× 1)k
e−(100/60)·1 ≥ 0.960
If x = 3, P(Xt=1 ≤ 3) =3∑
k=0
1k!
(10060
× 1)k
e−(100/60)·1 = 0.910
If x = 4, P(Xt=1 ≤ 4) =4∑
k=0
1k!
(10060
× 1)k
e−(100/60)·1 = 0.968
➢ A left-turn bay of four car lengths at the intersection is sufficient to satisfy thedesign requirement.
Chen CL 77
Useful Probability DistributionsPoisson Process and Poisson Distribution
Ex: Traffic Control at A School Crosswalk
➢ The street width at a school crosswalk is D ft, and a child crossing the street
walks at a speed of 3.5 ft/sec. In other words, it takes a child t =D
3.5sec to
cross the street.
➢ Suppose 60 free intervals (t seconds each) in an hour, on the average, aredesired at this crossing; how much average traffic volume can be allowed at thiscrosswalk before crossing controls will be necessary?Assume that cars are crossing the crosswalk constitute a Poisson process.
➢ The number of t-sec intervals in an hour is3600
t, whereas in an interval of t
sec the probability of no cars passing through the crosswalk is P(X = 0) =(νt)0
0!e−νt = e−νt, if ν is the average vehicular traffic per second.
Therefore the maximum average traffic volume that can be allowed is such thatthe mean number of free intervals equals 60; that is,
Chen CL 78
Useful Probability DistributionsPoisson Process and Poisson Distribution
Ex: Traffic Control at A School Crosswalk
60 ≤(
3600t
)e−νt =
(3600
D3.5
)e−ν(D/3.5)
⇒ ν ≤ 3.5D
ln(
(3600)(3.5)(60)(D)
)D=25⇒ ν ≤ 3.5
25ln(
(3600)(3.5)(60)(25)
)= 0.298 vehicles/sec = 1073 vehicles/hr
➢ For various street widths D, the maximum traffic flow that can be allowed beforepedestrian crossing controls should be installed:
D (ft)= 25 40 60 75ν (veh/h)≤ 1073 522 263 173
➢ The above method has been adopted by the Joint Committee of the Institute ofTraffic Engineers and the International Association of Chiefs of Police.
Chen CL 79
Useful Probability DistributionsPoisson Process and Poisson Distribution
Ex: A Steel Pipeline Problem
➢ A major steel pipeline is used to transport crude oil from an oil productionplatform to a refinery over a distance of 100 km. Even though the entire pipelineis inspected once a year and repaired as necessary, the steel material is subjectto damaging corrosion.
➢ Assume that from past inspection records, the average distance between locationsof such corrosions is determined to be 0.15 km. In this case, if the occurrenceof corrosions along the pipeline is modeled as a Poisson process with a meanoccurrence rate of ν = 0.15/km, the probability that there will be 10 locationsof damaging corrosion between inspections is
P(X100 = 10) =(0.15× 100)10
10!e−0.15×100 = 0.049
Chen CL 80
Useful Probability DistributionsPoisson Process and Poisson Distribution
Ex: A Steel Pipeline Problem
➢ The probability of at least five corrosion sites between inspections (100 km)
P(X100 ≥ 5) = 1− P(X100 ≤ 4)
= 1−4∑
n=0
(0.15× 100)n
n!e−0.15×100
= 1−[(15)0
0!e−15 +
(15)1
1!e−15 +
(15)2
2!e−15
+(15)3
3!e−15 +
(15)4
4!e−15
]= 1− e−15︸︷︷︸
3×10−7
[1 + 15 + 112.5 + 562.5 + 2109.4
]︸ ︷︷ ︸
=2800.4
= 1− 0.0009 = 0.9991
Chen CL 81
Useful Probability DistributionsPoisson Process and Poisson Distribution
Ex: A Steel Pipeline Problem
➢ In any one of the corrosion sites, there may be one or more cracks that couldinitiate fracture failure. If the probability of this event occurring at a corrosionsite is 0.001, the probability of fracture failure along the entire 100-km pipelinebetween inspection and repair would be (denote F for fracture failure),
P(F ) = 1− P(F ) = 1− P(F ∩X100 ≥ 0)
= 1−∞∑
n=0
P(F |X100 = n
)P(X100 = n)
= 1−∞∑
n=0
(1− 0.001)n(0.15× 100)n
n!e−(0.15×100)
= 1− e−15
[1 + (0.999)
151!
+ (0.999)2152
2!+ (0.999)2
153
3!+ . . .
]= 1− e−15e0.999×15 = 1− e(0.999−1)×15 = 1− 0.985 = 0.015
Chen CL 82
Useful Probability DistributionsPoisson Process and Poisson Distribution
Ex: Large Earthquakes
➢ In the last 50 years, suppose that there were two large earthquakes (p =2/50 = 0.04, with magnitudes M ≥ 6) in Southern California. If we model theoccurrences of such large earthquakes as a Bernoulli sequence, the probabilityof such large earthquakes in Southern California in the next 15 years wouldbe evaluated as follows. First, the annual probability of occurrence of largeearthquakes is p = 2/50 = 0.04. Then
P(X ≥ 1) = 1− P(X = 0) = 1−
(150
)(0.04)0(0.96)15 = 0.458
➢ If the occurrences of large earthquakes in Southern California were modeledas a Poisson process, we would first determine the mean occurrence rate asν = 2/50 = 0.04 per year, and the probability of such large earthquakes in thenext 15 years then becomes
P(X15 ≥ 1) = 1− P(X15 = 0) = 1− (0.04× 15)0
0!e−(0.04×15) = 0.451
Chen CL 83
Useful Probability DistributionsPoisson Process and Poisson Distribution
Ex: Large Earthquakes
➢ Suppose that during an earthquake of M ≥ 6, the ground shaking intensity Yat a particular building site has a lognormal distribution with a median of 0.20gand a c.o.v. of 0.25. (i.e., λ = ln(xm) = ln(0.2g); ζ ≈ δ
X= 0.25, if cov < 0.3)
If the seismic capacity of a building is 0.30g, the probability that the buildingwill suffer damage during an earthquake of magnitude M ≥ 6 would be
P(D|M ≥ 6) = P(Y > 0.30g) = 1− P(Y ≤ 0.30g)
= 1− Φ(
ln(0.3g)− λ
ζ
)= 1− Φ
(ln(0.3g)− ln(0.2g)
0.25
)= 1− Φ
(ln(1.5)0.25
)= 1− 0.947 = 0.053
P(D|M ≥ 6) = 1− P(D|M ≥ 6) = 1− 0.053 = 0.947
Chen CL 84
Useful Probability DistributionsPoisson Process and Poisson Distribution
Ex: Large Earthquakes
➢ In the next 20 years, the probability that the building will not suffer damagefrom large earthquakes (assuming the Poisson process for occurrences of largeearthquakes) would be (giving ν = 0.04; t = 20;P(D|M ≥ 6) = 0.947)
P(D in 20 years) =∞∑
n=0
(0.947)n(0.04× 20)n
n!e−0.04×20
= e−0.80
[1 + (0.947)1
(0.80)1
1!+ (0.947)2
(0.80)2
2!
+ (0.947)3(0.80)3
3!+ . . .
]
= e−0.80e0.947×0.80 = e−0.0424 = 0.958
Chen CL 85
Useful Probability DistributionsPoisson Process and Poisson Distribution
Further Notes
➢ In both the Bernoulli sequence and the Poisson process, the
occurrences of an event between trials (in the case of the
Bernoulli model) and between intervals (in the Poisson model)
are statistically independent.
➢ More generally, the occurrence of a given event in one trial (or
interval) may affect the occurrence or nonoccurrence of the same
event in subsequent trials (or intervals).
In other words, the probability of occurrence of an event in a given
trial may depend on earlier trials, and thus could involve conditional
probabilities.
➢ If this conditional probability depends on the immediately preceding
trial (or interval), the resulting model is a Markov chain (or Markov
process).
Chen CL 86
Useful Probability DistributionsThe Exponential Distribution
➢ In the case of a Bernoulli sequence, the probability of the recurrence timebetween events is described by the geometric distribution.If the occurrences of an event constitute a Poisson process, the recurrence timewould be described by the exponential distribution.
➢ In the case of a Poisson process, if T1 is the time till the first occurrence of anevent, then (T1 > t) means that there is no occurrence of the event in (0, t);
P(T1 > t) = P(Xt = 0) =(νt)0
0!e−νt = e−νt
➢ Because the occurrences of an event in nonoverlapping intervals are statisticallyindependent, T1 is also the recurrence time between two consecutive occurrencesof the same event.The CDP (and PDF) of T1, therefore, is the exponential distribution,
FT1(t) = P(T1 ≤ t) = 1− e−νt
fT1
=dFdt
= νe−νt
Chen CL 87
Useful Probability DistributionsThe Exponential Distribution
➢ If the mean occurrence rate, ν, is constant, the mean recurrence time, E(T1), orreturn period for a Poisson process can be shown to be
E(T1) =1ν
This may be compared with the corresponding return period of 1/p for theBernoulli sequence.However, for events with small occurrence rate ν, 1/ν ≈ 1/p.Observation: in a Poisson process with occurrence rate ν, the probability of anevent occurring in a unit time interval (i.e., t = 1) is
p = P(X1 = 1) = νe−ν = ν
(1− ν +
12ν2 + . . .
)≈ ν for small ν
➢ For rare events, i.e., events with small mean occurrence rates or long returnperiods, the Bernoulli and the Poisson models should give approximately thesame results.
Chen CL 88
Useful Probability DistributionsThe Exponential DistributionEx: Earthquakes in San Francisco
➢ According to Benjamin (1968), the historical record of earthquakes in SanFrancisco from 1836 to 1961 shows that there were 16 earthquakes with groundmotion intensity in MM-scale of VI or higher. If the occurrence of such high-intensity earthquakes in the San Francisco-Bay Area can be assumed to constitutea Poisson process, the probability that the next high-intensity earthquake willoccur within the next 2 years would be evaluated as follows.
The mean occurrence rate of high-intensity earthquakes in the region is
ν =16125
= 0.128 quake per year
⇒ P(T1 ≤ 2) = 1− e−0.128×2 = 0.226
Chen CL 89
Useful Probability DistributionsThe Exponential DistributionEx: Earthquakes in San Francisco
➢ The above is equivalent to the probability of the occurrence of such high-intensityearthquakes (one or more) in the next two years. With the Poisson model, thislatter probability would be
P(X2 ≥ 1) = 1− P(X2 ≤ 0) = 1− P(X2 = 0)
= 1− (0.128× 2)0
0!e−0.128×2 = 1− e−0.128×2 = 0.226
➢ The probability that no earthquakes of this high intensity will occur in the next10 years is (or by Poisson distribution)
P(T1 > 10) = e−0.128×10 = 0.278
P(X10 = 0) =(0.128× 10)0
0!e−0.128×10 = 0.278
Chen CL 90
Useful Probability DistributionsThe Exponential DistributionEx: Earthquakes in San Francisco
➢ The return period of an intensity VI earthquake inSan Francisco,
T 1 =1
0.128= 7.8 year
➢ The probability of occurrence of large earthquakeswithin a given time t is given by the CDF of T1;
P(T1 ≤ t) = 1− e−0.128t t = 5, 10, . . .
➢ The probability of high-intensity earthquakes occurring within the return periodof 7.8 years in the San Francisco area would be
P(T1 ≤ 7.8) = 1− e0.128×7.8 = 1− e−1.0 = 0.632
Chen CL 91
Useful Probability DistributionsThe Exponential Distribution
➢ For a Poisson process the probability of an event occurring (once or more) withinits return period is always equal to 1− e−1 = 0.632.This may be compared with the probability of events with long return periods ofthe Bernoulli model.
➢ The exponential distribution is also useful as a general-purpose probabilityfunction.The PDP, CDF, and mean and variance of the exponential distribution are
fX(x) =
{λe−λx for x ≥ 00 for x < 0
FX(x) =
{1− e−λx for x ≥ 00 for x < 0
µX
=1λ
σ2X
=1λ2
Chen CL 92
Useful Probability DistributionsThe Shifted Exponential Distribution
➢ The PDP and CDP of the exponential distributions start at x = 0.
➢ In general, the distribution can start at any positive value of x;the resulting distribution may be called the shifted exponential distribution.
➢ The corresponding PDP and CDP starting at a,
fX(x) =
{λe−λ(x−a) for x ≥ a
0 for x < a
FX(x) =
{1− e−λ(x−a) for x ≥ a
0 for x < a
Chen CL 93
Useful Probability DistributionsThe Shifted Exponential Distribution
➢ The exponential distribution is appropriate for modeling the distribution of theoperational life, or time-to-failure of systems under “chance” or constant failurerate condition.
➢ In this regard, the parameter λ is related to the mean life or mean time-to-failureE(T ) as
λ =1
E(T )
➢ For a random variable X with the shifted exponential distribution starting atx = a, the mean value of X would be
E(X) = a +1λ
E(x− a) =1λ
σX
=1λ
Chen CL 94
Useful Probability DistributionsThe Shifted Exponential Distribution
Ex: Diesel Engines to Generate Backup Electrical Power
➢ Suppose that four identical diesel engines are used to generate backup electricalpower for the emergency control system of a nuclear power plant. Assumethat at least two of the diesel-powered units are required to supply the neededemergency power; in other words, at least two of the four engines must startautomatically during sudden loss of outside electrical power.
➢ The operational life T of each diesel engine may be modeled with the shiftedexponential distribution, with a rated mean operational life of 15 years and aguaranteed minimum life of 2 years.
➢ In this case, the reliability of the emergency backup system would clearly be ofinterest.For example, the probability that at least two of the four diesel engines willstart automatically during an emergency within the first 4 years of the life of thesystem can be determined as follows.
Chen CL 95
Useful Probability DistributionsThe Shifted Exponential Distribution
Ex: Diesel Engines to Generate Backup Electrical Power
➢ First, the probability that any of the engines will start without any problemwithin 4 years is (λ = 1/(15− 2);x = 4; a = 2)
P(T > 4) = 1− P(T ≤ 4) = 1−(1− e−( 1
15−2)(4−2))
= 0.8574
➢ Then, denoting N as the number of engines starting during an emergency, thereliability of the backup system within 4 years is
P(N ≥ 2) =4∑
n=2
(4n
)(0.8574)n(0.1426)4−n = 1−
1∑n=0
(4n
)(0.8574)n(0.1426)4−n
= 1−
(40
)(0.8574)0(0.1426)4−0 −
(41
)(0.8574)1(0.1426)4−1
= 1− 0.0004− 0.0099 = 0.990
➢ Therefore, the reliability of the backup system within 4 years is 99%, eventhough the reliability of each engine is only about 86%.
Chen CL 96
Useful Probability DistributionsThe Gamma Distribution
➢ The PDF, mean and variance of gamma distribution for a random variable X,(ν, k are parameters of the distribution)
fX(x) =
ν(νx)k−1
Γ(k)e−νx for x ≥ 0
0 for x < 0
µX
=k
νσ2
X=
k
ν2
Γ(k) =∫ ∞
0
xk−1e−xdx where k > 1.0
= (k − 1)Γ(k − 1)
= (k − 1)(k − 2) · · · (k − i)Γ(k − i)
Chen CL 97
Useful Probability DistributionsThe Gamma Distribution
➢ Calculation of the probability involving the gamma distribution can be performedusing tables of the incomplete gamma function, which are usually given for theratio (e.g., Harter, 1963):
I(u, k) =
∫ u
0
yk−1e−ydy
Γ(k)
P(a < X ≤ b) =νk
Γ(k)
∫ b
a
xk−1e−νxdx
‖‖ Let y = νx‖⇓
=1
Γ(k)
[∫ νb
0
yk−1e−ydy −∫ νa
0
yk−1e−ydy
]= I(νb, k)− I(νa, k)
➢ Therefore, in effect, the incomplete gamma function ratio is also the CDF of thegamma distribution.
Chen CL 98
Useful Probability DistributionsThe Gamma Distribution
Ex: Load on Buildings
➢ The gamma distribution may be used to represent the distribution of theequivalent uniformly distributed load (EUDL) on buildings. For a particularbuilding, if the mean EUDL is 15 psf (pounds per square foot) and the c.o.v. is25%, the parameters of the appropriate gamma distribution are,
δ =σ
µ=
√k/ν
k/ν=
1√k
⇒ k =1δ2
=1
(0.25)2= 16
ν =k
µ=
1615
= 1.067
➢ The design live load is generally specified (conservatively) to be on the high side.For instance, if the design EUDL is specified to be 25 psf, the probability thatthis design load will be exceeded according is
P(L > 25) = 1− P(L ≤ 25) = 1− I(1.067× 25, 16)= 1− I(26.67, 16) = 1− 0.671 = 0.329
Chen CL 99
Useful Probability DistributionsGamma Distribution and Poisson Process
➢ If the occurrences of an event constitute a Poisson process in time, then thetime till the kth occurrence of the event is governed by the gamma distribution.Earlier, in Sect. 3.2.7, we saw that the time until the first occurrence of theevent is governed by the exponential distribution.
➢ Let Tk denote the time until the kth occurrence of an event; then (Tk ≤ t)means that there were k or more occurrences of the event in time t.
➢ Hence, on the basis of Eq. 3.34, we obtain the CDF of Tk as
FTk
(t) =∞∑
x=k
P(Xt = x) = 1−k−1∑x=0
(νt)x
x!e−νt
= 1−[1 +
(νt)1!
+(νt)2
2!+ · · ·+ (νt)k−1
(k − 1)!
]e−νt
fTk
(t) =dF
Tk(t)
dt
?=ν(νt)k−1
(k − 1)!e−νt for t ≥ 0 (Eq. 3.44)
Chen CL 100
Useful Probability DistributionsGamma Distribution and Poisson Process
➢ For k = 1, i.e., for the time until the first occurrence of an event, Eq. 3.44 isreduced to the exponential distribution of Eq. 3.36.
fTk
(t) =ν(νt)k−1
(k − 1)!e−νt (Gamma distribution)
fT1
(t) = νe−νt (Exponential distribution)
➢ The above gamma distribution with integer k is known also as the Erlangdistribution.In this case, the mean time until the kth occurrence of an event and its varianceare
E(Tk) =k
νVar(Tk) =
k
ν2
Chen CL 101
Useful Probability DistributionsGamma Distribution and Poisson Process
Ex: Fatal Accidents on A Particular Highway
➢ Suppose that fatal accidents on a particular highway occur on the average aboutonce every 6 months. If we can assume that the occurrences of accidents on thishighway constitute a Poisson process, with mean occurrence rate of ν = 1/6per month, the time until the occurrence of the first accident (or betweentwo consecutive accidents) would be described by the exponential distribution,specifically with the following PDF:
fT1
=16(t/6)(1−1)
(1− 1)!e−t/6
➢ The time until the occurrence of the second accident (or the time between everyother accidents) on the same highway is described by the gamma distribution,with the PDF,
fT2
=16(t/6)(2−1)
(2− 1)!e−t/6
Chen CL 102
Useful Probability DistributionsGamma Distribution and Poisson Process
Ex: Fatal Accidents on A Particular Highway
➢ Whereas the time until the occurrence of the third accident would also begamma distributed with the PDF,
fT3
=16(t/6)(3−1)
(3− 1)!e−t/6
The above PDFs are illustratedgraphically in Fig. E3.28, and thecorresponding mean occurrencetlmes of T1, T2, and T3 are,respectively, 6, 12, and 18 months.
➢ Note: We might recognize that the exponential and gamma distributionsare the continuous analogues, respectively, of the geometric and negativebinomial distributions; that is, the geometric and negative binomial distributionsgovern the first and kth occurrence times of a Bernoulli sequence, whereas theexponential and gamma distributions govern the corresponding occurrence timesof a Poisson process.
Chen CL 103
Useful Probability DistributionsShifted Gamma Distribution
➢ Most probability distributions are described with two parameters, or even withone parameter, such as the exponential distribution.The shifted gamma distribution is one of the few exceptions with threeparameters.
➢ A three-parameter distribution may be useful for fitting statistical data in whichthe skewness (involving the third moment) in the data is significant; in particular,the third parameter would be necessary in order to explicitly include the skewnessin the observed data.
➢ As an extension of Eq. 3.42, the PDF of the three-parameter shifted gammadistribution for a random variable X, may be expressed as (ν, k ≥ 1.0)
fX(x) =
ν[ν(x− γ)]k−1
Γ(k)e−ν(x−γ) x ≥ γ
µX
=k
νσ2
X=
k
ν2
Chen CL 104
Useful Probability DistributionsShifted Gamma Distribution
Ex: Residual Stresses in Flanges of Steel H-Section➢ The three-parameter gamma distribution can
be shown to give better fit with statisticaldata when there is significant skewness in theobserved data. For instance, shown belowin Fig. E3.29 is the histogram of measuredresidual stresses in the flanges of steel H-sections. The mean, standard deviation, andskewness coefficient of the measured ratiosof residual stress/yield stress are, respectively,0.3561, 0.1927, and 0.8230.
➢ Clearly, because the data show significant skewness, a three-parameterdistribution is necessary in order to include the skewness for adequately fittingthe histogram of the measured residual stresses. As shown in Fig. E3.29, thethree-parameter gamma PDF (solid curve) that includes the skewness of 0.8230has a much closer fit to the histogram than the normal or lognormal distributionswhich are, of course, two-parameter distributions. This is further verified laterin Example 7.10 with the K-S goodness-of-fit test.
Chen CL 105
Useful Probability DistributionsHypergeometric Distribution
➢ The hypergeometric distribution arises when samples from a finite population,consisting of two types of elements (e.g., “good” and “bad”), are being examined.It is the basic distribution underlying many sampling plans used in connectionwith acceptance sampling and quality control.
➢ Consider a lot of N items, among which m are defective and the remaining(N −m) items are good.If a sample of n items is taken at random from this lot, the probability thatthere will be x defective items in the sample is given by the hypergeometricdistribution as follows:
P(X = x) =
(m
x
)(N −m
n− x
)(
N
n
) x = 1, 2, . . . ,m
Chen CL 106
Useful Probability DistributionsHypergeometric Distribution
➢ The above distribution is based on the following:
In the lot of N items, the number of samples of size n is
(N
n
);
among these, the number of samples with x defectives is(m
x
)(N −m
n− x
)
Therefore, assuming that the samples are equally likely to be selected, we obtainthe hypergeometric distribution.
Chen CL 107
Useful Probability DistributionsHypergeometric Distribution
Ex: Detection of Strain Gages➢ In a box of 100 strain gages, suppose we suspect that there may be four gages
that are defective. If six of the gages from the box were used in an experiment,the probability that one (and zero) defective gage was used in the experiment isevaluated as follows (in this case, we have N = 100, m = 4, and n = 6); thus,
P(X = 1) =
(41
)(100− 46− 1
)(
1006
) = 0.205
P(X = 0) =
(40
)(100− 46− 0
)(
1006
) = 0.778
➢ At least one defective gages was used in the experiments,
P(X ≥ 1) = 1− P(X = 0) = 1− 0.778 = 0.222
Chen CL 108
Useful Probability DistributionsHypergeometric Distribution
Ex: A Large Reinforced Concrete Construction Project
➢ In a large reinforced concrete construction project, 100 concrete cylinders areto be collected from the daily concrete mixes delivered to the constructionsite. Furthermore, to ensure material quality, the acceptance/rejection criterionrequires that ten of these cylinders (selected at random) must be tested forcrushing strength after curing for 1 week, and nine of the ten cylinders testedmust have a required minimum strength.
➢ Q: Is the acceptance/rejection criterion stringent enough?
➢ Whether the acceptance/rejection criterion is too stringent, or not stringentenough, depends on whether it is difficult or easy for poor-quality concrete mixesto go undetected.
Chen CL 109
Useful Probability DistributionsHypergeometric Distribution
Ex: A Large Reinforced Concrete Construction Project
➢ For example, if there is d percent of defective concrete, then on the basis of thespecified acceptance/rejection criterion, the probability of rejection of the dailyconcrete mixes would be (denoting X as the number of defective cylinders inthe test)
P(X > 1) = 1− P(X ≤ 1)
= 1−
(
100d
0
)(100(1− d)
10
)(
10010
) +
(100d
1
)(100(1− d)
9
)(
10010
)
Chen CL 110
Useful Probability DistributionsHypergeometric Distribution
Ex: A Large Reinforced Concrete Construction Project
➢ For example, if there is 5% (2%) defectives in the daily concrete mixes, ord = 5%, (2%),
d = 5% : P(rejection) = 1−
(
50
)(9510
)(
10010
) +
(51
)(959
)(
10010
)
= 1− [0.5837 + 0.0034] = 0.413
d = 2% : P(rejection) = 1−
(
20
)(9810
)(
10010
) +
(21
)(989
)(
10010
)
= 1− [0.8091 + 0.1818] = 0.009
Chen CL 111
Useful Probability DistributionsHypergeometric Distribution
Ex: A Large Reinforced Concrete Construction Project
➢ Therefore, if 5% of the concrete mixes were defective, it is likely (with 41%probability) that the defective material will be discovered with the proposedacceptance/rejection criterion,whereas if 2% of the concrete mixes were defective, the likelihood of the dailymixes being rejected is very low (with 0.009 probability).
➢ Hence, if the contract requires concrete with less than 2% defectives, then theproposed acceptance/rejection criterion is not stringent enough;on the other hand, if material with 5% defectives is acceptable, then the proposedcriterion may be satisfactory.
Chen CL 112
Useful Probability DistributionsBeta Distribution
➢ Most probability distributions are for random variables whose range of values areunlimited in either one or both directions.
➢ In some engineering applications, there may be problems in which there are finitelower and upper bound values of the random variables; in these cases, probabilitydistributions with finite lower and upper limits would be appropriate.
➢ The beta distribution is one of the few distributions appropriate for a randomvariable whose range of possible values are bounded, say between a and b. ItsPDF is given by
fX(x) =
1
B(q, r)(x− a)q−1(b− x)r−1
(b− a)q+r−1for a ≤ x ≤ b
0 otherwise
fX(x) =
1
B(q, r)xq−1(1− x)r−1 for 0 ≤ x ≤ 1
0 otherwise (standard beta dist.)
B(q, r) =∫ 1
0
xq−1(1− x)r−1dx =Γ(q)Γ(r)Γ(q + r)
(Beta function)
Chen CL 113
Useful Probability DistributionsBeta Distribution
➢ The probability associated witha beta distribution can beevaluated in terms of theincomplete beta function, andvalues of the incomplete betafunction ratio Bx(q, r)/B(q, r)have been tabulated.
Chen CL 114
Useful Probability DistributionsBeta Distribution
Bx(q, r) =∫ x
0
yq−1(1− y)r−1dy 0 < x < 1.0
FX(x) =1
B(q, r)
∫ x
0
yq−1(1− y)r−1dy=Bx(q, r)B(q, r)
≡β(x|q, r)=1− β(x|r, q)
P(x1 < X ≤ x2) =1
B(q, r)
∫ x2
x1
(x− a)q−1(b− x)r−1
(b− a)q+r−1dx
‖‖ Let y = x−a
b−a → 1− y = b−xb−a, dy = dx
b−a‖⇓
=1
B(q, r)
[∫ (x2−a)/(b−a)
0
{yq−1(1− y)r−1}dy −∫ (x1−a)/(b−a)
0
{•}dy
]‖‖ Let u = x2−a
b−a , v = x1−ab−a‖
⇓= β(u|q, r)− β(v|q, r)
µX
= a +q
q + r(b− a), σ2
X=
qr
(q + r)2(q + r + 1)(b− a)2
θX
=2(r − q)
(q + r)(q + r + 2)(b− a)
σX
, x̃ = a +1− q
2− q − r(b− a) (mode of X)
Chen CL 115
Useful Probability DistributionsBeta Distribution
Ex: Duration Required to Complete An Activity
➢ The duration required to complete an activity in a construction project has beenestimated by the subcontractor to be as follows:
Minimum duration = 5 days
Maximum duration = 10 days
Expected duration = 7 days
➢ The coefficient of variation of the required duration is estimated to be 10%.In this case, the beta distribution may be appropriate with a = 5 days andb = 10 days. The parameters of the distribution would be determined as follows:
5 +q
q + r(10− 5) = 7 ⇒ q =
23r
qr
(q + r)2(q + r + 1)(10− 5)2 = (0.1× 7)2 ⇒ q = 3.26, r = 4.89
Chen CL 116
Useful Probability DistributionsBeta Distribution
Ex: Duration Required to Complete An Activity
➢ The probability that the activity will be completed within 9 days
P(T ≤ 9) = βu(3.26, 4.89), u =9− 510− 5
= 0.8
➢ From tables of the incomplete beta function ratios we obtain after suitableinterpolation
P(T ≤ 9) = β0.8(3.26, 4.89) = 0.993
Chen CL 117
Multiple Random Variables
Chen CL 118
Multiple Random VariablesEx: Rainfall Intensity and Temperature
➢ Rainfall intensity at a gauge station: ⇒ RV X
➢ Temperature for run-off of a river: ⇒ RV Y
⇒ (X = x, Y = y): or [(X = x) ∩ (Y = y)]a joint event defined by values of RVs in XY -space
➢ Joint Probability Mass Function:
pX,Y
(x, y) = P[X = x and Y = y]
➢ Joint Probability Distribution Function:
FX,Y
(x, y) = P[X ≤ x and Y ≤ y]
=∑xi≤x
∑yj≤y
pX,Y
(xi, yj)
Chen CL 119
Multiple Random VariablesEx: May Temperature and Rainfall of US City
pX,Y
(65, 4)(
=1450
)= .28
FX,Y
(55, 1) =∑
xi≤55
∑yj≤1
pX,Y
(xi, yj)
= 0 + 0 + .02 + .02 + .02 + .04 = .10
Chen CL 120
Multiple Random VariablesJoint Probability Distribution
➢ The cummulative probability of the joint occurrences of the events
defined by X ≤ x, Y ≤ y
FX,Y
(x, y) ≡ P[X ≤ x, Y ≤ y]
➢ Axioms of Probability:
FX,Y
(−∞,−∞) = 0 FX,Y
(∞,∞) = 1
FX,Y
(−∞, y) = 0 FX,Y
(∞, y) = FY(y) = P[Y ≤ y]
FX,Y
(x,−∞) = 0 FX,Y
(x,∞) = FX(x) = P[X ≤ x]
➢ Note: FX,Y
(x, y) is nonnegative and nondecreasing
Chen CL 121
Multiple Random VariablesX, Y are Discrete RVs
➢ Probability Mass Functtion:
FX,Y
(x, y) ≡ P[X ≤ x, Y ≤ y]
=∑
xi≤x,yi≤y
P[X = xi, Y = yi]
=∑
xi≤x,yi≤y
pX,Y
(X = xi, Y = yi)
➢ Conditional PMF:
pX|Y (x|y) = P[X = x|Y = y] =
pX,Y
(x, y)p
Y(y)
pY |X(y|x) = P[Y = y|X = x] =
pX,Y
(x, y)p
X(x)
Chen CL 122
Multiple Random VariablesX, Y are Discrete RVs
➢ Marginal PMF:
pX(x) = P[X = x] =
∑∀yj
P[X = x|Y = yj]P[Y = yj]
=∑∀yj
P[X = x, Y = yj] =∑∀yj
pX,Y
(x, yj)
pY(y) = P[Y = y] =
∑∀xi
P[Y = y|X = xi]P[X = xi] =∑∀xi
pX,Y
(xi, y)
➢ Statistical Independent:
pX|Y (x|y) = p
X(x)
pY |X(y|x) = p
Y(y)
pX,Y
(x, y) = pX(x)p
Y(y)
Chen CL 123
Multiple Random VariablesX, Y are Discrete RVs
Ex: Construction Labor Survey
➢ From a survey of construction labor:
work duration (x = 6, 8, 10, 12 hrs) and
productivity (y = 50%, 70%, 90%)
➢ Joint PMF pX,Y
(x, y):
Relative(x, y) # of obs. frequencies
6, 50 2 0.0146, 70 5 0.0366, 90 10 0.0728, 50 5 0.0368, 70 30 0.2168, 90 25 0.180
10, 50 8 0.05810 70 25 0.18010,90 11 0.079
12, 50 10 0.07212, 70 6 0.04312, 90 2 0.014
139 1.000
Chen CL 124
Multiple Random VariablesX, Y are Discrete RVs
Ex: Construction Labor Survey
➢ Marginal PMF:
pX(x) =
∑yj=50,70,90
pX,Y
(x, yj)
pX(8) = 0.036 + 0.216 + 0.180
= 0.432
➢ Conditional probability:
pY |X(90%|8) =
pX,Y
(8, 90%)p
X(8)
=0.1800.432
= 0.417
Chen CL 125
Multiple Random VariablesX, Y are Continuous RVs
➢ Probability Density Function:
FX,Y
(x, y)dxdy ≡ P[x < X ≤ x + dx, y < Y ≤ y + dy]
FX,Y
(x, y) ≡∫ x
−∞
∫ y
−∞f
X,Y(u, v)dvdu
➢ Note:
(1) fX,Y
(x, y)dxdy =∂2F
X,Y(x, y)
∂x∂y
(2) P[a < X ≤ b, c < Y ≤ d] =∫ b
a
∫ d
c
fX,Y
(u, v)dvdu
Chen CL 126
Multiple Random VariablesX, Y are Continuous RVs
➢ Conditional PDF:
fX|Y (x|y) =
fX,Y
(x, y)f
Y(y)
fY |X(y|x) =
fX,Y
(x, y)f
X(x)
➢ Marginal PDF:
fX(x) =
∫ ∞
−∞f
X,Y(x, y)dy
fY(y) =
∫ ∞
−∞f
X,Y(x, y)dx
Chen CL 127
Multiple Random VariablesBivariate Normal Distribution
➢ Probability density function:
fX,Y
(x, y) =1
2πσXσ
Y
√1− ρ2
× exp[− 1
2(1−ρ2)
((x−µ
X)2
σ2X
− 2ρ(x−µ
X)(y−µ
Y)
σX
σY
+ (y−µY
)2
σ2Y
)]
Chen CL 128
Multiple Random VariablesBivariate Normal Distribution
Ex: Radar Network for Tracking Satelite
➢ Tracking satelite using radar network
➢ Forcast errors in azimuth (X) and elevation (Y )µ
X= µ
Y= 0, σ
X= 5 sec, σ
Y= 2 sec, ρ = 0
➢ Bivariate normal density function:
fX,Y
(x, y) =1
20πexp
[−1
2
(x2
52+
y2
22
)]
➢ Bivariate normal distribution function:
FX,Y
(x, y) =∫ x
−∞
1√2π(5)
exp(− ν2
2(25)
)[∫ y
−∞
1√2π(2)
exp(− u2
2(4)
)du
]dv
= Φ(
x5
)Φ(
y2
)➢ The probability that the forecast azimuth and elevation errors do not each exceed
+3 seconds
P[X ≤ 3 and Y ≤ 3] = Φ(35
)Φ(32
)= (.7257)(.9332) = .677
Chen CL 129
Covariance
σXY
≡ Cov(X, Y ) = E[(X − µ
X)(Y − µ
Y)]
= E [XY ]− E(X)E(Y )
➢ If X, Y are independent: Cov(X, Y ) = 0
fX,Y
(x, y) = fX(x)f
Y(y)
E [X, Y ] =∫ ∞
−∞
∫ ∞
−∞xyf
X,Y(x, y)dydx
=∫ ∞
−∞xf
X(x)dx
∫ ∞
−∞yf
Y(y)dy
= E [X] E [Y ]
➢ If Cov(X, Y ) is large positive:Values of X and Y tend to be both large or both small relative to their respectivemean
Chen CL 130
Covariance
σXY
≡ Cov(X, Y ) = E[(X − µ
X)(Y − µ
Y)]
= E [XY ]− E(X)E(Y )
➢ If Cov(X, Y ) is large negative: X large −→ Y smallX small −→ Y large
➢ If Cov(X, Y ) is small: little linear relationship
➢ Cov(X, Y ):A measure of degre of linear interrelationship between variates X, Y
➢ Problem: Cov(X, Y ) is scaling dependent !
Chen CL 131
Correlation
ρ ≡ Cov(X, Y )σ
Xσ
Y
∈ [−1, 1] (Scaling independent)
ρ = +1.0 ρ = −1.0 0 < ρ < 1.0
ρ = 0 ρ = 0 ρ = 0
Note: ρ is a measure of linear relationship
NOT imply a causal effect between variables
Chen CL 132
Covariance and CorrelationEx: A Cantilever Beam
➢ S1, S2 : independent random loads (µ1, σ1), (µ2, σ2)
➢ Shear force: Q = S1 + S2; bending moment: M = aS1 + 2aS2
µQ
= µ1 + µ2 σ2Q
= σ21 + σ2
2
µM
= aµ1 + 2aµ2 σ2M
= a2(σ21 + 4σ2
2)
Chen CL 133
Covariance and CorrelationEx: A Cantilever Beam
E [QM ] = E [(S1 + S2)(aS1 + 2aS2)]= aE [S2
1] + 3a E [S1S2]︸ ︷︷ ︸=E[S1] E[S2]
+2aE [S22]
= a(σ21 + µ2
1) + 3a(µ1)(µ2) + 2a(σ22 + µ2
2)= a(σ2
1 + 2σ22) + a(µ2
1 + 2µ22 + 3µ1µ2)
= a(σ21 + 2σ2
2) + µQµ
M
⇒ Cov(Q,M) = E [QM ]− µQµ
M= a(σ2
1 + σ22)
ρQM
=Cov(Q,M)
σQσ
M
=a(σ2
1 + σ22)√
σ21 + σ2
2
√a2(σ2
1 + 4σ22)
➢ If σ1 = σ2 ⇒ ρQM
=3√10
= 0.948
➢ Q,M : strong (linear) correlation at the support
Q,M : NO causal relation
Chen CL 134
Thank You for Your Attention
Questions Are Welcome