topic 2 random variables. discrete random variables continuous random variables
TRANSCRIPT
TOPIC 2TOPIC 2
Random Variables Random Variables
Random VariablesRandom Variables
Random Variables
Discrete Random Variables
Continuous Random Variables
Discrete Random Variables Discrete Random Variables
• Random Variable :
A numerical outcome of an experiment
Example: the sum of two fair dice, count number of tails from tossing 2 coins
• Discrete Random Variable :
Whole number (0, 1, 2, 3, etc.)
Obtained by counting
Usually a finite number of values :
Poisson random variable is exception (∞)
Probability Mass Function (PMF)Probability Mass Function (PMF)
• A set of probability values
• List of all possible [x, P (x)] pairs
x = value of random variable X (outcome)
P (x) = probability associated with value
• Mutually exclusive (no overlap)
• Collectively exhaustive (nothing left out)
• 0 ≤ P (x) ≤ 1 for all x
•
• Also referred as discrete random probability distribution 1xP
PMF ExamplePMF Example
Experiment: Toss 2 coins. Count number of tails (as a random variable of X).
Discrete Probability Distribution
Random Var, x Probabilities, P(x)
0 1/4 = 0.25
1 2/4 = 0.50
2 1/4 = 0.25
All Possible Occurence
Visualization of PMF ExampleVisualization of PMF Example
Listing Table
Formula
P (x)n
x!(n – x)!!
= px(1 – p)n - x
Graph
.00
.25
.50
0 1 2x
P(x)
{ (0, .25), (1, .50), (2, .25) }
Note: p = trial probability = ½
n = sample size = 2
# Tails f (x)Frequency
P (x)
0 1 .252 .501 .25
(x)
1
2
Cumulative Distribution FunctionsCumulative Distribution Functions
• An alternative way of specifying the probabilistic properties of a random variable X is through the function
• This function is known as the cumulative distribution function of a discrete random variable
xXPxF
ExampleExample
• Suppose X has a probability mass function given by
• then the cumulative distribution function F of X is given by
6
13,
3
12,
2
11 XPXPXP
6
5
3
1
2
12122 PPXPF
2
1111 PXPF
16
1
3
1
2
132133 PPPXPF
Summary MeasuresSummary Measures
• Expected value or expectation of a discrete random variable (Mean of probability distribution) :
average value of the random variables (all possible values)
• Variance : Measures the spread or variability in the values taken by
the random variable
• Standard Deviation The positive square root of the variance
i
ii xPxXE
iii
iii xPxxPXEx
XEXE
XEXEXVar
22
22
22
2
ExampleExample
Experiment : You toss 2 coins. You’re interested in the number of tails. What are the expected value, variance, and standard deviation of this random variable, number of tails?
0 .25 -1.00 1.00
1 .50 0 0
2 .25 1.00 1.00
0
.50
.50
= 1.0
x P(x) x P(x) x – (x – ) 2 (x – ) 2 P(x)
.25
0
.25
2 = .50
= .71
ExercisesExercises
1) An office has four copying machines, and the random variable x measures how many of them are in use at a particular moment in time. Suppose that P(X=0) = 0.08, P(X=1) = 0.11, P(X=2) = 0.27, and P(X=3) = 0.33.
a) What is P(X=4) ?
b) What is P(X≤2) ? (Cumulative probability distribution)
2) Four cards are labeled $1, $2, $3, and $6. A player pays $4 selects two cards at random, and then receives the sum of the winnings indicated on the two cards. Calculate the probability mass function and the cumulative distribution functions of the net winnings.
3) A consultant has six appointment times that are open, three on Monday and three on Tuesday. Suppose that when making an appointment a client randomly chooses one of its remaining open times, with each of those open times equally likely to be chosen. Let the random variable X be the total number of appointment that have already been made over both days at the moment when Monday’s schedule has just been completely filled
a) What is the state space of the random variable of X
b) Calculate the probability mass function of X
c) What is the expected value and standard deviation of the total number of appointments that have already been made over both days at the moment when Monday’s schedule has just been completely filled?
Answers to ExercisesAnswers to Exercises
1) a)
b)
2)
Calculation:
21.033.027.011.008.014
143210
1
XP
XPXPXPXPXP
xPi
i
46.027.011.008.0
2102
XPXPXPXP
6156,$3$
6146,$2$
6136,$1$
6113,$2$
6103,$1$
6112,$1$
XP
XP
XP
XP
XP
XP
155
6544
6433
6311
6200
6111
XPF
XPF
XPF
XPF
XPF
XPF
.,0431
,1421
etcX
X
612
34
!24!2
!442
C
Answers to ExercisesAnswers to Exercises
3) a)
b) Probability Mass Function Cumulative Distribution Function
c)
20123
456
!36!3
!663
C
50.020
10
20
10206
30.020
6
20
4105
15.020
3
20
144
05.020
13
63
53
63
63
43
53
63
33
43
63
33
C
CCXP
C
CCXP
C
CCXP
C
CXP
150.030.015.005.06
50.030.015.005.05
20.015.005.04
05.03
XP
XP
XP
XP
8874.07875.0
7875.05.025.563.025.55
15.025.5405.025.53
25.55.063.0515.0405.03
2
22
2222
iii
iii
xPxXVar
xPxXE
Continuous Random Variables Continuous Random Variables
• Continuous Random Variable
A numerical outcome of an experiment
Whole or fractional number
Obtained by measuring
Weight of a student (e.g., 115, 156.8, etc.)
Infinite number of values in interval
Too many to list like a discrete random variable
ExampleExample
Measure Time
Between Arrivals
Inter-Arrival
Time
0, 1.3, 2.78, ...
Experiment RandomVariable
PossibleValues
Weigh 100 People Weight 45.1, 78, ...
Measure Part Life Hours 900, 875.9, ...
Amount spent on food $ amount 54.12, 42, ...
• Defines the probabilistic properties of a continuous random variable
• Shows all values, x, and frequencies, f (x)
f (x) is a Probability Density Function (Not Probability Random Variable)
• Properties
Probability Density FunctionsProbability Density Functions
Value
(Value, Frequency)
Frequency
f(x)
a bx(Area Under
Curve)f x dx( )
All x 1
f x( ) a x b 0,
Continuous Random Variable Probability Continuous Random Variable Probability
Probability is Area Under Curve!
P a x b f x dxa
b
( ) ( )
f(x)
xa b
xall
dxxfxallP 1
Continuous Random Variable Probability Continuous Random Variable Probability
f(x)
xa
a
a
dxxfaXP 0 • This is in contrast to discrete random variables, which can have non zero probabilities of taking specific values.
• Continuous random variables can have nonzero probabilities of falling within certain continuous region (e.g. a ≤ x ≤ b)
Cumulative Distribution FunctionCumulative Distribution Function
• The cumulative distribution function of a continuous random variable X is defined as
• The cumulative distribution function F(x) is a continuous non-decreasing function that takes the value 0 prior to and at the beginning of the state space and increases to a value of 1 at the end
x
dyyfxXPxFendpointlower
Summary MeasuresSummary Measures
• Expected Value or expectation (Mean of random variable) :
Weighted average of all possible values
If the probability density function f(x) is symmetric then the expectation of the random variable x is equal to the point of symmetry
• Variance : Weighted average of squared deviation about mean
• Standard Deviation :
• Median :
xall
dxxfxXE
xall
dxxfxxE 222
2
5.0xF
Variances/Standard DeviationsVariances/Standard Deviations
• Variance shows the spread or variability in the values taken by the random variable
• Standard deviation is often used in place of the variance to describe the spread of the distribution
ExampleExample
Suppose that the diameter of a metal cylinder has a probability density function
f(x) = 1.5 – 6(x – 50)2
for 49.5 ≤ x ≤ 50.5• Is this a valid probability density function?• Is the probability density function symmetric? What is the
point of symmetry?• What is the probability that the metal cylinder has a
diameter between 49.8 and 50.1 mm?• What is the cumulative distribution function of the metal
cylinder diameter?• What is the expected diameter of the metal cylinder?• What is the variance and standard deviation of the metal
cylinder diameters?
Answer to the ExampleAnswer to the Example
• Is this a valid probability density function? Yes.
• What is the probability that the metal cylinder has a diameter between 49.8 and 50.1 mm?
• What is the cumulative distribution function of the metal cylinder diameter?
15.745.75505.4925.495.1505.5025.505.1
5025.15065.1
1
33
5.50
5.493
5.50
5.49
2
xxdxx
dxxfxall
432.0716.74148.75
5025.15065.11.508.491.50
8.493
1.50
8.49
2
xxdxxxP
5.745025.1
5025.15065.1
3
8.493
5.49
2
xx
yydyyxXPxFx
x
• What is the expected diameter of the metal cylinder?
• Is the probability density function symmetric? What is the point of symmetry? Yes. μ = 50 is the point of symmetry
• What is the variance and standard deviation of the metal cylinder diameters?
5065625.183765625.19125.745.495.755.50
505.075.05025.15065.15.50
5.49425.50
5.493
5.50
5.49
2
xxxxxdxxx
dxxfxXExall
22 65.165.1 xxxfxf
224.005.005.0025.0025.0
502.1505.05065.1505.50
5.4953
5.50
5.49
222
22
and
xxdxxx
dxxfxXVarxall
Answer to the ExampleAnswer to the Example
Answer to the ExampleAnswer to the Example
• Graphs of the example
49.5 50.550
μ = E(X)
f (x)
f (x) = 1.5 – 6(x – 50)2
Probability density function
49.5 50.5
F(x)
F (x) = 1.5x – 2(x – 50)3 – 74.5
0
1
Cumulative distribution function
σ = 0.224
0.5
50Median
Mean
Chebyshev’s InequalityChebyshev’s Inequality
• If a random variable has a mean µ and a variance σ2, then
11
12
cforc
cXcP
• For example, taking c = 2 and 3 gives
%8989.03
1133
%7575.02
1122
2
2
XP
XP
σσ σσσ σ
E (x) = µ
27
• Alternative ways of describing spread of data include determining the location of values that divide a set of observations into equal parts.
• The pth quantile or 100pth percentile of a random variable X with a cumulative distribution function F(x) is defined to be the value of x for which
p = 0.25 is called 25th percentile or lower quartile (Q1) p = 0.50 is called 50th percentile or median (Q2) p = 0.75 is called 75th percentile or upper quartile (Q3)
• Interquartile Range (IQR) = Q3 – Q1
Quantiles of Random Variables Quantiles of Random Variables
pxF
28
Quantiles of Random Variables Quantiles of Random Variables
Upper Quartile
MedianLower Quartile
Area = 0.25f(x)
Interquartile Range
ExampleExample
A random variable X has a probability density function
a) What is the value of A?
b) What is the median of X?
c) What is the lower quartile of X?
d) What is the upper quartile of X?
e) What is the interquartile range?
a)
43 xforx
Axf
866.1134212
11
4
3
4
3
AAxA
dxx
Adxxf
xall
Answer to the ExampleAnswer to the Example
48.35.03732.35.0732.3
5.0866.1
5.0
3
3
xxy
dyy
yF
x
x
b)
c)
d)
e)
24.325.03732.325.0732.3
25.0866.1
25.0
3
3
xxy
dyy
yF
x
x
74.375.03732.375.0732.3
75.0866.1
75.0
3
3
xxy
dyy
yF
x
x
5.024.374.3 IQR
Jointly Discrete Random VariablesJointly Discrete Random Variables
• Joint Probability Distribution
Y random values
Y1 Y2 Yn
XrandomValues
X1 p1,1 p1,2 p1,n
X2 p2,1 p2,2 p2,n
Xm pm,1 Pm,2 pm,n
n
j
m
iji YXP
1 1
1,
n
j
m
iijp
1 1
1or
• Joint Cumulative Distribution Function
y
j
x
iijpyYxXPyxF ,,
Marginal Probability DistributionMarginal Probability Distribution
Y random values
Y1 Y2 Yn
XrandomValues
X1 p1,1 p1,2 p1,n
X2 p2,1 p2,2 p2,n
Xm pm,1 Pm,2 pm,n
n
jjpxP
111
n
jjpxP
122
n
jmjm pxP
1
m
iipyP
111
m
iipyP
122
m
iinn pyP
1
Marginal distribution of x
Marginal distribution of y
Marginal Probability DistributionMarginal Probability Distribution
j
n
jji
m
ii yPYEyYxPXExX
1
22
1
22
YYXX 22
Variance:
Standard Deviation:
n
jjj
m
iii yPyYExPxXE
11
Expectation (Mean):
n
j
m
iijji pyxXYE
1 1
Conditional Probability DistributionConditional Probability Distribution
m
iij
ij
j
jijiyYi
p
p
yYP
yYxXPyYxXPp
j
,||
If two discrete random variables X and Y are jointly distributed, then the conditional distribution of random variable X conditional on the event Y = yj consists the probability values
What is this next equation about?
n
jij
ij
i
jiijxXj
p
p
xXP
yYxXPxXyYPp
i
,||
Covariance and CorrelationCovariance and Correlation
Covariance:
YX
YX
YX
YXYX
.Cov
)()(
,Cov,Corr
22
The correlation takes values between -1 and 1, and the discrete random variables x and y are• independent if Corr (X,Y) = 0 [or Cov(X,Y) = 0]• strongly dependent if Corr (X,Y) = -1 or 1 (negatively or positively)
To indicate the strength of the dependence of two random variables
YEXEXYE
YEYEXEXEYX
,Cov
In practice, the most convenient way to asses the strength of the dependence between two random variable is through their Correlation
ExampleExample
• A company that services air conditioner (AC) units in residence and office blocks is interested in how to schedule its technicians in the most efficient manner. Specifically the company is interested in how long a technician takes on a visit to a particular location, and the company recognizes that this mainly depends on the manner of AC units at the location that need to be serviced
Service Time (hours)
1 2 3 4
Number of AC units
1 0.12 0.08 0.07 0.05
2 0.08 0.15 0.21 0.13
3 0.01 0.01 0.02 0.07
n
j
m
iijp
1 1
1• Check that
107.02.01.01.
13.21.15.08.05.07.08.12.1 1
n
j
m
iijp
ExampleExample
• What is the probability that a location has no more than two AC units that take no more than 2 hours to service? (Joint cumulative probability function)
4301508081222 22211211 .....pppp,F
• What are the expected number, variance and standard deviation of AC? of the service time?
Service Time (hours)
1 2 3 4
Number of AC units
1 0.12 0.08 0.07 0.05
2 0.08 0.15 0.21 0.13
3 0.01 0.01 0.02 0.07
sum
0.32
0.57
0.11
sum 0.21 0.24 0.30 0.25
ExampleExample
62.0
386.011.079.1357.079.1232.079.11
79.111.0357.0232.01
2
2222
XX
X
XE
08.1
162.125.59.2430.59.2324.59.2221.59.21
59.225.430.324.221.1
2
22222
YY
Y
YE
• Is there any correlation between the number of ACs and the service hours?
34.0
08.162.0
224.0,,
224.079.159.286.4,
86.407.4308.2112.111 1
YX
n
j
m
iijji
YXCovYXCorr
YEXEXYEYXCov
PyxXYE
Positively correlated!
ExampleExample
• Suppose that a technician is visiting a location that is known to have three air conditioner units, what is the probability that the service time is four hours?
64.011.0
07.0
07.002.001.001.0
07.0
,|
4
3
34|4
|
3
jj
xX
n
jij
ij
i
jiijxj
p
pp
p
p
xXP
yYxXPxXyYPp
i
Linear Function of a Random VariableLinear Function of a Random Variable
If X and Y are two random variables, and
baXY
For some a, b that are real number, the expectation, the variance and the standard deviation of the random variable Y are
bXEaYE
XVaraYVar 2
XaYY
Linear Combination of Random VariablesLinear Combination of Random Variables
If X1 and X2 are two random variables, and
2121 XEXEXXE
and the variance
2122
12
212 ,Cov2 XXXXXX
If X1 and X2 are independent random variables so that Cov(X1,X2 ) = 0, then
The standard deviation
212
21 XXXX
22
12
212 XXXX
ExampleExample
1) Use the answers of the previous example (E(X), E(Y), E(XY), Var(X) and Var(Y)) and assume that X and Y are independent variables. Find the expectation and variance of the following random variables• 2X+6Y• 5X-9Y+8
From the previous example:
59.279.1 YEXE 162.1386.0 22 YX
12.1959.2679.126262 YEXEYXE
36.6859.2979.15895895 YEXEYXE
376.43162.136386.046262 22 YVarXVarYXVar
Then,
77.103162.181386.02595895 22 YVarXVarYXVar
Averaging Independent Random VariablesAveraging Independent Random Variables
Suppose that X1, X2 , ..…, Xn is a sequence of independent random variables each with an expectation μ and a variance σ2, and with an average
Then
n
XXXX n
21
n
n
n
XEXEXEXE n21
nn
n
n
XVarXVarXVarXVar n
2
2
2
221
ExampleExample
1) The weight of a certain type of brick has an expectation of 1.12 kg with a standard deviation of 0.03 kga) What are the expectation and variance of the average weight of
25 bricks randomly selected?b) How many bricks need to be selected so that their average
weight has a standard deviation of no more than 0.005 kg?
12.1
252521
XEXEXE
XE
000036.0
25
03.0 22
221
nn
XVarXVarXVarXVar n
a) Since independent variable, then
b)
12.12521 XEXEXE
36005.003.0
005.0005.0005.02
nnn
XVarX