chapter 3 properties of random variables moments and expectation
TRANSCRIPT
Chapter 3 Properties of
Random Variables
Moments and Expectation
Review Experiment Random Variable Observation Realization Parameter Sample Statistic
Moments One way to quantify the location and some
measures of the shape of the pdf.
xdAd '1
AxdA'
1
First moment about the origin
dxxpdA X )(
dxxxpX )('1
ith moment about the origin
dxxpx X
ii )('
j jX
iji xfx )('
continuous random variable
discrete random variable
ith central moment about the mean, m
dxxpx X
ii )()(
Expected value of a random variable X
dxxxpXE X )()(
j jXj xfxXE )()(
X continuous
X discrete
Expected value of function of X, g(X)
dxxpxgXgE X )()()]([
j jXj xfxgXgE )()()]([
X continuous
X discrete
Expected value and the first moment about the origin
dxxpx X
ii )('
dxxxpXE X )()(
Comparing
to
You can see that the expected value of the random value x is the first moment about the origin.
Rules for finding expected values
ccE )(
)]([)]([)]()([
)]([)]([
2121 XgEXgEXgXgE
XgcEXcgE
Measures of central tendency Arithmetic mean Geometric mean Median Mode Weighted Mean
Mean, mx, or average value
'1)( XEx
n
i
i
n
xx
1
k
iiinxn
x1
1
Mean of a r.v. X is its expected value.
Sample estimate is the arithmetic average.
Arithmetic mean of grouped data (k is number of groups, n is total number of observations, ni is the number of observations in group i, xi is the class mark of the ith group.
Geometric mean Used when the ratio of two consecutive
observations is either constant or nearly constant.
n
i
niG xX
1
1)( n
n
i i xxxxxx 43211
n
xx
n
ii
G
1
log)log(
The logarithm of the population geometric mean would be the expected value of the logarithm of X.
Median, Xmd
The observation such that half of the values in the sample lie on either side of Xmd. The median may not exist.
Population median, mmd would be the value satisfying:
md
dxxpX
5.0)(
p
iiX xf
1
5.0)(
X continuous
X discrete
Mode, mmo
Most frequently occurring value. The sample or population may have none, one or more than one mode.
Population mode is the value of X maximizing px(x).
0)(
dx
xdpX0
)(2
2
dx
xpd X
)(1 iXni xfMax
X continuous
X discrete
Weighted mean Used for describing the central tendancy
of grouped data.
k
i i
k
i ii
ww
xwx
1
1
Measures of Dispersion Measures of the spread of the data
Range Variance
Range Difference between the largest and smallest
sample values. For a population this interval often ranges from - ∞ to ∞
or from 0 to ∞. The sample range is a function of only 2 of the sample
values, but does convey some idea of the spread of the data.
Disadvantage of range: does not reflect frequency or magnitude of values that deviate from the mean.
Occasionally use the relative range
Relative range =
X
xx lu )(
Variance, s2
Defined as the second moment about the mean.
The average squared deviation from the mean. For a discrete population of size n:
Sample estimate of sx2 is sx
2
)()(])[()( 2222
2 XEXEXEXVar
n
xi i
X
22
)(
11
)(
1
)(22
222
2
n
xnx
nn
xx
n
xxs i i
i ii i
i ix
Variance Two basic differences between population
and sample variance. used instead of m n-1 is used as the denominator rather than n to
avoid a biased estimate for sx2
Variance of grouped data
x
k
iii
X n
xxns
1
22
1
)(
Rules for finding the Variance
)()(
)()(
0)(
2
2
XVarbbXaVar
XVarccXVar
cVar
Units of Variance Units of the variance are the same as units
on X2. Units on its positive square root, the
standard deviation, sx, are the same as the units of the random variable, X.
A dimensionless measure of dispersion is the coefficient of variation, Cv.
x
sc Xv
Measures of Symmetry Many distributions are not symmetrical Tailing off to the right or the left is skewing
the distribution. Tailing to the right-positively skewed Tailing to the left-negatively skewed
Skewness 3rd moment about the mean
dxxpxskewness X )()( 3
Practical measurements of skewness One measure of absolute skew is to
measure the difference between the mean and the mode.
Not meaningful for comparison sake because it is dependent on units of measure.
Pearson’s first coefficient of skewness Relative measure of skewness more useful
for comparison.
mo
X
mo
S
xx
Population skewness
Sample skewness
Measures of Peakedness (Flatness) Kurtosis refers to the extent of peakedness
of a probability distribution in comparison to the normal distribution.
Kurtosis is the 4th moment about the mean.
Calculate the coefficient of kurtosis, k
dxxpx X )()( 4
4
Kurtosis
Covariance Measure of the linear relationship between
two jointly distributed random variables, X and Y.
Covariance is the 1,1 central moment
)()()()])([(),()( 11 YEXEXYEYXEYXCovYXE YX
dxdyyxpyxYXCov YXyx ),())((),( ,
Covariance If X and Y are independent:
Sample statistic:
0),( , YXYXCov
)1(
))((1
,
n
yyxxs
n
iii
YX
Correlation Coefficient Normalized covariance
If X and Y are independent
YX
YXYX
,
,
11 , YX
0, YX
YX
YXYX ss
sr ,,
11 , YXr
Correlation Coefficient Measure of how two variables vary together. A value of r equal to positive one implies that X
and Y are perfectly related by Y=a+bX. Positive values indicate large (small) values of X
tend to be paired with large (small) values of Y. Negative values indicate large (small) values of X
tend to be paired with small (large) values of Y. Two values are uncorrelated ONLY IF r(x,y)=0. Correlation does NOT equal cause and effect.
Correlation coefficient: Linear dependence and functional dependence
Other Properties of Moments
)()()()( YbEXaEbYaXEZE
)()()()( 22 bYaXEbYaXEbYaXVarZVar
),(2)()()( 22 YXabCovYVarbXVaraZVar