statistical signal processing - university of louisville 600...statistical signal processing aly a....
TRANSCRIPT
ECE 600-03Statistical Signal Processing
Aly A. FaragUniversity of Louisville
Spring 2010www.cvip.uofl.edu
Lecture # 2 – On Random Variables
Reference – 1) My handwritten notes posted on blackboard and 2) Various resources on the web; too many to list – all are basic stuff from Textbooks.
Basic Concepts in Probability
(from the web)
Statistical Experiment
A statistical experiment E is described in terms of the trilogy: {S, σ, P}
S – sample space containing all elementary outcomes and collections of them.
σ – sigma algebra containing all measurable events
P – probability measure; weight/scale given to every even in σ
Sample Space - Events
• Sample Point
– The outcome of a random experiment
• Sample Space S
– The set of all possible outcomes
– Discrete and Continuous
• Events
– A set of outcomes, thus a subset of S
– Certain, Impossible and Elementary
S
BA
Set Operations
• Union
• Intersection
• Complement
• Properties– Commutation
– Associativity
– Distribution
– De Morgan’s Rule
A B
A B
A B
CA
CA
A B B A
A B C A B C
A B C A B A C
C C CA B A B
S
A B
Axioms and Corollaries
• Axioms
•
•
• If then:
• If A1, A2, … are pair wise exclusive, then:
• Corollaries
•
•
•
•A B
P A B P A P B
11
k k
kk
P A P A
0 P A
1P S
1CP A P A
1P A
0P
P A B
P A P B P A B
Computing Probabilities Using Counting Methods
• Sampling With Replacement and Ordering
–
• Sampling Without Replacement and With Ordering
–
• Permutations of n Distinct Objects
–
• Sampling Without Replacement and Ordering
–
• Sampling With Replacement and Without Ordering
–
kn
1 ... 1n n n k
!k
!
! !
n n n
k n k k n k
1 1
1
n k n k
k n
Conditional Probability
• Conditional Probability of event A given that event B has occurred
• If B1, B2,…,Bn a partition of S, then
(Law of Total Probability)
A B
CA
S
A B
|P A B
P A BP B
B1
B3
B2
A
1 1| ...
| j j
P A P A B P B
P A B P B
Bayes’ Rule
• If B1, …, Bn a partition of S then
1
|
|
|
j
j
j j
n
k k
k
P A BP B A
P A
P A B P B
P A B P B
likelihood priorposterior
evidence
0 11-p p
1010
1-ε ε 1-εε
input
output
Example (Binary communicationchannel)
Which input is more probable if theoutput is 1? A priori, both inputsymbols are equally likely.
Event Independence
• Events A and B are independentif
• If two events have non-zero probability and are mutually exclusive, then they cannot be independent
P A B P A P B
C
A B
½
½
½
½
½ 1 1
1
1
1 1
P A B P A P B
P B C P B P C
P A C P A P C
P A B C P
P A P B P C
Sequential Experiments
• Sequences of Independent Experiments
– E1, E2, …, Ej experiments
– A1, A2, …, Aj respective events
– Independent if
• Bernoulli Trials
– Test whether an event A occurs (success – failure)
– What is the probability of k successes in n independent repetitions of a Bernoulli trial?
– Transmission over a channel with ε = 10-3 and with 3-bit majority vote
1 2
1 2
...
...
n
n
P A A A
P A P A P A
1
!
! !
n kk
n
np k p p
k
n n
k k n k
Random Variables
(from the web)
Random Variables
• The Notion of a Random Variable
– The outcome is not always a number
– Assign a numerical value to the outcome of the experiment
• Definition
– A function X which assigns a real number X(ζ) to each outcome ζ in the sample space of a random experiment
S
x
Sx
ζ
X(ζ) = x
Cumulative Distribution Function
• Defined as the probability of the event {X≤x}
• Properties
XF x P X x
0 1XF x
lim 1Xx
F x
lim 0Xx
F x
if then X Xa b F a F a
X XP a X b F b F a
1 XP X x F x
x
2
1
Fx(x)
¼
½
¾
10 3
1
Fx(x)
x
Types of Random Variables
• Continuous
– Probability Density Function
• Discrete
– Probability Mass Function
X k kP x P X x
X X k k
k
F x P x u x x
X
X
dF xf x
dx
x
X XF x f t dt
Probability Density Function
• The pdf is computed from
• Properties
• For discrete r.v.
dx
fX(x)
X
X
dF xf x
dx
b
Xa
P a X b f x dx
x
X XF x f t dt
1 Xf t dt
fX(x)
XP x X x dx f x dx
x
X X k k
k
f x P x x x
Conditional Distribution
• The conditional distribution function of X given the event B
• The conditional pdf is
• The distribution function can be written as a weighted sum of conditional distribution functions
where Ai mutually exclusive and exhaustive events
|X
P X x BF x B
P B
|
|X
X
dF x Bf x B
dx
1
| |n
X X i i
i
F x B F x A P A
Expected Value and Variance
• The expected value or mean of X is
• Properties
• The variance of X is
• The standard deviation of X is
• Properties
XE X tf t dt
k X k
k
E X x P x
E c c
E cX cE X
E X c E X c
22Var X E X E X
Std X Var X
0Var c
2Var cX c Var X
Var X c Var X
More on Mean and Variance
• Physical Meaning
– If pmf is a set of point masses, then the expected value μ is the center of mass, and the standard deviation σ is a measure of how far values of x are likely to depart from μ
• Markov’s Inequality
• Chebyshev’s Inequality
• Both provide crude upper bounds for certain r.v.’s but might be useful when little is known for the r.v.
E X
P X aa
2
2P X a
a
2
1P X k
k
Joint Distributions
• Joint Probability Mass Function of X, Y
• Probability of event A
• Marginal PMFs (events involving each rv in isolation)
• Joint CMF of X, Y
• Marginal CMFs
,
,
XY j k j j
j k
p x y P X x Y y
P X x Y y
, ,XY XY j k
j A k A
P X Y A p x y
1
,XY j j XY j k
k
p x P X x p x y
1 1 1 1, ,XYF x y P X x Y y
,X XYF x F x P X x
,Y XYF y F y P Y y
Conditional Probability and Expectation
• The conditional CDF of Y given the event {X=x} is
• The conditional PDF of Y given the event {X=x} is
• The conditional expectation of Y given X=x is
, ' '|
y
XY
Y
X
f x y dyF y x
f x
,|
XY
Y
X
f x yf Y x
f x
||
X Y
Y
X
f x y f yf y x
f x
| |YE Y x yf y x dy
Independence of two Random Variables
• X and Y are independent if {X ≤ x} and {Y ≤ y} are independent for every combination of x, y
• Conditional Probability of independent R.V.s
,XY X YF x y F x F y
,XY X Yf x y f x f y
,XY X Yf x y f x f y
|Y Yf y x f y
|X Xf x y f x
Probability Theory
• Primary references:– Any Probability and Statistics text book (Papoulis)– Appendix A.4 in “Pattern Classification” by Duda et al
The principles of probability theory, describing the behavior of systems with random characteristics, are of fundamental importance to pattern recognition.
Esther LevinDept of Computer Science
CCNY
Example 1 ( wikipedia)•two bowls full of cookies.
•Bowl #1 has 10 chocolate chip cookies and 30 plain cookies,•bowl #2 has 20 of each.
•Fred picks a bowl at random, and then picks a cookie at random. •The cookie turns out to be a plain one.
•How probable is it that Fred picked it out of bowl •what’s the probability that Fred picked bowl #1, given that he has a plain cookie?”
•event A is that Fred picked bowl #1, •event B is that Fred picked a plain cookie. •Pr(A|B) ?
Example1 - cpntinuedTables of occurrences and relative frequenciesIt is often helpful when calculating conditional probabilities to create a simple table containing the number of occurrences of each outcome, or the relative frequencies of each outcome, for each of the independent variables. The tables below illustrate the use of this method for the cookies.
Number of cookies in each bowl
by type of cookie
Relative frequency of cookies in each bowl
by type of cookie
The table on the right is derived from the table on the left by dividing each entry by the total number of cookies under consideration, or 80 cookies.
Bowl 1 Bowl 2 Totals
Chocolate Chip 10 20 30
Plain 30 20 50
Total 40 40 80
Bowl
#1
Bowl
#2Totals
Chocolate
Chip0.125 0.250 0.375
Plain 0.375 0.250 0.625
Total 0.500 0.500 1.000
Example 2
• 1. Power Plant Operation. – The variables X, Y, Z describe
the state of 3 power plants (X=0 means plant X is idle).
– Denote by A an event that a plant X is idle, and by B an event that 2 out of three plants are working.
– What’s P(A) and P(A|B), the probability that X is idle given that at least 2 out of three are working?
X Y Z P(x,y,z)
0 0 0 0.07
0 0 1 0.04
0 1 0 0.03
0 1 1 0.18
1 0 0 0.16
1 0 1 0.18
1 1 0 0.21
1 1 1 0.13
• P(A) = P(0,0,0) + P(0,0,1) + P(0,1,0) + P(0, 1, 1) = 0.07+0.04 +0.03 +0.18 =0.32
• P(B) = P(0,1,1) +P(1,0,1) + P(1,1,0)+ P(1,1,1)= 0.18+ 0.18+0.21+0.13=0.7
• P(A and B) = P(0,1,1) = 0.18
• P(A|B) = P(A and B)/P(B) = 0.18/0.7 =0.257
2. Cars are assembled in four possible locations. Plant I supplies 20% of the cars; plant II, 24%; plant III, 25%; and plant IV, 31%. There is 1 year warrantee on every car.
The company collected data that shows
P(claim| plant I) = 0.05; P(claim|Plant II)=0.11;
P(claim|plant III) = 0.03; P(claim|Plant IV)=0.18;
Cars are sold at random.
An owned just submitted a claim for her car. What are the posterior probabilities that this car was made in plant I, II, III and IV?
• P(claim) = P(claim|plant I)P(plant I) +
P(claim|plant II)P(plant II) +
P(claim|plant III)P(plant III) +
P(claim|plant IV)P(plant IV) =0.0687
• P(plant1|claim) =
= P(claim|plant I) * P(plant I)/P(claim) = 0.146
• P(plantII|claim) =
= P(claim|plant II) * P(plant II)/P(claim) = 0.384
• P(plantIII|claim) =
= P(claim|plant III) * P(plant III)/P(claim) = 0.109
• P(plantIV|claim) =
= P(claim|plant IV) * P(plant IV)/P(claim) = 0.361
Example 3
3. It is known that 1% of population suffers from a particular disease. A blood test has a 97% chance to identify the disease for a diseased individual, by also has a 6% chance of falsely indicating that a healthy person has a disease.
a. What is the probability that a random person has a positive blood test.
b. If a blood test is positive, what’s the probability that the person has the disease?
c. If a blood test is negative, what’s the probability that the person does not have the disease?
• A is the event that a person has a disease. P(A) = 0.01; P(A’) = 0.99.
• B is the event that the test result is positive.
– P(B|A) = 0.97; P(B’|A) = 0.03;
– P(B|A’) = 0.06; P(B’|A’) = 0.94;
• (a) P(B) = P(A) P(B|A) + P(A’)P(B|A’) = 0.01*0.97 +0.99 * 0.06 = 0.0691
• (b) P(A|B)=P(B|A)*P(A)/P(B) = 0.97* 0.01/0.0691 = 0.1403
• (c) P(A’|B’) = P(B’|A’)P(A’)/P(B’)= P(B’|A’)P(A’)/(1-P(B))= 0.94*0.99/(1-.0691)=0.9997
Sums of Random Variables
• z = x + y
• Var(z) = Var(x) + Var(y) + 2Cov(x,y)
Special Case: x and y are independent r.v.
• If x,y independent: Var(z) = Var(x) + Var(y)
• Distribution of z:
yxz
dxxzpxpypxpzp yxyx
)()()()()(
Examples:
• x and y are uniform on [0,1]
– Find p(z=x+y), E(z), Var(z);
• x is uniform on [-1,1], and P(y)= 0.5 for y =0, y=10; and 0 elsewhere.
– Find p(z=x+y), E(z), Var(z);
Normal Distributions
• Gaussian distribution
• Mean
• Variance
• Central Limit Theorem says sums of random variables tend toward a Normal distribution.
• Mahalanobis Distance:
xxE )(
22/2)(
2
1),()( xxx
x
eNxp xx
22])[(xx
xE
x
xxr
Multivariate Normal Density
• x is a vector of d Gaussian variables
• Mahalanobis Distance
• All conditionals and marginals are also Gaussian
dxxpxxxxE
dxxxpxE
xTxe
dNxp
TT )())((]))([(
)(][
)(1)(2
1
2/1||2/2
1),()(
)()( 12 xxr T
Bivariate Normal Densities
• Level curves - elliplses.
– x and y width are determined by the variances, and the eccentricity by correlation coefficient
– Principal axes are the eigenvectors, and the width in these direction is the root of the corresponding eigenvalue.
Linear algebra
• Matrix A:
• Matrix Transpose
• Vector a
mnmm
n
n
nmij
aaa
aaa
aaa
aA
...
............
...
...
][
21
22221
11211
mjniabAbB jiij
T
mnij 1,1;][
],...,[;... 1
1
n
T
n
aaa
a
a
a
Matrix and vector multiplication
• Matrix multiplication
• Outer vector product
• Vector-matrix product
)()(,][
;][;][
BcolArowcwherecCAB
bBaA
jiijnmij
npijpmij
matrixnmanABbac
bBbaAa nij
T
mij
,
;][;][ 11
mlengthofvectormatrixmanAbC
bBbaA nijnmij
1
;][;][ 1
Inner Product• Inner (dot) product:
• Length (Eucledian norm) of a vector
• a is normalized iff ||a|| = 1
• The angle between two n-dimesional vectors
• An inner product is a measure of collinearity:– a and b are orthogonal iff
– a and b are collinear iff
• A set of vectors is linearly independent if no vector is a linear combination of other vectors.
n
i
ii
T baba1
n
i
i
T aaaa1
2
||||||||cos
ba
baT
0baT
|||||||| babaT
Determinant and Trace
• Determinant
• det(AB)= det(A)det(B)
• Trace
)det()1(
;,....1;)det(
;][
1
ij
ji
ij
n
j
ijij
nnij
MA
niAaA
aA
n
j
jjnnij aAtraA1
][;][
Matrix Inversion
• A (n x n) is nonsingular if there exists B
• A=[2 3; 2 2], B=[-1 3/2; 1 -1]
• A is nonsingular iff
• Pseudo-inverse for a non square matrix, provided
is not singular
1; ABIBAAB n
0|||| A
TT AAAA 1# ][ AAT
IAA #
Eigenvectors and Eigenvalues
1||||;,...,1, jjjj enjeAe
0]det[ nIA
n
j
jAtr1
][
Characteristic equation:n-th order polynomial, with n roots.
n
j
jA1
]det[