ii - random processes - applications to signal...

186
06/19/14 EC3410.SuFY14/MPF - Section II 1 II - Random Processes - Applications to Signal & Information Processing [p. 3] Random signal/sequence definition [p. 6] Signal mean, variance, autocorrelation & autocovariance sequence, normalized cross-correlation sequence [p. 16] Statistical characterization of random signals I.I.D. Random process Stationarity Wide sense stationarity (wss) Jointly wide sense stationarity (jointly wss) Correlation & cross-correlation for stationary RPs Signal average Ergodicity Concept of white noise, colored noise, Bernoulli process, Random walk [p. 51] Application: MA Processes: definitions and pdf properties [p. 57] Random process properties [p. 66] Multiple Random Processes Joint Properties [p. 72] Application to data analysis – How to assess signal stationarity [p. 76] Application to data analysis - How to check IID assumption Autocorrelation -- Lag plot [p. 89] Application: target range detection [p. 91] Introduction to the spectrogram [p. 95] Application: Gas furnace reaction time [p. 98] Application: Evaluating correlation between random signals [p. 101] Application: Evaluating correlation status between random signals [p. 102] Application: Detection of the periodicity of stationary signals in noisy environments [p. 104] Correlation matrix properties for a stationary process [p. 108] How to estimate correlation lags; biased/unbiased estimator issues [p. 116] Frequency domain description for a stationary process Power spectral density (PSD) definition & properties [p. 127] Principal Component Analysis (PCA, DKLT) Applications to biometrics (face recognition) Applications to network traffic flow anomaly detection [p. 158] Appendices [p. 184] References

Upload: nguyenthuan

Post on 11-Jul-2018

238 views

Category:

Documents


0 download

TRANSCRIPT

06/19/14 EC3410.SuFY14/MPF - Section II 1

II - Random Processes - Applications to Signal & Information Processing

• [p. 3] Random signal/sequence definition • [p. 6] Signal mean, variance, autocorrelation & autocovariance sequence, normalized cross-correlation sequence • [p. 16] Statistical characterization of random signals

– I.I.D. Random process – Stationarity – Wide sense stationarity (wss) – Jointly wide sense stationarity (jointly wss) – Correlation & cross-correlation for stationary RPs – Signal average – Ergodicity – Concept of white noise, colored noise, Bernoulli process, Random walk

• [p. 51] Application: MA Processes: definitions and pdf properties • [p. 57] Random process properties • [p. 66] Multiple Random Processes Joint Properties • [p. 72] Application to data analysis – How to assess signal stationarity • [p. 76] Application to data analysis - How to check IID assumption

– Autocorrelation -- Lag plot • [p. 89] Application: target range detection • [p. 91] Introduction to the spectrogram • [p. 95] Application: Gas furnace reaction time • [p. 98] Application: Evaluating correlation between random signals • [p. 101] Application: Evaluating correlation status between random signals • [p. 102] Application: Detection of the periodicity of stationary signals in noisy environments • [p. 104] Correlation matrix properties for a stationary process • [p. 108] How to estimate correlation lags; biased/unbiased estimator issues • [p. 116] Frequency domain description for a stationary process

– Power spectral density (PSD) definition & properties • [p. 127] Principal Component Analysis (PCA, DKLT)

– Applications to biometrics (face recognition) – Applications to network traffic flow anomaly detection

• [p. 158] Appendices • [p. 184] References

06/19/14 EC3410.SuFY14/MPF - Section II 2

Examples • [p. 7] Example 1 • [p. 9] Example 2 • [p. 11] Example 3 • [p. 28] Example 4 • [p. 33] Example 5 • [p. 35] Example 6 • [p. 37] Example 7 • [p. 39] Example 8 • [p. 50] Example 9 • [p. 58] Example 10 • [p. 60] Example 11 • [p. 70] Example 12 • [p. 75] Example 13; Pack2Data1 • [p. 87] Example 14 • [p. 88] Example 15; Pack2Data3 • [p. 94] Example 16; Pack2Data2

• [p. 98] Example 17 • [p. 101] Example 18; Pack2Data6 • [p. 102] Example 19 • [p. 103] Example 20; Pack2Data4 • [p. 107] Example 21 • [p. 114] Example 22 • [p. 117] Example 23 • [p. 119] Example 24 • [p. 120] Example 25 • [p. 123] Example 26

06/19/14 EC3410.SuFY14/MPF - Section II 3

Random Signal/Sequence - definitions

A RP is a mapping function that attributes a function x(t) = x(t,ξ) (for continuous signal case) or x(n)=x(nTs, ξ) (for discrete signal case) to each outcome of the random experiment

x(t, ξ1)

x(t, ξ3)

x(t, ξ2)

ξ1 •

• • ξ2

ξ3

t

t

t

• • •

• •

• •

• • •

• •

• •

• • •

• • •

• • •

• • •

• •

• • • •

• • • •

• • • • •

• •

• • • • •

• •

...

...

...

n

n

n x(n, ξ1)

x(n, ξ3)

x(n, ξ2)

ξ1 •

• • ξ2

ξ3

Continuous random

signal/process

Discrete random

signal/process

06/19/14 EC3410.SuFY14/MPF - Section II 4

•Consider sequence x(n) =x(n,ξ) ← for a fixed t, x(n) is a Random Variable (RV)

• x(n): → can be infinite dimensional

• x(n,ξ) for fixed RV ξ: called realization/trial of the random process

Random Signal/Sequence - definitions, cont’

• • •

• •

• •

• • •

• •

• •

• • •

• • •

• • •

• • •

• •

• • • •

• • • •

• • • • •

• •

• • • • •

• •

...

...

...

n

n

n x(n, ξ1)

x(n, ξ3)

x(n, ξ2)

ξ1 •

• • ξ2

ξ3

06/19/14 EC3410.SuFY14/MPF - Section II 5

• • •

• •

• •

• • •

• •

• •

• • •

• • •

• • •

• • •

• •

• • • •

• • • •

• • • • •

• •

• • • • •

• •

...

...

...

n

n

n x(n, ξ1)

x(n, ξ3)

x(n, ξ2)

ξ1 •

• • ξ2

ξ3

Example: x(n,ξ) = ξcos(πn/10), where ξ = U[0,1].

For a discrete random signal

06/19/14 EC3410.SuFY14/MPF - Section II 6

Signal mean value (ensemble average):

Signal variance: ( ) ( ){ }( ){ } ( )

22

2 2

( )x x

x

n E x n m n

E x n m n

σ = −

= −

n1 n

x[n]

n2

2 1n n= −

lag

( ) ( ){ }xm n E x n=

Discrete signal case

t1

t

x(t) 2 1t tτ = −Time lag (sec)

t2

Continuous signal case

Note: dimensionless!

06/19/14 EC3410.SuFY14/MPF - Section II 7

x(n,ξ) = ξcos(πn/5), where ξ = U[0,1]. Example 1: Compute process mean and variance

06/19/14 EC3410.SuFY14/MPF - Section II 8

Signal autocorrelation sequence:

n1 n

x[n]

n2

2 1k n n= −lag

( ) ( ) ( ) ( ){ }*1 2 1 2 1 2, ,xx xR n n R n n E x n x n= =

measures the dependency between values of the process at two different times. Allows to evaluate: 1) How quickly a random signal changes with respect to time, 2) The amount of memory a signal may have, 3) Whether the process has a periodic component and what the expected frequency might be, etc…

06/19/14 EC3410.SuFY14/MPF - Section II 9

Let x(n) be a real valued process defined as x(n, ξ)= ξ where ξ is defined as a RV with mean 0 and variance σ2

x. Compute: Rx(k,n)

Example 2:

x(n, ξ1)

x(n, ξ3)

x(n, ξ2)

ξ1 •

• • ξ2

ξ3

06/19/14 EC3410.SuFY14/MPF - Section II 10

06/19/14 EC3410.SuFY14/MPF - Section II 11

x(n,ξ) = cos(πn/5+ξ), where ξ = U[0,2π]. Example 3: Compute Rx(n1,n2)

06/19/14 EC3410.SuFY14/MPF - Section II 12

( ) ( )( ) ( ) ( )( ){ }( ) ( )

*

1 2 1 1 2 2

*

1 2 1 2

( , )

( , )

xx x x

x x x

C n n E x m x m

R n n m m

n n n n

n n

= − −

= −

Signal autocovariance function (remove impact due to process mean):

Signal normalized correlation function (remove impact due to process mean and normalizes max value to 1):

( ) ( )( ) ( )

1 21 2

1 2

,, x

xx x

C n nn n

n nρ

σ σ= ( )1 2| , | 1 !!x n nρ ≤

06/19/14 EC3410.SuFY14/MPF - Section II 13

Signal cross-correlation function:

( )1 2,xyR n n =

• Measures the dependency between values of two processes at two different times. • Allows to evaluate whether two processes are related in some linear fashion, or how well their dependence can be approximated by a linear relationship.

• Will NOT evaluate nonlinear dependence (as with the correlation coefficient defined earlier for random variables)

• Warning: Correlation does NOT imply causation

06/19/14 EC3410.SuFY14/MPF - Section II 14

Signal cross-covariance function:

( ) ( ) ( ) ( )*1 2 1 2 1 2, ,xy xy x yC n n R n n m n m n= −

• Similar to cross-correlation function: measures the dependency between values of two processes at two different times,

but also

• Removes impact of the mean value.

Note: unless there is a good reason to keep the signal

means, best to remove or use covariance based expressions!

06/19/14 EC3410.SuFY14/MPF - Section II 15

Normalized cross-correlation function:

( ) ( )( ) ( )

1 21 2

1 2

,, xy

xyx y

C n nn n

n nρ

σ σ= ( )1 2| , | 1 !!xy n nρ ≤

06/19/14 EC3410.SuFY14/MPF - Section II 16

• Random signals are characterized by joint distribution (or density) of samples

• Fx(x1, x2, …, xk, n1,…, nk) = Pr [x(n1) ≤ x1, … x(nk) ≤ xk]

• F(.) is highly complex to compute - difficult or impossible to obtain in practice

Statistical Characterization of Random Signals

06/19/14 EC3410.SuFY14/MPF - Section II 17

Independent, Identically Distributed (I.I.D.) Random Process:

A Random Process is said to be:

• An independent process (i.e., independent of itself at earlier and/or later times) if for any time index nk :

fx(x1, x2,…,xk;n1,…,nk) = f1(x1;n1)…fk(x2;nk)

• A RP process is IID if all RVs obtained for all time indices have the same pdf fx(x)

Note: I.I.D. processes have no memory, where a future value would depend on past values they can be viewed as building blocks for more realistic random signals. • Mean of I.I.D. Process:

mx(n) = E{x(n)} =

06/19/14 EC3410.SuFY14/MPF - Section II 18

Independent, Identically Distributed (I.I.D.) RP, cont’

Autocovariance of an IID process

{ }{ } { }

{ }

*1 2 1 1 2 2

*1 1 2 2 1 2

21 1 1 2

( , ) ( ( ) ( ))( ( ) ( ))

( ( ) ( )) ( ( ) ( )) ,

| ( ( ) ( )) | ,

=

x x x

x x

x

C n n E x n m n x n m n

E x n m n E x n m n n n

E x n m n n n

= − −

− − ≠= − =

Autocorrelation of an IID process:

Rx(n1, n2) =

06/19/14 EC3410.SuFY14/MPF - Section II 19

( , ) 0.05* ( ), ~ (0,1), ~ (0,1)x n n w n

N w Nξ ξ

ξ= + +

I.I.D. process ?

[ ( , )]E x n ξ

0 5 10 15 20 25 30 35 40 45 500

0.5

1

1.5

2

2.5

3

Time

E[x(n, ksi) ]

06/19/14 EC3410.SuFY14/MPF - Section II 20

( , ) 0.05* ( ), with ~ (0,1), ( )~N(0,1)x n n w n

N w nξ ξ

ξ= + +

I.I.D. process ?

[ ( , )]E x n ξ

0 5 10 15 20 25 30 35 40 45 500

5

10RP x(n)=ksi + 0.05*n+w(n)

Time

Tria

l 1

0 5 10 15 20 25 30 35 40 45 50-5

0

5

Time

Tria

l 2

0 5 10 15 20 25 30 35 40 45 50-5

0

5

Time

Tria

l 3

0 5 10 15 20 25 30 35 40 45 50-5

0

5

Time

Tria

l 4

06/19/14 EC3410.SuFY14/MPF - Section II 21

Data Analysis Application – What does the I.I.D assumption mean when talking about a finite time trial of the random signal? From [7, p. 17]

• IID is a property of the RP, not of the single trial.

• Saying that a signal is IID means we can consider the collected signal set {xi}i=1,…N as obtained from a sequence of random variables {Xi}i=1,…N, where RVs are independent and have the same pdf.

• See how the assumption can be verified for data later.

•Do we need the I.I.D. assumption? No, but it is very convenient and greatly simplifies CI derivations. non I.I.D. examples discussed in [7, Sect. 3]

06/19/14 EC3410.SuFY14/MPF - Section II 22

Stationarity Concept:

If x(n) is stationary for all orders N = 1, 2, … x(n) is said to be strict-sense stationary.

• If x(n) is stationary for order N = 1,

•Stationary up to order 2 → called wide-sense stationary (WSS).

( ; ) ( ; ) x xf x n f x n k⇒ = +

Pdf is identical for all times indices n

Definition: a RP is said to be stationary if any joint density or distribution function depends only on the spacing between samples, not where in the sequence the samples occur

fx(x1, …, xN; n1, …, nN) = fx(x1, …, xN; n1+k ,…, nN+k) for any k & any joint pdf

06/19/14 EC3410.SuFY14/MPF - Section II 23

Stationarity of order N=1 - Physical interpretation

• •

• •

• •

• •

• •

• •

• •

• •

• •

• •

• •

• •

• •

• • •

• • •

• •

• • • •

• •

• •

• • •

...

...

...

n

n

n

x(n, ξ1)

x(n, ξ3)

x(n, ξ2)

• •

• •

• •

• •

• •

• •

• •

• •

• •

• • •

• •

• • •

• ...

...

n

n

n

x(n, ξ4)

• •

• • •

• •

• • • •

• •

• •

• • •

... x(n, ξP)

x(n, ξ5)

Experiment is performed P times leads to P time sequences How to compute Fx(x1; n1) = Pr [x(n1) ≤ x1] [Probability that the functions x(n,ξ) do not exceed x1 at time n1] • Select values for x1and n1 • Count the number of trials K for which x(n1) ≤ x1 Fx(x1; n1) = Pr [x(n1) ≤ x1] = K/P Stationarity of order 1 means Fx(x1; n1) = Fx(x1; n2)=K/P Pr [x(n1) ≤ x1]= Pr [x(n2) ≤ x1]

x1

x1

x1

x1

x1

x1

n1 [11]

n2

06/19/14 EC3410.SuFY14/MPF - Section II 24

Stationarity of order N=2 - Physical interpretation Experiment is performed P times leads to P time sequences How to compute Fx(x1, x2; n1, n2) = Pr [x(n1) ≤ x1, x(n2) ≤ x2] [Probability that the functions x(n,ξ) do not exceed x1 at time n1 and x2 at time n2] • Select values for x1, x2, n1, n2 • Count the number of trials K for which x(n1) ≤ x1 and x(n2) ≤ x2 Fx(x1,x2; n1, n2) = K/P Stationarity of order 2 means Fx(x1,x2; n1,n2) = Fx(x1,x2; n1+N,n2+N) Pr [x(n1) ≤ x1, x(n2) ≤ x2]= Pr [x(n1+N) ≤ x1, x(n2+N) ≤ x2]

• •

• •

• •

• •

• •

• •

• •

• •

• •

• •

• •

• •

• •

• • •

• • •

• •

• • • •

• •

• •

• • •

...

...

...

n

n

n

x(n, ξ1)

x(n, ξ3)

x(n, ξ2)

• •

• •

• •

• •

• •

• •

• •

• •

• •

• • •

• •

• • •

• ...

...

n

n

n

x(n, ξ4)

• •

• • •

• •

• • • •

• •

• •

• • •

... x(n, ξP)

x(n, ξ5)

x1

x1

x1

x1

x1

x1

n1 x2

x2

x2

x2

x2

x2

n2 [11]

06/19/14 EC3410.SuFY14/MPF - Section II 25

Wide-Sense Stationarity Concept Definition: a random signal x(n) is called wide-sense stationary (WSS) if

(1) the mean is a constant independent of “n”

(2) the autocorrelation depends only on the distance k= n1 − n2

Consequences:

(1) Correlation sequence defined with one index only: Rx(k) which measures the amount of “predictability” of the RP (which is linked to memory present in the process)

(2) Variance is a constant independent of “n”

( ){ } ( )x xE x n m n m= =

( ) ( ) ( )( ) ( ) ( ){ }

2 2 11

*

21, ,

xx x x

x

n nR n n R n n R

kR E x n x n k

= =

= = −

( ) ( ) ( ) ( ){ } ( ){ } ( )

( ) ( )

2 22 2

22

| |

0 | |

x x x

x x x

n E x n m n E x n m n

R m

σ

σ

= − = −

= − =

06/19/14 EC3410.SuFY14/MPF - Section II 26

Wide-Sense Stationarity Concept, cont’ (3) the autocovariance also depends only on the time lag distance k= n1 − n2

( ) ( )

( ) ( ) ( ){ }( )

( ) ( )

1 2 1 2

1 2

*

*

2

,Select

( )( )

| |

xx x

x x x

x x x

x x x

C n n C n nk n n

C k E x n m x n k m

R k m m

C k R k m

= −

= −

= − − −

= −

= −

06/19/14 EC3410.SuFY14/MPF - Section II 27

Correlation/Covariance Function Properties for wss x(n)

(1) Conjugate symmetry

(2) Positive semi-definite property:

i.e., for any N and vector we have

(3) Rx(k) max at k = 0 and Rx(0)>0

(can we have Rx(0)=0?)

does (3) hold for Cx(k)?

( ) ( ) ( ) ( )* * Cx x x xR k R k k C k= − = −

( ){ } 1

N

na n

=

( ) ( ) ( )0 1

*1 1 0 0

1 10

N N

xn n

a n R n n a n= =

− ≥∑∑Useful

consequence

06/19/14 EC3410.SuFY14/MPF - Section II 28

06/19/14 EC3410.SuFY14/MPF - Section II 29

Example 4 - a RP process consists of 4 possible sample functions occurring with equal likelihood

1) Find the mean and correlation function 2) Is the RP wss?

1 2 3

4

( ) 1, ( ) 1, ( ) cos(0.2 ), ( ) sin(0.2 )x n x n x n n

x n nππ

= − = ==

06/19/14 EC3410.SuFY14/MPF - Section II 30

06/19/14 EC3410.SuFY14/MPF - Section II 31

• Coherence (Normalized covariance, also called normalized correlation coefficient) function for a wss process is defined as

measures the predictability of a RP ( easier to judge than by using Rx(k) ) as it is a bounded quantity

2 | ( ) | 1( )( ) xxx

x

C k kkρσ

ρ ≤= !!

Wide-Sense Stationarity, cont’

06/19/14 EC3410.SuFY14/MPF - Section II 32

Definition: x(n) and y(n) are said to be w.s. jointly stationary if:

1) x(n) and y(n) are each wss stationary

2) Rxy (n1, n0) = Rxy (n1 − n0)

Consequence: When x(n) and y(n) are w.s. jointly stationary:

Wide-Sense Stationarity, cont’

( ) ( ) ( ) ( ){ }( ) ( )

( )( ) ( )( ){ } ( )

* *1 0 1 0 1 0

1 0 1 0

*

, { ( ) } ( )

,

=

k

xy xy

xy xy

k

x y xy x y

R n n E x n y n R n n E x n y n k

C n n C n n

E x n m y n k m R k m m∗

== = − = −

= − =

− − − = −

06/19/14 EC3410.SuFY14/MPF - Section II 33

• Cross correlation/covariance properties:

Rxy(k)=

Cxy(k)=

• Normalized cross-covariance is defined as ( )

( ) | ( ) | 1xyyx xy

x y

C kk kρ

σρ

σ≤= !!

Wide-Sense Stationarity, cont’

Measures the amount of common information between 2 RPs delayed between each other by time lag k. Concept used as basis for radar detection schemes (more later….).

06/19/14 EC3410.SuFY14/MPF - Section II 34

Example 5: x(n,ξ) = exp[j(πn/5+ξ)], where ξ = U[0,2π].

y(n,ξ) = exp[j(πn/5+ξ’)], where ξ’ = U[0,π].

1) Compute Rxy(n1,n2), assume ξ and ξ’ independent 2) Are the processes j. wss?

06/19/14 EC3410.SuFY14/MPF - Section II 35

06/19/14 EC3410.SuFY14/MPF - Section II 36

Example 6

Assume you are given the zero-mean wss random processes x(n) and y(n) defined as: y(n)=x(n-D)+w(n), where w(n) is zero mean and wss and independent of x(n). Compute Rxy(k)

06/19/14 EC3410.SuFY14/MPF - Section II 37

in many applications only one realization of a RP is available

in general, one single member doesn’t provide information about the statistics of the process

except when process is stationary +ergodic: statistical information cannot be derived from one realization of RP, i.e., time averages

Def: a RP is called ergodic if:

all ensemble averages = all corresponding time averages

Def: a RP is said to be ergodic in the mean if:

Ergodicity:

Signal (time) Average:

Def: a wss RP is said to be ergodic in correlation at lag k if:

[ ] ( )1lim ,2 1

N

N n Nx n x n

→+∞=−

=+ ∑

( ) ( )1( ) lim , * ,2 1

N

x N n NR k x n x n k

Nξ ξ

→+∞=−

= −+ ∑

[ ( , )] ( , )E x n x nξ ξ=< >

06/19/14 EC3410.SuFY14/MPF - Section II 38

Process can be stationary and NOT ergodic

Ergodicity, cont’

Example 7: Assume RP which is a dc voltage waveform where the pdf for the voltage is given by U[0, 5].

1) Plot several possible trials for the RP 2) Is the process wss ? 3) Is the process ergodic in the mean?

x(n,ξ )

06/19/14 EC3410.SuFY14/MPF - Section II 39

06/19/14 EC3410.SuFY14/MPF - Section II 40

Example 8 - Consider the RP x(t) shown below. Check whether the process is 1) ergodic in the mean, 2) ergodic in correlation, 3) wss.

x1(t)=K

x2(t)=-K

1 1/ 2P =

2 1/ 2P =

06/19/14 EC3410.SuFY14/MPF - Section II 41

06/19/14 EC3410.SuFY14/MPF - Section II 42

RP Example - White noise

Definition: A random sequence w(n) is called a white noise process with mean 0 and variance σ2

w iff E{w(n)} = 0 & Rw(k) = σ 2wδ(k)

Notes: 1) All frequencies contribute the same amount (as in the case of white light, therefore the name of “white noise”) 2) There is NO constraint on the pdf. If the pdf of w(n) is Gaussian: it is called “white Gaussian noise” 3) In communication systems applications, thermal noise at a receiver is defined as the process w(n) with autocorrelation defined as: Rw(k) = (Ν0 /2)δ(k) where N0=KT, K=1.37 10-23 Joules/kelvin (Boltzmann’s constant) T: receiver noise temperature in kelvin

06/19/14 EC3410.SuFY14/MPF - Section II 43

x=randn(1200,1); [rx,lags]=xcorr(x,50,'biased'); figure subplot(211),plot(x),title('White Gaussian noise') xlabel('Sample number') subplot(212),plot(lags,rx)

Assume process is ergodic

Why do we need

ergodicity?

06/19/14 EC3410.SuFY14/MPF - Section II 44

RP Example - Colored noise

Definition: A non periodic random noise sequence w(n) is called a colored noise process if Rw(k) is not zero at k≠0 and

Notes: 1) All frequencies do NOT contribute the same amount (as was the case for white noise) 2) There is NO constraint on the pdf. If the pdf of w(n) is Gaussian: it is called “colored Gaussian noise” 3) Colored can easily be generated by passing white noise through a filter

lim [ ] 0wkR k

→∞=

06/19/14 EC3410.SuFY14/MPF - Section II 45

x=randn(1200,1); h=(1/30.)*ones(30,1); y=filter(h,1,x); % basic averaging filter [ry,lags]=xcorr(y,50,'biased'); subplot(211),plot(y),title('Colored Gaussian noise') xlabel('Sample number') subplot(212),plot(lags,ry) title('Correlation sequence'),xlabel('Lag number')

Assume process is ergodic

How can we check

whether the correlation plot make

sense?

06/19/14 EC3410.SuFY14/MPF - Section II 46

06/19/14 EC3410.SuFY14/MPF - Section II 47

RP Example - Bernoulli Process a binary sequence & independent samples

• Probabilistic description:

Pr (x(0) = 1, x(1) = 1, x(2) = −1) =

n

x[n]

. . .

x[n] = 1 with probability P = −1 with probability (1 − P) for P = 1/2 process is called binary white noise

Mean Variance

06/19/14 EC3410.SuFY14/MPF - Section II 48

Random Walk Random Process

• Consider a sequence of I.I.D. RVs {Xi}

• Define

•The process Sn is called simple random walk when Xi = ± 1 (Bernoulli RVs)

• When P = 1/2 and Xi = ± 1 (i.e., for Bernoulli process): discrete Wiener process

Turns out:[ ( )] 0

var( ( )) 1E x n

x n n=

= +

( )

0( ) ( ), 0,1,....

( ) ( ) ( ) sum process( ) 1/ ( ) mean process

n

kS n X k n

S n S n X nM n n S n

=

= =

= + ←

= ←

Is this a wss process ?

06/19/14 EC3410.SuFY14/MPF - Section II 49

RP Example - Random Walk, cont’ Sequence of I.I.D. RVs {Xi} & S(n) = X1 + X2 + . . . + Xn, n = 1, 2, . .

•Property: S(n) has independent increments in non-overlapping time intervals

( ) ( )2 1

1 2

2 3

2 1 1 1

1

3 2 1

( ) ( )

( ) ( )

n n

n n

n n

S n S n X X X X

X X

S n S n X X

+

+

− = + + − + +

= + +

− = + +

n1 n2 n3

06/19/14 EC3410.SuFY14/MPF - Section II 50

Random Walk, cont’ General Character • Tends to have long runs of positive and negative values.

• Length of runs increases with increasing time, local behavior remains the same.

s = rand(1,10000); r = cumsum(( (s > 0.5) *2) - 1);

Random walks applications found in Economics: to model shares prices, Physics: to model random movement of molecules in liquids and gases, Vision science: used to describe eye movements, Psychology: to explain the relation between the time needed to make a decision and the probability that a certain decision will be made,

06/19/14 EC3410.SuFY14/MPF - Section II 51

Example 9 - You are given the simple random walk process S(n). Compute P[S(n)]=+1 after 3 steps.

06/19/14 EC3410.SuFY14/MPF - Section II 52

RP Example –Moving Average (MA) Random Process

0( ) ( ),

where ( ) is zero mean white noise & ergodic

N

pp

x n a s n p

s n=

= −∑

• Compute mx(k), and Rx(n0,n1) • Is x(n) wss?

06/19/14 EC3410.SuFY14/MPF - Section II 53

06/19/14 EC3410.SuFY14/MPF - Section II 54

Data Analysis Application – pdf properties of MA Random Processes

0( ) ( ),

where ( ) is zero mean white noise

N

pp

x n a s n p

s n=

= −∑

Can you say anything on the pdf of x(n) ?

06/19/14 EC3410.SuFY14/MPF - Section II 55 55

Recall Lindeberg’s Central Limit Theorem (CLT) - NotePack 1 Describes the limiting behavior of the distribution function of a sum of independent random variables with finite mean and variance (Feller’s condition) (note “identically distributed” is no longer required….)

1 1 1

2 2

1 1

; with [ ] ;

var[ ]

n i

n i

n N N

n i s i xi i i

N N

s i xi i

s x m E x m

xσ σ

= = =

= =

= = =

= =

∑ ∑ ∑

∑ ∑2

2k=1,...N

2

Provided max 0 as n

then ~ ( , ) when is large

k

n

n n

s

s

n s ss N m n

σσ

σ

→ → ∞

mean variance

N(a,b)

Results can be applied to filters outputs….

06/19/14 EC3410.SuFY14/MPF - Section II 56

06/19/14 EC3410.SuFY14/MPF - Section II 57

06/19/14 EC3410.SuFY14/MPF - Section II 58

Random Process Properties

• if x(n) is periodic, ( ) ( )x n x n N= +

• Mean

( ) ( ) ( ) ( )x xm n E x n E x n pN m n pN= = + = + • Correlation/Covariance for a wss RP

( ) ( )1 2 1 2,x xR n n R n n= −

( ) ( )( ) ( )

x x

x x

R k R k pN

C k C k pN

= +

= +

Mean & Correlation/covariance functions of a periodic

process are also periodic with the same period

06/19/14 EC3410.SuFY14/MPF - Section II 59

Example 10 1) x(n) = A exp (j(ωn + θ)), θ ~ U [0,2π] Compute Rx(k) & mx(n) 2) x(n) = A cos(ωn + θ), θ ~ U [0,2π] Compute Rx(k) & mx(n)

06/19/14 EC3410.SuFY14/MPF - Section II 60

06/19/14 EC3410.SuFY14/MPF - Section II 61

Example 11 y(n)=s(n)+w(n), where s(n)=A cos(ωn + θ), θ ~ U [0,2π], w(n) zero-mean white wss noise, w(n) & s(n) are independent. Compute Ry(k) and my(n)

06/19/14 EC3410.SuFY14/MPF - Section II 62

06/19/14 EC3410.SuFY14/MPF - Section II 63

- A RP is said to be uncorrelated if

Uncorrelated Random Process

{ }{ }

*1 2 1 1 2 2

2 21 1 1 1 2

1 2

21 2 1 1 2

*1 2 1 2 1 2

21 2 1 1 2

( , ) ( ( ) ( ))( ( ) ( ))

| ( ) ( ) | ( ),

0,

( , ) ( ) ( )or equivalently if

( , ) ( , ) ( ) ( )

( , ) = ( ) ( )

x x x

x x

x x

x x x x

x x x

C n n E x n m n x n m n

E x n m n n n n

n n

C n n n n n

R n n C n n m n m nR n n n n n m

σ

σ δ

σ δ

= − −

− = == ≠

= −

= +

− + *1 2( ) ( )xn m n

06/19/14 EC3410.SuFY14/MPF - Section II 64

Uncorrelated wss Random Process

- A wss RP is said to be uncorrelated if

{ }{ }

*

2 2

2

2

2 2

( ) ( ( ) )( ( ) )

| ( ) | , 0

0, 0

( ) ( )or equivalently

( ) ( ) | |

( ) = ( ) | |

x x x

x x

x x

x x x

x x x

C k E x n m x n k m

E x n m k

k

C k k

R k C k mR k k m

σ

σ δ

σ δ

= − − −

− = == ≠

=

= +

+

06/19/14 EC3410.SuFY14/MPF - Section II 65

- A RP is said to be wide-sense (ws) cyclostationary if ∃ N such that

mx (n) = mx(n + N), ∀ n Rx (n1, n2) = Rx(n1 + N, n2 + N)

Examples of a w.s. cyclostationary process:

* DSB-AM signal

x(n) = A(n) cos (ω0n) A(n) = stationary RP

ω0 = constant

Cyclostationary Process

Signal statistics vary periodically with time

Leads to correlation

between areas of the signal

spectrum

Note: signal itself NOT necessarily

periodic

* OFDM signals

06/19/14 EC3410.SuFY14/MPF - Section II 66

Cyclostationary properties, cont’

• Cyclostationary property taken advantage of in cognitive radio (CR) detection applications:

- pilot symbols used in OFDM applications exhibit periodic behaviors resulting in cyclostationary signal behavior - noise usually doesn’t exhibit periodic behavior - difference between signal/noise behavior taken advantage of to extract OFDM signal characteristics.

06/19/14 EC3410.SuFY14/MPF - Section II 67

• Two RPs x(n) and y(n) are said to be statistically independent (of each other) if for all time indices n1 and n2:

• Or equivalently that for each choice of n1 and n2, the RVs x(n1) and x(n2) are independent. • 2 RPs x(n) and y(n) are said to be uncorrelated (of each other) if for all values n1 and n2

•2 RPs x(n) and y(n) are independent of each other uncorrelated

Multiple Random Processes Joint Properties

{ } { } { }

{ }

* *1 2 1 2 1 2

*1 2 1 1 2 2

( , ) ( ) ( ) ( ) ( )

which is equivalent to:

( , ) ( ( ) ( ))( ( ) ( )) 0

xy

xy x y

R n n E x n y n E x n E y n

C n n E x n m n y n m n

= =

= − − =

1 2 1 2( , , , ) ( , ) ( , )xy x yf x y n n f x n f y n=

06/19/14 EC3410.SuFY14/MPF - Section II 68

• Two RPs x(n) and y(n) are said to be jointly Gaussian RPs if for any choice of ni and mi, the random vectors [x(n1), …x(nn)] and [y(m1), …y(mn)] are jointly Gaussian.

• If x(n) and y(n) are jointly Gaussian and uncorrelated RPs

independent

• 2 RPs x(n) and y(n) are said to be orthogonal if for all values n1 and n2

Multiple Random Processes Joint Properties, cont’

{ }*1 2 1 2( , ) ( ) ( ) 0 xyR n n E x n y n= =

06/19/14 EC3410.SuFY14/MPF - Section II 69

• Two wss RPs x(n) and y(n) are said to be statistically independent (of each other) if for all time lag k and time n:

• Or equivalently that for each choice of n and k the RVs x(n) and x(n+k) are independent. • 2 wss RPs x(n) and y(n) are said to be uncorrelated (of each other) if for all time lag values k

• 2 wss RPs x(n) and y(n) are said to be orthogonal if for all time lag k

Multiple wss Random Processes Joint Properties

{ } { } { }{ }

* * *

*

( ) ( ) ( ) ( ) ( ) =

which is equivalent to: ( ) ( ( ) )( ( ) ) 0

xy x y

xy x y

R k E x n y n k E x n E y n k m m

C k E x n m y n k m

= − = −

= − − − =

( , , , ) ( , ) ( , )xy x yf x y n n k f x n f y n k+ = +

{ }*( ) ( ) ( ) 0 xyR k E x n y n k= − =

06/19/14 EC3410.SuFY14/MPF - Section II 70

Recall that:

• 2 wss RPs x(n) and y(n) are said to be uncorrelated (of each other) if for all time lag values k

• 2 wss RPs x(n) and y(n) are said to be orthogonal if for all time lag k:

Multiple wss Random Processes Joint Properties, cont’

*( ) or ( ) 0 xy x y xyR k m m C k= =

( ) 0 xyR k =

Consequences: 2 wss RPs x(n) and y(n) are independent of each other uncorrelated 2 wss orthogonal RPs x(n) and y(n) are orthogonal and at least one RP is zero mean

uncorrelated

(Unless both RPs are Gaussian)

06/19/14 EC3410.SuFY14/MPF - Section II 71

Example 12: let x(n) and y(n) be RPs generated as:

x(n) = αn, y(n)=α2n, α~N(0,1)

1) find the mean mx(n), my(n)

2) are x(n), y(n) wss?

3) are x(n), y(n) uncorrelated RPs?, uncorrelated of each other?

4) are x(n), y(n) independent RPs?, independent of each other?

06/19/14 EC3410.SuFY14/MPF - Section II 72

06/19/14 EC3410.SuFY14/MPF - Section II 73

Investigate changes in mean or variance: - If changes occur process is not wss How do we decide there is a change? (visually or via statistical

tests) - Visually

- Statistical tests: Two-Sample tests for equal means and equal variances over small non overlapping data blocks (independence bet. samples required for the tests)

Data Analysis Application – How do we assess whether data is stationary?

1) Consider the environment that produced it, 2) Check whether basic properties of the signal change with time or not.

Compute and track changes in mx(t) & varx(t)

06/19/14 EC3410.SuFY14/MPF - Section II 74

wss data example Non-wss data example

06/19/14 EC3410.SuFY14/MPF - Section II 75

- Tests can be implemented over short-time windows in MATLAB using

• ttest2.m (use the t-distribution) • vartest2.m (use the F-distribution)

- Requires the selection of a level of significance α (usually picked

around 5 to 10%) Tests sensitive to block lengths (useful only when data set is large

enough…)

06/19/14 EC3410.SuFY14/MPF - Section II 76

Example 13: You collected data from 2 thermal sensors X and Y. The data collected for each is contained in the matrix DATA=[X,Y]. Can each data be considered wss? ( Pack2Data1.mat) MATLAB hint: The Matlab function Pack2Example13Template.m provides you with a shell code to compute short-term statistics defined over overlapping data segments. Use if you find useful.

06/19/14 EC3410.SuFY14/MPF - Section II 77

Data Analysis Application – How do we know if the I.I.D assumption is valid ?

1) Inspect the normalized correlation plot

EXAMPLE [Ref 7, Ex. 2.18]: CPU DATA. Execution times for n = 7632 consecutive requests are measured and displayed on the upper left panel. Initial testing indicates the data appears stationary and roughly normal so the autocorrelation function can be used to test independence.

• The plot on the lower left panel shows a strong correlation. data is not independent

• Assume you are interested in extracting a IID sequence out of this data. How would you do so? Try sub-sampling

06/19/14 EC3410.SuFY14/MPF - Section II 78

( )x kρ

Data x(n)

• Random sub-sampling example, [Ref. 7]

06/19/14 EC3410.SuFY14/MPF - Section II 79

Data Analysis Application – How do we know if the I.I.D assumption is valid?, cont’

EXAMPLE [Ref 7, Ex. 2.18], cont’: CPU DATA example is not IID. • How is sub-sampling implemented?

- Basic N-level sub-sampling may be implemented by picking every Nth sample. However, this may result in aliasing in some cases (why is that? hint: think about what decimating does to the signal in the frequency domain see plots next page) - A better approach introduces randomness in the picking task. The sub-sampled data is obtained following the random sub-sampling scheme as follows. For every index i = 1...n, decide with probability p = 1/2 whether the point is kept. This gives the second plot on the figure. Then repeat the process. This gives sub-sampled data with p = 1/2 to 1/27 = 1/128.

06/19/14 EC3410.SuFY14/MPF - Section II 80

• Comparisons between deterministic/random sub-sampling y(n)=x(n)+0.5x(n-1)+0.3x(n-2)

Original signal spectrum & normalized correlation correlation between samples shows up to lag 2.

Spectrum of signal down-sampled by picking every other sample & and resulting normalized correlation correlation between samples shows at lag 1

Spectrum of signal down-sampled by picking randomly every other sample on average & and resulting normalized correlation correlation between samples shows at lag 1.

Spectrum of signal down-sampled by picking randomly every 4th sample on average & and resulting normalized correlation very weak correlation between samples shows at lag 1.

Conclusion: Degree of correlation between samples has decreased by picking every other sample. However, this may not be always the case. See next example

06/19/14 EC3410.SuFY14/MPF - Section II 81

• Comparisons between deterministic/random sub-sampling, cont’ Original signal spectrum & normalized correlation long term correlation between samples shows.

Spectrum of signal down-sampled by picking every other sample & and resulting normalized correlation correlation between samples shows.

Spectrum of signal down-sampled by picking randomly every other sample on average & and resulting normalized correlation decreased correlation between samples shows.

Spectrum of signal down-sampled by picking randomly every 4th sample on average & and resulting normalized correlation correlation between samples has significantly decreased.

Conclusion: Degree of correlation between samples has decreased by picking every other sample in a random fashion, while it does NOT when samples are picked in a regular fashion..

06/19/14 EC3410.SuFY14/MPF - Section II 82

• Comparisons between deterministic/random sub-sampling, cont’

Conclusion: If you wish to extract an IID sequence out of a correlated sequence, the best (safest) approach is to sub-sample in a random fashion to avoid potential aliasing effects.

06/19/14 EC3410.SuFY14/MPF - Section II 83

( )x kρ

Data x(n)

• Random sub-sampling example, [Ref. 7]

All ρx(k) within CI bounds!

06/19/14 EC3410.SuFY14/MPF - Section II 84

- Recall that the normalized correlation sequence obtained from a white sequence x(n) is defined as 2( ) ( )x xk kρ σ δ=Question: when can be considered equal to 0? use CI concept ( )x kρ

• Result: when x(n) is IID with pdf~N(0,1),

0( ) (0,1/ )x kk N Nρ ≠

• For 95% CI, zα/2=1.96 | ( ) | 1.96 /x k Nρ <

• How to evaluate the transformed sequence is IID?

06/19/14 EC3410.SuFY14/MPF - Section II 85

EXAMPLE cont’ [Ref 7, Ex. 2.18]: The previous figure shows that the data looses correlation when the sampling probability is p = 1/64. The turning point test for the sub-sampled data with p = 1/64 The sub-sampled data has 114 points, and the 95% CI obtained for the estimated mean of the sub-sampled data is [65.5, 71.7]. The 95% confidence interval that would be obtained if we would (wrongly) assume the original data to be IID is: [69.2, 69.9]. The IID assumption grossly underestimates the CI because the data is correlated.

06/19/14 EC3410.SuFY14/MPF - Section II 86

Is sub-sampling always the solution to removing correlation ? unfortunately not always! [Ex 2.19, Ref 7] shows the number of bytes transferred over an Ethernet LAN, (360,000 points) Illustrates long range dependent data

Above CI upper limit

06/19/14 EC3410.SuFY14/MPF - Section II 87

How do we know the iid assumption is valid ? Cont’

2) Inspect the “lag plot”

Def: plot x(n) versus x(n+lag) for different values of “lag” lag plot checks whether a data set or time series is independent

or not. Random data does not exhibit any identifiable structure in the lag plot. Non-random structure in the lagplot indicates that the underlying data may be correlated in some fashion.

[Ref 8]

06/19/14 EC3410.SuFY14/MPF - Section II 88

Example 14: Evaluate the “lagplot” obtained for a random walk sequence

11

=n

n k n nk

X S X S−=

= +∑

06/19/14 EC3410.SuFY14/MPF - Section II 89

Example 15: You are given measurements collected by sensors x y and z. ( Pack2Data3.mat)

1) Using the correlation function xcorr.m, evaluate whether the measurements obtained for each sensor are correlated or not.

Hint: 1) use [xcor,lags]=xcorr(x,maxlag,’coeff’); this will insure you can plot the range of correlation coefficients for a specified range of correlation lags from –maxlag to +maxlag and the correlation is normalized to have Rx(0)=1; 2) use a relatively small number of lags, around 20 or less to start so that you can see what happens around lag 0.

1) Using the matlab function lagplot.m plot lagplots for user-specified lags for sequences x, y and z. Evaluate whether the measurements obtained for each sensor are correlated. You can start with lags 2, 4, 8 and higher. Repeat for the measurements contained in y & z.

2) Estimate the maximum lag at which the data is correlated for each sensor. 3) Assume you want to generate an uncorrelated sequence out of y. Explain

how random subsampling can be used, implement random subsampling of y with factors equal to 2, 4, 8. Explain how you check that the data extracted of out y is uncorrelated.

4) Repeat 3) for sequence z. MATLAB note: The Matlab function Pack2Example15Template.m provides you with a shell code to compute a

random sub-sampled sequence for various subsampling amounts. Use if you find useful.

06/19/14 EC3410.SuFY14/MPF - Section II 90

Target

assume y(n) = x(n - N)

Application - Radar Target Detection Cross-correlation application

( ) ( ) ( ){ }*yxR E y n x n= − =

06/19/14 EC3410.SuFY14/MPF - Section II 91

Assume y(n) = x(n - N)

( ) ( ) ( ){ }*xyR k E x n y n k= − =

06/19/14 EC3410.SuFY14/MPF - Section II 92

Brief introduction to the Sliding Window FT (spectrogram)

FT

FT

FT

FT

Usually window incremented by a fraction (25 to 75%).

w[m]

Time

Freq

uenc

y

06/19/14 EC3410.SuFY14/MPF - Section II 93

Linear Chirp – Time domain

Linear Chirp – Spectrogram

Nor

mal

ized

Fr

eque

ncy,

f s=2

Spectrogram, cont’

06/19/14 EC3410.SuFY14/MPF - Section II 94

Low noise level

High noise level

06/19/14 EC3410.SuFY14/MPF - Section II 95

Example 16: Assume you send the chirp signal x(t). You turn your receiver on at the time you send x(t) and leave it on until you receive y(t). Assume the sampling frequency to be equal to 1Hz.

You have two scenarios to investigate: high and low SNR received signals obtained by sending x(t). The received signal in the high SNR case is yhigh(t). The received signal in the low SNR case is ylow(t). Plot the spectrograms for x(t), yhigh(t), and ylow(t) using the matlab function spectrogram.m • A good set of starting values for the spectrogram are: window length 32; overlap 16, nfft=2048. • Use Fs=1 and the ‘yaxis’ option so that the spectrogram is plotted with the time axis as the x-axis. Estimate the target distance in number of samples obtained for both cases ( x, ylow & yhigh in Pack2Data2.mat)

06/19/14 EC3410.SuFY14/MPF - Section II 96

Application: Gas furnace reaction time – cross-covariance/cross-correlation function application

Example: x1(t) represents a furnace input gas feed rate for a gas furnace x2(t) represents the % of CO2 in outlet gas Goal: Evaluate how fast the furnace responds to changes in the gas feed rate

[Box-Jenkins data]

06/19/14 EC3410.SuFY14/MPF - Section II 97 EC3410.WFY10/MPF - Section II

2 1

2 1

2 1

( )Compute: ( ) x x

x xx x

C kkρ

σ σ=

Min at lag=5

Question: What is the significance of the minimum ?

06/19/14 EC3410.SuFY14/MPF - Section II 98

Question: What is the impact of the data mean?

06/19/14 EC3410.SuFY14/MPF - Section II 99

Example 17 You are given the following 2 independent ergodic random signals s1(n) and s2(n) generated as RVs pdfs ~ U[-0.5,0.5], and the following pairs of ergodic random signals derived as:

Assume you generate 10000 data values of s1 and s2. Scatter plots and histogram information are shown next page. 1) Comment on the correlation status between y1(n) & y2(n), z1(n) & z2(n), v1(n) & v2(n), v1(n) & y1(n), v2(n) & y1(n). 2) Provide justification regarding the difference between the pdfs behaviors of the pairs y1(n) & y2(n) and z1(n) & z2(n)

1 1 1 1

2 2 2 2

1 1

2 2

( ) ( ) ( ) ( )8 10 1 1( ) ( ) ( ) ( )10 2 1 1

( ) ( 10)8 10( ) ( 5)10 2

y n s n z n s ny n s n z n s n

v n s nv n s n

− = =

+

= +

06/19/14 EC3410.SuFY14/MPF - Section II 100

06/19/14 EC3410.SuFY14/MPF - Section II 101

-20 -15 -10 -5 0 5 10 15 20-1

-0.5

0

0.5

1Ry1,y2

lag no.-20 -15 -10 -5 0 5 10 15 20

-1

-0.5

0

0.5

1Rz1,z2

lag no.

-20 -15 -10 -5 0 5 10 15 20-1

-0.5

0

0.5

1Rv1,v2

lag no.-20 -15 -10 -5 0 5 10 15 20

-1

-0.5

0

0.5

1Rv1,y1

lag no.

-20 -15 -10 -5 0 5 10 15 20-1

-0.5

0

0.5

1Rv2,y1

lag no.

06/19/14 EC3410.SuFY14/MPF - Section II 102

Example 18 You are given the following ergodic random signals s1(n), s2(n),s3(n). Extract 1) information regarding their density characteristics, and 2) whether and how they may be related to each other. Pack2Data6.mat

06/19/14 EC3410.SuFY14/MPF - Section II 103

Property: if the process x(n) is a-periodic and zero-mean, then

Application: Detection of the signal periodicity in noisy environments

Example 19: Assume we have a sinusoidal signal x(n) with uniform random phase φ imbedded in wss zero-mean white noise w(n) with variance σ2 (signal and noise uncorrelated). The correlation sequence may be used to get information on the properties of the periodic signal

lim ( ) 0xkR k

→∞=

Period N=?

06/19/14 EC3410.SuFY14/MPF - Section II 104

Example 20: You are given measurements collected from 2 underwater sensors y1 and y2, which contain tone(s) imbedded in noise ( Pack2Data4.mat). Assume the sampling frequency is equal to 1Hz. 1) Evaluate whether you can extract periodicity information on y1 and y2 using correlation information. 2) Evaluate whether the noise is white or not by computing the spectral estimates . Note: spectral estimates may be computed using h = spectrum.welch; % Select the frequency estimation scheme hpsd=psd(h,y,'Fs',1); % Calculate the frequency information, assume sampling freq=1Hz. Plot(hpsd) % plot frequency information from 0 to 1/2fs 3) Compute the frequency information from y1 and y2 from the spectral estimates computed in 2), and derive periodicity information on y1 and y2. 3) Compare the information obtained with both approaches. List advantages/limitations of both approaches.

2 2

1 2( ) & ( )j jY e Y eω ω

06/19/14 EC3410.SuFY14/MPF - Section II 105

Correlation Matrix Properties for a Stationary Process

Recall:

Correlation Matrix for a Stationary Process x(n) stationary ⇒ Rx(n1, n0) = Rx(n1 – n0)

Rx =

Assume you have the 2-dimensional random vector x = [x(0), x(1)]T

( )( )

( ) ( )**0 0 , 11

Hx

x x xR E xx E

x

= = =

06/19/14 EC3410.SuFY14/MPF - Section II 106

Assume x = [x(0), x(1)]T

Correlation Matrix for a Periodic RP

Rx(n) = Rx(n + N) N: period

06/19/14 EC3410.SuFY14/MPF - Section II 107

Correlation Matrix Properties

(1) Rx is Hermitian

(2) Rx is positive semi-definite, i.e., λ(Rx)

(3) Rx has an eigendecomposition of the form Rx=U Λ UH where: * U is a unitary eigenvector matrix (UUH=UHU=I) * Λ is a diagonal eigenvector matrix (4) The eigenvectors are orthogonal to each other

(5) .

Assume x = [x(0), …, x (N − 1)]T

tr( )x ii

R λ= ∑

06/19/14 EC3410.SuFY14/MPF - Section II 108

3 11 3xR

=

Is Rx a valid correlation matrix for the wss process x(n)?

Example 21:

06/19/14 EC3410.SuFY14/MPF - Section II 109

• Quality of estimate? → find mean and variance of

How to compute correlation estimates

• For discrete and real data: x = [x(0), , x(N-1 )]T

( )ˆxR k

( ) ( ) ( ) ( )1

0

1ˆ ˆN k

x xi

sR R x i x i kN k

kTk− −

=

= = +− ∑

( ) ( ) ( )1

0

1ˆ(1) N k

xi

E R k E x i x i kN k

− −

=

= + − ∑

( ) 0Assume known from 0 ,& ergodic (why?)x t t t T= → =

in seconds Lag: dimensionless

06/19/14 EC3410.SuFY14/MPF - Section II 110

How to compute correlation estimates, cont’

( )[ ]

( ) ( ) ( )22

ˆ(2) Var

when

x x x xi

NR k R i R i k R i kN k

N k

=−∞

≅ + + − −

>>

06/19/14 EC3410.SuFY14/MPF - Section II 111

( ) ( ) ( ) ( )1

0

1 N k

x x si

R k R kT x i x i kN

− −

=

= = +∑

( )(1) xE R k =

( ) ( ) ( ) ( )21(2) Var

0

x x x xR k R i R i k R i kN

k

−∞

≅ + + −

>

Alternate Estimator: Biased Estimator

Quality of estimate:

How to compute correlation estimates, cont’

06/19/14 EC3410.SuFY14/MPF - Section II 112

( ) ( ) ( ) ( ) ( ) ( )

( ) ( ) ( ) ( )

( ) ( ) ( )[ ]

( )

( ) ( )

( ) ( ) ( ) ( )

( )

1 1

0 0

2

1 1ˆ

ˆ

1 ˆVar Var

whenˆVar Constant Var

1 ˆ

when

Var

b

N k N k

x xi i

x x x

x x x

x x x

x

R k x i x i k R k x i x i kN N k

N kE R k R k E R k R kN

NR k R R k RN N k

k N

R k R k

E R k R k E R k R kN

N

R k

− − − −

= =

= + = +−

− = =

= = −

→ <∞ + ∞

= →

→ + ∞

∑ ∑

L L

( )ias of xE R k

Biased Estimator Unbiased Estimator

Biased/unbiased discrete correlation estimator summary

06/19/14 EC3410.SuFY14/MPF - Section II 113

je ω

06/19/14 EC3410.SuFY14/MPF - Section II 114

Example 22: Comparing theoretical and Estimated correlation sequences

Assume that you are given a wss ergodic RP generated as: s(n)=as(n-1)+v(n), where v(n) is Gaussian zero-mean white noise ~N(0,1), with |a|<1 1) Compute the theoretical correlation expression Rs(k) Figure next page plots 1) theoretical correlation values, 2) estimated biased correlation values, assuming N=50 data points are available for s(n), and 3) estimated biased correlation values, assuming N=10000 data points are available for s(n) and a=0.8. 2) Comment on differences

06/19/14 EC3410.SuFY14/MPF - Section II 115

06/19/14 EC3410.SuFY14/MPF - Section II 116

06/19/14 EC3410.SuFY14/MPF - Section II 117

Frequency Domain Description of Stationary Processes

Power spectral density (PSD)

( ) ( ) ( )

( ) ( ) ( )2

12

j j kx T x x

j j j kx x x

S e F R k R k e

R k IFT S e S e e d

ω

ω ω

π

ω

ω ωπ

−= =

= =

Digital frequency defined for 0 2ω π≤ ≤

Covers range [0,2π] or [−π,π], etc

06/19/14 EC3410.SuFY14/MPF - Section II 118

( ) 1kxR k a a= <

Example 23: find the PSD of zero-mean w.s.s. x(n) with

06/19/14 EC3410.SuFY14/MPF - Section II 119

PSD has three key properties:

The PSD Sx(ejω) is a real-valued and periodic (1) P1 = with period 2π for any x(n) (see details in Appendix C) if x(n) is real then Sx(ejω) is even

(2) P2 = The PSD Sx(ejω) is non-negative definite; i.e., Sx(ejω) ≥ 0 → see page 159, text [Therrien]

(3) P3 = The area under Sx(ejω) is non-negative and equals the energy of x(n)

06/19/14 EC3410.SuFY14/MPF - Section II 120

( ) 2 ( )xR k kσ δ= Example 24: White noise find the PSD of white noise x(n) with

Application to communication systems: ( ) 0( / 2) ( )xR k N kδ=

06/19/14 EC3410.SuFY14/MPF - Section II 121

Example 25: Harmonic Process

• Definition: a harmonic process is defined as:

where M, {Ak}, {ωk} are constants

{φk} are pairwise independent RVs uniformly distributed over [0, 2π]

• Compute E{x(n)}, Rx(k), Sx(ejω)

( )1

( ) cos ; 0M

k k k kk

x n A nω φ ω=

= + ≠∑

06/19/14 EC3410.SuFY14/MPF - Section II 122

06/19/14 EC3410.SuFY14/MPF - Section II 123

06/19/14 EC3410.SuFY14/MPF - Section II 124

Example 26: 1 2

1 2

( ) cos(0.1 ) 2sin(1.5 ), [0,2 ], independent

x n n nU

π θ π θθ θ π

= + + +

Compute Rx(k), Sx(ejω)

06/19/14 EC3410.SuFY14/MPF - Section II 125

06/19/14 EC3410.SuFY14/MPF - Section II 126

Summary of Properties for Stationary x(n) Definitions

Mean

Correlation

Covariance

Cross-Correlation

Cross-Covariance

PSD / Cross-PSD

Inter-relations ( ) ( ) 2

x x xC R m= − ( ) ( ) *xy xy x yC R m m= −

Properties

Autocorrelation PSD

( ) ( )( ) ( )( ){ }*x x xC E x n m x n m= − − −

( ){ } xE x n m=

( ) ( ) ( ){ }*xR E x n x n= −

( ) ( ) ( ){ }*xyR E x n y n= −

( ) ( )( ) ( )( ){ }*xy x xC E x n m y n m= − − −

( ) ( )j jx xS e R eω ω−= ∑

( ) ( )j jxy xyS e R eω ω−= ∑

( )( ) ( )( ) ( )

*

is NND

0 ,

x

x x

x x

R

R R

R R

= −

≥ ∀

( )( ) ( ) ( )

0

, real

jx

j jx x

S e

S e S e x n

ω

ω ω−

=

06/19/14 EC3410.SuFY14/MPF - Section II 127

Properties

Cross-correlation Cross-PSD ( ) ( )( ) ( ) ( )

( ) ( )

( )

*

0 0

1

xy yx

xy x y

xyxy

x y

xy

R R

R R R

σ σ

ρ

= −

=

( ) [ ( )]jxy xyS e FT R kω =

06/19/14 EC3410.SuFY14/MPF - Section II 128

Principal Component Analysis (PCA)

( Discrete Karhunen-Loeve Transform (DKLT))

• In many practical applications, it is beneficial to represent a random sequence x with a linearly equivalent sequence w consisting of uncorrelated components (such sequence w is called the innovation representation).

• In such cases, each component of the uncorrelated sequence w can be viewed as adding new information to the previous components. • Applications exist in compression, classification, etc...

PCA transformation may be used to perform dimension reduction while preserving as much variance from the original space as possible

06/19/14 EC3410.SuFY14/MPF - Section II 129

How to transform x into the innovation representation ?

Assume x=[x(0), … x(N − 1)]T is zero-mean. If x is not zero-mean, remove the mean before proceeding further. Questions: (1) What does it mean for w to be uncorrelated ? (2) What does “represent a random sequence x with a linearly equivalent sequence w” mean ?

06/19/14 EC3410.SuFY14/MPF - Section II 130

Define linear transformation as A=UH

x can be recovered by y by x = Uy x can be rewritten as:

1

11

| |

| |

N

N iii

N

yx u u y u

y =

= =

11

1

| | 0

| | 0

HH

Nx HN

N

uC U U u u

u

λ

λ

− − = Λ = − −

1 0uncorrelated

0

Hy

N

y U x C yλ

λ

= ⇒ = ⇒

UH transformation diagonalizes the covariance matrix

Eigenvector matrix Eigenvalue matrix

See derivation in Appendix E

06/19/14 EC3410.SuFY14/MPF - Section II 131

What if I want to compress information ?

1

M

iii

x y u M N=

= ≤∑

We want eM as small in norm as possible.

1 1 1Recall

N M N

i i ii i ii i i M

M

x y u y u y u

ex= = = +

= = +∑ ∑ ∑

2

1 1

N NHM MM i i

i M i ME E e e E y λ

= + = +

= = = ∑ ∑

How can I pick so that error between x and is minimum ? { } 1

Mi i

u=

x

Derivation shown in Appendix F

06/19/14 EC3410.SuFY14/MPF - Section II 132

2

1 1

N NHM MM i i

i M i ME E e e E y λ

= + = +

= = = ∑ ∑

To minimize loss EM, pick in eM eigenvectors associated with smallest eigenvalues.

+ put in the eigenvectors associated with largest eigenvalues. x

06/19/14 EC3410.SuFY14/MPF - Section II 133

PCA Applications

PCA is used in

- Speech / image coding,

- Communications

- Networking

- Data compression

- Electronic warfare

Because it allows for a lower dimensional representation of the data

- Classification

Because it works! Ref: [4]

06/19/14 EC3410.SuFY14/MPF - Section II 134

[Ref 13]

a

a

b

b

c

c

xyx

zyxy

=

- Redundancy in collected information? - How do we reduce dimension?

PCA Applications, cont’

06/19/14 EC3410.SuFY14/MPF - Section II 135

[Ref 13]

Common information/redundancy between sensors

PCA Applications, cont’

06/19/14 EC3410.SuFY14/MPF - Section II 136

PCA Applications, cont’

Example: Assume we have N samples of 2-dimensional data of type x=[x1,x2].

Ref: [4]

Reduced Basis Selection Scheme PCA

x(n) x(n) ∧

“Inverse” PCA

06/19/14 EC3410.SuFY14/MPF - Section II 137

[Ref 13]

PCA Applications, cont’

Caution: It is possible for PCA to fail ….

06/19/14 EC3410.SuFY14/MPF - Section II 138

Application to Biometrics

Database

Enrollment subsystem

Biometric reader

Feature Extractor

Authentication subsystem

Biometric reader

Feature Extractor

Biometric Matcher

Match or No Match

1010010…

Template

1010010…

Template

[Ref. 5]

06/19/14 EC3410.SuFY14/MPF - Section II 139

Identification (1:N)

Biometric reader

Biometric Matcher

Identification vs. Verification

Database

Verification (1:1)

Biometric reader

Biometric Matcher

ID

Database

This person is Emily Dawson

Match

I am Emily Dawson

[Ref. 5]

06/19/14 EC3410.SuFY14/MPF - Section II 140 1229/09

Face recognition overall procedure

Image captured

Face extracted

Discriminating feature parameters extracted

Extract Feature parameters combined to characterize

each individual + Select classifier type

Training (Design) Stage Testing Stage

Test algorithm with new data (testing set)

06/19/14 EC3410.SuFY14/MPF - Section II 141

Application to Face recognition The eigenface (PCA) implementation

re-organize each image (several options)

- Image with 6464 pixels results in a feature vectors of length 4096! - Need to reduce dimension to simplify problem.

06/19/14 EC3410.SuFY14/MPF - Section II 142

Display Decision

Database Collection (N people)

Camera

Cropped Image Files

...

Dimension reduction &

Features extraction

Subject-specific Feature Generation

Create feature space

Training Phase

Test Subject Camera features extraction

Compare and Classify

Testing Phase

06/19/14 EC3410.SuFY14/MPF - Section II 143

PCA CX=UΛUH

Training Images: each line is a N-dim vector representing a face

Average image vector

Zero-mean training images

Testing Images: each line is a N-dim vector representing a face

Testing Images subtracted from

averaged image vector

M principal components in “face space”

Eigenfaces: each column is an eigenvector of length N; select M<N columns

M principal components in “face space”

First Phase

Third Phase

Class-specific references

Class-specific Centroid calculation

Classification

Second Phase

PCA Recognizer overview

06/19/14 EC3410.SuFY14/MPF - Section II 144

Database Collection (N people)

Camera

Cropped Image Files

...

Class-specific Feature Generation

Create feature space

Dimension reduction &

Features extraction

• First Phase: Training – Extract relevant features & Select classifier

Define “class centroids” for class-specific

references

Goals: (1) Project Images into a smaller dimensional space to reduce computational load & (2) Keep discriminating class information

06/19/14 EC3410.SuFY14/MPF - Section II 145

Feature Space

Dimension reduction Projection

... ... • Create Projection matrix A from covariance matrix

• Project data onto smaller dimensional feature space

C1

C2 C3

• Second Phase: Training - Identify class features

06/19/14 EC3410.SuFY14/MPF - Section II 146

• Third Phase: Testing - Project test data onto feature space & compare against class centroids

C1 C2

C3 Feature Space

?

Decision: ( Class C2 )

06/19/14 EC3410.SuFY14/MPF - Section II 147

Original faces eigenfaces

= K1 + K2 +….+ KN

= K1 + K2

[Ref. 6]

06/19/14 EC3410.SuFY14/MPF - Section II 148

Average face 1st eigenface

[Ref. 6]

06/19/14 EC3410.SuFY14/MPF - Section II 149

PCA - Application to Traffic Monitoring & Network Anomaly Detection

[Ref. 9,10]

Interested in finding out whether • Network is under attack

• There is a sudden change in traffic patterns

• There is an equipment outage

• There is something never seen before In general, unsupervised methods for reliably detecting and classifying may be preferred are they do not require as much a-priori information as supervised schemes do. (flip side: they sometimes may not perform as well….)

06/19/14 EC3410.SuFY14/MPF - Section II 150

• Use Origin-Destination (OD) Network-wide traffic flow data

• Tested on Abilene (precedes Internet2, bef. 2007) academic network – Network connecting 200 US universities – 11 points of presence (PoPs) – Spanned the continental US – Upgraded in 2007 (Internet2)

• OD flow: measures IP-level flow entering and exiting the network at

a given PoP Seattle

Sunnyvale

Los Angeles Denver

Houston

Chicago New York

Washington

Atlanta

Kansas City Indianapolis

[Ref. 12]

06/19/14 EC3410.SuFY14/MPF - Section II 151

[Ref. 10]

• Represent overall network behavior by the set of all OD traffic flows

• Information carried by various OD flows may be related

• High dimensional problem how can we reduce the dimensionality ?

• Can we use OD flows to detect traffic anomalies ?

06/19/14 EC3410.SuFY14/MPF - Section II 152

Examples of OD flows

[Ref. 10]

06/19/14 EC3410.SuFY14/MPF - Section II 153

• Collect OD traffic flow obtained for each possible combination of Origin & Destinations (5 mn increments over one week)

• Combine in the OD traffic flow matrix X

• Use PCA (done via the SVD decomposition) to decompose information into a set of eigenvectors (called “eigenflows” and associated eigenvalues

X n×m U: eigenvector matrix (n×n)

1

m

λ

λ

Λ: eigenvalue matrix (n×n)

06/19/14 EC3410.SuFY14/MPF - Section II 154

Deterministic components (eigenflows associated with largest eigenvalues)

Spiky components (eigenflows associated unusual events

Noisy components (eigenflows associated with smaller eigenvalues

Three main types of eigenflows

[Ref. 10]

06/19/14 EC3410.SuFY14/MPF - Section II 155

Only a few large eigenvalues

Overall traffic may be modeled with few dimensions only

[Ref. 10]

06/19/14 EC3410.SuFY14/MPF - Section II 156

OD flow reconstructed in terms of three types of eigenflows

[Ref. 10]

06/19/14 EC3410.SuFY14/MPF - Section II 157

• Identify unusual traffic behavior by separating traffic into two components

(1) Usual traffic represented by eigenflows associated with k (small usually less than 10) of the largest eigenvalues (where most of the energy resides). represented by space S1 spanned by the first top k eigenflows define PS1: projection onto S1 (2) Unusual traffic represented by eigenflows not taken into account in usual traffic) represented by space S2 spanned by remaining n-k eigenflows.

• Project OD flow traffic y obtained onto S1 and S2 for all OD flows normal traffic = PS1y residual traffic= y-PS1y • Traffic anomaly detected when there is a sudden change in residual traffic

06/19/14 EC3410.SuFY14/MPF - Section II 158

nb of OD flows

i=1( ) ODflow(i,t)total traffic t = ∑

Residual traffic(t)=total traffic(t) – usual traffic(t)

[Ref. 10]

06/19/14 EC3410.SuFY14/MPF - Section II 159

Section II – Random Processes

Appendices

06/19/14 EC3410.SuFY14/MPF - Section II 160

Section II – Random Processes

Appendix A

How to compute correlation matrix estimates

06/19/14 EC3410.SuFY14/MPF - Section II 161

How to compute correlation matrix estimates

Discrete data: x = [x(0), … , x(N-1)]T Compute correlation matrix based on N data points. Maximum correlation matrix dimension? Define matrix X as

X =

x(0) x(1)

x(N – 1) 0

0

0 x(0) x(1)

x(N – 1) 0 0

0

0 0

0 x(0)

0 x(N – 1)

06/19/14 EC3410.SuFY14/MPF - Section II 162

XHX(1, 1) =

XHX =

x*(0)

0

0 0

… 0 0

… … …

x*(1) x*(0)

x*(N – 1) x*(N – 2) x*(N – 1)

x*(N – 1) x*(N – 2) x*(0)

x(0) x(1)

… …

x(N – 1) 0

0

0 x(0) x(1)

x(N – 1) 0

0

0

0 x(0)

0 x(N – 1)

x(2)

x(N – 2)

×

06/19/14 EC3410.SuFY14/MPF - Section II 163

Section II – Random Processes

Appendix B

How to assess data stationarity

06/19/14 EC3410.SuFY14/MPF - Section II 164

• Statistical test is defined as: 0 1 2

1 1 2

::

H m mH m m

=≠

• Where the test statistic is:

1 2

2 21 21 2/ /

m mTN Nσ σ

−=

+• If equal variances are assumed, T reduces to:

1 2

2 21 21 2

1 21 2

( 1) ( 1) 1/ 1/2

m mTN N N N

N Nσ σ

−=

− + −+

+ −

• Turns out T has a t distribution with υ degrees of freedom where:

( )

22 21 21 2

2 22 21 21 1 2 2

1 2

/ /

( / ) /( 1) ( / ) /( 1)If equal variances are assumed, then 2

N N

N N N NN N

σ συ

σ συ

+=

− + −= + −

• Reject the hypothesis that the two means are equal with (1-α) confidence if: ,1 / 2 , / 2 , / 2orT t t T tυ α υ α υ α−< = − >

• Test is implemented in MATLAB using ttest2.m

T-test for equality of means

[Ref: http://www.itl.nist.gov/div898/handbook/eda/section3/eda353.htm ]

Assume the sample mean and variance for each segments are defined as:

1 2

2 21 2, , ,m mσ σ

06/19/14 EC3410.SuFY14/MPF - Section II 165

[Ref: http://www.itl.nist.gov/div898/handbook/eda/section3/eda359.htm]

F-test for equality of variance

• Test is implemented in MATLAB using vartest2.m

0 1 2

1 1 2

::

HH

σ σσ σ

=≠

• Statistical test is defined as:

• Similar derivation as for the mean, requiring the use of the F distribution

• So the name: “Two-sample F-test for equal variances”

06/19/14 EC3410.SuFY14/MPF - Section II 166

Section II – Random Processes Appendix C

Detection of the number of periodic

tones imbedded in a noisy signal using the correlation matrix information

06/19/14 EC3410.SuFY14/MPF - Section II 167 167

• if rank ; rank can be 2 or 3

• if rank , then rank

• if rank , is called “rank deficient” • maximum rank = 2

X(n)=A exp (j(ωn + θ)), θ ~ U[0, 2π] Detection of number of stationary tones – noise free case

Compute the 2-dimensional correlation matrix of x and its rank. Linear algebra results:

(1) Rank of a square matrix = number of non-zero eigenvalues (2) Given the correlation matrix

( )2xR

( )( )2 1xR = ( )2xR

( )( )2 1xR = ( )( )3 1xR =

( )( )2 2xR =

( )( ) 1 3PxR P= >

( )( )3xR

( )

( ) ( ) ( )( ) ( ) ( )

( ) ( ) ( )

* *

*3

0 1 21 0 1

2 1 0

x x x

x x xx

x x x

R R RR R R

R

R R R

=

( )2xR

& rank

06/19/14 EC3410.SuFY14/MPF - Section II 168

( ) ( ) ( )( ) ( )

( ) ( ) ( )

*2

*

0 11 0

x xx

x x

x

R RR

R R

R k E x n x n k

= =

= −

06/19/14 EC3410.SuFY14/MPF - Section II 169 12/29/09

y(n)=A exp (j(ωn + θ))+w(n), θ ~ U[0, 2π], w[n] white noise with pdf~N(0,K), θ & w(n) independent

• Compute the 2-dimensional correlation matrix of y and its rank. • How can the information be used to detect the number of tones?

Detection of number of stationary tones – noisy case

06/19/14 EC3410.SuFY14/MPF - Section II 170

06/19/14 EC3410.SuFY14/MPF - Section II 171

Eigenvalues of the signal covariance matrix

Noise-free

Noisy

Noise-free

Noisy

06/19/14 EC3410.SuFY14/MPF - Section II 172

Limitations when tones are close

06/19/14 EC3410.SuFY14/MPF - Section II 173

Example C1: You are given the set of noise-free measurements contained in xa and the noisy measurements computed from xa contained in ya. You are also given noisy data contained in yb ( Data in Pack2Data5.mat). 1) For data xa and ya: Pick the covariance matrix dimension N=10, N=30, N=100, i.e., pick maxlag, as defined below, to be equal first to 10, 30, and 100. plot the eigenvalues. Explain how and why the specific selection of N affects the ability to detect the number of tones. Estimate the number of complex tones. 2) Estimate the number of complex tones contained in yb. The eigenvalues of the N-dim covariance matrix, where N is defined as maxlag=20 below (to be selected by user, value is problem dependent), or the correlation matrix of the ”zero-meaned”, data x may be computed and plotted as follows: maxlag=20;

[xc,xlags]=xcov(x, maxlag, 'biased'); % lag 0 is at index maxlag+1 xc1=xc(maxlag+1:end); XC=toeplitz(xc1); [v,lam]=eig(XC); lamda=flipud(diag(lam)); subplot(211),stem(lamda) % plot eigenvalues by decreasing values

subplot(212),periodogram(x,[],[],1),ylabel('dB'),title(‘Periodogram') % plot PSD

06/19/14 EC3410.SuFY14/MPF - Section II 174

Section II – Random Processes Appendix D

Proof that the PSD is periodic with a

period equal to 2π

06/19/14 EC3410.SuFY14/MPF - Section II 175

Property

Let x[n] be a stationary RP, then Sx(ejω) is periodic with period 2π

Proof - Simplest case:

( ) ( ) ( )002 2j k j

x xR k e S eω ω π δ ω ω π= ↔ = − −∑

( ) ( )We can prove jx T xS e F R kω =

( ) ( )by proving jx xR k IFT S e ω =

06/19/14 EC3410.SuFY14/MPF - Section II 176

( ) ( )

( )

( )

0

0

0

2

02

02

inside [ 2 ] only1" "ispresent2 0

121 2 2

2

2

i

j j j kx x

j k

j k

afor

j kj k

IFT S e S e e d

e d

e d

e e

ω ω ω

π

ω

π

ω

π

πω ω π

ωω

ω ω

ωπ

π δ ω ω π ωπ

δ ω ω π ω

= + =

=

=

= − −

= − −

= =

∑∫

∑ ∫

( ) ( )We prove jx T xS e F R kω = ( ) ( )by proving j

x xR k IFT S e ω =

06/19/14 EC3410.SuFY14/MPF - Section II 177

( )If ij kx i

iR k a e ω= ∑

( ) ( )2 2jx i i

iS e aω π δ ω ω π = − −

∑ ∑

( )2 ijji

ia e e ωωπ δ −∑sometimes written as

06/19/14 EC3410.SuFY14/MPF - Section II 178

Section II – Random Processes Appendix E

PCA derivation

06/19/14 EC3410.SuFY14/MPF - Section II 179

1

| |

| |Nx u u

=

To Prove that:

1

N

ii

x y=

= ∑may be rewritten as

06/19/14 EC3410.SuFY14/MPF - Section II 180

1

1

1 11 1 12 2 1

2 21 1 22 2 2

1 1 2 2

| |

| | ......

......

...

N

N

N N

N N

N N N NN N

yx u u

yx u y u y u yx u y u y u y

x u y u y u y

⇓ ⇓

= =

= + + += + + +

= + + +

1 21 2

1

+ +

N NN

iii

x u y u y u y

x y u=

= +

= ∑

06/19/14 EC3410.SuFY14/MPF - Section II 181

Section II – Random Processes Appendix F

PCA derivation - How to minimize the

error quantity using the eigendecomposition of the covariance

matrix

06/19/14 EC3410.SuFY14/MPF - Section II 182

06/19/14 EC3410.SuFY14/MPF - Section II 183 EC3410.WFY10/MPF - Section II

06/19/14 EC3410.SuFY14/MPF - Section II 184

Database S images

6464=4096 pixels

Images get reshaped as

column vectors

dimension N=40961

Cx= E[(x-mx)(x-mx) H]

Combine S vectors each of dimension

40961

x1… .xS

Cx dimension (NN)

40964096

PCA Find N eigen vectors/values

[ U, Λ]

Compression step: Keep P “top” eigenvectors

P<<N

Projection step: Project {xi} to get {yi}, i =1,…,S

Identify each subject characteristics via subject-specific {yi}’s

(select class centroid & spread information)

Store in database: • Centroid information for each subject (of P-dimension)

• P eigenvectors information (P4096)

Face Recognition Application for 6464 images - Summary

(P << N=4096)1

1

, 1,...,

H

iiHP

uy x i S

u

− − − −

= = − − − −

06/19/14 EC3410.SuFY14/MPF - Section II 185

[1]: Discrete Random Signals and Statistical Signal Processing, C. Therrien, Prentice Hall, 1992 [2]:Statistical and Adaptive Signal Processing, D. Manolakis, V. Ingle & S. Kogon, Artech House, 2005 [3] R. Giuterez-Osuna, course notes for CPSC 689: Statistical Classification and Clustering http://courses.cs.tamu.edu/rgutier/cpsc689_f05/ [4] Carreira-Perpiñán, M. Á. (2001): Continuous latent variable models for dimensionality reduction and sequential data reconstruction. PhD thesis, University of Sheffield, UK. http://www.cse.ogi.edu/~miguel/papers.html [5] “Biometrics: Faces and Identity Verification in a Networked World,” Presentation for CSI7163/ELG5121, D. Chow, M. Samuel [6] A. Drygajlo, Biometrics, Speech Processing and Biometrics Group Signal Processing Institute Ecole Polytechnique Fédérale de Lausanne (EPFL), http://scgwww.epfl.ch/courses [7] Performance Evaluation of Computer and Communication Systems, J-Y. Le Boudec, epfl, http://perfevalepfl.ch/lectureNotes.htm [8] http://www.itl.nist.gov/div898/handbook/eda [9] Whole-Network’ Methods for Traffic Analysis and Anomaly Detection, Eric D. Kolaczyk, www.sytacom.mcgill.ca/eng/15_MITACS/KolaczykMITACS08.pdf [10] N. Feamster, lectures notes for CS7260 (Internetworking Architectures and Protocols) http://www.cc.gatech.edu/classes/AY2006/cs7260_spring/syllabus.html#Schedule [11] W. Cham, Foundation Course on Probability, Random Variable and Random Processes [12] http://www.internet2.edu/2004AR/abilene_map_large.cfm

Section II – Random Processes References

06/19/14 EC3410.SuFY14/MPF - Section II 186

[13] J. Shlens, “A Tutorial on Principal Component Analysis,” ver. 3.0.1,

http://www.snl.salk.edu/~shlens/