ensemble empirical mode decomposition noise assisted signal analysis (nasa) part i preliminary...

52
ENSEMBLE EMPIRICAL MODE DECOMPOSITION Noise Assisted Signal Analysis (nasa) Part I Preliminary Zhaohua Wu and N. E. Huang: Ensemble Empirical Mode Decomposition: A Noise Assisted Data Analysis Method. Advances in Adaptive Data Analysis, 1, 1-41, 2009

Post on 21-Dec-2015

237 views

Category:

Documents


2 download

TRANSCRIPT

ENSEMBLE EMPIRICAL MODE DECOMPOSITIONNoise Assisted Signal Analysis (nasa)

Part I Preliminary

Zhaohua Wu and N. E. Huang:

Ensemble Empirical Mode Decomposition: A Noise Assisted Data Analysis Method. Advances in Adaptive Data Analysis, 1, 1-41, 2009

Theoretical Foundations

• Intermittency test, though ameliorates the mode mixing, destroys the adaptive nature of EMD.

• The EMD study of white noise guarantees a uniformed frame of scales.

• The cancellation of white noise with sufficient number of ensemble.

Theoretical Background I

Intermittency

Sifting with Intermittence Test

• To avoid mode mixing, we have to institute a special criterion to separate oscillation of different time scales into different IMF components.

• The criteria is to select time scale so that oscillations with time scale longer than this pre-selected criterion is not included in the IMF.

Observations

• Intermittency test ameliorates the mode mixing considerably.

• Intermittency test requires a set of subjective criteria.

• EMD with intermittency is no longer totally adaptive.

• For complicated data, the subjective criteria are hard, or impossible, to determine.

Effects of EMD (Sifting)

• To separate data into components of similar scale.

• To eliminate ridding waves.• To make the results symmetric with respect to

the x-axis and the amplitude more even.

– Note: The first two are necessary for valid IMF, the last effect actually cause the IMF to lost its intrinsic properties.

Theoretical Background II

A Study of White Noise

Wu, Zhaohua and N. E. Huang, 2004:

A Study of the Characteristics of White Noise Using the Empirical Mode Decomposition Method, Proceedings of the Royal Society of London , A 460, 1597-1611.

Methodology

• Based on observations from Monte Carlo numerical experiments on 1 million white noise data points.

• All IMF generated by 10 siftings.• Fourier spectra based on 200 realizations of

4,000 data points sections.• Probability density based on 50,000 data points

data sections.

IMF Period Statistics 

IMF1 2 3 4 5 6 7 8 9

number of peaks

347042 168176 83456 41632 20877 10471 5290 2658 1348

Mean period 2.881 5.946 11.98 24.02 47.90 95.50 189.0 376.2 741.8

period in year 0.240 0.496 0.998 2.000 3.992 7.958 15.75 31.35 61.75

 

Fourier Spectra of IMFs

0 1 2 3 4 5 6 7 8 90

0.5

1

1.5

spectr

um

(10**

-3)

Fourier Spectra of IMFs

1 1.5 2 2.5 3 3.50

0.2

0.4

0.6

0.8

1

ln T

spectr

um

(10**

-3)

Shifted Fourier Spectra of IMFs

Empirical Observations : IMean Energy

N

n nj=1

1E = c ( j )

N 2

Empirical Observations : IINormalized spectral area is constant

lnT ,nS d lnT const

Empirical Observations : IIINormalized spectral area is constant

n ,nNE = S d

is the total Energy of n-th IMF component

Empirical Observations : IVComputation of mean period

lnT ,n

n ,n T ,n lnT ,nn

S d lnTdT d lnTNE S d S S

T T T 2

lnT ,n

n

lnT ,n

S d lnTT

d lnTS

T

Empirical Observations : IIIThe product of the mean energy and period is

constant

n nE T const

n nln E lnT const

Monte Carlo Result : IMF Energy vs. Period

Empirical Observation: Histograms IMFs By Central Limit theory IMF should be normally distributed.

-1 0 10

5000

-1 -0.5 0 0.5 10

5000

-0.5 0 0.50

5000

-0.5 0 0.50

5000

-0.4 -0.2 0 0.2 0.40

5000

-0.2 0 0.20

5000

-0.2 -0.1 0 0.1 0.20

5000

-0.1 0 0.10

5000

mode 2 mode 3

mode 4 mode 5

mode 6 mode 7

mode 8 mode 9

Fundamental Theorem of Probability

• If we know the density function of a random variable, x, then we can express the density function of any random variable, y, for a given y=g(x). The procedure is as follows:

1 n

x 1 x ny , ,

1 n

,1 j j ,

j

Solve the roots of y = g(x ) + ... + g(x ) + ... then

f ( x ) f ( x )( y ) = + .... + + ...

g ( x ) g ( x )

d ybecause d y = g ( x ) dx ; therefore, dx = .

g ( x )

Fundamental Theorem of Probability

• If we know the density function of a random variable, x, is normal, then x-square should be

2

See: A. Papoulis : Probabil

1(y) = exp -y/2 U(y).

ity,

Random Variables,

2 y

where U(y) is a nor

and Stochastic Process

maliz

e

ing

s.

19

function.

84. Page 97-98.

Chi and Chi-Square Statistics

2 2 21 n 1 nn

1 / 22 2 21 n

Given n normal identical independent random

varaibles with density

1(x , ..., x ) = exp - x +... +x /2 U(y).

2

we have the RV's = x +... +x y=

then the density for y with -degree

-1+ /22

n

See: A. Papoulis : Probability, Random Variables, and Stochastic Processes

1984. Page 187-188.

y(y) = a y exp - U

of freedom is

with a = 1 2 ( / 2 )

(y)2

DEGREE OF FREEDOM

• Random samples of length N contains N degree of freedom

• Each Fourier component contains one degree of freedom

• For EMD, the shares of DOF is proportional to its share of energy; therefore, the degree of freedom for each IMF is given as

i if = N E .

Chi-Squared Energy Density Distributions

n n

n

n n

n

NE NEn n

y

NE NE

NE

y y

NEn n

E N ( NE ) e

Let E = e , then

y

NE ( NE )

N ( N e ) e e

NE E

e

= C exp y - 2 E

2 1 2

2 1

2 1

2

2

Histograms : IMF Energy Density

0.15 0.2 0.250

100

200

0.05 0.1 0.150

100

200

0.02 0.04 0.06 0.080

100

200

0.01 0.02 0.03 0.04 0.050

100

200

0 0.01 0.02 0.030

100

200

0 0.01 0.020

100

200

0 0.005 0.010

100

200

0 0.005 0.010

100

200

300

mode 2 mode 3

mode 4 mode 5

mode 6 mode 7

mode 8 mode 9

By Central Limit theory, IMF should be normally distributed; therefore, its energy should be Chi-squared distributed.

Chi-Squared Energy Density Distributions

n nNE NEn nE N ( NE ) e 2 1 2

By Central Limit theory, IMF should be normally distributed; therefore, its energy should be Chi-squared distributed.

Formula of Confidence Limit for IMF Distributions I

2 1 2

2

2 3

2 2 2

12 3

NEy NE y

NE

y y

yIntroducing new variable, ; then . It follows:

y N Ne e e

NE NE NE E C exp y C exp y

E

y = ln E

C N

y y y yE e y y ...

! !E

E = e

Formula of Confidence Limit for IMF Distributions II

2 3

12 2 3

NE/2

yWith the new variable, ; then , it follows:

y y y yNEy C' exp y

! !

1C = N exp - NE ( 1 - y )

y = ln E E =

.2

e

Formula of Confidence Limit for IMF Distributions III

2

2 2

NE/2

When y - y << 1 , we can neglect the higher power terms:

y yNE y C exp

!

1 C' = N exp - NE ( 1 - y ) .

2

Formula of Confidence Limit for IMF Distributions IV

For given confidence limit, ,

the corresponding vairable, y should satisfy

.

y

y dy

y dy

For a Gaussian distribution, it is often to relate α to the standard deviation, σ , i.e., α confidence level corresponds to k σ, where k varies with α. For example, having values -2.326, -0.675, -0.0, 0.675, and 2.326 for the first, 25th, 50th, 75th and 99th percentiles (with α being 0.01, 0.25, 0.5, 0.75, 0.99), respectively.

Formula of Confidence Limit for IMF Distributions V

2

2

2

n

n

n

When y - y << 1 , the distribution of E is

approximately Gaussian,

2 T1 = =

NE / N

Therefore , for any given , in terms of k ,we have

T y y k k

N

Formula of Confidence Limit for IMF Distributions VI

2

2

0

2

2

2 x

x

TGiven y y k k and lnE + lnT .

N

If we write , as defined before, then

y x; therefore

A pair of upper and lower bounds

x = lnT y = ln

will be

E

y x k

y x

eN

k eN

Confidence Limit for IMF Distributions

Data and IMFs SOI

1930 1940 1950 1960 1970 1980 1990 2000

-0.4-0.2

00.2

R

-0.5

0

0.5

C9

-0.5

0

0.5

C8

-10

1

C7

-10

1

C6

-10

1

C5

-2

0

2

C4

-2

0

2

C3

-20

2

C2

-20

2

C1

-50

5

Raw

SO

I

Statistical Significance for SOI IMFs

1 mon 1 yr 10 yr 100 yr

IMF 4, 5, 6 and 7 are 99% statistical significance signals.

Summary

• Not all IMF have the same statistical significance.

• Based on the white noise study, we have established a method to determine the statistical significant components.

• References:

• Wu, Zhaohua and N. E. Huang, 2003: A Study of the Characteristics of White Noise Using the Empirical Mode Decomposition Method, Proceedings of the Royal Society of London A460, 1597-1611.

• Flandrin, P., G. Rilling, and P. Gonçalvès, 2003: Empirical Mode Decomposition as a Filterbank, IEEE Signal Proc Lett. 11 (2): 112-114.

Observations

The white noise signal consists of signal of all scales.

EMD separates the scale dyadically.

The white noise provide a uniformly distributed frame of scales through EMD.

Flandrin, P., G. Rilling and P. Goncalves, 2004: Empirical Mode Decomposition as a filter bank. IEEE Signal Process. Lett., 11, 112-114.

Flandrin, P., P. Goncalves and G. Rilling, 2005: EMD equivalent filter banks, from interpretation to applications.Introduction to Hilbert-Huang Transform and its Applications, Ed. N. E. Huang and S. S. P. Shen, p. 57-74. World Scientific, New Jersey,

Different Approaches but reach the same end.

Fractional Gaussian Noiseaka Fractional Brownian Motion

H

22 H 2 H 2 H

H H

A continuous time Gaussian process, x (t), is a Fractional noise,

if it starts at zero, with zero mean and has correlation function:

R(t,s) = E x ( t ) x (s) = t s t s ,2

where H is a paramt

H

er known as the Hurst Index with value

in 0 ,1 , and is the rms value of x (t).

If H = 1/2, the process is Gaussian, or regular Brownian motion.

If H > 1/2, the process is positively correlated, or more

red.

If H < 1/2, the process is negatively correlated, or more blue.

Examples

Flandrin’s results

Flandrin’s results

Flandrin’s results

Flandrin’s results

Flandrin’s results

Flandrin’s results : Delta Function

Flandrin’s results : Delta Function

Theoretical Background III

Effects of adding White Noise

Some Preliminary

• Robert John Gledhill, 2003: Methods for Investigating Conformational Change in Biomolecular Simulations, University of Southampton, Department of Chemistry, Ph D Thesis.

• He investigated the effect of added noise as a tool for checking the stability of EMD.

Some Preliminary

• His basic assumption is that the correct result is the one without noise:

1 / 2M N 2p rj j

j=1 t=1

pj

rj

1 1Discrepancy c ( t ) - c ( t )

M N

where c ( t ) is the IMF from the perturbated signal (signal + noise)

and c ( t ) is the IMF from the original signal without noise.

Test results Top Whole data perturbed; bottom only 10% perturbed.

10%

Test results

Observations

• They made the critical assumption that the unperturbed signal gives the correct results.

• When the amplitude of the added perturbing noise is small, the discrepancy is small.

• When the amplitude of the added perturbing noise is large, the discrepancy becomes bi-modal.