page 0 of 14 dynamical invariants of an attractor and potential applications for speech data saurabh...

of 14Dynamical Invariants of an Attractor and potential applications for speech data

Saurabh Prasad Intelligent Electronic Systems

Human and Systems EngineeringDepartment of Electrical and Computer Engineering

Estimating Kolmogorov Entropy from Acoustic Attractors from a Recognition Perspective


Estimating the correlation integral from a time seriesCorrelation Integral of an attractor’s trajectory : Correlation sum of a system’s attractor is a measure quantifying the average number of neighbors in a neighborhood of radius along the trajectory.

where represents the i’th point on the trajectory, is a valid norm and is the Heaviside’s unit step function (serving as a count function here)

At a given embedding dimension (m > [2*D+1]), we have:

)()1(*

2)(

1 1ji

N

i

N

ij

xxNN

C

ix

ln

)(lnlimlim0

CD

N

DC )(

resolutionsAttractor'0)( C

radiussAttractor'1)( C

~ Fractal Dimension of the attractor


order-q Renyi entropy and K2-Entropy

Divide the state space into disjoint boxes

If the evolution of the state space that generated the observable is sampled at

dt ....,,2,

Numerically, the Kolmogorov entropy can be estimated as the second order Renyi entropy (K2)

diii

dq

dq iiip

qdK

,...,,21

021

),...,,(ln1

11limlim

Represents the joint probability that lies in box i1 lies in box i2 and so on.

)( tx

)2( tx

)exp(lim~)( 20

KdC D

d

d

)(

)(lnlim

1~

102

d

d

dC

CK

diii ,...,, 21

SystemStochasticK

SystemChaoticK

SystemOrderedK

2

2

2

0

0


Second Order Kolmogorov Entropy Estimation of speech data

•Speech data, sampled at 22.5 KHz

– Sustained Phones (/aa/, /ae/, /eh/, /sh/, /z/, /f/, /m/, /n/)

•Output – Second order Kolmogorov Entropy

•We wish to analyze:

– The presence or absence of chaos in any time series.

– Their discrimination characteristics across attractors from different sound units (for classification)


The analysis setup

• Currently, this analysis includes estimates of K2 for different embedding dimensions

• Variation in entropy estimates with the neighborhood radius, epsilon was studied

• Variation in entropy estimates with SNR of the signal was studied

• Currently, the analysis was performed on 3 vowels, 2 nasals and 2 fricatives

• Results show that vowels and nasals have a much smaller entropy, as compared to fricatives

• K2 consistently decreases with embedding dimension for vowels and nasals, while for fricatives, it consistently increases


The analysis setup (in progress / coming soon)…

• Data size (length of the time series):

–This is crucial for our purpose, since we wish to extract information from short

time series (sample data from utterances).

• Speaker variation:

– We wish to study variations in the Kolmogorov entropy of phone or word level attractors

• across different speakers.

• across different phones/words

• across different broad phone classes


Correlation Entropy vs. Embedding Dimension

Various Epsilons



Various SNRs



Various Data Lengths


Measuring Discrimination Information in K2 based features

Kullback-Leibler (KL) divergence: Provides an information theoretic distance measure between two statistical models

xdxp

xpxpxd

xp

xpxpjiJ

i

j

xj

j

i

xi

)(

)(ln)(

)(

)(ln)(),(

Average Discriminating Information between class i and class j:

Likelihood: i vs. j Likelihood: j vs. i

For Normal Densities:

),(),(),(

]))(([2

1)]([

2

1ln2

1),( 111

ijIjiIjiJ

CtrCCCtrC

CjiI T

jijijijii

j



Statistics of entropy estimates over several frames, for various phones



1 2 30

50

100

150

200

250

300

350

400

450

500

/aa/ vs. /f//ae/ vs. / sh//eh/ vs. /z/

/aa/ vs. /m//aa/ vs. /n//ae/ vs. /n/

/m/ vs. /f//n/ vs. /z//m/ vs. /sh/

KL-Divergence Measure between K2 features from various phonemes for two speakers

1 2 30

500

1000

1500

2000

2500

3000

/aa/ vs. /f//ae/ vs. /sh//eh/ vs. /z/

/aa/ vs. /m//aa/ vs. /n//ae/ vs. /n/

/m/ vs. /f//n/ vs. /z//m/ vs. /sh/


Plans

• Finish studying the use of K2 entropy as a feature characterizing phone-level attractors

– We will be performing a similar analysis on Lyapunov Exponents and Correlation Dimension estimates

• Measure speaker dependence in this invariant

• Use this setup on a meaningful recognition task

• Noise robustness, parameter tweaking, integrating these features to MFCCs

• Statistical Modeling…

page 0 of 14 dynamical invariants of an attractor and potential applications for speech data saurabh...

Documents