page 0 of 14 dynamical invariants of an attractor and potential applications for speech data saurabh...
TRANSCRIPT
Page 1 of 14Dynamical Invariants of an Attractor and potential applications for speech data
Saurabh Prasad Intelligent Electronic Systems
Human and Systems EngineeringDepartment of Electrical and Computer Engineering
Estimating Kolmogorov Entropy from Acoustic Attractors from a Recognition Perspective
Page 2 of 14Dynamical Invariants of an Attractor and potential applications for speech data
Estimating the correlation integral from a time seriesCorrelation Integral of an attractor’s trajectory : Correlation sum of a system’s attractor is a measure quantifying the average number of neighbors in a neighborhood of radius along the trajectory.
where represents the i’th point on the trajectory, is a valid norm and is the Heaviside’s unit step function (serving as a count function here)
At a given embedding dimension (m > [2*D+1]), we have:
)()1(*
2)(
1 1ji
N
i
N
ij
xxNN
C
ix
ln
)(lnlimlim0
CD
N
DC )(
resolutionsAttractor'0)( C
radiussAttractor'1)( C
~ Fractal Dimension of the attractor
Page 3 of 14Dynamical Invariants of an Attractor and potential applications for speech data
order-q Renyi entropy and K2-Entropy
Divide the state space into disjoint boxes
If the evolution of the state space that generated the observable is sampled at
dt ....,,2,
Numerically, the Kolmogorov entropy can be estimated as the second order Renyi entropy (K2)
diii
dq
dq iiip
qdK
,...,,21
021
),...,,(ln1
11limlim
Represents the joint probability that lies in box i1 lies in box i2 and so on.
)( tx
)2( tx
)exp(lim~)( 20
KdC D
d
d
)(
)(lnlim
1~
102
d
d
dC
CK
diii ,...,, 21
SystemStochasticK
SystemChaoticK
SystemOrderedK
2
2
2
0
0
Page 4 of 14Dynamical Invariants of an Attractor and potential applications for speech data
Second Order Kolmogorov Entropy Estimation of speech data
•Speech data, sampled at 22.5 KHz
– Sustained Phones (/aa/, /ae/, /eh/, /sh/, /z/, /f/, /m/, /n/)
•Output – Second order Kolmogorov Entropy
•We wish to analyze:
– The presence or absence of chaos in any time series.
– Their discrimination characteristics across attractors from different sound units (for classification)
Page 5 of 14Dynamical Invariants of an Attractor and potential applications for speech data
The analysis setup
• Currently, this analysis includes estimates of K2 for different embedding dimensions
• Variation in entropy estimates with the neighborhood radius, epsilon was studied
• Variation in entropy estimates with SNR of the signal was studied
• Currently, the analysis was performed on 3 vowels, 2 nasals and 2 fricatives
• Results show that vowels and nasals have a much smaller entropy, as compared to fricatives
• K2 consistently decreases with embedding dimension for vowels and nasals, while for fricatives, it consistently increases
Page 6 of 14Dynamical Invariants of an Attractor and potential applications for speech data
The analysis setup (in progress / coming soon)…
• Data size (length of the time series):
–This is crucial for our purpose, since we wish to extract information from short
time series (sample data from utterances).
• Speaker variation:
– We wish to study variations in the Kolmogorov entropy of phone or word level attractors
• across different speakers.
• across different phones/words
• across different broad phone classes
Page 7 of 14Dynamical Invariants of an Attractor and potential applications for speech data
Correlation Entropy vs. Embedding Dimension
Various Epsilons
Page 8 of 14Dynamical Invariants of an Attractor and potential applications for speech data
Correlation Entropy vs. Embedding Dimension
Various Epsilons
Page 9 of 14Dynamical Invariants of an Attractor and potential applications for speech data
Correlation Entropy vs. Embedding Dimension
Various Epsilons
Page 10 of 14Dynamical Invariants of an Attractor and potential applications for speech data
Correlation Entropy vs. Embedding Dimension
Various SNRs
Page 11 of 14Dynamical Invariants of an Attractor and potential applications for speech data
Correlation Entropy vs. Embedding Dimension
Various Data Lengths
Page 12 of 14Dynamical Invariants of an Attractor and potential applications for speech data
Measuring Discrimination Information in K2 based features
Kullback-Leibler (KL) divergence: Provides an information theoretic distance measure between two statistical models
xdxp
xpxpxd
xp
xpxpjiJ
i
j
xj
j
i
xi
)(
)(ln)(
)(
)(ln)(),(
Average Discriminating Information between class i and class j:
Likelihood: i vs. j Likelihood: j vs. i
For Normal Densities:
),(),(),(
]))(([2
1)]([
2
1ln2
1),( 111
ijIjiIjiJ
CtrCCCtrC
CjiI T
jijijijii
j
Page 13 of 14Dynamical Invariants of an Attractor and potential applications for speech data
Measuring Discrimination Information in K2 based features
Statistics of entropy estimates over several frames, for various phones
Page 14 of 14Dynamical Invariants of an Attractor and potential applications for speech data
Measuring Discrimination Information in K2 based features
1 2 30
50
100
150
200
250
300
350
400
450
500
/aa/ vs. /f//ae/ vs. / sh//eh/ vs. /z/
/aa/ vs. /m//aa/ vs. /n//ae/ vs. /n/
/m/ vs. /f//n/ vs. /z//m/ vs. /sh/
KL-Divergence Measure between K2 features from various phonemes for two speakers
1 2 30
500
1000
1500
2000
2500
3000
/aa/ vs. /f//ae/ vs. /sh//eh/ vs. /z/
/aa/ vs. /m//aa/ vs. /n//ae/ vs. /n/
/m/ vs. /f//n/ vs. /z//m/ vs. /sh/
Page 15 of 14Dynamical Invariants of an Attractor and potential applications for speech data
Plans
• Finish studying the use of K2 entropy as a feature characterizing phone-level attractors
– We will be performing a similar analysis on Lyapunov Exponents and Correlation Dimension estimates
• Measure speaker dependence in this invariant
• Use this setup on a meaningful recognition task
• Noise robustness, parameter tweaking, integrating these features to MFCCs
• Statistical Modeling…