hadas benisty, yekutiel avargel and israel cohen · hadas benisty, yekutiel avargel and israel...
TRANSCRIPT
ADAPTIVE SYSTEM IDENTIFICATION USING
TIME-VARYING FOURIER TRANSFORM
Hadas Benisty, Yekutiel Avargeland Israel Cohen
Presented by: Idan Igra
Technion, July 2013
Motivation
Long vs. short window
� In some processing schemes:
� Long window analysis is more accurate in
terms of steady-state error
� Short window analysis yields faster
convergence
Long vs. short window
� varying window length could solve this tradeoff
� Assuming we can identify interesting points
Mathematics
Time Varying STFT
� Time Varying-STFT deals with the following issues, like STFT is:
� Transformation
� Decimation factor
� Overlap
� Inverse-Transformation
� Completeness condition
� Analysis windows
Time Varying STFT
� Time-varying Fourier Transform:
Time Varying STFT vs. STFT
� Remember the original STFT:
N(t)
� N(t) is a piece-wise constant function:
N(t) = Nv, tv-1 < t < tv, v = 1, 2,…, V
Decimation Factor
� Decimation factor (Lv) also time-varying
� Piece-wise constant, like Nv.
Overlap
� And what about overlap?
� Still constant:
Nv / Lv = const
Inverse TV-STFT
� The inverse transform:
Inverse TV-STFT
� Completeness condition:
Analysis window
� Analysis window:
� Should preserve continuity (why?)
� And constant overlap…
� Solution: interlacing windows
Analysis window
� Interlaced hamming windows:
Applications
System Identification
� y(n) = Measured signal
� x(n) = Input signal (to estimate)
� ξ(n) = Additive noise (unknown)
� h(n) = Unknown LTI system
� Based on NLMS approximation.
System Identification
� For the simulations:
� ξ(n) ~ N(0, σξ2), where SNR = 30dB
� h(n) = w(n)β(n)e-0.03n
� β(n) ~ N(0, σβ2)
� w(n) is rectangular window by length of Nh
� STFT overlap was 50%.
System Identification
� Updating estimated system coefficients at transient frame: cubic interpulation.
� Zero-padding DFT produced similar results.
System Identification: Noise
� System identification with White Gaussian noise (as input signal):
� x(n) ~ N(0, 1)
� 0 < n < 9,200 [samples]: Nh = 16 [samples]
� n > 9,200 [samples]: Nh ≠ 16 [samples]
� Pre-knowledge: the change time.
� Window length was changed on the beginning
and after 9,200 samples for fast convergence
System Identification: Noise
� White Gaussian noise: time varying window length
System Identification: Noise
� White Gaussian noise: smoothed error
System Identification: AEC
� System identification for Acoustic Echo Cancellation (AEC).
� x(n) is a speech signal
� Sample rate = 16kHz.
� Room echo path: h(n)
� t < 2[sec]: Nh = 16[samples].
� Changed after 2 seconds.
� Again, pre-knowledge about the change.
System Identification: AEC
� AEC: Time varying window length
System Identification: AEC
� AEC: Results.
� (a) Far-end signal
� (b) Near-end signal
� (c)-(f): Error signals: 128,512,1024 & Time varying
Similar work
Reducing computational cost
� Adapting the time-frequency resolution over time: AR-STFT [Qaiser et al, 2008].
� For reducing computational cost.
� Controlling the A/D sampling rate.
Optimize processing quality
� Define the window length to maximize a measure of short-time time-frequency concentration.
� Investigating also other transformations except STFT: Wavelet and cone-kernel.
� By Jones et al, 1994.
Optimize processing quality
� (a) short, (b) medium, (c) long STFT.
Overcome impulse noise
� Varying window length can be used for reducing impulse noise [Wei, Bi, 2003].
� By optimizing window length to some signal-characteristics bombastic words…
� Rotation direction.
� Chirp rate.
Overcome impulse noise
Overcome impulse noise
Similar work
Much more varying window-length uses and manipulation on the net
My twist…
Phoneme adaptation
� Best window length for varying speech signal
� System Identification applications adapted
window length only to the changing system
� Why not adapting also to the changing input
signal?
� For example: Adapting to different
phonemes
Phoneme adaptation
� The experiment:
� Gaussian noise with a given variance.
� SNR = 10dB.
� Time-varying Wiener filtering.
� Offline processing.
� Known phoneme division over time (for
example by preprocessing).
Phoneme adaptation
... אדום
... כחול
! נפל
Time divisor
Phoneme
recognition (given)
Time varying
Wiener filtering
+
Gaussian noise
Phoneme recognition
� Phoneme recognizer returns one out of four phoneme types (changed on time):
� Silent,
� White Noise (ssss, fff, etc.),
� Vowel (aaa, eee…),
� Or impulse (d, t, …).
� Pre-recognized manually for the experiment purpose.
Phoneme recognition
0 0.5 1 1.5 2 2.5 3 3.5-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
time (s)
am
plit
ude
Original
Silent
Impulse
Vowel
White
Time divisor
� Time divisor decides the window length
� May change over time.
� Depends on phoneme type.
� Constant length: Simple wiener filtering.
Time divisor
� Time divisors tested:
� Per phoneme window length.
� Short convergence time divisor:
� Short window length right after phoneme type
change.
� Long window length later until next phoneme type change.
� Motivation:
� Following empirical experiment, the error is
changed depends on:
� Phoneme type
� Window length
� � Optimize window length for a phoneme
type may results in better performance.
Per phoneme window length
� Motivation:
2 3 4 5 6 7 8 9 10
x 10-3
10-4
10-3
10-2
Average error for each phoneme type
window length (ms)
avera
ge s
quare
d e
rror
Overall
Silent
Impulse
Vowel
White
Per phoneme window length
� Motivation:
� Similar to Adaptive System Identification
idea.
� Really? Wiener Filtering vs. NLMS.
� But adaptation according to signal instead
of system.
Short convergence time divisor
Phoneme adaptation
Silent Impulse Vowel White Overall0
0.01
0.02
0.03
0.04
0.05
0.06Error per phoneme (with musical noise reduction)
Optimal length per phoneme
Uniform optimal length
Short stft for convergence
Phoneme adaptation
0 0.5 1 1.5 2 2.5 3 3.5-1
0
1
am
plit
ude
Original
Silent
Impulse
Vowel
White
0 0.5 1 1.5 2 2.5 3 3.5-0.5
0
0.5Optimal length per phoneme - error
am
plit
ude
0 0.5 1 1.5 2 2.5 3 3.5-0.5
0
0.5Uniform optimal length - error
am
plit
ude
0 0.5 1 1.5 2 2.5 3 3.5-0.5
0
0.5Short stft for convergence - error
am
plit
ude
time (s)
Phoneme adaptation disadvantages
� Large error on length replacement
� Tried to improve by:� Very small alpha (0.5) on length replacement
� Using old filtering for a while after replacement
� Old filter is not optimal for the new size (need further investigation why).
� Except large mathematical error, inconvenient listening phenomena on length replacement:
� We didn’t discuss the computational cost…
Thanks!
Thanks for
listening!
References
� Hadas Benisty, Yekutiel Avargel and Israel Cohen. Adaptive System Identification using Time-Varying Fourier Transform. Department of Electrical Engineering, Technion - Israel Institute of Technology.
� Saeed Mian Qaisar, Laurent Fesquet, and Marc Renaudin. An Adaptive Resolution Computationally Efficient Short-Time Fourier Transform. Proceeding of World academy of science, engineering and technology volume 31, July 2008 ISSN 1307-6884.
� Douglas L. Jones, Richard G. Baraniuk. A Simple Scheme for Time-Frequency Representations. IEEE Transactions on Signal Processing, Vol. 42, No. 12, Dec. 1994.
� Wei, Y. M.; Bi, G. A. Robust STFT with Adaptive Window Length and Rotation Direction. International conference on information, communications and signal processing; ICIS-PCM 2003. 4th, International conference on information, communications and signal processing; ICIS-PCM 2003; 827-829.
Musical signal
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2-1
0
1
am
plit
ude
Original
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2-0.5
0
0.5Optimal length per phoneme - error
am
plit
ude
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2-0.5
0
0.5Uniform optimal length - error
am
plit
ude
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2-0.5
0
0.5Short stft for convergence - error
am
plit
ude
time (s)
Applications
� Application #3 – Per phoneme length.
2 3 4 5 6 7 8 9 10
x 10-3
10-3
10-2
10-1
Average error for each phoneme type
window length (ms)
avera
ge s
quare
d e
rror
Overall
Silent
Impulse
Vowel
White
Applications
� Application #3 – Per phoneme length.
Silent Impulse Vowel White Overall0
0.02
0.04
0.06
0.08
0.1
0.12Error per phoneme (with musical noise reduction)
Optimal length per phoneme
Uniform optimal length
Applications
� Application #3 – Per phoneme length.
Silent Impulse Vowel White Overall0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09Error per phoneme (no musical noise reduction)
Optimal length per phoneme
Uniform optimal length
Applications
� Application #3 – Per phoneme length.
0 0.5 1 1.5 2 2.5 3 3.5-1
0
1
am
plit
ude
Original
Silent
Impulse
Vowel
White
0 0.5 1 1.5 2 2.5 3 3.5-0.5
0
0.5Optimal length per phoneme - error
am
plit
ude
0 0.5 1 1.5 2 2.5 3 3.5-0.5
0
0.5Uniform optimal length - error
am
plit
ude
time (s)