hadas benisty, yekutiel avargel and israel cohen · hadas benisty, yekutiel avargel and israel...

ADAPTIVE SYSTEM IDENTIFICATION USING

TIME-VARYING FOURIER TRANSFORM

Hadas Benisty, Yekutiel Avargeland Israel Cohen

Presented by: Idan Igra

Technion, July 2013

Motivation

Long vs. short window

� In some processing schemes:

� Long window analysis is more accurate in

terms of steady-state error

� Short window analysis yields faster

convergence

Long vs. short window

� varying window length could solve this tradeoff

� Assuming we can identify interesting points

Mathematics

Time Varying STFT

� Time Varying-STFT deals with the following issues, like STFT is:

� Transformation

� Decimation factor

� Overlap

� Inverse-Transformation

� Completeness condition

� Analysis windows

Time Varying STFT

� Time-varying Fourier Transform:

Time Varying STFT vs. STFT

� Remember the original STFT:

N(t)

� N(t) is a piece-wise constant function:

N(t) = Nv, tv-1 < t < tv, v = 1, 2,…, V

Decimation Factor

� Decimation factor (Lv) also time-varying

� Piece-wise constant, like Nv.

Overlap

� And what about overlap?

� Still constant:

Nv / Lv = const

Inverse TV-STFT

� The inverse transform:

Inverse TV-STFT

� Completeness condition:

Analysis window

� Analysis window:

� Should preserve continuity (why?)

� And constant overlap…

� Solution: interlacing windows

Analysis window

� Interlaced hamming windows:

Applications

System Identification

� y(n) = Measured signal

� x(n) = Input signal (to estimate)

� ξ(n) = Additive noise (unknown)

� h(n) = Unknown LTI system

� Based on NLMS approximation.


� For the simulations:

� ξ(n) ~ N(0, σξ2), where SNR = 30dB

� h(n) = w(n)β(n)e-0.03n

� β(n) ~ N(0, σβ2)

� w(n) is rectangular window by length of Nh

� STFT overlap was 50%.


� Updating estimated system coefficients at transient frame: cubic interpulation.

� Zero-padding DFT produced similar results.

System Identification: Noise

� System identification with White Gaussian noise (as input signal):

� x(n) ~ N(0, 1)

� 0 < n < 9,200 [samples]: Nh = 16 [samples]

� n > 9,200 [samples]: Nh ≠ 16 [samples]

� Pre-knowledge: the change time.

� Window length was changed on the beginning

and after 9,200 samples for fast convergence


� White Gaussian noise: time varying window length


� White Gaussian noise: smoothed error

System Identification: AEC

� System identification for Acoustic Echo Cancellation (AEC).

� x(n) is a speech signal

� Sample rate = 16kHz.

� Room echo path: h(n)

� t < 2[sec]: Nh = 16[samples].

� Changed after 2 seconds.

� Again, pre-knowledge about the change.


� AEC: Time varying window length


� AEC: Results.

� (a) Far-end signal

� (b) Near-end signal

� (c)-(f): Error signals: 128,512,1024 & Time varying

Similar work

Reducing computational cost

� Adapting the time-frequency resolution over time: AR-STFT [Qaiser et al, 2008].

� For reducing computational cost.

� Controlling the A/D sampling rate.

Optimize processing quality

� Define the window length to maximize a measure of short-time time-frequency concentration.

� Investigating also other transformations except STFT: Wavelet and cone-kernel.

� By Jones et al, 1994.

Optimize processing quality

� (a) short, (b) medium, (c) long STFT.

Overcome impulse noise

� Varying window length can be used for reducing impulse noise [Wei, Bi, 2003].

� By optimizing window length to some signal-characteristics bombastic words…

� Rotation direction.

� Chirp rate.

Overcome impulse noise

Similar work

Much more varying window-length uses and manipulation on the net

My twist…

Phoneme adaptation

� Best window length for varying speech signal

� System Identification applications adapted

window length only to the changing system

� Why not adapting also to the changing input

signal?

� For example: Adapting to different

phonemes

Phoneme adaptation

� The experiment:

� Gaussian noise with a given variance.

� SNR = 10dB.

� Time-varying Wiener filtering.

� Offline processing.

� Known phoneme division over time (for

example by preprocessing).

Phoneme adaptation

... אדום

... כחול

! נפל

Time divisor

Phoneme

recognition (given)

Time varying

Wiener filtering

+

Gaussian noise

Phoneme recognition

� Phoneme recognizer returns one out of four phoneme types (changed on time):

� Silent,

� White Noise (ssss, fff, etc.),

� Vowel (aaa, eee…),

� Or impulse (d, t, …).

� Pre-recognized manually for the experiment purpose.

Phoneme recognition

0 0.5 1 1.5 2 2.5 3 3.5-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

time (s)

am

plit

ude

Original

Silent

Impulse

Vowel

White

Time divisor

� Time divisor decides the window length

� May change over time.

� Depends on phoneme type.

� Constant length: Simple wiener filtering.

Time divisor

� Time divisors tested:

� Per phoneme window length.

� Short convergence time divisor:

� Short window length right after phoneme type

change.

� Long window length later until next phoneme type change.

� Motivation:

� Following empirical experiment, the error is

changed depends on:

� Phoneme type

� Window length

� � Optimize window length for a phoneme

type may results in better performance.

Per phoneme window length

� Motivation:

2 3 4 5 6 7 8 9 10

x 10-3

10-4

10-3

10-2

Average error for each phoneme type

window length (ms)

avera

ge s

quare

d e

rror

Overall

Silent

Impulse

Vowel

White

Per phoneme window length

� Motivation:

� Similar to Adaptive System Identification

idea.

� Really? Wiener Filtering vs. NLMS.

� But adaptation according to signal instead

of system.

Short convergence time divisor

Phoneme adaptation

Silent Impulse Vowel White Overall0

0.01

0.02

0.03

0.04

0.05

0.06Error per phoneme (with musical noise reduction)

Optimal length per phoneme

Uniform optimal length

Short stft for convergence

Phoneme adaptation

0 0.5 1 1.5 2 2.5 3 3.5-1

0

1

am

plit

ude

Original

Silent

Impulse

Vowel

White

0 0.5 1 1.5 2 2.5 3 3.5-0.5

0

0.5Optimal length per phoneme - error

am

plit

ude

0 0.5 1 1.5 2 2.5 3 3.5-0.5

0

0.5Uniform optimal length - error

am

plit

ude

0 0.5 1 1.5 2 2.5 3 3.5-0.5

0

0.5Short stft for convergence - error

am

plit

ude

time (s)

Phoneme adaptation disadvantages

� Large error on length replacement

� Tried to improve by:� Very small alpha (0.5) on length replacement

� Using old filtering for a while after replacement

� Old filter is not optimal for the new size (need further investigation why).

� Except large mathematical error, inconvenient listening phenomena on length replacement:

� We didn’t discuss the computational cost…

Thanks!

Thanks for

listening!

References

� Hadas Benisty, Yekutiel Avargel and Israel Cohen. Adaptive System Identification using Time-Varying Fourier Transform. Department of Electrical Engineering, Technion - Israel Institute of Technology.

� Saeed Mian Qaisar, Laurent Fesquet, and Marc Renaudin. An Adaptive Resolution Computationally Efficient Short-Time Fourier Transform. Proceeding of World academy of science, engineering and technology volume 31, July 2008 ISSN 1307-6884.

� Douglas L. Jones, Richard G. Baraniuk. A Simple Scheme for Time-Frequency Representations. IEEE Transactions on Signal Processing, Vol. 42, No. 12, Dec. 1994.

� Wei, Y. M.; Bi, G. A. Robust STFT with Adaptive Window Length and Rotation Direction. International conference on information, communications and signal processing; ICIS-PCM 2003. 4th, International conference on information, communications and signal processing; ICIS-PCM 2003; 827-829.

Musical signal

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2-1

0

1

am

plit

ude

Original

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2-0.5

0


am

plit

ude

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2-0.5

0


am

plit

ude

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2-0.5

0

0.5Short stft for convergence - error

am

plit

ude

time (s)

Applications

� Application #3 – Per phoneme length.

2 3 4 5 6 7 8 9 10

x 10-3

10-3

10-2

10-1

Average error for each phoneme type

window length (ms)

avera

ge s

quare

d e

rror

Overall

Silent

Impulse

Vowel

White

Applications



0.02

0.04

0.06

0.08

0.1

0.12Error per phoneme (with musical noise reduction)



Applications



0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09Error per phoneme (no musical noise reduction)



Applications


0 0.5 1 1.5 2 2.5 3 3.5-1

0

1

am

plit

ude

Original

Silent

Impulse

Vowel

White

0 0.5 1 1.5 2 2.5 3 3.5-0.5

0


am

plit

ude

0 0.5 1 1.5 2 2.5 3 3.5-0.5

0


am

plit

ude

time (s)

hadas benisty, yekutiel avargel and israel cohen · hadas benisty, yekutiel avargel and israel...

Documents