communications & multimedia signal processing formant track restoration in train noisy speech...

13
Communications & Multimedia Signal Processing Formant Track Restoration in Train Noisy Speech Qin Yan Communication & Multimedia Signal Processing Group Dept of Electronic & Computer Engineering, Brunel University 25 May, 2004

Post on 21-Dec-2015

229 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Communications & Multimedia Signal Processing Formant Track Restoration in Train Noisy Speech Qin Yan Communication & Multimedia Signal Processing Group

Co

mm

un

icat

ion

s &

Mu

ltim

edia

Sig

nal

Pro

cess

ing

Formant Track Restoration in Train Noisy Speech

Qin Yan

Communication & Multimedia Signal Processing Group

Dept of Electronic & Computer Engineering, Brunel University

25 May, 2004

Page 2: Communications & Multimedia Signal Processing Formant Track Restoration in Train Noisy Speech Qin Yan Communication & Multimedia Signal Processing Group

Co

mm

un

icat

ion

s &

Mu

ltim

edia

Sig

nal

Pro

cess

ing

Main Progress

• Restore the formant tracks from the noisy speech.

• Initial progress of the speech enhancement system

Page 3: Communications & Multimedia Signal Processing Formant Track Restoration in Train Noisy Speech Qin Yan Communication & Multimedia Signal Processing Group

Co

mm

un

icat

ion

s &

Mu

ltim

edia

Sig

nal

Pro

cess

ing

Formant Tracking by 2D HMM in Noise Conditions

SNR F1 F2 F3 F4 F5

0 51.3 12.5 6.3 3.7 2.6

5 42 9.7 4.6 2.7 1.8

10 32.3 7.4 3.4 2 1.4

15 23.1 5.8 2.6 1.5 1.1

20 15.6 4.6 2.1 1.2 1

Table : Average errors (%) of formant tracks in train noisy speech by

2D HMM at different SNR conditions

• 2D HMM is not robust to formant tracking in noise conditions

Page 4: Communications & Multimedia Signal Processing Formant Track Restoration in Train Noisy Speech Qin Yan Communication & Multimedia Signal Processing Group

Co

mm

un

icat

ion

s &

Mu

ltim

edia

Sig

nal

Pro

cess

ing

LP Based Formant Tracking

Noise Model

LP-basedSpectral

Subtraction

Formant CandidatesSelection

LP Pole Analysis

Kalman Filter based

Formant Tracker

Noisy Speech

Formant tracks

VAD

Figure : Procedure of LP formant Tracking

• High LP order is to over-model the LP spectrum to split the poles from formants and noise.

• Formant candidate selection rejects spurious candidates.

• Kalman filter smoothes formant tracks.

• Formant tracks are fed back to reclassification according to the distance to the initial tracks

Reclassifier

Page 5: Communications & Multimedia Signal Processing Formant Track Restoration in Train Noisy Speech Qin Yan Communication & Multimedia Signal Processing Group

Co

mm

un

icat

ion

s &

Mu

ltim

edia

Sig

nal

Pro

cess

ing

LP Spectral Subtraction

• Noise is modelled by a low LP order but speech is modelled by a high order.

• Computation efficiency

• Disadvantage :

• Noise variance absence.

• A hard-decision needs to be employed to avoid the subtracted values going below a noise-floor.

• The spectral trajectory across time is not modeled and used in the denoising process.

)()()(ˆ fYfWfX LPSSSSLP

))(exp(

)(/)(ˆ)(1)(

SNRfSNRFf

fYfNffW

Thresh

LPSS

If ThreshSNRFf SNRf>

other

Page 6: Communications & Multimedia Signal Processing Formant Track Restoration in Train Noisy Speech Qin Yan Communication & Multimedia Signal Processing Group

Co

mm

un

icat

ion

s &

Mu

ltim

edia

Sig

nal

Pro

cess

ing

Performance of LP Spectra Subtraction

Figure : Improvement by LP spectra subtraction

-2-10123456789

0 5 10 15 20Global SNR(dB)

Imp

rovem

nt

SN

R(d

B)

LPSS Improvment

2

2

))()(ˆ)(

10log10fXfX

fXSNRframeNote : Improvement is calculated between average frame SNRs as:

Page 7: Communications & Multimedia Signal Processing Formant Track Restoration in Train Noisy Speech Qin Yan Communication & Multimedia Signal Processing Group

Co

mm

un

icat

ion

s &

Mu

ltim

edia

Sig

nal

Pro

cess

ing

LPC Spectrogram of speech in noisy train (SNR= 0)

LPC Spectrogram of Speech in noisy train after spectral subtraction

Performance I

Page 8: Communications & Multimedia Signal Processing Formant Track Restoration in Train Noisy Speech Qin Yan Communication & Multimedia Signal Processing Group

Co

mm

un

icat

ion

s &

Mu

ltim

edia

Sig

nal

Pro

cess

ing

• R is the measurement covariance matrix, updated by variance of differences between noisy observation and estimated tracks. • The process matrix Q is set to 0.16 experimentally.

Kalman Filter

1|ˆˆ

kkk FF

QPP k

1 )ˆ(ˆˆ kkkkk FZKFF

1)( RPPK kkk

kkk PKIP )()(ˆˆ

11| ikFcF k

P

ikikk

Time Update EquationsMeasurement Update Equations

“CORRECT”

“PREDICT”

Page 9: Communications & Multimedia Signal Processing Formant Track Restoration in Train Noisy Speech Qin Yan Communication & Multimedia Signal Processing Group

Co

mm

un

icat

ion

s &

Mu

ltim

edia

Sig

nal

Pro

cess

ing

Performance II

Figure : Comparison of clean formant tracks (solid) and cleaned formant tracks (dash dot) and noisy formant tracks (dot).

SNR=0 CleanedF1 51.3 18.1F2 12.5 11.8F3 6.3 6.2F4 3.7 2.7F5 2.6 2.5

Table : Average errors (%) of formant tracks in train noisy speech and cleaned speech.

Page 10: Communications & Multimedia Signal Processing Formant Track Restoration in Train Noisy Speech Qin Yan Communication & Multimedia Signal Processing Group

Co

mm

un

icat

ion

s &

Mu

ltim

edia

Sig

nal

Pro

cess

ing

Noise Model

LP-basedSpectral

Subtraction

Formant CandidatesSelection

LP Pole Analysis

Kalman Filter based

Formant Tracker

Noisy Speech

Formant tracks

VADReclassifier

Wiener Filter

Speech Reconstruction

Enhanced Speech

Initial Speech Enhancement system

Initial Speech Enhancement System

-2

0

2

4

6

8

10

0 5 10 15 20

Global SNR(dB)

Imp

rovem

nt

SN

R(d

B)

Overall Improvement

LPSS Improvment

Wiener Improvement

Page 11: Communications & Multimedia Signal Processing Formant Track Restoration in Train Noisy Speech Qin Yan Communication & Multimedia Signal Processing Group

Co

mm

un

icat

ion

s &

Mu

ltim

edia

Sig

nal

Pro

cess

ing

Speech enhancement with Speech enhancement with restored formant trajectoriesrestored formant trajectories

Future Work

Noise Model

LP-basedSpectral

Subtraction

Formant CandidatesSelection

LP Pole Analysis

Kalman Filter based

Formant Tracker

Noisy Speech

Formant tracks

VADReclassifier

Wiener Filter

Speech Reconstruction

Enhanced Speech

Initial Speech Enhancement system

Pitch Track Pitch Track

RestorationRestorationResidual

Page 12: Communications & Multimedia Signal Processing Formant Track Restoration in Train Noisy Speech Qin Yan Communication & Multimedia Signal Processing Group

Co

mm

un

icat

ion

s &

Mu

ltim

edia

Sig

nal

Pro

cess

ing

Speech enhancement with Speech enhancement with restored formant trajectoriesrestored formant trajectories

Future Work

Noise Model

LP-basedSpectral

Subtraction

Formant CandidatesSelection

LP Pole Analysis

Kalman Filter based

Formant Tracker

Noisy Speech

Formant tracks

VADReclassifier

Wiener Filter

Speech Reconstruction

Enhanced Speech

Speech Enhancement System

Pitch Track Pitch Track

RestorationRestorationResidual

Formant Tracks Restoration System

Page 13: Communications & Multimedia Signal Processing Formant Track Restoration in Train Noisy Speech Qin Yan Communication & Multimedia Signal Processing Group

Co

mm

un

icat

ion

s &

Mu

ltim

edia

Sig

nal

Pro

cess

ing

The End