speech enhancement

1

Speech Speech Enhancement Enhancement

5

tttttt

ttt

vyaazzyEy

vyz

ˆ

noise additiveFor

1KSpeech Noisy

1K Noise

1KSpeech Clean

t

t

t

z

v

y

Wiener Filtering:A linear estimation of clean signal from the noisy signal Using MMSE criterion

6

iontranspositHermitian:H

:i.e. signal.noisy the toorthogonal is

ˆ

error thesuch that selected is

if minimum is ˆ

Error SquareMean The

:Theorem Projection

2

Htttt

Httt

ttttt

ttttt

tt

vyvyaEvyyE

vyvyay

vyayyy

a

yyE

7

1

tt

:have llThen we'

0 i.e.,

,eduncorrelatandmean-zero

be toyand vAssuming

Htt

Htt

Htt

Htt

Htt

Htt

Htt

Htt

vvyyEyyEa

vvyyaEyyE

yvEvyE

8

1

ttt

t

t

vyy

vH

tt

yH

tt

a

vvE

yyE

Since y and v are zero mean:

This is called the time domain Wiener filter

9

0ˆ

ˆ

ˆ 2

tztytyE

tytyt

tytyE

dhtz

dzthyt

ˆ

We are looking for a frequency-domain Wiener filter, called the non-causal Wiener filter such that:

According to the projection theorem, for the error

to be minimum, the difference

has to be orthogonal to the noisy input

10

dhRR

t

dhtRtR

zdhtzEztyE

zdhtztyE

zzyz

zzyz

:

or

0or

11

)

:(since

:

:

on)(Convoluti *

:

tttt

tttyz

yyyz

yz

zz

zz

yz

zzyz

zzyz

vyEyyE

vyyER

SS

tzandtybetweenSpectrumcrossS

tzafSpectrumS

S

SjH

jHSS

hRR

t

12

zz

vvzz

vvyy

yy

vvyyzz

S

SSjH

SS

SjH

SSS

Popular form of Wiener filter

14

Spectral SubtractionSpectral Subtraction

ttt

ttt

VYZ

vyz

16

ttt

t

tt

t

zjttt

tt

toldtt

ZHY

Z

VZH

eVZY

VZ

ZnVnV

.ˆ

ˆ

ˆˆ

1ˆˆ

21

2

22

21

22

22

,

2

17

zyEMMSE

zyPMAP

yzPML

TtyyTtzz

snObservatioParametersEMMSE

nsObservatioParametersPMAPPosterioriaMaximum

ParametersnsObservatioPMLlihoodLikeMaximum

tt

:3

:2

:1

1,,1,0,1,,1,0,

:3

)(2

)(1

19

zymsPzyP

kyq

Ttkyq

L

M

kyq

SeqWeight

LmTtmm

MsTtss

TtzzRz

TtvvRv

TtyyRy

M

s

L

myy

t

t

tt

tt

tK

t

tK

t

tK

t

vyvy,,,lnmax,lnmax

:,

1,,0,,

1

1

,

.

,,1,1,,1,0,

,,1,1,,1,0,

1,,1,0,

1,,1,0,

1,,1,0,

1 1

MAP Speech Enhancement

20

,

,

1

1,

,

,

,

0.,1

ln1ln

1

1,,ln

,,1

,1,,1,0,

lnmax

v

ttt

ms

ktt

y

H

TtzHkyqky

zkyPzkyP

ky

zkymsP

kymsPky

RkyTtkyky

zyP

vy

vy

vy

22

zymsp

zymsp

vyyms

vyyms

,,lnmax

,,,lnmax

,,

,,

zymsp vy

ms,,,lnmax

,

23

tt

ttt

zzz

zygEyg

,,

ˆ

00

0

MMSE Speech Enhancement

We try to optimize the function:

g(.) is a function on Rk and

24

M L N Pt

t

tt

ttttt

tt

tttttt

M L N Pt

tt

zG

zG

zpnmsPzW

pnmszygE

zWyg

1 1 1 10

0

00

1 1 1 10

,,,,

,,,,

,,,,,,

,,,,

.,,,

25

cpcncmcsczbcncp

ccn

csnacscm

ccs

tsa

t

tsts tmtm tntn tptp

tztG

zbcczG

,,,..1

..10

:10 :1

0 :10 :1

0

0,,,,

,,,0||0,,,,0

26

1,,1,0,

,,,,,,,,

det2

21exp

,,,

1

21

,,

2

1

,,

KkkYyg

dypnmszypygpnmszygE

zzpnmszb

tt

tttttttvyttttttt

ptntmtst

k

tptntmtst

Trt

ttttt

The computation of Eqn1 is generally difficult. For some specific functions, Eqn1 has been derived.For instance, when g(.) is defined to be:

Where is the kth coefficient of the DFT of yt ,Eqn1 is equivalent to the popular Wiener filter

)(kYt

28

,,,

,,,,,,,,

||

1 1 1 1

1010

t

M L N Pt

tt

t

zbccaa

zGzG

Recursive Formula For G:

40

Automatic Noise Type Selection

43

t

mitmi

t

t

ttt

N

gLm

Mi

ca

N

g

NgyK

of Covariance

,,2,1

,,2,1,

,,,, :Parameters HMM-NS

sourceGaussian mean -zero

iidan be to(assumed Residual Stationary :

Function ticDeterminis :

1

,,

Nonstationary State HMM

44

1

,0

,0

,

21

,2

,0

,

21exp

.2

1,,

)orthogonal(usually polynomialorder rth an :

stateith visit the to timestarting The :

,,2,1

,polynomial be toassumed

isfunction ticdeterminis theif example,For

mj

R

rrmjt

TrR

rrmjt

mj

Kt

r

i

mit

R

rirmit

dhrBydhrBy

dmjb

h

MmNthrBy

Nonstationary-State HMM

45

,,

,,,,,,,,,maxarg,,

,,,,,,,,,max,,

sequenceduration :,,,

sequencen observatio:,,,

sequence state:,,,

1010,,,

1010,,,

110

110

110

110

110

vi

yyyddmmjssspdmj

yyyddmmjssspdmj

ddd

yyy

sss

ttttsss

t

ttttsss

t

T

T

T

t

t

Segmentation Algorithm in NS-HMM

46

LmMjTtfor

avimj

mjbcavimj

MjLm

mjbcmj

ijtvi

t

tjmijt

tL

jit

tjmj

1,1,0

.,,maxarg0,,

|,0,,...,,maxmaxmax0,,

state) Markov new a (entering 0dfor Recursion -2

1,1

0,,..0,,

:tionInitializa -1

1,,

|1

1

01

0

Segmentation Algorithm in NS-HMM

47

state) a within changednot is mixture the(assuming

0,1,1,0

1,,,,

|,,...1,,,,

looping) (self 0dfor stepRecursion -3

|

tdLmMjTtfor

dmjdmj

dmjbcadmjdmj

t

tjmjjtt

48

0,,3,2

*,*,**,*,*

ngBacktracki -5

,,maxmaxmaxarg*,*,*

,,maxmaxmax*

nTerminatio -4

1111

1

1

011111

1

1

011

TTtfor

dmsdms

dmidms

dmip

ttttttt

T

T

d

L

m

M

iTTT

T

T

d

L

m

M

i

49

tdPNLMfor

zdG

zdGzdW

ddpnmsygEW

dyddpnmsyfyg

zdWzygE

d

tt

ttt

t

ttttttt

tttttttyt

tM L N P T

dt

tt

1,1,1,1,1

,,,,,

,,,,,,,,,,

,,,,|

,,,,|

.,,,,,|

0

00

01 1 1 1 1

0

Now we generalize MMSE formulae for NS-HMM

50

functions.other than less iscost n computatio the

)ofDFTtheofcomponentthk:(

1,,0,

:for shown thatbeen hasIt specified. be tohas

, ....}|E{g ofn calculatio For the

tt

tt

t

yky

kkkyyg

yg

51

noise. andspeech ofduration

and mixture state, ingcorrespond for thefilter Wiener

theofcomponent kth theis ~

and

of DFT theofcomponent kth theis Where

~:meanith Gaussian w is

,,,,,|

,,,,,|

i.e., Gaussian. is g

ofcomponent kth theofn expectatio theshown that

hascriterion MMSE theusing estimationlinear A

,,,,

,,,,

kH

zkZ

kZkH

kYddpnmszkYfkY

dpnmszkgE

ttttt

ttttt

dpnms

tt

tdpnms

ttttttttvyt

tttttt

52

tddzbcc

aazdGzdG

zbcc

aazdG

zG

t

N Pt

tt

t

t

tM L N P t

dt

tt

1,,,,|...

..,1,,,,,,,,,

:state old in the stayingFor

0,,,,|..

...,,,,,

,0,,,,

:state new a enteringFor

s,constraintduration with G, ofn calculatio Recursive

||

1 1

1010

||

10

1 1 1 1 01

0

speech enhancement

Documents

popular wiener filter

causal wiener filter

popular form of wiener

wiener filtering

time domain wiener filter

frequencydomain wiener

mmse speech enhancementwe

nonstationary state