speech enhancement

60
1 Speech Speech Enhancement Enhancement

Upload: octavia-stevens

Post on 31-Dec-2015

39 views

Category:

Documents


4 download

DESCRIPTION

Speech Enhancement. Wiener Filtering: A linear estimation of clean signal from the noisy signal Using MMSE criterion. Since y and v are zero mean:. This is called the time domain Wiener filter. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Speech Enhancement

1

Speech Speech Enhancement Enhancement

Page 2: Speech Enhancement

2

Page 3: Speech Enhancement

3

Page 4: Speech Enhancement

4

Page 5: Speech Enhancement

5

tttttt

ttt

vyaazzyEy

vyz

ˆ

noise additiveFor

1KSpeech Noisy

1K Noise

1KSpeech Clean

t

t

t

z

v

y

Wiener Filtering:A linear estimation of clean signal from the noisy signal Using MMSE criterion

Page 6: Speech Enhancement

6

iontranspositHermitian:H

:i.e. signal.noisy the toorthogonal is

ˆ

error thesuch that selected is

if minimum is ˆ

Error SquareMean The

:Theorem Projection

2

Htttt

Httt

ttttt

ttttt

tt

vyvyaEvyyE

vyvyay

vyayyy

a

yyE

Page 7: Speech Enhancement

7

1

tt

:have llThen we'

0 i.e.,

,eduncorrelatandmean-zero

be toyand vAssuming

Htt

Htt

Htt

Htt

Htt

Htt

Htt

Htt

vvyyEyyEa

vvyyaEyyE

yvEvyE

Page 8: Speech Enhancement

8

1

ttt

t

t

vyy

vH

tt

yH

tt

a

vvE

yyE

Since y and v are zero mean:

This is called the time domain Wiener filter

Page 9: Speech Enhancement

9

ˆ

ˆ 2

tztytyE

tytyt

tytyE

dhtz

dzthyt

ˆ

We are looking for a frequency-domain Wiener filter, called the non-causal Wiener filter such that:

According to the projection theorem, for the error

to be minimum, the difference

has to be orthogonal to the noisy input

Page 10: Speech Enhancement

10

dhRR

t

dhtRtR

zdhtzEztyE

zdhtztyE

zzyz

zzyz

:

or

0or

Page 11: Speech Enhancement

11

)

:(since

:

:

on)(Convoluti *

:

tttt

tttyz

yyyz

yz

zz

zz

yz

zzyz

zzyz

vyEyyE

vyyER

SS

tzandtybetweenSpectrumcrossS

tzafSpectrumS

S

SjH

jHSS

hRR

t

Page 12: Speech Enhancement

12

zz

vvzz

vvyy

yy

vvyyzz

S

SSjH

SS

SjH

SSS

Popular form of Wiener filter

Page 13: Speech Enhancement

13

Page 14: Speech Enhancement

14

Spectral SubtractionSpectral Subtraction

ttt

ttt

VYZ

vyz

Page 15: Speech Enhancement

15

Page 16: Speech Enhancement

16

ttt

t

tt

t

zjttt

tt

toldtt

ZHY

Z

VZH

eVZY

VZ

ZnVnV

ˆ

ˆˆ

1ˆˆ

21

2

22

21

22

22

,

2

Page 17: Speech Enhancement

17

zyEMMSE

zyPMAP

yzPML

TtyyTtzz

snObservatioParametersEMMSE

nsObservatioParametersPMAPPosterioriaMaximum

ParametersnsObservatioPMLlihoodLikeMaximum

tt

:3

:2

:1

1,,1,0,1,,1,0,

:3

)(2

)(1

Page 18: Speech Enhancement

18

Page 19: Speech Enhancement

19

zymsPzyP

kyq

Ttkyq

L

M

kyq

SeqWeight

LmTtmm

MsTtss

TtzzRz

TtvvRv

TtyyRy

M

s

L

myy

t

t

tt

tt

tK

t

tK

t

tK

t

vyvy,,,lnmax,lnmax

:,

1,,0,,

1

1

,

.

,,1,1,,1,0,

,,1,1,,1,0,

1,,1,0,

1,,1,0,

1,,1,0,

1 1

MAP Speech Enhancement

Page 20: Speech Enhancement

20

,

,

1

1,

,

,

,

0.,1

ln1ln

1

1,,ln

,,1

,1,,1,0,

lnmax

v

ttt

ms

ktt

y

H

TtzHkyqky

zkyPzkyP

ky

zkymsP

kymsPky

RkyTtkyky

zyP

vy

vy

vy

Page 21: Speech Enhancement

21

Page 22: Speech Enhancement

22

zymsp

zymsp

vyyms

vyyms

,,lnmax

,,,lnmax

,,

,,

zymsp vy

ms,,,lnmax

,

Page 23: Speech Enhancement

23

tt

ttt

zzz

zygEyg

,,

ˆ

00

0

MMSE Speech Enhancement

We try to optimize the function:

g(.) is a function on Rk and

Page 24: Speech Enhancement

24

M L N Pt

t

tt

ttttt

tt

tttttt

M L N Pt

tt

zG

zG

zpnmsPzW

pnmszygE

zWyg

1 1 1 10

0

00

1 1 1 10

,,,,

,,,,

,,,,,,

,,,,

.,,,

Page 25: Speech Enhancement

25

cpcncmcsczbcncp

ccn

csnacscm

ccs

tsa

t

tsts tmtm tntn tptp

tztG

zbcczG

,,,..1

..10

:10 :1

0 :10 :1

0

0,,,,

,,,0||0,,,,0

Page 26: Speech Enhancement

26

1,,1,0,

,,,,,,,,

det2

21exp

,,,

1

21

,,

2

1

,,

KkkYyg

dypnmszypygpnmszygE

zzpnmszb

tt

tttttttvyttttttt

ptntmtst

k

tptntmtst

Trt

ttttt

The computation of Eqn1 is generally difficult. For some specific functions, Eqn1 has been derived.For instance, when g(.) is defined to be:

Where is the kth coefficient of the DFT of yt ,Eqn1 is equivalent to the popular Wiener filter

)(kYt

Page 27: Speech Enhancement

27

Page 28: Speech Enhancement

28

,,,

,,,,,,,,

||

1 1 1 1

1010

t

M L N Pt

tt

t

zbccaa

zGzG

Recursive Formula For G:

Page 29: Speech Enhancement

29

Page 30: Speech Enhancement

30

Page 31: Speech Enhancement

31

Page 32: Speech Enhancement

32

Page 33: Speech Enhancement

33

Page 34: Speech Enhancement

34

Page 35: Speech Enhancement

35

Page 36: Speech Enhancement

36

Page 37: Speech Enhancement

37

Page 38: Speech Enhancement

38

Page 39: Speech Enhancement

39

Page 40: Speech Enhancement

40

Automatic Noise Type Selection

Page 41: Speech Enhancement

41

Page 42: Speech Enhancement

42

Page 43: Speech Enhancement

43

t

mitmi

t

t

ttt

N

gLm

Mi

ca

N

g

NgyK

of Covariance

,,2,1

,,2,1,

,,,, :Parameters HMM-NS

sourceGaussian mean -zero

iidan be to(assumed Residual Stationary :

Function ticDeterminis :

1

,,

Nonstationary State HMM

Page 44: Speech Enhancement

44

1

,0

,0

,

21

,2

,0

,

21exp

.2

1,,

)orthogonal(usually polynomialorder rth an :

stateith visit the to timestarting The :

,,2,1

,polynomial be toassumed

isfunction ticdeterminis theif example,For

mj

R

rrmjt

TrR

rrmjt

mj

Kt

r

i

mit

R

rirmit

dhrBydhrBy

dmjb

h

MmNthrBy

Nonstationary-State HMM

Page 45: Speech Enhancement

45

,,

,,,,,,,,,maxarg,,

,,,,,,,,,max,,

sequenceduration :,,,

sequencen observatio:,,,

sequence state:,,,

1010,,,

1010,,,

110

110

110

110

110

vi

yyyddmmjssspdmj

yyyddmmjssspdmj

ddd

yyy

sss

ttttsss

t

ttttsss

t

T

T

T

t

t

Segmentation Algorithm in NS-HMM

Page 46: Speech Enhancement

46

LmMjTtfor

avimj

mjbcavimj

MjLm

mjbcmj

ijtvi

t

tjmijt

tL

jit

tjmj

1,1,0

.,,maxarg0,,

|,0,,...,,maxmaxmax0,,

state) Markov new a (entering 0dfor Recursion -2

1,1

0,,..0,,

:tionInitializa -1

1,,

|1

1

01

0

Segmentation Algorithm in NS-HMM

Page 47: Speech Enhancement

47

state) a within changednot is mixture the(assuming

0,1,1,0

1,,,,

|,,...1,,,,

looping) (self 0dfor stepRecursion -3

|

tdLmMjTtfor

dmjdmj

dmjbcadmjdmj

t

tjmjjtt

Page 48: Speech Enhancement

48

0,,3,2

*,*,**,*,*

ngBacktracki -5

,,maxmaxmaxarg*,*,*

,,maxmaxmax*

nTerminatio -4

1111

1

1

011111

1

1

011

TTtfor

dmsdms

dmidms

dmip

ttttttt

T

T

d

L

m

M

iTTT

T

T

d

L

m

M

i

Page 49: Speech Enhancement

49

tdPNLMfor

zdG

zdGzdW

ddpnmsygEW

dyddpnmsyfyg

zdWzygE

d

tt

ttt

t

ttttttt

tttttttyt

tM L N P T

dt

tt

1,1,1,1,1

,,,,,

,,,,,,,,,,

,,,,|

,,,,|

.,,,,,|

0

00

01 1 1 1 1

0

Now we generalize MMSE formulae for NS-HMM

Page 50: Speech Enhancement

50

functions.other than less iscost n computatio the

)ofDFTtheofcomponentthk:(

1,,0,

:for shown thatbeen hasIt specified. be tohas

, ....}|E{g ofn calculatio For the

tt

tt

t

yky

kkkyyg

yg

Page 51: Speech Enhancement

51

noise. andspeech ofduration

and mixture state, ingcorrespond for thefilter Wiener

theofcomponent kth theis ~

and

of DFT theofcomponent kth theis Where

~:meanith Gaussian w is

,,,,,|

,,,,,|

i.e., Gaussian. is g

ofcomponent kth theofn expectatio theshown that

hascriterion MMSE theusing estimationlinear A

,,,,

,,,,

kH

zkZ

kZkH

kYddpnmszkYfkY

dpnmszkgE

ttttt

ttttt

dpnms

tt

tdpnms

ttttttttvyt

tttttt

Page 52: Speech Enhancement

52

tddzbcc

aazdGzdG

zbcc

aazdG

zG

t

N Pt

tt

t

t

tM L N P t

dt

tt

1,,,,|...

..,1,,,,,,,,,

:state old in the stayingFor

0,,,,|..

...,,,,,

,0,,,,

:state new a enteringFor

s,constraintduration with G, ofn calculatio Recursive

||

1 1

1010

||

10

1 1 1 1 01

0

Page 53: Speech Enhancement

53

Page 54: Speech Enhancement

54

Page 55: Speech Enhancement

55

Page 56: Speech Enhancement

56

Page 57: Speech Enhancement

57

Page 58: Speech Enhancement

58

Page 59: Speech Enhancement

59

Page 60: Speech Enhancement

60