page 0 of 43 signal subspace speech enhancement. page 1 of 47 presentation outline introduction...

48
Page 1 of 43 Signal Subspace Speech Enhancement

Upload: allen-scott

Post on 21-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 1 of 43

Signal Subspace Speech Enhancement

Page 2: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 2 of 47

Presentation Outline

Introduction

Principals

Orthogonal Transforms (KLT Overview)

Papers Review

Page 3: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 3 of 47

Introduction

Two major classes of speech enhancement

– By modeling of noise/speech: like HMM Highly dependent on speech signal syntax and noise

characteristics

– Based on transformation: Spectral Subtraction Musical noise

Signal Subspace belongs to the second class (nonparametric)

Page 4: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 4 of 47

OrthogonalTransform

Schematic Diagram

ModifyingCoefficients

InverseTransform

Noisy signalNoisy signal(time domain)(time domain)

EstimatedCleanSignal

Page 5: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 5 of 47

OrthogonalTransform

Schematic Diagram

Framingoverlapping

EstimatingDimensions of

Subspaces

InverseTransform

Gs

Signal+Noise subspace

Noisy signalNoisy signal(time domain)(time domain)

GnCleanSignalNoise

subspace

EstimatingClean signalfromSignal+Noise subspaceProducing two

orthogonal subspaces

Page 6: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 6 of 47

Principals

Procedure– Estimate the dimension of the signal+noise

subspace in each frame

– Estimate clean signal from (S+N) subspace by considering some criteria (main part) energy of the residual noise energy of the signal distortion

– Nulling the coefficients related to the noise subspace

Page 7: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 7 of 47

Principals

Assumptions– Noise & speech are uncorrelated

– Noise is additive & white (whitened)

– Covariance matrix of the noise in each frame is positive definite and close to a Toeplitz matrix

– Signal is more statistically structured than noise process

Page 8: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 8 of 47

Principals

Key Factor in Signal Subspace method

– Covariance matrices of the clean signal have some zero eigenvalues.

The improvement in SNR is proportional to the number of those zeros.

Nullifying the coefficients of the noise subspace corresponds to that of weak spectral components in spectral subtraction.

Page 9: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 9 of 47

Orthogonal Transforms

Signal Subspace decomposition can be achieved by applying:

– KLT via Eigenvalue Decomposition (ED) of signal covariance

matrix via Singular Value Decomposition (SVD) of data matrix SVD approximation by recursive methods

– DCT as a good approximation to the KLT

– Walsh, Haar, Sine, Fourier,…

Page 10: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 10 of 47

Also known as “Hotelling”, “Principal Component” or “Eigenvector" Transform

Decorrelates the input vector perfectly – Processing of one component has no effect on

the others

Applications– Compression, Pattern Recognition,

Classification, Image Restoration, Speech Recognition, Speaker Recognition,…

Orthogonal Transforms:

Karhunen-Loeve Transform (KLT)

Page 11: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 11 of 47

KLT Overview

TNxxxx ),...,,( 21

NNLet R be the correlation matrix of a random

complex sequencethen

N

N

H xxx

x

x

x

ExxER 212

1

Where E is the expectation operator and R is Hermitian matrix.

Page 12: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 12 of 47

KLT Overview

NNLet be unitary matrix which diagonalizes R

are the eigenvalues of R.

H 1

N

H

Diag

R

,...,, 21

Nii ,...,2,1,

is called the KLT matrix.H

Page 13: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 13 of 47

KLT Overview

Property of :H

xy H•Consider the following transform:

sequence y is uncorrelated because :

HHH xxEyyE

RxxE HHH

y has no cross-correlation

Page 14: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 14 of 47

KLT Overview

What is ?

RH RH R

where N 21

i`s are ith column of and

NiR iii ,...,2,1 ,

Thus are eigenvectors corresponding to s'i si '

Page 15: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 15 of 47

KLT Overview

Comments

– The arrangement of y auto-correlations is the same as that of

– KLT can be based on Covariance matrix

– Using largest eigenvalues to reconstruct sequence with negligible error

– KLT is optimal

'si

Page 16: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 16 of 47

Difficulties

– Computational Complexity (no fast algorithm)

– Dependency on the statistics of the current frame

– Make uncorrelated not independent

Utilize KLT as a Benchmark in evaluating the performance of the other transforms.

KLT Overview

Page 17: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 17 of 47

Papers Review

1. A Signal Subspace Approach for S.E. [Ephraim 95]

2. On S.E. Algorithms based on Signal Subspace Methods [Hansen]

3. Extension of the Signal Subspace S.E. Approach to Colored Noise [Ephraim]

4. An Adaptive KLT Approach for S.E. [Gazor]

5. Incorporating the Human Hearing Properties in Signal Subspace Approach for S.E. [Jabloun]

6. An Energy-Constrained Signal Subspace Method for S.E. [Huang]

7. S.E. Based on the Subspace Method [Asano]

Page 18: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 18 of 47

A Signal Subspace Approach for S.E. [Ephraim 95]

Principal– Decompose the input vector of the noisy signal

into a signal+noise subspace and a noise subspace by applying KLT

Enhancement Procedure– Removing the noise subspace– Estimating the clean signal from S+N

subspace– Two linear estimators by considering:

Signal distortion Residual noise energy

Page 19: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 19 of 47

A Signal Subspace Approach for S.E. [Ephraim 95]

Notes– Keeping the residual noise below some

threshold to avoid producing musical noise

– Since DFT & KLT are related, SS is a particular case of this method

– if # of basis vectors (for linear combination of a vector) are less than the dim of the vector, then there are some zero eigenvalues for its correlation matrix

Page 20: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 20 of 47

A Signal Subspace Approach for S.E. [Ephraim 95]

Basics – speech signal : z=y+w , K-dimensional –

– – If M=K, representation is always possible.– Else “damped complex sinusoid model” can be

used.– Span( V ): produces all vector y

KMVsyM

mmm

,1

Mss ,,1 Are zero mean complex variables

Vsy

Page 21: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 21 of 47

A Signal Subspace Approach for S.E. [Ephraim 95]

When M<K, all vectors y lie in a subspace of spanned by the columns of V

SIGNAL+NOISE SUBSPACE

Covariance matrix of clean signal y

KR

## VVRyyER sy KM M,M M,K ;

MKhas

MRRank y

)(

zero eigenvalues

Vsy

Page 22: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 22 of 47

A Signal Subspace Approach for S.E. [Ephraim 95]

Covariance matrix of noise w : (K-Dim)

– White noise vectors fill the entire Euclidean space RK

– Thus the noise exists in both S+N subspace and complementary subspace

NOISE SUBSPACE

IwwER ww2#

KRRank w )(

n

Sn

n

RK

Page 23: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 23 of 47

A Signal Subspace Approach for S.E. [Ephraim 95]

The discussion indicates that Euclidean space of the noisy signal is composed of a signal subspace and a complementary noise subspace

This decomposition can be performed by applying KLT to the noisy signal :

Let The covariance matrix of z is:

wVsz

wsz RVVRzzER ##

Page 24: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 24 of 47

A Signal Subspace Approach for S.E. [Ephraim 95]

Noise is additive

Let be the eigendecomposition of Rz

Where are eigenvectors of Rz and

Eigenvalues of Rw are

wyz RRR

#UUR zz

kuuU ,,1 Kdiag zzz ,,1

2w

KMk

Mkkk

w

wyz

,,1 if

,,1 if 2

2

Page 25: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 25 of 47

A Signal Subspace Approach for S.E. [Ephraim 95]

Estimating Dimensions of Signal Subspace M

Because ,Hence is the orthogonal projector onto the S+N subspace

21,UUU

kuuU ,,1

21 : wzk kuU

)()( 1 VspanUspan #11UU

Let

: principal eigenvectors

Page 26: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 26 of 47

A Signal Subspace Approach for S.E. [Ephraim 95]

Thus a vector z of noisy signal can be decomposed as

is the Karhunen-Loeve Transform Matrix.

The vector does not contain signal information and can be nulled when estimating the clean signal.

However, M (dim of S+N subspace) must be calculated precisely

zUUzUUz #22

#11

zUU #22

#1U

IUU # IUUUU #22

#11

Page 27: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 27 of 47

A Signal Subspace Approach for S.E. [Ephraim 95]

Linear Estimation of the clean signal

– Time Domain Constrained Estimator

Minimize signal distortion while constraining the energy of residual noise in every frame below a given threshold

– Spectral Domain Constrained Estimator

Minimize signal distortion while constraining the energy of residual noise in each spectral component below a given threshold

Page 28: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 28 of 47

A Signal Subspace Approach for S.E. [Ephraim 95]

Time Domain Constrained Estimator

– Having z=y+w Let be a linear estimator of y

where H is a K*K matrix

– The residual signal is

Representing signal distortion and residual noise respectively

Hzy ˆ

wy rrHwyIHyyr

)(ˆ

Page 29: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 29 of 47

A Signal Subspace Approach for S.E. [Ephraim 95]

Defining Criterion

Solving :

yIHry )( #2yyy rrtrE

#2www rrtrE

KM

wwK

yH

αε

0 :subject to

min

221

2

Minimize signal distortion while constraining the energy of residual noise in the entire frame below a given threshold

Hwrw

Energy:

Energy:

Page 30: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 30 of 47

A Signal Subspace Approach for S.E. [Ephraim 95]

After solving the Constrained minimization by ‘‘Kuhn-Tucker’’ necessary conditions we obtain

Eigendecomposition of HTDC

12 IRRH wyyTDC

2221

IRRtr wyyK

Where is the Lagrange multiplier that must satisfy

#

00

0U

GUHTDC

Page 31: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 31 of 47

A Signal Subspace Approach for S.E. [Ephraim 95]

In order to null noisy components

#

00

0U

GUHTDC

#11 UGUHTDC

12 wyyG

If then HTDC=I, which means minimum distortion and maximum noise

)( max KM

Page 32: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 32 of 47

A Signal Subspace Approach for S.E. [Ephraim 95]

Spectral Domain Constrained Estimator– Minimize signal distortion while constraining the

energy of residual noise in each spectral component below a given threshold.

Results:

K,1,Mk 0

,,1k

),,(21

11

#

Mq

qqdiagQ

UQUH

kKK

KK

)}(/exp{ 2 kv ywk

Page 33: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 33 of 47

A Signal Subspace Approach for S.E. [Ephraim 95]

Notes

– The most computational complexity is in Eigendecomposition of the estimated covariance.

– Eigendecomposition of Toeplitz covariance matrix of the noisy vector is used as an approximate to KLT

– Compromise between large T in estimating Rz ,and large K to satisfy M<K, while KT can not be too large

Page 34: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 34 of 47

A Signal Subspace Approach for S.E. [Ephraim 95]

Implementation Results– The improvement in SNR is proportional to K /M

– The SDC estimator is more powerful than the TDC estimator

– SNR improvements in Signal Subspace and SS are similar

– Subjective Test 83.9 preferred Signal Subspace over noisy signal 98.2 preferred Signal Subspace over SS

Page 35: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 35 of 47

On S.E. Algorithms based on Signal Subspace Methods [Hansen] The dimension of the signal subspace is chosen at a point

with almost equal singular values Gain matrices for different estimators

– SDC– TDC– MV

Lowest residual noise

– LS G=I Lowest signal distortion and highest residual noise K /M improvement in SNR

SDC improves the SNR in the range 0-20 db

2noiseK

M

Less sensitive to errors in the noise estimation

Musical noise

Page 36: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 36 of 47

Extension of the Signal Subspace S.E. Approach to Colored Noise [Ephraim]

Whitening approach is not desirable for SDC estimator. Obtaining gain matrix H for SDC estimator

is not diagonal when the input noise is colored Whitening Orthogonal Transformation U’ modify

components by

,...,m iαNvE ii

dH

1 :subject to

min

2

2

2121 ~ ww RUHURH

H~

H~

Page 37: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 37 of 47

An Adaptive KLT Approach for S.E. [Gazor] Goal

– Enhancement of speech degraded by additive colored noise

Novelty

– Adaptive tracking based algorithm for obtaining KLT components

– A VAD based on principle eigenvalues

Page 38: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 38 of 47

An Adaptive KLT Approach for S.E. [Gazor] Objective

– Minimize the distortion when residual noise power is limited to a specific level

Type of colored noise– Have a diagonal covariance matrix in KLT

domain 12

wyyG

1 nyyG

Replaced by

Page 39: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 39 of 47

An Adaptive KLT Approach for S.E. [Gazor] Adaptive KLT tracking algorithm

– named “projection approximation subspace tracking”

– reducing computational time

– Eigendecomposition is considered as a constrained optimization problem

– Solving the problem considering quasi-stationarity of speech

– Then a recursive algorithm is planned to find a close approximation of eigenvectors of the noisy signal

Page 40: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 40 of 47

An Adaptive KLT Approach for S.E. [Gazor] Voice activity detector

– When the current principle components’ energy is above 1/12 its past minimum and maximum

Implementation Results

SNR (dB)

Non-Processed

Ephraim’sNoise Type

10 85% 55% white

5 75% 69% white

0 64% 89% white

10 75% 73% office

5 85% 79% office

0 68% 89% office

Page 41: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 41 of 47

Incorporating the Human Hearing Properties in the Signal Subspace Approach for S.E. [Jabloun] Goal

– Keep the residual noise as much as possible, in order to minimize signal distortion

Novelty– Transformation from Frequency to Eigendomain

for modeling masking threshold.

Many masking models were introduced in frequency domain; like Bark scale

IFET Masking FETeigendomain eigendomain

Page 42: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 42 of 47

Incorporating the Human Hearing Properties in the Signal Subspace Approach for S.E. [Jabloun] Use noise prewhitening to handle the colored noise

Implementation results

Input SNRCompared with noisy signal

Compared with Signal Subspace

20 dB 92% 71%

10 dB 85% 78%

5 dB 85% 92%

Page 43: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 43 of 47

An Energy-Constrained Signal Subspace Method for S.E. [Huang] Novelty

– The colored noise is modelled by an AR process.

– Estimating energy of clean signal to adjust the speech enhancement

Prewhitening filter is constructed based on the estimated AR parameters.– Optimal AR coeffs is given by [Key 98]

Page 44: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 44 of 47

An Energy-Constrained Signal Subspace Method for S.E. [Huang] Implementation Results

Input SNR 0 dB 5 dB 10 dB 20 dB

Baseline 40 % 70 % 90 % 100 %

ECSS 90 % 100 % 100 % 100 %

Word Recognition Accuracy for noisy digits

Input SNR 0 dB 5 dB 10 dB 20 dB

Improvement 7.6 6.4 5.2 2.9

SNR improvement for isolated noisy digits

Page 45: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 45 of 47

S.E. Based on the Subspace Method [Asano]—Microphone Array

The input spectrum observed at the mth microphone

Vector notation for all microphones

(spatial) correlation matrix for xk is

Then Eigenvalue Decomposition

is applied to

kNkSkAkX md

D

ddmm

.1

,

Microphone array

Ambient NoiseDirectional

Sourceskkkk nsAx

]xE[xR Hkkk

kR

Page 46: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 46 of 47

S.E. Based on the Subspace Method [Asano]—Microphone Array

Procedure– Weighting the eigenvalues of spatial correlation

matrix

Energy of D directional sources is concentrated on D largest eigenvalues

Ambient noise is reduced by weighting eigenvalues of the noise-dominant subspace

discarding M-D smallest eigenvalues when direct-ambient ratio is high

– Using MV beamformer to extract directional component from modified spatial correlation matrix

Page 47: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 47 of 47

S.E. Based on the Subspace Method [Asano]—Microphone Array

Implementation results

– Two directional speech signals + Ambient noise

Recognition Rate:

87.2%81.5%86.6%81.1%10 dB

78%72.3%71.5%66.9%5 dB

B1AB1ASNR

MV-NSRMV

Page 48: Page 0 of 43 Signal Subspace Speech Enhancement. Page 1 of 47 Presentation Outline Introduction Principals Orthogonal Transforms (KLT Overview) Papers

Page 48 of 47

Thanks For Your AttentionThanks For Your Attention

The End