online divergence switching for superresolution-based nonnegative matrix factorization

33
Online Divergence Switching for Superresolution-Based Nonnegative Matrix Factorization Daichi Kitamura, Hiroshi Saruwatari, Satoshi Nakamura (Nara Institute of Science and Technology, Japan) Yu Takahashi, Kazunobu Kondo (Yamaha Corporation, Japan) Hirokazu Kameoka (The University of Tokyo, Japan) 2014 RISP International Workshop on Nonlinear Circuits, Communications and Signal Processing Speech Analysis(2),2PM2-2

Upload: naistis

Post on 05-Nov-2014

626 views

Category:

Technology


4 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Factorization

Online Divergence Switching for

Superresolution-Based

Nonnegative Matrix Factorization

Daichi Kitamura, Hiroshi Saruwatari, Satoshi Nakamura

(Nara Institute of Science and Technology, Japan)

Yu Takahashi, Kazunobu Kondo

(Yamaha Corporation, Japan)

Hirokazu Kameoka (The University of Tokyo, Japan)

2014 RISP International Workshop on Nonlinear Circuits,

Communications and Signal Processing

Speech Analysis(2),2PM2-2

Page 2: Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Factorization

Outline

• 1. Research background

• 2. Conventional methods

– Nonnegative matrix factorization

– Supervised nonnegative matrix factorization

– Directional clustering

– Hybrid method

• 3. Proposed method

– Online divergence switching for hybrid method

• 4. Experiments

• 5. Conclusions

2

Page 3: Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Factorization

Outline

• 1. Research background

• 2. Conventional methods

– Nonnegative matrix factorization

– Supervised nonnegative matrix factorization

– Directional clustering

– Hybrid method

• 3. Proposed method

– Online divergence switching for hybrid method

• 4. Experiments

• 5. Conclusions

3

Page 4: Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Factorization

Research background

• Music signal separation technologies have received

much attention.

• Music signal separation based on nonnegative matrix

factorization (NMF) is a very active research area.

• The separation performance of supervised NMF

(SNMF) markedly degrades for the case of many

source mixtures.

4

• Automatic music transcription • 3D audio system, etc.

Applications

We have been proposed a new hybrid separation method for stereo music signals.

Page 5: Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Factorization

Research background

• Our proposed hybrid method

5

Input stereo signal

Spatial separation method

(Directional clustering)

SNMF-based separation method

(Superresolution-based SNMF)

Separated signal

Page 6: Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Factorization

Research background

• Optimal divergence criterion in superresolution-based

SNMF depends on the spatial conditions of the input

signal.

• Our aim in this presentation

6

We propose a new optimal separation scheme for this hybrid method to separate the target signal with high accuracy for any types of the spatial condition.

Page 7: Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Factorization

Outline

• 1. Research background

• 2. Conventional methods

– Nonnegative matrix factorization

– Supervised nonnegative matrix factorization

– Directional clustering

– Hybrid method

• 3. Proposed method

– Online divergence switching for hybrid method

• 4. Experiments

• 5. Conclusions

7

Page 8: Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Factorization

• NMF

– is a sparse representation algorithm.

– can extract significant features from the observed matrix.

NMF [Lee, et al., 2001]

Amplitude

Am

plit

ude

Observed matrix

(spectrogram)

Basis matrix

(spectral patterns)

Activation matrix

(Time-varying gain)

Time

Ω: Number of frequency bins

𝑇: Number of time frames

𝐾: Number of bases

Time

Fre

quency

Fre

quency

8

Basis Basis

Page 9: Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Factorization

Optimization in NMF

• The variable matrices and are optimized by

minimization of the divergence between and .

• Euclidian distance (EUC-distance) and Kullbuck-

Leibler divergence (KL-divergence) are often used

for the divergence in the cost function.

• In NMF-based separation, KL-divergence based cost

function achieves high separation performance. 9

: Entries of variable matrices and , respectively.

Cost function:

Page 10: Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Factorization

• SNMF utilizes some sample sounds of the target.

– Construct the trained basis matrix of the target sound

– Decompose into the target signal and other signal

SNMF [Smaragdis, et al., 2007]

10

Page 11: Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Factorization

Problem of SNMF

• The separation performance of SNMF markedly

degrades when many interference sources exist.

11

Page 12: Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Factorization

Directional clustering [Araki, et al., 2007]

• Directional clustering

– utilizes differences between channels as a separation cue.

– Is equal to binary masking in the spectrogram domain.

• Problems

– Cannot separate sources in the same direction

– Artificial distortion arises owing to the binary masking.

12

Right

L R

Center Left

L R

Center

Binary masking

Input signal (stereo) Separated signal

1 1 1 0 0 0

1 0 0 0 0 0

1 1 1 1 0 0

1 0 0 0 0 0

1 1 1 1 1 1 Fre

quency

Time

C C C R L R

C L L L R R

C C C C R R

C R R L L L

C C C C C C Fre

quency

Time

Binary mask Spectrogram

Entry-wise product

Page 13: Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Factorization

Hybrid method [D. Kitamura, et al., 2013]

• We have proposed a new SNMF called

superresolution-based SNMF and its hybrid method.

• Hybrid method consists of directional clustering and

superresolution-based SNMF.

13

Page 14: Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Factorization

Superresolution-based SNMF

• This SNMF reconstructs the spectrogram obtained

from directional clustering using supervised basis

extrapolation.

Time

Fre

quency

Separated cluster

: Chasms

Time

Fre

quency

Input spectrogram Other

direction

Time

Fre

quency

Reconstructed

spectrogram

14

Target

direction

Directional

clustering

Directional

clustering

Superresolution-

based SNMF

Superresolution-

based SNMF

Page 15: Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Factorization

• Spectral chasms owing to directional clustering

Superresolution-based SNMF

15

: Chasm

Time

Fre

quency

Separated cluster Chasms

Treat these chasms as

an unseen observations Supervised basis

Extrapolate the

fittest bases

Page 16: Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Factorization

Superresolution-based SNMF

Center Right Left Direction

sou

rce

co

mp

on

en

t

z

(b)

Center Right Left Direction

sou

rce

com

po

nen

t (a) Target

Center Right Left Direction

sou

rce

com

po

nen

t (c) Extrapolated

components Fr

equ

en

cy o

f Fr

equ

ency

of

Fre

qu

en

cy o

f

After

Input

After

signal

directional clustering

super- resolution- based SNMF

16

Page 17: Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Factorization

• The divergence is defined at all grids except for the

chasms by using the index matrix .

Decomposition model and cost function

17

Decomposition model:

Supervised bases (Fixed)

: Entries of matrices, , and , respectively

: Weighting parameters, : Binary complement, : Frobenius norm

Regularization term

Penalty term

Cost function:

: Index matrix obtained from directional clustering

Page 18: Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Factorization

Update rules

• We can obtain the update rules for the optimization of

the variables matrices , , and .

18

Update rules:

Page 19: Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Factorization

Outline

• 1. Research background

• 2. Conventional methods

– Nonnegative matrix factorization

– Supervised nonnegative matrix factorization

– Directional clustering

– Hybrid method

• 3. Proposed method

– Online divergence switching for hybrid method

• 4. Experiments

• 5. Conclusions

19

Page 20: Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Factorization

Consideration for optimal divergence

• Separation performance of conventional

SNMF

• Superresolution-based SNMF

– Optimal divergence depends on the amount of

spectral chasms.

20

KL-divergence EUC-distance

KL-divergence EUC-distance ?

However…

Page 21: Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Factorization

Consideration for optimal divergence

• Superresolution-based SNMF has two tasks.

• Abilities of each divergence

21

Signal

separation

Basis

extrapolation

Superresolution-

based SNMF

Signal

separation

Basis

extrapolation

KL-divergence (Very good) (Poor)

EUC-distance (Good) (Good)

Page 22: Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Factorization

Consideration for optimal divergence

• Spectrum decomposed by NMF with KL-divergence

tends to become sparse compared with that

decomposed by NMF with EUC-distance.

• Sparse basis is not suitable for extrapolating using

observable data.

22

Page 23: Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Factorization

Consideration for optimal divergence

• The optimal divergence for superresolution-based

SNMF depends on the amount of spectral chasms

because of the trade-off between separation and

extrapolation abilities. P

erf

orm

an

ce

Separation

Total performance

Extrapolation

Anti-sparse Sparse

Sparseness: Weak 23

KL-divergence EUC-distance

Strong

Page 24: Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Factorization

• The optimal divergence for superresolution-based

SNMF depends on the amount of spectral chasms.

Consideration for optimal divergence

24

Time

Fre

qu

en

cy : Chasms

Time

Fre

qu

en

cy : Chasms

If there are many chasms If the chasms are not exist

The extrapolation ability is

required.

The separation ability is

required.

KL-divergence should

be used.

EUC-distance should

be used.

Page 25: Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Factorization

Hybrid method for online input data

• When we consider applying the hybrid method to

online input data…

25

Online binary-masked spectrogram

Fre

qu

en

cy

Time

Observed spectrogram Directional clustering Directional clustering

Binary mask

Page 26: Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Factorization

Hybrid method for online input data

• We divide the online spectrogram into some block

parts.

26

Fre

qu

en

cy

Time

Superresolution-

based SNMF

Superresolution-

based SNMF

Superresolution-

based SNMF

Superresolution-

based SNMF

Superresolution-

based SNMF

Superresolution-

based SNMF

In parallel

Page 27: Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Factorization

Online divergence switching

• We calculate the rate of chasms in each block part.

27

There are many

chasms.

The chasms are

not exist so much.

Superresolution-

based SNMF with

KL-divergence

Superresolution-

based SNMF with

KL-divergence

Superresolution-

based SNMF with

EUC-distance

Superresolution-

based SNMF with

EUC-distance

Threshold value

Threshold value

Page 28: Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Factorization

Procedure of proposed method

28

Page 29: Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Factorization

Outline

• 1. Research background

• 2. Conventional methods

– Nonnegative matrix factorization

– Supervised nonnegative matrix factorization

– Directional clustering

– Hybrid method

• 3. Proposed method

– Online divergence switching for hybrid method

• 4. Experiments

• 5. Conclusions

29

Page 30: Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Factorization

Experimental conditions

• We used stereo-panning signals.

• Mixture of four instruments generated by MIDI synthesizer

• We used the same type of MIDI sounds of the target

instruments as supervision for training process.

30

Center

1 2 3

Left Right

Target source

Supervision

sound

Two octave notes that cover all the notes of the target signal

Page 31: Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Factorization

Experimental conditions

• We compared three methods.

– Hybrid method using only EUC-distance-based SNMF

(Conventional method 1)

– Hybrid method using only KL-divergence-based SNMF

(Conventional method 2)

– Proposed hybrid method that switches the divergence to

the optimal one (Proposed method)

• We used signal-to-distortion ratio (SDR) as an

evaluation score.

– SDR indicates the total separation accuracy, which includes

both of quality of separated target signal and degree of

separation.

31

Page 32: Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Factorization

Experimental result

• Average SDR scores for each method, where the

four instruments are shuffled with 12 combinations.

• Proposed method outperforms other methods.

32

Good Bad

8.0 8.5 9.0 9.5 10.0SDR [dB]

Conventional method 1

Conventional method 2

Proposed method

Page 33: Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Factorization

Conclusions

• We propose a new divergence switching scheme for

superresolution-based SNMF.

• This method is for the online input signal to separate

using optimal divergence in NMF.

• The proposed method can be used for any types of

the spatial condition of sources, and separates the

target signal with high accuracy.

33

Thank you for your attention!