Online Divergence Switching for
Superresolution-Based
Nonnegative Matrix Factorization
Daichi Kitamura, Hiroshi Saruwatari, Satoshi Nakamura
(Nara Institute of Science and Technology, Japan)
Yu Takahashi, Kazunobu Kondo
(Yamaha Corporation, Japan)
Hirokazu Kameoka (The University of Tokyo, Japan)
2014 RISP International Workshop on Nonlinear Circuits,
Communications and Signal Processing
Speech Analysis(2),2PM2-2
Outline
• 1. Research background
• 2. Conventional methods
– Nonnegative matrix factorization
– Supervised nonnegative matrix factorization
– Directional clustering
– Hybrid method
• 3. Proposed method
– Online divergence switching for hybrid method
• 4. Experiments
• 5. Conclusions
2
Outline
• 1. Research background
• 2. Conventional methods
– Nonnegative matrix factorization
– Supervised nonnegative matrix factorization
– Directional clustering
– Hybrid method
• 3. Proposed method
– Online divergence switching for hybrid method
• 4. Experiments
• 5. Conclusions
3
Research background
• Music signal separation technologies have received
much attention.
• Music signal separation based on nonnegative matrix
factorization (NMF) is a very active research area.
• The separation performance of supervised NMF
(SNMF) markedly degrades for the case of many
source mixtures.
4
• Automatic music transcription • 3D audio system, etc.
Applications
We have been proposed a new hybrid separation method for stereo music signals.
Research background
• Our proposed hybrid method
5
Input stereo signal
Spatial separation method
(Directional clustering)
SNMF-based separation method
(Superresolution-based SNMF)
Separated signal
Research background
• Optimal divergence criterion in superresolution-based
SNMF depends on the spatial conditions of the input
signal.
• Our aim in this presentation
6
We propose a new optimal separation scheme for this hybrid method to separate the target signal with high accuracy for any types of the spatial condition.
Outline
• 1. Research background
• 2. Conventional methods
– Nonnegative matrix factorization
– Supervised nonnegative matrix factorization
– Directional clustering
– Hybrid method
• 3. Proposed method
– Online divergence switching for hybrid method
• 4. Experiments
• 5. Conclusions
7
• NMF
– is a sparse representation algorithm.
– can extract significant features from the observed matrix.
NMF [Lee, et al., 2001]
Amplitude
Am
plit
ude
Observed matrix
(spectrogram)
Basis matrix
(spectral patterns)
Activation matrix
(Time-varying gain)
Time
Ω: Number of frequency bins
𝑇: Number of time frames
𝐾: Number of bases
Time
Fre
quency
Fre
quency
8
Basis Basis
Optimization in NMF
• The variable matrices and are optimized by
minimization of the divergence between and .
• Euclidian distance (EUC-distance) and Kullbuck-
Leibler divergence (KL-divergence) are often used
for the divergence in the cost function.
• In NMF-based separation, KL-divergence based cost
function achieves high separation performance. 9
: Entries of variable matrices and , respectively.
Cost function:
• SNMF utilizes some sample sounds of the target.
– Construct the trained basis matrix of the target sound
– Decompose into the target signal and other signal
SNMF [Smaragdis, et al., 2007]
10
Problem of SNMF
• The separation performance of SNMF markedly
degrades when many interference sources exist.
11
Directional clustering [Araki, et al., 2007]
• Directional clustering
– utilizes differences between channels as a separation cue.
– Is equal to binary masking in the spectrogram domain.
• Problems
– Cannot separate sources in the same direction
– Artificial distortion arises owing to the binary masking.
12
Right
L R
Center Left
L R
Center
Binary masking
Input signal (stereo) Separated signal
1 1 1 0 0 0
1 0 0 0 0 0
1 1 1 1 0 0
1 0 0 0 0 0
1 1 1 1 1 1 Fre
quency
Time
C C C R L R
C L L L R R
C C C C R R
C R R L L L
C C C C C C Fre
quency
Time
Binary mask Spectrogram
Entry-wise product
Hybrid method [D. Kitamura, et al., 2013]
• We have proposed a new SNMF called
superresolution-based SNMF and its hybrid method.
• Hybrid method consists of directional clustering and
superresolution-based SNMF.
13
Superresolution-based SNMF
• This SNMF reconstructs the spectrogram obtained
from directional clustering using supervised basis
extrapolation.
Time
Fre
quency
Separated cluster
: Chasms
Time
Fre
quency
Input spectrogram Other
direction
Time
Fre
quency
Reconstructed
spectrogram
14
Target
direction
Directional
clustering
Directional
clustering
Superresolution-
based SNMF
Superresolution-
based SNMF
• Spectral chasms owing to directional clustering
Superresolution-based SNMF
15
: Chasm
Time
Fre
quency
Separated cluster Chasms
Treat these chasms as
an unseen observations Supervised basis
…
Extrapolate the
fittest bases
Superresolution-based SNMF
Center Right Left Direction
sou
rce
co
mp
on
en
t
z
(b)
Center Right Left Direction
sou
rce
com
po
nen
t (a) Target
Center Right Left Direction
sou
rce
com
po
nen
t (c) Extrapolated
components Fr
equ
en
cy o
f Fr
equ
ency
of
Fre
qu
en
cy o
f
After
Input
After
signal
directional clustering
super- resolution- based SNMF
16
• The divergence is defined at all grids except for the
chasms by using the index matrix .
Decomposition model and cost function
17
Decomposition model:
Supervised bases (Fixed)
: Entries of matrices, , and , respectively
: Weighting parameters, : Binary complement, : Frobenius norm
Regularization term
Penalty term
Cost function:
: Index matrix obtained from directional clustering
Update rules
• We can obtain the update rules for the optimization of
the variables matrices , , and .
18
Update rules:
Outline
• 1. Research background
• 2. Conventional methods
– Nonnegative matrix factorization
– Supervised nonnegative matrix factorization
– Directional clustering
– Hybrid method
• 3. Proposed method
– Online divergence switching for hybrid method
• 4. Experiments
• 5. Conclusions
19
Consideration for optimal divergence
• Separation performance of conventional
SNMF
• Superresolution-based SNMF
– Optimal divergence depends on the amount of
spectral chasms.
20
KL-divergence EUC-distance
KL-divergence EUC-distance ?
However…
Consideration for optimal divergence
• Superresolution-based SNMF has two tasks.
• Abilities of each divergence
21
Signal
separation
Basis
extrapolation
Superresolution-
based SNMF
Signal
separation
Basis
extrapolation
KL-divergence (Very good) (Poor)
EUC-distance (Good) (Good)
Consideration for optimal divergence
• Spectrum decomposed by NMF with KL-divergence
tends to become sparse compared with that
decomposed by NMF with EUC-distance.
• Sparse basis is not suitable for extrapolating using
observable data.
22
Consideration for optimal divergence
• The optimal divergence for superresolution-based
SNMF depends on the amount of spectral chasms
because of the trade-off between separation and
extrapolation abilities. P
erf
orm
an
ce
Separation
Total performance
Extrapolation
Anti-sparse Sparse
Sparseness: Weak 23
KL-divergence EUC-distance
Strong
• The optimal divergence for superresolution-based
SNMF depends on the amount of spectral chasms.
Consideration for optimal divergence
24
Time
Fre
qu
en
cy : Chasms
Time
Fre
qu
en
cy : Chasms
If there are many chasms If the chasms are not exist
The extrapolation ability is
required.
The separation ability is
required.
KL-divergence should
be used.
EUC-distance should
be used.
Hybrid method for online input data
• When we consider applying the hybrid method to
online input data…
25
Online binary-masked spectrogram
Fre
qu
en
cy
Time
Observed spectrogram Directional clustering Directional clustering
Binary mask
Hybrid method for online input data
• We divide the online spectrogram into some block
parts.
26
Fre
qu
en
cy
Time
Superresolution-
based SNMF
Superresolution-
based SNMF
Superresolution-
based SNMF
Superresolution-
based SNMF
Superresolution-
based SNMF
Superresolution-
based SNMF
In parallel
Online divergence switching
• We calculate the rate of chasms in each block part.
27
There are many
chasms.
The chasms are
not exist so much.
Superresolution-
based SNMF with
KL-divergence
Superresolution-
based SNMF with
KL-divergence
Superresolution-
based SNMF with
EUC-distance
Superresolution-
based SNMF with
EUC-distance
Threshold value
Threshold value
Procedure of proposed method
28
Outline
• 1. Research background
• 2. Conventional methods
– Nonnegative matrix factorization
– Supervised nonnegative matrix factorization
– Directional clustering
– Hybrid method
• 3. Proposed method
– Online divergence switching for hybrid method
• 4. Experiments
• 5. Conclusions
29
Experimental conditions
• We used stereo-panning signals.
• Mixture of four instruments generated by MIDI synthesizer
• We used the same type of MIDI sounds of the target
instruments as supervision for training process.
30
Center
1 2 3
4
Left Right
Target source
Supervision
sound
Two octave notes that cover all the notes of the target signal
Experimental conditions
• We compared three methods.
– Hybrid method using only EUC-distance-based SNMF
(Conventional method 1)
– Hybrid method using only KL-divergence-based SNMF
(Conventional method 2)
– Proposed hybrid method that switches the divergence to
the optimal one (Proposed method)
• We used signal-to-distortion ratio (SDR) as an
evaluation score.
– SDR indicates the total separation accuracy, which includes
both of quality of separated target signal and degree of
separation.
31
Experimental result
• Average SDR scores for each method, where the
four instruments are shuffled with 12 combinations.
• Proposed method outperforms other methods.
32
Good Bad
8.0 8.5 9.0 9.5 10.0SDR [dB]
Conventional method 1
Conventional method 2
Proposed method
Conclusions
• We propose a new divergence switching scheme for
superresolution-based SNMF.
• This method is for the online input signal to separate
using optimal divergence in NMF.
• The proposed method can be used for any types of
the spatial condition of sources, and separates the
target signal with high accuracy.
33
Thank you for your attention!