learning to align polyphonic music. slide 1 learning to align polyphonic music shai shalev-shwartz...
Post on 20-Dec-2015
226 views
TRANSCRIPT
Learning to Align Polyphonic Music. Slide 1
Learning to Align Polyphonic Music
Shai Shalev-Shwartz
Hebrew University, Jerusalem
Joint work with
Yoram Singer, Google Inc.
Joseph Keshet, Hebrew University
Learning to Align Polyphonic Music. Slide 2
Motivation
Symbolic representation:
Acoustic representation:
Two ways for representing music
Learning to Align Polyphonic Music. Slide 3
Symbolic Representation
time
pitch
- pitch
symbolic representation:
- start-time
Learning to Align Polyphonic Music. Slide 4
Acoustic Representation
Feature Extraction
(e.g. Spectral Analysis)
acoustic representation:
acoustic signal:
Time
Fre
qu
en
cy
0 0.5 1 1.5 20
500
1000
1500
2000
2500
3000
3500
4000
Learning to Align Polyphonic Music. Slide 5
Time
Fre
qu
en
cy
0 0.5 1 1.5 20
500
1000
1500
2000
2500
3000
3500
4000
The Alignment Problem Setting
time
pitch
actual start-time:
Learning to Align Polyphonic Music. Slide 6
The Alignment Problem Setting
Goal: learn an alignment function
alignment function
actual start-times
acoustic representation
- pitch
symbolic representation
- start-times
Learning to Align Polyphonic Music. Slide 7
Previous Work
• Dynamic Programming (rule based)• Dannenberg 1984• Soulez et al. 2003• Orio & Schwarz 2001
• Generative Approaches• Raphael 1999• Durey & Clements 2001• Shalev-Shwartz et al. 2002
Learning to Align Polyphonic Music. Slide 8
Our Solution
Discriminative Learning Algorithm
Training Set
Alignment function
Discriminative Learning from examples
Learning to Align Polyphonic Music. Slide 9
Why Discriminative Learning?
“When Solving a given problem, try to avoid a
more general problem as an intermediate step” (Vladimir Vapnik’s principle for solving problems using a
restricted amount of information)
Or, if you would like to visit Barcelona, buy a ticket !
Don’t waste so much time on writing a paper for ISMIR 2004 …
Learning to Align Polyphonic Music. Slide 10
Outline of Solution
1. Define a quantitative assessment of alignments
2. Define a hypotheses class - what is the form of our alignment functions :
a. Map all possible alignments into vectors in an abstract vector-space
b. Find a projection in the vector-space which ranks alignments according to their quality
3. Suggest a learning algorithm
Learning to Align Polyphonic Music. Slide 11
Assessing alignments
e.g.
Learning to Align Polyphonic Music. Slide 12
Feature Functions for Alignment
feature functionfor alignment
Assessing the quality of a suggested alignment
acoustic and symbolic representation
suggested alignment
(actual start-times)
e.g.
e.g.
Learning to Align Polyphonic Music. Slide 13
Feature Functions for Alignment
correct alignment
slightly incorrect alignment
grossly incorrect alignment
Mapping all possible alignments into a vector space
Learning to Align Polyphonic Music. Slide 14
Main Solution Principle
grossly incorrect alignment
correct alignment
slightly incorrect alignment
Find a linear projection that ranks alignments according to their quality
Learning to Align Polyphonic Music. Slide 15
slightly incorrect alignment
Main Solution Principle (cont.)An example of projection with low confidence
correct alignment
grossly incorrect alignment
Learning to Align Polyphonic Music. Slide 16
slightly incorrect alignment
Main Solution Principle (cont.)An example of incorrect projection
correct alignment
grossly incorrect alignment
Learning to Align Polyphonic Music. Slide 17
Hypotheses class
The form of our alignment functions:
predict the alignment which attains the highest projection
defines the direction of projection
Learning to Align Polyphonic Music. Slide 18
Learning algorithm
Optimization Problem:
• Given a training set:
• Find:
• a projection and
• a maximal confidence scalar
such that the data is ranked correctly:
Learning to Align Polyphonic Music. Slide 19
Algorithmic aspects• Iterative algorithm:
• Works on one alignment example at a time• The algorithm works in polynomial time although the
number of constraints is exponentially large• Simple to implement
• Convergence:• Converges to a high confidence solution• #iterations depends on the best attainable confidence
• Generalization:• The gap between test and train error decreases with the
#examples. The gap is bounded above by
Learning to Align Polyphonic Music. Slide 20
Experimental Results• Task: alignment of polyphonic piano music• Dataset: 12 musical pieces where sound and MIDI
were both recorded + other performances of the same pieces in MIDI format
• Features: see in the paper• Algorithms:
• Discriminative method
• Generative method: Generalized Hidden Markov Model (GHMM)
• Using the same features as in the discriminative method
• Using different number of Gaussians (1,3,5,7)
Learning to Align Polyphonic Music. Slide 21
Experimental Results (Cont.)
Our discriminative method outperforms GHMM
GHMM-1
GHMM-3
GHMM-5
GHMM-7
Discrim
inativ
e
Loss (ms)
70
80
60
50
40
30
20
10
Learning to Align Polyphonic Music. Slide 22
The End