andrey kuznetsov evgeny pyshkin saint-petersburg state...
TRANSCRIPT
Andrey Kuznetsov
Evgeny Pyshkin
Saint-Petersburg State Polytechnical University
Speaker: Andrey Kuznetsov
Music Search: purpose Private interest Identify a piece of music
heard occasionally
Art interest Citation analysis
Commercial interest Customers willing to
buy the recording only know the tune
Copyright issues Plagiarism
Scenarios of Music Search «I remember the author/title/album …»
«I remember lyrics»
Text searchNot a case of melody search
Search!
Scenarios of Music Search «I have a recorded fragment»
Audio fingerprint searchNot a case of melody search
Scenarios of Music Search «I remember (can sing/hum/whistle/tap/strum on a
virtual instrument) a tune»
The most complicated scenario
Input data inaccuracy
Human perception factors
Algorithms
Query-by-humming
Note score analyzing
A case of melody search
From Melodies in Mind …(Users’ Point of View ) Extract a melody from the mind
Scores sheet
Virtual instrument
Humming, whistling, etc.
Search within database
Obtain results in form of ordered list
From Melodies in Mind …(Developers’ Point of View ) Create a database
Stores preprocessed samples
Extract a melody from the mind
User interface
Preprocessor algorithms
Search within database
Comparison with special algorithm
Provide results in form of ordered list
From Melodies in Mind … Create a database
Stores preprocessed samples
Extract a melody from the mind User interface (allows input)
Scores sheet
Virtual instrument
Humming, whistling, etc.
Preprocessor algorithms
Search within database Comparison with special algorithm
Provide results in form of ordered list
From Information in Mind … Create a database
Stores preprocessed samples
Extract an information from the mind User interface (allows input)
Scores sheet
Virtual instrument
Humming, whistling, etc.
Preprocessor algorithms
Search within database Comparison with special algorithm
Provide results in form of ordered list
• Instrumental ensemble• Time signature• Key signature• Tempo• Lyric• etc.
Search Filter (Example) General MIDI 16 groups, 128 instruments + drums
Likelihood of detection: 1 instrument = 1/128
1 group = 1/16
Melody 3 notes/intervals
Likelihood of detection: Random notes in one octave = 1/(123) = 1/1728
Intervals (±1 tone) = 1/(53) = 1/125
And it’s still difficult to guess a melody by only 4 notes!
EMD3D The main problem is: “Same notes –
Different compositions”
Produces a lot of false positive results
Solution is: “Differentiate notes taking into account human perception factors”
Third coordinate is introduced
EMD3D. Perception FactorsVolume Levels Volume levels (v)
Expectation: users usually reproduce loudest note
N notes tuned simultaneously
Rank notes according to their volume level
v ∊ [0; N), 0 = loudest
Advanced volume model
EMD3D. Perception FactorsPitch Repetition Diminishing the significance of repeated notes (r)
Expectation: repeated note doesn’t attract interest
Similar: Entropy as measure of “interestingness” [10,11]
r ∊ {0;1}, 0 = not repeated
Exceptions:
The only note repeated in the voice
The note that sounds after a rest
EMD3D. Perception Factors Highest and Lowest Notes Highest and lowest notes of a concord (m)
Expectation: Often the melody is either in the highest or in the lowest voice
Similar: Skyline algorithms [16]
m ∊ {0;1}, 0 = in highest/lowest voice
Sometimes expectation is wrong
Generalize the Approach Situations and respective if-then-like instructions
means creating knowledge base
Statistical Significance Evaluation Assume Z subspace is shifted by x
p ∊ [-1;1]
Initial value: p=1 for each note that definitely belong to a melody
Initial value: p=0 for all others notes
During utilization:
p has to be increased for those notes that users used to identify the composition.
p has to be decreased for all others notes
Conclusions & Further Work Use different search criteria
Instrumental ensemble Time signature (tension) Key signature (tonality) Title of the music piece Authorship Mono or polyphonic melody patterns
Search method Use knowledge bases to formalize human perception factors Use algorithms that take into consideration human
perception factors Use statistical information to improve search quality
References [1] Evaluating the Earth Mover’s Distance for
measuring symbolic melodic similarity Rainer Typke, Frans Wiering, Remco C. Veltkamp
EMD vs. EMD3DExample 2
Results for query (ordered list):
Composition 15. EMD3D=0
Composition 8. EMD3D=1
Composition 3. EMD3D=2
Composition 11. EMD3D=3
Composition 17. EMD3D=4
…
Composition 79. EMD3D=100
Results for query (ordered list):
Composition 1. EMD=0
Composition 2. EMD=0
Composition 3. EMD=0
Composition 4. EMD=0
Composition 5. EMD=0
…
Composition 100. EMD=0