andrey kuznetsov evgeny pyshkin saint-petersburg state...

38
Andrey Kuznetsov Evgeny Pyshkin Saint-Petersburg State Polytechnical University Speaker: Andrey Kuznetsov

Upload: others

Post on 28-Oct-2019

40 views

Category:

Documents


0 download

TRANSCRIPT

Andrey Kuznetsov

Evgeny Pyshkin

Saint-Petersburg State Polytechnical University

Speaker: Andrey Kuznetsov

Music Search: purpose Private interest Identify a piece of music

heard occasionally

Art interest Citation analysis

Commercial interest Customers willing to

buy the recording only know the tune

Copyright issues Plagiarism

Scenarios of Music Search «I remember the author/title/album …»

«I remember lyrics»

Text searchNot a case of melody search

Search!

Scenarios of Music Search «I have a recorded fragment»

Audio fingerprint searchNot a case of melody search

Scenarios of Music Search «I remember (can sing/hum/whistle/tap/strum on a

virtual instrument) a tune»

The most complicated scenario

Input data inaccuracy

Human perception factors

Algorithms

Query-by-humming

Note score analyzing

A case of melody search

From Melodies in Mind …(Users’ Point of View ) Extract a melody from the mind

Scores sheet

Virtual instrument

Humming, whistling, etc.

Search within database

Obtain results in form of ordered list

From Melodies in Mind …(Developers’ Point of View ) Create a database

Stores preprocessed samples

Extract a melody from the mind

User interface

Preprocessor algorithms

Search within database

Comparison with special algorithm

Provide results in form of ordered list

From Melodies in Mind … Create a database

Stores preprocessed samples

Extract a melody from the mind User interface (allows input)

Scores sheet

Virtual instrument

Humming, whistling, etc.

Preprocessor algorithms

Search within database Comparison with special algorithm

Provide results in form of ordered list

From Information in Mind … Create a database

Stores preprocessed samples

Extract an information from the mind User interface (allows input)

Scores sheet

Virtual instrument

Humming, whistling, etc.

Preprocessor algorithms

Search within database Comparison with special algorithm

Provide results in form of ordered list

• Instrumental ensemble• Time signature• Key signature• Tempo• Lyric• etc.

Search Filter (Example) General MIDI 16 groups, 128 instruments + drums

Likelihood of detection: 1 instrument = 1/128

1 group = 1/16

Melody 3 notes/intervals

Likelihood of detection: Random notes in one octave = 1/(123) = 1/1728

Intervals (±1 tone) = 1/(53) = 1/125

And it’s still difficult to guess a melody by only 4 notes!

Earths Mover’s Distance (EMD)

Earths Mover’s Distance 3D (EMD3D)

EMD. Definition

EMD. Interpretation

[1]

EMD & Polyphonic Music

Beethoven’s Moonlight Sonata

Beethoven’s Moonlight Sonata

Beethoven’s Moonlight sonata

Moonlight =>Chopin’s Polonaise No. 4

Chopin’s Polonaise No. 4

Chopin’s Polonaise No. 4

Polonaise No. 4 => Scriabin’s Prelude No. 4, op.11

EMD3D The main problem is: “Same notes –

Different compositions”

Produces a lot of false positive results

Solution is: “Differentiate notes taking into account human perception factors”

Third coordinate is introduced

EMD3D. Perception FactorsVolume Levels Volume levels (v)

Expectation: users usually reproduce loudest note

N notes tuned simultaneously

Rank notes according to their volume level

v ∊ [0; N), 0 = loudest

Advanced volume model

EMD3D. Perception FactorsPitch Repetition Diminishing the significance of repeated notes (r)

Expectation: repeated note doesn’t attract interest

Similar: Entropy as measure of “interestingness” [10,11]

r ∊ {0;1}, 0 = not repeated

Exceptions:

The only note repeated in the voice

The note that sounds after a rest

EMD3D. Perception Factors Highest and Lowest Notes Highest and lowest notes of a concord (m)

Expectation: Often the melody is either in the highest or in the lowest voice

Similar: Skyline algorithms [16]

m ∊ {0;1}, 0 = in highest/lowest voice

Sometimes expectation is wrong

EMD3D. Definition

EMD. Interpretation

EMD3D. Interpretation

Generalize the Approach Situations and respective if-then-like instructions

means creating knowledge base

EMD vs. EMD3DExample

The composition

Fragment A. Part of accompaniment Fragment B. Melody

EMD vs. EMD3DExampleFragment A Fragment B

EMD = 0

EMD3D = 4.17

EMD = 0

EMD3D = 0.56

Statistical Significance Evaluation Assume Z subspace is shifted by x

p ∊ [-1;1]

Initial value: p=1 for each note that definitely belong to a melody

Initial value: p=0 for all others notes

During utilization:

p has to be increased for those notes that users used to identify the composition.

p has to be decreased for all others notes

Conclusions & Further Work Use different search criteria

Instrumental ensemble Time signature (tension) Key signature (tonality) Title of the music piece Authorship Mono or polyphonic melody patterns

Search method Use knowledge bases to formalize human perception factors Use algorithms that take into consideration human

perception factors Use statistical information to improve search quality

Q&A

References [1] Evaluating the Earth Mover’s Distance for

measuring symbolic melodic similarity Rainer Typke, Frans Wiering, Remco C. Veltkamp

EMD vs. EMD3DExample 2

EMD vs. EMD3DExample 2

Results for query (ordered list):

Composition 15. EMD3D=0

Composition 8. EMD3D=1

Composition 3. EMD3D=2

Composition 11. EMD3D=3

Composition 17. EMD3D=4

Composition 79. EMD3D=100

Results for query (ordered list):

Composition 1. EMD=0

Composition 2. EMD=0

Composition 3. EMD=0

Composition 4. EMD=0

Composition 5. EMD=0

Composition 100. EMD=0