music retrieval
DESCRIPTION
Music retrieval. Conventional music retrieval systems Exact queries: ”Give me all songs from J.Lo’s latest album” What about ”Give me the music that I like”? New methods are needed: sophisticated similarity measures Increasing importance: MP3 players (10 3 songs) - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Music retrieval](https://reader036.vdocuments.mx/reader036/viewer/2022062518/5681403c550346895dababa9/html5/thumbnails/1.jpg)
Music retrieval
• Conventional music retrieval systems• Exact queries: ”Give me all songs from J.Lo’s latest album”• What about ”Give me the music that I like”?
New methods are needed:sophisticated similarity measures
• Increasing importance:• MP3 players (103 songs)• Personal music collections (104 songs)• Music on demand
• many songs, huge market value…
![Page 2: Music retrieval](https://reader036.vdocuments.mx/reader036/viewer/2022062518/5681403c550346895dababa9/html5/thumbnails/2.jpg)
Proposal
• Try a classifier method– Similarity measure
enables matching of fuzzy data always returns results
• Implement relevance feedback– User feedback
Improves retrieval performance
![Page 3: Music retrieval](https://reader036.vdocuments.mx/reader036/viewer/2022062518/5681403c550346895dababa9/html5/thumbnails/3.jpg)
Classifier systems
• Genetic programming
• Neural networks
• Curve fitting algorithms
• Vector quantizers
![Page 4: Music retrieval](https://reader036.vdocuments.mx/reader036/viewer/2022062518/5681403c550346895dababa9/html5/thumbnails/4.jpg)
Tree structured Vector Quantization
• Audio parameterizationFeature space: MFCC coefficients
• Quantization treeA supervised learning algorithm, TreeQ:
• Attempts to partition feature space for maximum class separation
![Page 5: Music retrieval](https://reader036.vdocuments.mx/reader036/viewer/2022062518/5681403c550346895dababa9/html5/thumbnails/5.jpg)
Features: MFCC coefficients
waveform
DFT Log Mel IDFT
MFCCs:
A 13-dimensional vector per window
5 minutes song 30103 windows
100 Hamming windows/second
![Page 6: Music retrieval](https://reader036.vdocuments.mx/reader036/viewer/2022062518/5681403c550346895dababa9/html5/thumbnails/6.jpg)
Classifying feature space
![Page 7: Music retrieval](https://reader036.vdocuments.mx/reader036/viewer/2022062518/5681403c550346895dababa9/html5/thumbnails/7.jpg)
Nearest neighbor
Discrimination line in feature space
• Problems:– Curse of
dimensionality– Distribution
assumptions– Complicated
distributions
![Page 8: Music retrieval](https://reader036.vdocuments.mx/reader036/viewer/2022062518/5681403c550346895dababa9/html5/thumbnails/8.jpg)
Vector Quantization:Adding decision surfaces
• Each surface is added such that
• It cuts only one dimension (speed)
• the mutual information is maximized:
![Page 9: Music retrieval](https://reader036.vdocuments.mx/reader036/viewer/2022062518/5681403c550346895dababa9/html5/thumbnails/9.jpg)
![Page 10: Music retrieval](https://reader036.vdocuments.mx/reader036/viewer/2022062518/5681403c550346895dababa9/html5/thumbnails/10.jpg)
![Page 11: Music retrieval](https://reader036.vdocuments.mx/reader036/viewer/2022062518/5681403c550346895dababa9/html5/thumbnails/11.jpg)
![Page 12: Music retrieval](https://reader036.vdocuments.mx/reader036/viewer/2022062518/5681403c550346895dababa9/html5/thumbnails/12.jpg)
Until further splits are not worthwile
– according to certain stop conditions
![Page 13: Music retrieval](https://reader036.vdocuments.mx/reader036/viewer/2022062518/5681403c550346895dababa9/html5/thumbnails/13.jpg)
Decision tree
• Tree partitions features space – L regions (cells/leaves)
– Based on class belonging of training data
![Page 14: Music retrieval](https://reader036.vdocuments.mx/reader036/viewer/2022062518/5681403c550346895dababa9/html5/thumbnails/14.jpg)
Template generation
• Generate templates for – Training data
– Test data
• Each MFCC vector is routed through the tree
![Page 15: Music retrieval](https://reader036.vdocuments.mx/reader036/viewer/2022062518/5681403c550346895dababa9/html5/thumbnails/15.jpg)
Template generation
• With a series of feature vectors,
each vector will end up in one of the leaves.
• This results in a histogram, or template, for each series of feature vectors.
![Page 16: Music retrieval](https://reader036.vdocuments.mx/reader036/viewer/2022062518/5681403c550346895dababa9/html5/thumbnails/16.jpg)
Template comparisonCorpus templates – one per training class
A B n
X
Query template
Compute similarity
sim(X,A), sim(X,B), sim(X,C), …sim(X,n)
Augmented similarity measure, e.g.
DiffSim(X) = sim(X,A) – sim(X,C)
![Page 17: Music retrieval](https://reader036.vdocuments.mx/reader036/viewer/2022062518/5681403c550346895dababa9/html5/thumbnails/17.jpg)
Template comparisonCorpus templates – one per training class
A B n
Query templates
Compute similarity
DiffSim(X)
Sort
Result list
![Page 18: Music retrieval](https://reader036.vdocuments.mx/reader036/viewer/2022062518/5681403c550346895dababa9/html5/thumbnails/18.jpg)
Preliminary experiments• Test subjects listened to 107 songs
Rated them:good, fair, poor (class belonging Cg, Cf, Cp)
• Training process:– For each user
• Select randomly a subset (N songs) from each class
• Construct a tree based on class belonging
• Generate histogram templates for Cg, Cf, Cp
• For each song X– Generate histogram template
– Compute DiffSim(X) = sim(X,Cg) – sim(X,Cp)
• Sort the list of songs according to DiffSim
![Page 19: Music retrieval](https://reader036.vdocuments.mx/reader036/viewer/2022062518/5681403c550346895dababa9/html5/thumbnails/19.jpg)
Results
N 1 3 5 7 9
random ,236 ,234 ,246 ,240 ,234
cos ,305 ,364 ,370 ,388 ,389
![Page 20: Music retrieval](https://reader036.vdocuments.mx/reader036/viewer/2022062518/5681403c550346895dababa9/html5/thumbnails/20.jpg)
Relevance feedback
Result list user
classifier
![Page 21: Music retrieval](https://reader036.vdocuments.mx/reader036/viewer/2022062518/5681403c550346895dababa9/html5/thumbnails/21.jpg)
Implementation
Adjust histogram profiles based on user feedback
• For each user– Select the top M songs from the
result list
– Add the contents of the songs to the histogram profile based on the user rating (class belonging Cg, Cf, Cp)
– For each song X• Generate histogram template
• Compute DiffSim(X) = sim(X,Cg) – sim(X,Cp)
– Sort the list of songs according to DiffSim
![Page 22: Music retrieval](https://reader036.vdocuments.mx/reader036/viewer/2022062518/5681403c550346895dababa9/html5/thumbnails/22.jpg)
Improvement
Amount of training data N
M 1 3 5 7 9
1 27,68 5,88 10,22 2,52 4,74
3 40,94 19,70 23,80 17,20 27,50
5 52,15 32,14 34,08 27,99 40,59
7 62,89 43,45 43,76 36,45 52,89