MACHINE LEARNING TECHNIQUES FOR MUSIC PREDICTION S. Grant Lowe Advisor: Prof. Nick Webb.
Post on 27-Dec-2015
Predicting Popular Music using data mining and machine learning
Machine Learning Techniques for Music Prediction S. Grant LoweAdvisor: Prof. Nick WebbResearch questionsCan we predict the year in which a song was released?Can we predict the genre of a song?Can we identify which attributes are the strongest in answering these questions?BackgroundHit Song ScienceGenre ClassificationYear PredictionapproachUse WEKAUse the Million Song DatasetWEKAMachine Learning SoftwareContains Visualization tools and algorithms for data analysis and modelingdataMillion Song Data Set: commercial tracks from 1922-2011,collected by LabROSA
REPLACE THIS7Early challengesData in the wrong Format: HDF5 vs CSVLots of missing Data!Almost half of the songs are missing year, a very important attributeMany attributes are being ignored because a majority of the songs are missing data.ArtistID -> Year?
AttributesThe MSD contains 53 descriptive attributes for each song, along with 90 timbre attributes. Attributes were removed if they were not good indicators of release year or genre, or if they were too closely tied to what was being classified.
Attribute MotivationRanked Descriptive Attributes Loudness (measured in decibels) Duration (in seconds) Tempo (estimated tempo in BPM) Time Signature (estimated beats per bar) Key Mode (major or minor)
Timbre is the quality of a musical note or sound that distinguishes different types of musical instruments, or voices. It is a complex notion also referred to as sound color, texture, or tone quality, and is derived from the shape of a segments spectro-temporal surface, independently of pitch and loudness.11Early Results Descriptive AttributesDiscretized into 6 decades; 1960-1970, 1970-1980, etc.Baseline (Chance selection): 16.67%First Tests: 6-9% correctly classifiedMore recent Tests: 25-30%Why Random Forest and BayesNet?Early Results
Random forestsare anensemble learningmethod forclassification,regressionand other tasks, that operate by constructing a multitude ofdecision treesat training time and outputting the class that is themodeof the classes (classification) or mean prediction (regression) of the individual trees. Random forests correct for decision trees' habit ofoverfittingto their training set.
14Genre predictionGenres:Classic pop and rockClassicalDance and ElectronicaFolkHip-HopJazzMetalPopRock and IndieSoul and ReggaeGenre Prediction Results
16Conclusions & Future WorkTimbre Attributes are better than Descriptive Attributes Why?Taste ProfileLyrical/Emotional ContentTag Dataset
Timbre - Recording QualityDescriptive Attributes Weaknesses no popular key for a decade, etc17Questions?