cs158: final project
DESCRIPTION
TRANSCRIPT
![Page 1: CS158: Final Project](https://reader034.vdocuments.mx/reader034/viewer/2022051818/549f7437ac795947768b4a4f/html5/thumbnails/1.jpg)
Music RecommendationEvan Casey and Erin Coughlan
![Page 2: CS158: Final Project](https://reader034.vdocuments.mx/reader034/viewer/2022051818/549f7437ac795947768b4a4f/html5/thumbnails/2.jpg)
Problem● Million Song Dataset Challenge (Kaggle)
○ 110k Users, 1m+ unique songs● Music Recommendation
○ Recommend songs for each user based on a larger training set of user listening histories
● Winner - 0.17910 (17.9%)● Benchmark - 0.02079 (2.1%)
![Page 3: CS158: Final Project](https://reader034.vdocuments.mx/reader034/viewer/2022051818/549f7437ac795947768b4a4f/html5/thumbnails/3.jpg)
Data● Million Song Dataset● Two subsets of 1000
users (random and most active)
● Echonest API to get metadata
![Page 4: CS158: Final Project](https://reader034.vdocuments.mx/reader034/viewer/2022051818/549f7437ac795947768b4a4f/html5/thumbnails/4.jpg)
Echonest DataMetadata we obtained:
● Tempo● Danceability● Energy● Speech● Acousticness
Unavailable metadata:
● Genre● Artist popularity● Song popularity● Location● Year released
![Page 5: CS158: Final Project](https://reader034.vdocuments.mx/reader034/viewer/2022051818/549f7437ac795947768b4a4f/html5/thumbnails/5.jpg)
Previous ApproachesDynamic K-Means:
● Kim et. al (6th Int’l Conference on ML)● Li et. al (University of Michigan)
Item and user-based collaborative-filtering:● Niu et. al (Stanford)● Lu et. al (Stanford)
![Page 6: CS158: Final Project](https://reader034.vdocuments.mx/reader034/viewer/2022051818/549f7437ac795947768b4a4f/html5/thumbnails/6.jpg)
K-Means
![Page 7: CS158: Final Project](https://reader034.vdocuments.mx/reader034/viewer/2022051818/549f7437ac795947768b4a4f/html5/thumbnails/7.jpg)
K-Means for our ProblemStep 1:K-Means from all songs listened to by all users
![Page 8: CS158: Final Project](https://reader034.vdocuments.mx/reader034/viewer/2022051818/549f7437ac795947768b4a4f/html5/thumbnails/8.jpg)
K-Means for our ProblemStep 2:K-Means from user listening history
![Page 9: CS158: Final Project](https://reader034.vdocuments.mx/reader034/viewer/2022051818/549f7437ac795947768b4a4f/html5/thumbnails/9.jpg)
K-Means for our ProblemStep 3:Predict based on location of user centroids
![Page 10: CS158: Final Project](https://reader034.vdocuments.mx/reader034/viewer/2022051818/549f7437ac795947768b4a4f/html5/thumbnails/10.jpg)
Mean Average PrecisionPredicted:
Actual:
![Page 11: CS158: Final Project](https://reader034.vdocuments.mx/reader034/viewer/2022051818/549f7437ac795947768b4a4f/html5/thumbnails/11.jpg)
Mean Average Precision
![Page 12: CS158: Final Project](https://reader034.vdocuments.mx/reader034/viewer/2022051818/549f7437ac795947768b4a4f/html5/thumbnails/12.jpg)
What are the Results?All Metadata 0.00200326282427
Weighted Centroids 0.00375567272976
Multiple Centroids (2) 0.00364834470835
Modified Metadata 0.00994279218087
All Improvements 0.01008282844
More Data 0.00266295400221
![Page 13: CS158: Final Project](https://reader034.vdocuments.mx/reader034/viewer/2022051818/549f7437ac795947768b4a4f/html5/thumbnails/13.jpg)
Number of Clusters?
![Page 14: CS158: Final Project](https://reader034.vdocuments.mx/reader034/viewer/2022051818/549f7437ac795947768b4a4f/html5/thumbnails/14.jpg)
Collaborative FilteringShawn
Billy
Paul
1 4 3 8 9
2
4
3 4
1
2
8 8
4
![Page 15: CS158: Final Project](https://reader034.vdocuments.mx/reader034/viewer/2022051818/549f7437ac795947768b4a4f/html5/thumbnails/15.jpg)
Collaborative Filtering
![Page 16: CS158: Final Project](https://reader034.vdocuments.mx/reader034/viewer/2022051818/549f7437ac795947768b4a4f/html5/thumbnails/16.jpg)
User-based Collaborative Filtering
Step 1: Obtain user history profile
![Page 17: CS158: Final Project](https://reader034.vdocuments.mx/reader034/viewer/2022051818/549f7437ac795947768b4a4f/html5/thumbnails/17.jpg)
User-based Collaborative Filtering
Step 2:Calculate similarity between users to find their nearest neighbors
![Page 18: CS158: Final Project](https://reader034.vdocuments.mx/reader034/viewer/2022051818/549f7437ac795947768b4a4f/html5/thumbnails/18.jpg)
User-based Collaborative Filtering
Step 3:Compute weighted average of the ratings by the neighbors and find the items with the highest average
![Page 19: CS158: Final Project](https://reader034.vdocuments.mx/reader034/viewer/2022051818/549f7437ac795947768b4a4f/html5/thumbnails/19.jpg)
Implementation Details
MRJobUsed Amazon EMR with MRJob to parallelize the algorithm across multiple machines
![Page 20: CS158: Final Project](https://reader034.vdocuments.mx/reader034/viewer/2022051818/549f7437ac795947768b4a4f/html5/thumbnails/20.jpg)
What are the Results?
User Collaborative Filtering (1k Users) 0.008223545412
User Collaborative Filtering (10k Users) 0.012654713312
User Collaborative Filtering (110k Users) 0.112794360446
![Page 21: CS158: Final Project](https://reader034.vdocuments.mx/reader034/viewer/2022051818/549f7437ac795947768b4a4f/html5/thumbnails/21.jpg)
Compiled ResultsBenchmark (1k Users) 0.0104030562401
K-means 0.01008282844
Benchmark (110k Users) 0.02079
User Collaborative Filtering 0.112794360446
![Page 22: CS158: Final Project](https://reader034.vdocuments.mx/reader034/viewer/2022051818/549f7437ac795947768b4a4f/html5/thumbnails/22.jpg)
Improvements?● Ensemble techniques● More metadata from echonest (genre, artist
popularity, etc.)● MapReduce for k-means
![Page 23: CS158: Final Project](https://reader034.vdocuments.mx/reader034/viewer/2022051818/549f7437ac795947768b4a4f/html5/thumbnails/23.jpg)
Questions?
![Page 24: CS158: Final Project](https://reader034.vdocuments.mx/reader034/viewer/2022051818/549f7437ac795947768b4a4f/html5/thumbnails/24.jpg)
Referenceshttp://cs229.stanford.edu/proj2012/NiuYinZhang-MillionSongDatasetChallenge.pdf
http://cs229.stanford.edu/proj2012/LuXiongLiu-MusicRecommenderSystemUtilizingUsers%E2%80%99ListeningHistoryandSocialNetworkInformation.pdf
http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4457263&tag=1
http://www-personal.umich.edu/~yjli/content/projectreport.pdf
![Page 25: CS158: Final Project](https://reader034.vdocuments.mx/reader034/viewer/2022051818/549f7437ac795947768b4a4f/html5/thumbnails/25.jpg)
Github Repohttps://github.com/erinkidd01/CS158FinalProject.git