recommender)systems) - people | mit...
TRANSCRIPT
![Page 1: Recommender)Systems) - People | MIT CSAILpeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture25.pdf · user movie score 1 21 1 1 213 5 2 345 4 2 123 4 2 768 3 3 76 5 4 45 4 5](https://reader033.vdocuments.mx/reader033/viewer/2022051918/6009f8707ce64b552a15ff6a/html5/thumbnails/1.jpg)
Recommender Systems Collabora2ve Filtering and Matrix Factoriza2on
Narges Razavian
Thanks to lecture slides from Alex Smola@CMU Yahuda Koren@Yahoo labs and Bing Liu@UIC
![Page 2: Recommender)Systems) - People | MIT CSAILpeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture25.pdf · user movie score 1 21 1 1 213 5 2 345 4 2 123 4 2 768 3 3 76 5 4 45 4 5](https://reader033.vdocuments.mx/reader033/viewer/2022051918/6009f8707ce64b552a15ff6a/html5/thumbnails/2.jpg)
We Know What You Ought To Be Watching This Summer
![Page 3: Recommender)Systems) - People | MIT CSAILpeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture25.pdf · user movie score 1 21 1 1 213 5 2 345 4 2 123 4 2 768 3 3 76 5 4 45 4 5](https://reader033.vdocuments.mx/reader033/viewer/2022051918/6009f8707ce64b552a15ff6a/html5/thumbnails/3.jpg)
Amazon.com
![Page 4: Recommender)Systems) - People | MIT CSAILpeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture25.pdf · user movie score 1 21 1 1 213 5 2 345 4 2 123 4 2 768 3 3 76 5 4 45 4 5](https://reader033.vdocuments.mx/reader033/viewer/2022051918/6009f8707ce64b552a15ff6a/html5/thumbnails/4.jpg)
score movie user
1 21 1
5 213 1
4 345 2
4 123 2
3 768 2
5 76 3
4 45 4
1 568 5
2 342 5
2 234 5
5 76 6
4 56 6
score movie user
? 62 1
? 96 1
? 7 2
? 3 2
? 47 3
? 15 3
? 41 4
? 28 4
? 93 5
? 74 5
? 69 6
? 83 6
Training data Test data
An example
![Page 5: Recommender)Systems) - People | MIT CSAILpeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture25.pdf · user movie score 1 21 1 1 213 5 2 345 4 2 123 4 2 768 3 3 76 5 4 45 4 5](https://reader033.vdocuments.mx/reader033/viewer/2022051918/6009f8707ce64b552a15ff6a/html5/thumbnails/5.jpg)
Two basic approaches
• Content-‐based recommenda2ons: • The user will be recommended items based on profile informa2on or similar to the ones the user preferred in the past;
• Collabora2ve filtering (or collabora2ve recommenda2ons): • The user will be recommended items that people with similar tastes and preferences liked in the past.
• Hybrids: Combine collabora2ve and content-‐based methods.
5
![Page 6: Recommender)Systems) - People | MIT CSAILpeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture25.pdf · user movie score 1 21 1 1 213 5 2 345 4 2 123 4 2 768 3 3 76 5 4 45 4 5](https://reader033.vdocuments.mx/reader033/viewer/2022051918/6009f8707ce64b552a15ff6a/html5/thumbnails/6.jpg)
Road Map
• Introduc2on • Content-‐based recommenda@on
• Collabora2ve filtering based recommenda2on – K-‐nearest neighbor – Matrix factoriza2on
6
![Page 7: Recommender)Systems) - People | MIT CSAILpeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture25.pdf · user movie score 1 21 1 1 213 5 2 345 4 2 123 4 2 768 3 3 76 5 4 45 4 5](https://reader033.vdocuments.mx/reader033/viewer/2022051918/6009f8707ce64b552a15ff6a/html5/thumbnails/7.jpg)
7
Content-‐Based Recommenda2on
• Recommend items that matches the User Profile.
• The Profile is based on items user has liked in the past or explicit interests that he defines.
• A content-‐based recommender system matches the profile of the item to the user profile to decide on its relevancy to the user.
![Page 8: Recommender)Systems) - People | MIT CSAILpeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture25.pdf · user movie score 1 21 1 1 213 5 2 345 4 2 123 4 2 768 3 3 76 5 4 45 4 5](https://reader033.vdocuments.mx/reader033/viewer/2022051918/6009f8707ce64b552a15ff6a/html5/thumbnails/8.jpg)
Road Map
• Introduc2on • Content-‐based recommenda2on
• Collabora2ve filtering based recommenda2ons – K-‐nearest neighbor – Matrix factoriza2on
8
![Page 9: Recommender)Systems) - People | MIT CSAILpeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture25.pdf · user movie score 1 21 1 1 213 5 2 345 4 2 123 4 2 768 3 3 76 5 4 45 4 5](https://reader033.vdocuments.mx/reader033/viewer/2022051918/6009f8707ce64b552a15ff6a/html5/thumbnails/9.jpg)
Collabora2ve Filtering Idea
A 9 B 3 C : : Z 5
A B C 9 : : Z 10
A 5 B 3 C : : Z 7
A B C 8 : : Z
A 6 B 4 C : : Z
A 10 B 4 C 8 . . Z 1
User Database
Correla2on Match
A 9 B 3 C : : Z 5
A 10 B 4 C 8 . . Z 1
![Page 10: Recommender)Systems) - People | MIT CSAILpeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture25.pdf · user movie score 1 21 1 1 213 5 2 345 4 2 123 4 2 768 3 3 76 5 4 45 4 5](https://reader033.vdocuments.mx/reader033/viewer/2022051918/6009f8707ce64b552a15ff6a/html5/thumbnails/10.jpg)
Collabora2ve filtering
• Collabora2ve filtering (CF): most widely-‐used recommenda2on approach in prac2ce. • k-‐nearest neighbor, • matrix factoriza2on
• Key characteris2c of CF: it predicts the u2lity of items for a user based on the items previously rated by other like-‐minded users.
10
![Page 11: Recommender)Systems) - People | MIT CSAILpeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture25.pdf · user movie score 1 21 1 1 213 5 2 345 4 2 123 4 2 768 3 3 76 5 4 45 4 5](https://reader033.vdocuments.mx/reader033/viewer/2022051918/6009f8707ce64b552a15ff6a/html5/thumbnails/11.jpg)
k-‐Nearest Neighbor
• kNN : – u2lizes the en2re user-‐item database to generate predic2ons directly, i.e., there is no model building.
• This approach includes both • User-‐based methods • Item-‐based methods
• Two primary phases: • the neighborhood forma2on phase and • the recommenda2on phase.
11
![Page 12: Recommender)Systems) - People | MIT CSAILpeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture25.pdf · user movie score 1 21 1 1 213 5 2 345 4 2 123 4 2 768 3 3 76 5 4 45 4 5](https://reader033.vdocuments.mx/reader033/viewer/2022051918/6009f8707ce64b552a15ff6a/html5/thumbnails/12.jpg)
Neighborhood forma2on phase
• The similarity between the target user, u, and a neighbor, v, can be calculated using the Pearson’s correla@on coefficient:
• ru,i is the ra2ng given to item I by user u. C is the list of items rated by BOTH users, u and v
12
![Page 13: Recommender)Systems) - People | MIT CSAILpeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture25.pdf · user movie score 1 21 1 1 213 5 2 345 4 2 123 4 2 768 3 3 76 5 4 45 4 5](https://reader033.vdocuments.mx/reader033/viewer/2022051918/6009f8707ce64b552a15ff6a/html5/thumbnails/13.jpg)
Recommenda2on Phase
• Then we can compute the ra2ng predic2on of item i for target user u
where V is the set of k similar users(could be all users), rv,i is the ra2ng of user v given to item i,
13
![Page 14: Recommender)Systems) - People | MIT CSAILpeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture25.pdf · user movie score 1 21 1 1 213 5 2 345 4 2 123 4 2 768 3 3 76 5 4 45 4 5](https://reader033.vdocuments.mx/reader033/viewer/2022051918/6009f8707ce64b552a15ff6a/html5/thumbnails/14.jpg)
Issue with the user-‐based kNN CF
• Lack of scalability: • it requires the real-‐2me comparison of the target user to all user records in order to generate predic2ons.
• Any sugges2ons to improve this?
• A varia2on of this approach that remedies this problem is called item-‐based CF.
14
![Page 15: Recommender)Systems) - People | MIT CSAILpeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture25.pdf · user movie score 1 21 1 1 213 5 2 345 4 2 123 4 2 768 3 3 76 5 4 45 4 5](https://reader033.vdocuments.mx/reader033/viewer/2022051918/6009f8707ce64b552a15ff6a/html5/thumbnails/15.jpg)
Item-‐based CF
• The item-‐based approach works by comparing items based on their pacern of ra2ngs across users. The similarity of items i and j is computed as follows:
15
![Page 16: Recommender)Systems) - People | MIT CSAILpeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture25.pdf · user movie score 1 21 1 1 213 5 2 345 4 2 123 4 2 768 3 3 76 5 4 45 4 5](https://reader033.vdocuments.mx/reader033/viewer/2022051918/6009f8707ce64b552a15ff6a/html5/thumbnails/16.jpg)
Recommenda2on phase
• Ader compu2ng the similarity between items we select a set of k most similar items to the target item and generate a predicted value of user u’s ra2ng
where J is the set of k similar items
16
![Page 17: Recommender)Systems) - People | MIT CSAILpeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture25.pdf · user movie score 1 21 1 1 213 5 2 345 4 2 123 4 2 768 3 3 76 5 4 45 4 5](https://reader033.vdocuments.mx/reader033/viewer/2022051918/6009f8707ce64b552a15ff6a/html5/thumbnails/17.jpg)
Prac2cal Issues : Cold Start
• New user – Rate some ini2al items – Non-‐personalized recommenda2ons – Describe tastes – Demographic info.
• New Item – Non-‐CF : content analysis, metadata
![Page 18: Recommender)Systems) - People | MIT CSAILpeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture25.pdf · user movie score 1 21 1 1 213 5 2 345 4 2 123 4 2 768 3 3 76 5 4 45 4 5](https://reader033.vdocuments.mx/reader033/viewer/2022051918/6009f8707ce64b552a15ff6a/html5/thumbnails/18.jpg)
Road Map
• Introduc2on • Content-‐based recommenda2on
• Collabora2ve filtering based recommenda2ons – K-‐nearest neighbor – Matrix factoriza@on
18
![Page 19: Recommender)Systems) - People | MIT CSAILpeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture25.pdf · user movie score 1 21 1 1 213 5 2 345 4 2 123 4 2 768 3 3 76 5 4 45 4 5](https://reader033.vdocuments.mx/reader033/viewer/2022051918/6009f8707ce64b552a15ff6a/html5/thumbnails/19.jpg)
Geared towards females
Geared towards males
serious
escapist
The Princess Diaries
The Lion King
Braveheart
Lethal Weapon
Independence Day
Amadeus The Color Purple
Dumb and Dumber
Ocean’s 11
Sense and Sensibility
Gus
Dave
Latent factor models
![Page 20: Recommender)Systems) - People | MIT CSAILpeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture25.pdf · user movie score 1 21 1 1 213 5 2 345 4 2 123 4 2 768 3 3 76 5 4 45 4 5](https://reader033.vdocuments.mx/reader033/viewer/2022051918/6009f8707ce64b552a15ff6a/html5/thumbnails/20.jpg)
Latent factor models
45531
312445
53432142
24542
522434
42331
items
.2 -.4 .1
.5 .6 -.5
.5 .3 -.2
.3 2.1 1.1
-2 2.1 -.7
.3 .7 -1
-.9 2.4 1.4 .3 -.4 .8 -.5 -2 .5 .3 -.2 1.1
1.3 -.1 1.2 -.7 2.9 1.4 -1 .3 1.4 .5 .7 -.8
.1 -.6 .7 .8 .4 -.3 .9 2.4 1.7 .6 -.4 2.1
~
~
items
users
users
![Page 21: Recommender)Systems) - People | MIT CSAILpeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture25.pdf · user movie score 1 21 1 1 213 5 2 345 4 2 123 4 2 768 3 3 76 5 4 45 4 5](https://reader033.vdocuments.mx/reader033/viewer/2022051918/6009f8707ce64b552a15ff6a/html5/thumbnails/21.jpg)
Es2mate unknown ra2ngs as inner-‐products of factors:
45531
312445
53432142
24542
522434
42331
items
.2 -.4 .1
.5 .6 -.5
.5 .3 -.2
.3 2.1 1.1
-2 2.1 -.7
.3 .7 -1
-.9 2.4 1.4 .3 -.4 .8 -.5 -2 .5 .3 -.2 1.1
1.3 -.1 1.2 -.7 2.9 1.4 -1 .3 1.4 .5 .7 -.8
.1 -.6 .7 .8 .4 -.3 .9 2.4 1.7 .6 -.4 2.1
~
~
items
users
users
?
![Page 22: Recommender)Systems) - People | MIT CSAILpeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture25.pdf · user movie score 1 21 1 1 213 5 2 345 4 2 123 4 2 768 3 3 76 5 4 45 4 5](https://reader033.vdocuments.mx/reader033/viewer/2022051918/6009f8707ce64b552a15ff6a/html5/thumbnails/22.jpg)
Es2mate unknown ra2ngs as inner-‐products of factors:
45531
312445
53432142
24542
522434
42331
items
.2 -.4 .1
.5 .6 -.5
.5 .3 -.2
.3 2.1 1.1
-2 2.1 -.7
.3 .7 -1
-.9 2.4 1.4 .3 -.4 .8 -.5 -2 .5 .3 -.2 1.1
1.3 -.1 1.2 -.7 2.9 1.4 -1 .3 1.4 .5 .7 -.8
.1 -.6 .7 .8 .4 -.3 .9 2.4 1.7 .6 -.4 2.1
~
~
items
users
users
?
![Page 23: Recommender)Systems) - People | MIT CSAILpeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture25.pdf · user movie score 1 21 1 1 213 5 2 345 4 2 123 4 2 768 3 3 76 5 4 45 4 5](https://reader033.vdocuments.mx/reader033/viewer/2022051918/6009f8707ce64b552a15ff6a/html5/thumbnails/23.jpg)
Es2mate unknown ra2ngs as inner-‐products of factors:
45531
312445
53432142
24542
522434
42331
items
.2 -.4 .1
.5 .6 -.5
.5 .3 -.2
.3 2.1 1.1
-2 2.1 -.7
.3 .7 -1
-.9 2.4 1.4 .3 -.4 .8 -.5 -2 .5 .3 -.2 1.1
1.3 -.1 1.2 -.7 2.9 1.4 -1 .3 1.4 .5 .7 -.8
.1 -.6 .7 .8 .4 -.3 .9 2.4 1.7 .6 -.4 2.1
~
~
items
users
2.4
users
![Page 24: Recommender)Systems) - People | MIT CSAILpeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture25.pdf · user movie score 1 21 1 1 213 5 2 345 4 2 123 4 2 768 3 3 76 5 4 45 4 5](https://reader033.vdocuments.mx/reader033/viewer/2022051918/6009f8707ce64b552a15ff6a/html5/thumbnails/24.jpg)
Challenges • Similar to SVD, but less constrained:
– Factorize with missing values!
• Re-‐define objec2ve func2on:
• Can use gradient descent to deal with missing values
To avoid over-‐filng
![Page 25: Recommender)Systems) - People | MIT CSAILpeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture25.pdf · user movie score 1 21 1 1 213 5 2 345 4 2 123 4 2 768 3 3 76 5 4 45 4 5](https://reader033.vdocuments.mx/reader033/viewer/2022051918/6009f8707ce64b552a15ff6a/html5/thumbnails/25.jpg)
Stochas2c Gradient Descent • For each data point,
• Deriva2ves on variables (q and p) are used for update:
• Both p and q are unknown, so we have to alternate – Will converge to local op2ma
![Page 26: Recommender)Systems) - People | MIT CSAILpeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture25.pdf · user movie score 1 21 1 1 213 5 2 345 4 2 123 4 2 768 3 3 76 5 4 45 4 5](https://reader033.vdocuments.mx/reader033/viewer/2022051918/6009f8707ce64b552a15ff6a/html5/thumbnails/26.jpg)
Incorpora2ng bias • Some users rate movies higher than others • Some movies get hyped and get higher ra2ngs
• The new model:
• The new objec2ve func2on
• Deriva2ves:
![Page 27: Recommender)Systems) - People | MIT CSAILpeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture25.pdf · user movie score 1 21 1 1 213 5 2 345 4 2 123 4 2 768 3 3 76 5 4 45 4 5](https://reader033.vdocuments.mx/reader033/viewer/2022051918/6009f8707ce64b552a15ff6a/html5/thumbnails/27.jpg)
Further modeling assump2ons
• Changing preferences over 2me?
• Varying confidence levels in ra2ngs?
• Other ideas?
![Page 28: Recommender)Systems) - People | MIT CSAILpeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture25.pdf · user movie score 1 21 1 1 213 5 2 345 4 2 123 4 2 768 3 3 76 5 4 45 4 5](https://reader033.vdocuments.mx/reader033/viewer/2022051918/6009f8707ce64b552a15ff6a/html5/thumbnails/28.jpg)
Summary
• Recommenda2on based on – Content – Collabora2ve filtering
• Collabora2ve filtering – Neighborhood method – Matrix Factoriza2on
• Possible Further topics – Hybrid models of content and collabora2ve to impute missing values and deal with cold start