collaborative filtering recommendation algorithm based on hadoop
TRANSCRIPT
![Page 1: Collaborative Filtering Recommendation Algorithm based on Hadoop](https://reader030.vdocuments.mx/reader030/viewer/2022032422/55a931a71a28ab35368b45e4/html5/thumbnails/1.jpg)
2015/2/27
Scaling-up Item-based Collaborative Filtering Recommendation Algorithm based on Hadoop
Jing Jiang, Jie Lu, Guangquan Zhang, Guodong Long 2011 IEEE World Congress Services
![Page 2: Collaborative Filtering Recommendation Algorithm based on Hadoop](https://reader030.vdocuments.mx/reader030/viewer/2022032422/55a931a71a28ab35368b45e4/html5/thumbnails/2.jpg)
outline
✤ Collaborative Filtering
✤ scaling-up item-based CF
✤ experimentation and evaluation
![Page 3: Collaborative Filtering Recommendation Algorithm based on Hadoop](https://reader030.vdocuments.mx/reader030/viewer/2022032422/55a931a71a28ab35368b45e4/html5/thumbnails/3.jpg)
Collaborative Filtering
✤ Collaborative filtering (CF) techniques have achieved widespread success in E-commerce nowadays.
![Page 4: Collaborative Filtering Recommendation Algorithm based on Hadoop](https://reader030.vdocuments.mx/reader030/viewer/2022032422/55a931a71a28ab35368b45e4/html5/thumbnails/4.jpg)
Collaborative Filtering
✤ Collaborative filtering is a method of making automatic predictions (filtering) about the interests of a user by collecting preferences or taste information from many users (collaborating). from wiki
![Page 5: Collaborative Filtering Recommendation Algorithm based on Hadoop](https://reader030.vdocuments.mx/reader030/viewer/2022032422/55a931a71a28ab35368b45e4/html5/thumbnails/5.jpg)
Collaborative Filtering
1. Weight all users with respect to similarity with active user
2. Select a subset of users to use as a set of predictors
3. Compute a prediction from a weighted combination of selected neighbors’ ratings
![Page 6: Collaborative Filtering Recommendation Algorithm based on Hadoop](https://reader030.vdocuments.mx/reader030/viewer/2022032422/55a931a71a28ab35368b45e4/html5/thumbnails/6.jpg)
1. Weight all users with respect to similarity with active user
2. Select a subset of users to use as a set of predictors
3. Compute a prediction from a weighted combination of selected neighbors’ ratings
simple compute
Nathan [5,1,5]
Joe [5,2,5]
John [2,5,2.5]
Al [2,2,4]
use cosine compute similarity
cos (Nathan,Joe) 0.99
cos (Nathan,John) 0.64
cos (Nathan,Al) 0.91
![Page 7: Collaborative Filtering Recommendation Algorithm based on Hadoop](https://reader030.vdocuments.mx/reader030/viewer/2022032422/55a931a71a28ab35368b45e4/html5/thumbnails/7.jpg)
1. Weight all users with respect to similarity with active user
2. Select a subset of users to use as a set of predictors
3. Compute a prediction from a weighted combination of selected neighbors’ ratings
simple compute
cos (Nathan,Joe) 0.99
cos (Nathan,John) 0.64
cos (Nathan,Al) 0.91
(0.99*4+0.64*3+0.91*2)/(0.99+0.64+0.91) = 3.03
0.99
0.910.64
? = 3.03
![Page 8: Collaborative Filtering Recommendation Algorithm based on Hadoop](https://reader030.vdocuments.mx/reader030/viewer/2022032422/55a931a71a28ab35368b45e4/html5/thumbnails/8.jpg)
Collaborative Filtering
✤ User-Based CF
✤ Item-Based CF
compute similarity base on user
compute similarity base on item
![Page 9: Collaborative Filtering Recommendation Algorithm based on Hadoop](https://reader030.vdocuments.mx/reader030/viewer/2022032422/55a931a71a28ab35368b45e4/html5/thumbnails/9.jpg)
Collaborative Filtering
✤ User-Based CFcompute similarity base on user
if predict user A to item4 rating user B to item4 rating is 5 user F to item4 rating is 1
user A to item4 =
5 * similarities (user A, user B) + 1 * similarities (user A, user F)
similarities (user A, user B) + similarities (user A, user F)
![Page 10: Collaborative Filtering Recommendation Algorithm based on Hadoop](https://reader030.vdocuments.mx/reader030/viewer/2022032422/55a931a71a28ab35368b45e4/html5/thumbnails/10.jpg)
Collaborative Filtering
✤ Item-Based CFcompute similarity base on item
if predict user A to item4 rating user A to item2 rating is 1 user A to item3 rating is 1
user A to item4 =
1 * similarities (item2, item4) + 1 * similarities (item3, item4)
similarities (item2, item4) + similarities (item3, item4)
![Page 11: Collaborative Filtering Recommendation Algorithm based on Hadoop](https://reader030.vdocuments.mx/reader030/viewer/2022032422/55a931a71a28ab35368b45e4/html5/thumbnails/11.jpg)
scaling-up item-based CF
divide CF algorithm into two steps as follows:
Similarity computation
Prediction and Recommendation
pearson correlation(1,-1)
j
![Page 12: Collaborative Filtering Recommendation Algorithm based on Hadoop](https://reader030.vdocuments.mx/reader030/viewer/2022032422/55a931a71a28ab35368b45e4/html5/thumbnails/12.jpg)
scaling-up item-based CF
pearson correlation(1,-1)
j
Covariance
![Page 13: Collaborative Filtering Recommendation Algorithm based on Hadoop](https://reader030.vdocuments.mx/reader030/viewer/2022032422/55a931a71a28ab35368b45e4/html5/thumbnails/13.jpg)
scaling-up item-based CF
Similarity computation
apple milk toast
sam 2 0 4
john 5 5 3
tim 2 4 ?
u
i j
j
Ri = (2+5+2)/3 Rj = (4+3)/2
![Page 14: Collaborative Filtering Recommendation Algorithm based on Hadoop](https://reader030.vdocuments.mx/reader030/viewer/2022032422/55a931a71a28ab35368b45e4/html5/thumbnails/14.jpg)
scaling-up item-based CF
Similarity computation
apple milk toast
sam 2 0 4
john 5 5 3
tim 2 4 ?
u
j i
Ru(sam) = (2+0+4)/3
Rj = (2+5+2)/3 Ri = (4+3)/2
![Page 15: Collaborative Filtering Recommendation Algorithm based on Hadoop](https://reader030.vdocuments.mx/reader030/viewer/2022032422/55a931a71a28ab35368b45e4/html5/thumbnails/15.jpg)
scaling-up item-based CF
The three parts of intensive computation are:
(1) computing the average rating for each item
(2) computing the similarity between item pairs
(3) computing predicted items for the target user
![Page 16: Collaborative Filtering Recommendation Algorithm based on Hadoop](https://reader030.vdocuments.mx/reader030/viewer/2022032422/55a931a71a28ab35368b45e4/html5/thumbnails/16.jpg)
item i by user j
map item i
1 2 3
![Page 17: Collaborative Filtering Recommendation Algorithm based on Hadoop](https://reader030.vdocuments.mx/reader030/viewer/2022032422/55a931a71a28ab35368b45e4/html5/thumbnails/17.jpg)
1
where means the set of users who rated the item k and item l
![Page 18: Collaborative Filtering Recommendation Algorithm based on Hadoop](https://reader030.vdocuments.mx/reader030/viewer/2022032422/55a931a71a28ab35368b45e4/html5/thumbnails/18.jpg)
2
similarity
3
map user jmap user j
![Page 19: Collaborative Filtering Recommendation Algorithm based on Hadoop](https://reader030.vdocuments.mx/reader030/viewer/2022032422/55a931a71a28ab35368b45e4/html5/thumbnails/19.jpg)
experimentation and evaluation
3 nodes
nodes with Intel P4 CPU, 1G RAM, 80G disk
All the machines were connectedwith one 100Mbps switch.
![Page 20: Collaborative Filtering Recommendation Algorithm based on Hadoop](https://reader030.vdocuments.mx/reader030/viewer/2022032422/55a931a71a28ab35368b45e4/html5/thumbnails/20.jpg)
experimentation and evaluation
13
20