scalable collaborative filtering for commerce recommendation
DESCRIPTION
The slides for DataScience.SG Meetup (23, Sep, 2014). An introduction of a scalable collaborative filtering method using Hadoop mapreduce we build for commerce recommendation application.TRANSCRIPT
![Page 1: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/1.jpg)
Scalable Collaborative Filtering for Commerce Recommendation
Yiqun Hu & Yew Yap Goh {yiqhu, ygoh}@paypal.com
September, 2014
![Page 2: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/2.jpg)
Agenda• Our Problem: Commerce Recommendation
• Collaborative Filtering (CF) 101
• Mahout’s Solution and Its Problems
• Our Solution
• Summary & Take Away
![Page 3: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/3.jpg)
Commerce Recommendation
![Page 4: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/4.jpg)
Which One To Choose ?We Accept
![Page 5: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/5.jpg)
Which One To Choose ?We Accept
![Page 6: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/6.jpg)
Which One To Choose ?We Accept
![Page 7: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/7.jpg)
A Win-Win Solution
Consumer Merchant
Recommendation !Engine
Make good use of my money Grow my business
![Page 8: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/8.jpg)
CF Input - Interaction Matrix
Consumer
Merchant
Commerce Interaction Matrix
![Page 9: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/9.jpg)
CF Input - Implicit Feedback
Binary Likeness Matrix
Confidence Matrix
Consumer
Merchant
Interaction Matrix
Commerce Interaction Matrix
* Yifa Hu, Yehuda Koren and Chris Volinsky, Collaborative Filtering For Implicit Feedback Datasets, ICDM 2008
![Page 10: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/10.jpg)
CF Modeling - Matrix Factorization
• U - the models of every consumer
• V - the models of every merchant
• Find the optimal U/V via optimization
Con
sum
er
Merchant
Con
sum
er Merchant
![Page 11: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/11.jpg)
• Iteratively updateFix V and update U: Fix U and update V:
Alternative Least Square (ALS)
RegularizationData Fitting
* Yifa Hu, Yehuda Koren and Chris Volinsky, Collaborative Filtering For Implicit Feedback Datasets, ICDM 2008
![Page 12: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/12.jpg)
Scalable ALS
Constant in current iteration
Only need to consider the
nonzero entities
Only need to consider
nonzero entities
* Yifa Hu, Yehuda Koren and Chris Volinsky, Collaborative Filtering For Implicit Feedback Datasets, ICDM 2008
![Page 13: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/13.jpg)
Open Source Technologies
![Page 14: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/14.jpg)
Open Source Technologies
![Page 15: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/15.jpg)
Open Source Technologies
![Page 16: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/16.jpg)
Implementation in MahoutAggregate'user'ra*ngs'
Ini*alize'item'models'
Update'user'models''
Update'item'models'
(1534,2323,2)'(1534,1128,3)'(1534,5678,1)''''''''''…'
(1534,'{1128:3,'2323:2,'5678:1,'…})'
Ini*alize'matrix'V'using'the'average'ra*ngs'
Run'K'itera*ons'
![Page 17: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/17.jpg)
How To Parallelize ?
."
."
."
."".""."
."
."
."
Input&Transac,on&Matrix&
Worker&K(1& Worker&K&Worker&2&Worker&1& Worker&3&
…"…"
![Page 18: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/18.jpg)
How To Parallelize ?
."
."
."
."".""."
."
."
."
Input&Transac,on&Matrix&
Worker&K(1& Worker&K&Worker&2&Worker&1& Worker&3&
…" …"…"…" …" …"…"
![Page 19: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/19.jpg)
How To Parallelize ?
."
."
."
."".""."
."
."
."
Input&Transac,on&Matrix&
Worker&K(1& Worker&K&Worker&2&Worker&1& Worker&3&
…" …"…"…" …" …"…"
![Page 20: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/20.jpg)
For Every Worker …
Worker&1&
…&
Compute&
Solve&Least&Square&Problem&
![Page 21: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/21.jpg)
Simple Illustration
Consumer
Merchant
Commerce Interaction Matrix
Worker&1&
Worker&2&
Worker&3&
Worker&4&
![Page 22: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/22.jpg)
Simple Illustration
Consumer
Merchant
Commerce Interaction Matrix
Worker&1&
Worker&2&
Worker&3&
Worker&4&
![Page 23: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/23.jpg)
Simple Illustration
Consumer
Merchant
Commerce Interaction Matrix
Worker&1&
Worker&2&
Worker&3&
Worker&4&
![Page 24: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/24.jpg)
Simple Illustration
Consumer
Merchant
Commerce Interaction Matrix
Worker&1&
Worker&2&
Worker&3&
Worker&4&
![Page 25: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/25.jpg)
Simple Illustration
Consumer
Merchant
Commerce Interaction Matrix
Worker&1&
Worker&2&
Worker&3&
Worker&4&
![Page 26: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/26.jpg)
Simple Illustration
Consumer
Merchant
Commerce Interaction Matrix
Worker&1&
Worker&2&
Worker&3&
Worker&4&
![Page 27: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/27.jpg)
Anything Wrong ?!
!
• Unnecessary broadcast of all item vectors to every worker;
• The volume of all item vectors can be huge and impossible to load into memory;
…"…"Worker"K)1" Worker"K"Worker"2"Worker"1" Worker"3"
…" …"…"…" …"
![Page 28: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/28.jpg)
Scalable ALS
Constant in current iteration
Only need to consider the
nonzero entities
Only need to consider
nonzero entities
![Page 29: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/29.jpg)
ALS Recap
![Page 30: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/30.jpg)
ALS Recap
![Page 31: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/31.jpg)
The Solution of Spotify
* Chris Johnson, Erik Bernhardsson, Algorithmic Music Recommendations at Spotify
![Page 32: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/32.jpg)
For Every Map Job …
…"…"
JobTracker
Worker 1 Worker 2 Worker 3
…"
…"
block 2 block 3block 1
…"…"…"
![Page 33: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/33.jpg)
For Every Map Job …
…"…"
JobTracker
Worker 1 Worker 2 Worker 3
…"
…"
block 2 block 3block 1
…" …" …"
![Page 34: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/34.jpg)
The Solution of Spotify
* Chris Johnson, Erik Bernhardsson, Algorithmic Music Recommendations at Spotify
![Page 35: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/35.jpg)
Mapper&5&Mapper&4&Mapper&3&Mapper&2&Mapper&1&
Simple Illustration
![Page 36: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/36.jpg)
Mapper&5&Mapper&4&Mapper&3&Mapper&2&Mapper&1&
Simple Illustration
![Page 37: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/37.jpg)
Mapper&5&Mapper&4&Mapper&3&Mapper&2&Mapper&1&
Simple Illustration
![Page 38: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/38.jpg)
Mapper&5&Mapper&4&Mapper&3&Mapper&2&Mapper&1&
Simple Illustration
Reducer 1
![Page 39: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/39.jpg)
Mapper&5&Mapper&4&Mapper&3&Mapper&2&Mapper&1&
Simple Illustration
Reducer 1 Reducer 2
![Page 40: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/40.jpg)
Mapper&5&Mapper&4&Mapper&3&Mapper&2&Mapper&1&
Simple Illustration
Reducer 1 Reducer 2 Reducer 3
![Page 41: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/41.jpg)
Mapper&5&Mapper&4&Mapper&3&Mapper&2&Mapper&1&
Simple Illustration
Reducer 1 Reducer 2 Reducer 3 Reducer 4
![Page 42: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/42.jpg)
Inside A Mapper
File System
Memory
Mapper
![Page 43: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/43.jpg)
Inside A Mapper
File System
Memory
Mapper
![Page 44: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/44.jpg)
Inside A Mapper
File System
Memory
Mapper
![Page 45: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/45.jpg)
Inside A Mapper
File System
Memory
Mapper
![Page 46: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/46.jpg)
Inside A Mapper
File System
Memory
Disadvantages:!
• Unnecessary file copy to all nodes before job;
• Each map function call, potential context switch with inefficient I/O load;
Mapper
![Page 47: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/47.jpg)
Our Solution - Shard Partition
(Lee, {1:1, 2:5}) (Kim, {1:4, 2:2}) (Ha, {1:2, 2:4}) (Oh, {1:2, 2:4}
(Lee, {4:2}) (Kim, {4:5}) (Ha, {3:3}) (Oh, {4:5})
(Lee, {5:4}) (Kim, {5:1, 6:2}) (Ha, {6:5}) (Oh, {5:1})
Consumer
Merchant
Commerce Interaction Matrix
![Page 48: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/48.jpg)
Our Solution - Shard Partition
Shard 1
(Lee, {1:1, 2:5}) (Kim, {1:4, 2:2}) (Ha, {1:2, 2:4}) (Oh, {1:2, 2:4}
(Lee, {4:2}) (Kim, {4:5}) (Ha, {3:3}) (Oh, {4:5})
(Lee, {5:4}) (Kim, {5:1, 6:2}) (Ha, {6:5}) (Oh, {5:1})
Consumer
Merchant
Commerce Interaction Matrix
![Page 49: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/49.jpg)
Our Solution - Shard Partition
Shard 1 Shard 2
(Lee, {1:1, 2:5}) (Kim, {1:4, 2:2}) (Ha, {1:2, 2:4}) (Oh, {1:2, 2:4}
(Lee, {4:2}) (Kim, {4:5}) (Ha, {3:3}) (Oh, {4:5})
(Lee, {5:4}) (Kim, {5:1, 6:2}) (Ha, {6:5}) (Oh, {5:1})
Consumer
Merchant
Commerce Interaction Matrix
![Page 50: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/50.jpg)
Our Solution - Shard Partition
Shard 1 Shard 2 Shard 3
(Lee, {1:1, 2:5}) (Kim, {1:4, 2:2}) (Ha, {1:2, 2:4}) (Oh, {1:2, 2:4}
(Lee, {4:2}) (Kim, {4:5}) (Ha, {3:3}) (Oh, {4:5})
(Lee, {5:4}) (Kim, {5:1, 6:2}) (Ha, {6:5}) (Oh, {5:1})
Consumer
Merchant
Commerce Interaction Matrix
![Page 51: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/51.jpg)
Shard 1
Our Solution - Shard Partition
Shard 1 Shard 2 Shard 3
(Lee, {1:1, 2:5}) (Kim, {1:4, 2:2}) (Ha, {1:2, 2:4}) (Oh, {1:2, 2:4}
(Lee, {4:2}) (Kim, {4:5}) (Ha, {3:3}) (Oh, {4:5})
(Lee, {5:4}) (Kim, {5:1, 6:2}) (Ha, {6:5}) (Oh, {5:1})
Consumer
Merchant
Commerce Interaction Matrix
![Page 52: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/52.jpg)
Shard 2Shard 1
Our Solution - Shard Partition
Shard 1 Shard 2 Shard 3
(Lee, {1:1, 2:5}) (Kim, {1:4, 2:2}) (Ha, {1:2, 2:4}) (Oh, {1:2, 2:4}
(Lee, {4:2}) (Kim, {4:5}) (Ha, {3:3}) (Oh, {4:5})
(Lee, {5:4}) (Kim, {5:1, 6:2}) (Ha, {6:5}) (Oh, {5:1})
Consumer
Merchant
Commerce Interaction Matrix
![Page 53: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/53.jpg)
Shard 2Shard 1
Our Solution - Shard Partition
Shard 1 Shard 2 Shard 3
(Lee, {1:1, 2:5}) (Kim, {1:4, 2:2}) (Ha, {1:2, 2:4}) (Oh, {1:2, 2:4}
(Lee, {4:2}) (Kim, {4:5}) (Ha, {3:3}) (Oh, {4:5})
Shard 3(Lee, {5:4}) (Kim, {5:1, 6:2}) (Ha, {6:5}) (Oh, {5:1})
Consumer
Merchant
Commerce Interaction Matrix
![Page 54: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/54.jpg)
Our Solution - Parallel Shard Processing
Shard 1 Shard 2 Shard 3Consumer
Merchant
Commerce Interaction Matrix
![Page 55: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/55.jpg)
Our Solution - Parallel Shard Processing
Shard 1 Shard 2 Shard 3
Stage 1:!Compute the individual contributions of each rating
Consumer
Merchant
Commerce Interaction Matrix
MapReduce Job !for Shard 1!
MapReduce Job !for Shard 2!
MapReduce Job !for Shard 3!
![Page 56: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/56.jpg)
Our Solution - Parallel Shard Processing
Shard 1 Shard 2 Shard 3
Global MapReduce Job for ALS !
Stage 1:!Compute the individual contributions of each rating
Stage 2:!Aggregate all contributions for every user and update their models in parallel
Consumer
Merchant
Commerce Interaction Matrix
MapReduce Job !for Shard 1!
MapReduce Job !for Shard 2!
MapReduce Job !for Shard 3!
![Page 57: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/57.jpg)
For Every Map Job …
…"…"
JobTracker
Worker 1 Worker 2 Worker 3
…"
…"
block 2 block 3block 1
![Page 58: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/58.jpg)
For Every Map Job …
…"…"
JobTracker
Worker 1 Worker 2 Worker 3
…"
…"
block 2 block 3block 1
![Page 59: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/59.jpg)
Scalability Comparison
#Customers! #Merchants! #Transactions! Capacity!Apache Mahout! 4,790,651! 1,195,890! 23,721,103! Fail!
Our Method! 45,948,109! 4,386,744! 324,084,408! Success!
More than 10x scalability improvement! !
![Page 60: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/60.jpg)
Summary
• Recommendation Problem
• Collaborative Filtering by Matrix Factorization
• Alternative Least Square (ALS)
• A Scalable Solution
![Page 61: Scalable Collaborative Filtering for Commerce Recommendation](https://reader033.vdocuments.mx/reader033/viewer/2022051513/547ca765b4af9fda158b519e/html5/thumbnails/61.jpg)
Thank You!!Question & Answer