recommendations @ rakuten group

33
Recommendations @ Rakuten Group RecSys 2015/12/01 Vincent Michel & David Mas [email protected] & [email protected] Big Data Europe, Big Data Department, Rakuten Inc. / Big Data, PriceMinister 1

Upload: recsysfr

Post on 14-Jan-2017

250 views

Category:

Internet


0 download

TRANSCRIPT

Page 1: Recommendations @ Rakuten Group

Recommendations @ Rakuten Group

RecSys 2015/12/01 Vincent Michel & David Mas [email protected] & [email protected] Big Data Europe, Big Data Department, Rakuten Inc. / Big Data, PriceMinister

1

Page 2: Recommendations @ Rakuten Group

Presentation Overview

2

§  Rakuten Group §  Recommendations Challenges

•  Challenges of Recommendations @ Rakuten •  Items Catalogues and Similarities •  Exploring Recommendations Models •  Recommendations Evaluation and Public Initiatives

§  Conclusion

Page 3: Recommendations @ Rakuten Group

Rakuten Group Worldwide

3

Recommendation challenges

q Different languages q Users behavior q Business areas

Page 4: Recommendations @ Rakuten Group

Rakuten Group in Numbers

4

Rakuten in Japan

q > 12.000 employees q > 48 billions euros of GMS q > 100.000.000 users q > 250.000.000 items q > 40.000 merchants

Rakuten Group

q Kobo 18.000.000 users q Viki 28.000.000 users q Viber 345.000.000 users

Page 5: Recommendations @ Rakuten Group

Rakuten Ecosystem

5

•  Rakuten global ecosystem : l Member-based business model that connects Rakuten services l Rakuten ID common to various Rakuten services l Online shopping and services;

Main business areas q E-commerce q Internet finance q Digital content

Recommendation challenges q Cross-services q Aggregated data q Complex users features

Page 6: Recommendations @ Rakuten Group

Rakuten’s e-commerce: B2B2C Business Model

6

•  Business to Business to Consumer: l Merchants located in different regions / online virtual shopping mall l Main profit sources

•  Fixed fees from merchants •  Fees based on each transaction and other service

Recommendation challenges

q Many shops q Items references q Global catalog

Page 7: Recommendations @ Rakuten Group

Big Data Department @ Rakuten

7

Big Data Department 150+ engineers – Japan / Europe / US

Missions

q Development and operations of internal systems for:

q Recommendations q Search q Targeting q User behavior tracking

Average traffic

q > 100.000.000 events / day q > 40.000.000 items view / day q > 50.000.000 search / day q > 750.000 purchases / day

Technology stack

q Java / Python / Ruby q Solr / Lucene q Cassandra / Couchbase q Hadoop / Hive / Pig q Redis / Kafka

Page 8: Recommendations @ Rakuten Group

Presentation Overview

8

§  Rakuten Group §  Recommendations Challenges

•  Challenges of Recommendations @ Rakuten •  Items Catalogues and Similarities •  Exploring Recommendations Models •  Recommendations Evaluation and Public Initiatives

§  Conclusion

Page 9: Recommendations @ Rakuten Group

Recommendations on Rakuten Marketplaces

9 9

Non-personalized recommendations q All-shop recommendations:

q Item to item q User to item

q In-shop recommendations q Review-based recommendations

Personalized recommendations q Purchase history recommendations q Cart add recommendations q Order confirmation recommendations

System status and scale q In production in over 35 services of Rakuten Group worldwide q Several hundreds of servers running:

q Hadoop q Cassandra q APIS

Page 10: Recommendations @ Rakuten Group

Challenges in Recommendations

10 10

Items Catalogue

Items Similarity

Recommendations

engine

Evaluation

Process

•  Items catalogues l Catalogue for multiple shops with different items references ?

•  Items similarity / distances l Cross services aggregation ? l Lots of parameters ?

•  Recommendations engine l Best / optimal recommendations logic ?

•  Evaluation process l Offline / online evaluation ? l Long-tail ? KPI ?

Page 11: Recommendations @ Rakuten Group

Recommendations Architecture: Constantly Evolving

11 11

Browsing  Events

Cocounts

Storage

Purchase  Events

Catalogue(s)

Distrib

u9on

 layer  

Recommendations Offline / materialized

Recommendations Online algebra / multi-arm

Page 12: Recommendations @ Rakuten Group

Presentation Overview

12

§  Rakuten Group §  Recommendations Challenges

•  Challenges of Recommendations @ Rakuten •  Items Catalogues and Similarities •  Exploring Recommendations Models •  Recommendations Evaluation and Public Initiatives

§  Conclusion

Page 13: Recommendations @ Rakuten Group

Items Catalogues

13 13

Use different levels of aggregation to improve recommendations

Category-level (e.g. food, soda, clothes, …)

Product-level (manufactured items)

Item in shop-level (specific product sell by a specific shop)

Increased statistical power in co-events computation

Easier business handling (picking the good item)

Page 14: Recommendations @ Rakuten Group

Enriching Catalogues using Record Linkage

14

Record linkage  q  Use external sources (e.g., Wikidata) to align markets' products  q  Fuzzy matching of 600K vs 350K items for movies alignments usecase.  q  Blocking algorithm  

Cross recommendation  q  Global catalog  q  Items aggregation  q  Helps with cold start issues q  Improved navigation  

Marketplace  2  Marketplace  1   Reference  database  

Page 15: Recommendations @ Rakuten Group

Co-occurrences and Similarities Computation

15

Multiple possible parameters:

l  Size of time window to be considered: Does browsing and purchase data reflect similar behavior ?

l  Threshold on co-occurrences Is one co-occurrence significant enough to be used ? Two ? Three ?

l  Symmetric or asymmetric Is the order important in the co-occurrence ? A then B == B then A ?

l  Similarity metrics Which similarity metrics to be used based on the co-occurrences ?

Only access to unitary data (purchase / browsing)

Use co-occurrences for computing items similarity

Page 16: Recommendations @ Rakuten Group

Co-occurrences Example

16

Browsing

Purchase

Session  ? Session  ? Time window 1

Session  ? Time window 2

07/11/2015   08/11/2015  

08/11/2015  

24/11/2015  

08/11/2015  

08/11/2015  

10/09/2015  

08/09/2015   10/09/2015  

Page 17: Recommendations @ Rakuten Group

Co-occurrences Computation

17

Co-­‐purchases

Co-­‐browsing

Classical co-occurrences  

Complementary  items

Subs9tute  items

Other possible co-occurrences  

Items  browsed  and  bought  together  

Items  browsed  and  not  bought  together  

“You  may  also  want…”  

“Similar  items…”  

08/11/2015  

08/11/2015  

08/11/2015  

07/11/2015  

08/11/2015  10/09/2015  

08/09/2015   07/11/2015  

Page 18: Recommendations @ Rakuten Group

Presentation Overview

18

§  Rakuten Group §  Recommendations Challenges

•  Challenges of Recommendations @ Rakuten •  Items Catalogues and Similarities •  Exploring Recommendations Models •  Recommendations Evaluation and Public Initiatives

§  Conclusion

Page 19: Recommendations @ Rakuten Group

Recommendations Algebra

19

Keys ideas l  Reuse already existing logics and combine them easily. l  Write business logic, not code ! l  Handle multiple input/output formats.

Algebra for defining and combining recommendations engines

19

Available Logics q Content-based q Collaborative-filtering q Item-item q User-item (personalization)

Available Backends q In-memory q HDF5 files q Cassandra q Couchbase

Available Hybridization q Linear algebra / weighting q Mixed q Cascade engines q Meta-level

Page 20: Recommendations @ Rakuten Group

Python Algebra Example

20

>>> engine1 = RecommendationsEngine(nb_recos=20, datatype=‘purchase’, ! asymmetric=True, ! distance=‘conditional_probability’)!>>> engine2 = RecommendationsEngine(similarity_th=0.01, datatype=‘browsing’, ! asymmetric=False, !

! ! ! distance=‘cosine_similarity’)!>>> composite_engine = engine1 + 0.2 * engine2! Get recommendations from items (item-to-item) !>>> recos = composite_engine.recommendations_by_items([123, 456, 789, …])!

20

Purchase-based Top-20

Asymmetric Conditional probability

Browsing-based Similarity > 0.01

Symmetric Cosine similarity

+  0.2   Composite engine

Page 21: Recommendations @ Rakuten Group

Python Algebra with Personalization

21

>>> history = HistoryEngine(datatype=‘purchase’, time_window=180, time_decay=0.01)!>>> engine1.register_history_engine(history)! …same code as previously (user-to-item)!!>>> recos = composite_engine.recommendations_by_user(‘userid’)!

21

Purchase-based Top-20

Asymmetric Conditional probability

Browsing-based Similarity > 0.01

Symmetric Cosine similarity

+  0.2   Composite engine

Purchase-history Time window 180 days

Time decay 0.01

Page 22: Recommendations @ Rakuten Group

Python Algebra – Complete Example

22 22

Purchase-based Top-20

Asymmetric Conditional probability

Browsing-based Similarity > 0.01

Symmetric Cosine similarity

+  0.2   Composite engine

Purchase-history Time window 180 days

Time decay 0.01

X  (cascade)    

Purchase-based Category-level

Similarity > 0.01 Asymmetric

Conditional probability

Browsing-based Category-level Similarity > 0.1

Symmetric Cosine similarity

+  0.1  

Composite engine

Page 23: Recommendations @ Rakuten Group

Presentation Overview

23

§  Rakuten Group §  Recommendations Challenges

•  Challenges of Recommendations @ Rakuten •  Items Catalogues and Similarities •  Exploring Recommendations Models •  Recommendations Evaluation and Public

Initiatives

§  Conclusion

Page 24: Recommendations @ Rakuten Group

Recommendation Quality Challenges

24

Minor  Product  

Major  Product  

(Popular)  New  Product  

Old  Product  

(A)  (B)  

(D)  

(C)  

Recommendations categories  •  Cold start issue

•  External data ? •  Cross-services ?

•  Hot products (A) •  Top-N items ?

•  Short tail (B) •  Long tail (C + D)  

Page 25: Recommendations @ Rakuten Group

Long Tail is Fat

25

Long tail numbers  •  Most of the items are long tail •  They still represent a large portion of the

traffic  

Popular  

Short  tail  

Long  tail  

Browsing  share   Number  of  items  

Long  tail   Short  tail   Popular  

Long tail approaches  •  Content-based •  Aggregation / clustering •  Personalization

Page 26: Recommendations @ Rakuten Group

Evaluation

26

Browsing  History  

Query  History  

Purchase  History  

Algorithms  

Datasets  

Offline Test  Long-term Research  

Online Test  KPI Maximization  

Use as prior  

Correlation between offline metrics & value  

Hybrid approach q Offline for Long-Term and Prior q Online for Short-Term and Maximizing KPI’s

Page 27: Recommendations @ Rakuten Group

Offline Evaluation

27

Pros/Cons  •  Convenient way to try new ideas •  Fast and cheap •  But hard to align with online KPI

Approaches  •  Rescoring •  Prediction game •  Business simulator

Target  =  item  bought  by  user  

Page 28: Recommendations @ Rakuten Group

Offline Evaluation for Online Learning

28

Page 29: Recommendations @ Rakuten Group

Public Initiative – Viki Recommendation Challenge

567 submissions from 132 participants http://www.dextra.sg/challenges/rakuten-viki-video-challenge

29

Page 30: Recommendations @ Rakuten Group

Presentation Overview

30

§  Rakuten Group §  Recommendations Challenges

•  Challenges of Recommendations @ Rakuten •  Items Catalogues and Similarities •  Exploring Recommendations Models •  Recommendations Evaluation and Public Initiatives

§  Conclusion

Page 31: Recommendations @ Rakuten Group

Conclusion

31

Items catalogue: reinforce statistical power of co-occurrences across shops and services; Items similarities: find the good parameters for the different use-cases;

Recommendations models: what is the best models for in-shop, all-shops, personalization? Evaluation: handling long-tail? Comparing different models?

Rakuten provides marketplaces worldwide

Specific challenges for recommendations

Page 32: Recommendations @ Rakuten Group

We are Hiring!

32

Data Scientist / Software Developer

Ø  Build algorithms for recommendations, search, targeting Ø  Predictive modeling, machine learning, natural language processing Ø  Working close to business Ø  Python, Java, Hadoop, Couchbase, Cassandra…

Ø  Also hiring: search engine developers, big data system administrators, etc.  

Big Data Department – team in Paris http://global.rakuten.com/corp/careers/bigdata/

http://www.priceminister.com/recrutement/?p=197

Page 33: Recommendations @ Rakuten Group

33

THANKS !

Questions ?

More on Rakuten tech initiatives

http://www.slideshare.net/rakutentech http://rit.rakuten.co.jp/oss.html

http://rit.rakuten.co.jp/opendata.html

Positions

•  http://global.rakuten.com/corp/careers/bigdata/ •  http://www.priceminister.com/recrutement/?p=197