large scale recommendation in e-commerce -- qiang yan

28
Large Scale Recommenda/on in ECommerce Qiang Yan, Quan Yuan Taobao Search&P13N Team Alibaba Group

Upload: qiang-yan

Post on 02-Dec-2014

1.294 views

Category:

Data & Analytics


16 download

DESCRIPTION

this is the slides for Large scale recommender system workshop in Recsys 2014

TRANSCRIPT

Page 1: Large scale recommendation in e-commerce -- qiang yan

Large  Scale  Recommenda/on  in  E-­‐Commerce    

Qiang  Yan,  Quan  Yuan  

Taobao  Search&P13N  Team  

Alibaba  Group  

Page 2: Large scale recommendation in e-commerce -- qiang yan

Outline  

•  Introduc@on  – Data  in  Taobao  – Recommenda@on  in  Taobao  

•  Approaches  to  recommenda@on  – eTREC  – Rank  

•  Lessons  we  learn  •  Conclusion  &  Challenges  

Page 3: Large scale recommendation in e-commerce -- qiang yan

Outline  

•  Introduc/on  – Data  in  Taobao  – Recommenda@on  in  Taobao  

•  Approaches  to  recommenda@on  – eTREC  – Rank  

•  Lessons  we  learn  •  Conclusion  &  Challenges  

Page 4: Large scale recommendation in e-commerce -- qiang yan

Largest  online  and  mobile  commerce  company  in  the  world  

Data  in  Taobao  

Page 5: Large scale recommendation in e-commerce -- qiang yan

Item  Discovery  

P13n  in    Ver/cal  Industry  

My  Taobao  –  Guess  You  Like  

Recommenda@ons  in  taobao.com  

Shop  Discovery  

Page 6: Large scale recommendation in e-commerce -- qiang yan

Recommenda@ons  in  Taobao  Mobile  Flash  Sale

Dress  rec  for  female

P13n  HQ  REC

New  Items  REC

Powered  by  Recommenda/on  

Page 7: Large scale recommendation in e-commerce -- qiang yan

Recommenda@ons  in  Taobao  Mobile  

Powered  by  Recommenda/on  

P13n  in    Ver@cal  Industry  

Shop  Discovery  

Item  Rec  

Page 8: Large scale recommendation in e-commerce -- qiang yan

Outline  

•  Introduc@on  – Data  in  Taobao  – Recommenda@on  in  Taobao  

•  Approaches  to  recommenda@on  – eTREC  – Rank  

•  Lessons  we  learn  •  Conclusion  &  Challenges  

Page 9: Large scale recommendation in e-commerce -- qiang yan

PlaIorm  

Match/Retrieval    

Rank  

Applica/on  

TPP   Tair   Hbase   UPS   JStrom   ODPS  

RT   CF   Content   User-­‐Based   DT   LC  

RTP   Xlib   Olive  

Re-­‐Rank  

Diversity   Freshness  Business  Goal  

HomePage(PC)  

Ver/cal  (PC)  

HomePage(mobile)  

Shopping  Path  

Ver/cal  (mobile)  

Overview  

Page 10: Large scale recommendation in e-commerce -- qiang yan

PlaIorm  

Match/Retrieval    

Rank  

Applica/on  

TPP   Tair   Hbase   UPS   JStrom   ODPS  

RT   CF   Content   User-­‐Based   DT   LC  

RTP   Xlib   Olive  

Re-­‐Rank  

Diversity   Freshness  Business  Goal  

HomePage(PC)  

Ver/cal  (PC)  

HomePage(mobile)  

Shopping  Path  

Ver/cal  (mobile)  

Overview  

eTREC

Page 11: Large scale recommendation in e-commerce -- qiang yan

eTREC  

Items  

Users  

Content(word、tag)  

User   Item  

ItemCF  

UserCF  

ContentCF  

Feature-­‐Based  CF  

User   Item  Features  

Tags  

Style  

Latent  Class  

….  

A  high  efficient  distributed  feature-­‐based  collabora/ve  filtering  tool  

Page 12: Large scale recommendation in e-commerce -- qiang yan

Implementa@on  trick  1  – Operators  

Jaccard   Cosine  

eTREC  

Page 13: Large scale recommendation in e-commerce -- qiang yan

Implementa@ons  trick  2  –  Less  Map/Reduce  •  NormAndDot  •  CalSim  

–  Less  emiSed  item-­‐item  pairs    

NormAndDot  Job  

CalSim  Job  

feature_id  en/ty_id  Preference  payload  

 

en/ty_id  norm(i)  <j,dot(i  ,j)>  ….  

 

en/ty_id  <j,  sim(i,j)>  ….  

 

 ItemCF  in  Mahout  

eTREC  

Page 14: Large scale recommendation in e-commerce -- qiang yan

•  Features  – Fast  

•  400M  users  X  200M  items      in  less  than  20  mins  

– Easy  to  use  – Scalable  

•  User-­‐defined  similarity  (Default:  cosine,jaccard,asymcosine)  

•  User-­‐defined  item-­‐item  pairs  

eTREC  

Page 15: Large scale recommendation in e-commerce -- qiang yan

Rank:  Olive  •  Olive  =  Real-­‐@me  Streaming  System    +    Online  Learning  •  Why  need  Online  Learning?  

–  User  — User  interests  shi_ing  — Mixture  account、Family  account  

–  Item  — Millions  of  new  items  per  day  — 10M  updated  items(@tle,price  .etc)  

–  Context  — Promo@ons,  Discounts  .etc  — Fes@vals:  Na@onal  days,  11.11  

Page 16: Large scale recommendation in e-commerce -- qiang yan

Olive  

Goal  n   Make  real-­‐@me  response  to  P/N  feedback  ,  and  improve  the  user  experiences  

n More  accurate  recommenda@ons  n Stable  model  

Model  n FTRL    n AdPredictor  

16  

Page 17: Large scale recommendation in e-commerce -- qiang yan

Asynchronous  Distributed  OGD  

Parameter  Server  (Tair/Hbase)  

Reducer  

Reducer  

w  

△  w  

w  

△  w  

Updater  

Updater  

FG  

FG  

IG  

IG  

Data  Shard  

Data  Shard  

Data  Shard  

Data  Shard  

Strom/Jstorm   TT/MetaQ  

Framework  

Page 18: Large scale recommendation in e-commerce -- qiang yan

•  FTRL-­‐Proximal  

Update  with  (sub-­‐)gradient    Updated  models  not  far  from  previous   L1-­‐Norm   L2-­‐Norm  

Olive  -­‐-­‐  FTRL  

Page 19: Large scale recommendation in e-commerce -- qiang yan

3.9B  samples,1.7B  features,Pre-­‐train  21mins  

n  Cold  start  ü  OWLQN-­‐LR  based  FTRL  Pre-­‐train  Model   0.5  

0.55  

0.6  

0.65  

0.7  

0.75  

0   1   2   3   4   5   6   7   8   9   10   11   12  

AUC  

Hour  

-­‐3  

-­‐2.5  

-­‐2  

-­‐1.5  

-­‐1  

-­‐0.5  

0  

beta  

n  Stability  ü  Residual-­‐based  Cascading  online  train  ü  |w-­‐w0|  Constrain  ü  Mini-­‐batch  update  

FTRL  in  Ac/on  

Page 20: Large scale recommendation in e-commerce -- qiang yan

Experiments  

Samples(3.9B):            Offline:    pre-­‐train  offline  FTRL  model  based  on  pv  and  click  data  in  14  days              Online:    in  the  following  4  days  

Model:                FTRL  

Page 21: Large scale recommendation in e-commerce -- qiang yan

0.65  

0.67  

0.69  

0.71  

0.73  

0.75  

0.77  

20140604   20140605   20140605   20140607  

GAUC  

LR   FTRL  

n  Accuracy  

n  Stability  

-­‐2.196  

-­‐2.194  

-­‐2.192  

-­‐2.19  

beta  

0.15  

0.25  

0.35  gender_comb_1_1  

Experiments  

10%+

Page 22: Large scale recommendation in e-commerce -- qiang yan

Olive  —  AdPredictor  

n  AdPredictor  -­‐-­‐  Not  Sparse  ü  Pruning  parameters  

n  Advantages:  ü  Bayesian  Model(easy  to  add  domain  knowledge)  ü  Model  uncertainty  explicitly  ü  Natural  explora/on  

Page 23: Large scale recommendation in e-commerce -- qiang yan

Outline  

•  Introduc@on  – Data  in  Taobao  – Recommenda@on  in  Taobao  

•  Approaches  to  recommenda@on  – eTREC  – Rank  

•  Lessons  we  learn  •  Conclusion  &  Challenges  

Page 24: Large scale recommendation in e-commerce -- qiang yan

Lessons  we  learn    •  Ways  to  improve  recommender  systems    •  Relevance  vs.  User  experiences  

   

•  Mobile  (Contextual)  features  is  very  important  in  ranking  of  recommenda@ons  on  the  mobile  

 

RT   Item-­‐CF   Content  User-­‐CF  

Relevance  

User  Experiences  

Re-­‐rank  10%

Rank  20%

Match  30%

Data  40%

Page 25: Large scale recommendation in e-commerce -- qiang yan

Outline  

•  Introduc@on  – Data  in  Taobao  – Recommenda@on  in  Taobao  

•  Approaches  to  recommenda@on  – eTREC  – Rank  

•  Lessons  we  learn  •  Challenges  

Page 26: Large scale recommendation in e-commerce -- qiang yan

Challenges  

•  Heterogeneous  data(search,  social,  poi,  image  .etc)  for  recommenda@on  

•  Mul@modal  inputs  :  images,  speech,  QR  code  

•  Context-­‐aware  and  interac@ve    recommenda@on  

•  Recommenda@on  traffic  alloca@on  to  a  beSer  ecommerce  eco-­‐system  

Page 27: Large scale recommendation in e-commerce -- qiang yan

Reference  •  T.  Graepel,  J.  Q.  Candela,  T.  Borchert,  and  R.  Herbrich.Web-­‐scale  Bayesian  click-­‐through  rate  

predic/on  for  sponsored  search  adver/sing  in  microsols  bing  search  engine.  In  Proc.  27th  Internat.  Conf.  on  Machine  Learning,  2010.  

•  H.  B.  McMahan.  Follow-­‐the-­‐regularized-­‐leader  and  mirror  descent:  Equivalence  theorems  and  L1  regulariza/on.  In  AISTATS,  2011.  

•  H.  B.  McMahan  and  O.  Muralidharan.  On  calibrated  predic/ons  for  auc/on  selec/on  mechanisms.  CoRR,abs/1211.3955,  2012.  

•  Jing  Jiang  ,  Jie  Lu  ,  Guangquan  Zhang  ,  Guodong  Long,  Scaling-­‐Up  Item-­‐Based  Collabora/ve  Filtering  Recommenda/on  Algorithm  Based  on  Hadoop,  Proceedings  of  the  2011  IEEE  World  Congress  on  Services,  p.490-­‐497,  July  04-­‐09,  2011  

•  Chu,  W.,  L.  Li,  et  al.  (2011).  "Contextual  Bandits  with  Linear  Payoff  Func/ons."  JMLR.  

•  Peter  Auer.  (2002).  "  Using  confidence  bounds  for  exploita/on  /explora/on  trade-­‐offs."  JMLR.    

Page 28: Large scale recommendation in e-commerce -- qiang yan

WE’RE  HIRING  Qiang  Yan  [email protected]  Chang  Liu        [email protected]  

 

Large  Scale  Recommenda/on  in  E-­‐Commerce