reco4j @ munich meetup (april 18th)

22
Alessandro Negro Reco4J Project @ Munich Meetup April 2013 Reco4J Project Intelligent RecommendaAons for Your Business

Upload: alessandro-negro

Post on 24-May-2015

720 views

Category:

Technology


0 download

DESCRIPTION

These slides were presentet at Munich Meetup of April 18th. They present the reco4j project, its high view and it vision. See the project site for more details here: http://www.reco4j.org

TRANSCRIPT

Page 1: Reco4J @ Munich Meetup (April 18th)

Alessandro  Negro   Reco4J  Project  @  Munich  Meetup    -­‐  April  2013  

Reco4J  Project  Intelligent  RecommendaAons  for  

Your  Business  

Page 2: Reco4J @ Munich Meetup (April 18th)

Alessandro  Negro   Reco4J  Project  @  Munich  Meetup    -­‐  April  2013   Page  1  

Recommender  Systems  •  A  system  that  can  recommend  or  present  items  to  the  user  based  on  the  user’s  interests  and  interacAons  

•  One  of  the  best  ways  to  provide  a  personalized  customer  experience  

•  Built  by  exploiAng  collecAve  intelligence  to  perform  predicAons  

•  Examples:  Amazon,  YouTube,  NeRlix,  Yahoo,  Tripadvisor,  Last.fm,  IMDb  

Page 3: Reco4J @ Munich Meetup (April 18th)

Alessandro  Negro   Reco4J  Project  @  Munich  Meetup    -­‐  April  2013   Page  2  

The  Example:  NeRlix  •  The  world  largest  online  movie  rental  services,  33  million  members  in  40  countries  

•  60%  of  members  selecAng  movies  based  on  recommendaAons  (September  2008)  

•  NeRlix  Prize:  US$  1,000,000  was  given  to  the  BellKor's  PragmaAc  Chaos  team  which  bested  NeRlix's  own  algorithm  for  predicAng  raAngs  by  10.06%  (September  2009)  

•  75%  of  the  content  watched  on  the  service  comes  from  its  recommendaAon  engine  (April  2012)  

Page 4: Reco4J @ Munich Meetup (April 18th)

Alessandro  Negro   Reco4J  Project  @  Munich  Meetup    -­‐  April  2013   Page  3  

Why  Recommender  Systems  •  Standard  uses:  

–  Increase  the  number  of  items  sold  –  Sell  more  diverse  items  –  Increase  the  user  saAsfacAon  –  Increase  user  fidelity  –  Beeer  understand  what  the  user  wants      

•  Advanced  uses:  –  Create  ad  hoc  campaigns  (per  geographic  area,  per  type  of  users)  –  OpAmize  products  distribuAon  over  a  wide  area  for  large  retail  chains  

Page 5: Reco4J @ Munich Meetup (April 18th)

Alessandro  Negro   Reco4J  Project  @  Munich  Meetup    -­‐  April  2013   Page  4  

Problem  •  There  are  no  available  sofware  products  for  state-­‐of-­‐the-­‐art  recommender  systems  

•  A  high-­‐end  recommender  engine  can  be  built  only  through  expensive  custom  projects  

•  Large  scale  user/item  datasets  require  a  big  data  approach  

•  There  is  no  "best  soluAon"  •  There  is  no  "one  soluAon  fits  all”  •  The  NeRlix  winner  composed  104  different  algorithms  

Page 6: Reco4J @ Munich Meetup (April 18th)

Alessandro  Negro   Reco4J  Project  @  Munich  Meetup    -­‐  April  2013   Page  5  

SoluAon:  Reco4J    

A  graph-­‐based  recommender  engine  

Page 7: Reco4J @ Munich Meetup (April 18th)

Alessandro  Negro   Reco4J  Project  @  Munich  Meetup    -­‐  April  2013   Page  6  

Reco4J  Main  Goals  •  Implement  the  state-­‐of-­‐the-­‐art  in  the  recommendaAon  on  top  of  a  graph  model  

 •  Provide  sofware  /  cloud  services  /  consultancy    

 •  Contribute  to  the  RecSys  research  field  

Page 8: Reco4J @ Munich Meetup (April 18th)

Alessandro  Negro   Reco4J  Project  @  Munich  Meetup    -­‐  April  2013   Page  7  

Reco4J  Features  •  Composable  models/algorithms  •  Persistent  models  •  Updatable  Models  •  Independent  from  source  knowledge  datasets  •  Cluster  and  cloud-­‐ready  •  MulAtenant  •  Social  recommendaAons  

Page 9: Reco4J @ Munich Meetup (April 18th)

Alessandro  Negro   Reco4J  Project  @  Munich  Meetup    -­‐  April  2013   Page  8  

Reco4J  Under  the  Hood  •  J  is  for  Java  •  CollaboraAve  filtering  algorithms  –  Neighborhood-­‐based  methods  –  Latent  factor  models  

•  Neo4J  Graph  Database:  –  Data  source  repository  –  Persistent  model  repository  

•  Hadoop  cluster/MapReduce  •  Apache  Mahout  

Page 10: Reco4J @ Munich Meetup (April 18th)

Alessandro  Negro   Reco4J  Project  @  Munich  Meetup    -­‐  April  2013   Page  9  

Advantage  of  graph  database  •  NoSQL  database  to  handle  BigData  issue  •  Extensibility  •  No  aggregate-­‐oriented  database  •  Minimal  informaAon  needed  •  Natural  way  for  represenAng  connecAons:  

–  User  -­‐  to  -­‐  item  –  Item  -­‐  to  -­‐  item  –  User  -­‐  to  -­‐  User  

•  Graph  ParAAoning  (sharding)  •  Performance  

Page 11: Reco4J @ Munich Meetup (April 18th)

Alessandro  Negro   Reco4J  Project  @  Munich  Meetup    -­‐  April  2013   Page  10  

Example:  Find  neighbors  

Page 12: Reco4J @ Munich Meetup (April 18th)

Alessandro  Negro   Reco4J  Project  @  Munich  Meetup    -­‐  April  2013   Page  11  

Why  Neo4J?  •  Java  based  •  Embeddable/Extensible  •  NaAve  graph  storage  with  naAve  graph  processing  engine  

•  Open  Source,  with  commercial  version  •  Property  Graph  •  ACID  support  •  Scalability/HA  •  Comprehensive  query/traversal  opAons  

Page 13: Reco4J @ Munich Meetup (April 18th)

Alessandro  Negro   Reco4J  Project  @  Munich  Meetup    -­‐  April  2013   Page  12  

RecommendaAon  Model  

Page 14: Reco4J @ Munich Meetup (April 18th)

Alessandro  Negro   Reco4J  Project  @  Munich  Meetup    -­‐  April  2013   Page  13  

Persistence  Model  

Page 15: Reco4J @ Munich Meetup (April 18th)

Alessandro  Negro   Reco4J  Project  @  Munich  Meetup    -­‐  April  2013   Page  14  

Persistence  Model  

Page 16: Reco4J @ Munich Meetup (April 18th)

Alessandro  Negro   Reco4J  Project  @  Munich  Meetup    -­‐  April  2013   Page  15  

Persistence  Model  

Page 17: Reco4J @ Munich Meetup (April 18th)

Alessandro  Negro   Reco4J  Project  @  Munich  Meetup    -­‐  April  2013   Page  16  

Reco4J  +  Hadoop  •  Queue  Based  Process  •  Operates  both  on  cluster  and  cloud  •  Each  process  downloads  data  from  

Neo4J/Reco4J  before  or  during  computaAon  

•  Stores  data  into  Reco4J  Model    

•  Scaling  augmenAng  the  number  of:  •  Neo4J  Nodes  (only  one  master)  •  Hadoop  Nodes  

Page 18: Reco4J @ Munich Meetup (April 18th)

Alessandro  Negro   Reco4J  Project  @  Munich  Meetup    -­‐  April  2013   Page  17  

Reco4J  in  the  Cloud  •  Recommenda)on  as  a  service  (RaaS)  •  Reco4J  cloud  infrastructure  offers:  –  Pay  as  you  need  –  Pay  as  you  grow  –  Support  for  burst  –  Periodical  analysis  at  lower  costs  –  Test/evaluate  several  algorithms  on  a  reduced  dataset  –  Compose  algorithms  dynamically  

Page 19: Reco4J @ Munich Meetup (April 18th)

Alessandro  Negro   Reco4J  Project  @  Munich  Meetup    -­‐  April  2013   Page  18  

Consultancy  Goals  

Analysis  

Data  Source  

ExploraAon  

Process  DefiniAon  

Import  Data  

Test/EvaluaAon  

Deploy  

Page 20: Reco4J @ Munich Meetup (April 18th)

Alessandro  Negro   Reco4J  Project  @  Munich  Meetup    -­‐  April  2013   Page  19  

Research  Topics  •  Real-­‐Time  recommendaAon  •  MulA-­‐criteria  recommender  systems  •  Recommending  to  groups  •  Parallel  algorithms  •  Filtering  

Page 21: Reco4J @ Munich Meetup (April 18th)

Alessandro  Negro   Reco4J  Project  @  Munich  Meetup    -­‐  April  2013   Page  20  

Reco4J  Site  AnalyAcs  

Page 22: Reco4J @ Munich Meetup (April 18th)

Alessandro  Negro   Reco4J  Project  @  Munich  Meetup    -­‐  April  2013   Page  21  

Thank  you  

Alessandro  Negro  Linkedin:  hep://it.linkedin.com/in/alessandronegro/  Email:  [email protected]    Reco4J  Site:  hep://www.reco4j.org  Twieer:  @reco4j  GitHub:  heps://github.com/reco4j