Reco4J Project @ Munich Meetup April 2013

These slides were presented at Munich Meetup of April 18th. They present the reco4j project, its high view and it vision.


Reco4J  Project  Intelligent  RecommendaAons  for  

Your  Business  

Recommender  Systems  •  A  system  that  can  recommend  or  present  items  to  the  user  based  on  the  user’s  interests  and  interacAons  

•  One  of  the  best  ways  to  provide  a  personalized  customer  experience  

•  Built  by  exploiAng  collecAve  intelligence  to  perform  predicAons  

•  Examples:  Amazon,  YouTube,  NeRlix,  Yahoo,  Tripadvisor,,  IMDb  

The  Example:  NeRlix  •  The  world  largest  online  movie  rental  services,  33  million  members  in  40  countries  

•  60%  of  members  selecAng  movies  based  on  recommendaAons  (September  2008)  

•  NeRlix  Prize:  US$  1,000,000  was  given  to  the  BellKor's  PragmaAc  Chaos  team  which  bested  NeRlix's  own  algorithm  for  predicAng  raAngs  by  10.06%  (September  2009)  

•  75%  of  the  content  watched  on  the  service  comes  from  its  recommendaAon  engine  (April  2012)  

Why  Recommender  Systems  •  Standard  uses:  

–  Increase  the  number  of  items  sold  –  Sell  more  diverse  items  –  Increase  the  user  saAsfacAon  –  Increase  user  fidelity  –  Beeer  understand  what  the  user  wants      

•  Advanced  uses:  –  Create  ad  hoc  campaigns  (per  geographic  area,  per  type  of  users)  –  OpAmize  products  distribuAon  over  a  wide  area  for  large  retail  chains  

Problem  •  There  are  no  available  sofware  products  for  state-­‐of-­‐the-­‐art  recommender  systems  

•  A  high-­‐end  recommender  engine  can  be  built  only  through  expensive  custom  projects  

•  Large  scale  user/item  datasets  require  a  big  data  approach  

•  There  is  no  "best  soluAon"  •  There  is  no  "one  soluAon  fits  all”  •  The  NeRlix  winner  composed  104  different  algorithms  

SoluAon:  Reco4J    

A  graph-­‐based  recommender  engine  

Reco4J  Main  Goals  •  Implement  the  state-­‐of-­‐the-­‐art  in  the  recommendaAon  on  top  of  a  graph  model  

 •  Provide  sofware  /  cloud  services  /  consultancy    

 •  Contribute  to  the  RecSys  research  field  

Reco4J  Features  •  Composable  models/algorithms  •  Persistent  models  •  Updatable  Models  •  Independent  from  source  knowledge  datasets  •  Cluster  and  cloud-­‐ready  •  MulAtenant  •  Social  recommendaAons  

Reco4J  Under  the  Hood  •  J  is  for  Java  •  CollaboraAve  filtering  algorithms  –  Neighborhood-­‐based  methods  –  Latent  factor  models  

•  Neo4J  Graph  Database:  –  Data  source  repository  –  Persistent  model  repository  

•  Hadoop  cluster/MapReduce  •  Apache  Mahout  

Advantage  of  graph  database  •  NoSQL  database  to  handle  BigData  issue  •  Extensibility  •  No  aggregate-­‐oriented  database  •  Minimal  informaAon  needed  •  Natural  way  for  represenAng  connecAons:  

–  User  -­‐  to  -­‐  item  –  Item  -­‐  to  -­‐  item  –  User  -­‐  to  -­‐  User  

•  Graph  ParAAoning  (sharding)  •  Performance  

Example:  Find  neighbors  

Why  Neo4J?  •  Java  based  •  Embeddable/Extensible  •  NaAve  graph  storage  with  naAve  graph  processing  engine  

•  Open  Source,  with  commercial  version  •  Property  Graph  •  ACID  support  •  Scalability/HA  •  Comprehensive  query/traversal  opAons  

RecommendaAon  Model  

Persistence  Model  

Persistence  Model  

Persistence  Model  

Reco4J  +  Hadoop  •  Queue  Based  Process  •  Operates  both  on  cluster  and  cloud  •  Each  process  downloads  data  from  

Neo4J/Reco4J  before  or  during  computaAon  

•  Stores  data  into  Reco4J  Model    

•  Scaling  augmenAng  the  number  of:  •  Neo4J  Nodes  (only  one  master)  •  Hadoop  Nodes  

Reco4J  in  the  Cloud  •  Recommenda)on  as  a  service  (RaaS)  •  Reco4J  cloud  infrastructure  offers:  –  Pay  as  you  need  –  Pay  as  you  grow  –  Support  for  burst  –  Periodical  analysis  at  lower  costs  –  Test/evaluate  several  algorithms  on  a  reduced  dataset  –  Compose  algorithms  dynamically  

Consultancy  Goals  


Data  Source  


Process  DefiniAon  

Import  Data  



Research  Topics  •  Real-­‐Time  recommendaAon  •  MulA-­‐criteria  recommender  systems  •  Recommending  to  groups  •  Parallel  algorithms  •  Filtering  

Reco4J  Site  AnalyAcs  

Thank  you  

Alessandro  Negro  Linkedin:  hep://  Email:  [email protected]    Reco4J  Site:  hep://  Twieer:  @reco4j  GitHub:  heps://