reco4j @ munich meetup (april 18th)
DESCRIPTION
These slides were presentet at Munich Meetup of April 18th. They present the reco4j project, its high view and it vision. See the project site for more details here: http://www.reco4j.orgTRANSCRIPT
Alessandro Negro Reco4J Project @ Munich Meetup -‐ April 2013
Reco4J Project Intelligent RecommendaAons for
Your Business
Alessandro Negro Reco4J Project @ Munich Meetup -‐ April 2013 Page 1
Recommender Systems • A system that can recommend or present items to the user based on the user’s interests and interacAons
• One of the best ways to provide a personalized customer experience
• Built by exploiAng collecAve intelligence to perform predicAons
• Examples: Amazon, YouTube, NeRlix, Yahoo, Tripadvisor, Last.fm, IMDb
Alessandro Negro Reco4J Project @ Munich Meetup -‐ April 2013 Page 2
The Example: NeRlix • The world largest online movie rental services, 33 million members in 40 countries
• 60% of members selecAng movies based on recommendaAons (September 2008)
• NeRlix Prize: US$ 1,000,000 was given to the BellKor's PragmaAc Chaos team which bested NeRlix's own algorithm for predicAng raAngs by 10.06% (September 2009)
• 75% of the content watched on the service comes from its recommendaAon engine (April 2012)
Alessandro Negro Reco4J Project @ Munich Meetup -‐ April 2013 Page 3
Why Recommender Systems • Standard uses:
– Increase the number of items sold – Sell more diverse items – Increase the user saAsfacAon – Increase user fidelity – Beeer understand what the user wants
• Advanced uses: – Create ad hoc campaigns (per geographic area, per type of users) – OpAmize products distribuAon over a wide area for large retail chains
Alessandro Negro Reco4J Project @ Munich Meetup -‐ April 2013 Page 4
Problem • There are no available sofware products for state-‐of-‐the-‐art recommender systems
• A high-‐end recommender engine can be built only through expensive custom projects
• Large scale user/item datasets require a big data approach
• There is no "best soluAon" • There is no "one soluAon fits all” • The NeRlix winner composed 104 different algorithms
Alessandro Negro Reco4J Project @ Munich Meetup -‐ April 2013 Page 5
SoluAon: Reco4J
A graph-‐based recommender engine
Alessandro Negro Reco4J Project @ Munich Meetup -‐ April 2013 Page 6
Reco4J Main Goals • Implement the state-‐of-‐the-‐art in the recommendaAon on top of a graph model
• Provide sofware / cloud services / consultancy
• Contribute to the RecSys research field
Alessandro Negro Reco4J Project @ Munich Meetup -‐ April 2013 Page 7
Reco4J Features • Composable models/algorithms • Persistent models • Updatable Models • Independent from source knowledge datasets • Cluster and cloud-‐ready • MulAtenant • Social recommendaAons
Alessandro Negro Reco4J Project @ Munich Meetup -‐ April 2013 Page 8
Reco4J Under the Hood • J is for Java • CollaboraAve filtering algorithms – Neighborhood-‐based methods – Latent factor models
• Neo4J Graph Database: – Data source repository – Persistent model repository
• Hadoop cluster/MapReduce • Apache Mahout
Alessandro Negro Reco4J Project @ Munich Meetup -‐ April 2013 Page 9
Advantage of graph database • NoSQL database to handle BigData issue • Extensibility • No aggregate-‐oriented database • Minimal informaAon needed • Natural way for represenAng connecAons:
– User -‐ to -‐ item – Item -‐ to -‐ item – User -‐ to -‐ User
• Graph ParAAoning (sharding) • Performance
Alessandro Negro Reco4J Project @ Munich Meetup -‐ April 2013 Page 10
Example: Find neighbors
Alessandro Negro Reco4J Project @ Munich Meetup -‐ April 2013 Page 11
Why Neo4J? • Java based • Embeddable/Extensible • NaAve graph storage with naAve graph processing engine
• Open Source, with commercial version • Property Graph • ACID support • Scalability/HA • Comprehensive query/traversal opAons
Alessandro Negro Reco4J Project @ Munich Meetup -‐ April 2013 Page 12
RecommendaAon Model
Alessandro Negro Reco4J Project @ Munich Meetup -‐ April 2013 Page 13
Persistence Model
Alessandro Negro Reco4J Project @ Munich Meetup -‐ April 2013 Page 14
Persistence Model
Alessandro Negro Reco4J Project @ Munich Meetup -‐ April 2013 Page 15
Persistence Model
Alessandro Negro Reco4J Project @ Munich Meetup -‐ April 2013 Page 16
Reco4J + Hadoop • Queue Based Process • Operates both on cluster and cloud • Each process downloads data from
Neo4J/Reco4J before or during computaAon
• Stores data into Reco4J Model
• Scaling augmenAng the number of: • Neo4J Nodes (only one master) • Hadoop Nodes
Alessandro Negro Reco4J Project @ Munich Meetup -‐ April 2013 Page 17
Reco4J in the Cloud • Recommenda)on as a service (RaaS) • Reco4J cloud infrastructure offers: – Pay as you need – Pay as you grow – Support for burst – Periodical analysis at lower costs – Test/evaluate several algorithms on a reduced dataset – Compose algorithms dynamically
Alessandro Negro Reco4J Project @ Munich Meetup -‐ April 2013 Page 18
Consultancy Goals
Analysis
Data Source
ExploraAon
Process DefiniAon
Import Data
Test/EvaluaAon
Deploy
Alessandro Negro Reco4J Project @ Munich Meetup -‐ April 2013 Page 19
Research Topics • Real-‐Time recommendaAon • MulA-‐criteria recommender systems • Recommending to groups • Parallel algorithms • Filtering
Alessandro Negro Reco4J Project @ Munich Meetup -‐ April 2013 Page 20
Reco4J Site AnalyAcs
Alessandro Negro Reco4J Project @ Munich Meetup -‐ April 2013 Page 21
Thank you
Alessandro Negro Linkedin: hep://it.linkedin.com/in/alessandronegro/ Email: [email protected] Reco4J Site: hep://www.reco4j.org Twieer: @reco4j GitHub: heps://github.com/reco4j