recommendation engines: some practical and theoretical considerations

21
Introduction Collaborative Filtering Sparsity and Scalability Contextuality Conclusions Recommendation Engines: Some Practical and Theoretical Considerations Research Seminar at Youku.com Peter Wittek University of Bor˚ as & Tsinghua University April 22, 2013 Peter Wittek Recommendation Engines

Upload: peter-wittek

Post on 11-May-2015

570 views

Category:

Technology


2 download

DESCRIPTION

A talk given at Youku.com about the machine learning and scalability aspects of contemporary recommendation systems.

TRANSCRIPT

Page 1: Recommendation Engines: Some Practical and Theoretical Considerations

Introduction Collaborative Filtering Sparsity and Scalability Contextuality Conclusions

Recommendation Engines: Some Practicaland Theoretical Considerations

Research Seminar at Youku.com

Peter Wittek

University of Boras & Tsinghua University

April 22, 2013

Peter Wittek Recommendation Engines

Page 2: Recommendation Engines: Some Practical and Theoretical Considerations

Introduction Collaborative Filtering Sparsity and Scalability Contextuality Conclusions

Outline

1 Introduction

2 Collaborative Filtering

3 Sparsity and Scalability

4 Contextuality

5 Conclusions

Peter Wittek Recommendation Engines

Page 3: Recommendation Engines: Some Practical and Theoretical Considerations

Introduction Collaborative Filtering Sparsity and Scalability Contextuality Conclusions

What It Is About

Objective: Predict user preferencesContent-based filtering versus collaborative filtering

Users versus content/metadata indexingHybrid systems

Roots in information retrieval, relevance feedback

Peter Wittek Recommendation Engines

Page 4: Recommendation Engines: Some Practical and Theoretical Considerations

Introduction Collaborative Filtering Sparsity and Scalability Contextuality Conclusions

An Example: Slope One

A simple baseline methodEasy to implement, efficient

Slope One

2 ? User B

Item J? = 2 + (1.5 − 1) = 2.5

1 1.5

Item I

User A

1.5 − 1 = 0.5

Peter Wittek Recommendation Engines

Page 5: Recommendation Engines: Some Practical and Theoretical Considerations

Introduction Collaborative Filtering Sparsity and Scalability Contextuality Conclusions

Background Operations

Memory-basedSimpleRobustCosine dissimilarity, Euclidean distance, etc.

Model-basedFull array of machine learning algorithms

The kernel tricka) b)

Peter Wittek Recommendation Engines

Page 6: Recommendation Engines: Some Practical and Theoretical Considerations

Introduction Collaborative Filtering Sparsity and Scalability Contextuality Conclusions

Outline

1 Introduction

2 Collaborative Filtering

3 Sparsity and Scalability

4 Contextuality

5 Conclusions

Peter Wittek Recommendation Engines

Page 7: Recommendation Engines: Some Practical and Theoretical Considerations

Introduction Collaborative Filtering Sparsity and Scalability Contextuality Conclusions

User Recommendations

Even a few ratings are more accurate than metadaThe cold start problemAnalysing users’ behaviour, preferences, ratingsExplicit versus implicit dataScalability and sparsity

Peter Wittek Recommendation Engines

Page 8: Recommendation Engines: Some Practical and Theoretical Considerations

Introduction Collaborative Filtering Sparsity and Scalability Contextuality Conclusions

Learning Methods

Simple ones: k-NN, decision treesMore intricate ones: matrix factorization, support vectors,artificial neural networks

A feed-forward neural network

Peter Wittek Recommendation Engines

Page 9: Recommendation Engines: Some Practical and Theoretical Considerations

Introduction Collaborative Filtering Sparsity and Scalability Contextuality Conclusions

An Example Pipeline

Peter Wittek Recommendation Engines

Page 10: Recommendation Engines: Some Practical and Theoretical Considerations

Introduction Collaborative Filtering Sparsity and Scalability Contextuality Conclusions

Outline

1 Introduction

2 Collaborative Filtering

3 Sparsity and Scalability

4 Contextuality

5 Conclusions

Peter Wittek Recommendation Engines

Page 11: Recommendation Engines: Some Practical and Theoretical Considerations

Introduction Collaborative Filtering Sparsity and Scalability Contextuality Conclusions

The Matrix We Are Facing

High-dimensionalSparse, missing elements

0.01%-0.1% nonzeroLow-rank

User types

The problem as a sparse matrix

Users (≈ 107 − 108)? ? ? 4 ?? 1 ? ? ? Videos (≈ 108 − 109)? ? ? ? 51 ? ? ? 2

Peter Wittek Recommendation Engines

Page 12: Recommendation Engines: Some Practical and Theoretical Considerations

Introduction Collaborative Filtering Sparsity and Scalability Contextuality Conclusions

Dealing With Sparsity

Rating from 1− 5That is three bits at best

For Netflix:log2 |users| = 18.8log2 |movies| = 14.1The numbers date to the competition (pre-2009).

Each entry will barely take three bytesFurther tweaks can halve the storage requirement

Peter Wittek Recommendation Engines

Page 13: Recommendation Engines: Some Practical and Theoretical Considerations

Introduction Collaborative Filtering Sparsity and Scalability Contextuality Conclusions

Low Rank Approximation

Goal: Estimate ratings for unknown elements

Singular Value Decomposition

A = U S VTx x

Here S is a diagonal matrix containing the singular values indecreasing order.

Peter Wittek Recommendation Engines

Page 14: Recommendation Engines: Some Practical and Theoretical Considerations

Introduction Collaborative Filtering Sparsity and Scalability Contextuality Conclusions

Conceptual Dynamics

The matrix is not staticIncremental biseration+Gaussian blurring+3D visualization

Snapshots on BBC videos

(a) (b)

Peter Wittek Recommendation Engines

Page 15: Recommendation Engines: Some Practical and Theoretical Considerations

Introduction Collaborative Filtering Sparsity and Scalability Contextuality Conclusions

Scalability

Learning algorithms are computationally demandingSome parallelize wellApache Mahout originally grew out of a scalable CF library

Based on Apache HadoopMapReduce: scaling out on a large number of nodes ofcommodity hardware

Peter Wittek Recommendation Engines

Page 16: Recommendation Engines: Some Practical and Theoretical Considerations

Introduction Collaborative Filtering Sparsity and Scalability Contextuality Conclusions

Real-Time Systems

Update operations and queriesParallel and distributed executionAcceleration by graphics hardware

Massively parallel architecture

GPUCompute Device

ComputeUnit

ComputeUnit

StreamCores

Peter Wittek Recommendation Engines

Page 17: Recommendation Engines: Some Practical and Theoretical Considerations

Introduction Collaborative Filtering Sparsity and Scalability Contextuality Conclusions

Outline

1 Introduction

2 Collaborative Filtering

3 Sparsity and Scalability

4 Contextuality

5 Conclusions

Peter Wittek Recommendation Engines

Page 18: Recommendation Engines: Some Practical and Theoretical Considerations

Introduction Collaborative Filtering Sparsity and Scalability Contextuality Conclusions

What Is Contextuality

The users’ preferences are not staticThe preferences are a function of the present contextIndirect clues: current browsing history, recent purchasehistory, etc.Infer context-type/micro-profileSmall improvements over baseline methods have alreadybeen reported

Peter Wittek Recommendation Engines

Page 19: Recommendation Engines: Some Practical and Theoretical Considerations

Introduction Collaborative Filtering Sparsity and Scalability Contextuality Conclusions

Enter Quantum Mechanics

Foraging theory: how to maximize net energy intake in apatchy environmentQuantum-like contextual patterns emerge from classicaldecisionsForaging decisions can translate to problems inrecommendation systems

Forget Bayes’ rule: Consider Luder’s rule

||Pb1|ψa1〉||2 = ||Pb1Pa1|ψ〉||2/||Pa1|ψ〉||2 (1)

Two operators are generally not commutative, leading to asequential, context-sensitive model of decision making.

Peter Wittek Recommendation Engines

Page 20: Recommendation Engines: Some Practical and Theoretical Considerations

Introduction Collaborative Filtering Sparsity and Scalability Contextuality Conclusions

Outline

1 Introduction

2 Collaborative Filtering

3 Sparsity and Scalability

4 Contextuality

5 Conclusions

Peter Wittek Recommendation Engines

Page 21: Recommendation Engines: Some Practical and Theoretical Considerations

Introduction Collaborative Filtering Sparsity and Scalability Contextuality Conclusions

Summary

A large part of recommendation systems is an engineeringproblem

Hybridise collaborative and content-based filteringAssemble and tune machine learning pipelinesMeasure prediction quality and monetary gainsContinue tuning

Exciting theoretical considerations await further research

Peter Wittek Recommendation Engines