cs 572: information retrieval - emory universityeugene/cs572/lectures/lecture21... ·...

145
CS 572: Information Retrieval Recommender Systems Acknowledgements Many slides in this lecture are adapted from Xavier Amatriain (Netflix),, and Y. Koren (Yahoo),

Upload: others

Post on 20-Mar-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

CS 572: Information Retrieval

Recommender Systems

Acknowledgements

Many slides in this lecture are adapted from Xavier Amatriain (Netflix),, and Y.

Koren (Yahoo),

Page 2: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Big Picture

Part 1: Classical IR: indexing, ranking, metrics

Part 2: Web: crawling, indexing, social web

Part 3: User-centric Information Access

– Recommender Systems

– Personalized Search

– Search interfaces and “intelligent assistants”

4/4/2016 CS 572: Information Retrieval. Spring 2016 2

Page 3: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Recommender Systems

• Recommender Systems: general approaches (today)

– Applications: News, Social recommendation (Wed)

• User feedback in Search (Wed)

• Personalized search (4/11)

• Search interfaces (4/13)

• Question Answering (4/18, Denis Savenkov)

– Collaborative Filtering

• Application: News Recommendation

– Wed: Applications to Search

• Improving Search Ranking with User Feedback

4/4/2016 CS 572: Information Retrieval. Spring 2016 3

Page 4: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Logistics for next 2 weeks

• Monday, 18 April: Question Answering (Denis Savenkov)

• Wed, April 20: Final project Milestone presentations:

– 10-minutes (max 8 slides)

– Email slides or URL to me by Tuesday night, 4/19.

4/4/2016 CS 572: Information Retrieval. Spring 2016 4

Page 5: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Information Retrieval Until Now

5

IR

SystemQuery

String

Document

corpus

Ranked

Documents

1. Doc1

2. Doc2

3. Doc3

.

.

4/4/2016 CS 572: Information Retrieval. Spring 2016

Page 6: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Information Overload Is Real

4/4/2016 CS 572: Information Retrieval. Spring 2016 6

Page 7: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Recommendations Systems Help

4/4/2016 CS 572: Information Retrieval. Spring 2016 7

Page 8: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Problem Definition: Recommendation

4/4/2016 CS 572: Information Retrieval. Spring 2016 8

Page 9: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Recommendation: Formal Problem

4/4/2016 CS 572: Information Retrieval. Spring 2016 9

Page 10: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

10

RS: Utility Matrix

0.4

10.2

0.30.5

0.21

King Kong LOTR Matrix Nacho Libre

Alice

Bob

Carol

David

4/4/2016 CS 572: Information Retrieval. Spring 2016

Page 11: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Recommendation System: Process

4/4/2016 CS 572: Information Retrieval. Spring 2016 11

Page 12: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Paradigms of recommender systems

Recommender systems reduce information overload by estimating relevance

4/4/2016 CS 572: Information Retrieval. Spring 2016 12

Page 13: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Paradigms of recommender systems

Personalized recommendations

4/4/2016 CS 572: Information Retrieval. Spring 2016 13

Page 14: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Paradigms of recommender systems

Collaborative: "Tell me what's popular among my peers"

4/4/2016 CS 572: Information Retrieval. Spring 2016 14

Page 15: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Paradigms of recommender systems

Content-based: "Show me more of the

same what I've liked"

4/4/2016 CS 572: Information Retrieval. Spring 2016 15

Page 16: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Paradigms of recommender systems

Knowledge-based: "Tell me what fits based on my needs"

4/4/2016 CS 572: Information Retrieval. Spring 2016 16

Page 17: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Paradigms of recommender systems

Hybrid: combinations of various inputs and/or composition of different mechanism

4/4/2016 CS 572: Information Retrieval. Spring 2016 17

Page 18: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

18

Key Problems

• Gathering “known” ratings for matrix

• Extrapolate unknown ratings from known ratings

– Mainly interested in high unknown ratings

• Evaluating extrapolation methods

4/4/2016 CS 572: Information Retrieval. Spring 2016

Page 19: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

19

Gathering Ratings

• Explicit

– Ask people to rate items

– Doesn’t work well in practice – people can’t be bothered

• Implicit

– Learn ratings from user actions

• e.g., purchase implies high rating

– What about low ratings?

4/4/2016 CS 572: Information Retrieval. Spring 2016

Page 20: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

20

Rating interface

4/4/2016 CS 572: Information Retrieval. Spring 2016

Page 21: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

21

Extrapolating Utilities

• Key problem: matrix U is sparse

– most people have not rated most items

• Three approaches

– Content-based

– Collaborative

– Hybrid

4/4/2016 CS 572: Information Retrieval. Spring 2016

Page 22: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Main Approaches

4/4/2016 CS 572: Information Retrieval. Spring 2016 22

Page 23: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Tradeoffs

Pros Cons

Collaborative No knowledge-engineering effort, serendipity of results, learns market segments

Requires some form of rating feedback, cold start for new users and new items

Content-based No community required, comparison between items possible

Content descriptions necessary, cold start for new users, no surprises

Knowledge-based Deterministic recommendations, assured quality, no cold-start, can resemble sales dialogue

Knowledge engineering effort to bootstrap, basically static, does not react to short-term trends

4/4/2016CS 572: Information Retrieval. Spring 201623

Page 24: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

What works (in “real world”)

4/4/2016 CS 572: Information Retrieval. Spring 2016 24

Page 25: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

COLLABORATIVE FILTERING

4/4/2016 CS 572: Information Retrieval. Spring 2016 25

Page 26: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Collaborative Filtering (CF)

• The most prominent approach to generate recommendations– used by large, commercial e-commerce sites

– well-understood, various algorithms and variations exist

– applicable in many domains (book, movies, DVDs, ..)

• Approach– use the "wisdom of the crowd" to recommend items

• Basic assumption and idea– Users give ratings to catalog items (implicitly or explicitly)

– Customers who had similar tastes in the past, will have similar tastes in the future

4/4/2016 CS 572: Information Retrieval. Spring 2016 26

Page 27: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

User-based nearest-neighbor collaborative filtering (1)

• The basic technique:

– Given an "active user" (Alice) and an item I not yet seen by Alice

– The goal is to estimate Alice's rating for this item, e.g., by• find a set of users (peers) who liked the same items as Alice in the past and

who have rated item I

• use, e.g. the average of their ratings to predict, if Alice will like item I

• do this for all items Alice has not seen and recommend the best-rated

Item1 Item2 Item3 Item4 Item5

Alice 5 3 4 4 ?

User1 3 1 2 3 3

User2 4 3 4 3 5

User3 3 3 1 5 4

User4 1 5 5 2 1

4/4/2016 CS 572: Information Retrieval. Spring 2016 27

Page 28: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

28

Collaborative Filtering in Action

A 9

B 3

C

: :

Z 5

A

B

C 9

: :

Z 10

A 5

B 3

C

: :

Z 7

A

B

C 8

: :

Z

A 6

B 4

C

: :

Z

A 10

B 4

C 8

. .

Z 1

User

Database

Active

User

Correlation

Match

A 9

B 3

C

. .

Z 5

A 9

B 3

C

: :

Z 5

A 10

B 4

C 8

. .

Z 1

Extract

RecommendationsC

4/4/2016 CS 572: Information Retrieval. Spring 2016

Page 29: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

User-based nearest-neighbor collaborative filtering (2)

• Some first questions– How do we measure similarity?

– How many neighbors should we consider?

– How do we generate a prediction from the neighbors' ratings?

Item1 Item2 Item3 Item4 Item5

Alice 5 3 4 4 ?

User1 3 1 2 3 3

User2 4 3 4 3 5

User3 3 3 1 5 4

User4 1 5 5 2 1

4/4/2016 CS 572: Information Retrieval. Spring 2016 29

Page 30: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Measuring user similarity

• A popular similarity measure in user-based CF: Pearson correlation

a, b : usersra,p : rating of user a for item pP : set of items, rated both by a and bPossible similarity values between -1 and 1; = user's

average ratings

Item1 Item2 Item3 Item4 Item5

Alice 5 3 4 4 ?

User1 3 1 2 3 3

User2 4 3 4 3 5

User3 3 3 1 5 4

User4 1 5 5 2 1

sim = 0,85sim = 0,70

sim = -0,79

𝒓𝒂, 𝒓𝒃

4/4/2016 CS 572: Information Retrieval. Spring 2016 30

Page 31: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Pearson correlation

• Takes differences in rating behavior into account

• Works well in usual domains, compared with alternative measures– such as cosine similarity

0

1

2

3

4

5

6

Item1 Item2 Item3 Item4

Ratings

Alice

User1

User4

4/4/2016 CS 572: Information Retrieval. Spring 2016 31

Page 32: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Making predictions

• A common prediction function:

• Calculate, whether the neighbors' ratings for the unseen item i are higher or lower than their average

• Combine the rating differences – use the similarity as a weight

• Add/subtract the neighbors' bias from the active user's average and use this as a prediction

4/4/2016 CS 572: Information Retrieval. Spring 2016 32

Page 33: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

33

Significance Weighting

• Important not to trust correlations based on very few co-rated items.

• Include significance weights, sa,u, based on number of co-rated items, m.

uauaua csw ,,,

50 if 50

50 if 1

, mm

ms ua

4/4/2016 CS 572: Information Retrieval. Spring 2016

Page 34: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

34

Collaborative Filtering Method

• Weight all users with respect to similarity with the active user.

• Select a subset of the users (neighbors) to use as predictors.

• Normalize ratings and compute a prediction from a weighted combination of the selected neighbors’ ratings.

• Present items with highest predicted ratings as recommendations.

4/4/2016 CS 572: Information Retrieval. Spring 2016

Page 35: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Making recommendations

• Making predictions is typically not the ultimate goal

• Usual approach (in academia)

– Rank items based on their predicted ratings

• However

– This might lead to the inclusion of (only) niche items

– In practice also: Take item popularity into account

• Approaches

– "Learning to rank"

• Optimize according to a given rank evaluation metric (see later)

4/4/2016 CS 572: Information Retrieval. Spring 2016 35

Page 36: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

36

Problems with Collaborative Filtering

• Cold Start: There needs to be enough other users already in the system to find a match.

• Sparsity: If there are many items to be recommended, even if there are many users, the user/ratings matrix is sparse, and it is hard to find users that have rated the same items.

• First Rater: Cannot recommend an item that has not been previously rated.– New items

– Esoteric items

• Popularity Bias: Cannot recommend items to someone with unique tastes. – Tends to recommend popular items.

4/4/2016 CS 572: Information Retrieval. Spring 2016

Page 37: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Improving the metrics / prediction function

• Not all neighbor ratings might be equally "valuable"– Agreement on commonly liked items is not so informative as

agreement on controversial items– Possible solution: Give more weight to items that have a higher

variance

• Value of number of co-rated items– Use "significance weighting", by e.g., linearly reducing the weight

when the number of co-rated items is low

• Case amplification– Intuition: Give more weight to "very similar" neighbors, i.e., where

the similarity value is close to 1.

• Neighborhood selection– Use similarity threshold or fixed number of neighbors

4/4/2016 CS 572: Information Retrieval. Spring 2016 37

Page 38: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Memory-based and model-based approaches

• User-based CF is said to be "memory-based"– the rating matrix is directly used to find neighbors / make

predictions

– does not scale for most real-world scenarios

– large e-commerce sites have tens of millions of customers and millions of items

• Model-based approaches– based on an offline pre-processing or "model-learning" phase

– at run-time, only the learned model is used to make predictions

– models are updated / re-trained periodically

– large variety of techniques used

– model-building and updating can be computationally expensive

4/4/2016 CS 572: Information Retrieval. Spring 2016 38

Page 39: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

2001: Item-based collaborative filtering recommendation algorithms, B.

Sarwar et al., WWW 2001

• Scalability issues arise with U2U if many more users than items (m >> n , m = |users|, n = |items|)

– e.g. Amazon.com– Space complexity O(m2) when pre-computed– Time complexity for computing Pearson O(m2n)

• High sparsity leads to few common ratings between two users

• Basic idea: "Item-based CF exploits relationships between items first, instead of relationships between users"

4/4/2016 CS 572: Information Retrieval. Spring 2016 39

Page 40: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Item-based collaborative filtering

• Basic idea:

– Use the similarity between items (and not users) to make predictions

• Example:

– Look for items that are similar to Item5

– Take Alice's ratings for these items to predict the rating for Item5

Item1 Item2 Item3 Item4 Item5

Alice 5 3 4 4 ?

User1 3 1 2 3 3

User2 4 3 4 3 5

User3 3 3 1 5 4

User4 1 5 5 2 1

4/4/2016 CS 572: Information Retrieval. Spring 2016 40

Page 41: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

The cosine similarity measure

• Produces better results in item-to-item filtering– for some datasets, no consistent picture in literature

• Ratings are seen as vector in n-dimensional space• Similarity is calculated based on the angle between the vectors

• Adjusted cosine similarity– take average user ratings into account, transform the original ratings– U: set of users who have rated both items a and b

4/4/2016 CS 572: Information Retrieval. Spring 2016 41

Page 42: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Pre-processing for item-based filtering

• Item-based filtering does not solve the scalability problem itself• Pre-processing approach by Amazon.com (in 2003)

– Calculate all pair-wise item similarities in advance– The neighborhood to be used at run-time is typically rather small,

because only items are taken into account which the user has rated– Item similarities are supposed to be more stable than user similarities

• Memory requirements– Up to N2 pair-wise similarities to be memorized (N = number of items) in

theory– In practice, this is significantly lower (items with no co-ratings)– Further reductions possible

• Minimum threshold for co-ratings (items, which are rated at least by n users)• Limit the size of the neighborhood (might affect recommendation accuracy)

4/4/2016 CS 572: Information Retrieval. Spring 2016 42

Page 43: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

More on ratings

• Pure CF-based systems only rely on the rating matrix• Explicit ratings

– Most commonly used (1 to 5, 1 to 7 Likert response scales)– Research topics

• "Optimal" granularity of scale; indication that 10-point scale is better accepted in movie domain

• Multidimensional ratings (multiple ratings per movie)

– Challenge• Users not always willing to rate many items; sparse rating matrices• How to stimulate users to rate more items?

• Implicit ratings– clicks, page views, time spent on some page, demo downloads …– Can be used in addition to explicit ones; question of correctness of

interpretation

4/4/2016 CS 572: Information Retrieval. Spring 2016 43

Page 44: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Data sparsity problems

• Cold start problem– How to recommend new items? What to recommend to new

users?

• Straightforward approaches– Ask/force users to rate a set of items– Use another method (e.g., content-based, demographic or

simply non-personalized) in the initial phase

• Alternatives– Use better algorithms (beyond nearest-neighbor approaches)– Example:

• In nearest-neighbor approaches, the set of sufficiently similar neighbors might be to small to make good predictions

• Assume "transitivity" of neighborhoods

4/4/2016 CS 572: Information Retrieval. Spring 2016 44

Page 45: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Example algorithms for sparse datasets

• Recursive CF– Assume there is a very close neighbor n of u who however has not

rated the target item i yet.

– Idea: • Apply CF-method recursively and predict a rating for item i for the neighbor

• Use this predicted rating instead of the rating of a more distant direct neighbor

Item1 Item2 Item3 Item4 Item5

Alice 5 3 4 4 ?

User1 3 1 2 3 ?

User2 4 3 4 3 5

User3 3 3 1 5 4

User4 1 5 5 2 1

sim = 0,85

Predict rating forUser1

4/4/2016 CS 572: Information Retrieval. Spring 2016 45

Page 46: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Graph-based methods

• "Spreading activation" (sketch) – Idea: Use paths of lengths > 3

to recommend items– Length 3: Recommend Item3 to User1– Length 5: Item1 also recommendable

4/4/2016 CS 572: Information Retrieval. Spring 2016 46

Page 47: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

More model-based approaches

• Plethora of different techniques proposed in the last years, e.g.,– Matrix factorization techniques, statistics

• singular value decomposition, principal component analysis

– Association rule mining• compare: shopping basket analysis

– Probabilistic models• clustering models, Bayesian networks, probabilistic Latent Semantic

Analysis

– Various other machine learning approaches

• Costs of pre-processing – Usually not discussed– Incremental updates possible?

4/4/2016 CS 572: Information Retrieval. Spring 2016 47

Page 48: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

2000: Application of Dimensionality Reduction in

Recommender System, B. Sarwar et al., WebKDD Workshop

• Basic idea: Trade more complex offline model building for faster online prediction generation

• Singular Value Decomposition for dimensionality reduction of rating matrices

– Captures important factors/aspects and their weights in the data

– factors can be genre, actors but also non-understandable ones

– Assumption that k dimensions capture the signals and filter out noise (K = 20 to 100)

• Constant time to make recommendations

• Approach also popular in IR (Latent Semantic Indexing), data compression, …

4/4/2016 CS 572: Information Retrieval. Spring 2016 48

Page 49: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

A picture says …

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

BobMary

Alice

Sue

4/4/2016 CS 572: Information Retrieval. Spring 2016 49

Page 50: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Matrix factorization

VkT

Dim1 -0.44 -0.57 0.06 0.38 0.57

Dim2 0.58 -0.66 0.26 0.18 -0.36

Uk Dim1 Dim2

Alice 0.47 -0.30

Bob -0.44 0.23

Mary 0.70 -0.06

Sue 0.31 0.93 Dim1 Dim2

Dim1 5.63 0

Dim2 0 3.23

T

kkkk VUM

k

• SVD:

• Prediction:

= 3 + 0.84 = 3.84

)()(ˆ EPLVAliceUrr Tkkkuui

4/4/2016 CS 572: Information Retrieval. Spring 2016 50

Page 51: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

2008: Factorization meets the neighborhood: a multifaceted

collaborative filtering model, Y. Koren, ACM SIGKDD

Stimulated by work on Netflix competition– Prize of $1,000,000 for accuracy improvement of 10% RMSE

compared to own Cinematch system

– Very large dataset (~100M ratings, ~480K users , ~18K movies)

– Last ratings/user withheld (set K)

Root mean squared error metric optimized to 0.8567

K

rr

RMSEKiu

uiui

),(

2)ˆ(

4/4/2016 CS 572: Information Retrieval. Spring 2016 51

Page 52: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Merges neighborhood models with latent factor models

Latent factor models

– good to capture weak signals in the overall data

Neighborhood models

– good at detecting strong relationships between close items

Combination in one prediction single function

– Local search method such as stochastic gradient descent to determine parameters

– Add penalty for high values to avoid over-fitting

2008: Factorization meets the neighborhood: a multifaceted

collaborative filtering model, Y. Koren, ACM SIGKDD

Kiu

iuiui

T

uiuuibqp

bbqpqpbbr),(

22222

,,)()(min

***

iTuiuui qpbbr ˆ

4/4/2016 CS 572: Information Retrieval. Spring 2016 52

Page 53: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Collaborative Filtering Issues

• Pros: – well-understood, works well in some domains, no knowledge engineering required

• Cons:– requires user community, sparsity problems, no integration of other knowledge sources, no

explanation of results

• What is the best CF method?– In which situation and which domain? Inconsistent findings; always the same domains and data

sets; differences between methods are often very small (1/100)

• How to evaluate the prediction quality?– MAE / RMSE: What does an MAE of 0.7 actually mean?

– Serendipity: Not yet fully understood

• What about multi-dimensional ratings?

4/4/2016 CS 572: Information Retrieval. Spring 2016 53

Page 54: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

54

Content-based recommendations

• Main idea: recommend items to customer C similar to previous items rated highly by C

• Movie recommendations

– recommend movies with same actor(s), director, genre, …

• Websites, blogs, news

– recommend other sites with “similar” content

4/4/2016 CS 572: Information Retrieval. Spring 2016

Page 55: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

55

Overall idea

likes

Item profiles

Red

Circles

Triangles

User profile

match

recommendbuild

4/4/2016 CS 572: Information Retrieval. Spring 2016

Page 56: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

56

Item Profiles

• For each item, create an item profile

• Profile is a set of features– movies: author, title, actor, director,…

– text: set of “important” words in document

• How to pick important words?– Usual heuristic is TF.IDF (Term Frequency times

Inverse Doc Frequency)

4/4/2016 CS 572: Information Retrieval. Spring 2016

Page 57: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

57

TF.IDF Strikes Again!

fij = frequency of term ti in document dj

ni = number of docs that mention term i

N = total number of docs

TF.IDF score wij = TFij x IDFi

Doc profile = set of words with highest TF.IDF scores, together with their scores

4/4/2016 CS 572: Information Retrieval. Spring 2016

Page 58: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

58

User profiles and prediction

• User profile possibilities:

– Weighted average of rated item profiles

– Variation: weight by difference from average rating for item

– …

• Prediction heuristic

– Given user profile c and item profile s, estimate u(c,s) = cos(c,s) = c.s/(|c||s|)

– Need efficient method to find items with high utility

4/4/2016 CS 572: Information Retrieval. Spring 2016

Page 59: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

59

Model-based approaches

• For each user, learn a classifier that classifies items into rating classes

– liked by user and not liked by user

– e.g., Bayesian, regression, SVM

• Apply classifier to each item to find recommendation candidates

• Problem: scalability

– Can revisit when we cover classifiers in detail

4/4/2016 CS 572: Information Retrieval. Spring 2016

Page 60: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

60

Advantages of Content-Based Approach

• No need for data on other users.

– No cold-start or sparsity problems.

• Able to recommend to users with unique tastes.

• Able to recommend new and unpopular items

– No first-rater problem.

• Can provide explanations of recommended items by listing content-features that caused an item to be recommended.

4/4/2016 CS 572: Information Retrieval. Spring 2016

Page 61: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

61

Disadvantages of Content-Based Approach

• Requires content that can be encoded as meaningful features.

• Users’ tastes must be represented as a learnable function of these content features.

• Unable to exploit quality judgments of other users.

– Unless these are somehow included in the content features.

4/4/2016 CS 572: Information Retrieval. Spring 2016

Page 62: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

62

Combining Content and Collaboration

• Content-based and collaborative methods have complementary strengths and weaknesses.

• Combine methods to obtain the best of both.

• Various hybrid approaches:

– Apply both methods and combine recommendations.

– Use collaborative data as content.

– Use content-based predictor as another collaborator.

– Use content-based predictor to complete collaborative data.

4/4/2016 CS 572: Information Retrieval. Spring 2016

Page 63: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

63

Movie Domain• EachMovie Dataset [Compaq Research Labs]

– Contains user ratings for movies on a 0–5 scale.

– 72,916 users (avg. 39 ratings each).

– 1,628 movies.

– Sparse user-ratings matrix – (2.6% full).

• Crawled Internet Movie Database (IMDb)

– Extracted content for titles in EachMovie.

• Basic movie information:

– Title, Director, Cast, Genre, etc.

• Popular opinions:

– User comments, Newspaper and Newsgroup reviews, etc.

4/4/2016 CS 572: Information Retrieval. Spring 2016

Page 64: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

64

Content-Boosted Collaborative Filtering (Movie domain,

IMDbEachMovie Web Crawler

Movie

Content

Database

Full User

Ratings Matrix

Collaborative

FilteringActive

User Ratings

User Ratings

Matrix (Sparse)Content-based

Predictor

Recommendations

4/4/2016 CS 572: Information Retrieval. Spring 2016

Page 65: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

65

Content-Boosted CF - I

Content-Based

Predictor

Training Examples

Pseudo User-ratings Vector

Items with Predicted Ratings

User-ratings Vector

User-rated Items

Unrated Items

4/4/2016 CS 572: Information Retrieval. Spring 2016

Page 66: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

66

Content-Boosted CF - II

• Compute pseudo user ratings matrix

– Full matrix – approximates actual full user ratings matrix

• Perform CF

– Using Pearson corr. between pseudo user-rating vectors

User Ratings

Matrix

Pseudo User

Ratings Matrix

Content-Based

Predictor

4/4/2016 CS 572: Information Retrieval. Spring 2016

Page 67: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

67

Classic Experiments (Konstan)

• Used subset of EachMovie (7,893 users; 299,997 ratings)

• Test set: 10% of the users selected at random.– Test users that rated at least 40 movies.

– Train on the remainder sets.

• Hold-out set: 25% items for each test user.– Predict rating of each item in the hold-out set.

• Compared CBCF to other prediction approaches:– Pure CF

– Pure Content-based

– Naïve hybrid (averages CF and content-based predictions)

4/4/2016 CS 572: Information Retrieval. Spring 2016

Page 68: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

68

Metrics

• Mean Absolute Error (MAE)

– Compares numerical predictions with user ratings

• ROC sensitivity [Herlocker 99]

– How well predictions help users select high-quality items

– Ratings 4 considered “good”; < 4 considered “bad”

• Paired t-test for statistical significance

4/4/2016 CS 572: Information Retrieval. Spring 2016

Page 69: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

69

Results - I

0.9

0.92

0.94

0.96

0.98

1

1.02

1.04

1.06

MA

E

Algorithm

MAE

CF

Content

Naïve

CBCF

CBCF is significantly better (4% over CF) at (p < 0.001)

4/4/2016 CS 572: Information Retrieval. Spring 2016

Page 70: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Results - II

0.58

0.6

0.62

0.64

0.66

0.68

RO

C-4

Algorithm

ROC Sensitivity

CF

Content

Naïve

CBCF

70

CBCF outperforms rest (5% improvement over CF)

4/4/2016 CS 572: Information Retrieval. Spring 2016

Page 71: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

71

Novelty versus Trust

• There is a trade-off– High confidence recommendations

• Recommendations are obvious

• Low utility for user

• However, they build trust– Users like to see some recommendations that they know are

right

– Recommendations with high prediction yet lower confidence

• Higher variability of error

• Higher novelty → higher utility for user– McLaughlin and Herlocker argue that “very obscure”

recommendations are often bad (e.g., hard to obtain)

4/4/2016 CS 572: Information Retrieval. Spring 2016

Page 72: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Sample Results: [Mooney et al, SIGIR 04]

• Much better at predicting top-rated movies

• Drawback: tends to often predict blockbuster/already popular movies

• A serendipity/ trust trade-off

72

Modified Precis ion at Top-N

0

0.05

0.1

0.15

0.2

0.25

0.3

Top 1 Top 5 Top 10 Top 15 Top 20

Mo

dif

ied

Pre

cis

ion

User-to-User Item-Item Distri buti on

4/4/2016 CS 572: Information Retrieval. Spring 2016

Page 73: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

4/4/2016 CS 572: Information Retrieval. Spring 2016 73

Page 74: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

• One Recommender Systems research question

– What should be in that list?

Recommender Systems in e-Commerce

4/4/2016 CS 572: Information Retrieval. Spring 2016 74

Page 75: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

• Another question both in research and practice

– How do we know that these are good recommendations?

Recommender Systems in e-Commerce

4/4/2016 CS 572: Information Retrieval. Spring 2016 75

Page 76: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

• This might lead to …– What is a good recommendation?

– What is a good recommendation strategy?

– What is a good recommendation strategy for my business?

Recommender Systems in e-Commerce

We hope you will buy also …These have been in stock for quite a while now …

4/4/2016 CS 572: Information Retrieval. Spring 2016 76

Page 77: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

• Total sales numbers

• Promotion of certain items

• …

• Click-through-rates

• Interactivity on platform

• …

• Customer return rates

• Customer satisfaction and loyalty

What is a good recommendation?

What are the measures in practice?

4/4/2016 CS 572: Information Retrieval. Spring 2016 77

Page 78: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Purpose and success criteria (1)

Different perspectives/aspects– Depends on domain and purpose– No holistic evaluation scenario exists

• Retrieval perspective– Reduce search costs– Provide "correct" proposals– Assumption: Users know in advance what they want

• Recommendation perspective– Serendipity – identify items from the Long Tail– Users did not know about existence

4/4/2016 CS 572: Information Retrieval. Spring 2016 78

Page 79: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

When does a RS do its job well?

"Recommend widely unknown items that users might actually like!"

20% of items accumulate 74% of all positive ratings

Recommend items from the long tail

4/4/2016 CS 572: Information Retrieval. Spring 2016 79

Page 80: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Purpose and success criteria (2)

• Prediction perspective– Predict to what degree users like an item– Most popular evaluation scenario in research

• Interaction perspective– Give users a "good feeling"– Educate users about the product domain– Convince/persuade users - explain

• Finally, conversion perspective – Commercial situations– Increase "hit", "clickthrough", "lookers to bookers" rates– Optimize sales margins and profit

4/4/2016 CS 572: Information Retrieval. Spring 2016 80

Page 81: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

• Test with real users

– A/B tests

– Example measures: sales increase, click through rates

• Laboratory studies

– Controlled experiments

– Example measures: satisfaction with the system (questionnaires)

• Offline experiments

– Based on historical data

– Example measures: prediction accuracy, coverage

How do we as researchers know?

4/4/2016 CS 572: Information Retrieval. Spring 2016 81

Page 82: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Empirical research

• Characterizing dimensions:

– Who is the subject that is in the focus of research?

– What research methods are applied?

– In which setting does the research take place?

Subject Online customers, students, historical online sessions, computers, …

Research method Experiments, quasi-experiments, non-experimental research

Setting Lab, real-world scenarios

4/4/2016 CS 572: Information Retrieval. Spring 2016 82

Page 83: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Research methods

• Experimental vs. non-experimental (observational) research methods– Experiment (test, trial):

• "An experiment is a study in which at least one variable is manipulated and units are randomly assigned to different levels or categories of manipulated variable(s)."

• Units: users, historic sessions, …

• Manipulated variable: type of RS, groups of recommended items,

explanation strategies …

• Categories of manipulated variable(s): content-based RS, collaborative RS

4/4/2016 CS 572: Information Retrieval. Spring 2016 83

Page 84: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Experiment designs

4/4/2016 CS 572: Information Retrieval. Spring 2016 84

Page 85: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Evaluation ~ IR Evaluation

• Recommendation is viewed as information retrieval task:– Retrieve (recommend) all items which are predicted to be

"good" or "relevant".

• Common protocol :– Hide some items with known ground truth

– Rank items or predict ratings -> Count -> Cross-validate

• Ground truth established by human domain experts

Reality

Actually Good Actually Bad

Pre

dic

tio

n Rated Good

True Positive (tp) False Positive (fp)

Rated Bad

False Negative (fn) True Negative (tn)

4/4/2016 CS 572: Information Retrieval. Spring 2016 85

Page 86: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Dilemma of IR measures in RS

IR measures are frequently applied, however:

Ground truth for most items actually unknown

What is a relevant item?

Different ways of measuring precision possible

Results from offline experimentation may have limited predictive power for

online user behavior.

4/4/2016 CS 572: Information Retrieval. Spring 2016 86

Page 87: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

• Rank Score extends recall and precision to take the positions of correct items in a ranked list into account– Particularly important in recommender systems as lower

ranked items may be overlooked by users– Learning-to-rank: Optimize models for such measures (e.g.,

AUC)

Metrics: Rank Score – position matters

Actually good

Item 237

Item 899

Recommended (predicted as good)

Item 345

Item 237

Item 187

For a user:

hit

4/4/2016 CS 572: Information Retrieval. Spring 2016 87

Page 88: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Accuracy measures• Datasets with items rated by users

– MovieLens datasets 100K-10M ratings

– Netflix 100M ratings

• Historic user ratings constitute ground truth

• Metrics measure error rate

– Mean Absolute Error (MAE) computes the deviation between predicted ratings and actual ratings

Root Mean Square Error (RMSE) is similar to

4/4/2016 CS 572: Information Retrieval. Spring 2016 88

Page 89: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Offline experimentation example

• Netflix competition– Web-based movie rental

– Prize of $1,000,000 for accuracy improvement (RMSE) of 10% compared to own Cinematch system.

• Historical dataset

– ~480K users rated ~18K movies on a scale of 1 to 5 (~100M ratings)

– Last 9 ratings/user withheld• Probe set – for teams for evaluation

• Quiz set – evaluates teams’ submissions for leaderboard

• Test set – used by Netflix to determine winner

• Today

– Rating prediction only seen as an additional input into the recommendation process

4/4/2016 CS 572: Information Retrieval. Spring 2016 89

Page 90: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

• Offline evaluation is the cheapest variant

– Still, gives us valuable insights

– and lets us compare our results (in theory)

• Dangers and trends:

– Domination of accuracy measures

– Focus on small set of domains (40% on movies in CS)

• Alternative and complementary measures:

– Diversity, Coverage, Novelty, Familiarity, Serendipity, Popularity, Concentration effects (Long tail)

An imperfect world

4/4/2016 CS 572: Information Retrieval. Spring 2016 90

Page 91: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

4/4/2016 CS 572: Information Retrieval. Spring 2016 91

Netflix Challenge

Page 92: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

scoremovieuser

1211

52131

43452

41232

37682

5763

4454

15685

23425

22345

5766

4566

scoremovieuser

?621

?961

?72

?32

?473

?153

?414

?284

?935

?745

?696

?836

Training data Test data

Movie rating data

4/4/2016 CS 572: Information Retrieval. Spring 2016 92

Page 93: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Netflix Prize

• Training data

– 100 million ratings

– 480,000 users

– 17,770 movies

– 6 years of data: 2000-2005

• Test data

– Last few ratings of each user (2.8 million)

– Evaluation criterion: root mean squared error (RMSE)

– Netflix Cinematch RMSE: 0.9514

• Competition

– 2700+ teams

– $1 million grand prize for 10% improvement on Cinematch result

– $50,000 2007 progress prize for 8.43% improvement

4/4/2016 CS 572: Information Retrieval. Spring 2016 93

Page 94: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Overall rating distribution

• Third of ratings are 4s

• Average rating is 3.68From TimelyDevelopment.com

4/4/2016 CS 572: Information Retrieval. Spring 2016 94

Page 95: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

#ratings per movie

• Avg #ratings/movie: 56274/4/2016 CS 572: Information Retrieval. Spring 2016 95

Page 96: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

#ratings per user

• Avg #ratings/user: 2084/4/2016 CS 572: Information Retrieval. Spring 2016 96

Page 97: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Average movie rating by movie count

More ratings to better moviesFrom TimelyDevelopment.com

4/4/2016 CS 572: Information Retrieval. Spring 2016 97

Page 98: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Most loved movies

CountAvg ratingTitle

137812 4.593 The Shawshank Redemption

133597 4.545 Lord of the Rings: The Return of the King

180883 4.306 The Green Mile

150676 4.460 Lord of the Rings: The Two Towers

139050 4.415 Finding Nemo

117456 4.504 Raiders of the Lost Ark

180736 4.299 Forrest Gump

147932 4.433 Lord of the Rings: The Fellowship of the ring

149199 4.325 The Sixth Sense

144027 4.333 Indiana Jones and the Last Crusade

4/4/2016CS 572: Information Retrieval. Spring 201698

Page 99: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Important RMSEs

Grand Prize: 0.8563; 10% improvement

BellKor: 0.8693; 8.63% improvement

Cinematch: 0.9514; baseline

Movie average: 1.0533

User average: 1.0651

Global average: 1.1296

Inherent noise: ????

Personalization

erroneous

accurate

4/4/2016 CS 572: Information Retrieval. Spring 2016 99

Page 100: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Challenges

• Size of data

– Scalability

– Keeping data in memory

• Missing data

– 99 percent missing

– Very imbalanced

• Avoiding overfitting

• Test and training data differ significantly

movie #16322

4/4/2016 CS 572: Information Retrieval. Spring 2016 100

Page 101: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

101

The Rules

• Improve RMSE metric by 10%

– RMSE= square root of the averaged squared difference between each prediction and the actual rating

– amplifies the contributions of egregious errors, both false positives ("trust busters") and false negatives ("missed opportunities").

• 100 million “training” ratings provided

– User, Rating, Movie name, date

4/4/2016 CS 572: Information Retrieval. Spring 2016

Page 102: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

4/4/2016 CS 572: Information Retrieval. Spring 2016 102

Page 103: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

4/4/2016 CS 572: Information Retrieval. Spring 2016 103

Page 104: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

movie #7614 movie #3250

movie #15868

Lessons from the Netflix Prize

Yehuda Koren

The BellKor team

(with Robert Bell & Chris Volinsky)

4/4/2016 CS 572: Information Retrieval. Spring 2016 104

Page 105: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

• Use an ensemble of complementing predictors

• Two, half tuned models worth more than a single, fully tuned model

The BellKor recommender system

4/4/2016 CS 572: Information Retrieval. Spring 2016 105

Page 106: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Effect of ensemble size

0.87

0.8725

0.875

0.8775

0.88

0.8825

0.885

0.8875

0.89

0.8925

1 8 15 22 29 36 43 50

#Predictors

Err

or

- R

MS

E

4/4/2016 CS 572: Information Retrieval. Spring 2016 106

Page 107: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

• Use an ensemble of complementing predictors

• Two, half tuned models worth more than a single, fully tuned model

• But:Many seemingly different models expose similar characteristics of the data, and won’t mix well

• Concentrate efforts along three axes...

The BellKor recommender system

4/4/2016 CS 572: Information Retrieval. Spring 2016 107

Page 108: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

The first axis:

• Multi-scale modeling of the data

• Combine top level, regional modeling of the data, with a refined, local view:

– k-NN: Extracting local patterns

– Factorization: Addressing regional effects

The three dimensions of the BellKor system

Global effects

Factorization

k-NN

4/4/2016 CS 572: Information Retrieval. Spring 2016 108

Page 109: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Multi-scale modeling – 1st tier

• Mean rating: 3.7 stars

• The Sixth Sense is 0.5 stars above avg

• Joe rates 0.2 stars below avg

Baseline estimation:Joe will rate The Sixth Sense 4 stars

Global effects:

4/4/2016 CS 572: Information Retrieval. Spring 2016 109

Page 110: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Multi-scale modeling – 2nd tier

• Both The Sixth Sense and Joe are placed high on the “Supernatural Thrillers” scale

Adjusted estimate:Joe will rate The Sixth Sense 4.5 stars

Factors model:

4/4/2016 CS 572: Information Retrieval. Spring 2016 110

Page 111: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Multi-scale modeling – 3rd tier

• Joe didn’t like related movie Signs

• Final estimate:Joe will rate The Sixth Sense 4.2 stars

Neighborhood model:

4/4/2016 CS 572: Information Retrieval. Spring 2016 111

Page 112: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

The second axis:

• Quality of modeling

• Make the best out of a model

• Strive for:

– Fundamental derivation

– Simplicity

– Avoid overfitting

– Robustness against #iterations, parameter setting, etc.

• Optimizing is good, but don’t overdo it!

The three dimensions of the BellKor system

global

local

quality4/4/2016 CS 572: Information Retrieval. Spring 2016 112

Page 113: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

• The third dimension will be discussed later...

• Next:Moving the multi-scale view along the quality axis

The three dimensions of the BellKor system

global

local

quality

???

4/4/2016 CS 572: Information Retrieval. Spring 2016 113

Page 114: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

• Earliest and most popular collaborative filtering method

• Derive unknown ratings from those of “similar” items (movie-movievariant)

• A parallel user-user flavor: rely on ratings of like-minded users (not in this talk)

Local modeling through k-NN

4/4/2016 CS 572: Information Retrieval. Spring 2016 114

Page 115: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

k-NN

121110987654321

455311

3124452

534321423

245424

5224345

423316

users

movie

s

- unknown rating - rating between 1 to 5

4/4/2016 CS 572: Information Retrieval. Spring 2016 115

Page 116: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

k-NN

121110987654321

455 ?311

3124452

534321423

245424

5224345

423316

users

movie

s

- estimate rating of movie 1 by user 5

4/4/2016 CS 572: Information Retrieval. Spring 2016 116

Page 117: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

k-NN

121110987654321

455 ?311

3124452

534321423

245424

5224345

423316

users

Neighbor selection:

Identify movies similar to 1, rated by user 5

movie

s

4/4/2016 CS 572: Information Retrieval. Spring 2016 117

Page 118: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

k-NN

121110987654321

455 ?311

3124452

534321423

245424

5224345

423316

users

Compute similarity weights:

s13=0.2, s16=0.3

movie

s

4/4/2016 CS 572: Information Retrieval. Spring 2016 118

Page 119: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

k-NN

121110987654321

4552.6311

3124452

534321423

245424

5224345

423316

users

Predict by taking weighted average:

(0.2*2+0.3*3)/(0.2+0.3)=2.6

movie

s

4/4/2016 CS 572: Information Retrieval. Spring 2016 119

Page 120: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Properties of k-NN

• Intuitive

• No substantial preprocessing is required

• Easy to explain reasoning behind a recommendation

• Accurate?

4/4/2016 CS 572: Information Retrieval. Spring 2016 120

Page 121: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Grand Prize: 0.8563

BellKor: 0.8693

Cinematch: 0.9514

Movie average: 1.0533

User average: 1.0651

Global average: 1.1296

Inherent noise: ????

0.96

0.91

k-NN on the RMSE scale

k-NN

erroneous

accurate

4/4/2016 CS 572: Information Retrieval. Spring 2016 121

Page 122: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

k-NN - Common practice

1. Define a similarity measure between items: sij

2. Select neighbors -- N(i;u): items most similar to i, that were rated by u

3. Estimate unknown rating, rui, as the weighted average:

( ; )

( ; )

ij uj ujj N i u

ui ui

ijj N i u

s r br b

s

baseline estimate for rui

4/4/2016 CS 572: Information Retrieval. Spring 2016 122

Page 123: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

k-NN - Common practice1. Define a similarity measure between items: sij

2. Select neighbors -- N(i;u): items similar to i, rated by u

3. Estimate unknown rating, rui, as the weighted average:

Problems:

1. Similarity measures are arbitrary; no fundamental

justification

2. Pairwise similarities isolate each neighbor; neglect

interdependencies among neighbors

3. Taking a weighted average is restricting; e.g., when

neighborhood information is limited

( ; )

( ; )

ij uj ujj N i u

ui ui

ijj N i u

s r br b

s

4/4/2016 CS 572: Information Retrieval. Spring 2016 123

Page 124: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

(We allow )

Interpolation weights

• Use a weighted sum rather than a weighted average:

( ; )

1ij

j N i u

w

( ; )ui ui ij uj ujj N i u

r b w r b

• Model relationships between item i and its neighbors

• Can be learnt through a least squares problem from all

other users that rated i:

2

( ; )Minw vi vi ij vj vjv u j N i u

r b w r b

4/4/2016 CS 572: Information Retrieval. Spring 2016 124

Page 125: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Interpolation weights

2

( ; )Minw vi vi ij vj vjv u j N i u

r b w r b

• Interpolation weights derived based on their role;

no use of an arbitrary similarity measure

• Explicitly account for interrelationships among the

neighbors

Challenges:

• Deal with missing values

• Avoid overfitting

• Efficient implementation

Mostly

unknown

Estimate inner-products

among movie ratings

4/4/2016 CS 572: Information Retrieval. Spring 2016 125

Page 126: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Geared

towards

females

Geared

towards

males

serious

escapist

The PrincessDiaries

The Lion King

Braveheart

Lethal Weapon

Independence Day

AmadeusThe Color Purple

Dumb and Dumber

Ocean’s 11

Sense and Sensibility

Gus

Dave

Latent factor models

4/4/2016 CS 572: Information Retrieval. Spring 2016 126

Page 127: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Latent factor models

45531

312445

53432142

24542

522434

42331

item

s

.2-.4.1

.5.6-.5

.5.3-.2

.32.11.1

-22.1-.7

.3.7-1

-.92.41.4.3-.4.8-.5-2.5.3-.21.1

1.3-.11.2-.72.91.4-1.31.4.5.7-.8

.1-.6.7.8.4-.3.92.41.7.6-.42.1

~

~

item

s

users

users

A rank-3 SVD approximation

4/4/2016 CS 572: Information Retrieval. Spring 2016 127

Page 128: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Estimate unknown ratings as inner-products of factors:

45531

312445

53432142

24542

522434

42331

item

s

.2-.4.1

.5.6-.5

.5.3-.2

.32.11.1

-22.1-.7

.3.7-1

-.92.41.4.3-.4.8-.5-2.5.3-.21.1

1.3-.11.2-.72.91.4-1.31.4.5.7-.8

.1-.6.7.8.4-.3.92.41.7.6-.42.1

~

~

item

s

users

A rank-3 SVD approximation

users

?

4/4/2016 CS 572: Information Retrieval. Spring 2016 128

Page 129: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Estimate unknown ratings as inner-products of factors:

45531

312445

53432142

24542

522434

42331

item

s

.2-.4.1

.5.6-.5

.5.3-.2

.32.11.1

-22.1-.7

.3.7-1

-.92.41.4.3-.4.8-.5-2.5.3-.21.1

1.3-.11.2-.72.91.4-1.31.4.5.7-.8

.1-.6.7.8.4-.3.92.41.7.6-.42.1

~

~

item

s

users

A rank-3 SVD approximation

users

?

4/4/2016 CS 572: Information Retrieval. Spring 2016 129

Page 130: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Estimate unknown ratings as inner-products of factors:

45531

312445

53432142

24542

522434

42331

item

s

.2-.4.1

.5.6-.5

.5.3-.2

.32.11.1

-22.1-.7

.3.7-1

-.92.41.4.3-.4.8-.5-2.5.3-.21.1

1.3-.11.2-.72.91.4-1.31.4.5.7-.8

.1-.6.7.8.4-.3.92.41.7.6-.42.1

~

~

item

s

users

2.4

A rank-3 SVD approximation

users

4/4/2016 CS 572: Information Retrieval. Spring 2016 130

Page 131: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Latent factor models45531

312445

53432142

24542

522434

42331

.2-.4.1

.5.6-.5

.5.3-.2

.32.11.1

-22.1-.7

.3.7-1

-.92.41.4.3-.4.8-.5-2.5.3-.21.1

1.3-.11.2-.72.91.4-1.31.4.5.7-.8

.1-.6.7.8.4-.3.92.41.7.6-.42.1

~

Properties:• SVD isn’t defined when entries are unknown use specialized

methods

• Very powerful model can easily overfit, sensitive to regularization

• Probably most popular model among contestants

– 12/11/2006: Simon Funk describes an SVD based method

– 12/29/2006: Free implementation at timelydevelopment.com

4/4/2016 CS 572: Information Retrieval. Spring 2016 131

Page 132: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Grand Prize: 0.8563

BellKor: 0.8693

Cinematch: 0.9514

Movie average: 1.0533

User average: 1.0651

Global average: 1.1296

Inherent noise: ????

0.93

0.89

Factorization on the RMSE scale

factorization

4/4/2016 CS 572: Information Retrieval. Spring 2016 132

Page 133: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Our approach

• User factors:Model a user u as a vector pu ~ Nk(, )

• Movie factors:Model a movie i as a vector qi ~ Nk(γ, Λ)

• Ratings:Measure “agreement” between u and i: rui ~ N(pu

Tqi, ε2)

• Maximize model’s likelihood:– Alternate between recomputing user-factors,

movie-factors and model parameters– Special cases:

• Alternating Ridge regression• Nonnegative matrix factorization

4/4/2016 CS 572: Information Retrieval. Spring 2016 133

Page 134: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Combining multi-scale views

global effects

regional effects

local effects

Residual fitting

factorization

k-NN

Weighted average

A unified model

k-NN

factorization

4/4/2016 CS 572: Information Retrieval. Spring 2016 134

Page 135: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Localized factorization model

• Standard factorization: User u is a linear function parameterized by pu

rui = puTqi

• Allow user factors – pu – to depend on the item being predicted

rui = pu(i)Tqi

• Vector pu(i) models behavior of u on items like i

4/4/2016 CS 572: Information Retrieval. Spring 2016 135

Page 136: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

RMSE vs. #Factors

0.905

0.91

0.915

0.92

0.925

0.93

0.935

0 20 40 60 80

#Factors

RM

SE

Factorization

Localized factorization

Results on Netflix Probe set

More

accurate

4/4/2016 CS 572: Information Retrieval. Spring 2016 136

Page 137: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Top contenders for Progress Prize 2007

0

1

2

3

4

5

6

7

8

9

10

10/2

/2006

11/2

/2006

12/2

/2006

1/2

/2007

2/2

/2007

3/2

/2007

4/2

/2007

5/2

/2007

6/2

/2007

7/2

/2007

8/2

/2007

9/2

/2007

10/2

/2007

% i

mp

rovem

en

t

ML@Toronto

How low can he go?

wxyzConsulting

Gravity

BellKor

Grand prize

multi-scale

modeling

4/4/2016 CS 572: Information Retrieval. Spring 2016 137

Page 138: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

• Can exploit movie titles and release year

• But movies side is pretty much covered anyway...

• It’s about the users!

• Turning to the third dimension...

Seek alternative perspectives of the data

global

local

quality

???

4/4/2016 CS 572: Information Retrieval. Spring 2016 138

Page 139: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

The third dimension of the BellKor system

• A powerful source of information:Characterize users by which movies they rated, rather than how they rated

• A binary representation of the data

45531

312445

53432142

24542

522434

42331

users

mo

vie

s

010100100101

111001001100

011101011011

010010010110

110000111100

010010010101

usersm

ovie

s

4/4/2016 CS 572: Information Retrieval. Spring 2016 139

Page 140: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

The third dimension of the BellKor system

movie #17270

Great news to recommender systems:

• Works even better on real life datasets (?)

• Improve accuracy by exploiting implicit feedback

• Implicit behavior is abundant and easy to collect:

– Rental history

– Search patterns

– Browsing history

– ...

• Implicit feedback allows predicting personalized ratings for users that never rated!

4/4/2016 CS 572: Information Retrieval. Spring 2016 140

Page 141: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

binaryimplicit

global

local

quality

The three dimensions of the BellKor system

Where do you want to be?

• All over the global-local axis

• Relatively high on the quality axis

• Can we go in between the explicit-implicit axis?

• Yes! Relevant methods:

– Conditional RBMs [ML@UToronto]

– NSVD [Arek Paterek]

– Asymmetric factor models

ratingsexplicit

4/4/2016 CS 572: Information Retrieval. Spring 2016 141

Page 142: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

LessonsWhat it takes to win:

1. Think deeper – design better algorithms

2. Think broader – use an ensemble of multiple predictors

3. Think different – model the data from different perspectives

At the personal level:

1. Have fun with the data

2. Work hard, long breath

3. Good teammates

Rapid progress of science:

1. Availability of large, real life data

2. Challenge, competition

3. Effective collaboration

movie #13043

4/4/2016 CS 572: Information Retrieval. Spring 2016 142

Page 143: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

45531

312445

53432142

24542

522434

42331

34321

454

2434

32

5

24

Yehuda Koren

AT&T Labs – Research

[email protected]

BellKor homepage:

www.research.att.com/~volinsky/netflix/4/4/2016 CS 572: Information Retrieval. Spring 2016 143

Page 144: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Summary

• Most effective approaches are often hybrid between collaborative and content-based filtering– c.f. Netflix, Google News recommendations

• Lots of data available, but algorithm scalability critical – (use Map/Reduce, Hadoop)

• Suggested rules-of-thumb (take w/ grain of salt):– If have moderate amounts of rating data for each user, use

collaborative filtering with kNN (reasonable baseline)– When little data available, use content (model-) based

recommendations

4/4/2016 CS 572: Information Retrieval. Spring 2016 144

Page 145: CS 572: Information Retrieval - Emory Universityeugene/cs572/lectures/lecture21... · 2016-04-04 · User-based nearest-neighbor collaborative filtering (1) • The basic technique:

Resources

• Recommender Systems Survey: http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1423975

• Koren et al. on winning the NetFlix challenge:

– http://videolectures.net/kdd09_koren_cftd/

– http://www.searchenginecaffe.com/2009/09/winning-

1m-netflix-prize-methods.html

• http://www.recommenderbook.net/

4/4/2016 CS 572: Information Retrieval. Spring 2016 145