recommendation engines

69
Recommendation Engines: A key personalization feature of modern web applications Haralambos (Babis) Marmanis NEJUG June 11, 2009

Upload: babis-marmanis

Post on 10-May-2015

419 views

Category:

Technology


2 download

DESCRIPTION

Modern web applications embrace personalization in order to provide a unique customer experience. Recommendation engines, in general, and Collaborative Filtering, in particular, are essential techniques for delivering state-of-the-art personalization effects on a web site. These slides are based on a presentation that I gave to New England's Java User Group (NEJUG) in 2009; in that respect, they are quite old. Nevertheless, the content is about the fundamental concepts of these techniques and the fundamentals never go out of fashion. The code references are from the project Yooreeka. The Yooreeka project started with the code of the book "Algorithms of the Intelligent Web " (Manning 2009). You can find the Yooreeka 2.0 API (Javadoc) at http://www.marmanis.com/static/javadoc/index.html

TRANSCRIPT

Page 1: Recommendation Engines

Recommendation Engines:A key personalization feature of modern web applications

Haralambos (Babis) Marmanis

NEJUGJune 11, 2009

Page 2: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Presentation Outline

1 IntroductionRecommendations in Action“It’s the Economy ...”Java source code

2 Basic ConceptsThe Online Music Store ExampleSimilarityDistance (formulas)Similarity (formulas)The ”best” Similarity formula

3 Collaborative FilteringUser basedRating Counting MatrixItem based

4 Content basedText Parsing & AnalysisDocument representation

5 Netflix PrizeNetflix Prize DescriptionLessons learned

6 Summary

Page 3: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Recommendations in Action

Online store recommendations

Amazon.comProvide recommendations for purchasing more items

Page 4: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Recommendations in Action

Online store recommendations

Netflix.comProvide recommendations for viewing more movies

Page 5: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Recommendations in Action

Content recommendations

Any news portal or other content aggregatorRecommendations for articles, books, news stories

Page 6: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

“It’s the Economy ...”

The Long Tail

Goodbye Pareto Principle, Hello Long TailErik Brynjolfsson, Yu (Jeffrey) Hu, and Michael D. Smith,used a log-linear curve to describe the relationshipbetween Amazon.com sales and sales ranking.

They found that a large proportion of Amazon.com’s booksales come from obscure books that were not available inbrick-and-mortar stores.

They also found that consumer benefit from access toincreased product variety in online book stores is ten timeslarger than their benefit from access to lower prices online!

Page 7: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

“It’s the Economy ...”

The Long Tail

Goodbye Pareto Principle, Hello Long TailErik Brynjolfsson, Yu (Jeffrey) Hu, and Michael D. Smith,used a log-linear curve to describe the relationshipbetween Amazon.com sales and sales ranking.

They found that a large proportion of Amazon.com’s booksales come from obscure books that were not available inbrick-and-mortar stores.

They also found that consumer benefit from access toincreased product variety in online book stores is ten timeslarger than their benefit from access to lower prices online!

Page 8: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

“It’s the Economy ...”

The Long Tail

Goodbye Pareto Principle, Hello Long TailErik Brynjolfsson, Yu (Jeffrey) Hu, and Michael D. Smith,used a log-linear curve to describe the relationshipbetween Amazon.com sales and sales ranking.

They found that a large proportion of Amazon.com’s booksales come from obscure books that were not available inbrick-and-mortar stores.

They also found that consumer benefit from access toincreased product variety in online book stores is ten timeslarger than their benefit from access to lower prices online!

Page 9: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Java source code

Yooreeka!

Open Source, Machine Learning librarySearch, recommendations, clustering, classification, andcombination of classifiers!URL: http://code.google.com/p/yooreeka/

Page 10: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Presentation Outline

1 IntroductionRecommendations in Action“It’s the Economy ...”Java source code

2 Basic ConceptsThe Online Music Store ExampleSimilarityDistance (formulas)Similarity (formulas)The ”best” Similarity formula

3 Collaborative FilteringUser basedRating Counting MatrixItem based

4 Content basedText Parsing & AnalysisDocument representation

5 Netflix PrizeNetflix Prize DescriptionLessons learned

6 Summary

Page 11: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

The Online Music Store Example

Frank’s music ratings

Page 12: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

The Online Music Store Example

Constantine’s music ratings

Page 13: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

The Online Music Store Example

Catherine’s music ratings

Page 14: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Similarity

The notion of SimilarityOften based on the notion of distanceThe smaller the distance, the greater the similaritySimilarity values, typically, constrained in [0,∞) or [0,1]It is not necessary to define similarity formulas. E.g. ifd < ε then similar, otherwise not.Similarity could also be empirical or probabilistic

Page 15: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Similarity

The notion of SimilarityOften based on the notion of distanceThe smaller the distance, the greater the similaritySimilarity values, typically, constrained in [0,∞) or [0,1]It is not necessary to define similarity formulas. E.g. ifd < ε then similar, otherwise not.Similarity could also be empirical or probabilistic

Page 16: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Similarity

The notion of SimilarityOften based on the notion of distanceThe smaller the distance, the greater the similaritySimilarity values, typically, constrained in [0,∞) or [0,1]It is not necessary to define similarity formulas. E.g. ifd < ε then similar, otherwise not.Similarity could also be empirical or probabilistic

Page 17: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Similarity

The notion of SimilarityOften based on the notion of distanceThe smaller the distance, the greater the similaritySimilarity values, typically, constrained in [0,∞) or [0,1]It is not necessary to define similarity formulas. E.g. ifd < ε then similar, otherwise not.Similarity could also be empirical or probabilistic

Page 18: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Similarity

The notion of SimilarityOften based on the notion of distanceThe smaller the distance, the greater the similaritySimilarity values, typically, constrained in [0,∞) or [0,1]It is not necessary to define similarity formulas. E.g. ifd < ε then similar, otherwise not.Similarity could also be empirical or probabilistic

Page 19: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Distance (formulas)

Let Xi and Yi be two vectors in RN

Minkowski or p-norm distance

d =

(N∑

i=1

|Xi − Yi |p) 1

p

(1)

Manhattan distance

d = maxi|Xi − Yi | (2)

Chebychev or L∞ distance

d = limp→∞

(N∑

i=1

|Xi − Yi |p) 1

p

(3)

Page 20: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Distance (formulas)

Let Xi and Yi be two vectors in RN

Minkowski or p-norm distance

d =

(N∑

i=1

|Xi − Yi |p) 1

p

(1)

Manhattan distance

d = maxi|Xi − Yi | (2)

Chebychev or L∞ distance

d = limp→∞

(N∑

i=1

|Xi − Yi |p) 1

p

(3)

Page 21: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Distance (formulas)

Let Xi and Yi be two vectors in RN

Minkowski or p-norm distance

d =

(N∑

i=1

|Xi − Yi |p) 1

p

(1)

Manhattan distance

d = maxi|Xi − Yi | (2)

Chebychev or L∞ distance

d = limp→∞

(N∑

i=1

|Xi − Yi |p) 1

p

(3)

Page 22: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Distance (formulas)

Let Xi and Yi be two vectors in RN

Minkowski or p-norm distance

d =

(N∑

i=1

|Xi − Yi |p) 1

p

(1)

Manhattan distance

d = maxi|Xi − Yi | (2)

Chebychev or L∞ distance

d = limp→∞

(N∑

i=1

|Xi − Yi |p) 1

p

(3)

Page 23: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Similarity (formulas)

Naıve Similarity

simNaive =β

β + d(4)

where d is the Euclidean distance.

Similarity I

simI = 1 − tanh(σ) (5)

where σ is the biased estimator of sample variance

Similarity II

simII = simI ×commonmaximum

(6)

There is more . . . Jaccard, Tanimoto, and so on

Page 24: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Similarity (formulas)

Naıve Similarity

simNaive =β

β + d(4)

where d is the Euclidean distance.

Similarity I

simI = 1 − tanh(σ) (5)

where σ is the biased estimator of sample variance

Similarity II

simII = simI ×commonmaximum

(6)

There is more . . . Jaccard, Tanimoto, and so on

Page 25: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Similarity (formulas)

Naıve Similarity

simNaive =β

β + d(4)

where d is the Euclidean distance.

Similarity I

simI = 1 − tanh(σ) (5)

where σ is the biased estimator of sample variance

Similarity II

simII = simI ×commonmaximum

(6)

There is more . . . Jaccard, Tanimoto, and so on

Page 26: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

The ”best” Similarity formula

Which is the best similarity formula?There is no such thing! It depends on the problem, thedata, the definition of ... ”best”Spertus,Sahami, and Buyukkokten (2005)Evaluating similarity measures: a large-scale study in theorkut social network. Proceedings of the eleventh ACMSIGKDD international conference on Knowledge discoveryin data miningThe simple L2 based (cosine) similarity showed the bestempirical results among seven similarity metrics.

Page 27: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

The ”best” Similarity formula

Which is the best similarity formula?There is no such thing! It depends on the problem, thedata, the definition of ... ”best”Spertus,Sahami, and Buyukkokten (2005)Evaluating similarity measures: a large-scale study in theorkut social network. Proceedings of the eleventh ACMSIGKDD international conference on Knowledge discoveryin data miningThe simple L2 based (cosine) similarity showed the bestempirical results among seven similarity metrics.

Page 28: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

The ”best” Similarity formula

Which is the best similarity formula?There is no such thing! It depends on the problem, thedata, the definition of ... ”best”Spertus,Sahami, and Buyukkokten (2005)Evaluating similarity measures: a large-scale study in theorkut social network. Proceedings of the eleventh ACMSIGKDD international conference on Knowledge discoveryin data miningThe simple L2 based (cosine) similarity showed the bestempirical results among seven similarity metrics.

Page 29: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

The ”best” Similarity formula

Which is the best similarity formula?There is no such thing! It depends on the problem, thedata, the definition of ... ”best”Spertus,Sahami, and Buyukkokten (2005)Evaluating similarity measures: a large-scale study in theorkut social network. Proceedings of the eleventh ACMSIGKDD international conference on Knowledge discoveryin data miningThe simple L2 based (cosine) similarity showed the bestempirical results among seven similarity metrics.

Page 30: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Presentation Outline

1 IntroductionRecommendations in Action“It’s the Economy ...”Java source code

2 Basic ConceptsThe Online Music Store ExampleSimilarityDistance (formulas)Similarity (formulas)The ”best” Similarity formula

3 Collaborative FilteringUser basedRating Counting MatrixItem based

4 Content basedText Parsing & AnalysisDocument representation

5 Netflix PrizeNetflix Prize DescriptionLessons learned

6 Summary

Page 31: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

TapestryExperimental mail system by Goldberg et al. (circa 1992)in Xerox PARC

Page 32: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

TapestryExperimental mail system by Goldberg et al. (circa 1992)in Xerox PARC

Page 33: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

TapestryExperimental mail system by Goldberg et al. (circa 1992)in Xerox PARC

Page 34: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

TapestryExperimental mail system by Goldberg et al. (circa 1992)in Xerox PARC

Page 35: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

TapestryExperimental mail system by Goldberg et al. (circa 1992)in Xerox PARC

Page 36: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

User based

User Similarity MatrixU1 U2 U3 U4 U5 ..

U1 [ S11 S12 S13 S14 S15 ... ]U2 [ S21 S22 S23 S24 S25 ... ]U3 [ S31 S32 S33 S34 S35 ... ]U4 [ S41 S42 S43 S44 S45 ... ]U5 [ S51 S52 S53 S54 S55 ... ].. [ ... ... ... ... ... ... ]

Page 37: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

User based

User Similarity Matrix (cont.)U1 U2 U3 U4 U5 ..

U1 [1.0, 0.333, 0.385, 0.333, 0.364, ... ]U2 [0.0, 1.000, 0.545, 0.385, 0.615, ... ]U3 [0.0, 0.000, 1.000, 0.364, 0.636, ... ]U4 [0.0, 0.000, 0.000, 1.000, 0.231, ... ]U5 [0.0, 0.000, 0.000, 0.000, 1.000, ... ].. [0.0, 0.000, 0.000, 0.000, 0.000, ... ]

Page 38: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Rating Counting Matrix

Rating Counting MatrixR1 R2 R3 R4 R5

R1 [ X11 X12 X13 X14 X15 ]R2 [ X21 X22 X23 X24 X25 ]R3 [ X31 X32 X33 X34 X35 ]R4 [ X41 X42 X43 X44 X45 ]R5 [ X51 X52 X53 X54 X55 ]

Page 39: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Rating Counting Matrix

BeanShell script (Users)BaseDataset ds = MusicData.createDataset();

Delphi delphi = newDelphi(ds,RecommendationType.USER_BASED);

MusicUser mu1 = ds.pickUser("Bob");

delphi.findSimilarUsers(mu1);

delphi.recommend(mu1);

Page 40: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Item based

Item Similarity MatrixI1 I2 I3 I4 I5 ...

I1 [1.0, 0.333, 0.385, 0.333, 0.364, ... ]I2 [0.0, 1.000, 0.545, 0.385, 0.615, ... ]I3 [0.0, 0.000, 1.000, 0.364, 0.636, ... ]I4 [0.0, 0.000, 0.000, 1.000, 0.231, ... ]I5 [0.0, 0.000, 0.000, 0.000, 1.000, ... ].. [0.0, 0.000, 0.000, 0.000, 0.000, ... ]

Page 41: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Item based

BeanShell script (Items)Delphi delphi = newDelphi(ds,RecommendationType.ITEM_BASED);

MusicUser mu1 = ds.pickUser("Bob");

delphi.recommend(mu1);

MusicItem mi = ds.pickItem("La Bamba");

delphi.findSimilarItems(mi);

Page 42: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Item based

Peruse the codeDelphi

UserBasedSimilarity

ItemBasedSimilarity

BaseSimilarityMatrix

RatingCountMatrix

Page 43: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Item based

Peruse the codeDelphi

UserBasedSimilarity

ItemBasedSimilarity

BaseSimilarityMatrix

RatingCountMatrix

Page 44: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Item based

Peruse the codeDelphi

UserBasedSimilarity

ItemBasedSimilarity

BaseSimilarityMatrix

RatingCountMatrix

Page 45: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Item based

Peruse the codeDelphi

UserBasedSimilarity

ItemBasedSimilarity

BaseSimilarityMatrix

RatingCountMatrix

Page 46: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Item based

Peruse the codeDelphi

UserBasedSimilarity

ItemBasedSimilarity

BaseSimilarityMatrix

RatingCountMatrix

Page 47: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Presentation Outline

1 IntroductionRecommendations in Action“It’s the Economy ...”Java source code

2 Basic ConceptsThe Online Music Store ExampleSimilarityDistance (formulas)Similarity (formulas)The ”best” Similarity formula

3 Collaborative FilteringUser basedRating Counting MatrixItem based

4 Content basedText Parsing & AnalysisDocument representation

5 Netflix PrizeNetflix Prize DescriptionLessons learned

6 Summary

Page 48: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Text Parsing & Analysis

No more ratings, what do we do?Now we deal with documents

So, we need to define similarity based on the content ofthe documents

Use Lucene’s StandardAnalyzer

Build your own! (see CustomAnalyzer)

Page 49: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Text Parsing & Analysis

No more ratings, what do we do?Now we deal with documents

So, we need to define similarity based on the content ofthe documents

Use Lucene’s StandardAnalyzer

Build your own! (see CustomAnalyzer)

Page 50: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Text Parsing & Analysis

No more ratings, what do we do?Now we deal with documents

So, we need to define similarity based on the content ofthe documents

Use Lucene’s StandardAnalyzer

Build your own! (see CustomAnalyzer)

Page 51: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Text Parsing & Analysis

No more ratings, what do we do?Now we deal with documents

So, we need to define similarity based on the content ofthe documents

Use Lucene’s StandardAnalyzer

Build your own! (see CustomAnalyzer)

Page 52: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Document representation

No more ratings!

Page 53: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Document representation

No more ratings!

Page 54: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Document representation

No more ratings!

Page 55: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Presentation Outline

1 IntroductionRecommendations in Action“It’s the Economy ...”Java source code

2 Basic ConceptsThe Online Music Store ExampleSimilarityDistance (formulas)Similarity (formulas)The ”best” Similarity formula

3 Collaborative FilteringUser basedRating Counting MatrixItem based

4 Content basedText Parsing & AnalysisDocument representation

5 Netflix PrizeNetflix Prize DescriptionLessons learned

6 Summary

Page 56: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Netflix Prize Description

Netflix prizeMore than 100 million ratings

480 thousand randomly-chosen, anonymous customers

18 thousand movie titles

Page 57: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Netflix Prize Description

Netflix prizeMore than 100 million ratings

480 thousand randomly-chosen, anonymous customers

18 thousand movie titles

Page 58: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Netflix Prize Description

Netflix prizeMore than 100 million ratings

480 thousand randomly-chosen, anonymous customers

18 thousand movie titles

Page 59: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Netflix Prize Description

Netflix prizeMore than 100 million ratings

480 thousand randomly-chosen, anonymous customers

18 thousand movie titles

Page 60: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Lessons learned

Important considerationsData normalization

Neighbor selectionHow many neighbors?Who are the ”best” neighbors?

Neighbor weights

”Our experience is that most efforts should beconcentrated in deriving substantially different approaches,rather than refining a single technique.”

Page 61: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Lessons learned

Important considerationsData normalization

Neighbor selectionHow many neighbors?Who are the ”best” neighbors?

Neighbor weights

”Our experience is that most efforts should beconcentrated in deriving substantially different approaches,rather than refining a single technique.”

Page 62: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Lessons learned

Important considerationsData normalization

Neighbor selectionHow many neighbors?Who are the ”best” neighbors?

Neighbor weights

”Our experience is that most efforts should beconcentrated in deriving substantially different approaches,rather than refining a single technique.”

Page 63: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Lessons learned

Important considerationsData normalization

Neighbor selectionHow many neighbors?Who are the ”best” neighbors?

Neighbor weights

”Our experience is that most efforts should beconcentrated in deriving substantially different approaches,rather than refining a single technique.”

Page 64: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Presentation Outline

1 IntroductionRecommendations in Action“It’s the Economy ...”Java source code

2 Basic ConceptsThe Online Music Store ExampleSimilarityDistance (formulas)Similarity (formulas)The ”best” Similarity formula

3 Collaborative FilteringUser basedRating Counting MatrixItem based

4 Content basedText Parsing & AnalysisDocument representation

5 Netflix PrizeNetflix Prize DescriptionLessons learned

6 Summary

Page 65: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Important considerationsBusiness value validation - ”Long Tail”, ”niches to riches”,etc.

Similarity metrics - Many to choose from, do not be afraidto explore!

Collaborative Filtering: ”Show me your friend ...”User basedItem based

Content based recommendations - NLP challenges

Large scale implementations - Speed, data size, quality

Page 66: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Important considerationsBusiness value validation - ”Long Tail”, ”niches to riches”,etc.

Similarity metrics - Many to choose from, do not be afraidto explore!

Collaborative Filtering: ”Show me your friend ...”User basedItem based

Content based recommendations - NLP challenges

Large scale implementations - Speed, data size, quality

Page 67: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Important considerationsBusiness value validation - ”Long Tail”, ”niches to riches”,etc.

Similarity metrics - Many to choose from, do not be afraidto explore!

Collaborative Filtering: ”Show me your friend ...”User basedItem based

Content based recommendations - NLP challenges

Large scale implementations - Speed, data size, quality

Page 68: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Important considerationsBusiness value validation - ”Long Tail”, ”niches to riches”,etc.

Similarity metrics - Many to choose from, do not be afraidto explore!

Collaborative Filtering: ”Show me your friend ...”User basedItem based

Content based recommendations - NLP challenges

Large scale implementations - Speed, data size, quality

Page 69: Recommendation Engines

Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary

Important considerationsBusiness value validation - ”Long Tail”, ”niches to riches”,etc.

Similarity metrics - Many to choose from, do not be afraidto explore!

Collaborative Filtering: ”Show me your friend ...”User basedItem based

Content based recommendations - NLP challenges

Large scale implementations - Speed, data size, quality