amazon item-to-item recommendations
TRANSCRIPT
Amazon.com RecommendationItem-to-Item Collaborative Filtering
Authors:Greg Linden,Brent Smith,and Jeremy YorkOrigin:JANUARY • FEBRUARY 2003 Published by the IEEE Computer SocietyReporter:朱韋恩Date:2008/11/3
Outline
Introduction Problems Recommendation Algorithms Comparison Conclusion
Recommender system in our life
Some problem
Many applications require the results set to be returned in realtime
New customers typically have extremely limited information
Customer data is volatile
Three common approaches to solving the problem
Traditional collaborative filtering Cluster models Search-based methods
Amazon.com Item-to-Item CF Algorithm
Traditional Collaborative Filtering
Nearest-Neighbor CF algorithm Cosine distance
For N-dimensional vector of items, measure two customers A and B
Traditional Collaborative Filtering
Disadvantage 1.examines only a small customer sample... 2.item-space partitioning ...
3.If discards the most popular or unpopular items...
Cluster Models
Goal: Divide the customer base into many
segments and assign the user to the segment containing the most similar customers
Cluster Models
Advantage in smaller size of group have better online
scalability and performance
Disadvantage complex and expensive clustering
computation is run offline. However, recommendation quality is low.
Search-Based Methods
Given the user’s purchased and rated items, constructs a search query to find other popular items
For example, same author, artist, director, or similar keywords
Search-Based Methods
If the user has few purchases or ratings, search-based recommendation algorithms scale and perform well
If users with thousands of purchases, it is impractical to base a query on all the
items
Search-Based Methods
Disadvantage
1.too general 2.too narrow
Item-to-Item Collaborative Filtering
Rather than matching the user to similar customers, build a similar-items table by finding that customers tend to purchase together
Amazon.com used this method
Amazon.com
Amazon.com
Item-to-Item CF Algorithm
For each item in product catalog, I1 For each customer C who purchased I1 For each item I2 purchased by customer C Record that a customer purchased I1 and
I2 For each item I2 Compute the similarity between I1 and I2
Item-to-Item Collaborative Filtering
Advantage Incerase the scalability and performance
Scalability: A Comparison
Traditional CF: Impractical on large data sets Cluster models: Perform much of the computation offline,
but recommendation quality is relatively poor
Search-based models: Scale poorly for customers with numerous
purchases and ratings
Scalability: A Comparison
Item-to-Item CF: -creates the similar-items table offline -fast for extremely large data set -quality is excellent -performs well with limited user data
Conclusion