toward the next generation of recommender systems 2008. 11.05 ieee transactions on knowledge and...
TRANSCRIPT
Toward the Next generation of Recommender systems
2008. 11.05IEEE Transactions on Knowledge and Data Engineering
Volume 17 , Issue 6 (June 2005)
Written by Gediminas Adomavicius, Alexander Tuzhilin
Summarized by Gihyun Gong
Copyright 2008 by CEBT
About paper
This paper is about an overview of recommendation sys-tem
Focused on rating based recommendation which is most popular
Content based
Collaborative filtering
Hybrid methods
Extending capabilities of recommendation system
Copyright 2008 by CEBT
Outline
About recommendation
Recommendation methods
Demographic filtering
Content-based Methods
Collaborative Methods
Hybrid Methods
Current research issues in recommendation system
Copyright 2008 by CEBT
Recommendation
Recommendation is type of information filtering technique that attempts to present information items (movies, music, books, news, images, web pages) that are likely of interest to the user
Recommendation can be formulated as :
C : all users
S : set of all possible item
u : function that measures the usefulness of item s to user c
Recommendation is reduced to the problem ofestimating ratings for the items that have not been seen by a user
How to rating?
How to estimating?
Copyright 2008 by CEBT
Recommendation (cont’d)
Problem of recommender system
Usually not defined on the whole C X S space, but only on some subset of it
Recommendation engine should be able to estimate the ratings of the non-rated movie/user
Copyright 2008 by CEBT
Recommendation system
Recommendation system is a system which has the effect of guiding the user in a personalized way to interesting or useful objects in a large space of possible options
Recommender systems are usually classified into the following categories, based on how recommendations are made:
Demographic filtering
Content-based recommendations: The user will be recommended items similar to the ones the user preferred in the past
Collaborative recommendations: The user will be recommended items that are preferred by other people with similar tastes and preferences
Hybrid approaches: These methods combine collaborative and content-based methods.
Copyright 2008 by CEBT
Demographic filtering
Uses demographic information
Ages, Jobs, Location, …
Advantages
No feedback is needed
No cold start problem
Disadvantages
Can not provide personalization
Low accuracy
Too general
Copyright 2008 by CEBT
Content-based recommendation
Recommend items similar to those users preferred in the past
User preference profile is the key
Matching “user preferences” with “item characteris-tics”
Designed mostly to recommended text-based items
The content in these system is usually described with key-words
Similarity measure
TF-IDF
Cosine similarity
Copyright 2008 by CEBT
Similarity function
TF-IDF
N is the number of documents
Ni is How many times keyword ki is appears in the document
Fi,j is the number of times keyword ki is appears in the document j
Cosine Similarity
For text matching, the attribute vectors A and B are usually the tf-idf vectors of the documents.
)log(*)(
*
,
,
,,
ijk
ji
ijiji
n
N
f
f
IDFTFw
v1user
v2
Copyright 2008 by CEBT
Limitation of Content-based method
Limited Content Analysis
This method is based on text, but not all content is well repre-sented by keywords
– Picture, Taste, …
Overspecialization
User is limited to being recommended items already rated
Unrated items not shown
Use random or mutation in genetic algorithm to solve
New User Problem
This method uses user preference profile
New user have very few ratings (or no history available)
System needs new user’s rating of sample items
However, people usually do not want to rate sample items
Copyright 2008 by CEBT
Collaborative Filtering
Using Trend information, 『Word of Mouth』 Basic idea of CF
1. Build a ratings table from user rating.
2. Compare user’s ratings, and calculate similarity between users.We call the user group which presents high similarity that ‘Nearest Neighborhood’
3. Predict user preference based on rating of Nearest neighbor-hood.
Copyright 2008 by CEBT
Collaborative Filtering methods
Memory-based (or Nearest-Neighborhood)
Similarity based model
Use entire collection of previously rate item by the user
Store all user information in a Database
Model-based
Probabilistic model
Use collection of rating to learn a model, which is used to make rating prediction
Based on machine-learning
– Bayesian network, Clustering, NN, …
Copyright 2008 by CEBT
Advantages of Collaborative Fil-tering
Can deal with multimedia contents
Can recommend based on user preference and quality of item
Can recommend serendipity item
Copyright 2008 by CEBT
Limitation of Collaborative method
New User Problem
Must first learn the user’s preferences from the ratings that the user gives
New Item Problem
Until the new item is rated by a substantial number of users, the recommender system would not be able to recommend it
User’s rating problem
Different users might use different scales
Sparsity
The number of ratings already obtained is usually very small compared to the number of ratings that need to be predicted
Scalability
Computing cost grows with C X S space
System typically have to search millions of users and items, it causes a serious scalability problem
However, these correlations will change when new users are added
Adaptability
Requirement of a user may change over time
Copyright 2008 by CEBT
Surveys on Hybrid method
Combining separate recommender Linear combination of two outputs Voting scheme
Adding Content-based to Collaborative model Add Content-based profile for each user Use filterbot, the virtual user
Adding Collaborative to Content-based model Add user profiles presented by term vector for each items
Single unifying model Knowledge-based techniques
– Entrée uses some domain knowledge– Quickstep, Foxtrot system uses topic ontology
Copyright 2008 by CEBT
Extending capabilities
Comprehensive understanding of Users and Items
Profiles in pure content-based and collaborative-based still tend to be quite simple and do not utilize some of the more advanced profiling techniques
In addition to using traditional profile features, such as keywords and simple user demographics more advanced profiling techniques based on data mining rules, sequences, and signatures that describe a user’s interests can be used to build user profiles
Copyright 2008 by CEBT
17
Extending capabilities (cont’d)
Multidimensionality of Recommendations
Current recommendation system uses only 2-dimension
– User x Item
We can extend dimension of recommendation
– Context(TPOK), Demographic information, …
Copyright 2008 by CEBT
18
Extending capabilities (cont’d)
Example of multidimension : The movie
Traditional recommendation consider just 2 space
– Who is the user?
– What movie?
We can consider other information
– Characteristics of the movie?
– Person wants to see movie?
– Where and how the movie will be seen?
– With whom the movie will be seen?
– When will the movie be seen?
Copyright 2008 by CEBT
Extending capabilities (cont’d)
Multicriteria Rating
To expand rating criteria
Taking a linear combination of multiple criteria and reduc-ing the problem to a single-criterion optimization problem
Optimizing the most important criterion and converting other criteria to constraint
Copyright 2008 by CEBT
Extending capabilities (cont’d)
Restaurant example :
Copyright 2008 by CEBT
Extending capabilities (cont’d)
Nonintrusiveness
The problem of feedback normalizing
One way to explore the intrusiveness problem is to deter-mine an optimal number of ratings the system should ask from a new user
This topic is related to Opinion Mining
Copyright 2008 by CEBT
Extending capabilities (cont’d)
Flexibility
Most of the recommendation methods are “hard-wired” into the systems
Therefore, the end-user cannot customize recommendations ac-cording to his or her needs in real time.
Also, most of the recommender systems recommend only individ-ual items to individual users and do not deal with aggregation.
However, it is important to be able to provide aggregated recom-mendations in a number of applications, such as recommend brands or categories of products to certain segments of users (e.g. Vacations in Florida - Students).
One way to support aggregated recommendations is by utilizing the OLAP-based approach.
Recommendation Query Language (RQL)
Copyright 2008 by CEBT
23
Extending capabilities (cont’d)
RQL is SQL-like language for expressing flexible user-speci-fied recommendation requests
“recommend to each user from New York the best three movies that are longer than two hours” can be ex-pressed in RQL”.
Copyright 2008 by CEBT