a hybrid recommender system: user profiling from keywords and ratings ana stanescu, swapnil nagar,...

28
A Hybrid Recommender System: User Profiling from Keywords and Ratings Ana Stanescu, Swapnil Nagar, Doina Caragea 2013 IEEE/WIC/ACM International Conferences on Web Intelligence (WI) and Intelligent Agent Technology (IAT)

Upload: pauline-hodges

Post on 12-Jan-2016

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Hybrid Recommender System: User Profiling from Keywords and Ratings Ana Stanescu, Swapnil Nagar, Doina Caragea 2013 IEEE/WIC/ACM International Conferences

A Hybrid Recommender System: User Profiling from Keywords and Ratings

Ana Stanescu, Swapnil Nagar, Doina Caragea

2013 IEEE/WIC/ACM International Conferences on Web Intelligence (WI) and Intelligent Agent Technology (IAT)

Page 2: A Hybrid Recommender System: User Profiling from Keywords and Ratings Ana Stanescu, Swapnil Nagar, Doina Caragea 2013 IEEE/WIC/ACM International Conferences

2

Outline

Introduction Related Work Approaches Experimental Setup Results Conclusion

Page 3: A Hybrid Recommender System: User Profiling from Keywords and Ratings Ana Stanescu, Swapnil Nagar, Doina Caragea 2013 IEEE/WIC/ACM International Conferences

3

Introduction(1/3) Recommendation systems[3]

Content-Based User preferred in the past. Data scarcity problem. Cannot identify new and different items.

Collaborative Filtering Based on the user-user similarity. A new item cannot be recommended.

Hybrid

• [3] M. Balabanovic and Y. Shoham. Fab: content-based, collaborative recommendation. Communications of the ACM, 40, 1997.

Page 4: A Hybrid Recommender System: User Profiling from Keywords and Ratings Ana Stanescu, Swapnil Nagar, Doina Caragea 2013 IEEE/WIC/ACM International Conferences

4

Introduction(2/3)

We propose a hybrid system that mediates the data sparsity problem and reduces the noise from the user generated content.

We adapt for movies the Weighted Tag Recommender (WTR) approach from [14]. Addressed the problem of recommending books on

Amazon and built their system exclusively from tag information.

• [14] H. Liang, Y. Xu, Y. Li, R. Nayak, and G. Shaw. A hybrid recommender systems based on weighted tags. 10th SIAM International Conference on Data Mining, 2010.

Page 5: A Hybrid Recommender System: User Profiling from Keywords and Ratings Ana Stanescu, Swapnil Nagar, Doina Caragea 2013 IEEE/WIC/ACM International Conferences

5

Introduction(3/3)

Weighted Tag-Rating Recommender (WTRR). Weighted Keyword-Rating Recommender

(WKRR). Both our keyword and tag representations of users

can help alleviate the noise and semantic ambiguity problems inherent in the information contributed by users of social networks.

Page 6: A Hybrid Recommender System: User Profiling from Keywords and Ratings Ana Stanescu, Swapnil Nagar, Doina Caragea 2013 IEEE/WIC/ACM International Conferences

6

Related Work(1/3) Tagging is a type of labeling, whose purpose is to

assist users in the process of finding content on the web. [18]

Tags are free annotations and there are no constrains assigning tags.

A hybrid system proposed by Liang et al. [14] addresses these problems, by using weighted tags.

• [14] H. Liang, Y. Xu, Y. Li, R. Nayak, and G. Shaw. A hybrid recommender systems based on weighted tags. 10th SIAM International Conference on Data Mining, 2010.

• [18] A. Said, B. Kille, E. W. De Luca, and S. Albayrak. Personalizing tags: a folksonomy-like approach for recommending movies. In Proceedings of the 2nd International Workshop on Information Heterogeneity and Fusion in Recommender Systems, HetRec ’11, 2011.

Page 7: A Hybrid Recommender System: User Profiling from Keywords and Ratings Ana Stanescu, Swapnil Nagar, Doina Caragea 2013 IEEE/WIC/ACM International Conferences

7

Related Work (2/3)

For domains where both tags and ratings are available, a recommender system should exploit all the information.

Systems that leverage ratings, which can be either explicitly provided by the users[5], are known to perform well.

Ratings can also be noisy.[2]• [5] R. M. Bell, Y. Koren, and C. Volinsky. The Bellkor 2008 solution to the Netflix prize.

2008.• [2] X. Amatriain, J. Pujol, and N. Oliver. I like it... i like it not: Evaluating user ratings noise

in recommender systems. In User Modeling, Adaptation, and Personalization, Lecture Notes in Computer Science. 2009.

Page 8: A Hybrid Recommender System: User Profiling from Keywords and Ratings Ana Stanescu, Swapnil Nagar, Doina Caragea 2013 IEEE/WIC/ACM International Conferences

8

Related Work (3/3)

The system proposed by [6] is an ensemble of various recommenders primarily used for mining and aggregating the information from various sources.

In [12], the authors propose learning multiple models which can incorporate different types of inputs to predict the preferences of diverse users.

• [6] E. Bothos, K. Christidis, D. Apostolou, and G. Mentzas. Information market based recommender systems fusion. In Proceedings of the 2nd International Workshop on Informatio.

• [12] C. Jones, J. Ghosh, and A. Sharma. Learning multiple models for exploiting predictive heterogeneity in recommender systems. 2011.

Page 9: A Hybrid Recommender System: User Profiling from Keywords and Ratings Ana Stanescu, Swapnil Nagar, Doina Caragea 2013 IEEE/WIC/ACM International Conferences

9

Approaches – WTRR(1/5)

Weighted Tag-Rating Recommender(WTRR) The book recommender system proposed in [14] is

built from tag information only. Tags may not always capture the true preference of

the user. We incorporate the actual ratings.

• [14] H. Liang, Y. Xu, Y. Li, R. Nayak, and G. Shaw. A hybrid recommender systems based on weighted tags. 10th SIAM International Conference on Data Mining, 2010.

Page 10: A Hybrid Recommender System: User Profiling from Keywords and Ratings Ana Stanescu, Swapnil Nagar, Doina Caragea 2013 IEEE/WIC/ACM International Conferences

10

Approaches – WTRR(2/5) Tag Relevance

Finding meaning of each tag for each user individually

Tag Relatedness Metric

Summation of ratings assigned to the movie mi by all the users who used tag tx.

Summation of all the ratings from the users who tagged mi.

Measures how similar tag ty is to a given tag tx.

The set of movies tagged with tx by ui.

Page 11: A Hybrid Recommender System: User Profiling from Keywords and Ratings Ana Stanescu, Swapnil Nagar, Doina Caragea 2013 IEEE/WIC/ACM International Conferences

11

Approaches – WTRR(3/5)

User Profile To leverage the advantages of hybrid systems,

users topic preferences and movie preferences are combined.

Every user is represented by a profile, encoded using a vector of weights:

• uiT : user ui’s topic preferences. (values denoting how much ui is interested in each tag.)

• uiM : user ui’s movie preferences.

Page 12: A Hybrid Recommender System: User Profiling from Keywords and Ratings Ana Stanescu, Swapnil Nagar, Doina Caragea 2013 IEEE/WIC/ACM International Conferences

12

Approaches – WTRR(4/5)

Weight of each tag for a user

Total relevance weight of ty for ui

Summation of ratings assigned to the movie mj by all the users who used tx.

Summation of all ratings assigned to the movie mj by all the users who tagged it.

Page 13: A Hybrid Recommender System: User Profiling from Keywords and Ratings Ana Stanescu, Swapnil Nagar, Doina Caragea 2013 IEEE/WIC/ACM International Conferences

13

Approaches – WTRR(5/5)

Inverse user frequency of tag ty

The tag representation of each user(Values of the topic preference vector ui

T for each user ui)

• |Uty| is the number of users that used ty .

• e is Euler’s number.

Page 14: A Hybrid Recommender System: User Profiling from Keywords and Ratings Ana Stanescu, Swapnil Nagar, Doina Caragea 2013 IEEE/WIC/ACM International Conferences

14

Approaches – WKRR(1/4)

Weighted Keyword-Rating Recommender (WKRR). Our algorithm dynamically creates a user profile

from IMDB movie keywords and explicit user ratings.

Similar to WTRR, we profile users on preference.

• uiK : user ui’s keyword topic preferences.

• uiR : user ui’s rating-based movie preferences.

Page 15: A Hybrid Recommender System: User Profiling from Keywords and Ratings Ana Stanescu, Swapnil Nagar, Doina Caragea 2013 IEEE/WIC/ACM International Conferences

15

Approaches – WKRR(2/4)

Movie Description Based on Weighted Keywords movie keyword relevance metric

Page 16: A Hybrid Recommender System: User Profiling from Keywords and Ratings Ana Stanescu, Swapnil Nagar, Doina Caragea 2013 IEEE/WIC/ACM International Conferences

16

Approaches – WKRR(3/4)

The Representation of Keywords degree of connection between keywords

representation of keyword kx

Page 17: A Hybrid Recommender System: User Profiling from Keywords and Ratings Ana Stanescu, Swapnil Nagar, Doina Caragea 2013 IEEE/WIC/ACM International Conferences

17

Approaches – WKRR(4/4)

User Profile Generation From Keywords Weight of a keyword to a user

Total relevance weight of a keyword for a user

Page 18: A Hybrid Recommender System: User Profiling from Keywords and Ratings Ana Stanescu, Swapnil Nagar, Doina Caragea 2013 IEEE/WIC/ACM International Conferences

18

Approaches – Neighborhood Formation(1/2)

In order to predict a user’s rating for an unseen movie, we first set out to find the community of users sharing similar taste.

Identify for each user u, an ordered list of k most similar users such that sim(u, u1) is maximum, sim(u, u2) is the second highest and so on.

Page 19: A Hybrid Recommender System: User Profiling from Keywords and Ratings Ana Stanescu, Swapnil Nagar, Doina Caragea 2013 IEEE/WIC/ACM International Conferences

19

Approaches – Neighborhood Formation(2/2)

The similarity between two users

In this paper, ω = 0.9.

Page 20: A Hybrid Recommender System: User Profiling from Keywords and Ratings Ana Stanescu, Swapnil Nagar, Doina Caragea 2013 IEEE/WIC/ACM International Conferences

20

Approaches – Rating Prediction Formula(1/2)

Traditional Top N algorithms choose the Top N most similar neighbors to predict the missing value. Set of users similar to u:

Page 21: A Hybrid Recommender System: User Profiling from Keywords and Ratings Ana Stanescu, Swapnil Nagar, Doina Caragea 2013 IEEE/WIC/ACM International Conferences

21

Approaches – Rating Prediction Formula(2/2)

To calculate the missing ratings we used a popular user-based prediction formula described in [11].

• [11] J. L. Herlocker, J. A. Konstan, L. G. Terveen, and J. T. Riedl. Evaluating collaborative filtering recommender systems. ACM Transactions on Information Systems, 2004.

• ru : the average of the ratings given by user u.

• wuv : the similarity value between user u and user v.

• σu : the standard deviation of ratings given by user u.• N(u) : set of most similar users to user u.

Page 22: A Hybrid Recommender System: User Profiling from Keywords and Ratings Ana Stanescu, Swapnil Nagar, Doina Caragea 2013 IEEE/WIC/ACM International Conferences

22

Experimental Setup(1/3)

Dataset hetrec2011- movielens-2k dated May 2011[7]

Based on the original MovieLens10M dataset, published by the GroupLens research group.

• [7] I. Cantador, P. Brusilovsky, and T. Kuflik. 2nd workshop on information heterogeneity and fusion in recommender systems (hetrec 2011). In Proceedings of the 5th ACM conference on Recommender systems, 2011.

• http://www.grouplens.org

Page 23: A Hybrid Recommender System: User Profiling from Keywords and Ratings Ana Stanescu, Swapnil Nagar, Doina Caragea 2013 IEEE/WIC/ACM International Conferences

23

Experimental Setup(2/3)

Evaluation Metrics Predictive accuracy metrics

Root Mean Squared Error (RMSE) Mean Absolute Error (MAE)

• N : the total number of ratings from all users.• pu,m : the predicted rating for user u on movie m.

• ru,m : the actual rating for movie m assigned by the user u.

Page 24: A Hybrid Recommender System: User Profiling from Keywords and Ratings Ana Stanescu, Swapnil Nagar, Doina Caragea 2013 IEEE/WIC/ACM International Conferences

24

Experimental Setup(3/3)

Experiments We trained our algorithm on the train set and then

predicted the ratings in the test set. We kept 80% of users for training, while 20% of

users were set aside for test.

Page 25: A Hybrid Recommender System: User Profiling from Keywords and Ratings Ana Stanescu, Swapnil Nagar, Doina Caragea 2013 IEEE/WIC/ACM International Conferences

25

Results(1/3)

Compare WTRR ,WKRR, and purely collaborative (PC) approach

Page 26: A Hybrid Recommender System: User Profiling from Keywords and Ratings Ana Stanescu, Swapnil Nagar, Doina Caragea 2013 IEEE/WIC/ACM International Conferences

Results(2/3)

Compare the results of the WKRR with the results of state of the art approaches reported in [6] and [12].

26

• [6] E. Bothos, K. Christidis, D. Apostolou, and G. Mentzas. Information market based recommender systems fusion. In Proceedings of the 2nd International Workshop on Information Heterogeneity and Fusion in Recommender Systems, 2011.

• [12] C. Jones, J. Ghosh, and A. Sharma. Learning multiple models for exploiting predictive heterogeneity in recommender systems. 2011.

Page 27: A Hybrid Recommender System: User Profiling from Keywords and Ratings Ana Stanescu, Swapnil Nagar, Doina Caragea 2013 IEEE/WIC/ACM International Conferences

27

Results(3/3)

Page 28: A Hybrid Recommender System: User Profiling from Keywords and Ratings Ana Stanescu, Swapnil Nagar, Doina Caragea 2013 IEEE/WIC/ACM International Conferences

28

Conclusion

We propose a novel hybrid recommendation technique.

WTRR and WKRR use tags and keywords, respectively.

The results of our experiments show that the performance of WKRR exceeds the other approaches.

WTRR is better than WKRR, when only the subset of data with both tags and keywords is used.