personalized paper recommendation based on user historical behavior
DESCRIPTION
Personalized Paper Recommendation Based on User Historical Behavior. Yuan Wang College of Information Technology Science, Nankai University, Tianjin, China kdd.nankai.edu.cn/wangy Major contributors: Jie Liu, XingLiang Dong, Tianbi Liu, and YaLou Huang. Nankai University IIP Lab. Outline. - PowerPoint PPT PresentationTRANSCRIPT
1
Personalized Paper Recommendation Based on
User Historical Behavior
Yuan WangCollege of Information Technology Science, Nankai
University, Tianjin, China kdd.nankai.edu.cn/wangy
Major contributors: Jie Liu, XingLiang Dong, Tianbi Liu, and YaLou Huang
Nankai University IIP Lab
Outline
Why need we provide personalized paper recommendation? background & our motivation Related approaches
Personalized Paper Recommendation model Overview Estimation for each part The optimization of the model
Results Data Collection Evaluation
Conclusion
Background
the rapid development of Digital Libraries ( DL) for information sharing and search
more new idea first posted on DL information overload and infromation lost:
Motivation
effective technique How to tackle the problem now?
sending a email through RSS subscription
Users’ historical reveal users’ mind(*^__^*) browsing behaviors on the site( publishing, marking a
paper as favorite, rating, making a comment, and tagging )
require user interestsexplicitly
Motivation
Focus on providing more relevance papers to researchers
Learn their personalized preference from their historical behaviors
Related approaches
Personalized Recommendation Collaborative filtering
users always prefer things their friend PHOAKS and REFERRAL Web, CiteSeer Search Amazon , ebay ,
Douban, mainly in Commercial recommendation system
Drawbacks Cold Start : Pseudo Users add new
score 、 clustering
Content-based filtering based on content and user similarity
Web Watcher , LIRA , Leticia
Related approaches
Personalized recommendation for scientific papers
Collaborative filtering Methods
citation relationship, similar to PageRank, less user preference
Learn from a recommendation system, then filtering, Input difficult to get
get preference from log, more noise Drawbacks
content information should be taken consideration. Content-based filtering
take care for messages carried with papers without cold start and data sparse problem statistical models are effective for paper
recommendation
Outline
Why need we provide personalized paper recommendation? background & our motivation Related approaches
Personalized Paper Recommendation model Overview Estimation for each part The optimization of the model
Results Data Collection Evaluation
Conclusion
Personalized Paper Recommendation model
A triple relationship for PPR
Description Di : the document set to be recommended to
the user. Dx : the document set the user is viewing U : the current user
Similarity between recommended resource and users
(Di, Dx, U)
Personalized Paper Recommendation model
Assumption users and documents draw from i.i.d
Description p(di) : Priori Probability of Paper p(dx|di) : Similarity between Papers p(uk|di) : Similarity between User and Paper
Priori Probability of Paper
evaluate the probability a document will be selected
through global website by users’ historical behaviors
more valuable, more operation A = {down, keep, visit, tag, score, comment,
collect, ……}
absolute discount smoothing
title, abstract, keywords and domain of area as documents’ feature
calculate the similarity through word segmentation
w : each word in a document tf (w; di) : the frequency in which w appears in di tf (di) : the frequency of all words in di. tf (w, D) : the frequency that w appears in all documents tf (D) : the total frequency all words appear in all documents a : a parameter used for smoothing ( 0.1 here)
Similarity between Papers
Actions on papers reveal users’ preference words in users and documents draw from i.i.d
Wk : the set of each word in user’s profile
tf (w; uk) : the frequency in which w appears in profile of uk P(w|di) : calculate as similarity between papers
Similarity between User and Paper
The optimization of the model Original model did’t consider users’
preference transitive relation
Optimize model by Matrix Transition
User A User B User C
The optimization of the model
Random Walk on graph
normalize by colume normalize by row
The optimization of the model
Random Walk on graph
normalize by colume normalize by row
Outline
Why need we provide personalized paper recommendation? background & our motivation Related approaches
Personalized Paper Recommendation model Overview Estimation for each part The optimization of the model
Results Data Collection Evaluation
Conclusion
Data Collection
Train Data Website: www.paper.edu.cn User behaviors: publish, keep, download, visit,
tag, score and make comments about papers Paper : first publish of papers from October
1, 2010 to March 1, 2011. Test Data
(U, Dx, Di, L) 638 data samples: 339 labeled with 1 and 299
with 0. 26 users and 93 papers
Evaluation Matrix
MAP Mean Average Precision show us the accuracy of models
NDCG Normalized Discounted Cumulative Gain list accuracy evaluation based NDCG@1 to NDCG@6.
Results
Personalized or not five percent increase the average improvement of NDCG is
10.2%.
Results
User preference iteration The more iterations, the sharper MAP declines the original information is lost as the number
of iterations increases
Results
optimization of model “ori” means original model, “n” means the
iteration times. ori+1 get the highest score.
Outline
Why need we provide personalized paper recommendation? background & our motivation Related approaches
Personalized Paper Recommendation model Overview Estimation for each part The optimization of the model
Results Data Collection Evaluation
Conclusion
Conclusion
Proposed a personalized recommendation model based on users’ historical behavior.
Users’ preference profile extracted from historical behavior, with the help of content from user model and paper information.
Provide recommendation service with generation model
Random walk model in original model to helping correlation transformation between users.
Appendix
MAP
M is size of recommended paper set, p (j) is the accuracy of first j recommended papers, l (j) is label information. C (di) is the total number of related papers to the viewed one di.
NDCG
r(i) refers to the relevant grade of ith paper and Zn is a normalized parameter, which assures and the values NDCG@n of top results add up to 1.