research article exploiting explicit and implicit feedback...
TRANSCRIPT
Research ArticleExploiting Explicit and Implicit Feedback forPersonalized Ranking
Gai Li1 and Qiang Chen2
1School of Electronics and Information Engineering Shunde Polytechnic Guangdong Shunde 528333 China2Department of Computer Science Guangdong University of Education Guangdong Guangzhou 510303 China
Correspondence should be addressed to Gai Li ligai999126com
Received 14 October 2015 Accepted 12 November 2015
Academic Editor Anna Pandolfi
Copyright copy 2016 G Li and Q Chen This is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited
The problem of the previous researches on personalized ranking is that they focused on either explicit feedback data or implicitfeedback data rather than making full use of the information in the dataset Until now nobody has studied personalized rankingalgorithm by exploiting both explicit and implicit feedback In order to overcome the defects of prior researches a new personalizedranking algorithm (MERR SVD++) based on the newest xCLiMF model and SVD++ algorithm was proposed which exploitedboth explicit and implicit feedback simultaneously and optimized the well-known evaluation metric Expected Reciprocal Rank(ERR) Experimental results on practical datasets showed that our proposed algorithm outperformed existing personalized rankingalgorithms over different evaluation metrics and that the running time of MERR SVD++ showed a linear correlation with thenumber of rating Because of its high precision and the good expansibility MERR SVD++ is suitable for processing big data andhas wide application prospect in the field of internet information recommendation
1 Introduction
As e-commerce is growing in popularity an importantchallenge is helping customers sort through a large variety ofoffered products to easily find the ones they will enjoy themost One of the tools that address this challenge is therecommender system which is attracting a lot of attentionrecently [1ndash4] Recommender systems are a subclass of infor-mation filtering systems that seek to predict the ldquoratingrdquo orldquopreferencerdquo that users would give to an item [1] Preferencesfor items are learned from usersrsquo past interactions with thesystem such as purchase histories or the click logs Thepurpose of the system is to recommend items that usersmight like from a large collection Recommender systemshave been applied to many areas on the Internet such as thee-commerce systemAmazon the DVD rental systemNetflixand Google News Recommender systems are usually clas-sified into three categories based on how recommendationsare made content-based recommendations collaborativefiltering (CF) and hybrid approaches In these approachescollaborative filtering is the most widely used and the mostsuccessful
Recently collaborative filtering algorithm has beenwidely studied in both academic and industrial fieldsThe data processed by collaborative filtering algorithm aredivided into two categories explicit feedback data (egratings votes) and implicit feedback data (eg clicks pur-chases) Explicit feedback data are more widely used in theresearch fields of recommender system [1 2 4ndash7] They areoften in the form of numeric ratings from users to expresstheir preferences regarding specific items Implicit feedbackdata are easier to collect The research on implicit feedbackabout CF is also called One-Class Collaborative Filtering(OCCF) [8ndash15] in which only positive implicit feedback oronly positive examples can be observed The explicit andimplicit feedback data can be expressed in matrix form asshown in Figure 1 In the explicit feedbackmatrix an elementcan be any real number but often ratings are integers inthe range (1sim5) such as the ratings on Netflix where amissing element represents amissing example In the implicitfeedback matrix the positive-only user preferences data canbe represented as a single-valued matrix
Collaborative filtering algorithms also can be divided intotwo categories collaborative filtering (CF) algorithms based
Hindawi Publishing CorporationMathematical Problems in EngineeringVolume 2016 Article ID 2535329 11 pageshttpdxdoiorg10115520162535329
2 Mathematical Problems in Engineering
Item
Use
r
1 3
3 2
4 4
4 3 1
5 5
2 5
(a)
Item
Use
r
1 1
1
1 1
1 1
1 1
1 1
(b)
Figure 1 Examples of an explicit feedback matrix (a) and an implicit feedback matrix (b) for a recommender system
on rating prediction [2 4 5 8 9 12 16 17] and personalizedranking (PR) algorithms based on ranking prediction [36 7 10 11 13ndash15 18] In collaborative filtering algorithmsbased on rating prediction one predicts the actual rating foran item that a customer has not rated yet and then ranksthe items according to the predicted ratings On the otherhand for personalized ranking algorithms based on rankingprediction one predicts a preference ordering over the yetunrated items without going through the intermeditate stepof rating prediction Actually from the recommendationperspective the order over the items is more important thantheir ratingTherefore in this paper we focus on personalizedranking algorithms based on ranking prediction
The problem of the previous researches on personalizedranking is that they focused on either explicit feedback dataor implicit feedback data rather than making full use of theinformation in the dataset However in most real world rec-ommender systems both explicit and implicit user feedbackare abundant and could potentially complement each otherIt is desirable to be able to unify these two heterogeneousforms of user feedback in order to generate more accu-rate recommendations The idea of complementing explicitfeedback with implicit feedback was first proposed in [16]where the author considered explicit feedback as how theusers rated the movies and implicit feedback as what movieswere rated by the users The two forms of feedback werecombined via a factorized neighborhood model (called Sin-gular Value Decomposition++ SVD++) an extension oftraditional nearest item-based model in which the item-item similarity matrix was approximated via low rank fac-torization In order to make full use of explicit and implicitfeedback Liu et al [17] proposed Co-Rating model whichdeveloped matrix factorization models that could be trainedfrom explicit and implicit feedback simultaneously BothSVD++ and Co-Rating are based on rating prediction Untilnow nobody has studied personalized ranking algorithm byexploiting both explicit and implicit feedback
In order to overcome the defects of prior researchesthis paper proposes a new personalized ranking algorithm
(MERR SVD++) which exploits both explicit and implicitfeedback and optimizes Expected Reciprocal Rank (ERR)based on the newest Extended Collaborative Less-Is-MoreFiltering (xCLiMF)model [18] and SVD++ algorithm Exper-imental results on practical datasets showed that our pro-posed algorithm outperformed existing personalized rankingalgorithms over different evaluation metrics and that therunning time of MERR SVD++ showed a linear correlationwith the number of rating Because of its high precisionand the good expansibility MERR SVD++ is suitable forprocessing big data and has wide application prospect in thefield of internet information recommendation
The rest of this paper is organized as follows Section 2introduces previous related work Section 3 demonstrates theproblem formalization and SVD++ algorithm a new per-sonalized ranking algorithm (MERR SVD++) is proposedin Section 4 the experimental results and discussion arepresented in Section 5 followed by the conclusion and futurework in Section 6
2 Related Work
21 Rating Prediction In conventional CF tasks the mostfrequently used evaluation metrics are the Mean AbsoluteError (MAE) and the Root Mean Square Error (RMSE)Therefore rating prediction (such as the Netflix Prize) hasbeen the most popular method for solving the CF problemRating prediction methods are always regression based theyminimize the error of predicted ratings and true ratingsThe simplest algorithm for rating prediction is 119870-Nearest-Neighbor (KNN) [19] which predicts the missing ratingsfrom the neighborhood of users or items KNN is a memory-based algorithm and one needs to compute all the similaritiesbetweendifferent users or itemsMore efficient algorithms aremodel based they build a model from the visible ratings andcompute all the missing ratings from the model Widely usedmodel-based rating prediction methods include PLSA [20]the Restricted Boltzmann Machine (RBM) [21] and a seriesof matrix factorization techniques [22ndash25]
Mathematical Problems in Engineering 3
22 Learning to Rank LTR is the core technology forinformation retrieval When a query is input into a searchengine LTR is responsible for ranking all the documentsor Web pages according to their relevance to this query orother objectives Many LTR algorithms have been proposedrecently and they can be classified into three categoriespointwise listwise and pairwise [3 26]
In the pointwise approach it is assumed that each query-document pair in the training data has a numerical or ordinalscore Then the LTR problem can be approximated by aregression problem given a single query-document pair itsscore is predicted
As the name suggests the listwise approach takes theentire set of documents associatedwith a query in the trainingdata as the input to construct amodel and predict their scores
The pairwise approach does not focus on accuratelypredicting the degree of relevance of each document insteadit mainly cares about the relative order of two documents Inthis sense it is closer to the concept of ldquorankingrdquo
23 Personalized Ranking Algorithms for Collaborative Fil-tering The algorithms about personalized ranking can alsobe divided into two categories personalized ranking withimplicit feedback (PRIF) [6 7 10 11 13ndash15] and personalizedranking with explicit feedback (PREF) [18 27ndash29] Theforemost of PRIF is Bayesian Personalized Ranking (BPR)[11] which converts the OCCF problem into a rankingproblem Pan et al [13] proposedAdaptive BayesianPersonal-ized Ranking (ABPR) which generalized BPR algorithm forhomogeneous implicit feedback and learned the confidenceadaptively Pan et al [15] have proposed Group BayesianPersonalized Ranking (GBPR) via introducing richer inter-actions among users In GBPR it introduces group prefer-ence to relax the individual and independence assumptionsof BPR Jahrer and Toscher [6] proposed SVD and AFMwhich used a ranking based objective function constructedby matrix decomposition model and a stochastic gradientdescent optimizer Takacs and Tikk [7] proposed RankALSwhich presented a computationally effective approach forthe direct minimization of a ranking objective functionwithout sampling Gai [10] proposed a new PRIF algorithm(Pairwise Probabilistic Matrix Factorization (PPMF)) to fur-ther improve the performance of previous PRIF algorithmsRecently Shi et al [14] proposed Collaborative Less-is-MoreFiltering (CLiMF) in which the model parameters werelearned by directly maximizing the well-known informationretrieval metric Mean Reciprocal Rank (MRR) HoweverCLiMF is not suitable for other evaluation metrics (egMAP AUC and NDCG) References [6 7 10 11 13ndash15]can improve the performance of OCCF by solving the datasparsity and imbalance problems of PRIF to a certain extentAs for PREF [27] was adapted from PLSA and [28] employedthe KNN technique of CF both of which utilized the pairwiseranking method Reference [29] utilized the listwise methodto build its ranker which was a variation of MaximumMargin Factorization [22] Shi et al [18] proposed ExtendedCollaborative Less-Is-More Filtering (xCLiMF)model whichcould be seen as a generalization of the CLiMF methodThe key idea of the xCLiMF algorithm is that it builds a
recommendation model by optimizing Expected ReciprocalRank an evaluation metric that generalizes Reciprocal Rank(RR) in order to incorporate userrsquo explicit feedback Refer-ences [18 27ndash29] can also improve the performance of PREFby solving the data sparsity and imbalance problems of PREFto a certain extent
3 Problem Formalization and SVD++
31 Problem Formalization In this paper we use capitalletters to denote a matrix (such as 119883) Given a matrix 119883119883
119894119895represents its element 119883
119894indicates the 119894th row of 119883 119883
119895
symbolizes the 119895th column of119883 and119883
119879 stands for the trans-pose of119883
32 SVD++ Given that a matrix 119883 = (119883
119894119895)
119898times119899 the total
number of users is 119898 and the total number of items is 119899 if119883 is an explicit feedback matrix then119883
119894119895isin 0 1 2 3 4 5 or
119883
119894119895is unknown We want to approximate 119883 with a low rank
matrix
119883 = (
119883
119894119895)
119898times119899 where
119883 = 119880
119879119881 119880 isin 119862
119889times119898 119881 isin 119862
119889times119899119880 and119881 denote the explicit feature matrix with ranks of 119889 forusers and items respectively 119889 ≪ 119896 and 119896 denotes the rankof119883 119896 le min(119898 119899)
If 119883 is an implicit feedback matrix then 119883
119894119895isin 0 1 or
119883
119894119895is unknown We want to approximate 119883 with a low rank
matrix
119883 = (
119883
119894119895)
119898times119899 where
119883 = 119867
119879119882 119867 isin 119862
119889times119898 119882 isin
119862
119889times119899119867 and119882 denote the implicit feature matrix with ranksof 119889 for users and items respectively
SVD++ is a collaborative filtering algorithm unifyingexplicit and implicit feedback based on rating prediction andmatrix factorization [16] In SVD++ matrix 119881 denotes theexplicit and implicit feature matrix of items simultaneously
The feature matrix of users can be defined as
119880
119906+ |119873 (119906)|
minus12sum
119895isin119873(119906)
119882
119895 (1)
where 119880
119906is used to characterize the userrsquos explicit feed-
back |119873(119906)|
minus12sum
119895isin119873(119906)119882
119895is used to characterize the userrsquos
implicit feedback 119873(119906) denotes the set of all items that user119906 gave implicit feedback and119882
119895denotes the implicit feature
vector of item 119895So the prediction formula of 119883
119906119894in SVD++ can be defined
as
119883
119906119894= (119880
119906+ |119873 (119906)|
minus12sum
119895isin119873(119906)
119882
119895)
119879
119881
119894
(2)
4 Exploiting Explicit and Implicit Feedbackfor Personalized Ranking
In this section we will firstly introduce our MERR SVD++model then give the learning algorithm of this model andfinally analyze its computational complexity
41 Exploiting Explicit and Implicit Feedback for PersonalizedRanking In practical applications the user scans the results
4 Mathematical Problems in Engineering
list from top to bottom and stops when a result is found thatfully satisfies the userrsquos information need The usefulness ofan item at rank 119894 is dependent on the usefulness of the itemsat rank less than 119894 Reciprocal Rank (RR) is an importantevaluationmetric in the research field of information retrieval[14] and it strongly emphasizes the relevance of resultsreturned at the top of the list The ERR measure is ageneralized version of RR designed to be used with multiplerelevance level data (eg ratings) ERR has similar propertiesto RR in that it strongly emphasizes the relevance of resultsreturned at the top of the list
Using the definition of ERR in [18 30] we can describeERR for a ranked item list of user 119906 as follows
ERR119906=
119899
sum
119894=1
119875
119906119894
119877
119906119894
(3)
where119877119906119894denotes the rank position of item 119894 for user 119906 when
all items are ranked in descending order of the predictedrelevance values And119875
119906119894denotes the probability that the user
119906 stops at position 119894 As in [30] 119875119906119894is defined as follows
119875
119906119894= 119903
119906119894
119899
prod
119895=1
(1 minus 119903
119906119895119868 (119877
119906119895lt 119877
119906119894)) (4)
where 119868(119909) is an indicator function equal to 1 if the conditionis true and otherwise 0 And 119903
119906119894denotes the probability that
user 119906 finds the item 119894 relevant Substituting (4) into (3) weobtain the calculation formula of ERR
119906
ERR119906=
119899
sum
119894=1
119903
119906119894
119877
119906119894
119899
prod
119895=1
(1 minus 119903
119906119895119868 (119877
119906119895lt 119877
119906119894)) (5)
We use a mapping function similar to the one used in[18] to convert ratings (or levels of relevance in general) toprobabilities of relevance as follows
119903
119906119894=
2
119883119906119894minus 1
2
119883max 119868
119906119894gt 0
0 119868
119906119894= 0
(6)
where 119868
119906119894is an indicator function Note that 119868
119906119894gt 0 (119868
119906119894=
0) indicates that user 119906rsquos preference to item 119894 is known(unknown) 119883
119906119894denotes the rating given by user 119906 to item 119894
and119883max is the highest ratingIn this paper we define that 119877(119906) denotes the set of all
items that user 119906 gave explicit feedback so 119873(119906) = 119877(119906) inthe dataset that the users only gave explicit feedback and theimplicit feedback dataset is created by setting all the ratingdata in explicit feedback dataset as 1 A toy example can beseen in Figure 2 If the dataset contains both explicit feedbackdata and implicit feedback data 119873(119906) = 119877(119906) + 119872(119906)119872(119906) denoting the set of all items that user 119906 only gaveimplicit feedback A toy example can be seen in Figure 3The influence of implicit feedback on the performance ofMERR SVD++ can be found in Section 542 If we use thetraditional SVD++ algorithm to unify explicit and implicit
feedback data the prediction formula of 119883119906119894in SVD++ can
be defined as
119883
119906119894= (119880
119906+ |119873 (119906)|
minus12sum
119895isin119873(119906)
119882
119895)
119879
119881
119894
(7)
So far through the introduction of SVD++ we canexploit both explicit and implicit feedback simultaneouslyby optimizing evaluation metric ERR So we call our modelMERR SVD++
Note that the value of rank 119877
119906119894depends on the value of
119883
119906119894 For example if the predicted relevance value
119883
119906119894of item
119894 is the second highest among all the items for user 119906 then wewill have 119877
119906119894= 2
Similar to other ranking measures such as RR ERR isalso nonsmooth with respect to the latent factors of users anditems that is 119880 119881 and 119882 It is thus impossible to optimizeERRdirectly using conventional optimization techniquesWethus employ smoothing techniques that were also used inCLiMF [14] and xCLiMF [18] to attain a smoothed versionof ERR In particular we approximate the rank-based terms1119877
119906119894and 119868(119877
119906119895lt 119877
119906119894) in (5) by smooth functions with
respect to the model parameters 119880 119881 and 119882 The approx-imate formula is as follows
119868 (119877
119906119895lt 119877
119906119894) asymp 119892 (
119883
119906119895minus
119883
119906119894)
1
119877
119906119894
= 119892 (
119883
119906119894)
(8)
where 119892(119909) is a logistic function that is 119892(119909) = 1(1 + 119890
minus119909)
Substituting (8) into (5) we obtain a smoothed approxi-mation of ERR
119906
ERR119906=
119899
sum
119894=1
119903
119906119894119892 (
119883
119906119894)
119899
prod
119895=1
(1 minus 119903
119906119895119892 (
119883
119906(119895minus119894))) (9)
Note that for notation convenience we make use of thesubstitution
119883
119906(119895minus119894)=
119883
119906119895minus
119883
119906119894
Given the monotonicity of the logarithm function themodel parameters that maximize (9) are equivalent to theparameters that maximize ln((1|119873(119906)|)ERR
119906) Specifically
we have
119880
119906 119881119882 = arg max
119880119906119881119882
ERR119906
= arg max119880119906119881119882
ln ((1119873 (119906))ERR119906)
= arg max119880119906119881119882
ln(
119899
sum
119895=1
119903
119906119894119892 (
119883
119906119894)
|119873 (119906)|
119899
prod
119895=1
(1 minus 119903
119906119895119892 (
119883
119906(119895minus119894))))
(10)
Mathematical Problems in Engineering 5
Item
Use
r
2
5
4
4 3
(a)
Item
Use
r
1
1
1
1 1
(b)
Figure 2 A toy example of the dataset that the users only gave explicit feedback (a) denotes the explicit feedback dataset (b) denotes theimplicit feedback dataset
Item
Use
r
2
5
4
4 3
(a)
Item
Use
r1 1 1
1 1
1
1 1
(b)
Figure 3 A toy example of the dataset that contains both explicit feedback data and implicit feedback data (a) denotes the explicit feedbackdataset (b) denotes the implicit feedback dataset and the numbers in bold denote the dataset that users only gave implicit feedback
Based on Jensenrsquos inequality and the concavity of thelogarithm function in a similar manner to [14 18] we derivethe lower bound of ln((1|119873(119906)|)ERR
119906) as follows
ln(
1
|119873 (119906)|
ERR119906)
= ln[
[
119899
sum
119895=1
119903
119906119894119892 (
119883
119906119894)
|119873 (119906)|
119899
prod
119895=1
(1 minus 119903
119906119895119892 (
119883
119906(119895minus119894)))
]
]
ge
1
|119873 (119906)|
sdot
119899
sum
119895=1
119903
119906119894ln[
[
119892 (
119883
119906119894)
119899
prod
119895=1
(1 minus 119903
119906119895119892 (
119883
119906(119895minus119894)))
]
]
=
1
|119873 (119906)|
sdot
119899
sum
119895=1
119903
119906119894[
[
ln119892 (
119883
119906119894) +
119899
sum
119895=1
ln (1 minus 119903
119906119895119892 (
119883
119906(119895minus119894)))
]
]
(11)
We can neglect the constant 1|119873(119906)| in the lower bound andobtain a new objective function
119871 (119880
119906 119881119882)
=
119899
sum
119895=1
119903
119906119894[
[
ln119892 (
119883
119906119894) +
119899
sum
119895=1
ln (1 minus 119903
119906119895119892 (
119883
119906(119895minus119894)))
]
]
(12)
Taking into account all119898 users and using the Frobenius normof the latent factors for regularization we obtain the objectivefunction of MERR SVD++
119871 (119880119881119882) =
119898
sum
119906
119899
sum
119894=1
119903
119906119894[
[
ln119892 (
119883
119906119894)
+
119899
sum
119895=1
ln (1 minus 119903
119906119895119892 (
119883
119906(119895minus119894)))
]
]
minus
120582
2
(119880
2+ 119881
2
+ 119882
2)
(13)
6 Mathematical Problems in Engineering
in which 120582 denotes the regularization coefficient and 119880
denotes the Frobenius norm of119880 Note that the lower bound119871(119880 119881119882) is much less complex than the original objectivefunction in (9) and standard optimization methods forexample gradient ascend can be used to learn the optimalmodel parameters 119880 119881 and119882
42 Optimization We can now maximize the objectivefunction (13) with respect to the latent factorsarg max
119880119881119882119871(119880119881119882) Note that 119871(119880119881119882) represents an
approximation of the mean value of ERR across all the usersWe can thus remove the constant coefficient 1119898 since ithas no influence on the optimization of 119871(119880 119881119882) Since theobjective function is smooth we can use gradient ascent forthe optimization The gradients can be derived in a similarmanner to xCLiMF [18] as shown in the following
120597119871
120597119880
119906
=
119899
sum
119894=1
119903
119906119894[
[
119892 (minus
119883
119906119894)119881
119894
+
119899
sum
119895=1
119903
119906119895119892
1015840(
119883
119906119895minus
119883
119906119894) (119881
119894minus 119881
119895)
1 minus 119903
119906119895119892 (
119883
119906119895minus
119883
119906119894)
]
]
minus 120582119880
119906
(14)
120597119871
120597119881
119894
= 119903
119906119894[
[
119892 (minus
119883
119906119894)119881
119894+
119899
sum
119895=1
119903
119906119895119892
1015840(
119883
119906119895minus
119883
119906119894)
sdot (
1
1 minus 119903
119906119895119892 (
119883
119906119895minus
119883
119906119894)
minus
1
1 minus 119903
119906119894119892 (
119883
119906119894minus
119883
119906119895)
)
]
]
(119880
119906+ |119873 (119906)|
minus12
sdot sum
119896isin119873(119906)
119882
119896) minus 120582119881
119894
(15)
120597119871
120597119882
119894
= 119903
119906119894[
[
119892 (minus
119883
119906119894)119881
119894
+
119899
sum
119895=1
119903
119906119895119892
1015840(
119883
119906119895minus
119883
119906119894) (119881
119894minus 119881
119895)
1 minus 119903
119906119895119892 (
119883
119906119895minus
119883
119906119894)
]
]
minus 120582119882
119894
(16)
The learning algorithm for the MERR SVD++ model isoutlined in Algorithm 1
The published research papers in [8ndash11 13 18] showthat the use of the normal distribution 119873(0 001) for theinitialization of feature matrix is very effective so we still usethis approach for the initialization of 119880 119881 and 119882 in ourproposed algorithm
43 Computational Complexity Here we first analyze thecomplexity of the learning process for one iteration Byexploiting the data sparseness in 119883 the computational com-plexity of the gradient in (14) is 119874(119889
2119898 + 119889119898) Note that
Input Training set119883 learning rate 120572 regularization 120582The number of the feature 119889 and the maximal numberOf iterations itermaxOutput feature matrix 119880 119881119882for 119906 = 1 2 119898 doIndex relevant items for user 119906119873(119906) = 119895 | 119883
119894119895gt 0 0 le 119895 le 119899
endInitialize 119880(0) 119881(0) and119882
(0) with random valuesand 119905 = 0repeatfor 119906 = 1 2 119898 doUpdate 119880
119906
119880
(119905+1)
119906= 119880
(119905)
119906+ 120572
120597119871
120597119880
(119905)
119906
for 119894 isin 119873(119906) doUpdate 119881
119894119882119894
119881
119894
(119905+1)= 119881
119894
(119905)+ 120572
120597119871
120597119881
119894
(119905)
119882
119894
(119905+1)= 119882
119894
(119905)+ 120572
120597119871
120597119882
119894
(119905)
EndEnd
119905 = 119905 + 1until 119905 ge itermax119880 = 119880
(119905) 119881 = 119881
(119905)119882 = 119882
(119905)
Algorithm 1 MERR SVD++
denotes the average number of relevant items across all theusers The computational complexity of the gradient in (16)is also 119874(119889
2119898 + 119889119898) The computational complexity of the
gradient in (15) is 119874(119889
2119898 + 119889119898) Hence the complexity
of the learning algorithm in one iteration is in the order of119874(119889
2119898) In the case that is a small number that is 2 ≪ 119898
the complexity is linear to the number of users in the datacollection Note that we have 119904 = 119898 in which 119904 denotes thenumber of nonzeros in the user-item matrix The complexityof the learning algorithm is then 119874(119889119904) Since we usuallyhave ≪ 119904 the complexity is 119874(119889119904) even in the case that is large that is being linear to the number of nonzeros (ierelevant observations in the data) In sum our analysis showsthat MERR SVD++ is suitable for large scale use cases Notethat we also empirically verify the complexity of the learningalgorithm in Section 543
5 Experiment
51 Datasets We use two datasets for the experiments Thefirst is the MovieLens 1 million dataset (ML1m) [18] whichcontains ca 1M ratings (1ndash5 scale) from ca 6K users and 37Kmovies The sparseness of the ML1m dataset is 9553 Thesecond dataset is the Netflix dataset [2 16 17] which contains100000000 ratings (1ndash5 scale) from 480189 users on 17770movies Due to the huge size of the Netflix data we extract asubset of 10000 users and 10000 movies in which each userhas rated more than 100 different movies
Mathematical Problems in Engineering 7
52 Evaluation Metrics Just as has been justified by [17]NDCG is a very suitable evaluation metric for personalizedranking algorithmswhich combine explicit and implicit feed-back And our proposed MERR SVD++ algorithm exploitsboth explicit and implicit feedback simultaneously andoptimizes the well-known personalized ranking evaluationmetric Expected Reciprocal Rank (ERR) So we use theNDCG and ERR as evaluation metrics for the predictabilityof models in this paper
NDCG is another most widely used measure for rankingproblems To define NDCG
119906for a user 119906 one first needs to
define DCG119906
DCG119906=
119898
sum
119894=1
2
pref(119894)minus 1
log (119894 + 1)
(17)
where pref(119894) is a binary indicator returning 1 if the 119894th item ispreferred and 0 otherwise DCG
119906is then normalized by the
ideal ranked list into the interval [0 1]
NDCG119906=
DCG119906
IDCG119906
(18)
where IDCG119906denotes that the ranked list is sorted exactly
according to the userrsquos tastes positive items are placed at thehead of the ranked listTheNDCGof all the users is themeanscore of each user
ERR is a generalized version of Reciprocal Rank (RR)designed to be used with multiple relevance level data (egratings) It has similar properties to RR in that it stronglyemphasizes the relevance of results returned at the top of thelist Using the definition of ERR in [18] we can define ERRfor a ranked item list of user 119906 as follows
ERR119906=
119899
sum
119894=1
119903
119906119894
119877
119906119894
119899
prod
119895=1
(1 minus 119903
119906119895119868 (119877
119906119895lt 119877
119906119894)) (19)
Similar to NDCG the ERR of all the users is the mean scoreof each user
Since in recommender systems the userrsquos satisfaction isdominated by only a few items on the top of the recommenda-tion list our evaluation in the following experiments focuseson the performance of top-5 recommended items that isNDCG5 and ERR5
53 Experiment Setup For each dataset we randomlyselected 5 rated items (movies) and 1000 unrated items(movies) for each user to form a test set We then randomlyselected a varying number of rated items from the rest toform a training set For example just as in [14 18] underthe condition of ldquoGiven 5rdquo we randomly selected 5 rateditems (disjoint to the items in the test set) for each user inorder to generate a training set We investigated a variety ofldquoGivenrdquo conditions for the training sets that is 5 10 and 15for the ML1m dataset and 10 20 and 30 for the extractedNetflix dataset Generated recommendation lists for each userare compared to the ground truth in the test set in order tomeasure the performance
All the models were implemented in MATLAB R2009aForMERR SVD++ the value of the regularization parameter
120582 was selected from range 10
minus5 10
minus4 10
minus3 10
minus2 10
minus1 1 10
and optimal parameter value was used And the learning rate120572 was selected from set 119860 119860 = 120572 | 120572 = 00001 times 2
119888 120572 le
05 119888 gt 0 and 119888 is a constant and the optimal parametervalue was also used In order to compare their performancesfairly for all matrix factorization models we set the numberof features to be 10 The optimal values of all parametersfor all the baseline models used are determined individuallyMore detailed setting methods of the parameters for all thebaselines can be found in the corresponding references Forall the algorithms used in our experiments we repeated theexperiment 5 times for each of the different conditions of eachdataset and the performances reported were averaged across5 runs
54 Experiment Results In this section we present a seriesof experiments to evaluate MERR SVD++ We designedthe experiments in order to address the following researchquestions
(1) Does the proposed MERR SVD++ outperform state-of-the-art personalized ranking approaches for top-Nrecommendation
(2) Does the performance of MERR SVD++ improvewhen we only increase the number of implicit feed-back data for each user
(3) Is MERR SVD++ scalable for large scale use cases
541 Performance Comparison We compare the perfor-mance of MERR SVD++ with that of five baseline algo-rithms The approaches we compare with are listed below
(i) Co-Rating [17] a state-of-the-art CF model thatcan be trained from explicit and implicit feedbacksimultaneously
(ii) SVD++ [16] the first proposed CF model that com-bines explicit and implicit feedback
(iii) xCLiMF [18] a state-of-the-art PR approach whichaims at directly optimizing ERR for top-N recom-mendation in domains with explicit feedback data(eg ratings)
(iv) CofiRank [29] a PR approach that optimizes theNDCG measure [28] for domains with explicit feed-back data (eg ratings) The implementation is basedon the publicly available software package from theauthors
(v) CLiMF [14] a state-of-the-art PR approach that opti-mizes the Mean Reciprocal Rank (MRR) measure[31] for domains with implicit feedback data (egclick follow) We use the explicit feedback datasetsby binarizing the rating values with a threshold Onthe ML1m dataset and the extracted Netflix datasetwe take ratings 4 and 5 (the highest two relevancelevels) respectively as the relevance threshold fortop-N recommendation
The results of the experiments on the ML1m and theextractedNetflix datasets are shown in Figure 4 Rows denote
8 Mathematical Problems in Engineering
0 Given 5 Given 10 Given 150
001
002
003
004
005
006The results on the ML1m dataset
The varieties of ldquoGivenrdquo conditions for the training sets
ND
CG
5
CLiMFCofiRankxCLiMF
SVD++Co_RatingMERR_SVD++
(a)
0 Given 5 Given 10 Given 150
001
002
003
004
005
006
007The results on the ML1m dataset
ERR
5
The varieties of ldquoGivenrdquo conditions for the training sets
CLiMFCofiRankxCLiMF
SVD++Co_RatingMERR_SVD++
(b)
0 Given 10 Given 20 Given 300
002
004
006
008
010
012The results on the extracted Netflix dataset
ND
CG
5
The varieties of ldquoGivenrdquo conditions for the training sets
CLiMFCofiRankxCLiMF
SVD++Co_RatingMERR_SVD++
(c)
0 Given 10 Given 20 Given 300
001
002
003
004
005
006
007
008The results on the extracted Netflix dataset
ERR
5
The varieties of ldquoGivenrdquo conditions for the training sets
CLiMFCofiRankxCLiMF
SVD++Co_RatingMERR_SVD++
(d)
Figure 4 The performance comparison of MERR SVD++ and baselines
the varieties of ldquoGivenrdquo conditions for the training sets basedon the ML1m and the extracted Netflix datasets and columnsdenote the quality of NDCG and ERR Figure 4 showsthat MERR SVD++ outperforms the baseline approachesin terms of both ERR and NDCG in all of the cases Theresults show that the improvement of ERR aligns consistentlywith the improvement of NDCG indicating that optimizingERR would not degrade the utility of recommendations thatare captured by the NDCG measure It can be seen thatthe relative performance of MERR SVD++ improves as thenumber of observed ratings from the users increases Thisresult indicates that MERR SVD++ can learn better top-Nrecommendation models if more observations of the gradedrelevance data from users can be used The results alsoreveal that it is difficult to model user preferences encoded
in multiple levels of relevance with limited observations inparticular when the number of observations is lower than thenumber of relevance levels
Compared to Co-Rating which is based on ratingprediction MERR SVD++ is based on ranking predictionand succeeds in enhancing the top-ranked performance byoptimizing ERR As reported in [17] the performance ofSVD++ is slightly weaker than that of Co-Rating which isbecause SVD++ model only attempts to approximate theobserved ratings and does not model preferences expressedin implicit feedback The results also show that Co-Ratingand SVD++ significantly outperform xCLiMF and CofiRankwhich confirms our belief that implicit feedback could indeedcomplement explicit feedback It can be seen in Figure 4that xCLiMF significantly outperforms CLiMF in terms of
Mathematical Problems in Engineering 9
0 10 20 30 40 50005
006
007
008
009
010
The increased numbers of implicit feedback for each user
The v
alue
of e
valu
atio
n m
etric
s
ERR5NDCG5
Figure 5 The influence of implicit feedback on the performance ofMERR SVD++
both ERR and NDCG across all the settings of relevancethresholds and datasets The results indicate that the infor-mation loss from binarizing multilevel relevance data wouldinevitably make recommendation models based on binaryrelevance data such as CLiMF suboptimal for the use caseswith explicit feedback data
Hence we give a positive answer to our first researchquestion
542 The Influence of Implicit Feedback on the Performanceof MERR SVD++ The influence of implicit feedback on theperformance of MERR SVD++ can be found in Figure 5Here 119873(119906) = 119877(119906) + 119872(119906) 119872(119906) denotes the set of allitems that user 119906 only gave implicit feedback Rows denotethe increased numbers of implicit feedback for each userand columns denote the quality of ERR and NDCG In ourexperiment the increased numbers of implicit feedback foreach user are the same We use the extracted Netflix datasetunder the condition of ldquoGiven 10rdquo Figure 5 shows that thequality of ERR and NDCG of MERR SVD++ synchronouslyand linearly improves with the increase of implicit feedbackfor each user which confirms our belief that implicit feedbackcould indeed complement explicit feedback
With this experimental result we give a positive answerto our second research question
543 Scalability The last experiment investigated the scala-bility of MERR SVD++ by measuring the training time thatwas required for the training set at different scales Firstlyas analyzed in Section 43 the computational complexityof MERR SVD++ is linear in the number of users in thetraining set when the average number of items rated peruser is fixed To demonstrate the scalability we used differentnumbers of users in the training set under each conditionwe randomly selected from 10 to 100 users in the trainingset and their rated items as the training data for learning thelatent factors The results on the ML1m dataset are shown inFigure 6 We can observe that the computational time under
10 20 30 40 50 60 70 80 90 1000
5
10
15
20
25
30
35
40
Users for training ()
Runt
ime p
er it
erat
ion
(sec
)
Given 5Given 10Given 15
Figure 6 Scalability analysis of MERR SVD++ in terms of thenumber of users in the training set
each condition increases almost linearly to the increase of thenumber of users Secondly as also discussed in Section 43the computational complexity of MERR SVD++ could befurther approximated to be linear to the amount of knowndata (ie nonzero entries in the training user-item matrix)To demonstrate this we examined the runtime of the learningalgorithm against different scales of the training sets underdifferent ldquoGivenrdquo conditions The result is shown in Figure 6from which we can observe that the average runtime of thelearning algorithm per iteration increases almost linearly asthe number of nonzeros in the training set increases
The observations from this experiment allow us to answerour last research question positively
6 Conclusion and Future Work
The problem of the previous researches on personalizedranking is that they focused on either explicit feedbackdata or implicit feedback data rather than making full useof the information in the dataset Until now nobody hasstudied personalized ranking algorithm by exploiting bothexplicit and implicit feedback In order to overcome thedefects of prior researches in this paper we have presenteda new personalized ranking algorithm (MERR SVD++) byexploiting both explicit and implicit feedback simultaneouslyMERR SVD++ optimizes the well-known evaluation metricExpected Reciprocal Rank (ERR) and is based on the newestxCLiMF model and SVD++ algorithm Experimental resultson practical datasets showed that our proposed algorithmoutperformed existing personalized ranking algorithms overdifferent evaluation metrics and that the running time ofMERR SVD++ showed a linear correlation with the num-ber of rating Because of its high precision and the goodexpansibility MERR SVD++ is suitable for processing bigdata and can greatly improve the recommendation speedand validity by solving the latency problem of personalizedrecommendation and has wide application prospect in the
10 Mathematical Problems in Engineering
field of internet information recommendation And becauseMERR SVD++ exploits both explicit and implicit feedbacksimultaneously MERR SVD++ can solve the data sparsityand imbalance problems of personalized ranking algorithmsto a certain extent
For futurework we plan to extend our algorithm to richerones so that our algorithm can solve the grey sheep problemand cold start problem of personalized recommendationAlso we would like to explore more useful information fromthe explicit feedback and implicit feedback simultaneously
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Acknowledgments
This work is sponsored in part by the National NaturalScience Foundation of China (nos 61370186 6140326461402122 61003140 61033010 and 61272414) Science andTechnology Planning Project of Guangdong Province (nos2014A010103040 and 2014B010116001) Science and Tech-nology Planning Project of Guangzhou (nos 2014J4100032and 201510010203) the Ministry of Education and ChinaMobile Research Fund (no MCM20121051) the secondbatch open subject of mechanical and electrical professionalgroup engineering technology development center in Foshancity (no 2015-KJZX139) and the 2015 Research BackboneTeachers Training Program of Shunde Polytechnic (no 2015-KJZX014)
References
[1] G Adomavicius and A Tuzhilin ldquoToward the next generationof recommender systems a survey of the state-of-the-art andpossible extensionsrdquo IEEE Transactions on Knowledge and DataEngineering vol 17 no 6 pp 734ndash749 2005
[2] S Ahn A Korattikara and N Liu ldquoLarge-scale distributedBayesian matrix factorization using stochastic gradientMCMCrdquo in Proceedings of the 21th ACMSIGKDDConference onKnowledgeDiscovery andDataMining pp 401ndash410 ACMPressSydney Australia August 2015
[3] S Balakrishnan and S Chopra ldquoCollaborative rankingrdquo in Pro-ceedings of the 5th ACM International Conference onWeb Searchand DataMining (WSDM rsquo12) pp 143ndash152 IEEE Seattle WashUSA February 2012
[4] Q Yao and J Kwok ldquoAccelerated inexact soft-impute for fastlarge-scale matrix completionrdquo in Proceedings of the 24rd Inter-national Joint Conference on Artificial Intelligence pp 4002ndash4008 ACM Press Buenos Aires Argentina July 2015
[5] Q Diao M Qiu C-Y Wu A J Smola J Jiang and C WangldquoJointly modeling aspects ratings and sentiments for movierecommendation (JMARS)rdquo in Proceedings of the 20th ACMSIGKDD International Conference on Knowledge Discovery andDataMining (KDD rsquo14) pp 193ndash202 ACMNewYorkNYUSAAugust 2014
[6] M Jahrer and A Toscher ldquoCollaborative filtering ensemble forrankingrdquo in Proceedings of the 17nd International Conference on
Knowledge Discovery and Data Mining pp 153ndash167 ACM SanDiego Calif USA August 2011
[7] G Takacs and D Tikk ldquoAlternating least squares for person-alized rankingrdquo in Proceedings of the 5th ACM Conference onRecommender Systems pp 83ndash90 ACM Dublin Ireland 2012
[8] G Li and L Li ldquoDual collaborative topicmodeling from implicitfeedbacksrdquo in Proceedings of the IEEE International Conferenceon Security Pattern Analysis and Cybernetics pp 395ndash404Wuhan China October 2014
[9] H Wang X Shi and D Yeung ldquoRelational stacked denoisingautoencoder for tag recommendationrdquo in Proceedings of the29th AAAI Conference on Artificial Intelligence pp 3052ndash3058ACM Press Austin Tex USA January 2015
[10] L Gai ldquoPairwise probabilistic matrix factorization for implicitfeedback collaborative filteringrdquo in Proceedings of the IEEEInternational Conference on Security Pattern Analysis andCybernetics (SPAC rsquo14) pp 181ndash190 Wuhan China October2014
[11] S Rendle C Freudenthaler Z Gantner and L Schmidt-Thieme ldquoBPR Bayesian personalized ranking from implicitfeedbackrdquo in Proceedings of the 25th Conference on Uncertaintyin Artificial Intelligence (UAI rsquo09) pp 452ndash461 Morgan Kauf-mann Montreal Canada 2009
[12] L Guo J Ma H R Jiang Z M Chen and C M XingldquoSocial trust aware item recommendation for implicit feedbackrdquoJournal of Computer Science and Technology vol 30 no 5 pp1039ndash1053 2015
[13] W K Pan H Zhong C F Xu and Z Ming ldquoAdaptive Bayesianpersonalized ranking for heterogeneous implicit feedbacksrdquoKnowledge-Based Systems vol 73 pp 173ndash180 2015
[14] Y Shi A Karatzoglou L Baltrunas M Larson N Oliver andA Hanjalic ldquoCLiMF collaborative less-is-more filteringrdquo inProceedings of the 23rd International Joint Conference on Artifi-cial Intelligence (IJCAI rsquo13) pp 3077ndash3081 ACMBeijing ChinaAugust 2013
[15] WK Pan andLChen ldquoGBPR grouppreference based bayesianpersonalized ranking for one-class collaborative filteringrdquo inProceedings of the 23rd International Joint Conference on Artifi-cial Intelligence (IJCAI rsquo13) pp 2691ndash2697 Beijing China 2013
[16] Y Koren ldquoFactor in the neighbors scalable and accurate col-laborative filteringrdquoACMTransactions on Knowledge Discoveryfrom Data vol 4 no 1 pp 1ndash24 2010
[17] N N Liu E W Xiang M Zhao and Q Yang ldquoUnifyingexplicit and implicit feedback for collaborative filteringrdquo inProceedings of the 19th International Conference on Informationand Knowledge Management (CIKM rsquo10) pp 1445ndash1448 ACMToronto Canada October 2010
[18] Y Shi A Karatzoglou L BaltrunasM Larson andA HanjalicldquoXCLiMF Optimizing expected reciprocal rank for data withmultiple levels of relevancerdquo in Proceedings of the 7th ACMConference on Recommender Systems (RecSys rsquo13) pp 431ndash434ACM Hongkong October 2013
[19] J Jiang J Lu G Zhang and G Long ldquoScaling-up item-based collaborative filtering recommendation algorithm basedon Hadooprdquo in Proceedings of the 7th IEEE World Congress onServices pp 490ndash497 IEEE Washington DC USA July 2011
[20] T Hofmann ldquoLatent semantic models for collaborative filter-ingrdquo ACM Transactions on Information Systems vol 22 no 1pp 89ndash115 2004
[21] R Salakhutdinov A Mnih and G Hinton ldquoRestricted Boltz-mannmachines for collaborative filteringrdquo in Proceedings of the
Mathematical Problems in Engineering 11
24rd International Conference onMachine Learning (ICML rsquo07)pp 791ndash798 ACM Press Corvallis Ore USA June 2007
[22] N Srebro J Rennie andT Jaakkola ldquoMaximum-marginmatrixfactorizationrdquo in Proceedings of the 18th Annual Conferenceon Neural Information Processing Systems pp 11ndash217 BritishColumbia Canada 2004
[23] J T Liu C H Wu and W Y Liu ldquoBayesian probabilisticmatrix factorization with social relations and item contents forrecommendationrdquo Decision Support Systems vol 55 no 3 pp838ndash850 2013
[24] D Feldman and T Tassa ldquoMore constraints smaller coresetsconstrained matrix approximation of sparse big datardquo in Pro-ceedings of the 21th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining (KDD rsquo15) pp 249ndash258ACM Press Sydney Australia August 2015
[25] S Mirisaee E Gaussier and A Termier ldquoImproved local searchfor binarymatrix factorizationrdquo in Proceedings of the 29th AAAIConference on Artificial Intelligence pp 1198ndash1204 ACM PressAustin Tex USA January 2015
[26] T Y Liu Learning to Rank for Information Retrieval SpringerNew York NY USA 2011
[27] NN LiuM Zhao andQ Yang ldquoProbabilistic latent preferenceanalysis for collaborative filteringrdquo in Proceedings of the 18thInternational Conference on Information and Knowledge Man-agement (CIKM rsquo09) pp 759ndash766 ACM Press Hong KongNovember 2009
[28] N N Liu and Q Yang ldquoEigenRank a ranking-orientedapproach to collaborative filteringrdquo in Proceedings of the 31stAnnual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval (ACM SIGIR rsquo08) pp 83ndash90 ACM Singapore July 2008
[29] M Weimer A Karatzoglou Q V Le and A J SmolaldquoCofiRankndashmaximum margin matrix factorization for collab-orative rankingrdquo in Proceedings of the 21th Conference onAdvances in Neural Information Processing Systems pp 79ndash86Curran Associates Vancouver Canada 2007
[30] O Chapelle D Metlzer Y Zhang and P Grinspan ldquoExpectedreciprocal rank for graded relevancerdquo in Proceedings of the 18thACM International Conference on Information and KnowledgeManagement (CIKM rsquo09) pp 621ndash630 ACM New York NYUSA November 2009
[31] E M Voorhees ldquoThe trec-8 question answering track reportrdquoin Proceedings of the 8th Text Retrieval Conference (TREC-8 rsquo99)Gaithersburg Md USA November 1999
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
2 Mathematical Problems in Engineering
Item
Use
r
1 3
3 2
4 4
4 3 1
5 5
2 5
(a)
Item
Use
r
1 1
1
1 1
1 1
1 1
1 1
(b)
Figure 1 Examples of an explicit feedback matrix (a) and an implicit feedback matrix (b) for a recommender system
on rating prediction [2 4 5 8 9 12 16 17] and personalizedranking (PR) algorithms based on ranking prediction [36 7 10 11 13ndash15 18] In collaborative filtering algorithmsbased on rating prediction one predicts the actual rating foran item that a customer has not rated yet and then ranksthe items according to the predicted ratings On the otherhand for personalized ranking algorithms based on rankingprediction one predicts a preference ordering over the yetunrated items without going through the intermeditate stepof rating prediction Actually from the recommendationperspective the order over the items is more important thantheir ratingTherefore in this paper we focus on personalizedranking algorithms based on ranking prediction
The problem of the previous researches on personalizedranking is that they focused on either explicit feedback dataor implicit feedback data rather than making full use of theinformation in the dataset However in most real world rec-ommender systems both explicit and implicit user feedbackare abundant and could potentially complement each otherIt is desirable to be able to unify these two heterogeneousforms of user feedback in order to generate more accu-rate recommendations The idea of complementing explicitfeedback with implicit feedback was first proposed in [16]where the author considered explicit feedback as how theusers rated the movies and implicit feedback as what movieswere rated by the users The two forms of feedback werecombined via a factorized neighborhood model (called Sin-gular Value Decomposition++ SVD++) an extension oftraditional nearest item-based model in which the item-item similarity matrix was approximated via low rank fac-torization In order to make full use of explicit and implicitfeedback Liu et al [17] proposed Co-Rating model whichdeveloped matrix factorization models that could be trainedfrom explicit and implicit feedback simultaneously BothSVD++ and Co-Rating are based on rating prediction Untilnow nobody has studied personalized ranking algorithm byexploiting both explicit and implicit feedback
In order to overcome the defects of prior researchesthis paper proposes a new personalized ranking algorithm
(MERR SVD++) which exploits both explicit and implicitfeedback and optimizes Expected Reciprocal Rank (ERR)based on the newest Extended Collaborative Less-Is-MoreFiltering (xCLiMF)model [18] and SVD++ algorithm Exper-imental results on practical datasets showed that our pro-posed algorithm outperformed existing personalized rankingalgorithms over different evaluation metrics and that therunning time of MERR SVD++ showed a linear correlationwith the number of rating Because of its high precisionand the good expansibility MERR SVD++ is suitable forprocessing big data and has wide application prospect in thefield of internet information recommendation
The rest of this paper is organized as follows Section 2introduces previous related work Section 3 demonstrates theproblem formalization and SVD++ algorithm a new per-sonalized ranking algorithm (MERR SVD++) is proposedin Section 4 the experimental results and discussion arepresented in Section 5 followed by the conclusion and futurework in Section 6
2 Related Work
21 Rating Prediction In conventional CF tasks the mostfrequently used evaluation metrics are the Mean AbsoluteError (MAE) and the Root Mean Square Error (RMSE)Therefore rating prediction (such as the Netflix Prize) hasbeen the most popular method for solving the CF problemRating prediction methods are always regression based theyminimize the error of predicted ratings and true ratingsThe simplest algorithm for rating prediction is 119870-Nearest-Neighbor (KNN) [19] which predicts the missing ratingsfrom the neighborhood of users or items KNN is a memory-based algorithm and one needs to compute all the similaritiesbetweendifferent users or itemsMore efficient algorithms aremodel based they build a model from the visible ratings andcompute all the missing ratings from the model Widely usedmodel-based rating prediction methods include PLSA [20]the Restricted Boltzmann Machine (RBM) [21] and a seriesof matrix factorization techniques [22ndash25]
Mathematical Problems in Engineering 3
22 Learning to Rank LTR is the core technology forinformation retrieval When a query is input into a searchengine LTR is responsible for ranking all the documentsor Web pages according to their relevance to this query orother objectives Many LTR algorithms have been proposedrecently and they can be classified into three categoriespointwise listwise and pairwise [3 26]
In the pointwise approach it is assumed that each query-document pair in the training data has a numerical or ordinalscore Then the LTR problem can be approximated by aregression problem given a single query-document pair itsscore is predicted
As the name suggests the listwise approach takes theentire set of documents associatedwith a query in the trainingdata as the input to construct amodel and predict their scores
The pairwise approach does not focus on accuratelypredicting the degree of relevance of each document insteadit mainly cares about the relative order of two documents Inthis sense it is closer to the concept of ldquorankingrdquo
23 Personalized Ranking Algorithms for Collaborative Fil-tering The algorithms about personalized ranking can alsobe divided into two categories personalized ranking withimplicit feedback (PRIF) [6 7 10 11 13ndash15] and personalizedranking with explicit feedback (PREF) [18 27ndash29] Theforemost of PRIF is Bayesian Personalized Ranking (BPR)[11] which converts the OCCF problem into a rankingproblem Pan et al [13] proposedAdaptive BayesianPersonal-ized Ranking (ABPR) which generalized BPR algorithm forhomogeneous implicit feedback and learned the confidenceadaptively Pan et al [15] have proposed Group BayesianPersonalized Ranking (GBPR) via introducing richer inter-actions among users In GBPR it introduces group prefer-ence to relax the individual and independence assumptionsof BPR Jahrer and Toscher [6] proposed SVD and AFMwhich used a ranking based objective function constructedby matrix decomposition model and a stochastic gradientdescent optimizer Takacs and Tikk [7] proposed RankALSwhich presented a computationally effective approach forthe direct minimization of a ranking objective functionwithout sampling Gai [10] proposed a new PRIF algorithm(Pairwise Probabilistic Matrix Factorization (PPMF)) to fur-ther improve the performance of previous PRIF algorithmsRecently Shi et al [14] proposed Collaborative Less-is-MoreFiltering (CLiMF) in which the model parameters werelearned by directly maximizing the well-known informationretrieval metric Mean Reciprocal Rank (MRR) HoweverCLiMF is not suitable for other evaluation metrics (egMAP AUC and NDCG) References [6 7 10 11 13ndash15]can improve the performance of OCCF by solving the datasparsity and imbalance problems of PRIF to a certain extentAs for PREF [27] was adapted from PLSA and [28] employedthe KNN technique of CF both of which utilized the pairwiseranking method Reference [29] utilized the listwise methodto build its ranker which was a variation of MaximumMargin Factorization [22] Shi et al [18] proposed ExtendedCollaborative Less-Is-More Filtering (xCLiMF)model whichcould be seen as a generalization of the CLiMF methodThe key idea of the xCLiMF algorithm is that it builds a
recommendation model by optimizing Expected ReciprocalRank an evaluation metric that generalizes Reciprocal Rank(RR) in order to incorporate userrsquo explicit feedback Refer-ences [18 27ndash29] can also improve the performance of PREFby solving the data sparsity and imbalance problems of PREFto a certain extent
3 Problem Formalization and SVD++
31 Problem Formalization In this paper we use capitalletters to denote a matrix (such as 119883) Given a matrix 119883119883
119894119895represents its element 119883
119894indicates the 119894th row of 119883 119883
119895
symbolizes the 119895th column of119883 and119883
119879 stands for the trans-pose of119883
32 SVD++ Given that a matrix 119883 = (119883
119894119895)
119898times119899 the total
number of users is 119898 and the total number of items is 119899 if119883 is an explicit feedback matrix then119883
119894119895isin 0 1 2 3 4 5 or
119883
119894119895is unknown We want to approximate 119883 with a low rank
matrix
119883 = (
119883
119894119895)
119898times119899 where
119883 = 119880
119879119881 119880 isin 119862
119889times119898 119881 isin 119862
119889times119899119880 and119881 denote the explicit feature matrix with ranks of 119889 forusers and items respectively 119889 ≪ 119896 and 119896 denotes the rankof119883 119896 le min(119898 119899)
If 119883 is an implicit feedback matrix then 119883
119894119895isin 0 1 or
119883
119894119895is unknown We want to approximate 119883 with a low rank
matrix
119883 = (
119883
119894119895)
119898times119899 where
119883 = 119867
119879119882 119867 isin 119862
119889times119898 119882 isin
119862
119889times119899119867 and119882 denote the implicit feature matrix with ranksof 119889 for users and items respectively
SVD++ is a collaborative filtering algorithm unifyingexplicit and implicit feedback based on rating prediction andmatrix factorization [16] In SVD++ matrix 119881 denotes theexplicit and implicit feature matrix of items simultaneously
The feature matrix of users can be defined as
119880
119906+ |119873 (119906)|
minus12sum
119895isin119873(119906)
119882
119895 (1)
where 119880
119906is used to characterize the userrsquos explicit feed-
back |119873(119906)|
minus12sum
119895isin119873(119906)119882
119895is used to characterize the userrsquos
implicit feedback 119873(119906) denotes the set of all items that user119906 gave implicit feedback and119882
119895denotes the implicit feature
vector of item 119895So the prediction formula of 119883
119906119894in SVD++ can be defined
as
119883
119906119894= (119880
119906+ |119873 (119906)|
minus12sum
119895isin119873(119906)
119882
119895)
119879
119881
119894
(2)
4 Exploiting Explicit and Implicit Feedbackfor Personalized Ranking
In this section we will firstly introduce our MERR SVD++model then give the learning algorithm of this model andfinally analyze its computational complexity
41 Exploiting Explicit and Implicit Feedback for PersonalizedRanking In practical applications the user scans the results
4 Mathematical Problems in Engineering
list from top to bottom and stops when a result is found thatfully satisfies the userrsquos information need The usefulness ofan item at rank 119894 is dependent on the usefulness of the itemsat rank less than 119894 Reciprocal Rank (RR) is an importantevaluationmetric in the research field of information retrieval[14] and it strongly emphasizes the relevance of resultsreturned at the top of the list The ERR measure is ageneralized version of RR designed to be used with multiplerelevance level data (eg ratings) ERR has similar propertiesto RR in that it strongly emphasizes the relevance of resultsreturned at the top of the list
Using the definition of ERR in [18 30] we can describeERR for a ranked item list of user 119906 as follows
ERR119906=
119899
sum
119894=1
119875
119906119894
119877
119906119894
(3)
where119877119906119894denotes the rank position of item 119894 for user 119906 when
all items are ranked in descending order of the predictedrelevance values And119875
119906119894denotes the probability that the user
119906 stops at position 119894 As in [30] 119875119906119894is defined as follows
119875
119906119894= 119903
119906119894
119899
prod
119895=1
(1 minus 119903
119906119895119868 (119877
119906119895lt 119877
119906119894)) (4)
where 119868(119909) is an indicator function equal to 1 if the conditionis true and otherwise 0 And 119903
119906119894denotes the probability that
user 119906 finds the item 119894 relevant Substituting (4) into (3) weobtain the calculation formula of ERR
119906
ERR119906=
119899
sum
119894=1
119903
119906119894
119877
119906119894
119899
prod
119895=1
(1 minus 119903
119906119895119868 (119877
119906119895lt 119877
119906119894)) (5)
We use a mapping function similar to the one used in[18] to convert ratings (or levels of relevance in general) toprobabilities of relevance as follows
119903
119906119894=
2
119883119906119894minus 1
2
119883max 119868
119906119894gt 0
0 119868
119906119894= 0
(6)
where 119868
119906119894is an indicator function Note that 119868
119906119894gt 0 (119868
119906119894=
0) indicates that user 119906rsquos preference to item 119894 is known(unknown) 119883
119906119894denotes the rating given by user 119906 to item 119894
and119883max is the highest ratingIn this paper we define that 119877(119906) denotes the set of all
items that user 119906 gave explicit feedback so 119873(119906) = 119877(119906) inthe dataset that the users only gave explicit feedback and theimplicit feedback dataset is created by setting all the ratingdata in explicit feedback dataset as 1 A toy example can beseen in Figure 2 If the dataset contains both explicit feedbackdata and implicit feedback data 119873(119906) = 119877(119906) + 119872(119906)119872(119906) denoting the set of all items that user 119906 only gaveimplicit feedback A toy example can be seen in Figure 3The influence of implicit feedback on the performance ofMERR SVD++ can be found in Section 542 If we use thetraditional SVD++ algorithm to unify explicit and implicit
feedback data the prediction formula of 119883119906119894in SVD++ can
be defined as
119883
119906119894= (119880
119906+ |119873 (119906)|
minus12sum
119895isin119873(119906)
119882
119895)
119879
119881
119894
(7)
So far through the introduction of SVD++ we canexploit both explicit and implicit feedback simultaneouslyby optimizing evaluation metric ERR So we call our modelMERR SVD++
Note that the value of rank 119877
119906119894depends on the value of
119883
119906119894 For example if the predicted relevance value
119883
119906119894of item
119894 is the second highest among all the items for user 119906 then wewill have 119877
119906119894= 2
Similar to other ranking measures such as RR ERR isalso nonsmooth with respect to the latent factors of users anditems that is 119880 119881 and 119882 It is thus impossible to optimizeERRdirectly using conventional optimization techniquesWethus employ smoothing techniques that were also used inCLiMF [14] and xCLiMF [18] to attain a smoothed versionof ERR In particular we approximate the rank-based terms1119877
119906119894and 119868(119877
119906119895lt 119877
119906119894) in (5) by smooth functions with
respect to the model parameters 119880 119881 and 119882 The approx-imate formula is as follows
119868 (119877
119906119895lt 119877
119906119894) asymp 119892 (
119883
119906119895minus
119883
119906119894)
1
119877
119906119894
= 119892 (
119883
119906119894)
(8)
where 119892(119909) is a logistic function that is 119892(119909) = 1(1 + 119890
minus119909)
Substituting (8) into (5) we obtain a smoothed approxi-mation of ERR
119906
ERR119906=
119899
sum
119894=1
119903
119906119894119892 (
119883
119906119894)
119899
prod
119895=1
(1 minus 119903
119906119895119892 (
119883
119906(119895minus119894))) (9)
Note that for notation convenience we make use of thesubstitution
119883
119906(119895minus119894)=
119883
119906119895minus
119883
119906119894
Given the monotonicity of the logarithm function themodel parameters that maximize (9) are equivalent to theparameters that maximize ln((1|119873(119906)|)ERR
119906) Specifically
we have
119880
119906 119881119882 = arg max
119880119906119881119882
ERR119906
= arg max119880119906119881119882
ln ((1119873 (119906))ERR119906)
= arg max119880119906119881119882
ln(
119899
sum
119895=1
119903
119906119894119892 (
119883
119906119894)
|119873 (119906)|
119899
prod
119895=1
(1 minus 119903
119906119895119892 (
119883
119906(119895minus119894))))
(10)
Mathematical Problems in Engineering 5
Item
Use
r
2
5
4
4 3
(a)
Item
Use
r
1
1
1
1 1
(b)
Figure 2 A toy example of the dataset that the users only gave explicit feedback (a) denotes the explicit feedback dataset (b) denotes theimplicit feedback dataset
Item
Use
r
2
5
4
4 3
(a)
Item
Use
r1 1 1
1 1
1
1 1
(b)
Figure 3 A toy example of the dataset that contains both explicit feedback data and implicit feedback data (a) denotes the explicit feedbackdataset (b) denotes the implicit feedback dataset and the numbers in bold denote the dataset that users only gave implicit feedback
Based on Jensenrsquos inequality and the concavity of thelogarithm function in a similar manner to [14 18] we derivethe lower bound of ln((1|119873(119906)|)ERR
119906) as follows
ln(
1
|119873 (119906)|
ERR119906)
= ln[
[
119899
sum
119895=1
119903
119906119894119892 (
119883
119906119894)
|119873 (119906)|
119899
prod
119895=1
(1 minus 119903
119906119895119892 (
119883
119906(119895minus119894)))
]
]
ge
1
|119873 (119906)|
sdot
119899
sum
119895=1
119903
119906119894ln[
[
119892 (
119883
119906119894)
119899
prod
119895=1
(1 minus 119903
119906119895119892 (
119883
119906(119895minus119894)))
]
]
=
1
|119873 (119906)|
sdot
119899
sum
119895=1
119903
119906119894[
[
ln119892 (
119883
119906119894) +
119899
sum
119895=1
ln (1 minus 119903
119906119895119892 (
119883
119906(119895minus119894)))
]
]
(11)
We can neglect the constant 1|119873(119906)| in the lower bound andobtain a new objective function
119871 (119880
119906 119881119882)
=
119899
sum
119895=1
119903
119906119894[
[
ln119892 (
119883
119906119894) +
119899
sum
119895=1
ln (1 minus 119903
119906119895119892 (
119883
119906(119895minus119894)))
]
]
(12)
Taking into account all119898 users and using the Frobenius normof the latent factors for regularization we obtain the objectivefunction of MERR SVD++
119871 (119880119881119882) =
119898
sum
119906
119899
sum
119894=1
119903
119906119894[
[
ln119892 (
119883
119906119894)
+
119899
sum
119895=1
ln (1 minus 119903
119906119895119892 (
119883
119906(119895minus119894)))
]
]
minus
120582
2
(119880
2+ 119881
2
+ 119882
2)
(13)
6 Mathematical Problems in Engineering
in which 120582 denotes the regularization coefficient and 119880
denotes the Frobenius norm of119880 Note that the lower bound119871(119880 119881119882) is much less complex than the original objectivefunction in (9) and standard optimization methods forexample gradient ascend can be used to learn the optimalmodel parameters 119880 119881 and119882
42 Optimization We can now maximize the objectivefunction (13) with respect to the latent factorsarg max
119880119881119882119871(119880119881119882) Note that 119871(119880119881119882) represents an
approximation of the mean value of ERR across all the usersWe can thus remove the constant coefficient 1119898 since ithas no influence on the optimization of 119871(119880 119881119882) Since theobjective function is smooth we can use gradient ascent forthe optimization The gradients can be derived in a similarmanner to xCLiMF [18] as shown in the following
120597119871
120597119880
119906
=
119899
sum
119894=1
119903
119906119894[
[
119892 (minus
119883
119906119894)119881
119894
+
119899
sum
119895=1
119903
119906119895119892
1015840(
119883
119906119895minus
119883
119906119894) (119881
119894minus 119881
119895)
1 minus 119903
119906119895119892 (
119883
119906119895minus
119883
119906119894)
]
]
minus 120582119880
119906
(14)
120597119871
120597119881
119894
= 119903
119906119894[
[
119892 (minus
119883
119906119894)119881
119894+
119899
sum
119895=1
119903
119906119895119892
1015840(
119883
119906119895minus
119883
119906119894)
sdot (
1
1 minus 119903
119906119895119892 (
119883
119906119895minus
119883
119906119894)
minus
1
1 minus 119903
119906119894119892 (
119883
119906119894minus
119883
119906119895)
)
]
]
(119880
119906+ |119873 (119906)|
minus12
sdot sum
119896isin119873(119906)
119882
119896) minus 120582119881
119894
(15)
120597119871
120597119882
119894
= 119903
119906119894[
[
119892 (minus
119883
119906119894)119881
119894
+
119899
sum
119895=1
119903
119906119895119892
1015840(
119883
119906119895minus
119883
119906119894) (119881
119894minus 119881
119895)
1 minus 119903
119906119895119892 (
119883
119906119895minus
119883
119906119894)
]
]
minus 120582119882
119894
(16)
The learning algorithm for the MERR SVD++ model isoutlined in Algorithm 1
The published research papers in [8ndash11 13 18] showthat the use of the normal distribution 119873(0 001) for theinitialization of feature matrix is very effective so we still usethis approach for the initialization of 119880 119881 and 119882 in ourproposed algorithm
43 Computational Complexity Here we first analyze thecomplexity of the learning process for one iteration Byexploiting the data sparseness in 119883 the computational com-plexity of the gradient in (14) is 119874(119889
2119898 + 119889119898) Note that
Input Training set119883 learning rate 120572 regularization 120582The number of the feature 119889 and the maximal numberOf iterations itermaxOutput feature matrix 119880 119881119882for 119906 = 1 2 119898 doIndex relevant items for user 119906119873(119906) = 119895 | 119883
119894119895gt 0 0 le 119895 le 119899
endInitialize 119880(0) 119881(0) and119882
(0) with random valuesand 119905 = 0repeatfor 119906 = 1 2 119898 doUpdate 119880
119906
119880
(119905+1)
119906= 119880
(119905)
119906+ 120572
120597119871
120597119880
(119905)
119906
for 119894 isin 119873(119906) doUpdate 119881
119894119882119894
119881
119894
(119905+1)= 119881
119894
(119905)+ 120572
120597119871
120597119881
119894
(119905)
119882
119894
(119905+1)= 119882
119894
(119905)+ 120572
120597119871
120597119882
119894
(119905)
EndEnd
119905 = 119905 + 1until 119905 ge itermax119880 = 119880
(119905) 119881 = 119881
(119905)119882 = 119882
(119905)
Algorithm 1 MERR SVD++
denotes the average number of relevant items across all theusers The computational complexity of the gradient in (16)is also 119874(119889
2119898 + 119889119898) The computational complexity of the
gradient in (15) is 119874(119889
2119898 + 119889119898) Hence the complexity
of the learning algorithm in one iteration is in the order of119874(119889
2119898) In the case that is a small number that is 2 ≪ 119898
the complexity is linear to the number of users in the datacollection Note that we have 119904 = 119898 in which 119904 denotes thenumber of nonzeros in the user-item matrix The complexityof the learning algorithm is then 119874(119889119904) Since we usuallyhave ≪ 119904 the complexity is 119874(119889119904) even in the case that is large that is being linear to the number of nonzeros (ierelevant observations in the data) In sum our analysis showsthat MERR SVD++ is suitable for large scale use cases Notethat we also empirically verify the complexity of the learningalgorithm in Section 543
5 Experiment
51 Datasets We use two datasets for the experiments Thefirst is the MovieLens 1 million dataset (ML1m) [18] whichcontains ca 1M ratings (1ndash5 scale) from ca 6K users and 37Kmovies The sparseness of the ML1m dataset is 9553 Thesecond dataset is the Netflix dataset [2 16 17] which contains100000000 ratings (1ndash5 scale) from 480189 users on 17770movies Due to the huge size of the Netflix data we extract asubset of 10000 users and 10000 movies in which each userhas rated more than 100 different movies
Mathematical Problems in Engineering 7
52 Evaluation Metrics Just as has been justified by [17]NDCG is a very suitable evaluation metric for personalizedranking algorithmswhich combine explicit and implicit feed-back And our proposed MERR SVD++ algorithm exploitsboth explicit and implicit feedback simultaneously andoptimizes the well-known personalized ranking evaluationmetric Expected Reciprocal Rank (ERR) So we use theNDCG and ERR as evaluation metrics for the predictabilityof models in this paper
NDCG is another most widely used measure for rankingproblems To define NDCG
119906for a user 119906 one first needs to
define DCG119906
DCG119906=
119898
sum
119894=1
2
pref(119894)minus 1
log (119894 + 1)
(17)
where pref(119894) is a binary indicator returning 1 if the 119894th item ispreferred and 0 otherwise DCG
119906is then normalized by the
ideal ranked list into the interval [0 1]
NDCG119906=
DCG119906
IDCG119906
(18)
where IDCG119906denotes that the ranked list is sorted exactly
according to the userrsquos tastes positive items are placed at thehead of the ranked listTheNDCGof all the users is themeanscore of each user
ERR is a generalized version of Reciprocal Rank (RR)designed to be used with multiple relevance level data (egratings) It has similar properties to RR in that it stronglyemphasizes the relevance of results returned at the top of thelist Using the definition of ERR in [18] we can define ERRfor a ranked item list of user 119906 as follows
ERR119906=
119899
sum
119894=1
119903
119906119894
119877
119906119894
119899
prod
119895=1
(1 minus 119903
119906119895119868 (119877
119906119895lt 119877
119906119894)) (19)
Similar to NDCG the ERR of all the users is the mean scoreof each user
Since in recommender systems the userrsquos satisfaction isdominated by only a few items on the top of the recommenda-tion list our evaluation in the following experiments focuseson the performance of top-5 recommended items that isNDCG5 and ERR5
53 Experiment Setup For each dataset we randomlyselected 5 rated items (movies) and 1000 unrated items(movies) for each user to form a test set We then randomlyselected a varying number of rated items from the rest toform a training set For example just as in [14 18] underthe condition of ldquoGiven 5rdquo we randomly selected 5 rateditems (disjoint to the items in the test set) for each user inorder to generate a training set We investigated a variety ofldquoGivenrdquo conditions for the training sets that is 5 10 and 15for the ML1m dataset and 10 20 and 30 for the extractedNetflix dataset Generated recommendation lists for each userare compared to the ground truth in the test set in order tomeasure the performance
All the models were implemented in MATLAB R2009aForMERR SVD++ the value of the regularization parameter
120582 was selected from range 10
minus5 10
minus4 10
minus3 10
minus2 10
minus1 1 10
and optimal parameter value was used And the learning rate120572 was selected from set 119860 119860 = 120572 | 120572 = 00001 times 2
119888 120572 le
05 119888 gt 0 and 119888 is a constant and the optimal parametervalue was also used In order to compare their performancesfairly for all matrix factorization models we set the numberof features to be 10 The optimal values of all parametersfor all the baseline models used are determined individuallyMore detailed setting methods of the parameters for all thebaselines can be found in the corresponding references Forall the algorithms used in our experiments we repeated theexperiment 5 times for each of the different conditions of eachdataset and the performances reported were averaged across5 runs
54 Experiment Results In this section we present a seriesof experiments to evaluate MERR SVD++ We designedthe experiments in order to address the following researchquestions
(1) Does the proposed MERR SVD++ outperform state-of-the-art personalized ranking approaches for top-Nrecommendation
(2) Does the performance of MERR SVD++ improvewhen we only increase the number of implicit feed-back data for each user
(3) Is MERR SVD++ scalable for large scale use cases
541 Performance Comparison We compare the perfor-mance of MERR SVD++ with that of five baseline algo-rithms The approaches we compare with are listed below
(i) Co-Rating [17] a state-of-the-art CF model thatcan be trained from explicit and implicit feedbacksimultaneously
(ii) SVD++ [16] the first proposed CF model that com-bines explicit and implicit feedback
(iii) xCLiMF [18] a state-of-the-art PR approach whichaims at directly optimizing ERR for top-N recom-mendation in domains with explicit feedback data(eg ratings)
(iv) CofiRank [29] a PR approach that optimizes theNDCG measure [28] for domains with explicit feed-back data (eg ratings) The implementation is basedon the publicly available software package from theauthors
(v) CLiMF [14] a state-of-the-art PR approach that opti-mizes the Mean Reciprocal Rank (MRR) measure[31] for domains with implicit feedback data (egclick follow) We use the explicit feedback datasetsby binarizing the rating values with a threshold Onthe ML1m dataset and the extracted Netflix datasetwe take ratings 4 and 5 (the highest two relevancelevels) respectively as the relevance threshold fortop-N recommendation
The results of the experiments on the ML1m and theextractedNetflix datasets are shown in Figure 4 Rows denote
8 Mathematical Problems in Engineering
0 Given 5 Given 10 Given 150
001
002
003
004
005
006The results on the ML1m dataset
The varieties of ldquoGivenrdquo conditions for the training sets
ND
CG
5
CLiMFCofiRankxCLiMF
SVD++Co_RatingMERR_SVD++
(a)
0 Given 5 Given 10 Given 150
001
002
003
004
005
006
007The results on the ML1m dataset
ERR
5
The varieties of ldquoGivenrdquo conditions for the training sets
CLiMFCofiRankxCLiMF
SVD++Co_RatingMERR_SVD++
(b)
0 Given 10 Given 20 Given 300
002
004
006
008
010
012The results on the extracted Netflix dataset
ND
CG
5
The varieties of ldquoGivenrdquo conditions for the training sets
CLiMFCofiRankxCLiMF
SVD++Co_RatingMERR_SVD++
(c)
0 Given 10 Given 20 Given 300
001
002
003
004
005
006
007
008The results on the extracted Netflix dataset
ERR
5
The varieties of ldquoGivenrdquo conditions for the training sets
CLiMFCofiRankxCLiMF
SVD++Co_RatingMERR_SVD++
(d)
Figure 4 The performance comparison of MERR SVD++ and baselines
the varieties of ldquoGivenrdquo conditions for the training sets basedon the ML1m and the extracted Netflix datasets and columnsdenote the quality of NDCG and ERR Figure 4 showsthat MERR SVD++ outperforms the baseline approachesin terms of both ERR and NDCG in all of the cases Theresults show that the improvement of ERR aligns consistentlywith the improvement of NDCG indicating that optimizingERR would not degrade the utility of recommendations thatare captured by the NDCG measure It can be seen thatthe relative performance of MERR SVD++ improves as thenumber of observed ratings from the users increases Thisresult indicates that MERR SVD++ can learn better top-Nrecommendation models if more observations of the gradedrelevance data from users can be used The results alsoreveal that it is difficult to model user preferences encoded
in multiple levels of relevance with limited observations inparticular when the number of observations is lower than thenumber of relevance levels
Compared to Co-Rating which is based on ratingprediction MERR SVD++ is based on ranking predictionand succeeds in enhancing the top-ranked performance byoptimizing ERR As reported in [17] the performance ofSVD++ is slightly weaker than that of Co-Rating which isbecause SVD++ model only attempts to approximate theobserved ratings and does not model preferences expressedin implicit feedback The results also show that Co-Ratingand SVD++ significantly outperform xCLiMF and CofiRankwhich confirms our belief that implicit feedback could indeedcomplement explicit feedback It can be seen in Figure 4that xCLiMF significantly outperforms CLiMF in terms of
Mathematical Problems in Engineering 9
0 10 20 30 40 50005
006
007
008
009
010
The increased numbers of implicit feedback for each user
The v
alue
of e
valu
atio
n m
etric
s
ERR5NDCG5
Figure 5 The influence of implicit feedback on the performance ofMERR SVD++
both ERR and NDCG across all the settings of relevancethresholds and datasets The results indicate that the infor-mation loss from binarizing multilevel relevance data wouldinevitably make recommendation models based on binaryrelevance data such as CLiMF suboptimal for the use caseswith explicit feedback data
Hence we give a positive answer to our first researchquestion
542 The Influence of Implicit Feedback on the Performanceof MERR SVD++ The influence of implicit feedback on theperformance of MERR SVD++ can be found in Figure 5Here 119873(119906) = 119877(119906) + 119872(119906) 119872(119906) denotes the set of allitems that user 119906 only gave implicit feedback Rows denotethe increased numbers of implicit feedback for each userand columns denote the quality of ERR and NDCG In ourexperiment the increased numbers of implicit feedback foreach user are the same We use the extracted Netflix datasetunder the condition of ldquoGiven 10rdquo Figure 5 shows that thequality of ERR and NDCG of MERR SVD++ synchronouslyand linearly improves with the increase of implicit feedbackfor each user which confirms our belief that implicit feedbackcould indeed complement explicit feedback
With this experimental result we give a positive answerto our second research question
543 Scalability The last experiment investigated the scala-bility of MERR SVD++ by measuring the training time thatwas required for the training set at different scales Firstlyas analyzed in Section 43 the computational complexityof MERR SVD++ is linear in the number of users in thetraining set when the average number of items rated peruser is fixed To demonstrate the scalability we used differentnumbers of users in the training set under each conditionwe randomly selected from 10 to 100 users in the trainingset and their rated items as the training data for learning thelatent factors The results on the ML1m dataset are shown inFigure 6 We can observe that the computational time under
10 20 30 40 50 60 70 80 90 1000
5
10
15
20
25
30
35
40
Users for training ()
Runt
ime p
er it
erat
ion
(sec
)
Given 5Given 10Given 15
Figure 6 Scalability analysis of MERR SVD++ in terms of thenumber of users in the training set
each condition increases almost linearly to the increase of thenumber of users Secondly as also discussed in Section 43the computational complexity of MERR SVD++ could befurther approximated to be linear to the amount of knowndata (ie nonzero entries in the training user-item matrix)To demonstrate this we examined the runtime of the learningalgorithm against different scales of the training sets underdifferent ldquoGivenrdquo conditions The result is shown in Figure 6from which we can observe that the average runtime of thelearning algorithm per iteration increases almost linearly asthe number of nonzeros in the training set increases
The observations from this experiment allow us to answerour last research question positively
6 Conclusion and Future Work
The problem of the previous researches on personalizedranking is that they focused on either explicit feedbackdata or implicit feedback data rather than making full useof the information in the dataset Until now nobody hasstudied personalized ranking algorithm by exploiting bothexplicit and implicit feedback In order to overcome thedefects of prior researches in this paper we have presenteda new personalized ranking algorithm (MERR SVD++) byexploiting both explicit and implicit feedback simultaneouslyMERR SVD++ optimizes the well-known evaluation metricExpected Reciprocal Rank (ERR) and is based on the newestxCLiMF model and SVD++ algorithm Experimental resultson practical datasets showed that our proposed algorithmoutperformed existing personalized ranking algorithms overdifferent evaluation metrics and that the running time ofMERR SVD++ showed a linear correlation with the num-ber of rating Because of its high precision and the goodexpansibility MERR SVD++ is suitable for processing bigdata and can greatly improve the recommendation speedand validity by solving the latency problem of personalizedrecommendation and has wide application prospect in the
10 Mathematical Problems in Engineering
field of internet information recommendation And becauseMERR SVD++ exploits both explicit and implicit feedbacksimultaneously MERR SVD++ can solve the data sparsityand imbalance problems of personalized ranking algorithmsto a certain extent
For futurework we plan to extend our algorithm to richerones so that our algorithm can solve the grey sheep problemand cold start problem of personalized recommendationAlso we would like to explore more useful information fromthe explicit feedback and implicit feedback simultaneously
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Acknowledgments
This work is sponsored in part by the National NaturalScience Foundation of China (nos 61370186 6140326461402122 61003140 61033010 and 61272414) Science andTechnology Planning Project of Guangdong Province (nos2014A010103040 and 2014B010116001) Science and Tech-nology Planning Project of Guangzhou (nos 2014J4100032and 201510010203) the Ministry of Education and ChinaMobile Research Fund (no MCM20121051) the secondbatch open subject of mechanical and electrical professionalgroup engineering technology development center in Foshancity (no 2015-KJZX139) and the 2015 Research BackboneTeachers Training Program of Shunde Polytechnic (no 2015-KJZX014)
References
[1] G Adomavicius and A Tuzhilin ldquoToward the next generationof recommender systems a survey of the state-of-the-art andpossible extensionsrdquo IEEE Transactions on Knowledge and DataEngineering vol 17 no 6 pp 734ndash749 2005
[2] S Ahn A Korattikara and N Liu ldquoLarge-scale distributedBayesian matrix factorization using stochastic gradientMCMCrdquo in Proceedings of the 21th ACMSIGKDDConference onKnowledgeDiscovery andDataMining pp 401ndash410 ACMPressSydney Australia August 2015
[3] S Balakrishnan and S Chopra ldquoCollaborative rankingrdquo in Pro-ceedings of the 5th ACM International Conference onWeb Searchand DataMining (WSDM rsquo12) pp 143ndash152 IEEE Seattle WashUSA February 2012
[4] Q Yao and J Kwok ldquoAccelerated inexact soft-impute for fastlarge-scale matrix completionrdquo in Proceedings of the 24rd Inter-national Joint Conference on Artificial Intelligence pp 4002ndash4008 ACM Press Buenos Aires Argentina July 2015
[5] Q Diao M Qiu C-Y Wu A J Smola J Jiang and C WangldquoJointly modeling aspects ratings and sentiments for movierecommendation (JMARS)rdquo in Proceedings of the 20th ACMSIGKDD International Conference on Knowledge Discovery andDataMining (KDD rsquo14) pp 193ndash202 ACMNewYorkNYUSAAugust 2014
[6] M Jahrer and A Toscher ldquoCollaborative filtering ensemble forrankingrdquo in Proceedings of the 17nd International Conference on
Knowledge Discovery and Data Mining pp 153ndash167 ACM SanDiego Calif USA August 2011
[7] G Takacs and D Tikk ldquoAlternating least squares for person-alized rankingrdquo in Proceedings of the 5th ACM Conference onRecommender Systems pp 83ndash90 ACM Dublin Ireland 2012
[8] G Li and L Li ldquoDual collaborative topicmodeling from implicitfeedbacksrdquo in Proceedings of the IEEE International Conferenceon Security Pattern Analysis and Cybernetics pp 395ndash404Wuhan China October 2014
[9] H Wang X Shi and D Yeung ldquoRelational stacked denoisingautoencoder for tag recommendationrdquo in Proceedings of the29th AAAI Conference on Artificial Intelligence pp 3052ndash3058ACM Press Austin Tex USA January 2015
[10] L Gai ldquoPairwise probabilistic matrix factorization for implicitfeedback collaborative filteringrdquo in Proceedings of the IEEEInternational Conference on Security Pattern Analysis andCybernetics (SPAC rsquo14) pp 181ndash190 Wuhan China October2014
[11] S Rendle C Freudenthaler Z Gantner and L Schmidt-Thieme ldquoBPR Bayesian personalized ranking from implicitfeedbackrdquo in Proceedings of the 25th Conference on Uncertaintyin Artificial Intelligence (UAI rsquo09) pp 452ndash461 Morgan Kauf-mann Montreal Canada 2009
[12] L Guo J Ma H R Jiang Z M Chen and C M XingldquoSocial trust aware item recommendation for implicit feedbackrdquoJournal of Computer Science and Technology vol 30 no 5 pp1039ndash1053 2015
[13] W K Pan H Zhong C F Xu and Z Ming ldquoAdaptive Bayesianpersonalized ranking for heterogeneous implicit feedbacksrdquoKnowledge-Based Systems vol 73 pp 173ndash180 2015
[14] Y Shi A Karatzoglou L Baltrunas M Larson N Oliver andA Hanjalic ldquoCLiMF collaborative less-is-more filteringrdquo inProceedings of the 23rd International Joint Conference on Artifi-cial Intelligence (IJCAI rsquo13) pp 3077ndash3081 ACMBeijing ChinaAugust 2013
[15] WK Pan andLChen ldquoGBPR grouppreference based bayesianpersonalized ranking for one-class collaborative filteringrdquo inProceedings of the 23rd International Joint Conference on Artifi-cial Intelligence (IJCAI rsquo13) pp 2691ndash2697 Beijing China 2013
[16] Y Koren ldquoFactor in the neighbors scalable and accurate col-laborative filteringrdquoACMTransactions on Knowledge Discoveryfrom Data vol 4 no 1 pp 1ndash24 2010
[17] N N Liu E W Xiang M Zhao and Q Yang ldquoUnifyingexplicit and implicit feedback for collaborative filteringrdquo inProceedings of the 19th International Conference on Informationand Knowledge Management (CIKM rsquo10) pp 1445ndash1448 ACMToronto Canada October 2010
[18] Y Shi A Karatzoglou L BaltrunasM Larson andA HanjalicldquoXCLiMF Optimizing expected reciprocal rank for data withmultiple levels of relevancerdquo in Proceedings of the 7th ACMConference on Recommender Systems (RecSys rsquo13) pp 431ndash434ACM Hongkong October 2013
[19] J Jiang J Lu G Zhang and G Long ldquoScaling-up item-based collaborative filtering recommendation algorithm basedon Hadooprdquo in Proceedings of the 7th IEEE World Congress onServices pp 490ndash497 IEEE Washington DC USA July 2011
[20] T Hofmann ldquoLatent semantic models for collaborative filter-ingrdquo ACM Transactions on Information Systems vol 22 no 1pp 89ndash115 2004
[21] R Salakhutdinov A Mnih and G Hinton ldquoRestricted Boltz-mannmachines for collaborative filteringrdquo in Proceedings of the
Mathematical Problems in Engineering 11
24rd International Conference onMachine Learning (ICML rsquo07)pp 791ndash798 ACM Press Corvallis Ore USA June 2007
[22] N Srebro J Rennie andT Jaakkola ldquoMaximum-marginmatrixfactorizationrdquo in Proceedings of the 18th Annual Conferenceon Neural Information Processing Systems pp 11ndash217 BritishColumbia Canada 2004
[23] J T Liu C H Wu and W Y Liu ldquoBayesian probabilisticmatrix factorization with social relations and item contents forrecommendationrdquo Decision Support Systems vol 55 no 3 pp838ndash850 2013
[24] D Feldman and T Tassa ldquoMore constraints smaller coresetsconstrained matrix approximation of sparse big datardquo in Pro-ceedings of the 21th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining (KDD rsquo15) pp 249ndash258ACM Press Sydney Australia August 2015
[25] S Mirisaee E Gaussier and A Termier ldquoImproved local searchfor binarymatrix factorizationrdquo in Proceedings of the 29th AAAIConference on Artificial Intelligence pp 1198ndash1204 ACM PressAustin Tex USA January 2015
[26] T Y Liu Learning to Rank for Information Retrieval SpringerNew York NY USA 2011
[27] NN LiuM Zhao andQ Yang ldquoProbabilistic latent preferenceanalysis for collaborative filteringrdquo in Proceedings of the 18thInternational Conference on Information and Knowledge Man-agement (CIKM rsquo09) pp 759ndash766 ACM Press Hong KongNovember 2009
[28] N N Liu and Q Yang ldquoEigenRank a ranking-orientedapproach to collaborative filteringrdquo in Proceedings of the 31stAnnual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval (ACM SIGIR rsquo08) pp 83ndash90 ACM Singapore July 2008
[29] M Weimer A Karatzoglou Q V Le and A J SmolaldquoCofiRankndashmaximum margin matrix factorization for collab-orative rankingrdquo in Proceedings of the 21th Conference onAdvances in Neural Information Processing Systems pp 79ndash86Curran Associates Vancouver Canada 2007
[30] O Chapelle D Metlzer Y Zhang and P Grinspan ldquoExpectedreciprocal rank for graded relevancerdquo in Proceedings of the 18thACM International Conference on Information and KnowledgeManagement (CIKM rsquo09) pp 621ndash630 ACM New York NYUSA November 2009
[31] E M Voorhees ldquoThe trec-8 question answering track reportrdquoin Proceedings of the 8th Text Retrieval Conference (TREC-8 rsquo99)Gaithersburg Md USA November 1999
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
Mathematical Problems in Engineering 3
22 Learning to Rank LTR is the core technology forinformation retrieval When a query is input into a searchengine LTR is responsible for ranking all the documentsor Web pages according to their relevance to this query orother objectives Many LTR algorithms have been proposedrecently and they can be classified into three categoriespointwise listwise and pairwise [3 26]
In the pointwise approach it is assumed that each query-document pair in the training data has a numerical or ordinalscore Then the LTR problem can be approximated by aregression problem given a single query-document pair itsscore is predicted
As the name suggests the listwise approach takes theentire set of documents associatedwith a query in the trainingdata as the input to construct amodel and predict their scores
The pairwise approach does not focus on accuratelypredicting the degree of relevance of each document insteadit mainly cares about the relative order of two documents Inthis sense it is closer to the concept of ldquorankingrdquo
23 Personalized Ranking Algorithms for Collaborative Fil-tering The algorithms about personalized ranking can alsobe divided into two categories personalized ranking withimplicit feedback (PRIF) [6 7 10 11 13ndash15] and personalizedranking with explicit feedback (PREF) [18 27ndash29] Theforemost of PRIF is Bayesian Personalized Ranking (BPR)[11] which converts the OCCF problem into a rankingproblem Pan et al [13] proposedAdaptive BayesianPersonal-ized Ranking (ABPR) which generalized BPR algorithm forhomogeneous implicit feedback and learned the confidenceadaptively Pan et al [15] have proposed Group BayesianPersonalized Ranking (GBPR) via introducing richer inter-actions among users In GBPR it introduces group prefer-ence to relax the individual and independence assumptionsof BPR Jahrer and Toscher [6] proposed SVD and AFMwhich used a ranking based objective function constructedby matrix decomposition model and a stochastic gradientdescent optimizer Takacs and Tikk [7] proposed RankALSwhich presented a computationally effective approach forthe direct minimization of a ranking objective functionwithout sampling Gai [10] proposed a new PRIF algorithm(Pairwise Probabilistic Matrix Factorization (PPMF)) to fur-ther improve the performance of previous PRIF algorithmsRecently Shi et al [14] proposed Collaborative Less-is-MoreFiltering (CLiMF) in which the model parameters werelearned by directly maximizing the well-known informationretrieval metric Mean Reciprocal Rank (MRR) HoweverCLiMF is not suitable for other evaluation metrics (egMAP AUC and NDCG) References [6 7 10 11 13ndash15]can improve the performance of OCCF by solving the datasparsity and imbalance problems of PRIF to a certain extentAs for PREF [27] was adapted from PLSA and [28] employedthe KNN technique of CF both of which utilized the pairwiseranking method Reference [29] utilized the listwise methodto build its ranker which was a variation of MaximumMargin Factorization [22] Shi et al [18] proposed ExtendedCollaborative Less-Is-More Filtering (xCLiMF)model whichcould be seen as a generalization of the CLiMF methodThe key idea of the xCLiMF algorithm is that it builds a
recommendation model by optimizing Expected ReciprocalRank an evaluation metric that generalizes Reciprocal Rank(RR) in order to incorporate userrsquo explicit feedback Refer-ences [18 27ndash29] can also improve the performance of PREFby solving the data sparsity and imbalance problems of PREFto a certain extent
3 Problem Formalization and SVD++
31 Problem Formalization In this paper we use capitalletters to denote a matrix (such as 119883) Given a matrix 119883119883
119894119895represents its element 119883
119894indicates the 119894th row of 119883 119883
119895
symbolizes the 119895th column of119883 and119883
119879 stands for the trans-pose of119883
32 SVD++ Given that a matrix 119883 = (119883
119894119895)
119898times119899 the total
number of users is 119898 and the total number of items is 119899 if119883 is an explicit feedback matrix then119883
119894119895isin 0 1 2 3 4 5 or
119883
119894119895is unknown We want to approximate 119883 with a low rank
matrix
119883 = (
119883
119894119895)
119898times119899 where
119883 = 119880
119879119881 119880 isin 119862
119889times119898 119881 isin 119862
119889times119899119880 and119881 denote the explicit feature matrix with ranks of 119889 forusers and items respectively 119889 ≪ 119896 and 119896 denotes the rankof119883 119896 le min(119898 119899)
If 119883 is an implicit feedback matrix then 119883
119894119895isin 0 1 or
119883
119894119895is unknown We want to approximate 119883 with a low rank
matrix
119883 = (
119883
119894119895)
119898times119899 where
119883 = 119867
119879119882 119867 isin 119862
119889times119898 119882 isin
119862
119889times119899119867 and119882 denote the implicit feature matrix with ranksof 119889 for users and items respectively
SVD++ is a collaborative filtering algorithm unifyingexplicit and implicit feedback based on rating prediction andmatrix factorization [16] In SVD++ matrix 119881 denotes theexplicit and implicit feature matrix of items simultaneously
The feature matrix of users can be defined as
119880
119906+ |119873 (119906)|
minus12sum
119895isin119873(119906)
119882
119895 (1)
where 119880
119906is used to characterize the userrsquos explicit feed-
back |119873(119906)|
minus12sum
119895isin119873(119906)119882
119895is used to characterize the userrsquos
implicit feedback 119873(119906) denotes the set of all items that user119906 gave implicit feedback and119882
119895denotes the implicit feature
vector of item 119895So the prediction formula of 119883
119906119894in SVD++ can be defined
as
119883
119906119894= (119880
119906+ |119873 (119906)|
minus12sum
119895isin119873(119906)
119882
119895)
119879
119881
119894
(2)
4 Exploiting Explicit and Implicit Feedbackfor Personalized Ranking
In this section we will firstly introduce our MERR SVD++model then give the learning algorithm of this model andfinally analyze its computational complexity
41 Exploiting Explicit and Implicit Feedback for PersonalizedRanking In practical applications the user scans the results
4 Mathematical Problems in Engineering
list from top to bottom and stops when a result is found thatfully satisfies the userrsquos information need The usefulness ofan item at rank 119894 is dependent on the usefulness of the itemsat rank less than 119894 Reciprocal Rank (RR) is an importantevaluationmetric in the research field of information retrieval[14] and it strongly emphasizes the relevance of resultsreturned at the top of the list The ERR measure is ageneralized version of RR designed to be used with multiplerelevance level data (eg ratings) ERR has similar propertiesto RR in that it strongly emphasizes the relevance of resultsreturned at the top of the list
Using the definition of ERR in [18 30] we can describeERR for a ranked item list of user 119906 as follows
ERR119906=
119899
sum
119894=1
119875
119906119894
119877
119906119894
(3)
where119877119906119894denotes the rank position of item 119894 for user 119906 when
all items are ranked in descending order of the predictedrelevance values And119875
119906119894denotes the probability that the user
119906 stops at position 119894 As in [30] 119875119906119894is defined as follows
119875
119906119894= 119903
119906119894
119899
prod
119895=1
(1 minus 119903
119906119895119868 (119877
119906119895lt 119877
119906119894)) (4)
where 119868(119909) is an indicator function equal to 1 if the conditionis true and otherwise 0 And 119903
119906119894denotes the probability that
user 119906 finds the item 119894 relevant Substituting (4) into (3) weobtain the calculation formula of ERR
119906
ERR119906=
119899
sum
119894=1
119903
119906119894
119877
119906119894
119899
prod
119895=1
(1 minus 119903
119906119895119868 (119877
119906119895lt 119877
119906119894)) (5)
We use a mapping function similar to the one used in[18] to convert ratings (or levels of relevance in general) toprobabilities of relevance as follows
119903
119906119894=
2
119883119906119894minus 1
2
119883max 119868
119906119894gt 0
0 119868
119906119894= 0
(6)
where 119868
119906119894is an indicator function Note that 119868
119906119894gt 0 (119868
119906119894=
0) indicates that user 119906rsquos preference to item 119894 is known(unknown) 119883
119906119894denotes the rating given by user 119906 to item 119894
and119883max is the highest ratingIn this paper we define that 119877(119906) denotes the set of all
items that user 119906 gave explicit feedback so 119873(119906) = 119877(119906) inthe dataset that the users only gave explicit feedback and theimplicit feedback dataset is created by setting all the ratingdata in explicit feedback dataset as 1 A toy example can beseen in Figure 2 If the dataset contains both explicit feedbackdata and implicit feedback data 119873(119906) = 119877(119906) + 119872(119906)119872(119906) denoting the set of all items that user 119906 only gaveimplicit feedback A toy example can be seen in Figure 3The influence of implicit feedback on the performance ofMERR SVD++ can be found in Section 542 If we use thetraditional SVD++ algorithm to unify explicit and implicit
feedback data the prediction formula of 119883119906119894in SVD++ can
be defined as
119883
119906119894= (119880
119906+ |119873 (119906)|
minus12sum
119895isin119873(119906)
119882
119895)
119879
119881
119894
(7)
So far through the introduction of SVD++ we canexploit both explicit and implicit feedback simultaneouslyby optimizing evaluation metric ERR So we call our modelMERR SVD++
Note that the value of rank 119877
119906119894depends on the value of
119883
119906119894 For example if the predicted relevance value
119883
119906119894of item
119894 is the second highest among all the items for user 119906 then wewill have 119877
119906119894= 2
Similar to other ranking measures such as RR ERR isalso nonsmooth with respect to the latent factors of users anditems that is 119880 119881 and 119882 It is thus impossible to optimizeERRdirectly using conventional optimization techniquesWethus employ smoothing techniques that were also used inCLiMF [14] and xCLiMF [18] to attain a smoothed versionof ERR In particular we approximate the rank-based terms1119877
119906119894and 119868(119877
119906119895lt 119877
119906119894) in (5) by smooth functions with
respect to the model parameters 119880 119881 and 119882 The approx-imate formula is as follows
119868 (119877
119906119895lt 119877
119906119894) asymp 119892 (
119883
119906119895minus
119883
119906119894)
1
119877
119906119894
= 119892 (
119883
119906119894)
(8)
where 119892(119909) is a logistic function that is 119892(119909) = 1(1 + 119890
minus119909)
Substituting (8) into (5) we obtain a smoothed approxi-mation of ERR
119906
ERR119906=
119899
sum
119894=1
119903
119906119894119892 (
119883
119906119894)
119899
prod
119895=1
(1 minus 119903
119906119895119892 (
119883
119906(119895minus119894))) (9)
Note that for notation convenience we make use of thesubstitution
119883
119906(119895minus119894)=
119883
119906119895minus
119883
119906119894
Given the monotonicity of the logarithm function themodel parameters that maximize (9) are equivalent to theparameters that maximize ln((1|119873(119906)|)ERR
119906) Specifically
we have
119880
119906 119881119882 = arg max
119880119906119881119882
ERR119906
= arg max119880119906119881119882
ln ((1119873 (119906))ERR119906)
= arg max119880119906119881119882
ln(
119899
sum
119895=1
119903
119906119894119892 (
119883
119906119894)
|119873 (119906)|
119899
prod
119895=1
(1 minus 119903
119906119895119892 (
119883
119906(119895minus119894))))
(10)
Mathematical Problems in Engineering 5
Item
Use
r
2
5
4
4 3
(a)
Item
Use
r
1
1
1
1 1
(b)
Figure 2 A toy example of the dataset that the users only gave explicit feedback (a) denotes the explicit feedback dataset (b) denotes theimplicit feedback dataset
Item
Use
r
2
5
4
4 3
(a)
Item
Use
r1 1 1
1 1
1
1 1
(b)
Figure 3 A toy example of the dataset that contains both explicit feedback data and implicit feedback data (a) denotes the explicit feedbackdataset (b) denotes the implicit feedback dataset and the numbers in bold denote the dataset that users only gave implicit feedback
Based on Jensenrsquos inequality and the concavity of thelogarithm function in a similar manner to [14 18] we derivethe lower bound of ln((1|119873(119906)|)ERR
119906) as follows
ln(
1
|119873 (119906)|
ERR119906)
= ln[
[
119899
sum
119895=1
119903
119906119894119892 (
119883
119906119894)
|119873 (119906)|
119899
prod
119895=1
(1 minus 119903
119906119895119892 (
119883
119906(119895minus119894)))
]
]
ge
1
|119873 (119906)|
sdot
119899
sum
119895=1
119903
119906119894ln[
[
119892 (
119883
119906119894)
119899
prod
119895=1
(1 minus 119903
119906119895119892 (
119883
119906(119895minus119894)))
]
]
=
1
|119873 (119906)|
sdot
119899
sum
119895=1
119903
119906119894[
[
ln119892 (
119883
119906119894) +
119899
sum
119895=1
ln (1 minus 119903
119906119895119892 (
119883
119906(119895minus119894)))
]
]
(11)
We can neglect the constant 1|119873(119906)| in the lower bound andobtain a new objective function
119871 (119880
119906 119881119882)
=
119899
sum
119895=1
119903
119906119894[
[
ln119892 (
119883
119906119894) +
119899
sum
119895=1
ln (1 minus 119903
119906119895119892 (
119883
119906(119895minus119894)))
]
]
(12)
Taking into account all119898 users and using the Frobenius normof the latent factors for regularization we obtain the objectivefunction of MERR SVD++
119871 (119880119881119882) =
119898
sum
119906
119899
sum
119894=1
119903
119906119894[
[
ln119892 (
119883
119906119894)
+
119899
sum
119895=1
ln (1 minus 119903
119906119895119892 (
119883
119906(119895minus119894)))
]
]
minus
120582
2
(119880
2+ 119881
2
+ 119882
2)
(13)
6 Mathematical Problems in Engineering
in which 120582 denotes the regularization coefficient and 119880
denotes the Frobenius norm of119880 Note that the lower bound119871(119880 119881119882) is much less complex than the original objectivefunction in (9) and standard optimization methods forexample gradient ascend can be used to learn the optimalmodel parameters 119880 119881 and119882
42 Optimization We can now maximize the objectivefunction (13) with respect to the latent factorsarg max
119880119881119882119871(119880119881119882) Note that 119871(119880119881119882) represents an
approximation of the mean value of ERR across all the usersWe can thus remove the constant coefficient 1119898 since ithas no influence on the optimization of 119871(119880 119881119882) Since theobjective function is smooth we can use gradient ascent forthe optimization The gradients can be derived in a similarmanner to xCLiMF [18] as shown in the following
120597119871
120597119880
119906
=
119899
sum
119894=1
119903
119906119894[
[
119892 (minus
119883
119906119894)119881
119894
+
119899
sum
119895=1
119903
119906119895119892
1015840(
119883
119906119895minus
119883
119906119894) (119881
119894minus 119881
119895)
1 minus 119903
119906119895119892 (
119883
119906119895minus
119883
119906119894)
]
]
minus 120582119880
119906
(14)
120597119871
120597119881
119894
= 119903
119906119894[
[
119892 (minus
119883
119906119894)119881
119894+
119899
sum
119895=1
119903
119906119895119892
1015840(
119883
119906119895minus
119883
119906119894)
sdot (
1
1 minus 119903
119906119895119892 (
119883
119906119895minus
119883
119906119894)
minus
1
1 minus 119903
119906119894119892 (
119883
119906119894minus
119883
119906119895)
)
]
]
(119880
119906+ |119873 (119906)|
minus12
sdot sum
119896isin119873(119906)
119882
119896) minus 120582119881
119894
(15)
120597119871
120597119882
119894
= 119903
119906119894[
[
119892 (minus
119883
119906119894)119881
119894
+
119899
sum
119895=1
119903
119906119895119892
1015840(
119883
119906119895minus
119883
119906119894) (119881
119894minus 119881
119895)
1 minus 119903
119906119895119892 (
119883
119906119895minus
119883
119906119894)
]
]
minus 120582119882
119894
(16)
The learning algorithm for the MERR SVD++ model isoutlined in Algorithm 1
The published research papers in [8ndash11 13 18] showthat the use of the normal distribution 119873(0 001) for theinitialization of feature matrix is very effective so we still usethis approach for the initialization of 119880 119881 and 119882 in ourproposed algorithm
43 Computational Complexity Here we first analyze thecomplexity of the learning process for one iteration Byexploiting the data sparseness in 119883 the computational com-plexity of the gradient in (14) is 119874(119889
2119898 + 119889119898) Note that
Input Training set119883 learning rate 120572 regularization 120582The number of the feature 119889 and the maximal numberOf iterations itermaxOutput feature matrix 119880 119881119882for 119906 = 1 2 119898 doIndex relevant items for user 119906119873(119906) = 119895 | 119883
119894119895gt 0 0 le 119895 le 119899
endInitialize 119880(0) 119881(0) and119882
(0) with random valuesand 119905 = 0repeatfor 119906 = 1 2 119898 doUpdate 119880
119906
119880
(119905+1)
119906= 119880
(119905)
119906+ 120572
120597119871
120597119880
(119905)
119906
for 119894 isin 119873(119906) doUpdate 119881
119894119882119894
119881
119894
(119905+1)= 119881
119894
(119905)+ 120572
120597119871
120597119881
119894
(119905)
119882
119894
(119905+1)= 119882
119894
(119905)+ 120572
120597119871
120597119882
119894
(119905)
EndEnd
119905 = 119905 + 1until 119905 ge itermax119880 = 119880
(119905) 119881 = 119881
(119905)119882 = 119882
(119905)
Algorithm 1 MERR SVD++
denotes the average number of relevant items across all theusers The computational complexity of the gradient in (16)is also 119874(119889
2119898 + 119889119898) The computational complexity of the
gradient in (15) is 119874(119889
2119898 + 119889119898) Hence the complexity
of the learning algorithm in one iteration is in the order of119874(119889
2119898) In the case that is a small number that is 2 ≪ 119898
the complexity is linear to the number of users in the datacollection Note that we have 119904 = 119898 in which 119904 denotes thenumber of nonzeros in the user-item matrix The complexityof the learning algorithm is then 119874(119889119904) Since we usuallyhave ≪ 119904 the complexity is 119874(119889119904) even in the case that is large that is being linear to the number of nonzeros (ierelevant observations in the data) In sum our analysis showsthat MERR SVD++ is suitable for large scale use cases Notethat we also empirically verify the complexity of the learningalgorithm in Section 543
5 Experiment
51 Datasets We use two datasets for the experiments Thefirst is the MovieLens 1 million dataset (ML1m) [18] whichcontains ca 1M ratings (1ndash5 scale) from ca 6K users and 37Kmovies The sparseness of the ML1m dataset is 9553 Thesecond dataset is the Netflix dataset [2 16 17] which contains100000000 ratings (1ndash5 scale) from 480189 users on 17770movies Due to the huge size of the Netflix data we extract asubset of 10000 users and 10000 movies in which each userhas rated more than 100 different movies
Mathematical Problems in Engineering 7
52 Evaluation Metrics Just as has been justified by [17]NDCG is a very suitable evaluation metric for personalizedranking algorithmswhich combine explicit and implicit feed-back And our proposed MERR SVD++ algorithm exploitsboth explicit and implicit feedback simultaneously andoptimizes the well-known personalized ranking evaluationmetric Expected Reciprocal Rank (ERR) So we use theNDCG and ERR as evaluation metrics for the predictabilityof models in this paper
NDCG is another most widely used measure for rankingproblems To define NDCG
119906for a user 119906 one first needs to
define DCG119906
DCG119906=
119898
sum
119894=1
2
pref(119894)minus 1
log (119894 + 1)
(17)
where pref(119894) is a binary indicator returning 1 if the 119894th item ispreferred and 0 otherwise DCG
119906is then normalized by the
ideal ranked list into the interval [0 1]
NDCG119906=
DCG119906
IDCG119906
(18)
where IDCG119906denotes that the ranked list is sorted exactly
according to the userrsquos tastes positive items are placed at thehead of the ranked listTheNDCGof all the users is themeanscore of each user
ERR is a generalized version of Reciprocal Rank (RR)designed to be used with multiple relevance level data (egratings) It has similar properties to RR in that it stronglyemphasizes the relevance of results returned at the top of thelist Using the definition of ERR in [18] we can define ERRfor a ranked item list of user 119906 as follows
ERR119906=
119899
sum
119894=1
119903
119906119894
119877
119906119894
119899
prod
119895=1
(1 minus 119903
119906119895119868 (119877
119906119895lt 119877
119906119894)) (19)
Similar to NDCG the ERR of all the users is the mean scoreof each user
Since in recommender systems the userrsquos satisfaction isdominated by only a few items on the top of the recommenda-tion list our evaluation in the following experiments focuseson the performance of top-5 recommended items that isNDCG5 and ERR5
53 Experiment Setup For each dataset we randomlyselected 5 rated items (movies) and 1000 unrated items(movies) for each user to form a test set We then randomlyselected a varying number of rated items from the rest toform a training set For example just as in [14 18] underthe condition of ldquoGiven 5rdquo we randomly selected 5 rateditems (disjoint to the items in the test set) for each user inorder to generate a training set We investigated a variety ofldquoGivenrdquo conditions for the training sets that is 5 10 and 15for the ML1m dataset and 10 20 and 30 for the extractedNetflix dataset Generated recommendation lists for each userare compared to the ground truth in the test set in order tomeasure the performance
All the models were implemented in MATLAB R2009aForMERR SVD++ the value of the regularization parameter
120582 was selected from range 10
minus5 10
minus4 10
minus3 10
minus2 10
minus1 1 10
and optimal parameter value was used And the learning rate120572 was selected from set 119860 119860 = 120572 | 120572 = 00001 times 2
119888 120572 le
05 119888 gt 0 and 119888 is a constant and the optimal parametervalue was also used In order to compare their performancesfairly for all matrix factorization models we set the numberof features to be 10 The optimal values of all parametersfor all the baseline models used are determined individuallyMore detailed setting methods of the parameters for all thebaselines can be found in the corresponding references Forall the algorithms used in our experiments we repeated theexperiment 5 times for each of the different conditions of eachdataset and the performances reported were averaged across5 runs
54 Experiment Results In this section we present a seriesof experiments to evaluate MERR SVD++ We designedthe experiments in order to address the following researchquestions
(1) Does the proposed MERR SVD++ outperform state-of-the-art personalized ranking approaches for top-Nrecommendation
(2) Does the performance of MERR SVD++ improvewhen we only increase the number of implicit feed-back data for each user
(3) Is MERR SVD++ scalable for large scale use cases
541 Performance Comparison We compare the perfor-mance of MERR SVD++ with that of five baseline algo-rithms The approaches we compare with are listed below
(i) Co-Rating [17] a state-of-the-art CF model thatcan be trained from explicit and implicit feedbacksimultaneously
(ii) SVD++ [16] the first proposed CF model that com-bines explicit and implicit feedback
(iii) xCLiMF [18] a state-of-the-art PR approach whichaims at directly optimizing ERR for top-N recom-mendation in domains with explicit feedback data(eg ratings)
(iv) CofiRank [29] a PR approach that optimizes theNDCG measure [28] for domains with explicit feed-back data (eg ratings) The implementation is basedon the publicly available software package from theauthors
(v) CLiMF [14] a state-of-the-art PR approach that opti-mizes the Mean Reciprocal Rank (MRR) measure[31] for domains with implicit feedback data (egclick follow) We use the explicit feedback datasetsby binarizing the rating values with a threshold Onthe ML1m dataset and the extracted Netflix datasetwe take ratings 4 and 5 (the highest two relevancelevels) respectively as the relevance threshold fortop-N recommendation
The results of the experiments on the ML1m and theextractedNetflix datasets are shown in Figure 4 Rows denote
8 Mathematical Problems in Engineering
0 Given 5 Given 10 Given 150
001
002
003
004
005
006The results on the ML1m dataset
The varieties of ldquoGivenrdquo conditions for the training sets
ND
CG
5
CLiMFCofiRankxCLiMF
SVD++Co_RatingMERR_SVD++
(a)
0 Given 5 Given 10 Given 150
001
002
003
004
005
006
007The results on the ML1m dataset
ERR
5
The varieties of ldquoGivenrdquo conditions for the training sets
CLiMFCofiRankxCLiMF
SVD++Co_RatingMERR_SVD++
(b)
0 Given 10 Given 20 Given 300
002
004
006
008
010
012The results on the extracted Netflix dataset
ND
CG
5
The varieties of ldquoGivenrdquo conditions for the training sets
CLiMFCofiRankxCLiMF
SVD++Co_RatingMERR_SVD++
(c)
0 Given 10 Given 20 Given 300
001
002
003
004
005
006
007
008The results on the extracted Netflix dataset
ERR
5
The varieties of ldquoGivenrdquo conditions for the training sets
CLiMFCofiRankxCLiMF
SVD++Co_RatingMERR_SVD++
(d)
Figure 4 The performance comparison of MERR SVD++ and baselines
the varieties of ldquoGivenrdquo conditions for the training sets basedon the ML1m and the extracted Netflix datasets and columnsdenote the quality of NDCG and ERR Figure 4 showsthat MERR SVD++ outperforms the baseline approachesin terms of both ERR and NDCG in all of the cases Theresults show that the improvement of ERR aligns consistentlywith the improvement of NDCG indicating that optimizingERR would not degrade the utility of recommendations thatare captured by the NDCG measure It can be seen thatthe relative performance of MERR SVD++ improves as thenumber of observed ratings from the users increases Thisresult indicates that MERR SVD++ can learn better top-Nrecommendation models if more observations of the gradedrelevance data from users can be used The results alsoreveal that it is difficult to model user preferences encoded
in multiple levels of relevance with limited observations inparticular when the number of observations is lower than thenumber of relevance levels
Compared to Co-Rating which is based on ratingprediction MERR SVD++ is based on ranking predictionand succeeds in enhancing the top-ranked performance byoptimizing ERR As reported in [17] the performance ofSVD++ is slightly weaker than that of Co-Rating which isbecause SVD++ model only attempts to approximate theobserved ratings and does not model preferences expressedin implicit feedback The results also show that Co-Ratingand SVD++ significantly outperform xCLiMF and CofiRankwhich confirms our belief that implicit feedback could indeedcomplement explicit feedback It can be seen in Figure 4that xCLiMF significantly outperforms CLiMF in terms of
Mathematical Problems in Engineering 9
0 10 20 30 40 50005
006
007
008
009
010
The increased numbers of implicit feedback for each user
The v
alue
of e
valu
atio
n m
etric
s
ERR5NDCG5
Figure 5 The influence of implicit feedback on the performance ofMERR SVD++
both ERR and NDCG across all the settings of relevancethresholds and datasets The results indicate that the infor-mation loss from binarizing multilevel relevance data wouldinevitably make recommendation models based on binaryrelevance data such as CLiMF suboptimal for the use caseswith explicit feedback data
Hence we give a positive answer to our first researchquestion
542 The Influence of Implicit Feedback on the Performanceof MERR SVD++ The influence of implicit feedback on theperformance of MERR SVD++ can be found in Figure 5Here 119873(119906) = 119877(119906) + 119872(119906) 119872(119906) denotes the set of allitems that user 119906 only gave implicit feedback Rows denotethe increased numbers of implicit feedback for each userand columns denote the quality of ERR and NDCG In ourexperiment the increased numbers of implicit feedback foreach user are the same We use the extracted Netflix datasetunder the condition of ldquoGiven 10rdquo Figure 5 shows that thequality of ERR and NDCG of MERR SVD++ synchronouslyand linearly improves with the increase of implicit feedbackfor each user which confirms our belief that implicit feedbackcould indeed complement explicit feedback
With this experimental result we give a positive answerto our second research question
543 Scalability The last experiment investigated the scala-bility of MERR SVD++ by measuring the training time thatwas required for the training set at different scales Firstlyas analyzed in Section 43 the computational complexityof MERR SVD++ is linear in the number of users in thetraining set when the average number of items rated peruser is fixed To demonstrate the scalability we used differentnumbers of users in the training set under each conditionwe randomly selected from 10 to 100 users in the trainingset and their rated items as the training data for learning thelatent factors The results on the ML1m dataset are shown inFigure 6 We can observe that the computational time under
10 20 30 40 50 60 70 80 90 1000
5
10
15
20
25
30
35
40
Users for training ()
Runt
ime p
er it
erat
ion
(sec
)
Given 5Given 10Given 15
Figure 6 Scalability analysis of MERR SVD++ in terms of thenumber of users in the training set
each condition increases almost linearly to the increase of thenumber of users Secondly as also discussed in Section 43the computational complexity of MERR SVD++ could befurther approximated to be linear to the amount of knowndata (ie nonzero entries in the training user-item matrix)To demonstrate this we examined the runtime of the learningalgorithm against different scales of the training sets underdifferent ldquoGivenrdquo conditions The result is shown in Figure 6from which we can observe that the average runtime of thelearning algorithm per iteration increases almost linearly asthe number of nonzeros in the training set increases
The observations from this experiment allow us to answerour last research question positively
6 Conclusion and Future Work
The problem of the previous researches on personalizedranking is that they focused on either explicit feedbackdata or implicit feedback data rather than making full useof the information in the dataset Until now nobody hasstudied personalized ranking algorithm by exploiting bothexplicit and implicit feedback In order to overcome thedefects of prior researches in this paper we have presenteda new personalized ranking algorithm (MERR SVD++) byexploiting both explicit and implicit feedback simultaneouslyMERR SVD++ optimizes the well-known evaluation metricExpected Reciprocal Rank (ERR) and is based on the newestxCLiMF model and SVD++ algorithm Experimental resultson practical datasets showed that our proposed algorithmoutperformed existing personalized ranking algorithms overdifferent evaluation metrics and that the running time ofMERR SVD++ showed a linear correlation with the num-ber of rating Because of its high precision and the goodexpansibility MERR SVD++ is suitable for processing bigdata and can greatly improve the recommendation speedand validity by solving the latency problem of personalizedrecommendation and has wide application prospect in the
10 Mathematical Problems in Engineering
field of internet information recommendation And becauseMERR SVD++ exploits both explicit and implicit feedbacksimultaneously MERR SVD++ can solve the data sparsityand imbalance problems of personalized ranking algorithmsto a certain extent
For futurework we plan to extend our algorithm to richerones so that our algorithm can solve the grey sheep problemand cold start problem of personalized recommendationAlso we would like to explore more useful information fromthe explicit feedback and implicit feedback simultaneously
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Acknowledgments
This work is sponsored in part by the National NaturalScience Foundation of China (nos 61370186 6140326461402122 61003140 61033010 and 61272414) Science andTechnology Planning Project of Guangdong Province (nos2014A010103040 and 2014B010116001) Science and Tech-nology Planning Project of Guangzhou (nos 2014J4100032and 201510010203) the Ministry of Education and ChinaMobile Research Fund (no MCM20121051) the secondbatch open subject of mechanical and electrical professionalgroup engineering technology development center in Foshancity (no 2015-KJZX139) and the 2015 Research BackboneTeachers Training Program of Shunde Polytechnic (no 2015-KJZX014)
References
[1] G Adomavicius and A Tuzhilin ldquoToward the next generationof recommender systems a survey of the state-of-the-art andpossible extensionsrdquo IEEE Transactions on Knowledge and DataEngineering vol 17 no 6 pp 734ndash749 2005
[2] S Ahn A Korattikara and N Liu ldquoLarge-scale distributedBayesian matrix factorization using stochastic gradientMCMCrdquo in Proceedings of the 21th ACMSIGKDDConference onKnowledgeDiscovery andDataMining pp 401ndash410 ACMPressSydney Australia August 2015
[3] S Balakrishnan and S Chopra ldquoCollaborative rankingrdquo in Pro-ceedings of the 5th ACM International Conference onWeb Searchand DataMining (WSDM rsquo12) pp 143ndash152 IEEE Seattle WashUSA February 2012
[4] Q Yao and J Kwok ldquoAccelerated inexact soft-impute for fastlarge-scale matrix completionrdquo in Proceedings of the 24rd Inter-national Joint Conference on Artificial Intelligence pp 4002ndash4008 ACM Press Buenos Aires Argentina July 2015
[5] Q Diao M Qiu C-Y Wu A J Smola J Jiang and C WangldquoJointly modeling aspects ratings and sentiments for movierecommendation (JMARS)rdquo in Proceedings of the 20th ACMSIGKDD International Conference on Knowledge Discovery andDataMining (KDD rsquo14) pp 193ndash202 ACMNewYorkNYUSAAugust 2014
[6] M Jahrer and A Toscher ldquoCollaborative filtering ensemble forrankingrdquo in Proceedings of the 17nd International Conference on
Knowledge Discovery and Data Mining pp 153ndash167 ACM SanDiego Calif USA August 2011
[7] G Takacs and D Tikk ldquoAlternating least squares for person-alized rankingrdquo in Proceedings of the 5th ACM Conference onRecommender Systems pp 83ndash90 ACM Dublin Ireland 2012
[8] G Li and L Li ldquoDual collaborative topicmodeling from implicitfeedbacksrdquo in Proceedings of the IEEE International Conferenceon Security Pattern Analysis and Cybernetics pp 395ndash404Wuhan China October 2014
[9] H Wang X Shi and D Yeung ldquoRelational stacked denoisingautoencoder for tag recommendationrdquo in Proceedings of the29th AAAI Conference on Artificial Intelligence pp 3052ndash3058ACM Press Austin Tex USA January 2015
[10] L Gai ldquoPairwise probabilistic matrix factorization for implicitfeedback collaborative filteringrdquo in Proceedings of the IEEEInternational Conference on Security Pattern Analysis andCybernetics (SPAC rsquo14) pp 181ndash190 Wuhan China October2014
[11] S Rendle C Freudenthaler Z Gantner and L Schmidt-Thieme ldquoBPR Bayesian personalized ranking from implicitfeedbackrdquo in Proceedings of the 25th Conference on Uncertaintyin Artificial Intelligence (UAI rsquo09) pp 452ndash461 Morgan Kauf-mann Montreal Canada 2009
[12] L Guo J Ma H R Jiang Z M Chen and C M XingldquoSocial trust aware item recommendation for implicit feedbackrdquoJournal of Computer Science and Technology vol 30 no 5 pp1039ndash1053 2015
[13] W K Pan H Zhong C F Xu and Z Ming ldquoAdaptive Bayesianpersonalized ranking for heterogeneous implicit feedbacksrdquoKnowledge-Based Systems vol 73 pp 173ndash180 2015
[14] Y Shi A Karatzoglou L Baltrunas M Larson N Oliver andA Hanjalic ldquoCLiMF collaborative less-is-more filteringrdquo inProceedings of the 23rd International Joint Conference on Artifi-cial Intelligence (IJCAI rsquo13) pp 3077ndash3081 ACMBeijing ChinaAugust 2013
[15] WK Pan andLChen ldquoGBPR grouppreference based bayesianpersonalized ranking for one-class collaborative filteringrdquo inProceedings of the 23rd International Joint Conference on Artifi-cial Intelligence (IJCAI rsquo13) pp 2691ndash2697 Beijing China 2013
[16] Y Koren ldquoFactor in the neighbors scalable and accurate col-laborative filteringrdquoACMTransactions on Knowledge Discoveryfrom Data vol 4 no 1 pp 1ndash24 2010
[17] N N Liu E W Xiang M Zhao and Q Yang ldquoUnifyingexplicit and implicit feedback for collaborative filteringrdquo inProceedings of the 19th International Conference on Informationand Knowledge Management (CIKM rsquo10) pp 1445ndash1448 ACMToronto Canada October 2010
[18] Y Shi A Karatzoglou L BaltrunasM Larson andA HanjalicldquoXCLiMF Optimizing expected reciprocal rank for data withmultiple levels of relevancerdquo in Proceedings of the 7th ACMConference on Recommender Systems (RecSys rsquo13) pp 431ndash434ACM Hongkong October 2013
[19] J Jiang J Lu G Zhang and G Long ldquoScaling-up item-based collaborative filtering recommendation algorithm basedon Hadooprdquo in Proceedings of the 7th IEEE World Congress onServices pp 490ndash497 IEEE Washington DC USA July 2011
[20] T Hofmann ldquoLatent semantic models for collaborative filter-ingrdquo ACM Transactions on Information Systems vol 22 no 1pp 89ndash115 2004
[21] R Salakhutdinov A Mnih and G Hinton ldquoRestricted Boltz-mannmachines for collaborative filteringrdquo in Proceedings of the
Mathematical Problems in Engineering 11
24rd International Conference onMachine Learning (ICML rsquo07)pp 791ndash798 ACM Press Corvallis Ore USA June 2007
[22] N Srebro J Rennie andT Jaakkola ldquoMaximum-marginmatrixfactorizationrdquo in Proceedings of the 18th Annual Conferenceon Neural Information Processing Systems pp 11ndash217 BritishColumbia Canada 2004
[23] J T Liu C H Wu and W Y Liu ldquoBayesian probabilisticmatrix factorization with social relations and item contents forrecommendationrdquo Decision Support Systems vol 55 no 3 pp838ndash850 2013
[24] D Feldman and T Tassa ldquoMore constraints smaller coresetsconstrained matrix approximation of sparse big datardquo in Pro-ceedings of the 21th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining (KDD rsquo15) pp 249ndash258ACM Press Sydney Australia August 2015
[25] S Mirisaee E Gaussier and A Termier ldquoImproved local searchfor binarymatrix factorizationrdquo in Proceedings of the 29th AAAIConference on Artificial Intelligence pp 1198ndash1204 ACM PressAustin Tex USA January 2015
[26] T Y Liu Learning to Rank for Information Retrieval SpringerNew York NY USA 2011
[27] NN LiuM Zhao andQ Yang ldquoProbabilistic latent preferenceanalysis for collaborative filteringrdquo in Proceedings of the 18thInternational Conference on Information and Knowledge Man-agement (CIKM rsquo09) pp 759ndash766 ACM Press Hong KongNovember 2009
[28] N N Liu and Q Yang ldquoEigenRank a ranking-orientedapproach to collaborative filteringrdquo in Proceedings of the 31stAnnual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval (ACM SIGIR rsquo08) pp 83ndash90 ACM Singapore July 2008
[29] M Weimer A Karatzoglou Q V Le and A J SmolaldquoCofiRankndashmaximum margin matrix factorization for collab-orative rankingrdquo in Proceedings of the 21th Conference onAdvances in Neural Information Processing Systems pp 79ndash86Curran Associates Vancouver Canada 2007
[30] O Chapelle D Metlzer Y Zhang and P Grinspan ldquoExpectedreciprocal rank for graded relevancerdquo in Proceedings of the 18thACM International Conference on Information and KnowledgeManagement (CIKM rsquo09) pp 621ndash630 ACM New York NYUSA November 2009
[31] E M Voorhees ldquoThe trec-8 question answering track reportrdquoin Proceedings of the 8th Text Retrieval Conference (TREC-8 rsquo99)Gaithersburg Md USA November 1999
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
4 Mathematical Problems in Engineering
list from top to bottom and stops when a result is found thatfully satisfies the userrsquos information need The usefulness ofan item at rank 119894 is dependent on the usefulness of the itemsat rank less than 119894 Reciprocal Rank (RR) is an importantevaluationmetric in the research field of information retrieval[14] and it strongly emphasizes the relevance of resultsreturned at the top of the list The ERR measure is ageneralized version of RR designed to be used with multiplerelevance level data (eg ratings) ERR has similar propertiesto RR in that it strongly emphasizes the relevance of resultsreturned at the top of the list
Using the definition of ERR in [18 30] we can describeERR for a ranked item list of user 119906 as follows
ERR119906=
119899
sum
119894=1
119875
119906119894
119877
119906119894
(3)
where119877119906119894denotes the rank position of item 119894 for user 119906 when
all items are ranked in descending order of the predictedrelevance values And119875
119906119894denotes the probability that the user
119906 stops at position 119894 As in [30] 119875119906119894is defined as follows
119875
119906119894= 119903
119906119894
119899
prod
119895=1
(1 minus 119903
119906119895119868 (119877
119906119895lt 119877
119906119894)) (4)
where 119868(119909) is an indicator function equal to 1 if the conditionis true and otherwise 0 And 119903
119906119894denotes the probability that
user 119906 finds the item 119894 relevant Substituting (4) into (3) weobtain the calculation formula of ERR
119906
ERR119906=
119899
sum
119894=1
119903
119906119894
119877
119906119894
119899
prod
119895=1
(1 minus 119903
119906119895119868 (119877
119906119895lt 119877
119906119894)) (5)
We use a mapping function similar to the one used in[18] to convert ratings (or levels of relevance in general) toprobabilities of relevance as follows
119903
119906119894=
2
119883119906119894minus 1
2
119883max 119868
119906119894gt 0
0 119868
119906119894= 0
(6)
where 119868
119906119894is an indicator function Note that 119868
119906119894gt 0 (119868
119906119894=
0) indicates that user 119906rsquos preference to item 119894 is known(unknown) 119883
119906119894denotes the rating given by user 119906 to item 119894
and119883max is the highest ratingIn this paper we define that 119877(119906) denotes the set of all
items that user 119906 gave explicit feedback so 119873(119906) = 119877(119906) inthe dataset that the users only gave explicit feedback and theimplicit feedback dataset is created by setting all the ratingdata in explicit feedback dataset as 1 A toy example can beseen in Figure 2 If the dataset contains both explicit feedbackdata and implicit feedback data 119873(119906) = 119877(119906) + 119872(119906)119872(119906) denoting the set of all items that user 119906 only gaveimplicit feedback A toy example can be seen in Figure 3The influence of implicit feedback on the performance ofMERR SVD++ can be found in Section 542 If we use thetraditional SVD++ algorithm to unify explicit and implicit
feedback data the prediction formula of 119883119906119894in SVD++ can
be defined as
119883
119906119894= (119880
119906+ |119873 (119906)|
minus12sum
119895isin119873(119906)
119882
119895)
119879
119881
119894
(7)
So far through the introduction of SVD++ we canexploit both explicit and implicit feedback simultaneouslyby optimizing evaluation metric ERR So we call our modelMERR SVD++
Note that the value of rank 119877
119906119894depends on the value of
119883
119906119894 For example if the predicted relevance value
119883
119906119894of item
119894 is the second highest among all the items for user 119906 then wewill have 119877
119906119894= 2
Similar to other ranking measures such as RR ERR isalso nonsmooth with respect to the latent factors of users anditems that is 119880 119881 and 119882 It is thus impossible to optimizeERRdirectly using conventional optimization techniquesWethus employ smoothing techniques that were also used inCLiMF [14] and xCLiMF [18] to attain a smoothed versionof ERR In particular we approximate the rank-based terms1119877
119906119894and 119868(119877
119906119895lt 119877
119906119894) in (5) by smooth functions with
respect to the model parameters 119880 119881 and 119882 The approx-imate formula is as follows
119868 (119877
119906119895lt 119877
119906119894) asymp 119892 (
119883
119906119895minus
119883
119906119894)
1
119877
119906119894
= 119892 (
119883
119906119894)
(8)
where 119892(119909) is a logistic function that is 119892(119909) = 1(1 + 119890
minus119909)
Substituting (8) into (5) we obtain a smoothed approxi-mation of ERR
119906
ERR119906=
119899
sum
119894=1
119903
119906119894119892 (
119883
119906119894)
119899
prod
119895=1
(1 minus 119903
119906119895119892 (
119883
119906(119895minus119894))) (9)
Note that for notation convenience we make use of thesubstitution
119883
119906(119895minus119894)=
119883
119906119895minus
119883
119906119894
Given the monotonicity of the logarithm function themodel parameters that maximize (9) are equivalent to theparameters that maximize ln((1|119873(119906)|)ERR
119906) Specifically
we have
119880
119906 119881119882 = arg max
119880119906119881119882
ERR119906
= arg max119880119906119881119882
ln ((1119873 (119906))ERR119906)
= arg max119880119906119881119882
ln(
119899
sum
119895=1
119903
119906119894119892 (
119883
119906119894)
|119873 (119906)|
119899
prod
119895=1
(1 minus 119903
119906119895119892 (
119883
119906(119895minus119894))))
(10)
Mathematical Problems in Engineering 5
Item
Use
r
2
5
4
4 3
(a)
Item
Use
r
1
1
1
1 1
(b)
Figure 2 A toy example of the dataset that the users only gave explicit feedback (a) denotes the explicit feedback dataset (b) denotes theimplicit feedback dataset
Item
Use
r
2
5
4
4 3
(a)
Item
Use
r1 1 1
1 1
1
1 1
(b)
Figure 3 A toy example of the dataset that contains both explicit feedback data and implicit feedback data (a) denotes the explicit feedbackdataset (b) denotes the implicit feedback dataset and the numbers in bold denote the dataset that users only gave implicit feedback
Based on Jensenrsquos inequality and the concavity of thelogarithm function in a similar manner to [14 18] we derivethe lower bound of ln((1|119873(119906)|)ERR
119906) as follows
ln(
1
|119873 (119906)|
ERR119906)
= ln[
[
119899
sum
119895=1
119903
119906119894119892 (
119883
119906119894)
|119873 (119906)|
119899
prod
119895=1
(1 minus 119903
119906119895119892 (
119883
119906(119895minus119894)))
]
]
ge
1
|119873 (119906)|
sdot
119899
sum
119895=1
119903
119906119894ln[
[
119892 (
119883
119906119894)
119899
prod
119895=1
(1 minus 119903
119906119895119892 (
119883
119906(119895minus119894)))
]
]
=
1
|119873 (119906)|
sdot
119899
sum
119895=1
119903
119906119894[
[
ln119892 (
119883
119906119894) +
119899
sum
119895=1
ln (1 minus 119903
119906119895119892 (
119883
119906(119895minus119894)))
]
]
(11)
We can neglect the constant 1|119873(119906)| in the lower bound andobtain a new objective function
119871 (119880
119906 119881119882)
=
119899
sum
119895=1
119903
119906119894[
[
ln119892 (
119883
119906119894) +
119899
sum
119895=1
ln (1 minus 119903
119906119895119892 (
119883
119906(119895minus119894)))
]
]
(12)
Taking into account all119898 users and using the Frobenius normof the latent factors for regularization we obtain the objectivefunction of MERR SVD++
119871 (119880119881119882) =
119898
sum
119906
119899
sum
119894=1
119903
119906119894[
[
ln119892 (
119883
119906119894)
+
119899
sum
119895=1
ln (1 minus 119903
119906119895119892 (
119883
119906(119895minus119894)))
]
]
minus
120582
2
(119880
2+ 119881
2
+ 119882
2)
(13)
6 Mathematical Problems in Engineering
in which 120582 denotes the regularization coefficient and 119880
denotes the Frobenius norm of119880 Note that the lower bound119871(119880 119881119882) is much less complex than the original objectivefunction in (9) and standard optimization methods forexample gradient ascend can be used to learn the optimalmodel parameters 119880 119881 and119882
42 Optimization We can now maximize the objectivefunction (13) with respect to the latent factorsarg max
119880119881119882119871(119880119881119882) Note that 119871(119880119881119882) represents an
approximation of the mean value of ERR across all the usersWe can thus remove the constant coefficient 1119898 since ithas no influence on the optimization of 119871(119880 119881119882) Since theobjective function is smooth we can use gradient ascent forthe optimization The gradients can be derived in a similarmanner to xCLiMF [18] as shown in the following
120597119871
120597119880
119906
=
119899
sum
119894=1
119903
119906119894[
[
119892 (minus
119883
119906119894)119881
119894
+
119899
sum
119895=1
119903
119906119895119892
1015840(
119883
119906119895minus
119883
119906119894) (119881
119894minus 119881
119895)
1 minus 119903
119906119895119892 (
119883
119906119895minus
119883
119906119894)
]
]
minus 120582119880
119906
(14)
120597119871
120597119881
119894
= 119903
119906119894[
[
119892 (minus
119883
119906119894)119881
119894+
119899
sum
119895=1
119903
119906119895119892
1015840(
119883
119906119895minus
119883
119906119894)
sdot (
1
1 minus 119903
119906119895119892 (
119883
119906119895minus
119883
119906119894)
minus
1
1 minus 119903
119906119894119892 (
119883
119906119894minus
119883
119906119895)
)
]
]
(119880
119906+ |119873 (119906)|
minus12
sdot sum
119896isin119873(119906)
119882
119896) minus 120582119881
119894
(15)
120597119871
120597119882
119894
= 119903
119906119894[
[
119892 (minus
119883
119906119894)119881
119894
+
119899
sum
119895=1
119903
119906119895119892
1015840(
119883
119906119895minus
119883
119906119894) (119881
119894minus 119881
119895)
1 minus 119903
119906119895119892 (
119883
119906119895minus
119883
119906119894)
]
]
minus 120582119882
119894
(16)
The learning algorithm for the MERR SVD++ model isoutlined in Algorithm 1
The published research papers in [8ndash11 13 18] showthat the use of the normal distribution 119873(0 001) for theinitialization of feature matrix is very effective so we still usethis approach for the initialization of 119880 119881 and 119882 in ourproposed algorithm
43 Computational Complexity Here we first analyze thecomplexity of the learning process for one iteration Byexploiting the data sparseness in 119883 the computational com-plexity of the gradient in (14) is 119874(119889
2119898 + 119889119898) Note that
Input Training set119883 learning rate 120572 regularization 120582The number of the feature 119889 and the maximal numberOf iterations itermaxOutput feature matrix 119880 119881119882for 119906 = 1 2 119898 doIndex relevant items for user 119906119873(119906) = 119895 | 119883
119894119895gt 0 0 le 119895 le 119899
endInitialize 119880(0) 119881(0) and119882
(0) with random valuesand 119905 = 0repeatfor 119906 = 1 2 119898 doUpdate 119880
119906
119880
(119905+1)
119906= 119880
(119905)
119906+ 120572
120597119871
120597119880
(119905)
119906
for 119894 isin 119873(119906) doUpdate 119881
119894119882119894
119881
119894
(119905+1)= 119881
119894
(119905)+ 120572
120597119871
120597119881
119894
(119905)
119882
119894
(119905+1)= 119882
119894
(119905)+ 120572
120597119871
120597119882
119894
(119905)
EndEnd
119905 = 119905 + 1until 119905 ge itermax119880 = 119880
(119905) 119881 = 119881
(119905)119882 = 119882
(119905)
Algorithm 1 MERR SVD++
denotes the average number of relevant items across all theusers The computational complexity of the gradient in (16)is also 119874(119889
2119898 + 119889119898) The computational complexity of the
gradient in (15) is 119874(119889
2119898 + 119889119898) Hence the complexity
of the learning algorithm in one iteration is in the order of119874(119889
2119898) In the case that is a small number that is 2 ≪ 119898
the complexity is linear to the number of users in the datacollection Note that we have 119904 = 119898 in which 119904 denotes thenumber of nonzeros in the user-item matrix The complexityof the learning algorithm is then 119874(119889119904) Since we usuallyhave ≪ 119904 the complexity is 119874(119889119904) even in the case that is large that is being linear to the number of nonzeros (ierelevant observations in the data) In sum our analysis showsthat MERR SVD++ is suitable for large scale use cases Notethat we also empirically verify the complexity of the learningalgorithm in Section 543
5 Experiment
51 Datasets We use two datasets for the experiments Thefirst is the MovieLens 1 million dataset (ML1m) [18] whichcontains ca 1M ratings (1ndash5 scale) from ca 6K users and 37Kmovies The sparseness of the ML1m dataset is 9553 Thesecond dataset is the Netflix dataset [2 16 17] which contains100000000 ratings (1ndash5 scale) from 480189 users on 17770movies Due to the huge size of the Netflix data we extract asubset of 10000 users and 10000 movies in which each userhas rated more than 100 different movies
Mathematical Problems in Engineering 7
52 Evaluation Metrics Just as has been justified by [17]NDCG is a very suitable evaluation metric for personalizedranking algorithmswhich combine explicit and implicit feed-back And our proposed MERR SVD++ algorithm exploitsboth explicit and implicit feedback simultaneously andoptimizes the well-known personalized ranking evaluationmetric Expected Reciprocal Rank (ERR) So we use theNDCG and ERR as evaluation metrics for the predictabilityof models in this paper
NDCG is another most widely used measure for rankingproblems To define NDCG
119906for a user 119906 one first needs to
define DCG119906
DCG119906=
119898
sum
119894=1
2
pref(119894)minus 1
log (119894 + 1)
(17)
where pref(119894) is a binary indicator returning 1 if the 119894th item ispreferred and 0 otherwise DCG
119906is then normalized by the
ideal ranked list into the interval [0 1]
NDCG119906=
DCG119906
IDCG119906
(18)
where IDCG119906denotes that the ranked list is sorted exactly
according to the userrsquos tastes positive items are placed at thehead of the ranked listTheNDCGof all the users is themeanscore of each user
ERR is a generalized version of Reciprocal Rank (RR)designed to be used with multiple relevance level data (egratings) It has similar properties to RR in that it stronglyemphasizes the relevance of results returned at the top of thelist Using the definition of ERR in [18] we can define ERRfor a ranked item list of user 119906 as follows
ERR119906=
119899
sum
119894=1
119903
119906119894
119877
119906119894
119899
prod
119895=1
(1 minus 119903
119906119895119868 (119877
119906119895lt 119877
119906119894)) (19)
Similar to NDCG the ERR of all the users is the mean scoreof each user
Since in recommender systems the userrsquos satisfaction isdominated by only a few items on the top of the recommenda-tion list our evaluation in the following experiments focuseson the performance of top-5 recommended items that isNDCG5 and ERR5
53 Experiment Setup For each dataset we randomlyselected 5 rated items (movies) and 1000 unrated items(movies) for each user to form a test set We then randomlyselected a varying number of rated items from the rest toform a training set For example just as in [14 18] underthe condition of ldquoGiven 5rdquo we randomly selected 5 rateditems (disjoint to the items in the test set) for each user inorder to generate a training set We investigated a variety ofldquoGivenrdquo conditions for the training sets that is 5 10 and 15for the ML1m dataset and 10 20 and 30 for the extractedNetflix dataset Generated recommendation lists for each userare compared to the ground truth in the test set in order tomeasure the performance
All the models were implemented in MATLAB R2009aForMERR SVD++ the value of the regularization parameter
120582 was selected from range 10
minus5 10
minus4 10
minus3 10
minus2 10
minus1 1 10
and optimal parameter value was used And the learning rate120572 was selected from set 119860 119860 = 120572 | 120572 = 00001 times 2
119888 120572 le
05 119888 gt 0 and 119888 is a constant and the optimal parametervalue was also used In order to compare their performancesfairly for all matrix factorization models we set the numberof features to be 10 The optimal values of all parametersfor all the baseline models used are determined individuallyMore detailed setting methods of the parameters for all thebaselines can be found in the corresponding references Forall the algorithms used in our experiments we repeated theexperiment 5 times for each of the different conditions of eachdataset and the performances reported were averaged across5 runs
54 Experiment Results In this section we present a seriesof experiments to evaluate MERR SVD++ We designedthe experiments in order to address the following researchquestions
(1) Does the proposed MERR SVD++ outperform state-of-the-art personalized ranking approaches for top-Nrecommendation
(2) Does the performance of MERR SVD++ improvewhen we only increase the number of implicit feed-back data for each user
(3) Is MERR SVD++ scalable for large scale use cases
541 Performance Comparison We compare the perfor-mance of MERR SVD++ with that of five baseline algo-rithms The approaches we compare with are listed below
(i) Co-Rating [17] a state-of-the-art CF model thatcan be trained from explicit and implicit feedbacksimultaneously
(ii) SVD++ [16] the first proposed CF model that com-bines explicit and implicit feedback
(iii) xCLiMF [18] a state-of-the-art PR approach whichaims at directly optimizing ERR for top-N recom-mendation in domains with explicit feedback data(eg ratings)
(iv) CofiRank [29] a PR approach that optimizes theNDCG measure [28] for domains with explicit feed-back data (eg ratings) The implementation is basedon the publicly available software package from theauthors
(v) CLiMF [14] a state-of-the-art PR approach that opti-mizes the Mean Reciprocal Rank (MRR) measure[31] for domains with implicit feedback data (egclick follow) We use the explicit feedback datasetsby binarizing the rating values with a threshold Onthe ML1m dataset and the extracted Netflix datasetwe take ratings 4 and 5 (the highest two relevancelevels) respectively as the relevance threshold fortop-N recommendation
The results of the experiments on the ML1m and theextractedNetflix datasets are shown in Figure 4 Rows denote
8 Mathematical Problems in Engineering
0 Given 5 Given 10 Given 150
001
002
003
004
005
006The results on the ML1m dataset
The varieties of ldquoGivenrdquo conditions for the training sets
ND
CG
5
CLiMFCofiRankxCLiMF
SVD++Co_RatingMERR_SVD++
(a)
0 Given 5 Given 10 Given 150
001
002
003
004
005
006
007The results on the ML1m dataset
ERR
5
The varieties of ldquoGivenrdquo conditions for the training sets
CLiMFCofiRankxCLiMF
SVD++Co_RatingMERR_SVD++
(b)
0 Given 10 Given 20 Given 300
002
004
006
008
010
012The results on the extracted Netflix dataset
ND
CG
5
The varieties of ldquoGivenrdquo conditions for the training sets
CLiMFCofiRankxCLiMF
SVD++Co_RatingMERR_SVD++
(c)
0 Given 10 Given 20 Given 300
001
002
003
004
005
006
007
008The results on the extracted Netflix dataset
ERR
5
The varieties of ldquoGivenrdquo conditions for the training sets
CLiMFCofiRankxCLiMF
SVD++Co_RatingMERR_SVD++
(d)
Figure 4 The performance comparison of MERR SVD++ and baselines
the varieties of ldquoGivenrdquo conditions for the training sets basedon the ML1m and the extracted Netflix datasets and columnsdenote the quality of NDCG and ERR Figure 4 showsthat MERR SVD++ outperforms the baseline approachesin terms of both ERR and NDCG in all of the cases Theresults show that the improvement of ERR aligns consistentlywith the improvement of NDCG indicating that optimizingERR would not degrade the utility of recommendations thatare captured by the NDCG measure It can be seen thatthe relative performance of MERR SVD++ improves as thenumber of observed ratings from the users increases Thisresult indicates that MERR SVD++ can learn better top-Nrecommendation models if more observations of the gradedrelevance data from users can be used The results alsoreveal that it is difficult to model user preferences encoded
in multiple levels of relevance with limited observations inparticular when the number of observations is lower than thenumber of relevance levels
Compared to Co-Rating which is based on ratingprediction MERR SVD++ is based on ranking predictionand succeeds in enhancing the top-ranked performance byoptimizing ERR As reported in [17] the performance ofSVD++ is slightly weaker than that of Co-Rating which isbecause SVD++ model only attempts to approximate theobserved ratings and does not model preferences expressedin implicit feedback The results also show that Co-Ratingand SVD++ significantly outperform xCLiMF and CofiRankwhich confirms our belief that implicit feedback could indeedcomplement explicit feedback It can be seen in Figure 4that xCLiMF significantly outperforms CLiMF in terms of
Mathematical Problems in Engineering 9
0 10 20 30 40 50005
006
007
008
009
010
The increased numbers of implicit feedback for each user
The v
alue
of e
valu
atio
n m
etric
s
ERR5NDCG5
Figure 5 The influence of implicit feedback on the performance ofMERR SVD++
both ERR and NDCG across all the settings of relevancethresholds and datasets The results indicate that the infor-mation loss from binarizing multilevel relevance data wouldinevitably make recommendation models based on binaryrelevance data such as CLiMF suboptimal for the use caseswith explicit feedback data
Hence we give a positive answer to our first researchquestion
542 The Influence of Implicit Feedback on the Performanceof MERR SVD++ The influence of implicit feedback on theperformance of MERR SVD++ can be found in Figure 5Here 119873(119906) = 119877(119906) + 119872(119906) 119872(119906) denotes the set of allitems that user 119906 only gave implicit feedback Rows denotethe increased numbers of implicit feedback for each userand columns denote the quality of ERR and NDCG In ourexperiment the increased numbers of implicit feedback foreach user are the same We use the extracted Netflix datasetunder the condition of ldquoGiven 10rdquo Figure 5 shows that thequality of ERR and NDCG of MERR SVD++ synchronouslyand linearly improves with the increase of implicit feedbackfor each user which confirms our belief that implicit feedbackcould indeed complement explicit feedback
With this experimental result we give a positive answerto our second research question
543 Scalability The last experiment investigated the scala-bility of MERR SVD++ by measuring the training time thatwas required for the training set at different scales Firstlyas analyzed in Section 43 the computational complexityof MERR SVD++ is linear in the number of users in thetraining set when the average number of items rated peruser is fixed To demonstrate the scalability we used differentnumbers of users in the training set under each conditionwe randomly selected from 10 to 100 users in the trainingset and their rated items as the training data for learning thelatent factors The results on the ML1m dataset are shown inFigure 6 We can observe that the computational time under
10 20 30 40 50 60 70 80 90 1000
5
10
15
20
25
30
35
40
Users for training ()
Runt
ime p
er it
erat
ion
(sec
)
Given 5Given 10Given 15
Figure 6 Scalability analysis of MERR SVD++ in terms of thenumber of users in the training set
each condition increases almost linearly to the increase of thenumber of users Secondly as also discussed in Section 43the computational complexity of MERR SVD++ could befurther approximated to be linear to the amount of knowndata (ie nonzero entries in the training user-item matrix)To demonstrate this we examined the runtime of the learningalgorithm against different scales of the training sets underdifferent ldquoGivenrdquo conditions The result is shown in Figure 6from which we can observe that the average runtime of thelearning algorithm per iteration increases almost linearly asthe number of nonzeros in the training set increases
The observations from this experiment allow us to answerour last research question positively
6 Conclusion and Future Work
The problem of the previous researches on personalizedranking is that they focused on either explicit feedbackdata or implicit feedback data rather than making full useof the information in the dataset Until now nobody hasstudied personalized ranking algorithm by exploiting bothexplicit and implicit feedback In order to overcome thedefects of prior researches in this paper we have presenteda new personalized ranking algorithm (MERR SVD++) byexploiting both explicit and implicit feedback simultaneouslyMERR SVD++ optimizes the well-known evaluation metricExpected Reciprocal Rank (ERR) and is based on the newestxCLiMF model and SVD++ algorithm Experimental resultson practical datasets showed that our proposed algorithmoutperformed existing personalized ranking algorithms overdifferent evaluation metrics and that the running time ofMERR SVD++ showed a linear correlation with the num-ber of rating Because of its high precision and the goodexpansibility MERR SVD++ is suitable for processing bigdata and can greatly improve the recommendation speedand validity by solving the latency problem of personalizedrecommendation and has wide application prospect in the
10 Mathematical Problems in Engineering
field of internet information recommendation And becauseMERR SVD++ exploits both explicit and implicit feedbacksimultaneously MERR SVD++ can solve the data sparsityand imbalance problems of personalized ranking algorithmsto a certain extent
For futurework we plan to extend our algorithm to richerones so that our algorithm can solve the grey sheep problemand cold start problem of personalized recommendationAlso we would like to explore more useful information fromthe explicit feedback and implicit feedback simultaneously
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Acknowledgments
This work is sponsored in part by the National NaturalScience Foundation of China (nos 61370186 6140326461402122 61003140 61033010 and 61272414) Science andTechnology Planning Project of Guangdong Province (nos2014A010103040 and 2014B010116001) Science and Tech-nology Planning Project of Guangzhou (nos 2014J4100032and 201510010203) the Ministry of Education and ChinaMobile Research Fund (no MCM20121051) the secondbatch open subject of mechanical and electrical professionalgroup engineering technology development center in Foshancity (no 2015-KJZX139) and the 2015 Research BackboneTeachers Training Program of Shunde Polytechnic (no 2015-KJZX014)
References
[1] G Adomavicius and A Tuzhilin ldquoToward the next generationof recommender systems a survey of the state-of-the-art andpossible extensionsrdquo IEEE Transactions on Knowledge and DataEngineering vol 17 no 6 pp 734ndash749 2005
[2] S Ahn A Korattikara and N Liu ldquoLarge-scale distributedBayesian matrix factorization using stochastic gradientMCMCrdquo in Proceedings of the 21th ACMSIGKDDConference onKnowledgeDiscovery andDataMining pp 401ndash410 ACMPressSydney Australia August 2015
[3] S Balakrishnan and S Chopra ldquoCollaborative rankingrdquo in Pro-ceedings of the 5th ACM International Conference onWeb Searchand DataMining (WSDM rsquo12) pp 143ndash152 IEEE Seattle WashUSA February 2012
[4] Q Yao and J Kwok ldquoAccelerated inexact soft-impute for fastlarge-scale matrix completionrdquo in Proceedings of the 24rd Inter-national Joint Conference on Artificial Intelligence pp 4002ndash4008 ACM Press Buenos Aires Argentina July 2015
[5] Q Diao M Qiu C-Y Wu A J Smola J Jiang and C WangldquoJointly modeling aspects ratings and sentiments for movierecommendation (JMARS)rdquo in Proceedings of the 20th ACMSIGKDD International Conference on Knowledge Discovery andDataMining (KDD rsquo14) pp 193ndash202 ACMNewYorkNYUSAAugust 2014
[6] M Jahrer and A Toscher ldquoCollaborative filtering ensemble forrankingrdquo in Proceedings of the 17nd International Conference on
Knowledge Discovery and Data Mining pp 153ndash167 ACM SanDiego Calif USA August 2011
[7] G Takacs and D Tikk ldquoAlternating least squares for person-alized rankingrdquo in Proceedings of the 5th ACM Conference onRecommender Systems pp 83ndash90 ACM Dublin Ireland 2012
[8] G Li and L Li ldquoDual collaborative topicmodeling from implicitfeedbacksrdquo in Proceedings of the IEEE International Conferenceon Security Pattern Analysis and Cybernetics pp 395ndash404Wuhan China October 2014
[9] H Wang X Shi and D Yeung ldquoRelational stacked denoisingautoencoder for tag recommendationrdquo in Proceedings of the29th AAAI Conference on Artificial Intelligence pp 3052ndash3058ACM Press Austin Tex USA January 2015
[10] L Gai ldquoPairwise probabilistic matrix factorization for implicitfeedback collaborative filteringrdquo in Proceedings of the IEEEInternational Conference on Security Pattern Analysis andCybernetics (SPAC rsquo14) pp 181ndash190 Wuhan China October2014
[11] S Rendle C Freudenthaler Z Gantner and L Schmidt-Thieme ldquoBPR Bayesian personalized ranking from implicitfeedbackrdquo in Proceedings of the 25th Conference on Uncertaintyin Artificial Intelligence (UAI rsquo09) pp 452ndash461 Morgan Kauf-mann Montreal Canada 2009
[12] L Guo J Ma H R Jiang Z M Chen and C M XingldquoSocial trust aware item recommendation for implicit feedbackrdquoJournal of Computer Science and Technology vol 30 no 5 pp1039ndash1053 2015
[13] W K Pan H Zhong C F Xu and Z Ming ldquoAdaptive Bayesianpersonalized ranking for heterogeneous implicit feedbacksrdquoKnowledge-Based Systems vol 73 pp 173ndash180 2015
[14] Y Shi A Karatzoglou L Baltrunas M Larson N Oliver andA Hanjalic ldquoCLiMF collaborative less-is-more filteringrdquo inProceedings of the 23rd International Joint Conference on Artifi-cial Intelligence (IJCAI rsquo13) pp 3077ndash3081 ACMBeijing ChinaAugust 2013
[15] WK Pan andLChen ldquoGBPR grouppreference based bayesianpersonalized ranking for one-class collaborative filteringrdquo inProceedings of the 23rd International Joint Conference on Artifi-cial Intelligence (IJCAI rsquo13) pp 2691ndash2697 Beijing China 2013
[16] Y Koren ldquoFactor in the neighbors scalable and accurate col-laborative filteringrdquoACMTransactions on Knowledge Discoveryfrom Data vol 4 no 1 pp 1ndash24 2010
[17] N N Liu E W Xiang M Zhao and Q Yang ldquoUnifyingexplicit and implicit feedback for collaborative filteringrdquo inProceedings of the 19th International Conference on Informationand Knowledge Management (CIKM rsquo10) pp 1445ndash1448 ACMToronto Canada October 2010
[18] Y Shi A Karatzoglou L BaltrunasM Larson andA HanjalicldquoXCLiMF Optimizing expected reciprocal rank for data withmultiple levels of relevancerdquo in Proceedings of the 7th ACMConference on Recommender Systems (RecSys rsquo13) pp 431ndash434ACM Hongkong October 2013
[19] J Jiang J Lu G Zhang and G Long ldquoScaling-up item-based collaborative filtering recommendation algorithm basedon Hadooprdquo in Proceedings of the 7th IEEE World Congress onServices pp 490ndash497 IEEE Washington DC USA July 2011
[20] T Hofmann ldquoLatent semantic models for collaborative filter-ingrdquo ACM Transactions on Information Systems vol 22 no 1pp 89ndash115 2004
[21] R Salakhutdinov A Mnih and G Hinton ldquoRestricted Boltz-mannmachines for collaborative filteringrdquo in Proceedings of the
Mathematical Problems in Engineering 11
24rd International Conference onMachine Learning (ICML rsquo07)pp 791ndash798 ACM Press Corvallis Ore USA June 2007
[22] N Srebro J Rennie andT Jaakkola ldquoMaximum-marginmatrixfactorizationrdquo in Proceedings of the 18th Annual Conferenceon Neural Information Processing Systems pp 11ndash217 BritishColumbia Canada 2004
[23] J T Liu C H Wu and W Y Liu ldquoBayesian probabilisticmatrix factorization with social relations and item contents forrecommendationrdquo Decision Support Systems vol 55 no 3 pp838ndash850 2013
[24] D Feldman and T Tassa ldquoMore constraints smaller coresetsconstrained matrix approximation of sparse big datardquo in Pro-ceedings of the 21th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining (KDD rsquo15) pp 249ndash258ACM Press Sydney Australia August 2015
[25] S Mirisaee E Gaussier and A Termier ldquoImproved local searchfor binarymatrix factorizationrdquo in Proceedings of the 29th AAAIConference on Artificial Intelligence pp 1198ndash1204 ACM PressAustin Tex USA January 2015
[26] T Y Liu Learning to Rank for Information Retrieval SpringerNew York NY USA 2011
[27] NN LiuM Zhao andQ Yang ldquoProbabilistic latent preferenceanalysis for collaborative filteringrdquo in Proceedings of the 18thInternational Conference on Information and Knowledge Man-agement (CIKM rsquo09) pp 759ndash766 ACM Press Hong KongNovember 2009
[28] N N Liu and Q Yang ldquoEigenRank a ranking-orientedapproach to collaborative filteringrdquo in Proceedings of the 31stAnnual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval (ACM SIGIR rsquo08) pp 83ndash90 ACM Singapore July 2008
[29] M Weimer A Karatzoglou Q V Le and A J SmolaldquoCofiRankndashmaximum margin matrix factorization for collab-orative rankingrdquo in Proceedings of the 21th Conference onAdvances in Neural Information Processing Systems pp 79ndash86Curran Associates Vancouver Canada 2007
[30] O Chapelle D Metlzer Y Zhang and P Grinspan ldquoExpectedreciprocal rank for graded relevancerdquo in Proceedings of the 18thACM International Conference on Information and KnowledgeManagement (CIKM rsquo09) pp 621ndash630 ACM New York NYUSA November 2009
[31] E M Voorhees ldquoThe trec-8 question answering track reportrdquoin Proceedings of the 8th Text Retrieval Conference (TREC-8 rsquo99)Gaithersburg Md USA November 1999
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
Mathematical Problems in Engineering 5
Item
Use
r
2
5
4
4 3
(a)
Item
Use
r
1
1
1
1 1
(b)
Figure 2 A toy example of the dataset that the users only gave explicit feedback (a) denotes the explicit feedback dataset (b) denotes theimplicit feedback dataset
Item
Use
r
2
5
4
4 3
(a)
Item
Use
r1 1 1
1 1
1
1 1
(b)
Figure 3 A toy example of the dataset that contains both explicit feedback data and implicit feedback data (a) denotes the explicit feedbackdataset (b) denotes the implicit feedback dataset and the numbers in bold denote the dataset that users only gave implicit feedback
Based on Jensenrsquos inequality and the concavity of thelogarithm function in a similar manner to [14 18] we derivethe lower bound of ln((1|119873(119906)|)ERR
119906) as follows
ln(
1
|119873 (119906)|
ERR119906)
= ln[
[
119899
sum
119895=1
119903
119906119894119892 (
119883
119906119894)
|119873 (119906)|
119899
prod
119895=1
(1 minus 119903
119906119895119892 (
119883
119906(119895minus119894)))
]
]
ge
1
|119873 (119906)|
sdot
119899
sum
119895=1
119903
119906119894ln[
[
119892 (
119883
119906119894)
119899
prod
119895=1
(1 minus 119903
119906119895119892 (
119883
119906(119895minus119894)))
]
]
=
1
|119873 (119906)|
sdot
119899
sum
119895=1
119903
119906119894[
[
ln119892 (
119883
119906119894) +
119899
sum
119895=1
ln (1 minus 119903
119906119895119892 (
119883
119906(119895minus119894)))
]
]
(11)
We can neglect the constant 1|119873(119906)| in the lower bound andobtain a new objective function
119871 (119880
119906 119881119882)
=
119899
sum
119895=1
119903
119906119894[
[
ln119892 (
119883
119906119894) +
119899
sum
119895=1
ln (1 minus 119903
119906119895119892 (
119883
119906(119895minus119894)))
]
]
(12)
Taking into account all119898 users and using the Frobenius normof the latent factors for regularization we obtain the objectivefunction of MERR SVD++
119871 (119880119881119882) =
119898
sum
119906
119899
sum
119894=1
119903
119906119894[
[
ln119892 (
119883
119906119894)
+
119899
sum
119895=1
ln (1 minus 119903
119906119895119892 (
119883
119906(119895minus119894)))
]
]
minus
120582
2
(119880
2+ 119881
2
+ 119882
2)
(13)
6 Mathematical Problems in Engineering
in which 120582 denotes the regularization coefficient and 119880
denotes the Frobenius norm of119880 Note that the lower bound119871(119880 119881119882) is much less complex than the original objectivefunction in (9) and standard optimization methods forexample gradient ascend can be used to learn the optimalmodel parameters 119880 119881 and119882
42 Optimization We can now maximize the objectivefunction (13) with respect to the latent factorsarg max
119880119881119882119871(119880119881119882) Note that 119871(119880119881119882) represents an
approximation of the mean value of ERR across all the usersWe can thus remove the constant coefficient 1119898 since ithas no influence on the optimization of 119871(119880 119881119882) Since theobjective function is smooth we can use gradient ascent forthe optimization The gradients can be derived in a similarmanner to xCLiMF [18] as shown in the following
120597119871
120597119880
119906
=
119899
sum
119894=1
119903
119906119894[
[
119892 (minus
119883
119906119894)119881
119894
+
119899
sum
119895=1
119903
119906119895119892
1015840(
119883
119906119895minus
119883
119906119894) (119881
119894minus 119881
119895)
1 minus 119903
119906119895119892 (
119883
119906119895minus
119883
119906119894)
]
]
minus 120582119880
119906
(14)
120597119871
120597119881
119894
= 119903
119906119894[
[
119892 (minus
119883
119906119894)119881
119894+
119899
sum
119895=1
119903
119906119895119892
1015840(
119883
119906119895minus
119883
119906119894)
sdot (
1
1 minus 119903
119906119895119892 (
119883
119906119895minus
119883
119906119894)
minus
1
1 minus 119903
119906119894119892 (
119883
119906119894minus
119883
119906119895)
)
]
]
(119880
119906+ |119873 (119906)|
minus12
sdot sum
119896isin119873(119906)
119882
119896) minus 120582119881
119894
(15)
120597119871
120597119882
119894
= 119903
119906119894[
[
119892 (minus
119883
119906119894)119881
119894
+
119899
sum
119895=1
119903
119906119895119892
1015840(
119883
119906119895minus
119883
119906119894) (119881
119894minus 119881
119895)
1 minus 119903
119906119895119892 (
119883
119906119895minus
119883
119906119894)
]
]
minus 120582119882
119894
(16)
The learning algorithm for the MERR SVD++ model isoutlined in Algorithm 1
The published research papers in [8ndash11 13 18] showthat the use of the normal distribution 119873(0 001) for theinitialization of feature matrix is very effective so we still usethis approach for the initialization of 119880 119881 and 119882 in ourproposed algorithm
43 Computational Complexity Here we first analyze thecomplexity of the learning process for one iteration Byexploiting the data sparseness in 119883 the computational com-plexity of the gradient in (14) is 119874(119889
2119898 + 119889119898) Note that
Input Training set119883 learning rate 120572 regularization 120582The number of the feature 119889 and the maximal numberOf iterations itermaxOutput feature matrix 119880 119881119882for 119906 = 1 2 119898 doIndex relevant items for user 119906119873(119906) = 119895 | 119883
119894119895gt 0 0 le 119895 le 119899
endInitialize 119880(0) 119881(0) and119882
(0) with random valuesand 119905 = 0repeatfor 119906 = 1 2 119898 doUpdate 119880
119906
119880
(119905+1)
119906= 119880
(119905)
119906+ 120572
120597119871
120597119880
(119905)
119906
for 119894 isin 119873(119906) doUpdate 119881
119894119882119894
119881
119894
(119905+1)= 119881
119894
(119905)+ 120572
120597119871
120597119881
119894
(119905)
119882
119894
(119905+1)= 119882
119894
(119905)+ 120572
120597119871
120597119882
119894
(119905)
EndEnd
119905 = 119905 + 1until 119905 ge itermax119880 = 119880
(119905) 119881 = 119881
(119905)119882 = 119882
(119905)
Algorithm 1 MERR SVD++
denotes the average number of relevant items across all theusers The computational complexity of the gradient in (16)is also 119874(119889
2119898 + 119889119898) The computational complexity of the
gradient in (15) is 119874(119889
2119898 + 119889119898) Hence the complexity
of the learning algorithm in one iteration is in the order of119874(119889
2119898) In the case that is a small number that is 2 ≪ 119898
the complexity is linear to the number of users in the datacollection Note that we have 119904 = 119898 in which 119904 denotes thenumber of nonzeros in the user-item matrix The complexityof the learning algorithm is then 119874(119889119904) Since we usuallyhave ≪ 119904 the complexity is 119874(119889119904) even in the case that is large that is being linear to the number of nonzeros (ierelevant observations in the data) In sum our analysis showsthat MERR SVD++ is suitable for large scale use cases Notethat we also empirically verify the complexity of the learningalgorithm in Section 543
5 Experiment
51 Datasets We use two datasets for the experiments Thefirst is the MovieLens 1 million dataset (ML1m) [18] whichcontains ca 1M ratings (1ndash5 scale) from ca 6K users and 37Kmovies The sparseness of the ML1m dataset is 9553 Thesecond dataset is the Netflix dataset [2 16 17] which contains100000000 ratings (1ndash5 scale) from 480189 users on 17770movies Due to the huge size of the Netflix data we extract asubset of 10000 users and 10000 movies in which each userhas rated more than 100 different movies
Mathematical Problems in Engineering 7
52 Evaluation Metrics Just as has been justified by [17]NDCG is a very suitable evaluation metric for personalizedranking algorithmswhich combine explicit and implicit feed-back And our proposed MERR SVD++ algorithm exploitsboth explicit and implicit feedback simultaneously andoptimizes the well-known personalized ranking evaluationmetric Expected Reciprocal Rank (ERR) So we use theNDCG and ERR as evaluation metrics for the predictabilityof models in this paper
NDCG is another most widely used measure for rankingproblems To define NDCG
119906for a user 119906 one first needs to
define DCG119906
DCG119906=
119898
sum
119894=1
2
pref(119894)minus 1
log (119894 + 1)
(17)
where pref(119894) is a binary indicator returning 1 if the 119894th item ispreferred and 0 otherwise DCG
119906is then normalized by the
ideal ranked list into the interval [0 1]
NDCG119906=
DCG119906
IDCG119906
(18)
where IDCG119906denotes that the ranked list is sorted exactly
according to the userrsquos tastes positive items are placed at thehead of the ranked listTheNDCGof all the users is themeanscore of each user
ERR is a generalized version of Reciprocal Rank (RR)designed to be used with multiple relevance level data (egratings) It has similar properties to RR in that it stronglyemphasizes the relevance of results returned at the top of thelist Using the definition of ERR in [18] we can define ERRfor a ranked item list of user 119906 as follows
ERR119906=
119899
sum
119894=1
119903
119906119894
119877
119906119894
119899
prod
119895=1
(1 minus 119903
119906119895119868 (119877
119906119895lt 119877
119906119894)) (19)
Similar to NDCG the ERR of all the users is the mean scoreof each user
Since in recommender systems the userrsquos satisfaction isdominated by only a few items on the top of the recommenda-tion list our evaluation in the following experiments focuseson the performance of top-5 recommended items that isNDCG5 and ERR5
53 Experiment Setup For each dataset we randomlyselected 5 rated items (movies) and 1000 unrated items(movies) for each user to form a test set We then randomlyselected a varying number of rated items from the rest toform a training set For example just as in [14 18] underthe condition of ldquoGiven 5rdquo we randomly selected 5 rateditems (disjoint to the items in the test set) for each user inorder to generate a training set We investigated a variety ofldquoGivenrdquo conditions for the training sets that is 5 10 and 15for the ML1m dataset and 10 20 and 30 for the extractedNetflix dataset Generated recommendation lists for each userare compared to the ground truth in the test set in order tomeasure the performance
All the models were implemented in MATLAB R2009aForMERR SVD++ the value of the regularization parameter
120582 was selected from range 10
minus5 10
minus4 10
minus3 10
minus2 10
minus1 1 10
and optimal parameter value was used And the learning rate120572 was selected from set 119860 119860 = 120572 | 120572 = 00001 times 2
119888 120572 le
05 119888 gt 0 and 119888 is a constant and the optimal parametervalue was also used In order to compare their performancesfairly for all matrix factorization models we set the numberof features to be 10 The optimal values of all parametersfor all the baseline models used are determined individuallyMore detailed setting methods of the parameters for all thebaselines can be found in the corresponding references Forall the algorithms used in our experiments we repeated theexperiment 5 times for each of the different conditions of eachdataset and the performances reported were averaged across5 runs
54 Experiment Results In this section we present a seriesof experiments to evaluate MERR SVD++ We designedthe experiments in order to address the following researchquestions
(1) Does the proposed MERR SVD++ outperform state-of-the-art personalized ranking approaches for top-Nrecommendation
(2) Does the performance of MERR SVD++ improvewhen we only increase the number of implicit feed-back data for each user
(3) Is MERR SVD++ scalable for large scale use cases
541 Performance Comparison We compare the perfor-mance of MERR SVD++ with that of five baseline algo-rithms The approaches we compare with are listed below
(i) Co-Rating [17] a state-of-the-art CF model thatcan be trained from explicit and implicit feedbacksimultaneously
(ii) SVD++ [16] the first proposed CF model that com-bines explicit and implicit feedback
(iii) xCLiMF [18] a state-of-the-art PR approach whichaims at directly optimizing ERR for top-N recom-mendation in domains with explicit feedback data(eg ratings)
(iv) CofiRank [29] a PR approach that optimizes theNDCG measure [28] for domains with explicit feed-back data (eg ratings) The implementation is basedon the publicly available software package from theauthors
(v) CLiMF [14] a state-of-the-art PR approach that opti-mizes the Mean Reciprocal Rank (MRR) measure[31] for domains with implicit feedback data (egclick follow) We use the explicit feedback datasetsby binarizing the rating values with a threshold Onthe ML1m dataset and the extracted Netflix datasetwe take ratings 4 and 5 (the highest two relevancelevels) respectively as the relevance threshold fortop-N recommendation
The results of the experiments on the ML1m and theextractedNetflix datasets are shown in Figure 4 Rows denote
8 Mathematical Problems in Engineering
0 Given 5 Given 10 Given 150
001
002
003
004
005
006The results on the ML1m dataset
The varieties of ldquoGivenrdquo conditions for the training sets
ND
CG
5
CLiMFCofiRankxCLiMF
SVD++Co_RatingMERR_SVD++
(a)
0 Given 5 Given 10 Given 150
001
002
003
004
005
006
007The results on the ML1m dataset
ERR
5
The varieties of ldquoGivenrdquo conditions for the training sets
CLiMFCofiRankxCLiMF
SVD++Co_RatingMERR_SVD++
(b)
0 Given 10 Given 20 Given 300
002
004
006
008
010
012The results on the extracted Netflix dataset
ND
CG
5
The varieties of ldquoGivenrdquo conditions for the training sets
CLiMFCofiRankxCLiMF
SVD++Co_RatingMERR_SVD++
(c)
0 Given 10 Given 20 Given 300
001
002
003
004
005
006
007
008The results on the extracted Netflix dataset
ERR
5
The varieties of ldquoGivenrdquo conditions for the training sets
CLiMFCofiRankxCLiMF
SVD++Co_RatingMERR_SVD++
(d)
Figure 4 The performance comparison of MERR SVD++ and baselines
the varieties of ldquoGivenrdquo conditions for the training sets basedon the ML1m and the extracted Netflix datasets and columnsdenote the quality of NDCG and ERR Figure 4 showsthat MERR SVD++ outperforms the baseline approachesin terms of both ERR and NDCG in all of the cases Theresults show that the improvement of ERR aligns consistentlywith the improvement of NDCG indicating that optimizingERR would not degrade the utility of recommendations thatare captured by the NDCG measure It can be seen thatthe relative performance of MERR SVD++ improves as thenumber of observed ratings from the users increases Thisresult indicates that MERR SVD++ can learn better top-Nrecommendation models if more observations of the gradedrelevance data from users can be used The results alsoreveal that it is difficult to model user preferences encoded
in multiple levels of relevance with limited observations inparticular when the number of observations is lower than thenumber of relevance levels
Compared to Co-Rating which is based on ratingprediction MERR SVD++ is based on ranking predictionand succeeds in enhancing the top-ranked performance byoptimizing ERR As reported in [17] the performance ofSVD++ is slightly weaker than that of Co-Rating which isbecause SVD++ model only attempts to approximate theobserved ratings and does not model preferences expressedin implicit feedback The results also show that Co-Ratingand SVD++ significantly outperform xCLiMF and CofiRankwhich confirms our belief that implicit feedback could indeedcomplement explicit feedback It can be seen in Figure 4that xCLiMF significantly outperforms CLiMF in terms of
Mathematical Problems in Engineering 9
0 10 20 30 40 50005
006
007
008
009
010
The increased numbers of implicit feedback for each user
The v
alue
of e
valu
atio
n m
etric
s
ERR5NDCG5
Figure 5 The influence of implicit feedback on the performance ofMERR SVD++
both ERR and NDCG across all the settings of relevancethresholds and datasets The results indicate that the infor-mation loss from binarizing multilevel relevance data wouldinevitably make recommendation models based on binaryrelevance data such as CLiMF suboptimal for the use caseswith explicit feedback data
Hence we give a positive answer to our first researchquestion
542 The Influence of Implicit Feedback on the Performanceof MERR SVD++ The influence of implicit feedback on theperformance of MERR SVD++ can be found in Figure 5Here 119873(119906) = 119877(119906) + 119872(119906) 119872(119906) denotes the set of allitems that user 119906 only gave implicit feedback Rows denotethe increased numbers of implicit feedback for each userand columns denote the quality of ERR and NDCG In ourexperiment the increased numbers of implicit feedback foreach user are the same We use the extracted Netflix datasetunder the condition of ldquoGiven 10rdquo Figure 5 shows that thequality of ERR and NDCG of MERR SVD++ synchronouslyand linearly improves with the increase of implicit feedbackfor each user which confirms our belief that implicit feedbackcould indeed complement explicit feedback
With this experimental result we give a positive answerto our second research question
543 Scalability The last experiment investigated the scala-bility of MERR SVD++ by measuring the training time thatwas required for the training set at different scales Firstlyas analyzed in Section 43 the computational complexityof MERR SVD++ is linear in the number of users in thetraining set when the average number of items rated peruser is fixed To demonstrate the scalability we used differentnumbers of users in the training set under each conditionwe randomly selected from 10 to 100 users in the trainingset and their rated items as the training data for learning thelatent factors The results on the ML1m dataset are shown inFigure 6 We can observe that the computational time under
10 20 30 40 50 60 70 80 90 1000
5
10
15
20
25
30
35
40
Users for training ()
Runt
ime p
er it
erat
ion
(sec
)
Given 5Given 10Given 15
Figure 6 Scalability analysis of MERR SVD++ in terms of thenumber of users in the training set
each condition increases almost linearly to the increase of thenumber of users Secondly as also discussed in Section 43the computational complexity of MERR SVD++ could befurther approximated to be linear to the amount of knowndata (ie nonzero entries in the training user-item matrix)To demonstrate this we examined the runtime of the learningalgorithm against different scales of the training sets underdifferent ldquoGivenrdquo conditions The result is shown in Figure 6from which we can observe that the average runtime of thelearning algorithm per iteration increases almost linearly asthe number of nonzeros in the training set increases
The observations from this experiment allow us to answerour last research question positively
6 Conclusion and Future Work
The problem of the previous researches on personalizedranking is that they focused on either explicit feedbackdata or implicit feedback data rather than making full useof the information in the dataset Until now nobody hasstudied personalized ranking algorithm by exploiting bothexplicit and implicit feedback In order to overcome thedefects of prior researches in this paper we have presenteda new personalized ranking algorithm (MERR SVD++) byexploiting both explicit and implicit feedback simultaneouslyMERR SVD++ optimizes the well-known evaluation metricExpected Reciprocal Rank (ERR) and is based on the newestxCLiMF model and SVD++ algorithm Experimental resultson practical datasets showed that our proposed algorithmoutperformed existing personalized ranking algorithms overdifferent evaluation metrics and that the running time ofMERR SVD++ showed a linear correlation with the num-ber of rating Because of its high precision and the goodexpansibility MERR SVD++ is suitable for processing bigdata and can greatly improve the recommendation speedand validity by solving the latency problem of personalizedrecommendation and has wide application prospect in the
10 Mathematical Problems in Engineering
field of internet information recommendation And becauseMERR SVD++ exploits both explicit and implicit feedbacksimultaneously MERR SVD++ can solve the data sparsityand imbalance problems of personalized ranking algorithmsto a certain extent
For futurework we plan to extend our algorithm to richerones so that our algorithm can solve the grey sheep problemand cold start problem of personalized recommendationAlso we would like to explore more useful information fromthe explicit feedback and implicit feedback simultaneously
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Acknowledgments
This work is sponsored in part by the National NaturalScience Foundation of China (nos 61370186 6140326461402122 61003140 61033010 and 61272414) Science andTechnology Planning Project of Guangdong Province (nos2014A010103040 and 2014B010116001) Science and Tech-nology Planning Project of Guangzhou (nos 2014J4100032and 201510010203) the Ministry of Education and ChinaMobile Research Fund (no MCM20121051) the secondbatch open subject of mechanical and electrical professionalgroup engineering technology development center in Foshancity (no 2015-KJZX139) and the 2015 Research BackboneTeachers Training Program of Shunde Polytechnic (no 2015-KJZX014)
References
[1] G Adomavicius and A Tuzhilin ldquoToward the next generationof recommender systems a survey of the state-of-the-art andpossible extensionsrdquo IEEE Transactions on Knowledge and DataEngineering vol 17 no 6 pp 734ndash749 2005
[2] S Ahn A Korattikara and N Liu ldquoLarge-scale distributedBayesian matrix factorization using stochastic gradientMCMCrdquo in Proceedings of the 21th ACMSIGKDDConference onKnowledgeDiscovery andDataMining pp 401ndash410 ACMPressSydney Australia August 2015
[3] S Balakrishnan and S Chopra ldquoCollaborative rankingrdquo in Pro-ceedings of the 5th ACM International Conference onWeb Searchand DataMining (WSDM rsquo12) pp 143ndash152 IEEE Seattle WashUSA February 2012
[4] Q Yao and J Kwok ldquoAccelerated inexact soft-impute for fastlarge-scale matrix completionrdquo in Proceedings of the 24rd Inter-national Joint Conference on Artificial Intelligence pp 4002ndash4008 ACM Press Buenos Aires Argentina July 2015
[5] Q Diao M Qiu C-Y Wu A J Smola J Jiang and C WangldquoJointly modeling aspects ratings and sentiments for movierecommendation (JMARS)rdquo in Proceedings of the 20th ACMSIGKDD International Conference on Knowledge Discovery andDataMining (KDD rsquo14) pp 193ndash202 ACMNewYorkNYUSAAugust 2014
[6] M Jahrer and A Toscher ldquoCollaborative filtering ensemble forrankingrdquo in Proceedings of the 17nd International Conference on
Knowledge Discovery and Data Mining pp 153ndash167 ACM SanDiego Calif USA August 2011
[7] G Takacs and D Tikk ldquoAlternating least squares for person-alized rankingrdquo in Proceedings of the 5th ACM Conference onRecommender Systems pp 83ndash90 ACM Dublin Ireland 2012
[8] G Li and L Li ldquoDual collaborative topicmodeling from implicitfeedbacksrdquo in Proceedings of the IEEE International Conferenceon Security Pattern Analysis and Cybernetics pp 395ndash404Wuhan China October 2014
[9] H Wang X Shi and D Yeung ldquoRelational stacked denoisingautoencoder for tag recommendationrdquo in Proceedings of the29th AAAI Conference on Artificial Intelligence pp 3052ndash3058ACM Press Austin Tex USA January 2015
[10] L Gai ldquoPairwise probabilistic matrix factorization for implicitfeedback collaborative filteringrdquo in Proceedings of the IEEEInternational Conference on Security Pattern Analysis andCybernetics (SPAC rsquo14) pp 181ndash190 Wuhan China October2014
[11] S Rendle C Freudenthaler Z Gantner and L Schmidt-Thieme ldquoBPR Bayesian personalized ranking from implicitfeedbackrdquo in Proceedings of the 25th Conference on Uncertaintyin Artificial Intelligence (UAI rsquo09) pp 452ndash461 Morgan Kauf-mann Montreal Canada 2009
[12] L Guo J Ma H R Jiang Z M Chen and C M XingldquoSocial trust aware item recommendation for implicit feedbackrdquoJournal of Computer Science and Technology vol 30 no 5 pp1039ndash1053 2015
[13] W K Pan H Zhong C F Xu and Z Ming ldquoAdaptive Bayesianpersonalized ranking for heterogeneous implicit feedbacksrdquoKnowledge-Based Systems vol 73 pp 173ndash180 2015
[14] Y Shi A Karatzoglou L Baltrunas M Larson N Oliver andA Hanjalic ldquoCLiMF collaborative less-is-more filteringrdquo inProceedings of the 23rd International Joint Conference on Artifi-cial Intelligence (IJCAI rsquo13) pp 3077ndash3081 ACMBeijing ChinaAugust 2013
[15] WK Pan andLChen ldquoGBPR grouppreference based bayesianpersonalized ranking for one-class collaborative filteringrdquo inProceedings of the 23rd International Joint Conference on Artifi-cial Intelligence (IJCAI rsquo13) pp 2691ndash2697 Beijing China 2013
[16] Y Koren ldquoFactor in the neighbors scalable and accurate col-laborative filteringrdquoACMTransactions on Knowledge Discoveryfrom Data vol 4 no 1 pp 1ndash24 2010
[17] N N Liu E W Xiang M Zhao and Q Yang ldquoUnifyingexplicit and implicit feedback for collaborative filteringrdquo inProceedings of the 19th International Conference on Informationand Knowledge Management (CIKM rsquo10) pp 1445ndash1448 ACMToronto Canada October 2010
[18] Y Shi A Karatzoglou L BaltrunasM Larson andA HanjalicldquoXCLiMF Optimizing expected reciprocal rank for data withmultiple levels of relevancerdquo in Proceedings of the 7th ACMConference on Recommender Systems (RecSys rsquo13) pp 431ndash434ACM Hongkong October 2013
[19] J Jiang J Lu G Zhang and G Long ldquoScaling-up item-based collaborative filtering recommendation algorithm basedon Hadooprdquo in Proceedings of the 7th IEEE World Congress onServices pp 490ndash497 IEEE Washington DC USA July 2011
[20] T Hofmann ldquoLatent semantic models for collaborative filter-ingrdquo ACM Transactions on Information Systems vol 22 no 1pp 89ndash115 2004
[21] R Salakhutdinov A Mnih and G Hinton ldquoRestricted Boltz-mannmachines for collaborative filteringrdquo in Proceedings of the
Mathematical Problems in Engineering 11
24rd International Conference onMachine Learning (ICML rsquo07)pp 791ndash798 ACM Press Corvallis Ore USA June 2007
[22] N Srebro J Rennie andT Jaakkola ldquoMaximum-marginmatrixfactorizationrdquo in Proceedings of the 18th Annual Conferenceon Neural Information Processing Systems pp 11ndash217 BritishColumbia Canada 2004
[23] J T Liu C H Wu and W Y Liu ldquoBayesian probabilisticmatrix factorization with social relations and item contents forrecommendationrdquo Decision Support Systems vol 55 no 3 pp838ndash850 2013
[24] D Feldman and T Tassa ldquoMore constraints smaller coresetsconstrained matrix approximation of sparse big datardquo in Pro-ceedings of the 21th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining (KDD rsquo15) pp 249ndash258ACM Press Sydney Australia August 2015
[25] S Mirisaee E Gaussier and A Termier ldquoImproved local searchfor binarymatrix factorizationrdquo in Proceedings of the 29th AAAIConference on Artificial Intelligence pp 1198ndash1204 ACM PressAustin Tex USA January 2015
[26] T Y Liu Learning to Rank for Information Retrieval SpringerNew York NY USA 2011
[27] NN LiuM Zhao andQ Yang ldquoProbabilistic latent preferenceanalysis for collaborative filteringrdquo in Proceedings of the 18thInternational Conference on Information and Knowledge Man-agement (CIKM rsquo09) pp 759ndash766 ACM Press Hong KongNovember 2009
[28] N N Liu and Q Yang ldquoEigenRank a ranking-orientedapproach to collaborative filteringrdquo in Proceedings of the 31stAnnual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval (ACM SIGIR rsquo08) pp 83ndash90 ACM Singapore July 2008
[29] M Weimer A Karatzoglou Q V Le and A J SmolaldquoCofiRankndashmaximum margin matrix factorization for collab-orative rankingrdquo in Proceedings of the 21th Conference onAdvances in Neural Information Processing Systems pp 79ndash86Curran Associates Vancouver Canada 2007
[30] O Chapelle D Metlzer Y Zhang and P Grinspan ldquoExpectedreciprocal rank for graded relevancerdquo in Proceedings of the 18thACM International Conference on Information and KnowledgeManagement (CIKM rsquo09) pp 621ndash630 ACM New York NYUSA November 2009
[31] E M Voorhees ldquoThe trec-8 question answering track reportrdquoin Proceedings of the 8th Text Retrieval Conference (TREC-8 rsquo99)Gaithersburg Md USA November 1999
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
6 Mathematical Problems in Engineering
in which 120582 denotes the regularization coefficient and 119880
denotes the Frobenius norm of119880 Note that the lower bound119871(119880 119881119882) is much less complex than the original objectivefunction in (9) and standard optimization methods forexample gradient ascend can be used to learn the optimalmodel parameters 119880 119881 and119882
42 Optimization We can now maximize the objectivefunction (13) with respect to the latent factorsarg max
119880119881119882119871(119880119881119882) Note that 119871(119880119881119882) represents an
approximation of the mean value of ERR across all the usersWe can thus remove the constant coefficient 1119898 since ithas no influence on the optimization of 119871(119880 119881119882) Since theobjective function is smooth we can use gradient ascent forthe optimization The gradients can be derived in a similarmanner to xCLiMF [18] as shown in the following
120597119871
120597119880
119906
=
119899
sum
119894=1
119903
119906119894[
[
119892 (minus
119883
119906119894)119881
119894
+
119899
sum
119895=1
119903
119906119895119892
1015840(
119883
119906119895minus
119883
119906119894) (119881
119894minus 119881
119895)
1 minus 119903
119906119895119892 (
119883
119906119895minus
119883
119906119894)
]
]
minus 120582119880
119906
(14)
120597119871
120597119881
119894
= 119903
119906119894[
[
119892 (minus
119883
119906119894)119881
119894+
119899
sum
119895=1
119903
119906119895119892
1015840(
119883
119906119895minus
119883
119906119894)
sdot (
1
1 minus 119903
119906119895119892 (
119883
119906119895minus
119883
119906119894)
minus
1
1 minus 119903
119906119894119892 (
119883
119906119894minus
119883
119906119895)
)
]
]
(119880
119906+ |119873 (119906)|
minus12
sdot sum
119896isin119873(119906)
119882
119896) minus 120582119881
119894
(15)
120597119871
120597119882
119894
= 119903
119906119894[
[
119892 (minus
119883
119906119894)119881
119894
+
119899
sum
119895=1
119903
119906119895119892
1015840(
119883
119906119895minus
119883
119906119894) (119881
119894minus 119881
119895)
1 minus 119903
119906119895119892 (
119883
119906119895minus
119883
119906119894)
]
]
minus 120582119882
119894
(16)
The learning algorithm for the MERR SVD++ model isoutlined in Algorithm 1
The published research papers in [8ndash11 13 18] showthat the use of the normal distribution 119873(0 001) for theinitialization of feature matrix is very effective so we still usethis approach for the initialization of 119880 119881 and 119882 in ourproposed algorithm
43 Computational Complexity Here we first analyze thecomplexity of the learning process for one iteration Byexploiting the data sparseness in 119883 the computational com-plexity of the gradient in (14) is 119874(119889
2119898 + 119889119898) Note that
Input Training set119883 learning rate 120572 regularization 120582The number of the feature 119889 and the maximal numberOf iterations itermaxOutput feature matrix 119880 119881119882for 119906 = 1 2 119898 doIndex relevant items for user 119906119873(119906) = 119895 | 119883
119894119895gt 0 0 le 119895 le 119899
endInitialize 119880(0) 119881(0) and119882
(0) with random valuesand 119905 = 0repeatfor 119906 = 1 2 119898 doUpdate 119880
119906
119880
(119905+1)
119906= 119880
(119905)
119906+ 120572
120597119871
120597119880
(119905)
119906
for 119894 isin 119873(119906) doUpdate 119881
119894119882119894
119881
119894
(119905+1)= 119881
119894
(119905)+ 120572
120597119871
120597119881
119894
(119905)
119882
119894
(119905+1)= 119882
119894
(119905)+ 120572
120597119871
120597119882
119894
(119905)
EndEnd
119905 = 119905 + 1until 119905 ge itermax119880 = 119880
(119905) 119881 = 119881
(119905)119882 = 119882
(119905)
Algorithm 1 MERR SVD++
denotes the average number of relevant items across all theusers The computational complexity of the gradient in (16)is also 119874(119889
2119898 + 119889119898) The computational complexity of the
gradient in (15) is 119874(119889
2119898 + 119889119898) Hence the complexity
of the learning algorithm in one iteration is in the order of119874(119889
2119898) In the case that is a small number that is 2 ≪ 119898
the complexity is linear to the number of users in the datacollection Note that we have 119904 = 119898 in which 119904 denotes thenumber of nonzeros in the user-item matrix The complexityof the learning algorithm is then 119874(119889119904) Since we usuallyhave ≪ 119904 the complexity is 119874(119889119904) even in the case that is large that is being linear to the number of nonzeros (ierelevant observations in the data) In sum our analysis showsthat MERR SVD++ is suitable for large scale use cases Notethat we also empirically verify the complexity of the learningalgorithm in Section 543
5 Experiment
51 Datasets We use two datasets for the experiments Thefirst is the MovieLens 1 million dataset (ML1m) [18] whichcontains ca 1M ratings (1ndash5 scale) from ca 6K users and 37Kmovies The sparseness of the ML1m dataset is 9553 Thesecond dataset is the Netflix dataset [2 16 17] which contains100000000 ratings (1ndash5 scale) from 480189 users on 17770movies Due to the huge size of the Netflix data we extract asubset of 10000 users and 10000 movies in which each userhas rated more than 100 different movies
Mathematical Problems in Engineering 7
52 Evaluation Metrics Just as has been justified by [17]NDCG is a very suitable evaluation metric for personalizedranking algorithmswhich combine explicit and implicit feed-back And our proposed MERR SVD++ algorithm exploitsboth explicit and implicit feedback simultaneously andoptimizes the well-known personalized ranking evaluationmetric Expected Reciprocal Rank (ERR) So we use theNDCG and ERR as evaluation metrics for the predictabilityof models in this paper
NDCG is another most widely used measure for rankingproblems To define NDCG
119906for a user 119906 one first needs to
define DCG119906
DCG119906=
119898
sum
119894=1
2
pref(119894)minus 1
log (119894 + 1)
(17)
where pref(119894) is a binary indicator returning 1 if the 119894th item ispreferred and 0 otherwise DCG
119906is then normalized by the
ideal ranked list into the interval [0 1]
NDCG119906=
DCG119906
IDCG119906
(18)
where IDCG119906denotes that the ranked list is sorted exactly
according to the userrsquos tastes positive items are placed at thehead of the ranked listTheNDCGof all the users is themeanscore of each user
ERR is a generalized version of Reciprocal Rank (RR)designed to be used with multiple relevance level data (egratings) It has similar properties to RR in that it stronglyemphasizes the relevance of results returned at the top of thelist Using the definition of ERR in [18] we can define ERRfor a ranked item list of user 119906 as follows
ERR119906=
119899
sum
119894=1
119903
119906119894
119877
119906119894
119899
prod
119895=1
(1 minus 119903
119906119895119868 (119877
119906119895lt 119877
119906119894)) (19)
Similar to NDCG the ERR of all the users is the mean scoreof each user
Since in recommender systems the userrsquos satisfaction isdominated by only a few items on the top of the recommenda-tion list our evaluation in the following experiments focuseson the performance of top-5 recommended items that isNDCG5 and ERR5
53 Experiment Setup For each dataset we randomlyselected 5 rated items (movies) and 1000 unrated items(movies) for each user to form a test set We then randomlyselected a varying number of rated items from the rest toform a training set For example just as in [14 18] underthe condition of ldquoGiven 5rdquo we randomly selected 5 rateditems (disjoint to the items in the test set) for each user inorder to generate a training set We investigated a variety ofldquoGivenrdquo conditions for the training sets that is 5 10 and 15for the ML1m dataset and 10 20 and 30 for the extractedNetflix dataset Generated recommendation lists for each userare compared to the ground truth in the test set in order tomeasure the performance
All the models were implemented in MATLAB R2009aForMERR SVD++ the value of the regularization parameter
120582 was selected from range 10
minus5 10
minus4 10
minus3 10
minus2 10
minus1 1 10
and optimal parameter value was used And the learning rate120572 was selected from set 119860 119860 = 120572 | 120572 = 00001 times 2
119888 120572 le
05 119888 gt 0 and 119888 is a constant and the optimal parametervalue was also used In order to compare their performancesfairly for all matrix factorization models we set the numberof features to be 10 The optimal values of all parametersfor all the baseline models used are determined individuallyMore detailed setting methods of the parameters for all thebaselines can be found in the corresponding references Forall the algorithms used in our experiments we repeated theexperiment 5 times for each of the different conditions of eachdataset and the performances reported were averaged across5 runs
54 Experiment Results In this section we present a seriesof experiments to evaluate MERR SVD++ We designedthe experiments in order to address the following researchquestions
(1) Does the proposed MERR SVD++ outperform state-of-the-art personalized ranking approaches for top-Nrecommendation
(2) Does the performance of MERR SVD++ improvewhen we only increase the number of implicit feed-back data for each user
(3) Is MERR SVD++ scalable for large scale use cases
541 Performance Comparison We compare the perfor-mance of MERR SVD++ with that of five baseline algo-rithms The approaches we compare with are listed below
(i) Co-Rating [17] a state-of-the-art CF model thatcan be trained from explicit and implicit feedbacksimultaneously
(ii) SVD++ [16] the first proposed CF model that com-bines explicit and implicit feedback
(iii) xCLiMF [18] a state-of-the-art PR approach whichaims at directly optimizing ERR for top-N recom-mendation in domains with explicit feedback data(eg ratings)
(iv) CofiRank [29] a PR approach that optimizes theNDCG measure [28] for domains with explicit feed-back data (eg ratings) The implementation is basedon the publicly available software package from theauthors
(v) CLiMF [14] a state-of-the-art PR approach that opti-mizes the Mean Reciprocal Rank (MRR) measure[31] for domains with implicit feedback data (egclick follow) We use the explicit feedback datasetsby binarizing the rating values with a threshold Onthe ML1m dataset and the extracted Netflix datasetwe take ratings 4 and 5 (the highest two relevancelevels) respectively as the relevance threshold fortop-N recommendation
The results of the experiments on the ML1m and theextractedNetflix datasets are shown in Figure 4 Rows denote
8 Mathematical Problems in Engineering
0 Given 5 Given 10 Given 150
001
002
003
004
005
006The results on the ML1m dataset
The varieties of ldquoGivenrdquo conditions for the training sets
ND
CG
5
CLiMFCofiRankxCLiMF
SVD++Co_RatingMERR_SVD++
(a)
0 Given 5 Given 10 Given 150
001
002
003
004
005
006
007The results on the ML1m dataset
ERR
5
The varieties of ldquoGivenrdquo conditions for the training sets
CLiMFCofiRankxCLiMF
SVD++Co_RatingMERR_SVD++
(b)
0 Given 10 Given 20 Given 300
002
004
006
008
010
012The results on the extracted Netflix dataset
ND
CG
5
The varieties of ldquoGivenrdquo conditions for the training sets
CLiMFCofiRankxCLiMF
SVD++Co_RatingMERR_SVD++
(c)
0 Given 10 Given 20 Given 300
001
002
003
004
005
006
007
008The results on the extracted Netflix dataset
ERR
5
The varieties of ldquoGivenrdquo conditions for the training sets
CLiMFCofiRankxCLiMF
SVD++Co_RatingMERR_SVD++
(d)
Figure 4 The performance comparison of MERR SVD++ and baselines
the varieties of ldquoGivenrdquo conditions for the training sets basedon the ML1m and the extracted Netflix datasets and columnsdenote the quality of NDCG and ERR Figure 4 showsthat MERR SVD++ outperforms the baseline approachesin terms of both ERR and NDCG in all of the cases Theresults show that the improvement of ERR aligns consistentlywith the improvement of NDCG indicating that optimizingERR would not degrade the utility of recommendations thatare captured by the NDCG measure It can be seen thatthe relative performance of MERR SVD++ improves as thenumber of observed ratings from the users increases Thisresult indicates that MERR SVD++ can learn better top-Nrecommendation models if more observations of the gradedrelevance data from users can be used The results alsoreveal that it is difficult to model user preferences encoded
in multiple levels of relevance with limited observations inparticular when the number of observations is lower than thenumber of relevance levels
Compared to Co-Rating which is based on ratingprediction MERR SVD++ is based on ranking predictionand succeeds in enhancing the top-ranked performance byoptimizing ERR As reported in [17] the performance ofSVD++ is slightly weaker than that of Co-Rating which isbecause SVD++ model only attempts to approximate theobserved ratings and does not model preferences expressedin implicit feedback The results also show that Co-Ratingand SVD++ significantly outperform xCLiMF and CofiRankwhich confirms our belief that implicit feedback could indeedcomplement explicit feedback It can be seen in Figure 4that xCLiMF significantly outperforms CLiMF in terms of
Mathematical Problems in Engineering 9
0 10 20 30 40 50005
006
007
008
009
010
The increased numbers of implicit feedback for each user
The v
alue
of e
valu
atio
n m
etric
s
ERR5NDCG5
Figure 5 The influence of implicit feedback on the performance ofMERR SVD++
both ERR and NDCG across all the settings of relevancethresholds and datasets The results indicate that the infor-mation loss from binarizing multilevel relevance data wouldinevitably make recommendation models based on binaryrelevance data such as CLiMF suboptimal for the use caseswith explicit feedback data
Hence we give a positive answer to our first researchquestion
542 The Influence of Implicit Feedback on the Performanceof MERR SVD++ The influence of implicit feedback on theperformance of MERR SVD++ can be found in Figure 5Here 119873(119906) = 119877(119906) + 119872(119906) 119872(119906) denotes the set of allitems that user 119906 only gave implicit feedback Rows denotethe increased numbers of implicit feedback for each userand columns denote the quality of ERR and NDCG In ourexperiment the increased numbers of implicit feedback foreach user are the same We use the extracted Netflix datasetunder the condition of ldquoGiven 10rdquo Figure 5 shows that thequality of ERR and NDCG of MERR SVD++ synchronouslyand linearly improves with the increase of implicit feedbackfor each user which confirms our belief that implicit feedbackcould indeed complement explicit feedback
With this experimental result we give a positive answerto our second research question
543 Scalability The last experiment investigated the scala-bility of MERR SVD++ by measuring the training time thatwas required for the training set at different scales Firstlyas analyzed in Section 43 the computational complexityof MERR SVD++ is linear in the number of users in thetraining set when the average number of items rated peruser is fixed To demonstrate the scalability we used differentnumbers of users in the training set under each conditionwe randomly selected from 10 to 100 users in the trainingset and their rated items as the training data for learning thelatent factors The results on the ML1m dataset are shown inFigure 6 We can observe that the computational time under
10 20 30 40 50 60 70 80 90 1000
5
10
15
20
25
30
35
40
Users for training ()
Runt
ime p
er it
erat
ion
(sec
)
Given 5Given 10Given 15
Figure 6 Scalability analysis of MERR SVD++ in terms of thenumber of users in the training set
each condition increases almost linearly to the increase of thenumber of users Secondly as also discussed in Section 43the computational complexity of MERR SVD++ could befurther approximated to be linear to the amount of knowndata (ie nonzero entries in the training user-item matrix)To demonstrate this we examined the runtime of the learningalgorithm against different scales of the training sets underdifferent ldquoGivenrdquo conditions The result is shown in Figure 6from which we can observe that the average runtime of thelearning algorithm per iteration increases almost linearly asthe number of nonzeros in the training set increases
The observations from this experiment allow us to answerour last research question positively
6 Conclusion and Future Work
The problem of the previous researches on personalizedranking is that they focused on either explicit feedbackdata or implicit feedback data rather than making full useof the information in the dataset Until now nobody hasstudied personalized ranking algorithm by exploiting bothexplicit and implicit feedback In order to overcome thedefects of prior researches in this paper we have presenteda new personalized ranking algorithm (MERR SVD++) byexploiting both explicit and implicit feedback simultaneouslyMERR SVD++ optimizes the well-known evaluation metricExpected Reciprocal Rank (ERR) and is based on the newestxCLiMF model and SVD++ algorithm Experimental resultson practical datasets showed that our proposed algorithmoutperformed existing personalized ranking algorithms overdifferent evaluation metrics and that the running time ofMERR SVD++ showed a linear correlation with the num-ber of rating Because of its high precision and the goodexpansibility MERR SVD++ is suitable for processing bigdata and can greatly improve the recommendation speedand validity by solving the latency problem of personalizedrecommendation and has wide application prospect in the
10 Mathematical Problems in Engineering
field of internet information recommendation And becauseMERR SVD++ exploits both explicit and implicit feedbacksimultaneously MERR SVD++ can solve the data sparsityand imbalance problems of personalized ranking algorithmsto a certain extent
For futurework we plan to extend our algorithm to richerones so that our algorithm can solve the grey sheep problemand cold start problem of personalized recommendationAlso we would like to explore more useful information fromthe explicit feedback and implicit feedback simultaneously
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Acknowledgments
This work is sponsored in part by the National NaturalScience Foundation of China (nos 61370186 6140326461402122 61003140 61033010 and 61272414) Science andTechnology Planning Project of Guangdong Province (nos2014A010103040 and 2014B010116001) Science and Tech-nology Planning Project of Guangzhou (nos 2014J4100032and 201510010203) the Ministry of Education and ChinaMobile Research Fund (no MCM20121051) the secondbatch open subject of mechanical and electrical professionalgroup engineering technology development center in Foshancity (no 2015-KJZX139) and the 2015 Research BackboneTeachers Training Program of Shunde Polytechnic (no 2015-KJZX014)
References
[1] G Adomavicius and A Tuzhilin ldquoToward the next generationof recommender systems a survey of the state-of-the-art andpossible extensionsrdquo IEEE Transactions on Knowledge and DataEngineering vol 17 no 6 pp 734ndash749 2005
[2] S Ahn A Korattikara and N Liu ldquoLarge-scale distributedBayesian matrix factorization using stochastic gradientMCMCrdquo in Proceedings of the 21th ACMSIGKDDConference onKnowledgeDiscovery andDataMining pp 401ndash410 ACMPressSydney Australia August 2015
[3] S Balakrishnan and S Chopra ldquoCollaborative rankingrdquo in Pro-ceedings of the 5th ACM International Conference onWeb Searchand DataMining (WSDM rsquo12) pp 143ndash152 IEEE Seattle WashUSA February 2012
[4] Q Yao and J Kwok ldquoAccelerated inexact soft-impute for fastlarge-scale matrix completionrdquo in Proceedings of the 24rd Inter-national Joint Conference on Artificial Intelligence pp 4002ndash4008 ACM Press Buenos Aires Argentina July 2015
[5] Q Diao M Qiu C-Y Wu A J Smola J Jiang and C WangldquoJointly modeling aspects ratings and sentiments for movierecommendation (JMARS)rdquo in Proceedings of the 20th ACMSIGKDD International Conference on Knowledge Discovery andDataMining (KDD rsquo14) pp 193ndash202 ACMNewYorkNYUSAAugust 2014
[6] M Jahrer and A Toscher ldquoCollaborative filtering ensemble forrankingrdquo in Proceedings of the 17nd International Conference on
Knowledge Discovery and Data Mining pp 153ndash167 ACM SanDiego Calif USA August 2011
[7] G Takacs and D Tikk ldquoAlternating least squares for person-alized rankingrdquo in Proceedings of the 5th ACM Conference onRecommender Systems pp 83ndash90 ACM Dublin Ireland 2012
[8] G Li and L Li ldquoDual collaborative topicmodeling from implicitfeedbacksrdquo in Proceedings of the IEEE International Conferenceon Security Pattern Analysis and Cybernetics pp 395ndash404Wuhan China October 2014
[9] H Wang X Shi and D Yeung ldquoRelational stacked denoisingautoencoder for tag recommendationrdquo in Proceedings of the29th AAAI Conference on Artificial Intelligence pp 3052ndash3058ACM Press Austin Tex USA January 2015
[10] L Gai ldquoPairwise probabilistic matrix factorization for implicitfeedback collaborative filteringrdquo in Proceedings of the IEEEInternational Conference on Security Pattern Analysis andCybernetics (SPAC rsquo14) pp 181ndash190 Wuhan China October2014
[11] S Rendle C Freudenthaler Z Gantner and L Schmidt-Thieme ldquoBPR Bayesian personalized ranking from implicitfeedbackrdquo in Proceedings of the 25th Conference on Uncertaintyin Artificial Intelligence (UAI rsquo09) pp 452ndash461 Morgan Kauf-mann Montreal Canada 2009
[12] L Guo J Ma H R Jiang Z M Chen and C M XingldquoSocial trust aware item recommendation for implicit feedbackrdquoJournal of Computer Science and Technology vol 30 no 5 pp1039ndash1053 2015
[13] W K Pan H Zhong C F Xu and Z Ming ldquoAdaptive Bayesianpersonalized ranking for heterogeneous implicit feedbacksrdquoKnowledge-Based Systems vol 73 pp 173ndash180 2015
[14] Y Shi A Karatzoglou L Baltrunas M Larson N Oliver andA Hanjalic ldquoCLiMF collaborative less-is-more filteringrdquo inProceedings of the 23rd International Joint Conference on Artifi-cial Intelligence (IJCAI rsquo13) pp 3077ndash3081 ACMBeijing ChinaAugust 2013
[15] WK Pan andLChen ldquoGBPR grouppreference based bayesianpersonalized ranking for one-class collaborative filteringrdquo inProceedings of the 23rd International Joint Conference on Artifi-cial Intelligence (IJCAI rsquo13) pp 2691ndash2697 Beijing China 2013
[16] Y Koren ldquoFactor in the neighbors scalable and accurate col-laborative filteringrdquoACMTransactions on Knowledge Discoveryfrom Data vol 4 no 1 pp 1ndash24 2010
[17] N N Liu E W Xiang M Zhao and Q Yang ldquoUnifyingexplicit and implicit feedback for collaborative filteringrdquo inProceedings of the 19th International Conference on Informationand Knowledge Management (CIKM rsquo10) pp 1445ndash1448 ACMToronto Canada October 2010
[18] Y Shi A Karatzoglou L BaltrunasM Larson andA HanjalicldquoXCLiMF Optimizing expected reciprocal rank for data withmultiple levels of relevancerdquo in Proceedings of the 7th ACMConference on Recommender Systems (RecSys rsquo13) pp 431ndash434ACM Hongkong October 2013
[19] J Jiang J Lu G Zhang and G Long ldquoScaling-up item-based collaborative filtering recommendation algorithm basedon Hadooprdquo in Proceedings of the 7th IEEE World Congress onServices pp 490ndash497 IEEE Washington DC USA July 2011
[20] T Hofmann ldquoLatent semantic models for collaborative filter-ingrdquo ACM Transactions on Information Systems vol 22 no 1pp 89ndash115 2004
[21] R Salakhutdinov A Mnih and G Hinton ldquoRestricted Boltz-mannmachines for collaborative filteringrdquo in Proceedings of the
Mathematical Problems in Engineering 11
24rd International Conference onMachine Learning (ICML rsquo07)pp 791ndash798 ACM Press Corvallis Ore USA June 2007
[22] N Srebro J Rennie andT Jaakkola ldquoMaximum-marginmatrixfactorizationrdquo in Proceedings of the 18th Annual Conferenceon Neural Information Processing Systems pp 11ndash217 BritishColumbia Canada 2004
[23] J T Liu C H Wu and W Y Liu ldquoBayesian probabilisticmatrix factorization with social relations and item contents forrecommendationrdquo Decision Support Systems vol 55 no 3 pp838ndash850 2013
[24] D Feldman and T Tassa ldquoMore constraints smaller coresetsconstrained matrix approximation of sparse big datardquo in Pro-ceedings of the 21th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining (KDD rsquo15) pp 249ndash258ACM Press Sydney Australia August 2015
[25] S Mirisaee E Gaussier and A Termier ldquoImproved local searchfor binarymatrix factorizationrdquo in Proceedings of the 29th AAAIConference on Artificial Intelligence pp 1198ndash1204 ACM PressAustin Tex USA January 2015
[26] T Y Liu Learning to Rank for Information Retrieval SpringerNew York NY USA 2011
[27] NN LiuM Zhao andQ Yang ldquoProbabilistic latent preferenceanalysis for collaborative filteringrdquo in Proceedings of the 18thInternational Conference on Information and Knowledge Man-agement (CIKM rsquo09) pp 759ndash766 ACM Press Hong KongNovember 2009
[28] N N Liu and Q Yang ldquoEigenRank a ranking-orientedapproach to collaborative filteringrdquo in Proceedings of the 31stAnnual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval (ACM SIGIR rsquo08) pp 83ndash90 ACM Singapore July 2008
[29] M Weimer A Karatzoglou Q V Le and A J SmolaldquoCofiRankndashmaximum margin matrix factorization for collab-orative rankingrdquo in Proceedings of the 21th Conference onAdvances in Neural Information Processing Systems pp 79ndash86Curran Associates Vancouver Canada 2007
[30] O Chapelle D Metlzer Y Zhang and P Grinspan ldquoExpectedreciprocal rank for graded relevancerdquo in Proceedings of the 18thACM International Conference on Information and KnowledgeManagement (CIKM rsquo09) pp 621ndash630 ACM New York NYUSA November 2009
[31] E M Voorhees ldquoThe trec-8 question answering track reportrdquoin Proceedings of the 8th Text Retrieval Conference (TREC-8 rsquo99)Gaithersburg Md USA November 1999
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
Mathematical Problems in Engineering 7
52 Evaluation Metrics Just as has been justified by [17]NDCG is a very suitable evaluation metric for personalizedranking algorithmswhich combine explicit and implicit feed-back And our proposed MERR SVD++ algorithm exploitsboth explicit and implicit feedback simultaneously andoptimizes the well-known personalized ranking evaluationmetric Expected Reciprocal Rank (ERR) So we use theNDCG and ERR as evaluation metrics for the predictabilityof models in this paper
NDCG is another most widely used measure for rankingproblems To define NDCG
119906for a user 119906 one first needs to
define DCG119906
DCG119906=
119898
sum
119894=1
2
pref(119894)minus 1
log (119894 + 1)
(17)
where pref(119894) is a binary indicator returning 1 if the 119894th item ispreferred and 0 otherwise DCG
119906is then normalized by the
ideal ranked list into the interval [0 1]
NDCG119906=
DCG119906
IDCG119906
(18)
where IDCG119906denotes that the ranked list is sorted exactly
according to the userrsquos tastes positive items are placed at thehead of the ranked listTheNDCGof all the users is themeanscore of each user
ERR is a generalized version of Reciprocal Rank (RR)designed to be used with multiple relevance level data (egratings) It has similar properties to RR in that it stronglyemphasizes the relevance of results returned at the top of thelist Using the definition of ERR in [18] we can define ERRfor a ranked item list of user 119906 as follows
ERR119906=
119899
sum
119894=1
119903
119906119894
119877
119906119894
119899
prod
119895=1
(1 minus 119903
119906119895119868 (119877
119906119895lt 119877
119906119894)) (19)
Similar to NDCG the ERR of all the users is the mean scoreof each user
Since in recommender systems the userrsquos satisfaction isdominated by only a few items on the top of the recommenda-tion list our evaluation in the following experiments focuseson the performance of top-5 recommended items that isNDCG5 and ERR5
53 Experiment Setup For each dataset we randomlyselected 5 rated items (movies) and 1000 unrated items(movies) for each user to form a test set We then randomlyselected a varying number of rated items from the rest toform a training set For example just as in [14 18] underthe condition of ldquoGiven 5rdquo we randomly selected 5 rateditems (disjoint to the items in the test set) for each user inorder to generate a training set We investigated a variety ofldquoGivenrdquo conditions for the training sets that is 5 10 and 15for the ML1m dataset and 10 20 and 30 for the extractedNetflix dataset Generated recommendation lists for each userare compared to the ground truth in the test set in order tomeasure the performance
All the models were implemented in MATLAB R2009aForMERR SVD++ the value of the regularization parameter
120582 was selected from range 10
minus5 10
minus4 10
minus3 10
minus2 10
minus1 1 10
and optimal parameter value was used And the learning rate120572 was selected from set 119860 119860 = 120572 | 120572 = 00001 times 2
119888 120572 le
05 119888 gt 0 and 119888 is a constant and the optimal parametervalue was also used In order to compare their performancesfairly for all matrix factorization models we set the numberof features to be 10 The optimal values of all parametersfor all the baseline models used are determined individuallyMore detailed setting methods of the parameters for all thebaselines can be found in the corresponding references Forall the algorithms used in our experiments we repeated theexperiment 5 times for each of the different conditions of eachdataset and the performances reported were averaged across5 runs
54 Experiment Results In this section we present a seriesof experiments to evaluate MERR SVD++ We designedthe experiments in order to address the following researchquestions
(1) Does the proposed MERR SVD++ outperform state-of-the-art personalized ranking approaches for top-Nrecommendation
(2) Does the performance of MERR SVD++ improvewhen we only increase the number of implicit feed-back data for each user
(3) Is MERR SVD++ scalable for large scale use cases
541 Performance Comparison We compare the perfor-mance of MERR SVD++ with that of five baseline algo-rithms The approaches we compare with are listed below
(i) Co-Rating [17] a state-of-the-art CF model thatcan be trained from explicit and implicit feedbacksimultaneously
(ii) SVD++ [16] the first proposed CF model that com-bines explicit and implicit feedback
(iii) xCLiMF [18] a state-of-the-art PR approach whichaims at directly optimizing ERR for top-N recom-mendation in domains with explicit feedback data(eg ratings)
(iv) CofiRank [29] a PR approach that optimizes theNDCG measure [28] for domains with explicit feed-back data (eg ratings) The implementation is basedon the publicly available software package from theauthors
(v) CLiMF [14] a state-of-the-art PR approach that opti-mizes the Mean Reciprocal Rank (MRR) measure[31] for domains with implicit feedback data (egclick follow) We use the explicit feedback datasetsby binarizing the rating values with a threshold Onthe ML1m dataset and the extracted Netflix datasetwe take ratings 4 and 5 (the highest two relevancelevels) respectively as the relevance threshold fortop-N recommendation
The results of the experiments on the ML1m and theextractedNetflix datasets are shown in Figure 4 Rows denote
8 Mathematical Problems in Engineering
0 Given 5 Given 10 Given 150
001
002
003
004
005
006The results on the ML1m dataset
The varieties of ldquoGivenrdquo conditions for the training sets
ND
CG
5
CLiMFCofiRankxCLiMF
SVD++Co_RatingMERR_SVD++
(a)
0 Given 5 Given 10 Given 150
001
002
003
004
005
006
007The results on the ML1m dataset
ERR
5
The varieties of ldquoGivenrdquo conditions for the training sets
CLiMFCofiRankxCLiMF
SVD++Co_RatingMERR_SVD++
(b)
0 Given 10 Given 20 Given 300
002
004
006
008
010
012The results on the extracted Netflix dataset
ND
CG
5
The varieties of ldquoGivenrdquo conditions for the training sets
CLiMFCofiRankxCLiMF
SVD++Co_RatingMERR_SVD++
(c)
0 Given 10 Given 20 Given 300
001
002
003
004
005
006
007
008The results on the extracted Netflix dataset
ERR
5
The varieties of ldquoGivenrdquo conditions for the training sets
CLiMFCofiRankxCLiMF
SVD++Co_RatingMERR_SVD++
(d)
Figure 4 The performance comparison of MERR SVD++ and baselines
the varieties of ldquoGivenrdquo conditions for the training sets basedon the ML1m and the extracted Netflix datasets and columnsdenote the quality of NDCG and ERR Figure 4 showsthat MERR SVD++ outperforms the baseline approachesin terms of both ERR and NDCG in all of the cases Theresults show that the improvement of ERR aligns consistentlywith the improvement of NDCG indicating that optimizingERR would not degrade the utility of recommendations thatare captured by the NDCG measure It can be seen thatthe relative performance of MERR SVD++ improves as thenumber of observed ratings from the users increases Thisresult indicates that MERR SVD++ can learn better top-Nrecommendation models if more observations of the gradedrelevance data from users can be used The results alsoreveal that it is difficult to model user preferences encoded
in multiple levels of relevance with limited observations inparticular when the number of observations is lower than thenumber of relevance levels
Compared to Co-Rating which is based on ratingprediction MERR SVD++ is based on ranking predictionand succeeds in enhancing the top-ranked performance byoptimizing ERR As reported in [17] the performance ofSVD++ is slightly weaker than that of Co-Rating which isbecause SVD++ model only attempts to approximate theobserved ratings and does not model preferences expressedin implicit feedback The results also show that Co-Ratingand SVD++ significantly outperform xCLiMF and CofiRankwhich confirms our belief that implicit feedback could indeedcomplement explicit feedback It can be seen in Figure 4that xCLiMF significantly outperforms CLiMF in terms of
Mathematical Problems in Engineering 9
0 10 20 30 40 50005
006
007
008
009
010
The increased numbers of implicit feedback for each user
The v
alue
of e
valu
atio
n m
etric
s
ERR5NDCG5
Figure 5 The influence of implicit feedback on the performance ofMERR SVD++
both ERR and NDCG across all the settings of relevancethresholds and datasets The results indicate that the infor-mation loss from binarizing multilevel relevance data wouldinevitably make recommendation models based on binaryrelevance data such as CLiMF suboptimal for the use caseswith explicit feedback data
Hence we give a positive answer to our first researchquestion
542 The Influence of Implicit Feedback on the Performanceof MERR SVD++ The influence of implicit feedback on theperformance of MERR SVD++ can be found in Figure 5Here 119873(119906) = 119877(119906) + 119872(119906) 119872(119906) denotes the set of allitems that user 119906 only gave implicit feedback Rows denotethe increased numbers of implicit feedback for each userand columns denote the quality of ERR and NDCG In ourexperiment the increased numbers of implicit feedback foreach user are the same We use the extracted Netflix datasetunder the condition of ldquoGiven 10rdquo Figure 5 shows that thequality of ERR and NDCG of MERR SVD++ synchronouslyand linearly improves with the increase of implicit feedbackfor each user which confirms our belief that implicit feedbackcould indeed complement explicit feedback
With this experimental result we give a positive answerto our second research question
543 Scalability The last experiment investigated the scala-bility of MERR SVD++ by measuring the training time thatwas required for the training set at different scales Firstlyas analyzed in Section 43 the computational complexityof MERR SVD++ is linear in the number of users in thetraining set when the average number of items rated peruser is fixed To demonstrate the scalability we used differentnumbers of users in the training set under each conditionwe randomly selected from 10 to 100 users in the trainingset and their rated items as the training data for learning thelatent factors The results on the ML1m dataset are shown inFigure 6 We can observe that the computational time under
10 20 30 40 50 60 70 80 90 1000
5
10
15
20
25
30
35
40
Users for training ()
Runt
ime p
er it
erat
ion
(sec
)
Given 5Given 10Given 15
Figure 6 Scalability analysis of MERR SVD++ in terms of thenumber of users in the training set
each condition increases almost linearly to the increase of thenumber of users Secondly as also discussed in Section 43the computational complexity of MERR SVD++ could befurther approximated to be linear to the amount of knowndata (ie nonzero entries in the training user-item matrix)To demonstrate this we examined the runtime of the learningalgorithm against different scales of the training sets underdifferent ldquoGivenrdquo conditions The result is shown in Figure 6from which we can observe that the average runtime of thelearning algorithm per iteration increases almost linearly asthe number of nonzeros in the training set increases
The observations from this experiment allow us to answerour last research question positively
6 Conclusion and Future Work
The problem of the previous researches on personalizedranking is that they focused on either explicit feedbackdata or implicit feedback data rather than making full useof the information in the dataset Until now nobody hasstudied personalized ranking algorithm by exploiting bothexplicit and implicit feedback In order to overcome thedefects of prior researches in this paper we have presenteda new personalized ranking algorithm (MERR SVD++) byexploiting both explicit and implicit feedback simultaneouslyMERR SVD++ optimizes the well-known evaluation metricExpected Reciprocal Rank (ERR) and is based on the newestxCLiMF model and SVD++ algorithm Experimental resultson practical datasets showed that our proposed algorithmoutperformed existing personalized ranking algorithms overdifferent evaluation metrics and that the running time ofMERR SVD++ showed a linear correlation with the num-ber of rating Because of its high precision and the goodexpansibility MERR SVD++ is suitable for processing bigdata and can greatly improve the recommendation speedand validity by solving the latency problem of personalizedrecommendation and has wide application prospect in the
10 Mathematical Problems in Engineering
field of internet information recommendation And becauseMERR SVD++ exploits both explicit and implicit feedbacksimultaneously MERR SVD++ can solve the data sparsityand imbalance problems of personalized ranking algorithmsto a certain extent
For futurework we plan to extend our algorithm to richerones so that our algorithm can solve the grey sheep problemand cold start problem of personalized recommendationAlso we would like to explore more useful information fromthe explicit feedback and implicit feedback simultaneously
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Acknowledgments
This work is sponsored in part by the National NaturalScience Foundation of China (nos 61370186 6140326461402122 61003140 61033010 and 61272414) Science andTechnology Planning Project of Guangdong Province (nos2014A010103040 and 2014B010116001) Science and Tech-nology Planning Project of Guangzhou (nos 2014J4100032and 201510010203) the Ministry of Education and ChinaMobile Research Fund (no MCM20121051) the secondbatch open subject of mechanical and electrical professionalgroup engineering technology development center in Foshancity (no 2015-KJZX139) and the 2015 Research BackboneTeachers Training Program of Shunde Polytechnic (no 2015-KJZX014)
References
[1] G Adomavicius and A Tuzhilin ldquoToward the next generationof recommender systems a survey of the state-of-the-art andpossible extensionsrdquo IEEE Transactions on Knowledge and DataEngineering vol 17 no 6 pp 734ndash749 2005
[2] S Ahn A Korattikara and N Liu ldquoLarge-scale distributedBayesian matrix factorization using stochastic gradientMCMCrdquo in Proceedings of the 21th ACMSIGKDDConference onKnowledgeDiscovery andDataMining pp 401ndash410 ACMPressSydney Australia August 2015
[3] S Balakrishnan and S Chopra ldquoCollaborative rankingrdquo in Pro-ceedings of the 5th ACM International Conference onWeb Searchand DataMining (WSDM rsquo12) pp 143ndash152 IEEE Seattle WashUSA February 2012
[4] Q Yao and J Kwok ldquoAccelerated inexact soft-impute for fastlarge-scale matrix completionrdquo in Proceedings of the 24rd Inter-national Joint Conference on Artificial Intelligence pp 4002ndash4008 ACM Press Buenos Aires Argentina July 2015
[5] Q Diao M Qiu C-Y Wu A J Smola J Jiang and C WangldquoJointly modeling aspects ratings and sentiments for movierecommendation (JMARS)rdquo in Proceedings of the 20th ACMSIGKDD International Conference on Knowledge Discovery andDataMining (KDD rsquo14) pp 193ndash202 ACMNewYorkNYUSAAugust 2014
[6] M Jahrer and A Toscher ldquoCollaborative filtering ensemble forrankingrdquo in Proceedings of the 17nd International Conference on
Knowledge Discovery and Data Mining pp 153ndash167 ACM SanDiego Calif USA August 2011
[7] G Takacs and D Tikk ldquoAlternating least squares for person-alized rankingrdquo in Proceedings of the 5th ACM Conference onRecommender Systems pp 83ndash90 ACM Dublin Ireland 2012
[8] G Li and L Li ldquoDual collaborative topicmodeling from implicitfeedbacksrdquo in Proceedings of the IEEE International Conferenceon Security Pattern Analysis and Cybernetics pp 395ndash404Wuhan China October 2014
[9] H Wang X Shi and D Yeung ldquoRelational stacked denoisingautoencoder for tag recommendationrdquo in Proceedings of the29th AAAI Conference on Artificial Intelligence pp 3052ndash3058ACM Press Austin Tex USA January 2015
[10] L Gai ldquoPairwise probabilistic matrix factorization for implicitfeedback collaborative filteringrdquo in Proceedings of the IEEEInternational Conference on Security Pattern Analysis andCybernetics (SPAC rsquo14) pp 181ndash190 Wuhan China October2014
[11] S Rendle C Freudenthaler Z Gantner and L Schmidt-Thieme ldquoBPR Bayesian personalized ranking from implicitfeedbackrdquo in Proceedings of the 25th Conference on Uncertaintyin Artificial Intelligence (UAI rsquo09) pp 452ndash461 Morgan Kauf-mann Montreal Canada 2009
[12] L Guo J Ma H R Jiang Z M Chen and C M XingldquoSocial trust aware item recommendation for implicit feedbackrdquoJournal of Computer Science and Technology vol 30 no 5 pp1039ndash1053 2015
[13] W K Pan H Zhong C F Xu and Z Ming ldquoAdaptive Bayesianpersonalized ranking for heterogeneous implicit feedbacksrdquoKnowledge-Based Systems vol 73 pp 173ndash180 2015
[14] Y Shi A Karatzoglou L Baltrunas M Larson N Oliver andA Hanjalic ldquoCLiMF collaborative less-is-more filteringrdquo inProceedings of the 23rd International Joint Conference on Artifi-cial Intelligence (IJCAI rsquo13) pp 3077ndash3081 ACMBeijing ChinaAugust 2013
[15] WK Pan andLChen ldquoGBPR grouppreference based bayesianpersonalized ranking for one-class collaborative filteringrdquo inProceedings of the 23rd International Joint Conference on Artifi-cial Intelligence (IJCAI rsquo13) pp 2691ndash2697 Beijing China 2013
[16] Y Koren ldquoFactor in the neighbors scalable and accurate col-laborative filteringrdquoACMTransactions on Knowledge Discoveryfrom Data vol 4 no 1 pp 1ndash24 2010
[17] N N Liu E W Xiang M Zhao and Q Yang ldquoUnifyingexplicit and implicit feedback for collaborative filteringrdquo inProceedings of the 19th International Conference on Informationand Knowledge Management (CIKM rsquo10) pp 1445ndash1448 ACMToronto Canada October 2010
[18] Y Shi A Karatzoglou L BaltrunasM Larson andA HanjalicldquoXCLiMF Optimizing expected reciprocal rank for data withmultiple levels of relevancerdquo in Proceedings of the 7th ACMConference on Recommender Systems (RecSys rsquo13) pp 431ndash434ACM Hongkong October 2013
[19] J Jiang J Lu G Zhang and G Long ldquoScaling-up item-based collaborative filtering recommendation algorithm basedon Hadooprdquo in Proceedings of the 7th IEEE World Congress onServices pp 490ndash497 IEEE Washington DC USA July 2011
[20] T Hofmann ldquoLatent semantic models for collaborative filter-ingrdquo ACM Transactions on Information Systems vol 22 no 1pp 89ndash115 2004
[21] R Salakhutdinov A Mnih and G Hinton ldquoRestricted Boltz-mannmachines for collaborative filteringrdquo in Proceedings of the
Mathematical Problems in Engineering 11
24rd International Conference onMachine Learning (ICML rsquo07)pp 791ndash798 ACM Press Corvallis Ore USA June 2007
[22] N Srebro J Rennie andT Jaakkola ldquoMaximum-marginmatrixfactorizationrdquo in Proceedings of the 18th Annual Conferenceon Neural Information Processing Systems pp 11ndash217 BritishColumbia Canada 2004
[23] J T Liu C H Wu and W Y Liu ldquoBayesian probabilisticmatrix factorization with social relations and item contents forrecommendationrdquo Decision Support Systems vol 55 no 3 pp838ndash850 2013
[24] D Feldman and T Tassa ldquoMore constraints smaller coresetsconstrained matrix approximation of sparse big datardquo in Pro-ceedings of the 21th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining (KDD rsquo15) pp 249ndash258ACM Press Sydney Australia August 2015
[25] S Mirisaee E Gaussier and A Termier ldquoImproved local searchfor binarymatrix factorizationrdquo in Proceedings of the 29th AAAIConference on Artificial Intelligence pp 1198ndash1204 ACM PressAustin Tex USA January 2015
[26] T Y Liu Learning to Rank for Information Retrieval SpringerNew York NY USA 2011
[27] NN LiuM Zhao andQ Yang ldquoProbabilistic latent preferenceanalysis for collaborative filteringrdquo in Proceedings of the 18thInternational Conference on Information and Knowledge Man-agement (CIKM rsquo09) pp 759ndash766 ACM Press Hong KongNovember 2009
[28] N N Liu and Q Yang ldquoEigenRank a ranking-orientedapproach to collaborative filteringrdquo in Proceedings of the 31stAnnual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval (ACM SIGIR rsquo08) pp 83ndash90 ACM Singapore July 2008
[29] M Weimer A Karatzoglou Q V Le and A J SmolaldquoCofiRankndashmaximum margin matrix factorization for collab-orative rankingrdquo in Proceedings of the 21th Conference onAdvances in Neural Information Processing Systems pp 79ndash86Curran Associates Vancouver Canada 2007
[30] O Chapelle D Metlzer Y Zhang and P Grinspan ldquoExpectedreciprocal rank for graded relevancerdquo in Proceedings of the 18thACM International Conference on Information and KnowledgeManagement (CIKM rsquo09) pp 621ndash630 ACM New York NYUSA November 2009
[31] E M Voorhees ldquoThe trec-8 question answering track reportrdquoin Proceedings of the 8th Text Retrieval Conference (TREC-8 rsquo99)Gaithersburg Md USA November 1999
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
8 Mathematical Problems in Engineering
0 Given 5 Given 10 Given 150
001
002
003
004
005
006The results on the ML1m dataset
The varieties of ldquoGivenrdquo conditions for the training sets
ND
CG
5
CLiMFCofiRankxCLiMF
SVD++Co_RatingMERR_SVD++
(a)
0 Given 5 Given 10 Given 150
001
002
003
004
005
006
007The results on the ML1m dataset
ERR
5
The varieties of ldquoGivenrdquo conditions for the training sets
CLiMFCofiRankxCLiMF
SVD++Co_RatingMERR_SVD++
(b)
0 Given 10 Given 20 Given 300
002
004
006
008
010
012The results on the extracted Netflix dataset
ND
CG
5
The varieties of ldquoGivenrdquo conditions for the training sets
CLiMFCofiRankxCLiMF
SVD++Co_RatingMERR_SVD++
(c)
0 Given 10 Given 20 Given 300
001
002
003
004
005
006
007
008The results on the extracted Netflix dataset
ERR
5
The varieties of ldquoGivenrdquo conditions for the training sets
CLiMFCofiRankxCLiMF
SVD++Co_RatingMERR_SVD++
(d)
Figure 4 The performance comparison of MERR SVD++ and baselines
the varieties of ldquoGivenrdquo conditions for the training sets basedon the ML1m and the extracted Netflix datasets and columnsdenote the quality of NDCG and ERR Figure 4 showsthat MERR SVD++ outperforms the baseline approachesin terms of both ERR and NDCG in all of the cases Theresults show that the improvement of ERR aligns consistentlywith the improvement of NDCG indicating that optimizingERR would not degrade the utility of recommendations thatare captured by the NDCG measure It can be seen thatthe relative performance of MERR SVD++ improves as thenumber of observed ratings from the users increases Thisresult indicates that MERR SVD++ can learn better top-Nrecommendation models if more observations of the gradedrelevance data from users can be used The results alsoreveal that it is difficult to model user preferences encoded
in multiple levels of relevance with limited observations inparticular when the number of observations is lower than thenumber of relevance levels
Compared to Co-Rating which is based on ratingprediction MERR SVD++ is based on ranking predictionand succeeds in enhancing the top-ranked performance byoptimizing ERR As reported in [17] the performance ofSVD++ is slightly weaker than that of Co-Rating which isbecause SVD++ model only attempts to approximate theobserved ratings and does not model preferences expressedin implicit feedback The results also show that Co-Ratingand SVD++ significantly outperform xCLiMF and CofiRankwhich confirms our belief that implicit feedback could indeedcomplement explicit feedback It can be seen in Figure 4that xCLiMF significantly outperforms CLiMF in terms of
Mathematical Problems in Engineering 9
0 10 20 30 40 50005
006
007
008
009
010
The increased numbers of implicit feedback for each user
The v
alue
of e
valu
atio
n m
etric
s
ERR5NDCG5
Figure 5 The influence of implicit feedback on the performance ofMERR SVD++
both ERR and NDCG across all the settings of relevancethresholds and datasets The results indicate that the infor-mation loss from binarizing multilevel relevance data wouldinevitably make recommendation models based on binaryrelevance data such as CLiMF suboptimal for the use caseswith explicit feedback data
Hence we give a positive answer to our first researchquestion
542 The Influence of Implicit Feedback on the Performanceof MERR SVD++ The influence of implicit feedback on theperformance of MERR SVD++ can be found in Figure 5Here 119873(119906) = 119877(119906) + 119872(119906) 119872(119906) denotes the set of allitems that user 119906 only gave implicit feedback Rows denotethe increased numbers of implicit feedback for each userand columns denote the quality of ERR and NDCG In ourexperiment the increased numbers of implicit feedback foreach user are the same We use the extracted Netflix datasetunder the condition of ldquoGiven 10rdquo Figure 5 shows that thequality of ERR and NDCG of MERR SVD++ synchronouslyand linearly improves with the increase of implicit feedbackfor each user which confirms our belief that implicit feedbackcould indeed complement explicit feedback
With this experimental result we give a positive answerto our second research question
543 Scalability The last experiment investigated the scala-bility of MERR SVD++ by measuring the training time thatwas required for the training set at different scales Firstlyas analyzed in Section 43 the computational complexityof MERR SVD++ is linear in the number of users in thetraining set when the average number of items rated peruser is fixed To demonstrate the scalability we used differentnumbers of users in the training set under each conditionwe randomly selected from 10 to 100 users in the trainingset and their rated items as the training data for learning thelatent factors The results on the ML1m dataset are shown inFigure 6 We can observe that the computational time under
10 20 30 40 50 60 70 80 90 1000
5
10
15
20
25
30
35
40
Users for training ()
Runt
ime p
er it
erat
ion
(sec
)
Given 5Given 10Given 15
Figure 6 Scalability analysis of MERR SVD++ in terms of thenumber of users in the training set
each condition increases almost linearly to the increase of thenumber of users Secondly as also discussed in Section 43the computational complexity of MERR SVD++ could befurther approximated to be linear to the amount of knowndata (ie nonzero entries in the training user-item matrix)To demonstrate this we examined the runtime of the learningalgorithm against different scales of the training sets underdifferent ldquoGivenrdquo conditions The result is shown in Figure 6from which we can observe that the average runtime of thelearning algorithm per iteration increases almost linearly asthe number of nonzeros in the training set increases
The observations from this experiment allow us to answerour last research question positively
6 Conclusion and Future Work
The problem of the previous researches on personalizedranking is that they focused on either explicit feedbackdata or implicit feedback data rather than making full useof the information in the dataset Until now nobody hasstudied personalized ranking algorithm by exploiting bothexplicit and implicit feedback In order to overcome thedefects of prior researches in this paper we have presenteda new personalized ranking algorithm (MERR SVD++) byexploiting both explicit and implicit feedback simultaneouslyMERR SVD++ optimizes the well-known evaluation metricExpected Reciprocal Rank (ERR) and is based on the newestxCLiMF model and SVD++ algorithm Experimental resultson practical datasets showed that our proposed algorithmoutperformed existing personalized ranking algorithms overdifferent evaluation metrics and that the running time ofMERR SVD++ showed a linear correlation with the num-ber of rating Because of its high precision and the goodexpansibility MERR SVD++ is suitable for processing bigdata and can greatly improve the recommendation speedand validity by solving the latency problem of personalizedrecommendation and has wide application prospect in the
10 Mathematical Problems in Engineering
field of internet information recommendation And becauseMERR SVD++ exploits both explicit and implicit feedbacksimultaneously MERR SVD++ can solve the data sparsityand imbalance problems of personalized ranking algorithmsto a certain extent
For futurework we plan to extend our algorithm to richerones so that our algorithm can solve the grey sheep problemand cold start problem of personalized recommendationAlso we would like to explore more useful information fromthe explicit feedback and implicit feedback simultaneously
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Acknowledgments
This work is sponsored in part by the National NaturalScience Foundation of China (nos 61370186 6140326461402122 61003140 61033010 and 61272414) Science andTechnology Planning Project of Guangdong Province (nos2014A010103040 and 2014B010116001) Science and Tech-nology Planning Project of Guangzhou (nos 2014J4100032and 201510010203) the Ministry of Education and ChinaMobile Research Fund (no MCM20121051) the secondbatch open subject of mechanical and electrical professionalgroup engineering technology development center in Foshancity (no 2015-KJZX139) and the 2015 Research BackboneTeachers Training Program of Shunde Polytechnic (no 2015-KJZX014)
References
[1] G Adomavicius and A Tuzhilin ldquoToward the next generationof recommender systems a survey of the state-of-the-art andpossible extensionsrdquo IEEE Transactions on Knowledge and DataEngineering vol 17 no 6 pp 734ndash749 2005
[2] S Ahn A Korattikara and N Liu ldquoLarge-scale distributedBayesian matrix factorization using stochastic gradientMCMCrdquo in Proceedings of the 21th ACMSIGKDDConference onKnowledgeDiscovery andDataMining pp 401ndash410 ACMPressSydney Australia August 2015
[3] S Balakrishnan and S Chopra ldquoCollaborative rankingrdquo in Pro-ceedings of the 5th ACM International Conference onWeb Searchand DataMining (WSDM rsquo12) pp 143ndash152 IEEE Seattle WashUSA February 2012
[4] Q Yao and J Kwok ldquoAccelerated inexact soft-impute for fastlarge-scale matrix completionrdquo in Proceedings of the 24rd Inter-national Joint Conference on Artificial Intelligence pp 4002ndash4008 ACM Press Buenos Aires Argentina July 2015
[5] Q Diao M Qiu C-Y Wu A J Smola J Jiang and C WangldquoJointly modeling aspects ratings and sentiments for movierecommendation (JMARS)rdquo in Proceedings of the 20th ACMSIGKDD International Conference on Knowledge Discovery andDataMining (KDD rsquo14) pp 193ndash202 ACMNewYorkNYUSAAugust 2014
[6] M Jahrer and A Toscher ldquoCollaborative filtering ensemble forrankingrdquo in Proceedings of the 17nd International Conference on
Knowledge Discovery and Data Mining pp 153ndash167 ACM SanDiego Calif USA August 2011
[7] G Takacs and D Tikk ldquoAlternating least squares for person-alized rankingrdquo in Proceedings of the 5th ACM Conference onRecommender Systems pp 83ndash90 ACM Dublin Ireland 2012
[8] G Li and L Li ldquoDual collaborative topicmodeling from implicitfeedbacksrdquo in Proceedings of the IEEE International Conferenceon Security Pattern Analysis and Cybernetics pp 395ndash404Wuhan China October 2014
[9] H Wang X Shi and D Yeung ldquoRelational stacked denoisingautoencoder for tag recommendationrdquo in Proceedings of the29th AAAI Conference on Artificial Intelligence pp 3052ndash3058ACM Press Austin Tex USA January 2015
[10] L Gai ldquoPairwise probabilistic matrix factorization for implicitfeedback collaborative filteringrdquo in Proceedings of the IEEEInternational Conference on Security Pattern Analysis andCybernetics (SPAC rsquo14) pp 181ndash190 Wuhan China October2014
[11] S Rendle C Freudenthaler Z Gantner and L Schmidt-Thieme ldquoBPR Bayesian personalized ranking from implicitfeedbackrdquo in Proceedings of the 25th Conference on Uncertaintyin Artificial Intelligence (UAI rsquo09) pp 452ndash461 Morgan Kauf-mann Montreal Canada 2009
[12] L Guo J Ma H R Jiang Z M Chen and C M XingldquoSocial trust aware item recommendation for implicit feedbackrdquoJournal of Computer Science and Technology vol 30 no 5 pp1039ndash1053 2015
[13] W K Pan H Zhong C F Xu and Z Ming ldquoAdaptive Bayesianpersonalized ranking for heterogeneous implicit feedbacksrdquoKnowledge-Based Systems vol 73 pp 173ndash180 2015
[14] Y Shi A Karatzoglou L Baltrunas M Larson N Oliver andA Hanjalic ldquoCLiMF collaborative less-is-more filteringrdquo inProceedings of the 23rd International Joint Conference on Artifi-cial Intelligence (IJCAI rsquo13) pp 3077ndash3081 ACMBeijing ChinaAugust 2013
[15] WK Pan andLChen ldquoGBPR grouppreference based bayesianpersonalized ranking for one-class collaborative filteringrdquo inProceedings of the 23rd International Joint Conference on Artifi-cial Intelligence (IJCAI rsquo13) pp 2691ndash2697 Beijing China 2013
[16] Y Koren ldquoFactor in the neighbors scalable and accurate col-laborative filteringrdquoACMTransactions on Knowledge Discoveryfrom Data vol 4 no 1 pp 1ndash24 2010
[17] N N Liu E W Xiang M Zhao and Q Yang ldquoUnifyingexplicit and implicit feedback for collaborative filteringrdquo inProceedings of the 19th International Conference on Informationand Knowledge Management (CIKM rsquo10) pp 1445ndash1448 ACMToronto Canada October 2010
[18] Y Shi A Karatzoglou L BaltrunasM Larson andA HanjalicldquoXCLiMF Optimizing expected reciprocal rank for data withmultiple levels of relevancerdquo in Proceedings of the 7th ACMConference on Recommender Systems (RecSys rsquo13) pp 431ndash434ACM Hongkong October 2013
[19] J Jiang J Lu G Zhang and G Long ldquoScaling-up item-based collaborative filtering recommendation algorithm basedon Hadooprdquo in Proceedings of the 7th IEEE World Congress onServices pp 490ndash497 IEEE Washington DC USA July 2011
[20] T Hofmann ldquoLatent semantic models for collaborative filter-ingrdquo ACM Transactions on Information Systems vol 22 no 1pp 89ndash115 2004
[21] R Salakhutdinov A Mnih and G Hinton ldquoRestricted Boltz-mannmachines for collaborative filteringrdquo in Proceedings of the
Mathematical Problems in Engineering 11
24rd International Conference onMachine Learning (ICML rsquo07)pp 791ndash798 ACM Press Corvallis Ore USA June 2007
[22] N Srebro J Rennie andT Jaakkola ldquoMaximum-marginmatrixfactorizationrdquo in Proceedings of the 18th Annual Conferenceon Neural Information Processing Systems pp 11ndash217 BritishColumbia Canada 2004
[23] J T Liu C H Wu and W Y Liu ldquoBayesian probabilisticmatrix factorization with social relations and item contents forrecommendationrdquo Decision Support Systems vol 55 no 3 pp838ndash850 2013
[24] D Feldman and T Tassa ldquoMore constraints smaller coresetsconstrained matrix approximation of sparse big datardquo in Pro-ceedings of the 21th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining (KDD rsquo15) pp 249ndash258ACM Press Sydney Australia August 2015
[25] S Mirisaee E Gaussier and A Termier ldquoImproved local searchfor binarymatrix factorizationrdquo in Proceedings of the 29th AAAIConference on Artificial Intelligence pp 1198ndash1204 ACM PressAustin Tex USA January 2015
[26] T Y Liu Learning to Rank for Information Retrieval SpringerNew York NY USA 2011
[27] NN LiuM Zhao andQ Yang ldquoProbabilistic latent preferenceanalysis for collaborative filteringrdquo in Proceedings of the 18thInternational Conference on Information and Knowledge Man-agement (CIKM rsquo09) pp 759ndash766 ACM Press Hong KongNovember 2009
[28] N N Liu and Q Yang ldquoEigenRank a ranking-orientedapproach to collaborative filteringrdquo in Proceedings of the 31stAnnual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval (ACM SIGIR rsquo08) pp 83ndash90 ACM Singapore July 2008
[29] M Weimer A Karatzoglou Q V Le and A J SmolaldquoCofiRankndashmaximum margin matrix factorization for collab-orative rankingrdquo in Proceedings of the 21th Conference onAdvances in Neural Information Processing Systems pp 79ndash86Curran Associates Vancouver Canada 2007
[30] O Chapelle D Metlzer Y Zhang and P Grinspan ldquoExpectedreciprocal rank for graded relevancerdquo in Proceedings of the 18thACM International Conference on Information and KnowledgeManagement (CIKM rsquo09) pp 621ndash630 ACM New York NYUSA November 2009
[31] E M Voorhees ldquoThe trec-8 question answering track reportrdquoin Proceedings of the 8th Text Retrieval Conference (TREC-8 rsquo99)Gaithersburg Md USA November 1999
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
Mathematical Problems in Engineering 9
0 10 20 30 40 50005
006
007
008
009
010
The increased numbers of implicit feedback for each user
The v
alue
of e
valu
atio
n m
etric
s
ERR5NDCG5
Figure 5 The influence of implicit feedback on the performance ofMERR SVD++
both ERR and NDCG across all the settings of relevancethresholds and datasets The results indicate that the infor-mation loss from binarizing multilevel relevance data wouldinevitably make recommendation models based on binaryrelevance data such as CLiMF suboptimal for the use caseswith explicit feedback data
Hence we give a positive answer to our first researchquestion
542 The Influence of Implicit Feedback on the Performanceof MERR SVD++ The influence of implicit feedback on theperformance of MERR SVD++ can be found in Figure 5Here 119873(119906) = 119877(119906) + 119872(119906) 119872(119906) denotes the set of allitems that user 119906 only gave implicit feedback Rows denotethe increased numbers of implicit feedback for each userand columns denote the quality of ERR and NDCG In ourexperiment the increased numbers of implicit feedback foreach user are the same We use the extracted Netflix datasetunder the condition of ldquoGiven 10rdquo Figure 5 shows that thequality of ERR and NDCG of MERR SVD++ synchronouslyand linearly improves with the increase of implicit feedbackfor each user which confirms our belief that implicit feedbackcould indeed complement explicit feedback
With this experimental result we give a positive answerto our second research question
543 Scalability The last experiment investigated the scala-bility of MERR SVD++ by measuring the training time thatwas required for the training set at different scales Firstlyas analyzed in Section 43 the computational complexityof MERR SVD++ is linear in the number of users in thetraining set when the average number of items rated peruser is fixed To demonstrate the scalability we used differentnumbers of users in the training set under each conditionwe randomly selected from 10 to 100 users in the trainingset and their rated items as the training data for learning thelatent factors The results on the ML1m dataset are shown inFigure 6 We can observe that the computational time under
10 20 30 40 50 60 70 80 90 1000
5
10
15
20
25
30
35
40
Users for training ()
Runt
ime p
er it
erat
ion
(sec
)
Given 5Given 10Given 15
Figure 6 Scalability analysis of MERR SVD++ in terms of thenumber of users in the training set
each condition increases almost linearly to the increase of thenumber of users Secondly as also discussed in Section 43the computational complexity of MERR SVD++ could befurther approximated to be linear to the amount of knowndata (ie nonzero entries in the training user-item matrix)To demonstrate this we examined the runtime of the learningalgorithm against different scales of the training sets underdifferent ldquoGivenrdquo conditions The result is shown in Figure 6from which we can observe that the average runtime of thelearning algorithm per iteration increases almost linearly asthe number of nonzeros in the training set increases
The observations from this experiment allow us to answerour last research question positively
6 Conclusion and Future Work
The problem of the previous researches on personalizedranking is that they focused on either explicit feedbackdata or implicit feedback data rather than making full useof the information in the dataset Until now nobody hasstudied personalized ranking algorithm by exploiting bothexplicit and implicit feedback In order to overcome thedefects of prior researches in this paper we have presenteda new personalized ranking algorithm (MERR SVD++) byexploiting both explicit and implicit feedback simultaneouslyMERR SVD++ optimizes the well-known evaluation metricExpected Reciprocal Rank (ERR) and is based on the newestxCLiMF model and SVD++ algorithm Experimental resultson practical datasets showed that our proposed algorithmoutperformed existing personalized ranking algorithms overdifferent evaluation metrics and that the running time ofMERR SVD++ showed a linear correlation with the num-ber of rating Because of its high precision and the goodexpansibility MERR SVD++ is suitable for processing bigdata and can greatly improve the recommendation speedand validity by solving the latency problem of personalizedrecommendation and has wide application prospect in the
10 Mathematical Problems in Engineering
field of internet information recommendation And becauseMERR SVD++ exploits both explicit and implicit feedbacksimultaneously MERR SVD++ can solve the data sparsityand imbalance problems of personalized ranking algorithmsto a certain extent
For futurework we plan to extend our algorithm to richerones so that our algorithm can solve the grey sheep problemand cold start problem of personalized recommendationAlso we would like to explore more useful information fromthe explicit feedback and implicit feedback simultaneously
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Acknowledgments
This work is sponsored in part by the National NaturalScience Foundation of China (nos 61370186 6140326461402122 61003140 61033010 and 61272414) Science andTechnology Planning Project of Guangdong Province (nos2014A010103040 and 2014B010116001) Science and Tech-nology Planning Project of Guangzhou (nos 2014J4100032and 201510010203) the Ministry of Education and ChinaMobile Research Fund (no MCM20121051) the secondbatch open subject of mechanical and electrical professionalgroup engineering technology development center in Foshancity (no 2015-KJZX139) and the 2015 Research BackboneTeachers Training Program of Shunde Polytechnic (no 2015-KJZX014)
References
[1] G Adomavicius and A Tuzhilin ldquoToward the next generationof recommender systems a survey of the state-of-the-art andpossible extensionsrdquo IEEE Transactions on Knowledge and DataEngineering vol 17 no 6 pp 734ndash749 2005
[2] S Ahn A Korattikara and N Liu ldquoLarge-scale distributedBayesian matrix factorization using stochastic gradientMCMCrdquo in Proceedings of the 21th ACMSIGKDDConference onKnowledgeDiscovery andDataMining pp 401ndash410 ACMPressSydney Australia August 2015
[3] S Balakrishnan and S Chopra ldquoCollaborative rankingrdquo in Pro-ceedings of the 5th ACM International Conference onWeb Searchand DataMining (WSDM rsquo12) pp 143ndash152 IEEE Seattle WashUSA February 2012
[4] Q Yao and J Kwok ldquoAccelerated inexact soft-impute for fastlarge-scale matrix completionrdquo in Proceedings of the 24rd Inter-national Joint Conference on Artificial Intelligence pp 4002ndash4008 ACM Press Buenos Aires Argentina July 2015
[5] Q Diao M Qiu C-Y Wu A J Smola J Jiang and C WangldquoJointly modeling aspects ratings and sentiments for movierecommendation (JMARS)rdquo in Proceedings of the 20th ACMSIGKDD International Conference on Knowledge Discovery andDataMining (KDD rsquo14) pp 193ndash202 ACMNewYorkNYUSAAugust 2014
[6] M Jahrer and A Toscher ldquoCollaborative filtering ensemble forrankingrdquo in Proceedings of the 17nd International Conference on
Knowledge Discovery and Data Mining pp 153ndash167 ACM SanDiego Calif USA August 2011
[7] G Takacs and D Tikk ldquoAlternating least squares for person-alized rankingrdquo in Proceedings of the 5th ACM Conference onRecommender Systems pp 83ndash90 ACM Dublin Ireland 2012
[8] G Li and L Li ldquoDual collaborative topicmodeling from implicitfeedbacksrdquo in Proceedings of the IEEE International Conferenceon Security Pattern Analysis and Cybernetics pp 395ndash404Wuhan China October 2014
[9] H Wang X Shi and D Yeung ldquoRelational stacked denoisingautoencoder for tag recommendationrdquo in Proceedings of the29th AAAI Conference on Artificial Intelligence pp 3052ndash3058ACM Press Austin Tex USA January 2015
[10] L Gai ldquoPairwise probabilistic matrix factorization for implicitfeedback collaborative filteringrdquo in Proceedings of the IEEEInternational Conference on Security Pattern Analysis andCybernetics (SPAC rsquo14) pp 181ndash190 Wuhan China October2014
[11] S Rendle C Freudenthaler Z Gantner and L Schmidt-Thieme ldquoBPR Bayesian personalized ranking from implicitfeedbackrdquo in Proceedings of the 25th Conference on Uncertaintyin Artificial Intelligence (UAI rsquo09) pp 452ndash461 Morgan Kauf-mann Montreal Canada 2009
[12] L Guo J Ma H R Jiang Z M Chen and C M XingldquoSocial trust aware item recommendation for implicit feedbackrdquoJournal of Computer Science and Technology vol 30 no 5 pp1039ndash1053 2015
[13] W K Pan H Zhong C F Xu and Z Ming ldquoAdaptive Bayesianpersonalized ranking for heterogeneous implicit feedbacksrdquoKnowledge-Based Systems vol 73 pp 173ndash180 2015
[14] Y Shi A Karatzoglou L Baltrunas M Larson N Oliver andA Hanjalic ldquoCLiMF collaborative less-is-more filteringrdquo inProceedings of the 23rd International Joint Conference on Artifi-cial Intelligence (IJCAI rsquo13) pp 3077ndash3081 ACMBeijing ChinaAugust 2013
[15] WK Pan andLChen ldquoGBPR grouppreference based bayesianpersonalized ranking for one-class collaborative filteringrdquo inProceedings of the 23rd International Joint Conference on Artifi-cial Intelligence (IJCAI rsquo13) pp 2691ndash2697 Beijing China 2013
[16] Y Koren ldquoFactor in the neighbors scalable and accurate col-laborative filteringrdquoACMTransactions on Knowledge Discoveryfrom Data vol 4 no 1 pp 1ndash24 2010
[17] N N Liu E W Xiang M Zhao and Q Yang ldquoUnifyingexplicit and implicit feedback for collaborative filteringrdquo inProceedings of the 19th International Conference on Informationand Knowledge Management (CIKM rsquo10) pp 1445ndash1448 ACMToronto Canada October 2010
[18] Y Shi A Karatzoglou L BaltrunasM Larson andA HanjalicldquoXCLiMF Optimizing expected reciprocal rank for data withmultiple levels of relevancerdquo in Proceedings of the 7th ACMConference on Recommender Systems (RecSys rsquo13) pp 431ndash434ACM Hongkong October 2013
[19] J Jiang J Lu G Zhang and G Long ldquoScaling-up item-based collaborative filtering recommendation algorithm basedon Hadooprdquo in Proceedings of the 7th IEEE World Congress onServices pp 490ndash497 IEEE Washington DC USA July 2011
[20] T Hofmann ldquoLatent semantic models for collaborative filter-ingrdquo ACM Transactions on Information Systems vol 22 no 1pp 89ndash115 2004
[21] R Salakhutdinov A Mnih and G Hinton ldquoRestricted Boltz-mannmachines for collaborative filteringrdquo in Proceedings of the
Mathematical Problems in Engineering 11
24rd International Conference onMachine Learning (ICML rsquo07)pp 791ndash798 ACM Press Corvallis Ore USA June 2007
[22] N Srebro J Rennie andT Jaakkola ldquoMaximum-marginmatrixfactorizationrdquo in Proceedings of the 18th Annual Conferenceon Neural Information Processing Systems pp 11ndash217 BritishColumbia Canada 2004
[23] J T Liu C H Wu and W Y Liu ldquoBayesian probabilisticmatrix factorization with social relations and item contents forrecommendationrdquo Decision Support Systems vol 55 no 3 pp838ndash850 2013
[24] D Feldman and T Tassa ldquoMore constraints smaller coresetsconstrained matrix approximation of sparse big datardquo in Pro-ceedings of the 21th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining (KDD rsquo15) pp 249ndash258ACM Press Sydney Australia August 2015
[25] S Mirisaee E Gaussier and A Termier ldquoImproved local searchfor binarymatrix factorizationrdquo in Proceedings of the 29th AAAIConference on Artificial Intelligence pp 1198ndash1204 ACM PressAustin Tex USA January 2015
[26] T Y Liu Learning to Rank for Information Retrieval SpringerNew York NY USA 2011
[27] NN LiuM Zhao andQ Yang ldquoProbabilistic latent preferenceanalysis for collaborative filteringrdquo in Proceedings of the 18thInternational Conference on Information and Knowledge Man-agement (CIKM rsquo09) pp 759ndash766 ACM Press Hong KongNovember 2009
[28] N N Liu and Q Yang ldquoEigenRank a ranking-orientedapproach to collaborative filteringrdquo in Proceedings of the 31stAnnual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval (ACM SIGIR rsquo08) pp 83ndash90 ACM Singapore July 2008
[29] M Weimer A Karatzoglou Q V Le and A J SmolaldquoCofiRankndashmaximum margin matrix factorization for collab-orative rankingrdquo in Proceedings of the 21th Conference onAdvances in Neural Information Processing Systems pp 79ndash86Curran Associates Vancouver Canada 2007
[30] O Chapelle D Metlzer Y Zhang and P Grinspan ldquoExpectedreciprocal rank for graded relevancerdquo in Proceedings of the 18thACM International Conference on Information and KnowledgeManagement (CIKM rsquo09) pp 621ndash630 ACM New York NYUSA November 2009
[31] E M Voorhees ldquoThe trec-8 question answering track reportrdquoin Proceedings of the 8th Text Retrieval Conference (TREC-8 rsquo99)Gaithersburg Md USA November 1999
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
10 Mathematical Problems in Engineering
field of internet information recommendation And becauseMERR SVD++ exploits both explicit and implicit feedbacksimultaneously MERR SVD++ can solve the data sparsityand imbalance problems of personalized ranking algorithmsto a certain extent
For futurework we plan to extend our algorithm to richerones so that our algorithm can solve the grey sheep problemand cold start problem of personalized recommendationAlso we would like to explore more useful information fromthe explicit feedback and implicit feedback simultaneously
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Acknowledgments
This work is sponsored in part by the National NaturalScience Foundation of China (nos 61370186 6140326461402122 61003140 61033010 and 61272414) Science andTechnology Planning Project of Guangdong Province (nos2014A010103040 and 2014B010116001) Science and Tech-nology Planning Project of Guangzhou (nos 2014J4100032and 201510010203) the Ministry of Education and ChinaMobile Research Fund (no MCM20121051) the secondbatch open subject of mechanical and electrical professionalgroup engineering technology development center in Foshancity (no 2015-KJZX139) and the 2015 Research BackboneTeachers Training Program of Shunde Polytechnic (no 2015-KJZX014)
References
[1] G Adomavicius and A Tuzhilin ldquoToward the next generationof recommender systems a survey of the state-of-the-art andpossible extensionsrdquo IEEE Transactions on Knowledge and DataEngineering vol 17 no 6 pp 734ndash749 2005
[2] S Ahn A Korattikara and N Liu ldquoLarge-scale distributedBayesian matrix factorization using stochastic gradientMCMCrdquo in Proceedings of the 21th ACMSIGKDDConference onKnowledgeDiscovery andDataMining pp 401ndash410 ACMPressSydney Australia August 2015
[3] S Balakrishnan and S Chopra ldquoCollaborative rankingrdquo in Pro-ceedings of the 5th ACM International Conference onWeb Searchand DataMining (WSDM rsquo12) pp 143ndash152 IEEE Seattle WashUSA February 2012
[4] Q Yao and J Kwok ldquoAccelerated inexact soft-impute for fastlarge-scale matrix completionrdquo in Proceedings of the 24rd Inter-national Joint Conference on Artificial Intelligence pp 4002ndash4008 ACM Press Buenos Aires Argentina July 2015
[5] Q Diao M Qiu C-Y Wu A J Smola J Jiang and C WangldquoJointly modeling aspects ratings and sentiments for movierecommendation (JMARS)rdquo in Proceedings of the 20th ACMSIGKDD International Conference on Knowledge Discovery andDataMining (KDD rsquo14) pp 193ndash202 ACMNewYorkNYUSAAugust 2014
[6] M Jahrer and A Toscher ldquoCollaborative filtering ensemble forrankingrdquo in Proceedings of the 17nd International Conference on
Knowledge Discovery and Data Mining pp 153ndash167 ACM SanDiego Calif USA August 2011
[7] G Takacs and D Tikk ldquoAlternating least squares for person-alized rankingrdquo in Proceedings of the 5th ACM Conference onRecommender Systems pp 83ndash90 ACM Dublin Ireland 2012
[8] G Li and L Li ldquoDual collaborative topicmodeling from implicitfeedbacksrdquo in Proceedings of the IEEE International Conferenceon Security Pattern Analysis and Cybernetics pp 395ndash404Wuhan China October 2014
[9] H Wang X Shi and D Yeung ldquoRelational stacked denoisingautoencoder for tag recommendationrdquo in Proceedings of the29th AAAI Conference on Artificial Intelligence pp 3052ndash3058ACM Press Austin Tex USA January 2015
[10] L Gai ldquoPairwise probabilistic matrix factorization for implicitfeedback collaborative filteringrdquo in Proceedings of the IEEEInternational Conference on Security Pattern Analysis andCybernetics (SPAC rsquo14) pp 181ndash190 Wuhan China October2014
[11] S Rendle C Freudenthaler Z Gantner and L Schmidt-Thieme ldquoBPR Bayesian personalized ranking from implicitfeedbackrdquo in Proceedings of the 25th Conference on Uncertaintyin Artificial Intelligence (UAI rsquo09) pp 452ndash461 Morgan Kauf-mann Montreal Canada 2009
[12] L Guo J Ma H R Jiang Z M Chen and C M XingldquoSocial trust aware item recommendation for implicit feedbackrdquoJournal of Computer Science and Technology vol 30 no 5 pp1039ndash1053 2015
[13] W K Pan H Zhong C F Xu and Z Ming ldquoAdaptive Bayesianpersonalized ranking for heterogeneous implicit feedbacksrdquoKnowledge-Based Systems vol 73 pp 173ndash180 2015
[14] Y Shi A Karatzoglou L Baltrunas M Larson N Oliver andA Hanjalic ldquoCLiMF collaborative less-is-more filteringrdquo inProceedings of the 23rd International Joint Conference on Artifi-cial Intelligence (IJCAI rsquo13) pp 3077ndash3081 ACMBeijing ChinaAugust 2013
[15] WK Pan andLChen ldquoGBPR grouppreference based bayesianpersonalized ranking for one-class collaborative filteringrdquo inProceedings of the 23rd International Joint Conference on Artifi-cial Intelligence (IJCAI rsquo13) pp 2691ndash2697 Beijing China 2013
[16] Y Koren ldquoFactor in the neighbors scalable and accurate col-laborative filteringrdquoACMTransactions on Knowledge Discoveryfrom Data vol 4 no 1 pp 1ndash24 2010
[17] N N Liu E W Xiang M Zhao and Q Yang ldquoUnifyingexplicit and implicit feedback for collaborative filteringrdquo inProceedings of the 19th International Conference on Informationand Knowledge Management (CIKM rsquo10) pp 1445ndash1448 ACMToronto Canada October 2010
[18] Y Shi A Karatzoglou L BaltrunasM Larson andA HanjalicldquoXCLiMF Optimizing expected reciprocal rank for data withmultiple levels of relevancerdquo in Proceedings of the 7th ACMConference on Recommender Systems (RecSys rsquo13) pp 431ndash434ACM Hongkong October 2013
[19] J Jiang J Lu G Zhang and G Long ldquoScaling-up item-based collaborative filtering recommendation algorithm basedon Hadooprdquo in Proceedings of the 7th IEEE World Congress onServices pp 490ndash497 IEEE Washington DC USA July 2011
[20] T Hofmann ldquoLatent semantic models for collaborative filter-ingrdquo ACM Transactions on Information Systems vol 22 no 1pp 89ndash115 2004
[21] R Salakhutdinov A Mnih and G Hinton ldquoRestricted Boltz-mannmachines for collaborative filteringrdquo in Proceedings of the
Mathematical Problems in Engineering 11
24rd International Conference onMachine Learning (ICML rsquo07)pp 791ndash798 ACM Press Corvallis Ore USA June 2007
[22] N Srebro J Rennie andT Jaakkola ldquoMaximum-marginmatrixfactorizationrdquo in Proceedings of the 18th Annual Conferenceon Neural Information Processing Systems pp 11ndash217 BritishColumbia Canada 2004
[23] J T Liu C H Wu and W Y Liu ldquoBayesian probabilisticmatrix factorization with social relations and item contents forrecommendationrdquo Decision Support Systems vol 55 no 3 pp838ndash850 2013
[24] D Feldman and T Tassa ldquoMore constraints smaller coresetsconstrained matrix approximation of sparse big datardquo in Pro-ceedings of the 21th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining (KDD rsquo15) pp 249ndash258ACM Press Sydney Australia August 2015
[25] S Mirisaee E Gaussier and A Termier ldquoImproved local searchfor binarymatrix factorizationrdquo in Proceedings of the 29th AAAIConference on Artificial Intelligence pp 1198ndash1204 ACM PressAustin Tex USA January 2015
[26] T Y Liu Learning to Rank for Information Retrieval SpringerNew York NY USA 2011
[27] NN LiuM Zhao andQ Yang ldquoProbabilistic latent preferenceanalysis for collaborative filteringrdquo in Proceedings of the 18thInternational Conference on Information and Knowledge Man-agement (CIKM rsquo09) pp 759ndash766 ACM Press Hong KongNovember 2009
[28] N N Liu and Q Yang ldquoEigenRank a ranking-orientedapproach to collaborative filteringrdquo in Proceedings of the 31stAnnual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval (ACM SIGIR rsquo08) pp 83ndash90 ACM Singapore July 2008
[29] M Weimer A Karatzoglou Q V Le and A J SmolaldquoCofiRankndashmaximum margin matrix factorization for collab-orative rankingrdquo in Proceedings of the 21th Conference onAdvances in Neural Information Processing Systems pp 79ndash86Curran Associates Vancouver Canada 2007
[30] O Chapelle D Metlzer Y Zhang and P Grinspan ldquoExpectedreciprocal rank for graded relevancerdquo in Proceedings of the 18thACM International Conference on Information and KnowledgeManagement (CIKM rsquo09) pp 621ndash630 ACM New York NYUSA November 2009
[31] E M Voorhees ldquoThe trec-8 question answering track reportrdquoin Proceedings of the 8th Text Retrieval Conference (TREC-8 rsquo99)Gaithersburg Md USA November 1999
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
Mathematical Problems in Engineering 11
24rd International Conference onMachine Learning (ICML rsquo07)pp 791ndash798 ACM Press Corvallis Ore USA June 2007
[22] N Srebro J Rennie andT Jaakkola ldquoMaximum-marginmatrixfactorizationrdquo in Proceedings of the 18th Annual Conferenceon Neural Information Processing Systems pp 11ndash217 BritishColumbia Canada 2004
[23] J T Liu C H Wu and W Y Liu ldquoBayesian probabilisticmatrix factorization with social relations and item contents forrecommendationrdquo Decision Support Systems vol 55 no 3 pp838ndash850 2013
[24] D Feldman and T Tassa ldquoMore constraints smaller coresetsconstrained matrix approximation of sparse big datardquo in Pro-ceedings of the 21th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining (KDD rsquo15) pp 249ndash258ACM Press Sydney Australia August 2015
[25] S Mirisaee E Gaussier and A Termier ldquoImproved local searchfor binarymatrix factorizationrdquo in Proceedings of the 29th AAAIConference on Artificial Intelligence pp 1198ndash1204 ACM PressAustin Tex USA January 2015
[26] T Y Liu Learning to Rank for Information Retrieval SpringerNew York NY USA 2011
[27] NN LiuM Zhao andQ Yang ldquoProbabilistic latent preferenceanalysis for collaborative filteringrdquo in Proceedings of the 18thInternational Conference on Information and Knowledge Man-agement (CIKM rsquo09) pp 759ndash766 ACM Press Hong KongNovember 2009
[28] N N Liu and Q Yang ldquoEigenRank a ranking-orientedapproach to collaborative filteringrdquo in Proceedings of the 31stAnnual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval (ACM SIGIR rsquo08) pp 83ndash90 ACM Singapore July 2008
[29] M Weimer A Karatzoglou Q V Le and A J SmolaldquoCofiRankndashmaximum margin matrix factorization for collab-orative rankingrdquo in Proceedings of the 21th Conference onAdvances in Neural Information Processing Systems pp 79ndash86Curran Associates Vancouver Canada 2007
[30] O Chapelle D Metlzer Y Zhang and P Grinspan ldquoExpectedreciprocal rank for graded relevancerdquo in Proceedings of the 18thACM International Conference on Information and KnowledgeManagement (CIKM rsquo09) pp 621ndash630 ACM New York NYUSA November 2009
[31] E M Voorhees ldquoThe trec-8 question answering track reportrdquoin Proceedings of the 8th Text Retrieval Conference (TREC-8 rsquo99)Gaithersburg Md USA November 1999
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of