transfer learning in heterogeneous collaborative filtering domains
TRANSCRIPT
Transfer learning in
heterogeneous collaborative
filtering domainsAuthors/ Weike Pan and Qiang Yang
Affiliation/ Dept. of CSE, Hong Kong University of Science and Technology
Source/ Journal of Artificial Intelligence (2013)
Presenter/ Allen Wu
20
13
/3/2
7
1
Outline
• Introduction
• Heterogeneous collaborative filtering problems
• Transfer by collective factorization
• Experimental results
• Conclusion
20
13
/3/2
7
2
Introduction
• Data sparsity is a major challenge in collaborative filtering (CF).
• Overfitting can easily happen for prediction.
• Some auxiliary data of the form “like” or “dislike” may be more
easily obtained.
• It’s more convenient for users to express preference.
• How do we take advantage of auxiliary knowledge to alleviate the
sparsity problem?
• Most existing transfer learning methods in CF consider auxiliary data from
several perspectives.
• User-side transfer, item-side transfer, knowledge-transfer.
20
13
/3/2
7
3
Rating-matrix generative model (ICML’09)
• RMGM is derived and extended from FMM generative model,
which can be formulated as
• The difference:
• It learns (U, V) and (U3, V3) alternatively.
• A soft indicator matrix is used. E.g., U [0, 1]n d.
20
13
/3/2
7
8
Model formulation
• Assume a user u’s rating on an item i in the target data, rui, is
generated from
• user-specific latent feature vector Uu1 d, where u=1,…,n.
• item-specific latent feature vector Vi1 d, where i=1,…,m.
• some data-dependent effect denoted as B d d.
20
13
/3/2
7
12
Model formulation (Cont.)
• Likelihood:
• Prior:
• Posterior Likelihood Prior (Bayesian inference)
• Log(Posterior)= Log(Likelihood Prior)
20
13
/3/2
7
13
Learning U and V in CMTF• Theorem 1. Given B and V, we can obtain the user-specific
latent matrix U in a closed form.
20
13
/3/2
7
16
Performance on Netflix at different
sparsity levels
20
13
/3/2
7
26
• SCVD performs
better than CMTF in
all cases.
Conclusion
• This paper investigate how to address the sparsity problem in
CF via a transfer learning solution.
• The TCP framework is proposed to transfer knowledge from
auxiliary data to target data to alleviates the data sparsity.
• Experimental results show that TCP performs significantly
better than several state-of-the-art baseline algorithms.
• In the future, the “pure” cold-start problem for users without
any rating is needed to be addressed via transfer learning.
20
13
/3/2
7
27