arxiv:2107.04846v1 [cs.ir] 10 jul 2021

Propagation-aware Social Recommendation byTransfer Learning

Haodong Chang1[0000−0002−5015−1793] and Yabo Chu2[0000−0002−1694−9179]

1 University of Technology Sydney, Australia [email protected] Northeastern University, China [email protected]

(Haodong Chang and Yabo Chu contributed equally to this work)

Abstract. Social-aware recommendation approaches have been recog-nized as an effective way to solve the data sparsity issue of traditionalrecommender systems. The assumption behind is that the knowledge insocial user-user connections can be shared and transferred to the do-main of user-item interactions, whereby to help learn user preferences.However, most existing approaches merely adopt the first-order connec-tions among users during transfer learning, ignoring those connectionsin higher orders. We argue that better recommendation performance canalso benefit from high-order social relations. In this paper, we propose anovel Propagation-aware Transfer Learning Network (PTLN) based onthe propagation of social relations. We aim to better mine the sharingknowledge hidden in social networks and thus further improve recom-mendation performance. Specifically, we explore social influence in twoaspects: (a) higher-order friends have been taken into consideration byorder bias; (b) different friends in the same order will have distinct im-portance for recommendation by an attention mechanism. Besides, wedesign a novel regularization to bridge the gap between social relationsand user-item interactions. We conduct extensive experiments on tworeal-world datasets and beat other counterparts in terms of ranking ac-curacy, especially for the cold-start users with few historical interactions.

Keywords: Recommender system · Social Connections · Transfer Learn-ing · Social-aware Recommendation.

1 Introduction

Nowadays, recommender systems play an essential role in providing effectiverecommendations to users with items of interest. The key of success is to learnprecise user and item embeddings, where Collaborative Filtering (CF) is themost traditional method [1,2] to learn from user historical records, such as rat-ings, clicks, and reviews. However, for many users, it is lack of interaction datato provide accurate recommendations. The data sparsity problem limits the per-formance of CF-based models.

With the prevalence of online social networks, social connections have beenwidely leveraged to alleviate the data sparsity problem, and formed the line ofresearch called social-aware recommendation. Transfer learning [3,4] is a useful

arX

iv:2

107.

0484

6v1

[cs

.IR

] 1

0 Ju

l 202

1

2 H. Chang et al.

Fig. 1. An example to illustrate different friends’ influence. User b and c have thesame color as user u indicates that they have similar preferences with user u, anda have different preferences with user u. Then the system make a recommendationby considering these friends’ preference. “Mission Impossible” is in the front of therecommended list. ”Avatar” is ranked behind the other two films.

approach to learn the common knowledge shared between a source domain anda target domain, and then transfer the common knowledge to enhance the modellearning in target domain. Transfer learning is also applied in social-aware rec-ommendation to learn user preference from social connections and then transferto item domain, leading to more fine-refined user preference and thus betterrecommendation performance. However, most existing methods only adopt thefirst-order connections while ignoring the high-order connections. For example,in Figure 1, users a and b are both friends of user u in the first order, user c liesin the second order. Users u, b, c share similar interests, while user a has differentinterests with u. In this case, user c (in the 2nd order) will have more positiveinfluence on learning preference of user u than first-order friend user a.

Therefore, we argue that high-order friends are informative and can also helplearn user preference, especially considering the fact that users may not havemany direct connections with other users. It is valuable to find more relevant so-cial friends to deal with the data sparsity problem in social networks. Therefore,we adopt the trust propagation in our model to mine informative knowledgehidden in high-order social relations. Specifically, social influence have been con-sidered in two aspects. Firstly, friends in different orders will affect the learningof user preference. Different order has distinct bias towards preference learning.To the authors’ best knowledge, we are the first to take into account order biasin modelling high-order social influence. Secondly, friends in the same order willhave different importance for preference learning. We apply the attention mecha-nism to adaptively learn the importance of friends in the same order. Moreover,we propose a novel regularization term to formulate the relationship betweendomain-specific and cross-domain (common) knowledge to reduce the risk ofmodel overfitting.

To summarize, the main contributions of this paper are as follows:

Propagation-aware Social Recommendation by Transfer Learning 3

– We apply transfer learning to learn the sharing common knowledge betweensocial and item domains, and leverage social propagation to take into accounthigh-order social influence for better recommendation.

– We propose a new factor ‘order bias’ to distinguish social influence in highorders from low orders. We design a novel regularization term to formu-late the relationship between domain-specific and cross-domain (common)knowledge and thus to avoid overfitting.

– We conduct extensive experiments on two real-world datasets Ciao and Yelp,and demonstrate the effectiveness of our approach in ranking accuracy.

2 Related Work

Social-aware Recommendation Most previous social-aware recommendationworks are based on homogeneity and social influence theory, that is, users who areconnected tend to have similar behavioral preferences, and people with similarbehavioral preferences are more likely to establish connections. The meaningreflected in the recommendation model is that the user’s feature vector shouldbe as close as possible to the vector space’s similar user’s feature vector. Forexample, [5] assumed that users are more likely to have seen items consumed bytheir friends, and extended BPR[2] by changing the negative sampling strategy.TrustSVD[6] believed that not only the user’s explicit rating data and socialrelationships should be modeled, but the user’s implicit behavior data and socialrelationships should also be considered. Therefore, implicit social information isintroduced based on the SVD++[7] model. Recent research has used deep neuralnetworks as classifiers, yielding significant accuracy. E.g., SAMN[8] leveragesattention mechanism to model both aspect- and friend-level differences for social-aware recommendations. However, these methods use direct social connectionsand ignore high-order social relationships, which has a wealth of information.

There are also some studies considering trust propagation to get high-orderinformation. DeepInf[9] models the high-order to predict the social influence.[10] proposed a DiffNet neural model with a layer-wise influence diffusion partto model how users’ trusted friends recursively influence users’ latent preferences.The further work[11] jointly model the higher-order structure of the social andthe interest network. However, they need to use text or image information fordata enhancement, which may lack a certain degree of versatility. Moreover,existing methods ignore the influence of different order’s friends on users.

Our work differs from the above studies as the designed model uses attentionmechanism to aggregate different friends’ influence in each order adaptively. Andthe influence of order are considered as order bias. Order bias could adjust thefriend’s influence depend on the friend’s order.

Transfer Learning Transfer learning deals with the situation where the dataobtained from different resources are distributed differently. It assumes the ex-istence of common knowledge structure that defines the domain relatedness andincorporates this structure in the learning process by discovering a shared latent

4 H. Chang et al.

feature space in which the data distributions across domains are close to eachother. [12] pointed out that parts of the source domain data are inconsistent withthe target domain observations, which may affect the construction of the modelin the target domain. Based on that, some researchers [3,13] designed selectivelatent factor transfer models to better capture the consistency and heterogene-ity across domains for recommendation. However, in these works, the transferratio needs to be properly selected through human effort and can not changedynamically in different scenarios.

There are also some studies considering the adaption issue in transfer learn-ing. [14] proposed to adapt the transfer-all and transfer-none schemes by estimat-ing the similarity between a source and a target task. [15] designed a completelyheterogeneous transfer learning method to determine different transferability ofsource knowledge. However, these methods mainly focus on task adaptation ordomain adaption. [4] propose to adapt each user’s two kinds of information (iteminteractions and social connections) with a finer granularity, which allows theshared knowledge of each user to be transferred in a personalized manner. [16]propose a novel dual transfer learning-based model that significantly improvesrecommendation performance across other domains. Nevertheless, these meth-ods still ignore the following two issue:1)High-order information is very helpfulto improve the recommendation performance. 2)Sparse data in rating domainand social domain can lead to overfitting problems.

Our method innovatively leverage the high-order information for transferlearning. And we propose a novel regularization so that the user representationabout the common knowledge can be reconstructed to the user representationin the social and item domains, which could reduce the risk of overfitting due tothe lack of data.

3 Our Proposed Model

3.1 Notations

Suppose we have a user set U and an item set V, let M denote the number ofusers and N denote the number of items. Symbols u, t denote two different users,and v denotes an item. Fu represents the friend set of user u. In social ratingnetworks, users can form social connections with other users and interact withitems, resulting in two matrices: user-user social matrix and user-item interactionmatrix. The user-item interaction matrix is defined as R = [ruv]M×N from users’historical behaviors. ruv = 1 indicates that user u has an observed interaction(purchases, clicks) with item v. Similarly, we define the user-user social matrixX = [xut]M×M from social networks. xut = 1 indicates that user u trusts user t.We represent user u’s embedding in three parts: cu, su and iu, where cu denotesthe latent factors shared between the item domain and social domain, i.e., thecommon knowledge; su and iu are user latent factors corresponding to the socialdomain and item domain. The purpose of item recommendation is to generatea list of ranked items that meet user u’s preference.


ProB ProB ProB AttentionAttention

Predict Predict

Propagation Layer

k = 1 k = 2 k = K

Prediction Layer

Trust Predict Rating Predict

User embedding

User u

First order friends of user u Second order friends of user u K-th order friends of user u Interacted items

User embedding User embedding User embedding

Order bias Order bias Order bias

User embedding User embedding User embedding

social user embedding

item embedding

feature embedding

feature embedding

User embedding

qg

Addition

User embedding i

User embedding c

User embedding s

…

𝑂1 𝑂2 𝑂𝐾

𝑖0 𝑐0 𝑠0 𝑖0 𝑐0 𝑠0 𝑖0 𝑐0 𝑠0

𝑖𝑢1

𝑐𝑢1

𝑠𝑢1

𝑖𝑢2

𝑐𝑢2

𝑠𝑢2

𝑖𝑢𝐾

𝑐𝑢𝐾

𝑠𝑢𝐾

𝑖𝑢0

𝑐𝑢0

𝑠𝑢0

𝑖𝑢𝑐𝑢𝑠𝑢

𝑝𝑢𝑠 𝑝𝑢

𝐼

𝑋𝑢 𝑅𝑢…

Fig. 2. An overview of our PTLN model. ‘ProB’ represents the propagation blockintroduced in Figure 3.

3.2 Model Overview

The overall structure of our Propagation-aware Transfer Learning Network (PTLN)is illustrated in Figure 2. It includes three types of input: 1) the user embeddingof user u and u’s each order friends, 2) the social user embedding of u’s firstorder friends, 3) the item embedding of the item which u has interacted. Theoutputs of our model are the predicted probability r̂uv that how user u will likeitem v, and the predicted probability x̂ut that how user u will trust another usert. The main architecture of PTLN contains two components: propagation layerand prediction layer.

The propagation layer propagates over social networks to incorporate the in-fluence of high-order social friends, and then aggregate social influence of friendsin different orders. Besides, the order itself is also considered as order bias, indi-cating the influence bias of general friends in a specific order. In the predictionlayer, we adopt attention mechanism to consider the domain relationships to bet-ter transfer the domain-specific knowledge and the shared knowledge for eachtask. Moreover, we adopt an efficient whole-data based training strategy [4], andinvolves a novel regularization term in loss function to optimize the model.

3.3 Propagation Layer

In this part, we aim to explore the high-order social influence based on theidea that a user may share similar preferences with her friends. As shown inFigure 2, the propagation layer are constructed in a multi-block structure. Eachblock’s input is the user embedding of target user u and that of u’s friends atthis order. The output is the new user embedding which includes high-orderfriends’ influence. The new user embedding in each aspect is calculated as samein propagation block, therefore we take the process of calculating the new userembedding in common knowledge aspect as an example to explain the details ofthe formula. The new user embedding is learned in below four steps:

1)Calculate Similarity Embedding User’s social connection will indi-rectly influence the user’s preference in different degrees. As discussed in the

6 H. Chang et al.

Fig. 3. The details of K-th propagation block in common knowledge aspect.

introduction, the similarity between two connected users can be used as anessential basis for revealing the degree of influence. Thus we adopt attentionmechanism to assign the non-uniform weights to each friend according to thesimilarity between the user and her friends. we firstly calculate the similarityembedding between user u and her k-th order friend t in common knowledgeaspect as follow:

simC(u,t) = c0u � c0t (1)

where simC(u,t) ∈ RD1 denotes the similarity embedding between user u and

her k-th order friend t ∈ Fku in common knowledge aspect. The superscript 0

indicates the illustrated subject is initial. Fku represents the k-th order friend set

of user u. the operation � denotes the element-wise product of vectors.2)Calculate Attention score After obtaining similarity embedding from

k-th order friends, the attention are calculated by a trainable weighted matrixW ∈ RD1×1. For each aspect, the trainable weighted matrix are unique. The

k-th order friend t’s attention in common knowledge aspect A∗(C)(u,t) is defined as:

A∗(C)(u,t) = WT

CsimC(u,t) (2)

where WC is the trainable weighted matrix to the common knowledge aspect.Then we use the softmax function to normalize the friend’s attention score:

AC(u,t) =

exp(A∗(C)(u,t))∑

z∈Fkuexp(A∗(C)

(u,z))(3)

where AC(u,t) is the final attention of friend t which indicates the degree of t’s

influence on user u.3)Aggregate Friend’s Influence We leverage the attention score to ag-

gregate the k-th order friend’s influence, so that the friend influence embeddingwe get is obtained by dynamically absorbing the influence of her friends at thisorder.

fk(C,u) =∑

t∈F(k)u

Ak(u,t)c

0t (4)

where fk(C,u) ∈ RD1 represents the u’s friend influence embedding at k-th order.


4)Update User Embedding When generating the friend influence embed-ding, we merely consider the similarity between the user and friend’s preferenceignoring the influence of the friend’s order, as discussed in the introduction.Therefore we propose a concept of order bias to model the influence bias of gen-eral friends in a specific order. we consider that the order bias can dynamicallyadapt to the friend influence according to the order. With the friend influenceembedding and order bias, the user embedding will be updated as follow:

cku = c0u + fk(C,u) + ok (5)

The generated embedding cku is the new user embedding in k-th order. ok ∈ RD1

indicates the order bias of k-th order.After propagating with k times, we obtain k new user embedding from first

order to k-th order. We will use all new user embedding achieved in each orderwith initial user embedding to generate final user embedding cu as follow:

cu =∑k

cku (6)

3.4 Prediction Layer

Transfer Learning framework can transfer the shared knowledge from the sourcedomain to the target domain which is a promising method of using cross-domaindata to solve problems. [3] points that the degree of relationship between domainsis varied according to the user. Thus, we apply the attention mechanism to usethe domain-specific knowledge and common knowledge for better learning thefeature embedding which represent social domain preference and item domainpreference. For a user, if the two domains are less related, the shared knowledge(c) will be penalized and the attention network will learn to utilize more domain-specific knowledge (s or i) instead. Formally, the item domain attention and thesocial domain attention are defined as:

α∗(C,u) = hTαδ(Wαcu + bα);α∗(I,u) = hT

αδ(Wαiu + bα) (7)

β∗(C,u) = hTβ δ(Wβcu + bα);β∗(S,u) = hT

β δ(Wβsu + bα) (8)

Weight matrices W ∈ RD1×D2 ,h ∈ RD1 and bias units b serve as parametersof the two-layer attention network. α and β are related to the item domain andsocial domain, respectively. D2 denotes the dimension of attention network, andδ is the nonlinear activation function ReLU .

Then, the final attention scores are normalized with a softmax function:

α(C,u) =exp(α∗(C,u))

exp(α∗(C,u)) + exp(α∗(I,u))= 1− α(I,u);β(C,u) =

exp(β∗(C,u))

exp(β∗(C,u)) + exp(β∗(S,u))= 1− β(S,u)

(9)

α(C,u) and β(C,u) denote the weights of common knowledge c for item domainand social domain, respectively, which determine how much to transfer in each

8 H. Chang et al.

domain. After obtaining the above attention weights, the feature embedding ofuser u for the two domains are calculated as follows:

pIu = α(I,u)iu + α(C,u)cu; pS

u = β(S,u)su + β(C,u)cu (10)

The generated two feature embeddings pIu and pS

u represent the user’s prefer-ences for items and other users after transferring the shared knowledge betweenthe two domains.

For predicting the scores of each item and user, we adopt a neural form MF[17] to utilize the user’s feature embedding. For each task, a specific output layeris employed. The scores of user u for item v are calculated as follow:

r̂uv = WI(pIu � qv); x̂ut = WS(pS

u � gt) (11)

qv and gt denotes the latent factor vector of item v and user t as a friend,respectively. The operation � denotes the element-wise product of vectors

Whole-data based strategy leverages the full data with a potentially bettercoverage. Thus we adopt an efficient whole-data train strategy [4] to optimizeour model. For each task, the loss functions are defined as follow:

L̃I(Θ) =

D1∑i=1

D1∑j=1

((hI,ihI,j)

(∑u∈B

pIu,ipIu,j

)(∑v∈V

cI−v qv,iqv,j

))

+∑u∈B

∑v∈V+

((1− cI−v )r̂2uv − 2r̂uv

) (12)

L̃S(Θ) =

D1∑i=1

D1∑j=1

((hS,ihS,j)

(∑u∈B

pSu,ipSu,j

)(∑t∈U

cS−t gt,igt,j

))

+∑u∈B

∑t∈U+

((1− cS−t )x̂2ut − 2x̂ut

) (13)

I and S are related to the item domain and social domain. D1 is the latentfactor number. The scalar h,p,q,g denote the element of their correspondingvectors h,p,q,g. i and j denote the index of element in the vector. U+ and V +

denote the items v have interacted and the friends that directly connect. B isbatch of users. cI−v and cS−t are the weight of negative instances in two domains.

Both rating and social information are very sparse which could lead to theoverfitting problem. We consider that there has an implicit correlation betweencommon knowledge and domain-specific knowledge. This assumption motivatesus to propose a novel regularization term to against the overfitting problem:

L̃Reg(Θ) =∑k

(‖ik − θkαck‖2 + ‖sk − θkβck‖2) (14)

Where θ represents the weight of common knowledge c. α and β are related tothe item domain and social domain.

After that, we integrate both the sub-tasks loss and the novel regularizationterm into an overall objective function as follow:

L(Θ) = L̃I(Θ) + λ1L̃S(Θ) + λ2 ˜LReg(Θ) + λ3‖Θ‖2 (15)


Θ represents the parameters of our model. λ1,λ2,and λ3 are the parameters toadjust the weight proportion of each term.

4 Experiments

Table 1. Performance of all the comparison methods on the Ciao and Yelp datasets.The last column “Avg Imp” indicates the average improvement of PTLN over thecorresponding baseline on average. N indicates top-N task.

BaselinesMetrics

Precision Recall NDCG MRR

Ciao N=5 N=10 N=15 N=5 N=10 N=15 N=5 N=10 N=15 N=5 N=10 N=15 Avg ImpBPR 0.0208 0.017 0.0141 0.0272 0.0496 0.0631 0.0289 0.036 0.0402 0.0479 0.0538 0.056 60.84%NCF 0.0217 0.0176 0.0149 0.0392 0.057 0.0721 0.0294 0.0385 0.0441 0.0508 0.057 0.0596 45.92%SAMN 0.0266 0.0225 0.0195 0.0482 0.0743 0.0959 0.0405 0.0506 0.0575 0.0562 0.0632 0.0653 17.47%EATNN 0.0295 0.0233 0.0195 0.0528 0.0763 0.094 0.0454 0.054 0.0598 0.071 0.0787 0.0816 6.53%PTLN 0.0307 0.0244 0.0203 0.0571 0.0818 0.1006 0.0494 0.0585 0.0646 0.0755 0.083 0.0866

BaselinesMetrics

Precision Recall NDCG MRR

Yelp N=5 N=10 N=15 N=5 N=10 N=15 N=5 N=10 N=15 N=5 N=10 N=15 Avg ImpBPR 0.0349 0.0282 0.0235 0.0317 0.0497 0.0601 0.0507 0.0537 0.0548 0.0832 0.092 0.0949 16.03%NCF 0.0337 0.0292 0.0266 0.0503 0.0626 0.0711 0.0429 0.0465 0.0487 0.0721 0.081 0.0843 24.74%SAMN 0.0333 0.0283 0.0276 0.0496 0.0568 0.0695 0.045 0.047 0.0511 0.0745 0.0822 0.0865 12.96%EATNN 0.0327 0.029 0.0266 0.0507 0.0579 0.0666 0.0462 0.049 0.0516 0.0749 0.0835 0.087 12.18%PTLN 0.0356 0.0307 0.0274 0.0558 0.0647 0.0714 0.0543 0.057 0.0587 0.0889 0.0978 0.1009

4.1 Experimental Settings

Dataset We experimented with two public datasets: Ciao[4] and Yelp[18].Ciao provides a large amount of rating information and social information, whileusers make friends with others and express their experience through the formof reviews and ratings on Yelp. The two datasets were constructed followingprevious work [4,18]. Each dataset contains users’ ratings of the items they haveinteracted with and the social connections between users. To address the Top-Nrecommendation task, we remove all ratings that less than 4 for all datasets andkeep others with a score of 1. This preprocessing method aims at recommendingthe item list that users liked, and is widely used in existing works [4,8,19].

Baselines To evaluate the performance of Top-K recommendation,we com-pare our PTLN with the following methods: BPR [2]: A classic and widely usedranking algorithm for recommendation. It is implemented by learning pairwiserelation of rated and unrated items for each user rather than direct learning topredict ratings. NCF [17]: A neural CF model combines element-wise and hiddenlayers of the concatenation of user and item embedding to capture their high-order interactions. SAMN [8]: A state-of-the-art deep learning method leveragesattention mechanism to model both aspect- and friend-level differences for the

10 H. Chang et al.

social-aware recommendation. EATNN [4]: A state-of-the-art method uses at-tention mechanisms to adaptively capture the interplay between item domainand social domain for each user.

Evalutation Metrics We adapt four popular metrics Precision, Recall,NDCG( Normalized Discounted Cumulative Gain), and MRR(Mean Recipro-cal Rank) for evaluation. Specifically, NDCG is a position-aware ranking metric,which assigns a higher score to hits at higher positions. MRR considers the rank-ing position of the first correct item in the recommended list. The higher valueof these evaluation metrics, the better performance of the recommender system.

Parameter Setting The parameters for all baseline methods were initializedas in the corresponding papers and were then carefully tuned to achieve optimalperformance. The learning rate for all models were tuned among [0.0005, 0.0001,0.005, 0.001, 0.05, 0.01]. To prevent overfitting, we tuned the dropout ratio in[0.5,0.7,0.9]. The batch size was tested in [16,32,64,128,256], the embedding sizeD1 and the dimension of attention network D2 were tested in [32,64,128,256].For our PTLN model, D1 and D2 were set to 128 and 32 on Ciao and set to 64and 32 on Yelp. The learning rate was set to 0.0005 when using the Yelp and0.01 when using Ciao. The dropout ratio ρ was set to 0.7 on both datasets.

4.2 Performance Comparison

We investigate the Top-N performance with N set to [5,10,15], according withthe real recommendation scenario. We observe the results in Table 1:

Fig. 4. Performance with different PLTN variants on Ciao and Yelp.

1. Methods incorporating social information generally perform better than non-social method. SAMN, EATNN, and PTLN perform better than BPR andNCF. This result is consistent with previous work which indicates that socialinformation reflects users’ interest and is helpful in the recommendation.

2. Our method PTLN achieves the best performance on the two datasets andsignificantly outperforms all baseline methods. Specifically, compared to EATNN


which is the best baseline that uses attention mechanisms to adaptively cap-ture the interplay between item domain and social domain for each user.PTLN improves over EATNN about 6.53% on Ciao and 12.18% on Yelp.The substantial improvement of our model over the baselines could be at-tributed to two reasons: 1) our model considers the propagation of socialdomain knowledge, item domain knowledge and common knowledge, whichallows the latent factor to be modeled with a finer granularity; 2) we considerthe difference of friends’ influence and the order bias.

Table 2. Performance with different propagation depth K on Ciao and Yelp.

Ciao Pre@10 Recall@10 NDCG@10 MRR@10 Yelp Pre@10 Recall@10 NDCG@10 MRR@10

K=1 0.0242 0.0802 0.0559 0.0791 K=1 0.0306 0.0619 0.0559 0.0966

K=3 0.0242 0.0799 0.0557 0.0798 K=3 0.0301 0.0617 0.0558 0.0973

K=2 0.0244 0.0818 0.0585 0.0835 K=2 0.0307 0.0647 0.057 0.0978

Analyze of the propagation depth K The number of propagation layers Kreflects the extent to which the model uses social information and the degree towhich social information influences the model. Table 2 shows the results of differ-ent K values for both datasets. When K increases from 1 to 2 ,the performanceincreases, while the performance drops when K=3. We empirically conclude thatwhen the depth equals two is enough for the social recommendation.

4.3 Ablation Study

Impact of the order bias A key characteristic of our proposed model is theorder bias which considers the order of user’s friend in general. PTLN-O denotesa variant model of PTLN without using order bias. We can see that order biashas dramatically improved performance in Figure 4. We speculate a possiblereason is that order bias can dynamically adjust the output after fusing so thatthe updated user embedding of this order can better reflect the preference of theuser after being influenced by friends.

Impact of the attention mechanism Another critical characteristic of ourproposed model is we considering the diversity of friends’ influence by attentionmechanism. PTLN-A directly aggregate the friends’ influence and user’s embed-ding without any attention learning process. From Figure 4, we can see that ourmodel has a notable improvement in performance on the Ciao dataset when con-sidering the difference of friends’ influence. However, results on the Yelp datasetis not as significant as Ciao. This observation implies that the usefulness of con-sidering the importance strength of different elements in the modeling processvaries, and our proposed friend-level attention modeling could adapt to differentdatasets’ requirements.

12 H. Chang et al.

Impact of the novel regularization To evaluate the effectiveness of theproposed correlative regularization, we compare PTLN-R, a variant model ofPTLN without using novel regularization, with PLTN, in Figure 4. The PTLNmodel performs better than the PTLN-R, proving that our novel regularizationcan make the algorithm more stable.

5 Conclusions

In this paper, we present a novel social-aware recommendation model PTLNto address the sparsity problem of data. The core component of our model ispropagation layers that learn user embedding of each order by leveraging high-order information from the social domain, item domain, and common knowledgebetween the two domains. Attention mechanism and the concept of order biasare further employed to better distinguish the influence of different user friends.The proposed PTLN consistently and significantly outperforms the state-of-the-art recommendation models on different evaluation metrics, especially on thedataset with complicated social relationships and fewer item interactions whichverified our hypothesis about the varying degrees of different friends’ influence.

References

1. Yifan Hu, Yehuda Koren, and Chris Volinsky. Collaborative filtering for implicitfeedback datasets. In 2008 Eighth IEEE International Conference on Data Mining,pages 263–272. IEEE, 2008.

2. Steffen Rendle et al. Bpr: Bayesian personalized ranking from implicit feedback.arXiv preprint arXiv:1205.2618, 2012.

3. Lin Xiao Zhang Min, Zhang Yongfeng, Liu Yiqun, and Shaoping Ma. Learning andtransferring social and item visibilities for personalized recommendation. In CIKM2017, pages 337–346, 2017.

4. Chong Chen, Min Zhang, Chenyang Wang, Weizhi Ma, Minming Li, Yiqun Liu,and Shaoping Ma. An efficient adaptive transfer neural network for social-awarerecommendation. In SIGIR 2019, pages 225–234, 2019.

5. Tong Zhao et al. Leveraging social connections to improve personalized ranking forcollaborative filtering. In Proceedings of the 23rd ACM international conference onconference on information and knowledge management, pages 261–270, 2014.

6. Guibing Guo et al. Trustsvd: Collaborative filtering with both the explicit andimplicit influence of user trust and of item ratings. In Proceedings of the AAAIConference on Artificial Intelligence, volume 29, 2015.

7. Yehuda Koren. Factorization meets the neighborhood: a multifaceted collaborativefiltering model. In Proceedings of the 14th ACM SIGKDD international conferenceon Knowledge discovery and data mining, pages 426–434, 2008.

8. Chong Chen et al. Social attentional memory network: Modeling aspect-and friend-level differences in recommendation. In WSDM 2019, pages 177–185, 2019.

9. Jiezhong Qiu et al. Deepinf: Social influence prediction with deep learning. In Pro-ceedings of the 24th ACM SIGKDD International Conference on Knowledge Dis-covery & Data Mining, pages 2110–2119, 2018.

http://arxiv.org/abs/1205.2618


10. Le Wu, Peijie Sun, Yanjie Fu, Richang Hong, Xiting Wang, and Meng Wang. Aneural influence diffusion model for social recommendation. In SIGIR 2019

11. Le Wu et al. Diffnet++: A neural influence and interest diffusion network for socialrecommendation. arXiv preprint arXiv:2002.00844, 2020.

12. Eric Eaton et al. Selective transfer between learning tasks using task-based boost-ing. In Proceedings of the AAAI Conference on Artificial Intelligence, 2011.

13. Zhongqi Lu et al. Selective transfer learning for cross domain recommendation.In Proceedings of the 2013 SIAM International Conference on Data Mining, pages641–649. SIAM, 2013.

14. Bin Cao, Sinno Jialin Pan, Yu Zhang, Dit-Yan Yeung, and Qiang Yang. Adaptivetransfer learning. In AAAI, volume 2, page 7, 2010.

15. Seungwhan Moon and Jaime G Carbonell. Completely heterogeneous transferlearning with attention-what and what not to transfer. In IJCAI, volume 1, 2017.

16. Pan Li and Alexander Tuzhilin. Ddtcdr: Deep dual transfer cross domain recom-mendation. In Proceedings of the 13th International Conference on Web Search andData Mining, pages 331–339, 2020.

17. Xiangnan He et al. Neural collaborative filtering. In Proceedings of the 26thinternational conference on world wide web, pages 173–182, 2017.

18. Chuan Shi et al. Semantic path based personalized recommendation on weightedheterogeneous information networks. In CIKM 2015, pages 453–462, 2015.

19. Yao Wu, Christopher DuBois, Alice X Zheng, and Martin Ester. Collaborativedenoising auto-encoders for top-n recommender systems. In WSDM 2016

http://arxiv.org/abs/2002.00844

arxiv:2107.04846v1 [cs.ir] 10 jul 2021

Documents