link recommendation in p2p social networks yusuf aytaş, hakan ferhatosmanoğlu, Özgür ulusoy...
TRANSCRIPT
Link Recommendation In P2P Social NetworksYusuf Aytaş, Hakan Ferhatosmanoğlu, Özgür Ulusoy
Bilkent University, Ankara, Turkey
VLDB WOSS 2012
Outline
• Introduction• Motivation for P2P Social Networks• Link Recommendation• P2P Top-k Common Neighbor• Experiments• Discussion• Future Work
2/23
VLDB WOSS 2012
Introduction
• Social networks are mostly based on centralized infrastructure (“fat server thin client”).
• However, P2P infrastructure is a natural alternative for social networks.• Problems with centralized
infrastructure.
3/23
VLDB WOSS 2012
Problems with Centralized Systems
• Privacy: Social network providers can misuse users’ data.• Censorship: Social network provider can censor users’ shares.• Scalability: Data can be distributed over
network.• These can be avoided in P2P Social networks.
4/23
VLDB WOSS 2012
Advantages of P2P Systems
• Data can be maintained by peers, no need for another computer.
• Level of privacy can be defined according to user.
• Misuse of both linkage and user data is prevented.
• Accordingly, significant amount of research is needed for algorithms and systems of P2P Social Networks.
5/23
VLDB WOSS 2012
P2P Social Network Challenges
• Algorithm Perspective– Distributed graph algorithms– P2P Performance
• Systems Perspective– Storage– Robustness– Security
• SOWHOO: Our open source implementation» https://github.com/yusufaytas/sowhoo
6/23
VLDB WOSS 2012
Social Network Algorithms on P2P Environment
• In a P2P Social Network, peers have limited information about the network.
• Known algorithms like link prediction, community detection, information diffusion should be revisited.
• Efficiency of overlay network should be taken into account as well as algorithm accuracy.
• In this context, we propose a new approach “Link Recommendation”.
7/23
VLDB WOSS 2012
Problem Background
• Common Neighbor : A node is more likely to interact with another node if number of their shared neighbors is high.
• Top-K Query Processing: Finding k objects that have highest scores.
Id S1
a 0.9
d 0.85
e 0.83
h 0.75
. .
. .
Id S2
e 0.96
f 0.84
b 0.83
d 0.56
. .
. .
Id S1 S2a 0.9e 0.83 0.96d 0.85 0.56f 0.84b 0.83h 0.75
0.23
0.34
0.41
0.27
8/23
VLDB WOSS 2012
Problem Background
• Zhang proposed a Common Neighbor algorithm (NCNP) to predict links in a distributed graph.
• Kermarrec proposed a distributed social graph embedding algorithm (SocS) for link prediction.
• We consider P2P environment settings.• Our approach uses P2P Top-k retrieval to
enhance performance.• Scoring methods improve network overlay.
9/23
VLDB WOSS 2012
Link Recommendation
• Link recommendation: suggesting new links by considering both neighborhood information and network performance.
• To measure social information and P2P network, we use node scoring.
• We adapted Common Neighbors to distributed environment using Fagin’s and Threshold Algorithm.
10/23
VLDB WOSS 2012
Link Recommendation(Cont’d)
2
23
9
5
11/23
VLDB WOSS 2012
Node Scoring
• Node Importance• Reputation Scoring• P2P Systems Measures• Composite Measures– Trusted Centrality– Available Authority
• Our weighting strategy may suggest friendships that improve P2P Topology
12/23
VLDB WOSS 2012
Top-K Common NeighborE
A
F
D
B
C
Node A requests new Recommended Node.
Each node returns
recommended node.
Node A evaluates returned nodes and terminates if algorithm converges.
13/23
VLDB WOSS 2012
Top-K FA and TA Common Neighbor
• Top-K FA Common Neighbor algorithm stops if it receives k recommended nodes from all neighbors.– It generally results in worst case scenario.
• Top-K TA Common Neighbor algorithm stops if it has k recommended nodes greater than the threshold(approximated).– Threshold calculated at each iteration.
14/23
VLDB WOSS 2012
Setup For Experiments
• Synthetic and real data • For real data– Gnutella (6301 nodes and 20777 edges)– Wikipedia (7115 nodes and 103689 edges)
• For synthetic data, we implemented: – Uniformly distributed model,– Small world model of Watts and Strogatz,– Clustering model of Holme and Kim.
• We plan to use data from SOWHOO. 15/23
VLDB WOSS 2012
Experiments(Performance)
• We have evaluated algorithms’ efficiency as number of interactions vs. number of edges.
• An interaction/access is to retrieve recommended node information, i.e. weight and address from a peer.
• Assigned weights to network globally and locally according to power-law and uniform distribution.
• Global weights are single and do not change according to a node. Local weights are assigned by each node and differ.
16/23
VLDB WOSS 2012
Top-K TA vs. Top-K FA
17/23
VLDB WOSS 2012
Experiments (Accuracy)
• We evaluated algorithms according to recommended nodes by considering regular Common Neighbor as baseline.
• Also need to evaluate by using:– Rank of recommended nodes. – Sum of weights for recommended nodes.
• Performance measure(ω) for accuracy and efficiency trade-off:
18/23
VLDB WOSS 2012
Top-K TA vs. Top-K FA
19/23
VLDB WOSS 2012
SOWHOO
• We are building a P2P Social Network application to test our algorithms.
Super Peer
Super Peer
20/23
VLDB WOSS 2012
SOWHOO(Cont’d)
• SOWHOO has 3 layers : application layer, system layer, and network layer.
Network Layer
Application Layer
System Layer• Application Layer handles
user requests and provides user interface.
• System Layer provides mechanisms like pub/sub, notify/update and so on.
• Network layer provides messaging infrastructure between peers.
21/23
VLDB WOSS 2012
Discussion
• We presented ongoing work on Link Recommendation.
• P2P Top-K FA and TA Common Neighbors to find recommended links for a node.
• P2P Top-k TA is significantly better than P2P Top-k FA Common Neighbors in terms of efficiency.
• We also presented weighting methods and proposed combined weights.
22/23
VLDB WOSS 2012
Future Work
• We are planning to improve Top-K TA Common Neighbor algorithm to Top-K TA Common Neighbor+.
• Test our algorithms according to accuracy measures we have discussed.
• We are planning to complete implementation of SOWHOO.
• Test our algorithms on data generated by SOWHOO.
23/23