using transactional information to predict link strength in online social networks

22
Using Transactional Information to Predict Link Strength in Online Social Networks Indika Kahanda and Jennifer Neville Purdue University

Upload: roch

Post on 14-Jan-2016

92 views

Category:

Documents


56 download

DESCRIPTION

Using Transactional Information to Predict Link Strength in Online Social Networks. Indika Kahanda and Jennifer Neville Purdue University. Online social networks (OSNs). Explosive growth of online communities enables study of social processes and behavior at a larger scale than ever before - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Using Transactional Information to Predict  Link Strength in Online Social Networks

Using Transactional Information to Predict Link Strength in Online Social Networks

Indika Kahanda and Jennifer NevillePurdue University

Page 2: Using Transactional Information to Predict  Link Strength in Online Social Networks

Online social networks (OSNs)

• Explosive growth of online communities enables study of social processes and behavior at a larger scale than ever before

•Facebook: 200 mil active users

•MySpace: 125 mil active users

•LinkedIn: 40 mil users

• User-contributed data is much more extensive than hand-collected networks previously studied in social science

Page 3: Using Transactional Information to Predict  Link Strength in Online Social Networks

OSNs are larger and more heterogeneous than manually-collected social networks

Min degree=1Median

degree=81Max degree=2173

Purdue Facebook Network

Min degree=1Median degree=7Max degree=10

UNC National Longitudinal Study of Adolescent Health In-School

Survey

Page 4: Using Transactional Information to Predict  Link Strength in Online Social Networks

High median degree implies the presence of many weak, or spurious, friendship links.

Conjecture: Strong relationships can be identified

automatically from transactional link information

Page 5: Using Transactional Information to Predict  Link Strength in Online Social Networks

OSNs contain additional information about user interactions

Wall communications

Group membership

Photo postings

Page 6: Using Transactional Information to Predict  Link Strength in Online Social Networks

Purdue Facebook network

• 56061 public users in March 2008

• Undergrads, grad students, faculty, staff, alumni

Page 7: Using Transactional Information to Predict  Link Strength in Online Social Networks

Information about strong relationships

• Top Friends application allows users to nominate some of their friends as “best friends”

• This provides us with positive and negative training examples of strong relationships

• 4900 Purdue users have Top Friends application visible publicly (9%)

• 17,393 Purdue users are nominated as a Top Friend

• Max out-degree=40 max in-degree=14

Page 8: Using Transactional Information to Predict  Link Strength in Online Social Networks
Page 9: Using Transactional Information to Predict  Link Strength in Online Social Networks
Page 10: Using Transactional Information to Predict  Link Strength in Online Social Networks

Automatically identifying top friends

• Formulate this as a link strength prediction task

• For each friend pair (u,v), predict whether they are “top friends” given their attributes, interactions, and network information.

• Use supervised learning methods: Logistic regression, naïve Bayes classifiers, and bagged decision tress

• Consider features from four different categories: attribute similarity, topological connectivity, transactional connectivity, and network-transactional connectivity.

• Evaluate on data from the public Purdue Facebook network

• Use basic attribute information from profile, friendship links, wall postings, picture postings, group memberships, and “top friend” nominations

Page 11: Using Transactional Information to Predict  Link Strength in Online Social Networks

Related work

• Link prediction

• Focuses on predicting future links between any (u,v) pair in a network with a single edge type (i.e., friendship)

• Previous methods primarily use attribute similarity features (e.g., Taskar et al. ‘03) or topological features of the network (e.g., Liben-Nowell & Kleinberg ‘04)

• Adamic and Adar (‘03) used ancillary network information for link prediction but they focused on similarity-based features instead of transactions/interactions

• Pruning spurious links

• Singh et al. (’05) and Hill et al. (‘07) sample nodes and edges based on structural properties but they do not consider transactional information

Page 12: Using Transactional Information to Predict  Link Strength in Online Social Networks

Feature types

(1) Attribute-based features

Assess attribute similarity between users (e.g., number

of matches)

U VGender: MaleReligious: ChristianPolitical: Moderate

Gender: MaleReligious: AgnosticPolitical: Conservative

(2) Topological features

U V

Assess connectivity of

users in friendship network (e.g.,

number of common neighbors)

Page 13: Using Transactional Information to Predict  Link Strength in Online Social Networks

Feature types

(3) Transactional features

Assess transactional activity between user pairs (e.g., number of bi-

directional posts)

U V

Wall post

Photo post

Same group

(4) Network-transactional features

U V

Assess connectivity of users in transaction

networks (i.e., moderate

transactional activity by interactions with

other users)

Page 14: Using Transactional Information to Predict  Link Strength in Online Social Networks

Methodology

• Models

•Bagged decision trees, naïve Bayes classifiers, and logistic regression

• Experiments

•Feature ranking

•Feature type comparison

•Link type comparison

•Overall classification

• Performance measure: area under the ROC curve (AUC)

•Measures the quality of (probability) rankings produced by the model

Page 15: Using Transactional Information to Predict  Link Strength in Online Social Networks

Facebook sample

• Random sample of 500 users with top friends application

•Consider all friends of those 500 users

•Top friends positive training example

•Other friends negative training example

•Restrict attention to pairs that have values for 4 common attributes

• Final sample consisted of 8766 linked friends with 896 (10.2%) positive examples

Page 16: Using Transactional Information to Predict  Link Strength in Online Social Networks

Experiment 1: Feature rankings

• Compare relative importance of each of the 50 features

• Measures:

•Information gain

•Chi-square statistic

• Compute average rank of each feature and look at top 15:

•12 are network-transactional features, 3 are transactional

•12 use wall information, 3 use picture information

Page 17: Using Transactional Information to Predict  Link Strength in Online Social Networks

Experiment 2: Feature type comparison

Network-transactionalAUC=84%

TransactionalAUC=74%

TopologicalAUC=75%

•Ablation study using features of each type separately

•Attribute-based

•Topological

•Transactional

•Network-transactional

•Network-transactional features achieve best performance

Attribute-based

AUC=50%

Page 18: Using Transactional Information to Predict  Link Strength in Online Social Networks

Experiment 3: Link type comparison

WallAUC=82%

GroupAUC=63

%

PictureAUC=62%

•Ablation study using data from each link type separately (all features)

•Wall

•Picture

•Groups

•Friendship

•Wall information results in best performance

FriendsAUC=77

%

Why doesn’t picture information improve performance?… sparsity.

28% of user pairs have 1 wall link4% of user pairs have 1 picture link

Page 19: Using Transactional Information to Predict  Link Strength in Online Social Networks

Experiment 4: Overall classification results

• Uses 50 features, compares performance of three different models

• Bagged decision trees achieve best performance

• Network-transactional features account for 97% of the performance observed using all features

Bagged Decision TreesAUC=87% Naïve Bayes

AUC=81%

Logistic RegressionAUC=82%

Page 20: Using Transactional Information to Predict  Link Strength in Online Social Networks

Conclusion

• Formulated a link strength prediction task to automatically identify stronger relationships among existing friendships.

• Compared the utility of attribute-based, topological, transactional, and network-transactional features

• Showed that in addition to good accuracy overall, network-transactional features had the largest impact on model performance

• Results indicate that transactional events are useful for predicting link strength

• However, it is also necessary to consider the transactional events in the context of user behavior within the larger social network

Page 21: Using Transactional Information to Predict  Link Strength in Online Social Networks

Future work

• Exploit temporal aspect of transactions to improve predictions

• Address the more general link-strength prediction task by formulating a latent variable model

Distribution of Inter-arrival times for user pairs

0

0.2

0.4

0.6

0.8

1

1 21 41 61 81 101 121 141 161 181

Inter-arrival times (in days)

No

. of

inte

rva

ls

all friends

top friends

Page 22: Using Transactional Information to Predict  Link Strength in Online Social Networks

Thank you!

• Indika Kahanda:

[email protected]

• http://web.ics.purdue.edu/~ikahanda/

• Jennifer Neville:

[email protected]

• http://www.cs.purdue.edu/~neville/

Questions?