predicting positive and negative links in online social networks

23
Jure Leskovec (Stanford), Daniel Huttenlocher and Jon Kleinberg (Cornell)

Upload: fayre

Post on 19-Jan-2016

50 views

Category:

Documents


2 download

DESCRIPTION

Jure Leskovec (Stanford), Daniel Huttenlocher and Jon Kleinberg (Cornell). Predicting Positive and Negative Links in Online Social Networks. Social Interaction on the Web. Rich social structure in online computing applications Such structures are modeled by networks - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Predicting Positive and Negative Links in Online Social Networks

Jure Leskovec (Stanford), Daniel Huttenlocher and Jon Kleinberg (Cornell)

Page 2: Predicting Positive and Negative Links in Online Social Networks

Rich social structure in online computing applications

Such structures are modeled by networks

Most social network analyses view links as positive Friends Fans Followers

But generally links can convey either friendship or antagonism

2

Page 3: Predicting Positive and Negative Links in Online Social Networks

Our plan Study social interactions on the Web that have positive

and negative relationships Questions

How do edge signs and network structure interact? Approach

Edge sign prediction problem Given a network and signs on all but

one edge, predict the missing sign Applications

Friend recommendation for social media Easier to predict whether you know someone

vs. to predict what you think of them3

+

+?+

++

+

+–

––

––

Page 4: Predicting Positive and Negative Links in Online Social Networks

Each link AB is explicitly tagged with a sign: Epinions: Trust/Distrust

Does A trust B’s product reviews?(only positive links are visible)

Wikipedia: Support/Oppose Does A support B to become

Wikipedia administrator? Slashdot: Friend/Foe

Does A like B’s comments? Other examples:

World of Warcraft [Szell et al. 2010]

+

++

+++

+

+–

––

––

4

Page 5: Predicting Positive and Negative Links in Online Social Networks

Edge signs can be predicted with ~90% accuracy using only the local network structure No need for global trust-propagation mechanisms

Data oriented justification of classical theories from social psychology Our models align with theories of Balance and Status

Near perfect generalization: Same underlying mechanism of signed edge creation

Can train on “how people vote” and predict trust as well as the model trained on trust itself

5

Page 6: Predicting Positive and Negative Links in Online Social Networks

Machine Learning formulation: Predict sign of edge (u,v) Class label:

+1: positive edge -1: negative edge

Learning method: Logistic regression

Evaluation: Accuracy and ROC curves

Dataset: Original: 80% +edges Balanced: 50% +edges

Features for learning: Next slide

6

uu

vv

+

+?+

+

++

+

––

––

Page 7: Predicting Positive and Negative Links in Online Social Networks

For each edge (u,v) create features: Triad counts (16):

Counts of signed triads edge uv takes part in

Node degree (7 features): Signed degree:

d+out(u), d-

out(u), d+

in(v), d-in(v)

Total degree: dout(u), din(v)

Embeddedness of edge (u,v)

7

u v

-+++

--+-

Page 8: Predicting Positive and Negative Links in Online Social Networks

Error rates: Epinions: 6.5% Slashdot: 6.6% Wikipedia: 19%

Signs can be modeled from local network structure alone Trust propagation model of

[Guha et al. ‘04] has 14% error on Epinions

Triad features perform less well for less embedded edges

Wikipedia is harder to model: Votes are publicly visible

8

Epin

Slash

Wiki

Page 9: Predicting Positive and Negative Links in Online Social Networks

Our goal is not just to predict signs but also to derive insights into usage of signed edges

Logistic regression learns a weight bi for each feature xi:

Connection to theories from social psychology: Structural balance Theory of statuswhich both give predictions onthe sign of the edge (u,v) basedon the triad it is embedded into

9

u v

+-

Page 10: Predicting Positive and Negative Links in Online Social Networks

Consider edges as undirected Start with intuition [Heider ’46]:

Friend of my friend is my friend Enemy of enemy is my friend Enemy of friend is my enemy

Look at connected triples of nodes that are consistent with this logic:

10

+++

--

+--+

Page 11: Predicting Positive and Negative Links in Online Social Networks

Status theory [Davis-Leinhardt ‘68, Guha et al. ’04, Leskovec et al. ‘10] Link u v means: v has higher status than u Link u v means: v has lower status than u Based on signs/directions of links from/to node

x make a prediction Status and balance can make different predictions:

11

+

u v

x--

u v

x

++

Balance: +Status: –LogReg: –

Balance: +Status: –LogReg: –

Balance: +Status: –LogReg: –

Balance: +Status: –LogReg: –

u v

x -+

Balance: –Status: –LogReg: –

Balance: –Status: –LogReg: –

Page 12: Predicting Positive and Negative Links in Online Social Networks

12

+ ++ -- +- -+ ++ -- +- -

+ ++ -- +- -

+ ++ -- +- -

Page 13: Predicting Positive and Negative Links in Online Social Networks

Both theories agree well with learned models

Further observations: Backward-backward triads have smaller

weights than forward and mixed direction triads

Balance is in better agreement with Epinions and Slashdot while Status is with Wikipedia

Balance consistently disagrees with “enemy of my enemy is my friend”

13

vv

xx

uu

Page 14: Predicting Positive and Negative Links in Online Social Networks

Balance based and learned coefficients:

14

FeatureBalancetheory

EpinSlashdot

Wiki

const 0 0.43 1.49 0.04

1 0.05 0.04 0.05

-1 -0.11 -0.24 -0.16

-1 -0.21 -0.35 -0.14

1 -0.01 -0.03 -0.05

+ --

- +-

- -+

+ ++

Model if signs would be created purely based on Balance theoryModel if signs would be created

purely based on Balance theory

Page 15: Predicting Positive and Negative Links in Online Social Networks

Status based and learned coefficients:

15

FeatureStatustheory

EpinSlashdot

Wiki

const 0 -0.68 -1.39 -0.30

u < x < v 1 0.11 0.05 0.03

u > x > v -1 -0.10 -0.11 -0.19

u < x > v 0 0.06 0.16 0.03

u > x < v 0 -0.01 0.04 0.05

vv

xx

uu+ +

Triads whereu > x > v

Triads whereu > x > v

vv

xx

uu

vv

xx

uu

vv

xx

uu

+

+

––

Model if signs would be created purely based on Status theoryModel if signs would be created purely based on Status theory

Page 16: Predicting Positive and Negative Links in Online Social Networks

16

Epin

Slash

Wiki

Page 17: Predicting Positive and Negative Links in Online Social Networks

Do people use these very different linking systems by obeying the same principles? How generalizable are the results across the datasets?

Train on row “dataset”, predict on “column”

Almost perfect generalization of the models even though networks come from very different applications

17

Page 18: Predicting Positive and Negative Links in Online Social Networks

Suppose we are only interested in predicting whether there is a trust edge or no edge

Does knowing negative edges help?

18

+

+?+

++

+

+–

––

––

+

+?+

++

+

+

Vs.

YES!

Page 19: Predicting Positive and Negative Links in Online Social Networks

19

5

1

3

+ +-

Page 20: Predicting Positive and Negative Links in Online Social Networks

20

Page 21: Predicting Positive and Negative Links in Online Social Networks

21

Page 22: Predicting Positive and Negative Links in Online Social Networks

22Jure Leskovec

Page 23: Predicting Positive and Negative Links in Online Social Networks

Heuristic predictors for (u,v): Balance: Chose sign that makes

majority of the triads balanced Status: Predict in based on status

(v)=d+in(u)+d-

out(u)-d+out(v)-d-

in(v) Out-sign of v: majority sign In-sign of u: majority sign

Observations: Triadic models do better with

increasing embeddedness23