social network analysis via factor graph model

Post on 23-Feb-2016

32 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

Social Network Analysis via Factor Graph Model. Zi Yang. OUTLINE. Background Challenge Unsupervised case 1 Representative user finding Unsupervised case 2 Community discovery Experiments Supervised case Modeling information diffusion in social network. BACKGROUND. Social network - PowerPoint PPT Presentation

TRANSCRIPT

SOCIAL NETWORK ANALYSIS VIA FACTOR GRAPH MODELZi Yang

OUTLINE

Background Challenge Unsupervised case 1

Representative user finding Unsupervised case 2

Community discovery Experiments Supervised case

Modeling information diffusion in social network

BACKGROUND

Social network

Example: Digg.com A popular social news website for people to discover

and share content Various types of behaviors of the users

submit, digg, comment and reply a comment Edges

if one diggs or comments a story of another

BACKGROUND

Community discovery Modularity property

Affinity propagation Clustering via factor graph model Update rules:

,,

exp [ ]2i j

i j i ji j

k ky y

m

Pair-wise constrain

' . . '

' . . ' { , }

' . . ' { }

( , ) ( , ) max { ( , ') ( , ')}

( , ) min{0, ( , ) max{0, ( ', )}}

( , ) max{0, ( ', )}

k s t k k

i s t i i k

i s t i k

r i k s i k a a k s i k

a i k r k k r i k

a k k r i k

BACKGROUND

Affinity propagation

Local factor

1: 1:1 1

, if but :( ) ( , ) ( ) where ( )

0, otherwise

N Nk i

i k N k Ni k

c k i c kS c s i c c c

Regional constrain

OUTLINE

Background Challenge Unsupervised case 1

Representative user finding Unsupervised case 2

Community discovery Experiments Supervised case

Modeling information diffusion in social network

CHALLENGES

How to capture the local properties for social network analysis?

Community discovery as a graph clustering, and how to consider the edge information directly?

Homophily What constraint can be applied to describe the

formation/evolution of community?

OUTLINE

Background Challenge Unsupervised case 1

Representative user finding Unsupervised case 2

Community discovery Experiments Supervised case

Modeling information diffusion in social network

REPRESENTATIVE USER FINDING

Problem definition given a social network and (optional) a

confidence for each user , the objective is to find a pair-wise representativeness on each edge in the network, and estimate the representative degree of each user in the network, which is denoted by a set of variables satisfying . . In other words, represents the user that mostly trusts (or relies on).

( , )G V Ei iv

iv{ }iy

{1, , }iy N iy

iv

REPRESENTATIVE USER FINDING

Modeling Input

Variables

v3

v4v1

v2

y3

y4y1

y2

v3

v4v1

v2

Represent the representative

REPRESENTATIVE USER FINDING

Modeling Node feature function

y3

y4y1

y2

v3

v4v1

v2

g1(y1) g3(y3) g4(y4)g2(y2)

,

,( )

if ( )

( ) ( ) if

0 otherwise

ii y i

i i i j i ij NB i

w y O i

g g y w y i

iy

Normalization factor

Observation: similarity between the node and variable

Self-representative

Neighbor Representative

REPRESENTATIVE USER FINDING

Modeling Edge feature function

y3

y4y1

y2

v3

v4v1

v2

g1(y1) g3(y3) g4(y4)g2(y2)

f2,4(y2,y4)

f2,3(y2,y3)f3,2(y3,y2)f3,2(y3,y2)

f2,1(y2,y1)

, ,

if ( , ) ( , )

1 if i j

i j i j i ji j

y yf f y y

y y

i jy y

Undirected edge: bidirected influence

If vertexes of the edge have the same representativeIf vertexes of the edge have different representative

REPRESENTATIVE USER FINDING

Modeling Regional feature function

a feature function defined on the set of neighboring nodes of and itself.

y3

y4y1

y2

v3

v4v1

v2

g1(y1) g3(y3) g4(y4)g2(y2)

f2,4(y2,y4)

f2,3(y2,y3)f3,2(y3,y2)f3,2(y3,y2)

f2,1(y2,y1)

h4(y4,y2)h3(y3,y1)h2(y2,y3,y4)

h1(y1,y2)

( ) { } ( ) { }

0 if and ( ),( ) ( )

1 otherwise k i

k k I k k

y k i I k y kh h y

I k ky

To avoid “leader without followers”

iv

REPRESENTATIVE USER FINDING

Modeling Objective function

Solving Max-sum algorithm

:

,

,

:

: , ( ) { }1 1

, ( ) { }1 1

max log ( )

1( ) ( ) ( , ) ( )

1 ( ) ( , ) ( )

i j

i j

N N

i i j ki e E k

N N

i i i j i j k I k ki e E k

P

P g f hZ

g y f y y h yZ

1N1 Ny

1 N i i j I k k

y

y y y y y

REPRESENTATIVE USER FINDING

Model learning

( )

( ) { }

( ) { } { }( ) ( ) ( ) ( )

( )( ) ( ) { }

max min ,0

min min ,0 max min ,0 , max ,0

max

max

ii kjk I j

ij jj kj jjk I j i

ij ij ikj ij ij ikjj O i i jk I i O i k I i O i

ijk ik ik ikl ij ij ij O il I i O i j

a r

a r r r

r g c g a c

p g a c g a c

‚ ( ) ( ) { }

max log ,01

ljl I i O i j

ijk jikc p

REPRESENTATIVE USER FINDING

A bit explanation : how likely user persuades to take as his

representative : how likely user compliances the suggestion

from that he considers as his representative The direction of such process

Along the directed edges

ijkp iv jv kv

ijkc ivjv kv

v1 v2

v3

v1 v2

v3

v1 v2

v3

REPRESENTATIVE USER FINDING

Algorithm

OUTLINE

Background Challenge Unsupervised case 1

Representative user finding Unsupervised case 2

Community discovery Experiments Supervised case

Modeling information diffusion in social network

COMMUNITY DISCOVERY

Problem definition given a social network and an expected number

of communities , correspondingly a virtual node . is introduced for each community, and the objective is to find a community for each person satisfying , which represents the community that belongs to, such that maximize the preservation of structure (or maximize the modularity of the community).

GC

cu Uiy

iv {1, , }iy C iv

Q

COMMUNITY DISCOVERY

Feature definition – What’s different? Node feature function

Edge feature function

y3

y4y1

y2

v3

v4v1

v2

u1 u2

g4(y4)f2,4(y2,y4)

f3,2(y3,y2)f1,3(y1,y3)

f2,1(y2,y1)g3(y3)g2(y2)g1(y1)

f2,3(y2,y3)

, ,

,

( , ) exp

exp[ ]2

i j i j i j

i ji j i j

f y y q

k ky y

m

,

( ) ( )

( ) exp [ ] 1| |

j

i ji i j i

j I i O i y

g y y yX

COMMUNITY DISCOVERY

Algorithm

Result output and Variable updates

OUTLINE

Background Challenge Unsupervised case 1

Representative user finding Unsupervised case 2

Community discovery Experiments Supervised case

Modeling information diffusion in social network

Experiments

Dataset: Digg.com a popular social news website for people to

discover and share content 9,583 users, 56,440 contacts various types of behaviors of the users

submit, digg, comment and reply a comment Edges (In total: 308,362)

if one diggs or comments a story of another Weight of the edge: the total number of diggs and

comments

Experiments

Dataset: Digg.com 9,583 users, 56,440 contacts 308,362 edges

weight of the edge: the total number of diggs and comments

Settings: Parameter 0.6

Experiments

Result: 3 most self-representative users on 3 different topics for Digg user network

Experiments

Result: 3 most representative users of 5 communities on 3 different subset

Experiments

Result: Representative network on a sub graph in Digg-2 Network

pyrates

0.000

00.0

003

mikek814

0.0005

rocr69

1nfiniteL oop

0.0003

pavelmah

0.0000

0.0010

G ordonF ree

maxthreepwood0.0007

0.0007 0.0000

0.0000

upick

0.0000

ritubpant

wonderwal

0.0000

0.0000

mklopez

Omek

0.0000

SirP opper

irfanmp

0.0024

0.0024numberneal

0.0020 mpind176

louiebaur

0.0015

0.00100.0009

zohaibusman

0.0007

optimusprime01

0.0006 0.0007

0.00060.002

0

0.000

6

0.0006

0.0006

0.0000

0.0000

0.0000

0.0000

0.0000

0.0000

OUTLINE

Background Challenge Unsupervised case 1

Representative user finding Unsupervised case 2

Community discovery Experiments Supervised case

Modeling information diffusion in social network

Modeling information diffusion in social network Supervised model Bridging the actual value (label) with the

variable. More variables to come?

Learning the weights

Thanks

top related