[ieee 2012 international conference on advances in social networks analysis and mining (asonam 2012)...

7
Global Similarity in Social Networks with Typed Edges D.B. Skillicorn and Q. Zheng School of Computing Queen’s University Kingston. Canada {skill,quan}@cs.queensu.ca Abstract—Most real-world social network analysis treats edges (relationships) as having different intensities (weights), but the same qualitative properties. We address the problem of modelling edges of qualitatively different types that nevertheless interact with one another. For example, influence flows along friend and colleagues edges differently, but treating the two sets of different kinds of edges as independent graphs surely misses interesting and useful structure. We model the subgraph corresponding to each edge type as a layer, and show how to weight the edges connecting the layers to produce a consistent spectral embedding, including for directed graphs. This embedding can be used to compute social network properties of the combined graph, to predict edges, and to predict edge types. We illustrate with Padgett’s dataset of Florentine families in the 15th Century. I. I NTRODUCTION Social networks capture the pairwise connections between individuals (or, in general, nodes of any kind), and then induce knowledge from the global structure implied by the totality of these pairwise connections. Properties at three scales can be derived from such data: Properties associated with individual nodes: betweenness, centrality; Properties associated with the neighborhood of each node: closeness, clustering coefficient, density; Properties of the graph as a whole: diameter, contained substructures. These properties can be directly computed on the graph de- scribing the network, but direct algorithms are often expensive and do not handle incremental settings well, since adding only a single node to a graph can completely alter its path structure. It is common, therefore, to use spectral techniques to embed a graph in a geometric (usually Euclidean) space. Many of the desired properties can then be computed directly from the geometry. For example, the edges in a social network graph can be weighted with positive weights indicating the strength of the local similarity between the pair of nodes that each edge connects. The implicit similarity between unconnected nodes is a function of the strength, and often the number, of paths between those nodes. Once embedded, all of these similarities can be computed easily based on the reciprocal of the distances between the nodes: close nodes are considered to be highly similar, even if not originally connected. This is the basis of edge prediction. Spectral approaches to graphs, therefore, integrate local, pairwise similarity information over the entire graph, and use the integrated information to embed the graph, from which improved local similarity information can then be determined. Even for a connected pairs of nodes, the distance between them (and its implied similarity) may change from its original value, indicating that the global structure of the graph has implications for what at first appeared to be purely local in- formation. The natural dimension of an embedding of a graph with n nodes is n 1, so it is usual to follow an embedding by a projection to a much lower dimensional space before calculating distances. Projection to a lower dimensionality also makes visualization possible. In much social network analysis, edges are weighted, but of a single type. This rules out several interesting kinds of analysis. For example, the social network of an individual often consists of two different kinds of relationships, those associated with work (colleagues) and play (friends) with, of course, potential overlaps. If we want to model the way in which, say, influence works in such a social network, treating all of the edges the same seems inadequate. Presumably some kinds of influence flow better to (as it were) colleagues while other kinds flow better to friends. On the other hand, it also seems inadequate to treat the colleague and friend networks as entirely separate. Presumably some influence can flow from an individual to colleagues, and then on to their friends. The contribution of this paper is to extend the spectral graph approach to networks where the edges have (one of a fixed number of) types, as well as positive weights. This makes it possible to combine the features of, for example, Facebook and LinkedIn into a single network framework that takes into account the qualitative differences in the functionality of edges. As a side-effect of the construction, it is possible to answer two kinds of edge prediction questions: 1) Should there be an edge between these two nodes; and, if so, what type should it be? 2) If there is a new edge between two nodes, what type of edge is it? This enables, for example, a refinement of the suggestions made by social network systems from “this is someone you might know” to “this is a potential colleague” or “this is a potential friend”. 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 978-0-7695-4799-2/12 $26.00 © 2012 IEEE DOI 10.1109/ASONAM.2012.23 79 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 978-0-7695-4799-2/12 $26.00 © 2012 IEEE DOI 10.1109/ASONAM.2012.23 79

Upload: q

Post on 27-Mar-2017

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: [IEEE 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012) - Istanbul (2012.08.26-2012.08.29)] 2012 IEEE/ACM International Conference on Advances

Global Similarity in Social Networks with Typed Edges

D.B. Skillicorn and Q. ZhengSchool of ComputingQueen’s UniversityKingston. Canada

{skill,quan}@cs.queensu.ca

Abstract—Most real-world social network analysis treats edges(relationships) as having different intensities (weights), but thesame qualitative properties. We address the problem of modellingedges of qualitatively different types that nevertheless interactwith one another. For example, influence flows along friend andcolleagues edges differently, but treating the two sets of differentkinds of edges as independent graphs surely misses interestingand useful structure.

We model the subgraph corresponding to each edge type as alayer, and show how to weight the edges connecting the layers toproduce a consistent spectral embedding, including for directedgraphs. This embedding can be used to compute social networkproperties of the combined graph, to predict edges, and to predictedge types. We illustrate with Padgett’s dataset of Florentinefamilies in the 15th Century.

I. INTRODUCTION

Social networks capture the pairwise connections betweenindividuals (or, in general, nodes of any kind), and then induceknowledge from the global structure implied by the totality ofthese pairwise connections. Properties at three scales can bederived from such data:

• Properties associated with individual nodes: betweenness,centrality;

• Properties associated with the neighborhood of eachnode: closeness, clustering coefficient, density;

• Properties of the graph as a whole: diameter, containedsubstructures.

These properties can be directly computed on the graph de-scribing the network, but direct algorithms are often expensiveand do not handle incremental settings well, since adding onlya single node to a graph can completely alter its path structure.It is common, therefore, to use spectral techniques to embeda graph in a geometric (usually Euclidean) space. Many ofthe desired properties can then be computed directly from thegeometry.

For example, the edges in a social network graph can beweighted with positive weights indicating the strength of thelocal similarity between the pair of nodes that each edgeconnects. The implicit similarity between unconnected nodesis a function of the strength, and often the number, of pathsbetween those nodes. Once embedded, all of these similaritiescan be computed easily based on the reciprocal of the distancesbetween the nodes: close nodes are considered to be highlysimilar, even if not originally connected. This is the basis ofedge prediction.

Spectral approaches to graphs, therefore, integrate local,pairwise similarity information over the entire graph, and usethe integrated information to embed the graph, from whichimproved local similarity information can then be determined.Even for a connected pairs of nodes, the distance betweenthem (and its implied similarity) may change from its originalvalue, indicating that the global structure of the graph hasimplications for what at first appeared to be purely local in-formation. The natural dimension of an embedding of a graphwith n nodes is n− 1, so it is usual to follow an embeddingby a projection to a much lower dimensional space beforecalculating distances. Projection to a lower dimensionality alsomakes visualization possible.

In much social network analysis, edges are weighted, butof a single type. This rules out several interesting kinds ofanalysis. For example, the social network of an individualoften consists of two different kinds of relationships, thoseassociated with work (colleagues) and play (friends) with, ofcourse, potential overlaps. If we want to model the way inwhich, say, influence works in such a social network, treatingall of the edges the same seems inadequate. Presumably somekinds of influence flow better to (as it were) colleagues whileother kinds flow better to friends. On the other hand, it alsoseems inadequate to treat the colleague and friend networksas entirely separate. Presumably some influence can flow froman individual to colleagues, and then on to their friends.

The contribution of this paper is to extend the spectral graphapproach to networks where the edges have (one of a fixednumber of) types, as well as positive weights. This makes itpossible to combine the features of, for example, Facebookand LinkedIn into a single network framework that takesinto account the qualitative differences in the functionality ofedges.

As a side-effect of the construction, it is possible to answertwo kinds of edge prediction questions:

1) Should there be an edge between these two nodes; and,if so, what type should it be?

2) If there is a new edge between two nodes, what type ofedge is it?

This enables, for example, a refinement of the suggestionsmade by social network systems from “this is someone youmight know” to “this is a potential colleague” or “this is apotential friend”.

2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

978-0-7695-4799-2/12 $26.00 © 2012 IEEE

DOI 10.1109/ASONAM.2012.23

79

2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

978-0-7695-4799-2/12 $26.00 © 2012 IEEE

DOI 10.1109/ASONAM.2012.23

79

Page 2: [IEEE 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012) - Istanbul (2012.08.26-2012.08.29)] 2012 IEEE/ACM International Conference on Advances

Validation in this setting is, of course, problematic since itis not clear how influence “should” flow across heterogeneousedges. We illustrate the approach by applying it to a well-studied dataset. Our results do not support the most-popularunderstanding of the rise of the Medici family.

II. RELATED WORK

We frame the problem of interest as combining the infor-mation implicit in subgraphs whose edges represent differentkinds of local similarity. The problem of finding severaldifferent clusterings in a dataset that are somehow consistentwith each other has also been addressed and solutions to oneproblem serve also as solutions to the other. However, thedifferent ways of framing the problem have led to differentalgorithmic strategies.

Most attempts to represent graphs with different edge typesbegin with the separate subgraphs of each type. embed eachone independently into a geometric space and then attemptto “paste” these spaces together into a consistent whole. Forexample, Zhou et al. [13] combine normalized Laplacianmatrices of different views with user-determined weights foreach. Xia et al. [11] and Kumar and Daum [5] use iterativetechniques, using each representation of the graph in turn toconverge to a common representation. Cheng and Zhao [1]combine the distances in each separate embedding to create asingle similarity matrix and then repeat the spectral clusteringon this matrix to produce the final embedding. Tang et al.[9] use an approach they call Linked Matrix Factorizationto decompose each graph and then combine pieces of thedecompositions with weights to create a global fusion of thedata. The problem with strategies of these kinds is decidinghow the individual representations should be combined, aboutwhich there is usually not enough information to decide in aprincipled way.

Another line of attack is to connect the subgraphs represent-ing each view of the data with edges whose weights are thendetermined using a regularization framework [4, 7]. De Sa [3]considers the two subgraph case, models it as a bipartite graphand uses a spectral partition of the product matrices from thegraph. Regularization is problematic because it is not clearwhether to build a global regularizer that applies equally toedges of each type and to the edges that connect differentsubgraphs; or whether each subgraph should have its ownregularizer and there should also be another regularizer forthe connecting edges. There does not seem to be a compellingargument for either possibility, let along a way to relate themto one another.

III. APPROACH

Embeddings of undirected graphs attempt to place nodesthat are well connected close together so that the (heavy)edges that join them are short. The implicit circularity inthis description is resolved, as it often is, by the use of aneigendecomposition whose result is a geometric fixed point.An alternate, but equivalent view, is that the length of each

edge in the embedding should reflect the frequency with whichit is traversed in a random walk on the graph, so that longedges connect parts of the graph that are difficult to reachfrom one another, and so represent good places to cut thegraph into clusters.

Direct use of an eigendecomposition of an adjacency matrixfails because well-connected nodes correspond to rows withlarger entries. An eigendecomposition embeds such nodes farfrom the origin when, from a graph perspective, they should becentral. Adjacency matrices are therefore transformed into oneof a number of Laplacian matrices [6], the choices embodyingdecisions about how edges are integrated into paths, whetherdifferences in local degrees should affect the distance scale inthe embedding, and turning the graph structure inside-out sothat well-connected points are placed centrally (and sparselyconnected points are placed peripherally).

For directed graphs, the intuition behind spectral techniquesis less obvious. Approaches take the asymmetric adjacencymatrix of a directed graph and turn it, in one of a number ofways, into a symmetric matrix which can then be embeddedin the standard way [2].

Consider a set of n nodes connected by edges of c differenttypes which, for convenience we will consider to be colors.The set of edges of each particular type (color) forms asubgraph on the set of n nodes. There could, of course, beedges of more than one type between a particular pair of nodes,and there could be nodes connected by edges of only one type;in general, nodes will have incident edges of several types.Edges are positively weighted with a value that represents thetyped pairwise similarity between the nodes they connect.

Each typed subgraph can be represented by a weighted n×nadjacency matrix. But how should these matrices be combinedto represent the whole graph? For simplicity of description, weassume that c = 2 and call the two types of edges red andgreen. Our strategy for representing the multicolored graph isto replicate the nodes of the graph c times, and to representeach subgraph as a layer, connecting the versions of the samenodes. Thus each node in the original graph is represented by cvirtual copies of itself, one in each layer. The adjacency matrixof the entire graph is cn× cn, with the adjacency matrices foreach color as submatrices down the main diagonal.

But how then are the layers connected to one another? Or,to put it another way, what should appear in the off-diagonalpart of the larger adjacency matrix? It is conceivable thatdownweighted versions of single-color edges could be addedto represent “diagonal” paths from, say, a red version of onenode to a green version of another. However, it turns out to besimpler to add only “vertical” edges connecting the layers,that is edges from the version of a node of one color tothe matching version of the same node of another color. Inother words, the off-diagonal n × n submatrices of the largeadjacency matrix are all diagonal matrices.

Now the question becomes: what weights should be asso-ciated with these vertical edges? The basis for choosing these

8080

Page 3: [IEEE 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012) - Istanbul (2012.08.26-2012.08.29)] 2012 IEEE/ACM International Conference on Advances

weights comes from lazy random walks. In such models, arandom walker stays at the current node with probability 1/2or moves with probability 1/2, choosing among the outgoingedges with probability proportional to their weights. Lazyrandom walks are better behaved than random walks sincethey always have a stationary distribution.

In the layered model, then, a random walker at a node inone of the layers makes a move that stays within the samelayer with probability 1/2 (the actual move depending on theweights of the outgoing edges of the node as usual), or a moveto another layer with total probability 1/2, divided uniformlyacross the c − 1 other layers. Thus, from a monochromeperspective, the random walker behaves like an ordinaryrandom walker in the monochrome graph that is the unionof the colored subgraphs.

Given a graph with c colored subgraphs, therefore, we builda cn × cn adjacency matrix. The submatrices on the maindiagonal are the c different colored subgraphs. The remainingsubmatrices in each block row are diagonal matrices whoseentries are the row sums of the matching colored matrixdivided by c− 1. Thus the row sum of the adjacency matrixis twice the row sum of each row of a color submatrix.Henceforth, we ignore the constant factor which does notaffect the eigenvectors.

Let R be the adjacency matrix of the graph with red edges,and G the adjacency matrix of the graph with green edges.Then we construct

A =

(R Dr

Dg G

)

where Dr (respectively Dg) is the matrix with the degrees(or row sums) of R (respectively G) on its diagonal. In otherwords, the two versions of each node of the original graphare connected by vertical edges whose weight depends on theedge weights of that node in the red and green subgraphs.

Embedding such a matrix constructed by using two copiesof R, for example, is straightforward because the matrix issymmetric; it embeds the two copies of each node exactly atthe same locations, demonstrating why these values for theedge weights between layers are appropriate.

In general, A will not be symmetric because the total rededge weight and the total green edge weight are not the sameat each node. Hence the graph described by the adjacencymatrix is actually a directed graph because the downwardsand upwards edges between layers have different weights.

It is possible to explicitly symmetrize the entries so that thedownward and upward edges between versions of the samenode have the same weight. However, we use directed-graphspectral techniques directly, constructing a symmetric graph toembed, but in a different way. One of the advantages of usingdirected-graph spectral techniques is that it makes it possiblefor the red and green subgraphs also to be directed, increasingthe space of possible models.

The social-network intuition behind this model is as follows:

Each individual plays a number of different roles, and ineach role is connected to a different set of others with role-specific intensities or closeness. Properties such as influenceflow along edges of a single color exactly as we assumethey do in existing social-network modelling. However, pathsthat involve multiple colors encounter a kind of resistancethat is proportional to how easily each individual can moveamong their roles. In other words, the cost associated with flowchanging from one colored edge to another is modelled as thecost of individuals “changing gear” to convey some propertysuch as influence from one of their networks to another.In practice, this cost might be associated with contexts: anindividual has to remember something they learned at work intheir home context in order to pass it on to a friend.

For a directed graph, the edge weight on an individual edgeis not a good estimate of its global importance in connectingthe graph. In a general graph, an edge with a large weightwill be traversed often by a random walker. However, in adirected graph such an edge can only be traversed often ifthere are plentiful ways for the random walker to reach thenode(s) upstream of it. This cannot be determined from theimmediate neighborhood of its source node.

Computing an embedding for a directed graph is thereforea two-stage process: first, determine the global importance ofeach node, and then use this information to modulate the edgeweights and symmetrize the matrix appropriately.

The first step, then, is to compute the stationary distributionof the random walk, R, on the 2n-node graph. This distributionis a mapping from each node to a real value between 0 and 1representing the fraction of the time a random walker spendsat that node.

The 2n-node adjacency matrix, A, can be converted intoa random walk matrix, R, by dividing each row by its rowsum. However, there is a technical problem when a node isdisconnected in one (or more) of the layers, and so a row sumis zero. If the minor diagonal entries are chosen to sum to 1,then this version of the node is drawn more strongly towardsthe other layers than the others of its color. Although this isnot implausible in theory, it does not seem appropriate for theapplication domain. As a result we set the major diagonal entryto 1/2, and divide the remaining probability equally acrossthe minor diagonal entries. In other words, for nodes that areisolated in a subgraph, we introduce a self-loop for the nodein that subgraph.

We then compute the stationary distribution of R on the 2n-node graph, representing the time a random walker spends ateach node, and therefore representing the global importance ofeach node in the directed graph. This stationary distributionis the principal left eigenvector of R. Let Pi be the 2n ×2n diagonal matrix whose entries are those of the stationarydistribution. Construct the matrix W as

W = (Π1/2RΠ−1/2 +Π−1/2R′Π1/2)/2

where the superscript dash indicates transposition. W is a

8181

Page 4: [IEEE 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012) - Istanbul (2012.08.26-2012.08.29)] 2012 IEEE/ACM International Conference on Advances

symmetric matrix where the edge weights, especially the edgeweights of the “vertical” edges between layers have beenaltered to reflect the global importance of the nodes at thetop and bottom of each edge [2, 12].

W can be embedded in 2n−1-dimensional space in a stan-dard way and (provided that the layered graph is connected)truncated using the “right hand” eigenvectors, columns 2n−1,2n−2, and so on as the coordinates of the points correspondingto each node.

The directed-graph symmetrizing process reduces, in theundirected case, to a symmetric Laplacian embedding. Suchan embedding is problematic when the degrees of differentnodes have substantially different magnitudes as they do insocial networks because some people have many connectionsand some only a few. In particular, a symmetric Laplacianembedding tends to place nodes with few connections closerto the origin than their better-connected neighbors despitetheir being more isolated and so better represented as moreremote. According we alter the embedding in such a waythat, in the undirected case, it corresponds to a walk Lapla-cian embedding. This is done by dividing each entry in theeigenvectors by the square root of the corresponding entry ofthe stationary distribution, that is multiplying the eigenvectormatrix by Π−1/2 .

The embedding agrees across layers (that is, each copyof the same virtual node is embedded at exactly the samelocation) when multiple copies of the same graph are usedas the input data because the graph is undirected and bothapproaches reduce to an embedding of the walk Laplacian oneach of the subgraphs.

As an illustration, Figure 1 shows the embedding of a graphwhose red layer consists of a circle of 8 nodes, all connectedto two central nodes, and whose green layer consists of twocliques of size four, all of whose nodes are connected to thetwo central nodes. In the embedding, the distorting effect ofthe red layer on the green cliques pulls them into trapezoids,while the green layer pulls the red circle of points into anellipse. The longer blue lines indicate those points where theother layer has the most effect on the placement of the points.

IV. AN EXAMPLE

To illustrate the power of the typed-edge representation, webuild the combined social network of Florentine families inthe 15th Century. This period has been extensively studiedbecause of the rise of the Medici family, who rose fromhumble beginnings to dominate Tuscany and the Papacy overthe ensuing two centuries.

One popular theory explaining the growth in Medici powerat the beginning of the 15th Century is that they developed twoseparated social networks. On the one hand, they built financialrelationships with nouveau riche families; on the other, theybuilt marriage ties with the “old money”, oligarch families[8]. By keeping these two kinds of ties distinct, they wereable to act as gatekeepers between two communities, a role

−0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3−0.4

−0.3

−0.2

−0.1

0

0.1

0.2

0.3

0.4

14

15

4

5

13

16

6

3

2 12 1 11

19

10

7

20

17

8

9

18

19

18

Fig. 1. The embedding of a simple graph

Medici block Oligarchs

Medici Peruzzi

Tornabuoni Castellani

Acciaiuoli Strozzi

Ginori Albizzi

Pazzi Bischeri

Ridolfi Guadagni

Lamberteschi

Barbadori, Salviati with divided loyalties.

TABLE IDIVISION AMONG FLORENTINE FAMILIES

they parlayed into power broking.

Data about the connections among Florentine families wascollected by Padgett (www.casos.cs.cmu.edu/computationaltools/datasets/sets/padgett/index.html), and labelled as eithermarriage or business ties. This dataset has been extensivelyanalyzed from a social network perspective, but from the pointof view of one of the link types at a time, or as a single network[10].

Although allegiances shifted from time to time, the familiescan be divided roughly into those affiliated with the Medicis,and the older, oligarch families as shown in Table I.

Figure 2 shows the subgraph of marriage ties, embeddedusing two identical copies of the graph to illustrate theconsistency of the method. The graph shows a division amongthe oligarchs between those with slightly closer ties to theMedicis (Albizzi, Lamberteschi, Guadagni) and those without(Castellan, Strozzi, Peruzzi, Bischeri).

Figure 3 show the subgraph of financial ties. Here there isobvious structure; essentially there is a linear connection withthe Medicis almost at one end, and the oligarchs at the other.The two groups of oligarchs visible in the marriage networkare rearranged in the financial network, so that the Castellanihave close financial connections to the Medici, but not closemarriage connections.

8282

Page 5: [IEEE 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012) - Istanbul (2012.08.26-2012.08.29)] 2012 IEEE/ACM International Conference on Advances

−0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5−0.4

−0.3

−0.2

−0.1

0

0.1

0.2

0.3

0.4

pazzi

salvia

accia medici

ginori

n−1

albizzi

tornabuon

ridolf

guadag

lambert

barbad

bischeri

strozzi

castell

peruzzi

n−2

Fig. 2. Social network of marriages (importance embedding)

−0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3−0.3

−0.2

−0.1

0

0.1

0.2

0.3

0.4

guadag

bischeri

lambert

peruzzi

castell

strozzi ridolf albizzi accia

n−1

barbad

ginori

medici

salvia pazzi tornabuon

n−2

Fig. 3. Social network of financial dealings (importance embedding)

A. Combined social network

The combined social network is shown in Figure 4. Edgesbetween the two versions of each individual node are drawnin blue. The length of these blue edges can be interpreted asan indication of how different the social and business roles ofeach family are in this context.

The combined graph shows that, when both kinds of re-lationships are taken into account, there are two differentclasses of families. The first include Salviati, Pazzi, Acciaiuoli,Ridolfi, and Albizzi who are all relatively close to the Medici;and the second include Guadagni, Lamberteschi, Bischeri,and Peruzzi, Strozzi, and Castellani. The combined networkprovides information about the network status of the twofamilies considered to have divided loyalties. The Salviatifamily, on this data, is clearly connected to the Medicis. TheBarbadori family plays a strong intermediary role in both

networks but they appear to be better associated with theMedici side.

This combined social network does not support a model inwhich the Medicis act as intermediaries between the newerfamilies and the oligarchs, since the Medicis are at oneextreme of this arrangement, rather than central to it. There arefour different connections between the Medici group and theoligarchs: via Ridolfi, via Tornabuoni, via Ginori and Albizzi,and via Barbadori, this last being the most important (andmost direct) because it involves both marriage and financialconnections. The combined graph also makes it clear that theStrozzi family is the most isolated from the Medici family.Indeed, the relationship between these families became, overthe next century, one of the great rivalries in Tuscany.

B. Edge prediction

The embedded combined social network allows us to doedge prediction because the distances between pairs of nodesin the embedding are a meaningful reflection of their globalsimilarity, even if they were not already connected in the givensubgraphs.

The first form of edge prediction answers the question:should there be an edge between these two vertices? Forexample, in many datasets two unconnected nodes that havea short distance between them suggest either that there is aconnection that failed to be collected in the dataset, or thatthere should be or might soon be an edge (for example, incollaboration networks).

In the Florentine families data, we can, for example, askthe question: which unconnected families are likely to beconnected? In a 15th Century version of Facebook, we couldsuggest to both families that the other is one with which theymight build an alliance.

We do this by computing the edge lengths for each edge thatis not already connected in the given datasets using the fivemost significant dimensions in the embedding. (As a result, thedistances do not necessarily match the apparent distances inthe two-dimensional view of the embedding in Figure 4.) Thepairwise edge lengths are given in Table II, with the Acciaiuoli,Pazzi, and Salviati families removed, since they are all Mediciclients and connect only weakly to the rest of the network.

The shortest distance in this embedding is the financialdistance between Medici and Castellani, suggesting that Bar-badori’s role as middleman in financial dealings is at risk. TheMedici family also has short potential marriage connections tothe Peruzzi and Castellani families. The Strozzi and Medicifamilies are quite remote from one another in this socialnetwork, but the data suggests that, if they make a connection,it is significantly more likely to be a marriage connectionrather than a financial connection. And indeed that is whathappened: the Strozzis married into the Medicis, but not untilsome years later.

A natural question is why the distances are only calculatedwithin single layers. What about the distances from the red

8383

Page 6: [IEEE 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012) - Istanbul (2012.08.26-2012.08.29)] 2012 IEEE/ACM International Conference on Advances

−0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4−0.5

−0.4

−0.3

−0.2

−0.1

0

0.1

0.2

0.3

0.4

accia

accia pazzi salvia pazzi salvia

medicimedici

albizzi

ginori

tornabuon

ginori

albizzi

tornabuon

n−1

ridolf

ridolf

barbad barbad

guadag

castell

castell

guadaglambert

peruzzi

lambert

peruzzi

bischeri

strozzi

bischeri

strozzi

n−2

Fig. 4. Combined social network with red marriage edges, green financial edges, and blue edges representing the magnitude of role differences

version of one node to the green version of another? Our modeldoes not include ‘diagonal’ edges of this kind, so they aremeaningless within our framework. It can happen that pointsxR and yG are much closer together than xR and yR, orxG and yG. When this happens, it means that xR is well-connected to some red-layer point, zR, yG is well-connectedto its green-layer analogue, zG, and zR and zG are well-connected vertically. This amounts to saying that vertical edgeswith high weight “pull” the layers closer together, but this isalready fully accounted for by the framework.

A second version of the edge prediction problem is this:suppose a new edge appears in the real world or becauseof updating the dataset. What color is it most likely to be?This question can be answered by comparing the distancesbetween the relevant pairs of nodes in the red layer and in thegreen layer, and predicting the edge type will be whicheverof these distances is the smaller. For example, the Bischeriand Castellani family are already fairly well connected byboth marriage and financial links, but do not have a directconnection. if we learn that these two families have madea direct alliance, which mechanism is more likely to havebeen used? The marriage distance between them is 0.109 and

the financial distance is 0.114 so we predict that a marriagerelationship is slightly more likely.

V. DISCUSSION AND CONCLUSIONS

Social networks that ignore the different possible kinds ofrelationships, as most do today, surely miss useful informationabout how properties such as influence work in heterogeneousnetworks. The flow along friendship and colleague edges isdifferent because the contexts in which each takes place areusually different, geographically and mentally, and exacer-bated by the amounts of time we spend in these differentcontexts. On the other hand, treating, say, friendship andbusiness networks as distinct (Facebook and LinkedIn) alsosurely misses parts of the interactions among them.

We have developed a spectral approach to understandingand analyzing graphs with a finite number of edge types thatprovides well-motivated results, and a deeper intuition to helpunderstand the resulting embedding. There is always a problemof validation in such approaches because ground truth is notwell-defined; but we show how the approach can be used toanalyze groups where there is already some understanding ofthe group dynamics. The edge-typed data provides a more

8484

Page 7: [IEEE 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012) - Istanbul (2012.08.26-2012.08.29)] 2012 IEEE/ACM International Conference on Advances

nuanced understanding of how this particular social networkis structured (and incidentally suggests that a model of howMedici power was deployed through the 15th Century is notsupported by this data, at least during this (early) period ofMedici history).

Our use of a directed graph algorithm enables the approachto be generalized to more-realistic scenarios in which the indi-vidual typed edges are themselves directed. In the Florentinefamilies, for example, marriage has very different implicationsfrom the point of view of the man’s family than from thewoman’s; and a business relationship is very different fromthe point of view of the borrower than from the lender. Theseeffects can be accounted for directly in the framework we haveconstructed.

The nominal size of the matrices describing the networksgrows quadratically with the number of edge types, but theextra pieces are only diagonal matrices, so sparse matrix rep-resentations keep the actual growth linear. Hence the analysisalgorithms are reasonably scalable.

REFERENCES

[1] Yong Cheng and Ruilian Zhao. Multiview spectral clustering viaensemble. In Granular Computing, 2009, GRC ’09. IEEE InternationalConference on, pages 101 –106, aug. 2009.

[2] F. Chung. Laplacians and the Cheeger inequality for directed graphs.Annals of Combinatorics, 9:1–19, 2005.

[3] Virginia R. de Sa. Spectral clustering with two views. In Proceedingsof the Workshop on Learning with Multiple Views, 22 nd ICML. ICML(International Conference on Machine Learning), 2005.

[4] Xiaowen Dong, Pascal Frossard, Pierre Vandergheynst, and NikolaiNefedov. Clustering with multi-layer graphs: A spectral perspective.Technical report, Computing Research Repository, 2011.

[5] Abhishek Kumar and Hal Daum. A co-training approach for multi-viewspectral clustering. Computer, 94(5):393–400, 2011.

[6] Ulrike Luxburg. A tutorial on spectral clustering. Statistics andComputing, 17(4):395–416, December 2007.

[7] Pradeep Muthukrishnan, Dragomir R. Radev, and Qiaozhu Mei. Edgeweight regularization over multiple graphs for similarity learning. InGeoffrey I. Webb, Bing Liu, Chengqi Zhang, Dimitrios Gunopulos, andXindong Wu, editors, ICDM, pages 374–383. IEEE Computer Society,2010.

[8] J.F. Padgett and C.K. Ansell. Robust action and the rise of the Medici.American Journal of Sociology, 96(6):1259–1319, May 1993.

[9] Wei Tang, Zhengdong Lu, and Inderjit S. Dhillon. Clustering withmultiple graphs. In Wang, Kargupta, Ranka, Yu, and Wu, editors,IEEE International Conference on Data Mining, pages 1016–1021. IEEEComputer Society, 2009.

[10] S. Wasserman and K. Faust. Social Network Analysis: Methods andApplications. Cambridge University Press, 1994.

[11] Tian Xia, Dacheng Tao, Tao Mei, and Yongdong Zhang. Multiview spec-tral embedding. IEEE Transactions on Systems, Man, and Cybernetics,Part B, 40(6):1438–1446, 2010.

[12] D. Zhou, J. Huang, and B. Scholkopf. Learning from labeled andunlabeled data on a directed graph. In L. De Raedt and S. Wrobel,editors, Proceedings of the 22nd International Conference on MachineLearning (ICML), pages 1041–1048, 2005.

[13] Dengyong Zhou and Christopher J. C. Burges. Spectral clustering andtransductive learning with multiple views. In Proceedings of the 24thinternational conference on Machine learning, ICML ’07, pages 1159–1166, New York, NY, USA, 2007. ACM.

Pair Marriage Business

distance distance

Albizzi–Barbadori 0.200 0.264

Albizzi–Bischeri 0.200 0.288

Albizzi–Castellani 0.208 0.276

Albizzi–Lamberteschi 0.243 0.303

Albizzi–Peruzzi 0.204 0.277

Albizzi–Ridolfi 0.286 0.426

Albizzi–Strozzi 0.227 0.311

Albizzi–Tornabuoni 0.215 0.285

Barbadori–Bischeri 0.120 0.144

Barbadori–Guadagni 0.186 0.194

Barbadori–Lamberteschi 0.205 0.170

Barbadori–Ridolfi 0.257 0.369

Barbadori–Strozzi 0.140 0.216

Barbadori–Tornabuoni 0.152 0.120

Bischeri–Castellani 0.109 0.114

Bischeri–Ginori 0.178 0.170

Bischeri–Medici 0.121 0.154

Bischeri–Ridolfi 0.215 0.319

Bischeri–Tornabuoni 0.115 0.128

Castellani–Ginori 0.161 0.108

Castellani–Guadagni 0.191 0.172

Castellani–Medici 0.112 0.102

Castellani–Ridolfi 0.261 0.365

Castellani–Tornabuoni 0.169 0.133

Ginori–Guadagni 0.201 0.206

Ginori–Lamberteschi 0.237 0.193

Ginori–Peruzzi 0.165 0.122

Ginori–Ridolfi 0.280 0.367

Ginori–Strozzi 0.191 0.225

Ginori–Tornabuoni 0.194 0.132

Guadagni–Medici 0.160 0.190

Guadagni–Peruzzi 0.160 0.142

Guadagni–Ridolfi 0.216 0.310

Guadagni–Strozzi 0.260 0.358

Lamberteschi–Medici 0.185 0.173

Lamberteschi–Ridolfi 0.239 0.324

Lamberteschi–Strozzi 0.284 0.339

Lamberteschi–Tornabuoni 0.141 0.141

Medici–Peruzzi 0.112 0.112

Medici–Strozzi 0.172 0.251

Peruzzi–Ridolfi 0.244 0.350

Peruzzi–Tornabuoni 0.151 0.125

Strozzi–Tornabuoni 0.216 0.285

TABLE IITABLE OF DISTANCES BETWEEN UNCONNECTED NODES (SMALLER

MEANS MORE CLOSELY RELATED)

8585