computing local and global centrality

COMPUTING LOCAL AND GLOBAL CENTRALITY DAVID F. GLEICH (AND MANY OTHERS)! DATA MINING, NETWORKS AND DYNAMICS 2011 NOVEMBER 7

1

Pooya Esfandiar

Byung-Won On

Chen Greif

Laks V.S. Lakshmanan

Francesco Bonchi

LOCAL GLOBAL

Vahab Mirrokni

Reid Andersen

2/41

Graph centrality Global How important is a node? Local How important is a node with respect to another one?

3/41

Graph centrality Koschützki et al. must respect isomorphism higher is better Examples node-degree 1/shortest-path

4/41

Graph centrality This talk Path summation X

`

f (paths of length `)

X

`

↵` · number of paths of

length ` between i and j

local Katz score

5/41

A – adjacency matrix L – Laplacian matrix P – random walk transition matrix Katz score Commute time PageRank

Ki ,j = [(I � ↵AT )�1]i ,j

Ci ,j = vol(G)(L+i ,i + L+

j ,j � 2L+i ,j )

Xi ,j = (1 � ↵)[(I � ↵PT )�1]i ,j

(I � ↵PT )x = (1 � ↵)e/n

6/41

USES FOR CENTRALITY

Ranking features for web-search/classification Najork, M. A.; Zaragoza, H. & Taylor, M. J.#HITS on the web: How does it compare? Becchetti, L.; Castillo, C.; Donato, D.; Baeza-Yates, R. & Leonardi, S. Link analysis for Web spam detection

Interesting nodes

GeneRank, ProteinRank, TwitterRank, IsoRank, FutureRank, HostRank, DiffusionRank, ItemRank, SocialPageRank, SimRank

7/41

USES FOR CENTRALITY

Ranking networks of comparisons. Chartier, T. P.; Kreutzer, E.; Langville, A. N. & Pedings, K. E. Sensitivity and Stability of Ranking Vectors

Clustering or community detection

Andersen, R.; Chung, F. & Lang, K.#Local Graph Partitioning using PageRank Vectors

Link prediction

Savas et al. Hold on about 90 minutes

8/41

THESE GET USED A LOT. THEY

MUST BE FAST.

9

MATRICES, MOMENTS, QUADRATURE

Estimate a quadratic form Also used by Benzi and Bonito (LAA) for Katz scores and the matrix exponential

(ei � ej )T L+(ei � ej )

l x

T f (Z )x u

14

(ei + ej )T (I � ↵PT )�1(ei + ej ) �14

(ei � ej )T (I � ↵PT )�1(ei � ej ) Katz

Commute

10/4

1

MMQ - THE BIG IDEA

Quadratic form                    

Weighted sum                      

Stieltjes integral                      

Quadrature approximation                      

Matrix equation               David F. Gleich (Purdue) Univ. Chicago SSCS Seminar

Think                    

A is s.p.d. use EVD

“A tautology”

Lanczos

22 of 47

11/4

1

MMQ PROCEDURE Goal                         Given                         1. Run k-steps of Lanczos on     starting with     2. Compute       ,     with an additional eigenvalue at     ,

set                     3. Compute     ,     with an additional eigenvalue at   , set

                  4. Output               as lower and upper bounds on    

David F. Gleich (Purdue) Univ. Chicago SSCS Seminar

Correspond to a Gauss-Radau rule, with u as a prescribed node

Correspond to a Gauss-Radau rule, with l as a prescribed node

25 of 47 12/4

1

5 10 15 20 25 30-50

0

50arxiv, Katz, hard alpha

matrix-vector products5 10 15 20 25 30

-50

0

50arxiv, Katz, hard alpha

matrix-vector products

5 10 15 20 25 30

10-5

100

arxiv, Katz, hard

matrix-vector products5 10 15 20 25 30

10-5

100

arxiv, Katz, hard

matrix-vector products

𝛼 = 1/( || A ||2 + 1 )

Error Bounds

How well does it work?

13/4

1

MY COMPLAINTS

Matvecs are expensive. Takes many iterations. Just one score comes out!

14/4

1

KATZ SCORES ARE LOCALIZED

David F. Gleich (Purdue) Univ. Chicago SSCS Seminar

Up to 50 neighbors is 99.65% of the total mass

32 of 47

Katz scores are highly localized.

(I � ↵AT )k = ei

15/4

1

HOW CAN WE EXPLOIT THIS?

16

TOP-K ALGORITHM FOR KATZ

Approximate    

                            where     is sparse Keep     sparse too Ideally, don’t “touch” all of    

David F. Gleich (Purdue) Univ. Chicago SSCS Seminar 34 of 47

T

17/4

1

TOP-K ALGORITHM FOR KATZ

Approximate    

                            where     is sparse Keep     sparse too Ideally, don’t “touch” all of    

David F. Gleich (Purdue) Univ. Chicago SSCS Seminar 34 of 47

T

This is possible for "personalized PageRank!

18/4

1

Richardson Ax = b

x

(k+1) = x

(k ) + r

(k )

r

(k+1) = b � Ax

(k ) min x

T Ax � 2x

Tbequivalent#

to

A = AT , A ⌫ 0 Gradient descent

What about coordinate descent?

Gauss-Southwell Ax = b

x

(k+1) = x

(k ) + r (k )j ej

r

(k+1) = r

(k ) + r (k )j Aej

How to pick j?

Frequently “rediscovered” for PageRank. McSherry (WWW2005), Berkhin (JIM 2007), Andersen-Chung-Lang (FOCS 2006)

19/4

1

DEMO!

20

NEW CONVERGENCE THEORY

Katz and PageRank are equivalent if Gauss-Southwell converges when 𝛼 < 1 / || A ||2 (Luo and Tseng 1992) if j is picked as the largest residual Read all about it Fast matrix computations for pair-wise and column-wise commute times and Katz scores. Bonchi, Esfandiar, Gleich, Greif, Lakshmanan, J. Internet Mathematics (to appear)

𝛼 < 1 / || A ||1

21/4

1

10−2 10−1 100 101 102

0

0.2

0.4

0.6

0.8

1

Equivalent matrix−vector products

Prec

isio

n@k

for e

xact

top−

k se

ts

hollywood, Katz, hard alpha

k=10k=100k=1000cg k=25k=25

10−2 10−1 100 101 102

0

0.2

0.4

0.6

0.8

1

Equivalent matrix−vector products

Prec

isio

n@k

for e

xact

top−

k se

ts

hollywood, Katz, hard alpha

1,000,000 node, 100,000,000 edges

22/4

1

OPEN QUESTIONS

I can’t find any existing derivation of this method in the non-symmetric case (prior to the PageRank literature). Any thoughts? How to show that the method convergence for a non-symmetric matrix when is not diagonally dominant?

(I � ↵PT )

23/4

1

OVERLAPPING CLUSTERS FOR DISTRIBUTED CENTRALITY

24

LARGE GRAPHS, IN PRACTICE

src -> dst src -> dst src -> dst


Edge lists maybe tied together by a common host, stored redundantly on many hard drives.

Copy 1 Copy 2



Copy 1 Copy 2



Copy 1 Copy 2

25/4

1

UTILIZE SOME REDUNDANCY?

To compute global PageRank?

26

Overlapping clusters for distributed computation. #Andersen, Gleich, Mirrokni, WSDM2012 (to appear).

Use the redundancy to reduce communication when solving a PageRank problem

Overlapping Clusters

27/4

1

Communication avoiding algorithms Communication is the limiting factor in most computations these days. Flops are, relatively speaking, free.

28/4

1

KEY POINTS

Utilize personalized PageRank vectors to find the clusters with “good” conductance scores. Define “core” vertices for each cluster. Find a good way to cover the graph with these clusters. Use restricted additive Schwarz to solve #(thanks Prof. Szyld and Frommer!)

29/4

1

All nodes solve locally using #the coordinate descent method.

30/4

1


A core vertex for the gray cluster.

31/4

1


Red sends residuals to white. White send residuals to red.

32/4

1

White then uses the coordinate descent method to adjust its solution. Will cause communication to red/blue.

33/4

1

1 1.1 1.2 1.3 1.4 1.5 1.6 1.70

0.5

1

1.5

2

Volume Ratio

Rel

ativ

e W

ork

Metis Partitioner

Swapping Probability (usroads)PageRank Communication (usroads)Swapping Probability (web−Google)PageRank Communication (web−Google)

How much more of the graph we need to store.

It works!

34/4

1

PERSONALIZED PAGERANK CLUSTERS

Solve #to a large degree-weighted tolerance 𝜺 Sweep over the vertices in order of their degree-normalized rank. Find the best conductance set. A Cheeger-like inequality. (Not a heuristic.)

(I � ↵PT )x = (1 � ↵)ei

35/4

1

CORE VERTICES

Compute the expected “leavetime” for each vertex in a cluster. Keep increasing the threshold for a “good” vertex until every vertex is core in some cluster. Then approximate a set-cover problem to cover the graph with clusters, and use a heuristic to pack vertices until

36/4

1

MY QUESTIONS "and future directions

REVERSE ORDER

37

GRAPH SPECTRA

Some work by Banerjee and Jost. 38/4

1

computing local and global centrality

Technology

dstsrc dstsrc

ej ei ej t i p t

katzapproximate t

gleich purdueuniv

ei ej katz44also

ei ej commute1t

rjk ej

quadratic form t