computing local and global centrality
DESCRIPTION
Some recent results on computing PageRank and Katz scores on large networks that I presented at a Dagstuhl workshop.TRANSCRIPT
COMPUTING LOCAL AND GLOBAL CENTRALITY DAVID F. GLEICH (AND MANY OTHERS)! DATA MINING, NETWORKS AND DYNAMICS 2011 NOVEMBER 7
1
Pooya Esfandiar
Byung-Won On
Chen Greif
Laks V.S. Lakshmanan
Francesco Bonchi
LOCAL GLOBAL
Vahab Mirrokni
Reid Andersen
2/41
Graph centrality Global How important is a node? Local How important is a node with respect to another one?
3/41
Graph centrality Koschützki et al. must respect isomorphism higher is better Examples node-degree 1/shortest-path
4/41
Graph centrality This talk Path summation X
`
f (paths of length `)
X
`
↵` · number of paths of
length ` between i and j
local Katz score
5/41
A – adjacency matrix L – Laplacian matrix P – random walk transition matrix Katz score Commute time PageRank
Ki ,j = [(I � ↵AT )�1]i ,j
Ci ,j = vol(G)(L+i ,i + L+
j ,j � 2L+i ,j )
Xi ,j = (1 � ↵)[(I � ↵PT )�1]i ,j
(I � ↵PT )x = (1 � ↵)e/n
6/41
USES FOR CENTRALITY
Ranking features for web-search/classification Najork, M. A.; Zaragoza, H. & Taylor, M. J.#HITS on the web: How does it compare? Becchetti, L.; Castillo, C.; Donato, D.; Baeza-Yates, R. & Leonardi, S. Link analysis for Web spam detection
Interesting nodes
GeneRank, ProteinRank, TwitterRank, IsoRank, FutureRank, HostRank, DiffusionRank, ItemRank, SocialPageRank, SimRank
7/41
USES FOR CENTRALITY
Ranking networks of comparisons. Chartier, T. P.; Kreutzer, E.; Langville, A. N. & Pedings, K. E. Sensitivity and Stability of Ranking Vectors
Clustering or community detection
Andersen, R.; Chung, F. & Lang, K.#Local Graph Partitioning using PageRank Vectors
Link prediction
Savas et al. Hold on about 90 minutes
8/41
THESE GET USED A LOT. THEY
MUST BE FAST.
9
MATRICES, MOMENTS, QUADRATURE
Estimate a quadratic form Also used by Benzi and Bonito (LAA) for Katz scores and the matrix exponential
(ei � ej )T L+(ei � ej )
l x
T f (Z )x u
14
(ei + ej )T (I � ↵PT )�1(ei + ej ) �14
(ei � ej )T (I � ↵PT )�1(ei � ej ) Katz
Commute
10/4
1
MMQ - THE BIG IDEA
Quadratic form
Weighted sum
Stieltjes integral
Quadrature approximation
Matrix equation David F. Gleich (Purdue) Univ. Chicago SSCS Seminar
Think
A is s.p.d. use EVD
“A tautology”
Lanczos
22 of 47
11/4
1
MMQ PROCEDURE Goal Given 1. Run k-steps of Lanczos on starting with 2. Compute , with an additional eigenvalue at ,
set 3. Compute , with an additional eigenvalue at , set
4. Output as lower and upper bounds on
David F. Gleich (Purdue) Univ. Chicago SSCS Seminar
Correspond to a Gauss-Radau rule, with u as a prescribed node
Correspond to a Gauss-Radau rule, with l as a prescribed node
25 of 47 12/4
1
5 10 15 20 25 30-50
0
50arxiv, Katz, hard alpha
matrix-vector products5 10 15 20 25 30
-50
0
50arxiv, Katz, hard alpha
matrix-vector products
5 10 15 20 25 30
10-5
100
arxiv, Katz, hard
matrix-vector products5 10 15 20 25 30
10-5
100
arxiv, Katz, hard
matrix-vector products
𝛼 = 1/( || A ||2 + 1 )
Error Bounds
How well does it work?
13/4
1
MY COMPLAINTS
Matvecs are expensive. Takes many iterations. Just one score comes out!
14/4
1
KATZ SCORES ARE LOCALIZED
David F. Gleich (Purdue) Univ. Chicago SSCS Seminar
Up to 50 neighbors is 99.65% of the total mass
32 of 47
Katz scores are highly localized.
(I � ↵AT )k = ei
15/4
1
HOW CAN WE EXPLOIT THIS?
16
TOP-K ALGORITHM FOR KATZ
Approximate
where is sparse Keep sparse too Ideally, don’t “touch” all of
David F. Gleich (Purdue) Univ. Chicago SSCS Seminar 34 of 47
T
17/4
1
TOP-K ALGORITHM FOR KATZ
Approximate
where is sparse Keep sparse too Ideally, don’t “touch” all of
David F. Gleich (Purdue) Univ. Chicago SSCS Seminar 34 of 47
T
This is possible for "personalized PageRank!
18/4
1
Richardson Ax = b
x
(k+1) = x
(k ) + r
(k )
r
(k+1) = b � Ax
(k ) min x
T Ax � 2x
Tbequivalent#
to
A = AT , A ⌫ 0 Gradient descent
What about coordinate descent?
Gauss-Southwell Ax = b
x
(k+1) = x
(k ) + r (k )j ej
r
(k+1) = r
(k ) + r (k )j Aej
How to pick j?
Frequently “rediscovered” for PageRank. McSherry (WWW2005), Berkhin (JIM 2007), Andersen-Chung-Lang (FOCS 2006)
19/4
1
DEMO!
20
NEW CONVERGENCE THEORY
Katz and PageRank are equivalent if Gauss-Southwell converges when 𝛼 < 1 / || A ||2 (Luo and Tseng 1992) if j is picked as the largest residual Read all about it Fast matrix computations for pair-wise and column-wise commute times and Katz scores. Bonchi, Esfandiar, Gleich, Greif, Lakshmanan, J. Internet Mathematics (to appear)
𝛼 < 1 / || A ||1
21/4
1
10−2 10−1 100 101 102
0
0.2
0.4
0.6
0.8
1
Equivalent matrix−vector products
Prec
isio
n@k
for e
xact
top−
k se
ts
hollywood, Katz, hard alpha
k=10k=100k=1000cg k=25k=25
10−2 10−1 100 101 102
0
0.2
0.4
0.6
0.8
1
Equivalent matrix−vector products
Prec
isio
n@k
for e
xact
top−
k se
ts
hollywood, Katz, hard alpha
1,000,000 node, 100,000,000 edges
22/4
1
OPEN QUESTIONS
I can’t find any existing derivation of this method in the non-symmetric case (prior to the PageRank literature). Any thoughts? How to show that the method convergence for a non-symmetric matrix when is not diagonally dominant?
(I � ↵PT )
23/4
1
OVERLAPPING CLUSTERS FOR DISTRIBUTED CENTRALITY
24
LARGE GRAPHS, IN PRACTICE
src -> dst src -> dst src -> dst
src -> dst src -> dst src -> dst
Edge lists maybe tied together by a common host, stored redundantly on many hard drives.
Copy 1 Copy 2
src -> dst src -> dst src -> dst
src -> dst src -> dst src -> dst
Copy 1 Copy 2
src -> dst src -> dst src -> dst
src -> dst src -> dst src -> dst
Copy 1 Copy 2
25/4
1
UTILIZE SOME REDUNDANCY?
To compute global PageRank?
26
Overlapping clusters for distributed computation. #Andersen, Gleich, Mirrokni, WSDM2012 (to appear).
Use the redundancy to reduce communication when solving a PageRank problem
Overlapping Clusters
27/4
1
Communication avoiding algorithms Communication is the limiting factor in most computations these days. Flops are, relatively speaking, free.
28/4
1
KEY POINTS
Utilize personalized PageRank vectors to find the clusters with “good” conductance scores. Define “core” vertices for each cluster. Find a good way to cover the graph with these clusters. Use restricted additive Schwarz to solve #(thanks Prof. Szyld and Frommer!)
29/4
1
All nodes solve locally using #the coordinate descent method.
30/4
1
All nodes solve locally using #the coordinate descent method.
A core vertex for the gray cluster.
31/4
1
All nodes solve locally using #the coordinate descent method.
Red sends residuals to white. White send residuals to red.
32/4
1
White then uses the coordinate descent method to adjust its solution. Will cause communication to red/blue.
33/4
1
1 1.1 1.2 1.3 1.4 1.5 1.6 1.70
0.5
1
1.5
2
Volume Ratio
Rel
ativ
e W
ork
Metis Partitioner
Swapping Probability (usroads)PageRank Communication (usroads)Swapping Probability (web−Google)PageRank Communication (web−Google)
How much more of the graph we need to store.
It works!
34/4
1
PERSONALIZED PAGERANK CLUSTERS
Solve #to a large degree-weighted tolerance 𝜺 Sweep over the vertices in order of their degree-normalized rank. Find the best conductance set. A Cheeger-like inequality. (Not a heuristic.)
(I � ↵PT )x = (1 � ↵)ei
35/4
1
CORE VERTICES
Compute the expected “leavetime” for each vertex in a cluster. Keep increasing the threshold for a “good” vertex until every vertex is core in some cluster. Then approximate a set-cover problem to cover the graph with clusters, and use a heuristic to pack vertices until
36/4
1
MY QUESTIONS "and future directions
REVERSE ORDER
37
GRAPH SPECTRA
Some work by Banerjee and Jost. 38/4
1