gaps between the theory and practice of large-scale matrix-based network computations

28
Gaps between theory practice in large scale matrix computations for networks David F. Gleich Assistant Professor Computer Science Purdue University

Upload: david-gleich

Post on 11-May-2015

632 views

Category:

Technology


1 download

DESCRIPTION

I discuss some runtimes for the personalized PageRank vector and how it relates to open questions in how we should tackle these network based measures via matrix computations.

TRANSCRIPT

Page 1: Gaps between the theory and practice of large-scale matrix-based network computations

Gaps between theory practice in !!!

large scale matrix computations for networks

David F. Gleich Assistant Professor Computer Science Purdue University

Page 2: Gaps between the theory and practice of large-scale matrix-based network computations

Networks as matrices

Bott, Genetic Psychology Manuscripts, 1928

Bott, Genetic Psychology Manuscripts, 1928

Page 3: Gaps between the theory and practice of large-scale matrix-based network computations

Networks as matrices

Page 4: Gaps between the theory and practice of large-scale matrix-based network computations

Networks as matrices

Page 5: Gaps between the theory and practice of large-scale matrix-based network computations

Networks as matrices

A =

Page 6: Gaps between the theory and practice of large-scale matrix-based network computations

6 Image from rockysprings, deviantart, CC share-alike

Everything in the world can be explained by a matrix, and we see

how deep the rabbit hole goes

The talk ends, you believe -- whatever

you want to.

Page 7: Gaps between the theory and practice of large-scale matrix-based network computations
Page 8: Gaps between the theory and practice of large-scale matrix-based network computations

Matrix computations in a red-pill

Solve a problem better by exploiting its structure!

Page 9: Gaps between the theory and practice of large-scale matrix-based network computations

My research!Models and algorithms for high performance !matrix and network computations on data

18 P. G. CONSTANTINE, D. F. GLEICH, Y. HOU, AND J. TEMPLETON

1

error

0

2

(a) Error, s = 0.39 cm

1

std

0

2

(b) Std, s = 0.39 cm

10

error

0

20

(c) Error, s = 1.95 cm

10

std

0

20

(d) Std, s = 1.95 cm

Fig. 4.5: Error in the reduce order model compared to the prediction standard de-viation for one realization of the bubble locations at the final time for two values ofthe bubble radius, s = 0.39 and s = 1.95 cm. (Colors are visible in the electronicversion.)

the varying conductivity fields took approximately twenty minutes to construct usingCubit after substantial optimizations.

Working with the simulation data involved a few pre- and post-processing steps:interpret 4TB of Exodus II files from Aria, globally transpose the data, compute theTSSVD, and compute predictions and errors. The preprocessing steps took approx-imately 8-15 hours. We collected precise timing information, but we do not reportit as these times are from a multi-tenant, unoptimized Hadoop cluster where otherjobs with sizes ranging between 100GB and 2TB of data sometimes ran concurrently.Also, during our computations, we observed failures in hard disk drives and issuescausing entire nodes to fail. Given that the cluster has 40 cores, there was at most2400 cpu-hours consumed via these calculations—compared to the 131,072 hours ittook to compute 4096 heat transfer simulations on Red Sky. Thus, evaluating theROM was about 50-times faster than computing a full simulation.

We used 20,000 reducers to convert the Exodus II simulation data. This choicedetermined how many map tasks each subsequent step utilized—around 33,000. Wealso found it advantageous to store matrices in blocks of about 16MB per record. Thereduction in the data enabled us to use a laptop to compute the coe�cients of theROM and apply to the far face for the UQ study in Section 4.4.

Here are a few pertinent challenges we encountered while performing this study.Generating 8192 meshes with di↵erent material properties and running independent

Ax = b

min kAx � bkAx = �x

Massive matrix "computations

on multi-threaded and distributed architectures Big data methods too

Fast & Scalable"Network analysis

Tensor eigenvalues"and a power method

28

Tensor methods for network alignment

Network alignment is the problem of computing an approximate isomorphism between two net-works. In collaboration with Mohsen Bayati, Amin Saberi, Ying Wang, and Margot Gerritsen,the PI has developed a state of the art belief propagation method (Bayati et al., 2009).

FIGURE 6 – Previous workfrom the PI tackled net-work alignment with ma-trix methods for edgeoverlap:

i

j

j

0i

0

OverlapOverlap

A L B

This proposal is for match-ing triangles using tensormethods:

j

i

k

j

0

i

0

k

0

TriangleTriangle

A L B

If xi, xj , and xk areindicators associated withthe edges (i, i0), (j, j0), and(k, k0), then we want toinclude the product xixjxk

in the objective, yielding atensor problem.

We propose to study tensor methods to perform network alignmentwith triangle and other higher-order graph moment matching. Similarideas were proposed by Svab (2007); Chertok and Keller (2010) alsoproposed using triangles to aid in network alignment problems.In Bayati et al. (2011), we found that triangles were a key missingcomponent in a network alignment problem with a known solution.Given that preserving a triangle requires three edges between twographs, this yields a tensor problem:

maximizeX

i2L

wixi +X

i2L

X

j2L

xixjSi,j +X

i2L

X

j2L

X

k2L

xixjxkTi,j,k

| {z }triangle overlap term

subject to x is a matching.

Here, Ti,j,k = 1 when the edges corresponding to i, j, and k inL results in a triangle in the induced matching. Maximizing thisobjective is an intractable problem. We plan to investigate a heuris-tic based on a rank-1 approximation of the tensor T and usinga maximum-weight matching based rounding. Similar heuristicshave been useful in other matrix-based network alignment algo-rithms (Singh et al., 2007; Bayati et al., 2009). The work involvesenhancing the Symmetric-Shifted-Higher-Order Power Method due toKolda and Mayo (2011) to incredibly large and sparse tensors . On thisaspect, we plan to collaborate with Tamara G. Kolda. In an initialevaluation of this triangle matching on synthetic problems, using thetensor rank-1 approximation alone produced results that identifiedthe correct solution whereas all matrix approaches could not.

vision for the future

All of these projects fit into the PI’s vision for modernizing the matrix-computation paradigmto match the rapidly evolving space of network computations. This vision extends beyondthe scope of the current proposal. For example, the web is a huge network with over onetrillion unique URLs (Alpert and Hajaj, 2008), and search engines have indexed over 180billion of them (Cuil, 2009). Yet, why do we need to compute with the entire network?By way of analogy, note that we do not often solve partial di↵erential equations or modelmacro-scale physics by explicitly simulating the motion or interaction of elementary particles.We need something equivalent for the web and other large networks. Such investigations maytake many forms: network models, network geometry, or network model reduction. It is thevision of the PI that the language, algebra, and methodology of matrix computations will

11

maximize

Pijk

T

ijk

x

i

x

j

x

k

subject to kxk2

= 1

Human protein interaction networks 48,228 triangles Yeast protein interaction networks 257,978 triangles The tensor T has ~100,000,000,000 nonzeros

We work with it implicitly

where ! ensures the 2-norm

[x

(next)

]

i

= ⇢ · (

X

jk

T

ijk

x

j

x

k

+ �x

i

)

SSHOPM method due to "Kolda and Mayo

Network alignment

Page 10: Gaps between the theory and practice of large-scale matrix-based network computations

One canonical problem

(� � ↵AT D�1)x = f

A adjacency matrix

D degree matrix

↵ regularization

f “prior” or “givens”

PageRank

Personalized PageRank

Semi-supervised"learning on graphs

Protein function prediction

Gene-experiment association

Network alignment

Food webs

Page 11: Gaps between the theory and practice of large-scale matrix-based network computations

One canonical problem

(� � ↵AT D�1)x = f

Vahab - clustering Karl - clustering Art – prediction

Jen - prediction Sebastiano – ranking/centrality

Page 12: Gaps between the theory and practice of large-scale matrix-based network computations

An example on a graph PageRank by Google

1

2

3

4

5

6

The Model1. follow edges uniformly with

probability �, and2. randomly jump with probability

1� �, we’ll assume everywhere isequally likely

The places we find thesurfer most often are im-portant pages.

David F. Gleich (Sandia) PageRank intro Purdue 5 / 36

0

BBBBBB@� � ↵

2

6666664

1/6 1/2 0 0 0 01/6 0 0 1/3 0 01/6 1/2 0 1/3 0 01/6 0 1/2 0 0 01/6 0 1/2 1/3 0 11/6 0 0 0 1 0

3

7777775

1

CCCCCCA

2

6666664

x1x2x3x4x5x6

3

7777775=

2

6666664

000100

3

7777775

(� � ↵AT D�1)x = f

non-singular linear system (�< 1), non-negative inverse, works with weights, directed & undirected graphs, weights that don’t sum to less than 1 in each column, …

Page 13: Gaps between the theory and practice of large-scale matrix-based network computations

An example on a bigger graph

Newman’s netscience graph 379 vertices 1828 non-zeros “zero” on most of the nodes

f has a single "one here

Page 14: Gaps between the theory and practice of large-scale matrix-based network computations

A special case (� � ↵AT D�1)x = ei

“one column” or “one node”

x = column i of (� � ↵AT D�1

)

�1

localized solutions

Page 15: Gaps between the theory and practice of large-scale matrix-based network computations

An example on a bigger graph Crawl of flickr from 2006 ~800k nodes, 6M edges, alpha=1/2

0 2 4 6 8 10x 105

0

0.5

1

1.5

100 102 104 10610−15

10−10

10−5

100

100 102 104 10610−15

10−10

10−5

100

nonzeros er

ror

plot(x) ||x

true –

xnn

z||1

Page 16: Gaps between the theory and practice of large-scale matrix-based network computations

Complexity is complex

•  Linear system – O(n3) •  Sparse linear sys. (undir.) – O(m log(m)�)

where �is a function of latest result on solving SDD systems on graphs

•  Neumann series – O(m log(�)/log(tol))

Page 17: Gaps between the theory and practice of large-scale matrix-based network computations

0DWUL[�,QYHUVLRQ�E\�D�0RQWH�&DUOR�0HWKRG��7KH�IROORZLQJ�XQXVXDO�PHWKRG�RI�LQYHUWLQJ�D�FODVV�RI�PDWULFHV�ZDV�GH��

YLVHG�E\�-��921�1(80$11� DQG�6��0��8/$0��6LQFH�LW�DSSHDUV�QRW�WR�EH�LQ�SULQW��DQ�H[SRVLWLRQ�PD\�EH�RI�LQWHUHVW�WR�UHDGHUV�RI�07$�&��7KH�PHWKRG�LV�UHPDUNDEOH�LQ�WKDW�LW�FDQ�EH�XVHG�WR�LQYHUW�D�FODVV�RI�Q�WK�RUGHU�PDWULFHV��VHH�ILQDO�SDUDJUDSK��ZLWK�RQO\�Q��DULWKPHWLF�RSHUDWLRQV�LQ�DGGLWLRQ�WR�WKH�VFDQQLQJ�DQG�GLVFULPLQDWLQJ�UHTXLUHG�WR�SOD\�WKH�VROLWDLUH�JDPH��7KH�PHWKRG�WKHUHIRUH�DSSHDUV�EHVW�VXLWHG�WR�D�KXPDQ�FRPSXWHU�ZLWK�D�WDEOH�RI�UDQGRP�GLJLWV�DQG�QR�FDOFXODWLQJ�PDFKLQH��0RUHRYHU��WKH�PHWKRG�OHQGV�LWVHOI�IDLUO\�ZHOO�WR�REWDLQLQJ�D�VLQJOH�HOHPHQW�RI�WKH�LQYHUVH�PDWUL[�ZLWKRXW�GHWHUPLQLQJ�WKH�UHVW�RI�WKH�PDWUL[��7KH�WHUP��0RQWH�&DUOR��UHIHUV�WR�PDWKHPDWLFDO�VDPSOLQJ�SURFHGXUHV�XVHG�WR�DSSUR[LPDWH�D� WKHRUHWLFDO�GLVWULEXWLRQ�>VHH�07$&��Y�����S�����@��

/HW�%�EH�D�PDWUL[�RI�RUGHU�Q�ZKRVH�LQYHUVH�LV�GHVLUHG��DQG�OHW�$� � ,� �� %��ZKHUH�,� LV�WKH�XQLW�PDWUL[��)RU�DQ\�PDWUL[�0��OHW�;U�0��GHQRWH�WKH�U�WK�SURSHU�YDOXH�RI�0��DQG�OHW�0LM�GHQRWH�WKH�HOHPHQW�RI�0�LQ�WKH�L�WK�URZ�DQG�M�WK�FROXPQ��7KH�SUHVHQW�PHWKRG�SUHVXSSRVHV�WKDW�

���� PD[� ���� U�%�,� � PD[�;U�$�,�����U� U�

:KHQ�����KROGV��LW�LV�NQRZQ�WKDW�DR�

���� �%����LL� � �>,� ��$�@���� � �$���LL��N�2�

7KH�0RQWH�&DUOR�PHWKRG�WR�FRPSXWH��%���LL�LV�WR�SOD\�D�VROLWDLUH�JDPH�*L��ZKRVH�H[SHFWHG�SD\PHQW�LV��%���LM��$FFRUGLQJ�WR�D�UHVXOW�RI�.2/02*252))��RQ�WKH�VWURQJ�ODZ�RI�ODUJH�QXPEHUV��LI�RQH�SOD\V�VXFK�D�JDPH�UHSHDWHGO\��WKH�DYHUDJH�SD\PHQW�IRU�1� VXFFHVVLYH�SOD\V�ZLOO�FRQYHUJH�WR� �%��LM� DV�1���� R�� IRU�DOPRVW�DOO�VHTXHQFHV�RI�SOD\V��7KH�UXOHV�RI�WKH�JDPH�ZLOO�EH�H[SUHVVHG�LQ�WHUPV�RI�EDOOV�LQ�XUQV��EXW�D�FRPSXWHU�ZRXOG�SUREDEO\�XVH�D�WDEOH�RI�UDQGRP�GLJLWV��

)RU� ��B�L�M�F� Q�SLFN�SUREDELOLWLHV�SL������ DQG�FRUUHVSRQGLQJ��YDOXH�IDFWRUV��YL���VXEMHFW�RQO\�WR�WKH�FRQGLWLRQV�WKDW�SL�YLL� � DL��DQG�(��� SMM� �� ���/HW�WKH��VWRS�SUREDELOLWLHV��SL�EH�GHILQHG�E\�WKH�UHODWLRQV�SL� � ���� �� SVM��7DNH� Q�XUQV��,Q�WKH�L�WK�XUQ�8L�L� � ���������Q�� SXW�DQ�DVVRUWPHQW�RI�Q��� ��GLIIHUHQW�W\SHV�RI�EDOOV��(DFK� EDOO�RI�WKH�M�WK�W\SH�LV�PDUNHG��M���DQG�ZLOO�EH� GUDZQ�IURP�8L�ZLWK�SUREDELOLW\�SLM�M� � ���������� ��Q��� 7KH��Q��� ���WK�W\SH�RI�EDOO�LV�PDUNHG��6723��� DQG�ZLOO�EH�GUDZQ�IURP�8L�ZLWK�SUREDELOLW\�SL��

)RU�WKH�JDPH�*L��QRZ�WR�EH�GHILQHG��WKH�YDOXH�RI�D�SOD\�LV�D�UDQGRP�YDULDEOH�*LM�ZKRVH�H[SHFWDWLRQ�ZLOO�EH�SURYHG�WR�EH��%���LM��)LUVW�GUDZ�D�EDOO�IURP�8L��DOO�GUDZLQJV�DUH�ZLWK�UHSODFHPHQW���,I�LW�LV�D�6723� EDOO��WKH�SD\PHQW�*���LV����LI�L��� M�� RU�S����LI�L� �M���2WKHUZLVH�WKH�EDOO�PXVW�FDUU\�D�PDUN�L����F����LL� Q���DQG�RQH�LV�QH[W�WR�GUDZ�D�EDOO�IURP�8L���ZKLFK�LQ�WXUQ�WHOOV�RQH�ZKHWKHU�WR�VWRS�RU�GUDZ�DJDLQ��2QH�IROORZV�WKLV�WUHDVXUH�KXQW�IURP�XUQ�WR�XUQ�XQWLO�D�6723� EDOO�LV�ILUVW�GUDZQ�VD\�IURP�8V�RQ�WKH�N�WK�GUDZLQJ��,I�K��� M�� WKH�SD\PHQW�*L��LV���� ,I�K� �M�� VXSSRVH�RQH�KDV�

����

Forsythe and Liebler, 1950 0DWUL[�,QYHUVLRQ�E\�D�0RQWH�&DUOR�0HWKRG��7KH�IROORZLQJ�XQXVXDO�PHWKRG�RI�LQYHUWLQJ�D�FODVV�RI�PDWULFHV�ZDV�GH��

YLVHG�E\�-��921�1(80$11� DQG�6��0��8/$0��6LQFH�LW�DSSHDUV�QRW�WR�EH�LQ�SULQW��DQ�H[SRVLWLRQ�PD\�EH�RI�LQWHUHVW�WR�UHDGHUV�RI�07$�&��7KH�PHWKRG�LV�UHPDUNDEOH�LQ�WKDW�LW�FDQ�EH�XVHG�WR�LQYHUW�D�FODVV�RI�Q�WK�RUGHU�PDWULFHV��VHH�ILQDO�SDUDJUDSK��ZLWK�RQO\�Q��DULWKPHWLF�RSHUDWLRQV�LQ�DGGLWLRQ�WR�WKH�VFDQQLQJ�DQG�GLVFULPLQDWLQJ�UHTXLUHG�WR�SOD\�WKH�VROLWDLUH�JDPH��7KH�PHWKRG�WKHUHIRUH�DSSHDUV�EHVW�VXLWHG�WR�D�KXPDQ�FRPSXWHU�ZLWK�D�WDEOH�RI�UDQGRP�GLJLWV�DQG�QR�FDOFXODWLQJ�PDFKLQH��0RUHRYHU��WKH�PHWKRG�OHQGV�LWVHOI�IDLUO\�ZHOO�WR�REWDLQLQJ�D�VLQJOH�HOHPHQW�RI�WKH�LQYHUVH�PDWUL[�ZLWKRXW�GHWHUPLQLQJ�WKH�UHVW�RI�WKH�PDWUL[��7KH�WHUP��0RQWH�&DUOR��UHIHUV�WR�PDWKHPDWLFDO�VDPSOLQJ�SURFHGXUHV�XVHG�WR�DSSUR[LPDWH�D� WKHRUHWLFDO�GLVWULEXWLRQ�>VHH�07$&��Y�����S�����@��

/HW�%�EH�D�PDWUL[�RI�RUGHU�Q�ZKRVH�LQYHUVH�LV�GHVLUHG��DQG�OHW�$� � ,� �� %��ZKHUH�,� LV�WKH�XQLW�PDWUL[��)RU�DQ\�PDWUL[�0��OHW�;U�0��GHQRWH�WKH�U�WK�SURSHU�YDOXH�RI�0��DQG�OHW�0LM�GHQRWH�WKH�HOHPHQW�RI�0�LQ�WKH�L�WK�URZ�DQG�M�WK�FROXPQ��7KH�SUHVHQW�PHWKRG�SUHVXSSRVHV�WKDW�

���� PD[� ���� U�%�,� � PD[�;U�$�,�����U� U�

:KHQ�����KROGV��LW�LV�NQRZQ�WKDW�DR�

���� �%����LL� � �>,� ��$�@���� � �$���LL��N�2�

7KH�0RQWH�&DUOR�PHWKRG�WR�FRPSXWH��%���LL�LV�WR�SOD\�D�VROLWDLUH�JDPH�*L��ZKRVH�H[SHFWHG�SD\PHQW�LV��%���LM��$FFRUGLQJ�WR�D�UHVXOW�RI�.2/02*252))��RQ�WKH�VWURQJ�ODZ�RI�ODUJH�QXPEHUV��LI�RQH�SOD\V�VXFK�D�JDPH�UHSHDWHGO\��WKH�DYHUDJH�SD\PHQW�IRU�1� VXFFHVVLYH�SOD\V�ZLOO�FRQYHUJH�WR� �%��LM� DV�1���� R�� IRU�DOPRVW�DOO�VHTXHQFHV�RI�SOD\V��7KH�UXOHV�RI�WKH�JDPH�ZLOO�EH�H[SUHVVHG�LQ�WHUPV�RI�EDOOV�LQ�XUQV��EXW�D�FRPSXWHU�ZRXOG�SUREDEO\�XVH�D�WDEOH�RI�UDQGRP�GLJLWV��

)RU� ��B�L�M�F� Q�SLFN�SUREDELOLWLHV�SL������ DQG�FRUUHVSRQGLQJ��YDOXH�IDFWRUV��YL���VXEMHFW�RQO\�WR�WKH�FRQGLWLRQV�WKDW�SL�YLL� � DL��DQG�(��� SMM� �� ���/HW�WKH��VWRS�SUREDELOLWLHV��SL�EH�GHILQHG�E\�WKH�UHODWLRQV�SL� � ���� �� SVM��7DNH� Q�XUQV��,Q�WKH�L�WK�XUQ�8L�L� � ���������Q�� SXW�DQ�DVVRUWPHQW�RI�Q��� ��GLIIHUHQW�W\SHV�RI�EDOOV��(DFK� EDOO�RI�WKH�M�WK�W\SH�LV�PDUNHG��M���DQG�ZLOO�EH� GUDZQ�IURP�8L�ZLWK�SUREDELOLW\�SLM�M� � ���������� ��Q��� 7KH��Q��� ���WK�W\SH�RI�EDOO�LV�PDUNHG��6723��� DQG�ZLOO�EH�GUDZQ�IURP�8L�ZLWK�SUREDELOLW\�SL��

)RU�WKH�JDPH�*L��QRZ�WR�EH�GHILQHG��WKH�YDOXH�RI�D�SOD\�LV�D�UDQGRP�YDULDEOH�*LM�ZKRVH�H[SHFWDWLRQ�ZLOO�EH�SURYHG�WR�EH��%���LM��)LUVW�GUDZ�D�EDOO�IURP�8L��DOO�GUDZLQJV�DUH�ZLWK�UHSODFHPHQW���,I�LW�LV�D�6723� EDOO��WKH�SD\PHQW�*���LV����LI�L��� M�� RU�S����LI�L� �M���2WKHUZLVH�WKH�EDOO�PXVW�FDUU\�D�PDUN�L����F����LL� Q���DQG�RQH�LV�QH[W�WR�GUDZ�D�EDOO�IURP�8L���ZKLFK�LQ�WXUQ�WHOOV�RQH�ZKHWKHU�WR�VWRS�RU�GUDZ�DJDLQ��2QH�IROORZV�WKLV�WUHDVXUH�KXQW�IURP�XUQ�WR�XUQ�XQWLO�D�6723� EDOO�LV�ILUVW�GUDZQ�VD\�IURP�8V�RQ�WKH�N�WK�GUDZLQJ��,I�K��� M�� WKH�SD\PHQW�*L��LV���� ,I�K� �M�� VXSSRVH�RQH�KDV�

����

Page 18: Gaps between the theory and practice of large-scale matrix-based network computations

Monte Carlo methods for PageRank K. Avrachenkov et al. 2005. Monte Carlo methods in PageRank Fogaras et al. 2005. Fully scaling personalized PageRank. Das Sarma et al. 2008. Estimating PageRank on graph streams. Bahmani et al. 2010. Fast and Incremental Personalized PageRank Bahmani et al. 2011. PageRank & MapReduce Borgs et al. 2012. Sublinear PageRank

Complexity – “O(log |V|)”

Page 19: Gaps between the theory and practice of large-scale matrix-based network computations

Gauss-Seidel and Gauss-Southwell Methods to solve A x = b

x

(k+1) = x

(k ) + ⇢jej [Ax

(k+1)]j = [b]jUpdate such that

In words “Relax” or “free” the jth coordinate of your solution vector in order to satisfy the jth equation of your linear system.

Gauss-Seidel repeatedly cycle through j = 1 to n Gauss-Southwell use the value of j that has the highest magnitude residual

r

(k ) = b � Ax

(k )

Page 20: Gaps between the theory and practice of large-scale matrix-based network computations

Matrix computations in a red-pill

Solve a problem better by exploiting its structure!

Page 21: Gaps between the theory and practice of large-scale matrix-based network computations

Gauss-Seidel/Southwell for PageRank w/ access to in-links & degs. PageRankPull

w/ access to out-links PageRankPush

j = blue node

x

(k+1)j

� ↵X

i!j

x

(k )i

/deg

i

= f

j

x

(k+1)j

� ↵x

(k )a

/6

� ↵x

(k )b

/2 � ↵x

(k )c

/3= f

j

Solve for x (k+1)j

a

b

c

j = blue node

Let

r

(k ) = f + ↵A

TD

�1x

(k ) � x

(k )

x

(k+1)j

= x

(k )j

+ r

j

then

Update r(k+1)

r (k+1)b = r (k )

b + ↵r (k )j /3

r (k+1)c = r (k )

c + ↵r (k )j /3

r (k+1)a = r (k )

a + ↵r (k )j /3

r (k+1)j = 0

Page 22: Gaps between the theory and practice of large-scale matrix-based network computations

Python code for PPR Push # initialization !# graph is a set of sets !# eps is stopping tol!# 0 < alpha < 1 !x = dict() !r = dict() !sumr = 0. !for (node,fi) in f.items(): ! r[node] = fi ! sumr += fi !

# main loop !while sumr > eps/(1-alpha): ! j = max(r.iteritems, ! key=(lambda x: r[x]) ! rj = r[j] ! x[j] += rj ! r[j] = 0 ! sumr -= rj ! deg = len(graph[j]) ! for i in graph[j]: ! if i not in r: r[i] = 0. ! r[j] += alpha/deg*rj! sumr += alpha/deg*rj!!

If f ≥ 0, this terminates when ||xtrue – xalg||1 < �

Page 23: Gaps between the theory and practice of large-scale matrix-based network computations

Relaxation methods for PageRank Arasu et al. 2002, PageRank computation and the structure of the web Jeh and Widom 2003, Scaling personalized PageRank McSherry 2005, A unified framework for PageRank acceleration Andersen et al. 2006, Local PageRank Berkhin 2007, Bookmark coloring algorithm for PageRank

Complexity – “O( |E| )”

Page 24: Gaps between the theory and practice of large-scale matrix-based network computations

104 106 10810−10

10−5

100

105

Monte CarloRelaxation

Number of edges the algorithm touches

||xtrue – xalg||1

nnz(A)

10" nnz(A)

Sublinear"“in theory”

How I’d solve it 22k node, 2M edge Facebook graph

gap

gap

Node degree=155

Page 25: Gaps between the theory and practice of large-scale matrix-based network computations

Matrix computations in a red-pill

Solve a problem better by exploiting its structure!

Page 26: Gaps between the theory and practice of large-scale matrix-based network computations

work

⇤= O

⇣(1/")

1

1�↵ d(log d)

2

Some unity? Theorem (Gleich and Kloster, 2013 arXiv:1310.3423)" Consider solving personalized PageRank using the Gauss-Southwell relaxation method in a graph with a Zipf-law in the degrees with exponent p=1 and max-degree d, then the work involved in getting a solution with 1-norm error�is

* (the paper currently bounds exp(A D-1) ei but analysis yields this bound for PageRank) ** (this bound is not very useful, but it justifies that this method isn’t horrible in theory)

Improve this?

Page 27: Gaps between the theory and practice of large-scale matrix-based network computations

There is more structure

G1 G2 G3 G4 G5 G6 G7 G8

Figure 4: The 8 graph motifs or graphlets that we use to match a sample of a socialnetwork from Facebook or LinkedIn to a sample of the MGEO-P network model.

mean of 5.4. In both networks, larger graphs have bigger e↵ective diameters, althoughthe di↵erences are slight.

Graph summaries: graph motifs and spectral densities

Both graph motifs and spectral densities are numeric summaries of a graph that abstractthe details of a network into a small set of values that are independent of the particularnodes of a network. These summaries have the property that isomorphic graphs havethe same values, although there may exist non-isomorphic graphs that also have the samevalues. For instance, co-spectral graphs have the same spectral densities.21 We wish to usethese summaries in order to determine the best dimension that preserves the summary.

Graph motifs, graphlets, or graph moments are the frequency or abundance of specificsmall subgraphs in a large network. We study undirected, connected graphs up to fournodes as our graph motifs. This is a set of 8 graphs shown in Figure 4. Counting theexact number of occurences of each subgraph within the large graph takes time O(n4),which is prohibitively large. Instead, we employ the rand-esu sampling algorithm22 asimplemented in the igraph library.23 This algorithm approximately estimates the countof each subgraph and depends on a sampling probability that can be interpreted as theprobability of continuing a search. Thus, values near 1 indicate exact scores and smallprobabilities truncate the search. The value we is 10� log n/ log 10+1. We use log-transformedoutput from this procedure in order to capture the dynamic range of the resulting values.

Todo• Insert discussion of use of network motifs, see rand-esu paper (Wernicke-2006-

motifs). It seems like we don’t need to discuss this and should cite the PNASpaper and the Janssen et al. paper for this info instead. The point here was thatthey were trying to claim something about function and this was more complicatedas random graphs exhibit nontrivial graph motifs.

Consider the graph G = (V, E), the normalized Laplacian matrix dictates many net-work properties including the behavior of a random walk, the number of connected compo-nents, an approximate connectivity measure, and many other features.24,25 The spectraldensity of the normalized Laplacian is a particularly helpful characterization that is a

8

The one ring.

(See C. Seshadhri’s talk for the reference)

Page 28: Gaps between the theory and practice of large-scale matrix-based network computations

Further open directions Nice to solve Unifying convergence results for Monte Carlo and relaxation on large networks to have provably efficient, practical algs.

–  Use triangles? Use preconditioning? A curiosity Is there any reason to use a Krylov method?

–  Staple of matrix computations, with Hk small BIG gap Can we get algorithms with “top k” or “ordering” convergence?

–  See Bahmani et al. 2010; Sarkar et al. 2008 (Proximity Search) Important? Are the useful, tractable multi-linear problems on a network?

–  e.g. triangles for network alignment; e.g. Kannan’s planted clique problem.

Supported by NSF CAREER 1149756-CCF www.cs.purdue.edu/homes/dgleich

A ! AVk+1 = Vk Hk