spectral methods for complex networks + path · spectral methods for complex networks richard c....
TRANSCRIPT
Spectral Methods for Complex Networks
Richard C. Wilson Dept. of Computer Science
University of York
+ path
Outline
Part I
1. Brief recap of spectral graph theory
2. Representation
3. Spectra of graph models
4. Application to graph partitioning
Part II
1. Paths and Cycles
2. Formal Series
3. Counting paths
4. Counting cycles
Matrix Representation
A Matrix Representation X of a network is matrix with entries
representing the vertices and edges
Adjacency
00110
00100
11010
10101
00010
5
4
3
2
1
5 4 3 2 1
A1
2
3
4
5
20000
01000
00300
00030
00001
DDegree matrix
Matrix Representation
The Laplacian (L) is
Signless Laplacian
20110
01100
11310
10131
00011
ADL
ADL s
20110
01100
11310
10131
00011
Matrix Representation
Normalized Laplacian
Entries are
2
1
2
1
2
1
2
1
ˆ
LDD
ADDIL
otherwise0
),(1
1
ˆ Evudd
vu
Lvu
uv
Incidence matrix
The incidence matrix of a graph is a matrix describing the relationship
between vertices and edges
Relationship to signless Laplacian
Adjacency
Laplacian
10
11
01
2,32,1
M1
2
3
DMMA T
TMMDL 2
T
s MML
Matrix Representation
Consider the Laplacian (L) of this network
Clearly if we label the network differently, we get a different matrix
In fact
represents the same graph for any permutation matrix P of the n labels
20110
01100
11310
10131
00011
5
4
3
2
1
5 4 3 2 1
1
2
3
4
5
TPLPL '
1
2
20101
01100
11301
00011
10113
5
4
3
2
1
5 4 3 2 1
Characterisations
Are two networks the same? (Graph Isomorphism), or is there
a bijection between the vertices such that all the edges are
in correspondence?
Interesting problem in computational theory, complexity
unknown but hypothesised as separate class in NP-
hierarchy, GI-hard
Graph Automorphism: Isomorphism between a graph and
itself.
Characterisations
An equivalent statement: Two networks are isomorphic iff
there exists a permutation matrix P such that
X should contain all information about the network
– Applies to L, A etc not to D
P is a relabelling; changes the order in which we label the
vertices
Our measurements from a matrix representation should be
invariant under this transformation (similarity transform)
TPPXX 12
X is a full matrix
representation
Spectral Graph Theory
Properties of the graph from the eigenvalues (eigenvectors) of
a matrix representation of the graph
1
UUUUX
UUX
T
LR
TSymmetric (undirected)
Always has n real eigenvalues
Non-symmetric
Possibly complex eigenvalues
Perron-Frobenius Theorem
Perron-Frobenius Theorem:
If X is an irreducible square matrix with non-negative entries, then there exists an eigenpair (λ,u) such that
Applies to both left and right eigenvector
• Key theorem: if our matrix is non-negative, we can find a principal(largest) eigenvalue which is positive and has a non-negative eigenvector
• Irreducible implies associated digraph is strongly connected
0
j
i
u
i
R
Spectrum
The graph has a ordered set of eigenvalues (λ1, λ2,… λn) in
terms of size (I will use smallest first).
The (ordered) set of eigenvalues is called the spectrum of the
graph.
Theorem: The spectrum is unchanged by the relabelling
transform
Corollary: If two graphs are isomorphic, they have the same
spectrum
This does not solve the isomorphism problem, as two
different graphs may have the same spectrum
12
12
TPPXX
Undirected networks: Spectrum of A
Spectrum of A: Positive and negative eigenvalues
1
max21max 0
0)Tr(
n
n
i
dd
A
mm
T
E
n
jin
m
i
i
i
i
n
i
n
i
ii
nn
ij
n
length of cyclesdistinct ofnumber
3)Tr(
2)Tr(
Tr
length of cycles ofnumber )(Tr
and joining length of paths ofnumber gives )(
33
22
A
A
A
AA
A
Undirected networks: Spectrum of A
Bipartite graph: If λ is an eigenvalue, then so is –λ, Sp(A)
symmetric around 0
Perron-Frobenius Theorem (A non-negative matrix)
n is largest magnitude eigenvalue, corresponding eigenvector
un is non-negative
Undirected Networks: Spectrum of L
Spectrum of L: L positive semi-definite
There always exists an eigenvector 1 with eigenvalue 0,
because of zero row-sums
The number zeros in the spectrum is the number of connected
components of the network.
n
E
n
i
210
2
Spectrum of L
A spanning tree of a graph is a tree containing only edges in
the network and all the vertices
Example
Kirchhoff’s theorem
The number of spanning trees of a graph is
n
i
in 2
1
Spectrum of normalised L
Spectrum of : Positive semi-definite
As with Laplacian, the number zeros in the spectrum is the
number of disconnected components of the network.
Eigenvector exists with eigenvalue 0 and entries
‘scale invariance’ for eigenvalues
20 21
n
i V
L̂
Tnddd 21
Regular networks
• A network is regular if all vertices have the same
degree
• Spectra (eigenvalues and eigenvectors) essentially
the same
AIL
AIL
ID
k
k
k
1ˆ
Eigensystem stability and the spectral difference
• If the network changes for some reason
– Rewiring, random noise etc.
• The eigenvalues and eigenvectors will change
• Let N be a symmetric matrix representing the
change (deleted/extra edges)
• The change in an eigenvalue is bounded above by
the Frobenius norm of N
– Small perturbation, small change in eigenvalues
)()()()()( nkkkk NXNXNX
Eigensystem stability and the spectral difference
• If N is small compared to X we can apply
eigenperturbation theory
• Eigenvectors not stable if spectral difference |λk-λj|
is small
n
kjj
j
jk
k
T
kkk
k
T
kkk
,1
uNuu
uu
Nuu
NXX
References
Spectra of Graphs, Brouwer & Haemers, Springer
Graph Spectra for Complex Networks, Van Mieghem,
Cambridge University Press
Spectral Graph Theory, Fan Chung, American Mathematical
Society
Spectral Methods and Labels
So far, we have considered edges only as present or absent
{0,1}. If we have more edge information, can encode in a
variety of ways. Edges can be weighted to encode
attributes, include diagonal entries to encode vertices
00110
00100
11010
1016.02.0
0002.04.0
A0.4
0.6
0.2
Coding Attributes
• Note: When using Laplacian, add diagonal elements after
forming L
• Label attributes: Code labels into [0,1]
• Example: chemical structures
Edges
─ 0.5
═ 1.0
Aromatic 0.75
Vertices
C 0.7
N 0.8
O 0.9
Coding Attributes
Spectral theory works equally well for complex matrices
Matrix entry is x+iy so can encode two independent attributes
per entry, x and y. Symmetric matrix becomes Hermitian
matrix
Eigenvalues real, eigenvectors complex
03.01.00
2.01.003.05.0
03.05.00
†
i
ii
i
A
AA
Spectra of Network Models
• A number of famous network models give very
distinctive eigenvalue distributions
• Example: Erdos-Renyi random graph model
• Edges are chosen by connecting each pair of
vertices with fixed probability p
Erdos-Renyi Spectrum of A
-50 -40 -30 -20 -10 0 10 20 30 40 500
0.005
0.01
0.015
pq
npqnpq
p
np
1
2
1
42
1 law circle-semiWigner
of eigenvaluegiant One
Scale free
• Scale-free (Preferential attachment)
• Network grows by adding new vertices
– m new edges added each time
• Probability of connection proportional to degree
decay lawpower follow triangleof Edges
24
1on distributi Triangular
of eigenvalue large One 4/12/1
1
npqnpq
p
nm
Scale-free Spectrum of A
-10 -5 0 5 100
0.05
0.1
0.15
0.2
0.25
npqnpq
p 24
1on distributi Triangular
Small world
• Small world (Watts-Strogatz)
• Basic ring topology with m neighbours
• Reconnect edges randomly with prob. p
• When p=0, regular graph with degree m
– Degenerate spectrum with sharp peaks
• When p=1, ER random graph
– Semi-circle law
• Transitions between two for p∈[0,1]
Small world
-4 -2 0 2 4 6 8 10 120
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
p=0.0
-6 -4 -2 0 2 4 6 8 10 120
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
p=0.1
-8 -6 -4 -2 0 2 4 6 8 10 120
0.02
0.04
0.06
0.08
0.1
0.12
p=0.5
-8 -6 -4 -2 0 2 4 6 8 10 120
0.02
0.04
0.06
0.08
0.1
0.12
p=0.3
Spectral Partitioning and Cuts
• Divide a network into modules or clusters
• Minimise C
– This simple approach does not work
cut
cut),(
21 ),cut(ji
ijAPPC
Spectral Partitioning
• Should prefer equal partitions
2
21
1
21 ),cut(),cut(
P
PP
P
PPC Ratio cut
)(vol
),cut(
)(vol
),cut(
2
21
1
21
P
PP
P
PPC Normalized cut
Spectral Partitioning
• Analysis (ratio cut)
Introduce indicator vector x
Has following properties
1.
2.
3. xi takes only two values
2
2
1
1
1
2
PvP
P
PvP
P
x
i
i
i
nx
0x1T
),(cut2 21 PPVT Lxx
0 subject to min x1LxxTT
𝑥 ∈ ℝ
Spectral Partitioning
• Similarly, for normalized cut
• Discretize x into two values to obtain partitions
• Solution depends on finding eigenvector
• Type of cut depends on matrix
– Equally well use another matrix, e.g. adjacency
• A measures affinity between vertices for being in
the same partition
0 subject to ˆmin),(ncut 2/1
21 xD1xLxTTPP
0 subject to max n
TTuxAxx
Modularity
• Modularity is a measure of partition quality
relative to some base graph model
• Can be summarised in modularity matrix
• Pij is the expected affinity according to base model
– Needs to be more clustered that the model
• Common to use the configuration model as the
base
ijijij PAB
||2 E
ddP
ji
ij
Modularity
• Modularity
BxxT
EQ
||2
1
xx BPP Tmax),(mcut 21
Paths
Paths
• The structure of a network can be probed by
looking at the paths
– Communicability
– Commute time
• Generally not tractable to enumerate paths – too
many
• Need to think carefully about what can be
computed in practice
– Powers of A, exp(A) etc.
TfAf UU
Path
A path is a contiguous sequence of edges in the network
The length of p, l(p) is the number of edges traversed
A simple path is a self-avoiding path, which does not repeat
any vertices (with the possible exception of i and j)
1
3
2
4
5
jiiiiiiiiiij lllwwwwwp ,,,,, 11232211
3)(
2435)5,3)(3,4)(4,2(
pl
p
Cycles
• A cycle is a closed path in a network, i.e. a path
across edges returning to the same vertex (i=j)
• Cycles are often an important structural
component of networks
1
3
2
4
5
Cycles
• A cycle is a sequence
• A simple cycle does not repeat any vertex except the
first/last
• Two cycles may be considered equivalent if they are the
same cycle with different starting points
1
3
2
4
5 121
2342~3423~4234
53435
Simple
Simple
Non-simple
iiiiiiiiiii lllwwwwwc ,,,,, 11232211
Counting paths
• Formal adjacency matrix
• Replace {0,1} with formal variables representing
edges
• Allows us to keep track of which sequences
contribute to a particular calculation
– Substitute specific values to do find actual values
00
00
00
001
100
010
31
23
12
w
w
w
WA
Counting paths
• Example
2)( with paths ofnumber 1
1,1
)2(
)2(
22
2
pl
pw
pww
ij
ij
Pp
ijij
xy
Ppk
kjikij
AW
W
1,1
!
1exp
pw
k
xy
k
kWW
Weighted sum of paths of all lengths
Walk generating function
• Can use z to control convergence, z<1/n
PpPpPp
T pppn
)3()2(
1
321
1WI1
WWWIWI
zw
w
ij
ij
converget doesn' useless, 1
Pp
plzz11
AIWI
1)(
!
1 AI z
dz
d
kP
k
kk
Example
MUTAG
Collection of 188 labelled chemical compounds.
Task is to predict whether each compound has
mutagenicity or not.
Method Dataset Accuracy
Random walk kernel
Backtrackless walk kernel
Mutag(labelled)
Mutag(labelled)
90. 0%
91.1%
Feature vector from Random walk
Feature vector from backtrackless random walk
Feature vector from Ihara coefficients
Shortest Path Kernel
COIL(unlabeled)
COIL(unlabeled)
COIL(unlabeled)
COIL(unlabeled)
94.4%
95.5%
94.4%
86.7%
Feature vector from Random walk
Feature vector from backtrackless random walk
Feature vector from Ihara coefficients
Mutag(unlabeled)
Mutag(unlabeled)
Mutag(unlabeled)
89.4%
90.5%
80.5%
Graph Kernels
• The walk generating function efficiently counts
paths
• Including backtracks
• Tottering masks interesting information
• Simple paths difficult to compute
Oriented Line Graph
• Oriented Line graph:
1 2
3 4
e21
e12
e23 e32 e42
e24
e41 e14
e43
e34
e23
e21
e12
e32
e42
e24 e41
e14
e43
e34
Oriented Line graph (OLG): no
backtracking
1. Convert edges into directed pairs
2. Each directed edge becomes a vertex
3. Join vertices where the head of one edge
meets the tail of another
4. Reverse pairs are not joined (eg. e12, e21)
Backtrackless Walks
• The adjacency of the OLG is given by T (the
Hashimoto matrix of the network)
• Paths on T are paths on A, except backtracks do
not appear
– Path of length l on T is path of length l+1 on A
• Count paths on T, but T can be big (2|E|×2|E|)
1)(
TIB zz
Efficient computation
• Complexity is a problem
• We can directly compute n×n matrix Ak, defined as
here i, j run over the vertices of G.
• Recursions for the matrices Ak – Let A be the adjacency matrix of a simple graph G and Q be a n×n
diagonal matrix whose ith diagonal entry is the degree of the ith node minus 1. Then
ji
kGA
jikat ending and at starting
ngbacktracki no with length of in paths ofnumber ,
3 if
2 if
1 if
21-k
2
k
k
k
k
k
QAAA
IQA
A
A
42 || || nEnV
[Stark and Terras 1996,
Aziz et al 2013]
Cycles
• It is easy to count short simple cycles in a network
• As we noted earlier, (number of
2-cycles)
• (number of simple 3-cycles)
• which is the number of 4-cycles,
most of which are not simple
1
4
2
3
cycle-4 a also is 12141
cycle-4 simple a is
141 ,121 ,12341
32
1
321
cc
c
ccc
123Tr 33 TA
102Tr 22 EA
50Tr 44 A
Cycles
• A cycle in OLG(G) induces a cycle in G
• Since backtracks are not allowed, certain cycles do not
appear
– Cycles of length 2
– Cycles with tails
• Let T be the adjacency of the OLG
– Called the Hashimoto matrix of G
• Still get repeats at larger size, eg c12, c1c2
2
exactly 5length toup cycles simple counts Tr nnn
TT
Cycles
• What about other matrix functions?
• Structural measures should be invariant to
permutation similarity transform
• det and perm seem obvious choices
– Counts hikes of length n, collections of disjoint cycles
• perm hard to compute
nn
i
S
m
S
n
i
i cccw
21
1
, sgnsgndet W
nn
i
S
m
S
n
i
i cccw
21
1
,perm W
Ihara Zeta Function
• Ihara (1966), Sunada (1986)
• Prime cycle of a graph:
– A cycle which has no backtracking and is not a multiple of another
cycle
Prime Not Prime
(backtracking)
Not Prime (twice
round a single
cycle)
Prime Cycles
– Similar trick to walk generating function
– Sum over hikes of any length
• Use T to eliminate backtracks in the hikes and let
wij → z to get a generating function
• Ihara zeta function of network
– Effectively series over Ihara prime cycles
• Efficient evaluation using (large) eigenvalues of T
Hh
h1
det WI
HCc
pl
G zzz 1)(1)1(det)( TI
Application: Social Balance
• Some social interactions can be characterised by a
positive/negative interaction
– Friend/enemy, for/against
• Social theory suggests that networks should evolve
into a balanced state to decrease tension
– Does this happen in practice?
Alice
Bob
Carol
+
+
-
Balance in Networks
• Early work focussed on triangles
– Easy to count
unbalanced 1
balanced 1
enemies 1
friends 1
c
wij
3
33
33
33
Tr2
TrTr
A
AA s
NN
NR
Cycles
• Problem with counting balanced squares:
• There is no simple way of counting simple cycles of
arbitrary length in a graph
• A Hamiltonian cycle is a simple cycle which visits all
vertices of the graph
• Determining whether such a cycle existing is known to be
NP-complete
– No polynomial-time algorithm likely for general simple cycle
counting
Cycles
• Can count Ihara cycles instead
– Simple up to length 5
– No cycle powers ck
ld
d
slld
l
lNN
|
)(1
TTr
ld
d
lld
l
lNN
|
)(1
TTr
Simple cycles
• Generating function for simple cycles
– S all simple cycles
• There is a trace formula for this function [Giscard
et al 2016]
• Naturally this is NP-hard to compute
– Can get efficient approximations for shorter cycles
using Monte Carlo sampling, particularly on sparse
graphs
Ss
slzzP )()(
dzzzz
zPGH
HN
H
H
H
)|(||| )()(Tr1
)( AIA
Balance in real networks
• WikiElections network represents the votes of
wikipedia users during the elections of other users
to adminship.
– Directed, 8,297 vertices, 12915 edges
Balance in real networks
• The Epinions network is a large directed graph on 131,828
vertices representing relations between the users of the
consumer review website Epinions.com.
– Directed with 841,372 edges