spectral methods for complex networks + path · spectral methods for complex networks richard c....

Spectral Methods for Complex Networks

Richard C. Wilson Dept. of Computer Science

University of York

+ path

Outline

Part I

1. Brief recap of spectral graph theory

2. Representation

3. Spectra of graph models

4. Application to graph partitioning

Part II

1. Paths and Cycles

2. Formal Series

3. Counting paths

4. Counting cycles

Matrix Representation

A Matrix Representation X of a network is matrix with entries

representing the vertices and edges

Adjacency

5 4 3 2 1

DDegree matrix

The Laplacian (L) is

Signless Laplacian

Normalized Laplacian

Entries are

otherwise0

ˆ Evudd

Incidence matrix

The incidence matrix of a graph is a matrix describing the relationship

between vertices and edges

Relationship to signless Laplacian

Adjacency

Laplacian

2,32,1

DMMA T

TMMDL 2

Consider the Laplacian (L) of this network

Clearly if we label the network differently, we get a different matrix

In fact

represents the same graph for any permutation matrix P of the n labels

5 4 3 2 1

TPLPL '

5 4 3 2 1

Characterisations

Are two networks the same? (Graph Isomorphism), or is there

a bijection between the vertices such that all the edges are

in correspondence?

Interesting problem in computational theory, complexity

unknown but hypothesised as separate class in NP-

hierarchy, GI-hard

Graph Automorphism: Isomorphism between a graph and

itself.

Characterisations

An equivalent statement: Two networks are isomorphic iff

there exists a permutation matrix P such that

X should contain all information about the network

– Applies to L, A etc not to D

P is a relabelling; changes the order in which we label the

vertices

Our measurements from a matrix representation should be

invariant under this transformation (similarity transform)

TPPXX 12

X is a full matrix

representation

Spectral Graph Theory

Properties of the graph from the eigenvalues (eigenvectors) of

a matrix representation of the graph

TSymmetric (undirected)

Always has n real eigenvalues

Non-symmetric

Possibly complex eigenvalues

Perron-Frobenius Theorem

Perron-Frobenius Theorem:

If X is an irreducible square matrix with non-negative entries, then there exists an eigenpair (λ,u) such that

Applies to both left and right eigenvector

• Key theorem: if our matrix is non-negative, we can find a principal(largest) eigenvalue which is positive and has a non-negative eigenvector

• Irreducible implies associated digraph is strongly connected

Spectrum

The graph has a ordered set of eigenvalues (λ1, λ2,… λn) in

terms of size (I will use smallest first).

The (ordered) set of eigenvalues is called the spectrum of the

graph.

Theorem: The spectrum is unchanged by the relabelling

transform

Corollary: If two graphs are isomorphic, they have the same

spectrum

This does not solve the isomorphism problem, as two

different graphs may have the same spectrum

Undirected networks: Spectrum of A

Spectrum of A: Positive and negative eigenvalues

max21max 0

length of cyclesdistinct ofnumber

length of cycles ofnumber )(Tr

and joining length of paths ofnumber gives )(

Undirected networks: Spectrum of A

Bipartite graph: If λ is an eigenvalue, then so is –λ, Sp(A)

symmetric around 0

Perron-Frobenius Theorem (A non-negative matrix)

n is largest magnitude eigenvalue, corresponding eigenvector

un is non-negative

Undirected Networks: Spectrum of L

Spectrum of L: L positive semi-definite

There always exists an eigenvector 1 with eigenvalue 0,

because of zero row-sums

The number zeros in the spectrum is the number of connected

components of the network.

Spectrum of L

A spanning tree of a graph is a tree containing only edges in

the network and all the vertices

Example

Kirchhoff’s theorem

The number of spanning trees of a graph is

Spectrum of normalised L

Spectrum of : Positive semi-definite

As with Laplacian, the number zeros in the spectrum is the

number of disconnected components of the network.

Eigenvector exists with eigenvalue 0 and entries

‘scale invariance’ for eigenvalues

Tnddd 21

Regular networks

• A network is regular if all vertices have the same

degree

• Spectra (eigenvalues and eigenvectors) essentially

the same

Eigensystem stability and the spectral difference

• If the network changes for some reason

– Rewiring, random noise etc.

• The eigenvalues and eigenvectors will change

• Let N be a symmetric matrix representing the

change (deleted/extra edges)

• The change in an eigenvalue is bounded above by

the Frobenius norm of N

– Small perturbation, small change in eigenvalues

)()()()()( nkkkk NXNXNX

Eigensystem stability and the spectral difference

• If N is small compared to X we can apply

eigenperturbation theory

• Eigenvectors not stable if spectral difference |λk-λj|

is small

References

Spectra of Graphs, Brouwer & Haemers, Springer

Graph Spectra for Complex Networks, Van Mieghem,

Cambridge University Press

Spectral Graph Theory, Fan Chung, American Mathematical

Society

Spectral Methods and Labels

So far, we have considered edges only as present or absent

{0,1}. If we have more edge information, can encode in a

variety of ways. Edges can be weighted to encode

attributes, include diagonal entries to encode vertices

1016.02.0

0002.04.0

Coding Attributes

• Note: When using Laplacian, add diagonal elements after

forming L

• Label attributes: Code labels into [0,1]

• Example: chemical structures

─ 0.5

═ 1.0

Aromatic 0.75

Vertices

Coding Attributes

Spectral theory works equally well for complex matrices

Matrix entry is x+iy so can encode two independent attributes

per entry, x and y. Symmetric matrix becomes Hermitian

matrix

Eigenvalues real, eigenvectors complex

03.01.00

2.01.003.05.0

03.05.00

Spectra of Network Models

• A number of famous network models give very

distinctive eigenvalue distributions

• Example: Erdos-Renyi random graph model

• Edges are chosen by connecting each pair of

vertices with fixed probability p

Erdos-Renyi Spectrum of A

-50 -40 -30 -20 -10 0 10 20 30 40 500

npqnpq

1 law circle-semiWigner

of eigenvaluegiant One

Scale free

• Scale-free (Preferential attachment)

• Network grows by adding new vertices

– m new edges added each time

• Probability of connection proportional to degree

decay lawpower follow triangleof Edges

1on distributi Triangular

of eigenvalue large One 4/12/1

npqnpq

Scale-free Spectrum of A

-10 -5 0 5 100

npqnpq

1on distributi Triangular

Small world

• Small world (Watts-Strogatz)

• Basic ring topology with m neighbours

• Reconnect edges randomly with prob. p

• When p=0, regular graph with degree m

– Degenerate spectrum with sharp peaks

• When p=1, ER random graph

– Semi-circle law

• Transitions between two for p∈[0,1]

Small world

-4 -2 0 2 4 6 8 10 120

-6 -4 -2 0 2 4 6 8 10 120

-8 -6 -4 -2 0 2 4 6 8 10 120

Spectral Partitioning and Cuts

• Divide a network into modules or clusters

• Minimise C

– This simple approach does not work

cut),(

21 ),cut(ji

ijAPPC

Spectral Partitioning

• Should prefer equal partitions

21 ),cut(),cut(

PPC Ratio cut

),cut(

PPC Normalized cut

• Analysis (ratio cut)

Introduce indicator vector x

Has following properties

3. xi takes only two values

),(cut2 21 PPVT Lxx

0 subject to min x1LxxTT

𝑥 ∈ ℝ

• Similarly, for normalized cut

• Discretize x into two values to obtain partitions

• Solution depends on finding eigenvector

• Type of cut depends on matrix

– Equally well use another matrix, e.g. adjacency

• A measures affinity between vertices for being in

the same partition

0 subject to ˆmin),(ncut 2/1

21 xD1xLxTTPP

0 subject to max n

TTuxAxx

Modularity

• Modularity is a measure of partition quality

relative to some base graph model

• Can be summarised in modularity matrix

• Pij is the expected affinity according to base model

– Needs to be more clustered that the model

• Common to use the configuration model as the

ijijij PAB

Modularity

• Modularity

xx BPP Tmax),(mcut 21

• The structure of a network can be probed by

looking at the paths

– Communicability

– Commute time

• Generally not tractable to enumerate paths – too

• Need to think carefully about what can be

computed in practice

– Powers of A, exp(A) etc.

TfAf UU

A path is a contiguous sequence of edges in the network

The length of p, l(p) is the number of edges traversed

A simple path is a self-avoiding path, which does not repeat

any vertices (with the possible exception of i and j)

jiiiiiiiiiij lllwwwwwp ,,,,, 11232211

2435)5,3)(3,4)(4,2(

Cycles

• A cycle is a closed path in a network, i.e. a path

across edges returning to the same vertex (i=j)

• Cycles are often an important structural

component of networks

Cycles

• A cycle is a sequence

• A simple cycle does not repeat any vertex except the

first/last

• Two cycles may be considered equivalent if they are the

same cycle with different starting points

2342~3423~4234

Simple

Non-simple

iiiiiiiiiii lllwwwwwc ,,,,, 11232211

Counting paths

• Formal adjacency matrix

• Replace {0,1} with formal variables representing

• Allows us to keep track of which sequences

contribute to a particular calculation

– Substitute specific values to do find actual values

Counting paths

• Example

2)( with paths ofnumber 1

kjikij

Weighted sum of paths of all lengths

Walk generating function

• Can use z to control convergence, z<1/n

PpPpPp

T pppn

)3()2(

WWWIWI

converget doesn' useless, 1

plzz11

1 AI z

Example

Collection of 188 labelled chemical compounds.

Task is to predict whether each compound has

mutagenicity or not.

Method Dataset Accuracy

Random walk kernel

Backtrackless walk kernel

Mutag(labelled)

90. 0%

Feature vector from Random walk

Feature vector from backtrackless random walk

Feature vector from Ihara coefficients

Shortest Path Kernel

COIL(unlabeled)

Feature vector from Random walk

Feature vector from backtrackless random walk

Feature vector from Ihara coefficients

Mutag(unlabeled)

Graph Kernels

• The walk generating function efficiently counts

• Including backtracks

• Tottering masks interesting information

• Simple paths difficult to compute

Oriented Line Graph

• Oriented Line graph:

e23 e32 e42

e41 e14

e24 e41

Oriented Line graph (OLG): no

backtracking

1. Convert edges into directed pairs

2. Each directed edge becomes a vertex

3. Join vertices where the head of one edge

meets the tail of another

4. Reverse pairs are not joined (eg. e12, e21)

Backtrackless Walks

• The adjacency of the OLG is given by T (the

Hashimoto matrix of the network)

• Paths on T are paths on A, except backtracks do

not appear

– Path of length l on T is path of length l+1 on A

• Count paths on T, but T can be big (2|E|×2|E|)

TIB zz

Efficient computation

• Complexity is a problem

• We can directly compute n×n matrix Ak, defined as

here i, j run over the vertices of G.

• Recursions for the matrices Ak – Let A be the adjacency matrix of a simple graph G and Q be a n×n

diagonal matrix whose ith diagonal entry is the degree of the ith node minus 1. Then

jikat ending and at starting

ngbacktracki no with length of in paths ofnumber ,

42 || || nEnV

[Stark and Terras 1996,

Aziz et al 2013]

Cycles

• It is easy to count short simple cycles in a network

• As we noted earlier, (number of

2-cycles)

• (number of simple 3-cycles)

• which is the number of 4-cycles,

most of which are not simple

cycle-4 a also is 12141

cycle-4 simple a is

141 ,121 ,12341

123Tr 33 TA

102Tr 22 EA

50Tr 44 A

Cycles

• A cycle in OLG(G) induces a cycle in G

• Since backtracks are not allowed, certain cycles do not

appear

– Cycles of length 2

– Cycles with tails

• Let T be the adjacency of the OLG

– Called the Hashimoto matrix of G

• Still get repeats at larger size, eg c12, c1c2

exactly 5length toup cycles simple counts Tr nnn

Cycles

• What about other matrix functions?

• Structural measures should be invariant to

permutation similarity transform

• det and perm seem obvious choices

– Counts hikes of length n, collections of disjoint cycles

• perm hard to compute

i cccw

, sgnsgndet W

i cccw

,perm W

Ihara Zeta Function

• Ihara (1966), Sunada (1986)

• Prime cycle of a graph:

– A cycle which has no backtracking and is not a multiple of another

Prime Not Prime

(backtracking)

Not Prime (twice

round a single

cycle)

Prime Cycles

– Similar trick to walk generating function

– Sum over hikes of any length

• Use T to eliminate backtracks in the hikes and let

wij → z to get a generating function

• Ihara zeta function of network

– Effectively series over Ihara prime cycles

• Efficient evaluation using (large) eigenvalues of T

det WI

G zzz 1)(1)1(det)( TI

Application: Social Balance

• Some social interactions can be characterised by a

positive/negative interaction

– Friend/enemy, for/against

• Social theory suggests that networks should evolve

into a balanced state to decrease tension

– Does this happen in practice?

Balance in Networks

• Early work focussed on triangles

– Easy to count

unbalanced 1

balanced 1

enemies 1

friends 1

Cycles

• Problem with counting balanced squares:

• There is no simple way of counting simple cycles of

arbitrary length in a graph

• A Hamiltonian cycle is a simple cycle which visits all

vertices of the graph

• Determining whether such a cycle existing is known to be

NP-complete

– No polynomial-time algorithm likely for general simple cycle

counting

Cycles

• Can count Ihara cycles instead

– Simple up to length 5

– No cycle powers ck

Simple cycles

• Generating function for simple cycles

– S all simple cycles

• There is a trace formula for this function [Giscard

et al 2016]

• Naturally this is NP-hard to compute

– Can get efficient approximations for shorter cycles

using Monte Carlo sampling, particularly on sparse

graphs

slzzP )()(

)|(||| )()(Tr1

)( AIA

Balance in real networks

• WikiElections network represents the votes of

wikipedia users during the elections of other users

to adminship.

– Directed, 8,297 vertices, 12915 edges

Balance in real networks

• The Epinions network is a large directed graph on 131,828

vertices representing relations between the users of the

consumer review website Epinions.com.

– Directed with 841,372 edges

spectral methods for complex networks + path · spectral methods for complex networks richard c....

Documents

joint spectral radius and path-complete graph...

synchronization in complex oscillator networks …complex...

journal of machine learning research - path-based spectral...

sdr techniques to handle complex and jam-packed spectral...

(dis)assembly path planning for complex objects and

tmrna-ribosome complex at different steps of tmrna path...

formation attributes: program complex spectral stratal slice...

spectral analysis and optimal synchronizability of complex...

visual path prediction in complex scenes with crowded moving...

review article prediction of spectral phonon mean free...

three-dimensional path planning in complex environments

non-decimated complex wavelet spectral tools with … ·...

spectral analysis of the complex cubic...

synthesis, spectral, thermal, optical dispersion and ... ·...

spectral properties of complex networks and ...

damping and complex modes - sdtools.com · 2 complex...

complex clipping path service provider by color experts...

tool path generation method of complex surface 5- axis

complex-grid spectral algorithms for inviscid linear...

joint spectral radius and path-complete graph … · of...