response network emerging from simple perturbation

Response network emerging from simple perturbation

Seung-Woo Son

Complex System and Statistical Physics Lab.,

Dept. Physics, KAIST, Daejeon 305-701, Korea

Motivation : Microarray Data

• Microarray data show the response of each gene to an experiment, which is a kind of perturbation to the genetic network. ex) gene deletion, temperature change etc

• Like building the genetic network from microarray data, the secondary network can be constructed from the response of primary network under perturbation. ex) node removal (?)

“ Can the secondary network represents the primary network correctly ? ”

“ What is the meaning of the response under perturbation ? ”

“ Ultimately, can we find out primary network from the secondary network ? ”

Introduction : Node Removal Perturbations

• When a node is removed, network structure changes. The network can break into several isolated clusters.

• Giant cluster size decreases gradually and the average path length increases.

R. Albert and A.-L.Barabási, Reviews of Modern Physics, 74, 47 (2002)

• SF network is more tolerant against random removal better than random network.

• In SF network, the diameter changes under a node removal follow the power-law distributtion. J.-H. Kim, K.-I. Goh, B. Kahng and D. Kim,Physical Review Letters, 91, 5 (2003)

Introduction : Load & Betweenness Centrality

• What is the “Load” ?– When every pair of nodes in a network exchanges data

packets along the shortest path, load of a node is the total number of data packets passing through that node.

ex) Internet traffic jam

ji j

k11

2

1

2

1

4

1

4

3

ji

ji klkl,

)()(

start target

i

k11

3

2

3

1

3

1

3

2

ji

ji kbkb,

)()(

start target

• Betweenness Centrality BC ( Freeman, 1977 )

– if is the number of geodesic paths from i to j and is the number of paths from i to j that pass through k, then is the proportion of geodesic paths from i to j that pass through k. The sum for all i,j pairs is betweenness centrality.

jig k

jig

jik

ji gg /

jik

jiji ggkb /)(

)(kl ji

1000 10000

1

10

100

1000 bi (avg. over row)

b(k) (avg. over column)

~ 2.0

Dis

trib

utio

n of

b

i ,

b(k)

bi & b(k)

Introduction : BC Changes . - BA model

)(kbi

)()()( kbkbkb oii

network wholeafter the nodeth -k of BC

removal nodeth -iafter nodeth -k of BC )(

o

i

b

kb

)()1(

)()1( 11

jbb

jbb

bii

cf) diameter changes

J.-H. Kim, K.-I. Goh, B. Kahng and D. Kim,Physical Review Letters, 91, 5 (2003)

Distribution of . - BA model)(kbi

o

ii b

ibb )(

• distribution is power law distribution with exponent 2.1

)(kbi

• Summation of BC changes after i-th node removal is linearly proportional to BC of i-th node in BA model.

)(ibb oi )()()()( ibbbkbkbkbb o

k k koioiii

MST & Percolation Network• How to build the secondary network ? : Based on = “correlation” bewtween node i and j

– MST (minimum spanning tree)A graph G = (V,E) with weighted edges. The subset of E of G of minimum weight which forms a tree on V ≡ MST .A node is linked to the most influential one with constraint such that N vertices must be connected only with (N-1) edges.

– PercolationAfter sorting Δbi(j) in descending order, add a link between i and j following that order. When all nodes make a giant cluster, stop the attachment. It means the links with values Δbi(j) > b* (percolation threshold) are valid and connected.

)()1(

)(. 1

jbb

jb

bii

a

cb

de

f

3 4

5

4

2

6

31

2

7

MST

)( jbi

a

cb

de

f

3 4

5

4

2

6

31

2

7

Percolation

Result : Secondary Networks

• The degree k of secondary networks contain the global information of primary network, because it is constructed from BC that is calculated from the information of whole network.

• More sparse or dense networks which contain the information of original network can be constructed.

• Secondary networks represent the primary network well with significant link matches.

BA 100 MST

Secondary network construction

Result : Minimal Spanning Tree

1 10 100

1

10

100

1000BA model ( N = 1000, m = 2 )

~ 2.2

kMST

exponet 2.2 fit line

dist

arib

utio

n D

( k m

st )

degree of MST ( kmst

)

-10 0 10 20 30 40 50 60 70 80-20

0

20

40

60

80

100

120

140

160

k mst ,

k pe

r

degree of primary network ( korg

)

mst percolation original slope 1 slope 3.3 slope 0.9

• In MST network, the degree distribution shows the power-law with exponent 2.2 not 3.0 ( Scale-free )

• The degree of each node in secondary network is linearly correlated to that of primary network.

orgpermst kkk ,

2.2~)( mstmst kkD

Result : Percolation Network

• The degree distribution of percolation network shows power-law. ( exponent -1.9 )

• Percolation features appear during giant cluster fromation.

1 10 100

1

10

100

BA model ( N = 1000, m = 2 )

percolation k exponent 1.9 fit line

~ 1.9

dist

ribut

ion

D (

kp

)

degree of percolation network ( kp )

1E-3 0.01 0.1 1

0.0

0.2

0.4

0.6

0.8

1.0

size

of

gian

t cl

uste

r (

G /

N )

percolation value ( v / vmax

)

BA 1000 nodes BA 3000 nodes

1 10 100 10001

10

100

1000

~ 0.9

number of nodes (n) size of giant cluster (s) linear fit 0.91

nu

mb

er

of

no

de

& s

ize

of

gia

nt

clu

ste

r

number of link attached (m)

Similarity Measurement between Two Networks

• The links of each node are regarded as vector in N dimensional vector space.– Vector inner product shows the

similarity between two networks.

• Binary undirected network case : It means how many links are overlapping each other.

1

2

3 0

45

6

1

2

3 0

45

6 compare

),,,,,( 21 iNijiii wwwwv 'iii vvx

N

1i

X measure similarity ix

BA model ( N = 1000 , M = 1996 )

links Xmatche

s

MST network 999 0.908 907

Percolation net.

3377 0.766 1529

Other BA net. 1996 0.019 39

RG network 1996 0.012 23

Random net.2041 0.003 5

957 0.001 1

• The network similarity measure between secondary and primary networks are significantly higher than other network.

( MST : 90.8 % , percolation : 76.6 % )

The secondary networks well represent the primary network.

) otherwise ( 0

linked) are ji, (if 1 ijw

Conclusions & Future Works• Conclusions

– Two secondary networks, MST & percolation network, reproduce the scale-free behavior and its degree of each node is in proportion to degree of primary network. Its degree contains the global information of primary network.

– Similarity measurement shows that the secondary networks reproduce original network quite well. ( MST: 91% , percolation: 77% )

– BC change Δbi(j) values represent the interaction between i-node and j-node. And It is related to diameter change directly.

– Δbi(j) and b(i) relations might help to explain network classification with BC distribution exponents.

• Future Works

– BC change calculation for other network models and real networks.– Precise relation between Δbi(j) and b(i) , analytic calculation.– Finding primary network from secondary network information.

Distribution of BC Changes .)(kbi

)()()( jbjbjb oii

ij ij

oioiij

ii ibbbjbjbjbb )()()()(

lk

lko

o dNN

Dd ,)1(

1

o

i

o

oi

o

oiN

o

oi

o

oi

o

oii

b

ibb

b

bb

D

DD

D

DDN

N

NN

DNN

D

NN

D

d

dd

)( 2

)1(

)1()2)(1(

1

bi : summation of BC after i-th node removedbo : summation of BC over whole network.

)1()(

)1( ,

NNib

NNbbdD

i

lklk

bκ : summation of BC from κ-th node to all.

AB

CF

E

D

G

start2

1

2

1

4

1

4

1

2

1


ib

Δbi : ( i-th node removed ) summation of BC changes.

Network deformation =

Lost a source of BC =

)1(22 Nb

select alternative shortest path + detour

( Contribution to Δbi < 0 )

( Contribution to Δbi > 0 )

select alternative shortest path

+

detour

bdd

ll ,,

lk lk lk

lklk g

igibib

,

,,

)()()(

)(,, igg lklk

)(,, igg lklk

Nonlinear!

Contribution of Δbi = portion of b(i)

77.4%

22.6%

1

2


A

B

A Small closeness centrality of A Large sum of distance from A Large ② contribution and small network deformation

B

:

: Large closeness centrality of B small sum of distance from B small ② contribution and large network deformation

bdd

ll ,,

Network

0)1(22)1(22 , NdNbb AA

A

0)1(22)( , NdBbb BBoB ( λ : detour length )

1 10 100 10000.1

1

10

100

1000

10000

100000

1000000

Nd(k) ~ k -2.9

Num

ber

of N

odes

( N

d )

Degree (k)

Introduction : Scale-free network• What is the Scale Free Network?

– SF network is the network with the power-law degree distribution.

Ex) BA model growth and preferential attachment

A.-L.Barabási and R. Albert, Emergence of scaling in random networks, Science, 286, 509 (1999)

Ex) Empirical Results of Real Networks World-Wide Web, Internet, Movie actor collaboration network, Science collaboration graph, Cellular network, etc.

R. Albert, H. Jeong, and A.-L.Barabási, Nature(London), 406, 378 (2000)

kkD ~)(

0

21

34

5

6

Next One?

– SF network shows error and attack tolerance.

1000 10000 100000 1000000 1E7 1E81E-7

1E-6

1E-5

1E-4

1E-3

0.01

0.1

1

10

D (

)

( load by SHT )

Introduction : Load & Classification of networks

• What is the “Load” ?– When every pair of nodes in a network exchanges data packets

along the shortest path, load, or “betweenness centrality(BC),” of a node is the total number of data packets passing through that node.

Ex) Internet traffic jam, influential people in social network, etc.

A

B

C

F

E

D

G

start2

1

2

1

4

1

4

1

2

1

– “It is found that the load distribution follows a power-law with the exponent δ~2.2(1)”

K.-I. Goh, B. Kahng, and D. Kim, Universal Behavior of Load Distribution in Scale-Free Networks, PRL, 87, 27 (2001)

2.2~)( D

- The exponent of load is robust without network model dependency. It can be used to classify the networks.

Kwang-Il Goh, et al., Classification of scale-free networks, PNAS, 99, 20 (2002)

δ is universal value !

response network emerging from simple perturbation

Documents

random network

sf network

mst network

genetic network

response of primary

response network emerging

network structure changes

network exchanges data