[ieee 2013 ieee symposium on computational intelligence for communication systems and networks...

Efficient Scheme for Load Balancing on

Heterogeneous Biswapped Networks

Liting Sun

School of Automation Engineering

University of Science and Technology Beijing

Beijing, China 100083

Email: [email protected]

Chaonan Tong

School of Automation Engineering

University of Science and Technology Beijing

Beijing, China 100083

Email: [email protected]

Abstract—Existing local iterative algorithms for load-balancing are poor-suited to many large-scale interconnectionnetworks. The main reason is complicated Laplace spectrum com-putations. In view of the large scale Biswapped network(BSN),our paper proposes a more simple and effective solution, namedHCDE-X, which is suitable to BSN of heterogeneous for loadbalancing. The scheme avoids the whole network’s Laplace matrixcalculation, only needs spectral information of the much smallerbasis or factor graph. By example illustrations and algorithmanalysis, the new suggested scheme lowers algorithm and thecomputational complexity, and reduces unnecessary communica-tion cost, and is more simple and effective than the traditionalones.

I. INTRODUCTION

Load balancing problem has been an important matter in

parallel and distributed processing, and which greatly affects

the performance of systems in processing time. The goal of

load balancing process is to migrate loads across edges, so

that all nodes achieve the balanced state after load migrating.

Now there have been many diffusion algorithms proposed

for load balancing on general networks. For homogeneous

networks, Cybenko [1] presented first order scheme(FOS).

Muthukrishnan et al. [2] improved FOS, and presented the

second order scheme (SOS), which speeded up the iteration

process. Diekmann et al. in [3] developed a new iterative

algorithm, called the optimal polynomial scheme(OPS). OPS

improved iteration convergence speed, and balanced the loads

among nodes within a finite number of iterations. For a more

simple and efficient algorithm than OPS, Elsasser et al. [4]

presented optimal diffusion scheme(OPT). In homogeneous

networks, the balanced state is achieved when all nodes of

networks have equal loads. But for heterogeneous networks,

each node has different capability, then the balanced state

means each node takes loads proportional to its capability.

So load-balancing algorithm in heterogeneous networks is a

bit more complicated than that in homogeneous networks.

For the latter one, Elsasse et al. [5] presented diffusive load

balancing scheme. Rotaru et al. in [6] designed an transition

algorithm from homogeneous system to heterogeneous system.

They all proved the effectiveness of the algorithms in theory.

All of above were the classical algorithms of load balancing

in the past. For the recent years, load balancing problem is

still a research hot topic, and it has been researched on more

ample scope. In [7], Alsarhan et al. suggested load balanced

clustering for cognitive radio technology. Warabino et al. in [8]

presented advanced load balancing which was used in cellular

network. Take the future trend of load balancing problem into

consideration, Keyvanpou in [9] proposed two main categories

that were topology dependent and topology independent of

load balancing algorithms in distributed systems.

However, traditional diffusion algorithms are not well

applied to large-scale interconnection networks (such as

biswapped networks), because of the complicated spectrum

computation. Biswapped network (BSN), which is a new type

of interconnection network, is closely related to the OTIS [10,

11]. BSN is of more regularity than the OTIS, and is built

of 2n copies of an n-node basic network (called the cluster)

which are isomorphism by using a simple rule for connectivity.

Recent studies show that BSN has good properties such as

low expansion cost and inherits good characteristics of factor

network which is similar to OTIS [12,13].

In this paper, we introduce a new algorithm HCDE-

X for load balancing to adapt to heterogeneous BSN. Our

paper is organized as follows. In section 2, we describe basic

definitions pertaining to load-balancing, diffusion algorithms

18978-1-4673-5903-0/13/$31.00 c©2013 IEEE

of heterogeneous networks, and the structure of BSN, and

in section 3, we propose algorithm for load balancing on

heterogeneous BSN. In section 4, we analyze the performance

of traditional algorithms and the new one, the results show the

simple and efficient of our approach.

II. DEFINITIONS

Definition1(BSNGraph). Let Ω=(VΩ, EΩ) be an

undirected graph. The BSN graph associated with Ω.

BSN (Ω)= (V,E) is an undirected graph with the vertex set

V (BSN(Ω)) = {< i, g, p > |g, p ∈ VΩ , i = 0, 1} and the

edges set E(BSN(Ω)) = {(< i, g, p1 >,< i, g, p2 >) |(p1,p2) ∈ EΩ, i = 0, 1} ∪ {(< i, g, p >,< 1− i, p, g >) |g, p ∈VΩ, i = 0, 1}.

From above, if we regard the basis network as group,

the definition postulates 2n groups, each group is an Ω

digraph. The name ”Biswapped network” (BSN) arises from

two defining properties of the network: one is when groups

are viewed as super-nodes, the resulting graph of super-nodes

is a complete 2n-node bipartite graph; the other one is that

the inter-group links connect processor g in cluster p of part

0 with processor p in cluster g of part 1.

If Ω has n nodes, then BSN (Ω) is composed of 2n

node-disjoint subnetworks Ωij(i = 0/1, 0 ≤ j ≤ n− 1),

which constitute the groups or clusters. Each group is

isomorphic to Ω. Denote the vertex set of Ωij as

Vij = {vijk |0 ≤ k ≤ n− 1, i = 0, 1}

and its edge set can be defined:

Eij = {(vjm, vjn) |(vm, vn) ∈ EΩ, i = 0, 1}.

The vertex set V of BSN (Ω) is

V = ∪viji∈{0,1};0≤j≤n−1

.

The edge set E of BSN (Ω) can be partitioned into two

subsets: the intragroup or basis edge set E1, and the intergroup

or swap edge set E2. Clearly,

E1 = ∪i∈{0,1};0≤j≤n−1

Eij ;

Fig.1 BSN-R4

E2 = {< vijk , v(1−i)kj> |k < j, i = 0, 1}.

Fig.1 contains an example of BSN , which is formed with

4−node ring as its basis or factor network.

Let Wij = (wij1 , wij2 , . . . , wijn)T (i = 0, 1) and

Cij = (cij1 , cij2 , . . . , cijn)T (i = 0, 1) represent the loads

and weight on the jth factor of part i respectively.

Similarly, let Wi = (wi1, wi2, . . . , win)T (i = 0, 1) and

Ci = (ci1, ci2, . . . , cin)T (i = 0, 1) denote the loads and

weight on BSN (Ω). Additionally, C ′ and C ′ij are taken to

be diagonal matrices, with elements of the vectors Ci and Cij

as their diagonal entries, respectively. That is:

C ′ = diag(c011 , c012 , . . . , c0nn , c111 , . . . , c1nn)

C ′ij = diag(c01, c02, . . . , c0n, . . . , c1n)(i = 0, 1)

Denote Af and A be the node-edge incidence matrices of

the basis graph Ω and BSN (Ω) respectively. Take A2 to be the

matrix specifying the incidence of the intergroup edges in E2

to nodes of BSN (Ω). Matrix A, Af and A2 all have in each

column exactly two nonzero entries 1 and -1, which represent

the nodes incident to the corresponding edges. The sign of

these nonzero entries imply directions of the flows produced

in the process of load-balancing on the corresponding edges.

The Laplacian L of a graph is L = AAT . Let L and Lf be

the Laplace matrices of BSN (Ω) and its basis network Ω,

respectively.

For heterogeneous networks, we have to use the general-

ized Laplacian representation: LC−1. In this case, the Lapla-

cian of BSN (Ω) and Ω are defined respectively as L = AAT

and Lf = Af AfT

, where A = AF−1 and Af = Af(F f

)−1.

From the above expression, the matrix F denotes diagonal

matrix with Fii =√fii , where fii indicates the edge-weight

of edges i in BSN (Ω). Likewise, the matrix F f denotes

diagonal matrix with F fii =

√ffii, where ff

ii denotes the edge-

weight of edge i in Ω. Then Laplacian matrix of BSN (Ω)

2013 IEEE Symposium on Computational Intelligence for Communication Systems and Networks (CIComms) 19

is LC ′−1, and the Laplacian matrix of Ω is LfC′−1ij . The

matrix Mf defined by Mf= I−αf LfC′−1ij is called diffusion

matrix of polynomial-based scheme, where α ∈ (0, 1) is a

constant. The similar to Mf , the matrix M is also got as

M = I−αL(C ′)−1. Let λf

1 < λf2 < . . . < λf

m′ be m′

distinct eigenvalues of the Laplacian LfC′−1ij in increasing

order, while λ1 < λ2 < . . . < λm are m distinct eigen-

values of L(C ′)−1. Then M and Mf have the eigenvalues

μi = 1− αλi and μfi = 1− αfλf

i respectively. We define the

diffusion norm γ and γf as γ ={max(

∣∣∣μf2

∣∣∣ , ∣∣μfm

∣∣)} < 1 and

γf ={max(

∣∣∣μf2

∣∣∣ , ∣∣∣μfm′

∣∣∣)} < 1 . For any polynomial-based

diffusion scheme, a small diffusion norm will lead to a fast rate

of convergence. The work load wk in step k can be expressed

in the form of wk = pk(M)w0 for any polynomial-based

load balancing schemes. The convergence of a polynomial

based load balancing scheme depends on whether the error

ek = wk −w between the iterate wk and the balanced load w

tends to zero [3].

III. HCDE-X FOR HETEROGENEOUS BSN

For large-scale BSN, it is difficult for us to compute all

eigenvalues of an 2n2 × 2n2 matrix before load balancing

process starts, and the calculation is large time costed. There-

fore we introduce a hybrid scheme of diffusion and dimension

exchange called HCDE-X scheme, where X represents any

general load balancing scheme. For the suggested scheme, the

balancing processes can be divided into four stages.

For doing inter-group links having much larger bandwidth

than doing intra-group links, we denote the edge weights of

the inter-group links, and let the weights on intra-group links

be infinite close to zero. Now, take lgp to be the corresponding

edge weight on the edge which linking the group g in part 0

and the group p in part 1. Then, lgp is expressed as follows.

lgp =(∑n

i=1 cgi)0 (∑n

i=1 cpi)1(∑ni=1

∑nj=1 cij

)0+

(∑ni=1

∑nj=1 cij

)1

(1)

Among the formula above, cgp is denoted as the weight

of node 〈g, p〉 of one part.

For the flowing discussion, we take zi and zfi to be

the corresponding eigenvectors of L(C ′)−1and LfC

′−1ij , and

define xi as the flows on the edges generated from load

balancing process in the i stage.

Now, the algorithm of HCDE-X is described as follows.

In the first stage, suggested scheme diffuses node loads

iteratively until all factor networks achieve balanced status. In

other words, the initial loads W 0ij of the jth factor network

in part i achieves balanced status W 0ij locally. Following this

stage, the workload W kij in step k can be expressed as

W kij =

(In ⊗ pk(M

f ))Wij . (2)

After this stage, the flows x1 produced from the factor

network can be got referencing in [14] as this:

x1 =(Af

(F f

)−1)T m′∑

i=2

1− (μi

f)k

λif

zfi . (3)

In formula 3, k is the iterations in the diffusion stage.

In the second stage, we perform a dimension and simple

diffusion strategy over all intergroup links. After the process,

the immigrated loads wΔ on every lgp can be expressed

wΔ =

∣∣∣∣ lgpcgp× wgp − lgp

cpg× wpg

∣∣∣∣ . (4)

For the formula, we denote the cgp and wgp as the weight

and loads on the node 〈g, p〉 of part 0, whereas cpg and wpg

are on behalf of the same meanings on the node 〈p, g〉 of part

1. The similar to formula (3), the flows produced after the

stage 2 is

x2 =

(I ⊗

(A2(F

′)−1)T

)∑m2

i=2

1−(μ

′i

)k∗

λ′i

z′i. (5)

We have specified the matrix A2 , now make supplement

explanation to the formula (5): we denote the λ′i (0 ≤ i ≤ m2)

to be the Laplacian eigenvalues of the graph, with which the

vertex set is V and the edge set is E2. Let z′i be the orthogonal

eigenvectors corresponding to λ′i, and μ

′i be the eigenvalues of

the corresponding to the diffusion matrix.

In the third and fourth stage, we proceed with diffusion

and dimension exchange using the same iterative polynomial-

based load balancing as in the first and second stage, and the

flows x3 generated from each factor network is got as

x3 =(Af

(F f

)−1)T m′∑

i=2

1− (μi

f)k∗∗

λif

zfi , (6)

the flows x4 is obtained

x4 =

(I ⊗

(A2(F

′)−1)T

)∑m2

i=2

1−(μ

′i

)k∗∗∗

λ′i

z′i. (7)

After the load balancing process finished, the whole

network achieves a load balancing status. The flows x

20 2013 IEEE Symposium on Computational Intelligence for Communication Systems and Networks (CIComms)

produced using the HCDE-X scheme can be calculated as

x =2n∑j=1

x1 + x2+2n∑j=1

x3 + x4

= x2 + x4 +2n∑j=1

(x1 + x3)

=(Af

(F f

)−1)T 2n∑

j=1

m′∑

i=2

2−(μfi )

k−(μfi )

k∗∗

(λi)f zfi

+I ⊗(A2(F

′)−1)T

m2∑i=2

2−(μ

′i

)k∗

−(μ

′i

)k∗∗∗

λ′i

z′i. (8)

If we use the general diffusion scheme X, the generated flows

x′

on BSN will be

x′=

(I ⊗ (

AF−1)T)∑m

i=2

1− (μi)k′

λizi. (9)

From our suggesting algorithm, we can see that most

of the migrations occurre in factor networks. Compared with

formula (8) and (9), the calculations of flows x resulting from

HCDE-X avoid the whole networks’s large scale complicated

Laplace spectrum computation, only just knowing the factor

networks’s Laplace eigenvalues. Therefore, our new scheme

lowers algorithm and the computational complexity.

Fig.2 illustrates the preceding four-stage load balancing

process. In the subfigure (a), the bigger integers represent the

loads of the nodes, while the decimal above the node denotes

the weight of each node. The decimals on the intergroup links

are the edge weights. Analyzing the figures, we can see that

in the subfigure (b), each factor network achieves the load

balancing status after the first diffusion stage. Following the

second stage, loads on nodes are diffused and exchanged on

the intergroup links in figure (c). The immigrated loads via

each inter-group link can be computed by using formula (4).

The similar to stage 1 and stage 2, stage 3 and stage 4 are

illustrated in subfigure (d) and (e). From subfigure (e), we can

see that the nodes with the same weights have the nearly work

loads, and the whole network is in a load balancing status.

The HCDE-X algorithm is outlined given below.

IV. ALGORITHM ANALYSIS

We know the convergence speed of HCDE-X is decided

by the diffusion norm of corresponding factor network, not

depends on the distribution of node weights, no matter X is

FOS, SOS, OPS or OPT.

Theorem1. Let Ω be the factor network of BSN (Ω),

then μf (Ω) ≺ μ (BSN (Ω)).

Proof : Let μ and μf be the Laplace eigenvalues of

BSN (Ω) and Ω, let λ and λf be the eigenvalues of

BSN (Ω) and basic network Ω . Denote D be degree

diagonal matrix, denote A be adjacency matrix, so we have

μf (Ω) = λf1 (D (Ω) +A (Ω)) (10)

BSN (Ω) can be described as a complete bipartite graph, Ω

is the connected generate support subgraph of BSN (Ω) ,then

μf (Ω) = μf1 (Ω) = λf

1 (D (Ω) +A (Ω))

≺ λ1 (D (BSN (Ω)) +A (BSN (Ω)))

= μ (BSN (Ω)) = μ1 (BSN (Ω))

(11)

μf (Ω) ≺ μ (BSN (Ω)) implies that when applying the

HCDE-FOS scheme and FOS scheme to BSN (Ω) at the

same time, HCDE-FOS will have a smaller upper bound

of error than FOS in the kth iteration according to∥∥ek∥∥

2≤ rk.

Theorem2. Let ψ′′t and ψ′t be potential of BSN (Ω)

and Ω after t steps iterations, then ψ′′t ψ′t.

Proof : We know ψt = μ2tψ0, ψ0 is the initial potential. By

using Theorem 1, we find that ψ′′t ψ′t.

Theorem3. To achieve balanced status on BSN, HCDE-OPS

has a smaller upper bound of iterations required.

Proof : We know that, for the OPS scheme, iterations

are not depending on the diameter of a graph. For the

symmetrical of BSN, the upper bound k of iterations of the

balancing flows satisfies k ≺ D(BSN (Ω)). By using HCDE-

X, d (Ω) ≺ D(BSN (Ω)) , then it implies when applying

the HCDE-OPS scheme and OPS scheme to BSN (Ω),

HCDE-OPS will have a smaller upper bound of iterations of

the balancing flows.


(a) Unbalanced initial node distribution on a Heterogeneous BSN-

R4

(b) Node loads after the first step on a Heterogeneous BSN-R4

(c) Node loads after the second step on a Heterogeneous BSN-R4

(d) Node loads after the third step on a Heterogeneous BSN-R4

(e) Node loads after the forth step on a Heterogeneous BSN-R4

Fig.2 An example of the HCDE-X scheme applied to heterogeneous

BSN

Algorithm 1 HCDE-X

Require: BSN (Ω) network consists of Ωij , Wij , Cij ;

Ensure: Balanced load vector, wlij ;

1: for all groups Ωij of BSN (Ω);

2: run the diffusion procedure X on Ωij ;

3: end for;

4: for all intergroup edges

5: e:(< i, g, p >,< 1− i, p, g >);

6: lgp =(∑n

i=1 cgi)(∑n

i=1 cpi)(∑n

i=1

∑nj=1 cij)

0+(

∑ni=1

∑nj=1 cij)

1

;

7: wgp = wgp +(

lgpcgp× wgp − lgp

cpg× wpg

);

8: end for

9: for all groups Ωij ;

10: run the diffusion procedure X on Ωij ;

11: end for;

12: for all intergroup edges

13: e:(< i, g, p >,< 1− i, p, g >);

14: lgp =(∑n

i=1 cgi)(∑n

i=1 cpi)(∑n

i=1

∑nj=1 cij)

0+(

∑ni=1

∑nj=1 cij)

1

;

15: wgp = wgp +(

lgpcgp× wgp − lgp

cpg× wpg

);

16: end for

17: for all groups Ωij ;

18: return blancing load vector as wlij ;

19: end for.

V. CONCLUSION

The suggested scheme described in this paper is based

on a common idea of hybrid diffusion-exchange-diffusion-

exchange strategy, but they take advantage of the special

structure of BSN to reduce the iterations and reduce the

required communication cost. A main focus of our ongoing

work is whether we can partition the intensive tasks into the

same groups by using the special structure of BSN, for purpose

of futher reducing the commucation cost.

REFERENCES

[1] G.Cybenko, Dynamic load balancing for distributed memory multipro-cessors, Journal of Parallel and Distributed Computing, 1989.7(2):279-

301.

[2] S.Muthukrishnan,B.Ghosh and M.H. Schultz, First- and second-orderdiffusive methods for rapid, coarse,distributed load balancing, Theory

of Computing Systems, 1998.31(4):331-354.

[3] R.Diekmann,A. Frommer and B. Monien, Efficient schemes for nearestneighbor load balancing, Parallel Computing, 1999.25(7):789-812.

[4] R.Elsasser,B.Monien,R.Preis,A.Frommer, Optimal and alternating-direction load balancing schemes, in Proceedings of Euro-Par’99, 31

Aug -3 Sept, 1999, Berlin, Germany: Springer-Verlag.

[5] R.Elsasser,B. Monien and R.Preis, Diffusive load balancing schemeson heterogeneous networks, Theory of Computing Systems,

2001.35(3):305-320.

22 2013 IEEE Symposium on Computational Intelligence for Communication Systems and Networks (CIComms)

[6] T.Rotaru,H.H. Nageli, Dynamic load balancing by diffusion in het-erogeneous systems, Journal of Parallel and Distributed Computing,

2004.64(4):481-97.

[7] A. Alsarhan, and A. Agarwal, Load balancing for spectrum managementin a cluster-based cognitive network, in 2011 Canadian Conference on

Electrical and Computer Engineering, CCECE 2011, 8 May-11 May,

2011, Niagara Falls, Canada: Institute of Electrical and Electronics

Engineers Inc.

[8] T.Warabino, S.Kaneko,S. Nanba, Advanced load balancing in LTE/LTE-A cellular network, in 2012 IEEE 23rd International Symposium on

Personal, Indoor and Mobile Radio Communications, PIMRC 2012, 9

September-12 September, 2012, Sydney, NSW, Australia: Institute of

Electrical and Electronics Engineers Inc.

[9] M. R.Keyvanpour,H.Mansourifar,B.Bagherzade, A novel classificationof load balancing algorithms in distributed systems, 2011 SSITE

International Conference on Computers and Advanced Technology in

Education, ICCATE 2011,3 November- 4 November,2011,2012, Beijing,

China: Springer Verlag.

[10] B.Parhami, Swapped interconnection networks: Topological, perfor-mance, and robustness attributes, Journal of Parallel and Distributed

Computing, 2005.65(11):1443-52.

[11] W.Chen,W.Xiao,B.Parhami, Swapped (OTIS) networks built of connect-ed basis networks are maximally fault tolerant, IEEE Transactions on

Parallel and Distributed Systems, 2009.20(3):361-366.

[12] W.Chen,W. Xiao, Topological Properties of Biswapped Networks (B-SNs): Node Symmetry and Maximal Fault Tolerance, Chinese Journal

of Computers, 2010.33(5):822-32.

[13] W.Xiao, Biswapped networks and their topological properties, in S-

NPD 2007: 8th ACIS International Conference on Software Engineering,

Artificial Intelligence, Networking, and Parallel/Distributed Computing,

30 July - 1 August ,2007.

[14] H.Arndt, On finite dimension exchange algorithms, Linear Algebra

and Its Applications, 2004. 380: p. 73-93.


[ieee 2013 ieee symposium on computational intelligence for communication systems and networks...

Documents