the deflection self-routing banyan network: a large-scale atm …hklee.kaist.ac.kr/publications/1999...

17
588 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 7, NO. 4, AUGUST 1999 The Deflection Self-Routing Banyan Network: A Large-Scale ATM Switch Using the Fully Adaptive Self-Routing and its Performance Analyses Jae-Hyun Park, Member, IEEE, Hyunsoo Yoon, and Heung-Kyu Lee Abstract— Because the Internet traffic that will be major traffic of broadband integrated services digital networks is bursty when cells are being switched within the multistage switching network, it has a higher possibility that multiple cells arriving simultaneously at a switching element through different incoming links may have to be forwarded along the same outgoing link. In this paper, we propose a high-performance large-scale ATM switch dealing with such link contention problem. It is a new unbuffered augmented Banyan network using fully adaptive self- routing control: the deflection self-routing Banyan network. To utilize all the links of the network as alternate paths, we employ the deflection-routing algorithm in each switching element, such that cells failing to get selected for the intended link are sent along different links, in the willing that they later return, or detour the contended link and continue their journey to the destination. Cells are never dropped within the switching network, whereas the switch has no multiple cell buffers. The proposed routing is as simple as that of the generic Banyan network, and all the switch elements (SE’s) have a uniform structure. To design proposed the network and its self-routing, we use topological properties that all the SE’s of the Banyan network are arranged in a regular pattern topologically. We formulate and prove these properties through an algebraic formalism. We also ran a performance analysis to provide quantitative comparison against the Banyan network and the replicated Banyan networks. As a result, we show that the new network has a far better performance and scalability than the other networks. Index Terms— Algebraic formalism, ATM switch, deflection self-routing Banyan network, performance evaluation, topological properties, unbuffered Banyan network. I. INTRODUCTION M ULTISTAGE switching networks have been widely used as efficient interconnection structures for parallel computer systems and the switching nodes in high-speed communication networks. In the future, multistage switching networks are also expected to be used in broadband integrated services digital networks (B-ISDN’s) and transport systems based on the asynchronous transfer mode (ATM) [1]. The reasons lie in their suitability to VLSI implementation and their Manuscript received February 27, 1998; revised December 11, 1998; approved by IEEE/ACM TRANSACTIONS ON NETWORKING Editor H. J. Chao. J.-H. Park is with the System Architecture Laboratory, Samsung Electronics Company, Bundang-ku, Sungnam 463-050, Korea (e-mail: [email protected]). H. Yoon and H.-K. Lee are with the Department of Computer Science, Korea Advanced Institute of Science and Technology, Yusung-Ku, Taejon 305- 701, Korea (e-mail: [email protected]; [email protected]). Publisher Item Identifier S 1063-6692(99)06954-X. self-routing capability [2]. This class of networks is usually known as a Banyan network. The multistage switching network is built up of smaller switch elements (SE’s), which are connected, through their links, to other SE’s or terminal devices. Because the Internet data traffic that will be majority of the traffic in a future B-ISDN is bursty, there is a higher possibility that there are multiple cells arriving simultaneously at an SE through different incoming links that may have to forward along the same outgoing link. Furthermore, the larger the network is, and the more unbalanced traffic patterns the network has, the higher possibility of such output contention. There are many ways to deal with the output contention: adding a distribution network in front of a routing network, using several networks in parallel, recirculating cells, adding one or more extra stages to a network, deflecting cells, augmenting extra internal links and modifying self-routing, increasing the bandwidth of internal links relative to the ports, and providing internal buffers [3]. However, most of them have shortcomings: they require much more hardware or complicated routing scheme or both. Although increasing the bandwidth of internal links and providing internal buffers have been widely used to implement commercialized ATM switching systems [4], the hardware cost of these schemes is so expensive, because the majority of data traffic is bursty and has unbalanced traffic patterns [5], [6]. Deflection routing is a good alternative to treat the output contention: cells failing to get selected for the destined link are sent along different links, in the willing that they later return, or detour the contended link and continue their journey to the destination. Deflection routing is used in high-speed multihop networks, since it gives a good performance [7] and is easy to implement [8]. As a kind of deflection routing, some methods of augment- ing extra internal links to the Banyan network and modifying its self-routing have been proposed in [9]–[11]. They have succeeded in improving the performance and imparting the fault tolerance. However, they use partially adaptive self- routing; they place their main stress upon supplement links, and partially use already existing links. As a result, they did not consider all links of the Banyan network as alternate links/paths. We propose a high-performance large-scale ATM switch that use all links of the Banyan network as alternate links/paths. We name this new augmented Banyan network the cyclic 1063–6692/99$10.00 1999 IEEE

Upload: others

Post on 11-Mar-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The deflection self-routing Banyan network: a large-scale ATM …hklee.kaist.ac.kr/publications/1999 IEEE Trans. Network... · 2001-07-03 · 588 IEEE/ACM TRANSACTIONS ON NETWORKING,

588 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 7, NO. 4, AUGUST 1999

The Deflection Self-Routing Banyan Network: ALarge-Scale ATM Switch Using the Fully Adaptive

Self-Routing and its Performance AnalysesJae-Hyun Park,Member, IEEE,Hyunsoo Yoon, and Heung-Kyu Lee

Abstract— Because the Internet traffic that will be majortraffic of broadband integrated services digital networks is burstywhen cells are being switched within the multistage switchingnetwork, it has a higher possibility that multiple cells arrivingsimultaneously at a switching element through different incominglinks may have to be forwarded along the same outgoing link.In this paper, we propose a high-performance large-scale ATMswitch dealing with such link contention problem. It is a newunbuffered augmented Banyan network using fully adaptive self-routing control: the deflection self-routing Banyan network. Toutilize all the links of the network as alternate paths, we employthe deflection-routing algorithm in each switching element, suchthat cells failing to get selected for the intended link are sent alongdifferent links, in the willing that they later return, or detourthe contended link and continue their journey to the destination.Cells are never dropped within the switching network, whereasthe switch has no multiple cell buffers. The proposed routing is assimple as that of the generic Banyan network, and all the switchelements (SE’s) have a uniform structure. To design proposed thenetwork and its self-routing, we use topological properties that allthe SE’s of the Banyan network are arranged in a regular patterntopologically. We formulate and prove these properties throughan algebraic formalism. We also ran a performance analysis toprovide quantitative comparison against the Banyan network andthe replicated Banyan networks. As a result, we show that thenew network has a far better performance and scalability thanthe other networks.

Index Terms—Algebraic formalism, ATM switch, deflectionself-routing Banyan network, performance evaluation, topologicalproperties, unbuffered Banyan network.

I. INTRODUCTION

M ULTISTAGE switching networks have been widelyused as efficient interconnection structures for parallel

computer systems and the switching nodes in high-speedcommunication networks. In the future, multistage switchingnetworks are also expected to be used in broadband integratedservices digital networks (B-ISDN’s) and transport systemsbased on the asynchronous transfer mode (ATM) [1]. Thereasons lie in their suitability to VLSI implementation and their

Manuscript received February 27, 1998; revised December 11, 1998;approved by IEEE/ACM TRANSACTIONS ON NETWORKING Editor H. J. Chao.

J.-H. Park is with the System Architecture Laboratory, SamsungElectronics Company, Bundang-ku, Sungnam 463-050, Korea (e-mail:[email protected]).

H. Yoon and H.-K. Lee are with the Department of Computer Science,Korea Advanced Institute of Science and Technology, Yusung-Ku, Taejon 305-701, Korea (e-mail: [email protected]; [email protected]).

Publisher Item Identifier S 1063-6692(99)06954-X.

self-routing capability [2]. This class of networks is usuallyknown as a Banyan network.

The multistage switching network is built up of smallerswitch elements (SE’s), which are connected, through theirlinks, to other SE’s or terminal devices. Because the Internetdata traffic that will be majority of the traffic in a futureB-ISDN is bursty, there is a higher possibility that thereare multiple cells arriving simultaneously at an SE throughdifferent incoming links that may have to forward along thesame outgoing link. Furthermore, the larger the network is,and the more unbalanced traffic patterns the network has, thehigher possibility of such output contention.

There are many ways to deal with the output contention:adding a distribution network in front of a routing network,using several networks in parallel, recirculating cells, addingone or more extra stages to a network, deflecting cells,augmenting extra internal links and modifying self-routing,increasing the bandwidth of internal links relative to theports, and providing internal buffers [3]. However, most ofthem have shortcomings: they require much more hardware orcomplicated routing scheme or both.

Although increasing the bandwidth of internal links andproviding internal buffers have been widely used to implementcommercialized ATM switching systems [4], the hardware costof these schemes is so expensive, because the majority of datatraffic is bursty and has unbalanced traffic patterns [5], [6].Deflection routing is a good alternative to treat the outputcontention: cells failing to get selected for the destined linkare sent along different links, in the willing that they laterreturn, or detour the contended link and continue their journeyto the destination. Deflection routing is used in high-speedmultihop networks, since it gives a good performance [7] andis easy to implement [8].

As a kind of deflection routing, some methods of augment-ing extra internal links to the Banyan network and modifyingits self-routing have been proposed in [9]–[11]. They havesucceeded in improving the performance and imparting thefault tolerance. However, they use partially adaptive self-routing; they place their main stress upon supplement links,and partially use already existing links. As a result, they didnot considerall links of the Banyan network as alternatelinks/paths.

We propose a high-performance large-scale ATM switch thatuse all links of the Banyan network as alternate links/paths.We name this new augmented Banyan network thecyclic

1063–6692/99$10.00 1999 IEEE

Page 2: The deflection self-routing Banyan network: a large-scale ATM …hklee.kaist.ac.kr/publications/1999 IEEE Trans. Network... · 2001-07-03 · 588 IEEE/ACM TRANSACTIONS ON NETWORKING,

PARK et al.: DEFLECTION SELF-ROUTING BANYAN NETORK 589

Banyan network. This switching network is a class of thedeflection self-routing Banyan network, of which the fullyadaptive self-routing scheme exploits all links of SE’s asalternate links/paths.

While the proposed network has no internal buffer, it neverloses cells within the switching network. No memory buffermeans that it is easy to speed up, and can even be usedfor fiber-optic switching. This approach also eliminates theneed for backward status links in order to implement flowcontrol. These characteristics imply the implementation usinggate-array technology that offer greater flexibility in the cost,performance, and other design parameters than dedicated VLSIsolution.

By adding extra links between SE’s within the same stageand extending the self-routing scheme, we could use all linksas alternate links/paths. Therefore, the proposed network undera balanced traffic pattern and an unbalanced one has a farbetter performance and scalability than other networks. Todesign the proposed network and its self-routing, we use thetopological properties that all SE’s of the Banyan network arearranged in a regular pattern in terms of topology. In otherwords, each stage of the Banyan network is composed ofsequence of thecyclic group,realized with SE’s and also stagesconnected symmetrically through the links between them. Weprove such properties of the Banyan networks through analgebraic formalism in this paper. The proposed self-routingis as simple as that of the Banyan network, and all SE’s of theproposed network have a uniform structure. We also present asimple cell-resequencing buffer providing a hardware-slidingwindow mechanism so that the proposed network preservesthe integrity of cell sequence.

This paper is organized as follows. In Section II, we in-vestigate the topological properties of the Banyan network toderive the cyclic Banyan network and its self-routing schemethrough an algebraic formalism. In Section III, we present thedefinition of the proposed network and its routing scheme.In Section IV, we analyze the performance of the networkunder uniform traffic via an analytic model and simulation, andevaluate it under nonuniform traffic via simulation. Finally,concluding remarks are given in Section V.

II. TOPOLOGICAL PROPERTIES OF THEBANYAN NETWORK

The Banyan networkas so defined by Goke and Lipovskiprovides unique paths between source port to destination port.More particularly, the set of paths destined for an SE in thenetwork form a spanning tree, and the set of paths from anSE also form a spanning tree [12]. However, only a certainclass of spanning trees, from among all such spanning trees,is important because all SE’s belonging to the same spanningtree can deliver a cell through the same destination port byusing the basic self-routing algorithm. As depicted by thickcontinuous lines in Fig. 1, spanning trees such as this consistof the links, from each SE of the last stage (as the root of thetree) to all SE’s of the first stage (as the leaves of the tree),together with the SE’s connected with the links as its nodes.

In related works [9]–[11], such spanning trees had beenused to make the Banyan network fault-tolerable by providing

alternate paths. In [11], they gained the alternate paths bychaining the SE’s in the same level within the trees withaugmented links. When a fault occurs in the selected outputlink, the cell can be delivered to an SE of the same levelwithin the tree, as depicted with arrow “A” in Fig. 1, andthen routed correctly again from that SE to the destinationport using the basic routing algorithm of the Banyan network.In other works [9], [10], they also made alternate paths byconnecting the additional backward (forward) link to the child(parent) SE. When the fault occurs, the cell can be deliveredto a child (parent) SE of a step lower (higher) level within thetree through the additional link as depicted with arrow “B”(“C”) in Fig. 1.

In order to impart the fault-tolerance to the Banyan network,or to improve the performance, or both, these schemes placetheir main stress upon the supplemental link. The reason thatif once the cell routed through the output link that is notdestined, not through the augmented link, there is no pathleading the cell to the destination port. Thus, these schemesuse just only the augmented link to detour faulty/congestedlink [13]. There is, however, always a vacant link withinthe very same SE at the competition moment, so far as allinput cells do not destine distinct links, respectively. Ourobjective is to provide an augmented Banyan network andan appropriate fully adaptive self-routing algorithm thereof,wherein the routing algorithm use all links of the basicBanyan network as alternate links/paths, in addition to thesupplemental links.

In this section, we investigate the topological propertiesof the Banyan network to derive the cyclic Banyan networkand the fully adaptive self-routing algorithm corresponding tothe network. Since it is widely known that many multistageinterconnection networks (MIN) are topologically equivalentwith each other [14], we will cite the delta network [15]for the sake of simplicity and clarity of illustration in thefollowing discussion. Also, we will refer to SE’s and linksthrough a numbering convention of the same type as thatproposed in [14]. This convention will be used to both describethe configuration of the delta network and the cyclic Banyannetwork, and to prove the validity of the routing scheme. Forthe switching network, the stages are numbered consec-utively from to beginning with the left-most stageas stage 1. For each of the stages, each link (input or output)associated with the stage has a relative position, with respect tothe top of the stage, that will be identified by a binary numeralhaving digits, link . Ineach stage, each SE has a relative position, with respect to thetop of the stage, that will be identified by a binary numeralwith digits,in a manner similar to the representation of a link, that willbe termed the level. The destination address of an input cellis represented by an expression .

First of all, let us define the delta network as the followingtopological definition.

Definition 1: For each SEat a stage of the delta network, each output link

of the SE is connected with an input link of stage

Page 3: The deflection self-routing Banyan network: a large-scale ATM …hklee.kaist.ac.kr/publications/1999 IEEE Trans. Network... · 2001-07-03 · 588 IEEE/ACM TRANSACTIONS ON NETWORKING,

590 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 7, NO. 4, AUGUST 1999

Fig. 1. The spanning trees, the equivalence class of SE’s, and the alternate paths in the delta network (N = 16).

in accordance with

connected with link

and

connected with link

where .The function ’s describe the interconnections by mapping

an SE in stage to two SE’s in stage one SE per outputlink out of the SE in stage. The will be used to describean SE that is reachable via the lower output link of the SE, and

for the upper output link. The ’s is useful for derivingof the topological relationships between the SE’s. Now, wedefine an abbreviation to indicate an SE by its stage and level.

Definition 2: The symbol is the SE located at levelof stage where 1 .

We will now show that an equivalence relation existsbetween the SE’s of a stage as follows.

Definition 3: For the self-routing control function im-plemented in the delta network, denotes the set ofoutput links in the last stage (i.e., stage ) attainablefrom the SE, that is, is the set of outputlinks reached when is applied to all possible addressesA, where

connected with link .When a cell arrives at any input link of an SE in stage,

the function is applied the address associated with the cell

in order to transfer the cell through an output link to an SEin stage .

Let us define the relation on the set of all SE’s of a stage.The symbol is defined to represent the set of all SE’s ofa stage.

Definition 4: is the set that is composed of all SE’s ofstage .

Definition 5: A relation on the set is a subset ofthe Cartesian product with the following property:

.Lemma 1: The relation on the set is an equivalence

relation.The fact that the relation is an equivalence relation

facilitates the proof of certain properties of the delta networkthat are useful for deriving an adaptive self-routing algorithm.

Definition 6: Given the relation and an SE of ,the equivalence class is a subset determined byas follows:

Let us define the notation “/ ” to indicate a partition inducedby an equivalence relation.

Definition 7: At stage in the Banyan network, the set ofequivalence classes induced by the relationis given by thenotation “/ ” defined by

.Now we will present useful properties to derive the fully

adaptive self-routing scheme.Theorem 1: Each of all SE’s belong to the same equiva-

lence class in stage (i.e., all ) have such anidentifier that terminatewith the same ( )-digit binary sequence “ ,”

Page 4: The deflection self-routing Banyan network: a large-scale ATM …hklee.kaist.ac.kr/publications/1999 IEEE Trans. Network... · 2001-07-03 · 588 IEEE/ACM TRANSACTIONS ON NETWORKING,

PARK et al.: DEFLECTION SELF-ROUTING BANYAN NETORK 591

(where ). When , i.e., in the first stage,all SE’s are in the same equivalence class, so there is not thesame bit in the identifiers for all SE’s of the first stage [16].

From Theorem 1, the following holds.Corollary 1: For a cell, if it starts from any SE within an

equivalence class, it will be deliver to the same output portby the routing function .

Theorem 1 provides a basis for the following definitions andresults, which establish a topological characterization of theBanyan network. Let us define the function to consolidaterelationship among the SE’s within the same stage.

Definition 8: At stage in the Banyan network, the SE thatis levels away relatively from an SE is given by thefunction defined by

and the SE that is levels away relatively from an SEis given by the function defined by

where .The following corollary shows that the relative topologi-

cal distance between two adjacent SE’s belong to the sameequivalence class is same.

Corollary 2: The topological distance between two SE’schosen randomly from all SE’s belong to an equivalence classin stage is levels

where .Proof: According to Definitions 7 and 8, and Theorem

1, proven.According to Corollary 2, we can show that there are

different kinds of SE’s in stage.Corollary 3: The order of the equivalence classes in stageis

From Corollaries 2 and 3, we can show that the SE’s of thedifferent classes in a stage are set in regular array that isgenerated by applying the function consecutively.

Theorem 2: Let be the group of the setunder the operation where the

SE’s are distinguished by the logical identity, relation. Then,the is the cyclic group such that a any element SE of set

is the generator of .As a result, Corollary 2 shows the key property of the

general cyclic group, and also Theorem 2 and Corollaries 2 and3 show that each stageof the Banyan network is composedof the sequences of the cyclic group of SE’s, such that theorder is . In Theorem 3 hereinafter, we can show ifthe equivalence relation exists between SE’s, then the SE’sgiven by applying the function or the function or bothalso have the equivalence relation with each other. In addition,it is worth noting that the cyclic group is the subgroup ofthe cyclic group of the next stage.

Theorem 3: The relation is a congruence, that satisfiesthe following requirements: 1) the relationis an equivalencerelation and 2) for function and function we have, forall and in the set of all SE’s in the network, thefollowing implication:and .

Proof: According to Definitions 1 and 5, Lemma 1, andTheorem 2, proven.

The theorems and corollaries provide a basis to derivethe following theorem and corollaries, which constitute thebasis of the cyclic Banyan network and the fully adaptiveself-routing scheme thereof.

Corollary 4: The path starting from an output link can bereplaced with another path starting from an output link of theequivalence SE thereof

where .Proof: By applying Definition 8, Theorem 1, and Corol-

lary 2 above, proven.Also, the following theorem shows that if there exists a path

that can transfer cells within a stage, then the path starting froman output link of the SE that is certain levels away from aninitial SE can be replaced with another path starting from anoutput link of the initial SE then going to the SE that is twotimes the original levels away, in the next stage. Using thischaracteristic, we can defer the congestion which occurred ina stage to that in the next stages.

Theorem 4: If there is the path that can transfer the cellwithin a stage, which is represented by the function thepath starting from an output link of the SE that islevelsaway from an initial SE can be replaced with another pathstarting from an output link of the initial SE and then goingto the SE that is levels away, in the next stage

where , and .Proof: When ,

(by Definitions 1 and 8)

Let’s assume when as follows:

When ,

(by Definition 8)

by case

(by case

(by Definition 8)

By the mathematical induction, proven.This proves the equality for . The case for follows

in a similar manner,mutatis mutandis. Also, the case forfollows in a similar manner,mutatis mutandis.

Page 5: The deflection self-routing Banyan network: a large-scale ATM …hklee.kaist.ac.kr/publications/1999 IEEE Trans. Network... · 2001-07-03 · 588 IEEE/ACM TRANSACTIONS ON NETWORKING,

592 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 7, NO. 4, AUGUST 1999

It is worth to note that the function is a homomorphism.The following corollary shows that a path starting from anoutput link can be substituted with another path starting fromanother output link within the same SE.

Corollary 5:

(by Definitions 1 and 8)

(by the above equation)

(by Definition 8, Corollary 2, and Theorem 4)

(by Definitions 1 and 8)

and

(by the above equation)

(by Definitions 8, Corollary 2, and Theorem 4)

where .As we have shown in the foregoing discussion, the Banyan

networks have topological properties that make them attrac-tive starting points from which to derive a new multipathunbuffered augmented Banyan network and nonminimal fullyadaptive self-routing control algorithm for those network. Ifwe make interconnections among the SE’s within a stage(i.e., from a SE to another SE ), then all theexisting links, as well as all the additional links, can beused for fully adaptive self-routing control. Fig. 2 illustratesconceptually how cell is transferred through the switchingnetwork in accordance with the fully adaptive self-routingcontrol algorithm. Each circle stands for a stage, and the solidline belong to a circle stands for the cyclic group within thestage. The unidirectional arrows from a cyclic group to its su-pergroup in the next stage illustrates the interstage routing.The bidirectional arrow over cyclic groups illustrates possibleintrastage routing . The number over each circle means theorder of the cyclic group of each stage in 6464 Banyannetwork. Since there can be various kinds of interconnectionsto provide paths from an SE to another SE wecan make several augmented Banyan networks for which wecan provide fully adaptive self-routing algorithms. The cyclicBanyan architecture specifies a particular type of intrastageconnection among the SE’s. This architecture provides anetwork with efficiency and cost advantages relative to otherextensions.

III. CYCLIC BANYAN NETWORK AND ITS SELF-ROUTING

In this section, we present a cost-effective extension of theBanyan network, the cyclic Banyan network, and the fullyadaptive self-routing scheme for the network.

A. Network Configuration

The cyclic Banyan network can be obtained from thedelta network, such as that depicted in Fig. 1, by the

Fig. 2. The cyclic groups, the homomorphism�j , and the fully adaptiveself-routing.

addition of links chaining all SE’s of a stage. Implementationof these additional links requires the network to compriseaugmented SE’s, each having chain-in links and chain-outlinks as well as the input links and output links appropriatefor the delta network.

Fig. 3 illustrates an SE having the additional links appropri-ate for the use in the cyclic Banyan network. The configurationof SE derives from a 2 2 configuration having only two inputlinks and two output links, such as would be used in the deltanetwork. Two chain-in links and two chain-out links have beenadded to the configuration of a 2 2 switch. SE is thus a 4

4 crossbar switch that operates in accordance with a fullyadaptive self-routing control algorithm to be described below.

Definition 1 and the following Definition 9 provide a precisetopological characterization of the cyclic Banyan network.

Definition 9: For each SE of the cyclic Banyan net-work, chain-out links of the SE’s are connected, respectively,with the chain-in link of the SE and the SEgiven by

and

Fig. 4 shows an example of a 1616 cyclic Banyan networkconfigured from the general 16 16 delta network. Thefunction and the function map a given SE to anotherSE within the same stage.

B. The Fully Adaptive Self-Routing Scheme

As in many other MIN’s, the routing of the cyclic Banyannetwork is controlled by means of destination tag. In additionto the general destination tag, it also uses, for each cell, adeviation tag having a fixed size of binarydigits. The deviation tag is updated at each stage to representthe value of the topological distance between the SE thatactually receives the cell and the originally intended SE (oran SE equivalent to the originally intended SE). Here, theoriginally intended SE is that SE an a given stage that wouldhave received the cell in accordance with the basic self-

Page 6: The deflection self-routing Banyan network: a large-scale ATM …hklee.kaist.ac.kr/publications/1999 IEEE Trans. Network... · 2001-07-03 · 588 IEEE/ACM TRANSACTIONS ON NETWORKING,

PARK et al.: DEFLECTION SELF-ROUTING BANYAN NETORK 593

Fig. 3. The organization of the augmented switching element.

Fig. 4. The cyclic Banyan network and the fully adaptive self-routing scheme thereof (N = 16).

routing of the underlying Banyan network. The value ofis calculated with the current value of the selected link,and the originally intended link. Thus, the destination addressof a cell in transit across the cyclic Banyan network comprisesa pair ( ). When a cell arrives at the input link of stage 1,the value of is set to zero; i.e., the cell have the destinationaddress ( ). The fully adaptive self-routing algorithm setforth in Definition 10 determines thereafter how each ofsuccessive SE’s transfers the cell and updates the value of.

Definition 10: Let ( ) denotes the destination addressfor an input cell to be transferred across a cyclic Banyannetwork, and let denote the routing control function for

the Banyan network from which the cyclic Banyan networkis configured. Let denotes the binary complement of

. A fully adaptive self-routing control algorithm for theSE, in the cyclic Banyan network is defined by a linkallocation procedure and the updating rules as follows:

I) Link-Allocation Procedure:For each input cell:

1) If then:

a) send the cell tob) if it fails, send the cell toc) if it fails, send the cell tod) if it fails, send the cell to .

Page 7: The deflection self-routing Banyan network: a large-scale ATM …hklee.kaist.ac.kr/publications/1999 IEEE Trans. Network... · 2001-07-03 · 588 IEEE/ACM TRANSACTIONS ON NETWORKING,

594 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 7, NO. 4, AUGUST 1999

2) If then:

a) send the cell to ;b) if it fails, send the cell to ;c) if it fails, send the cell to ;d) if it fails, send the cell to .

II) The Updating Rules for :

1) When

a) if and is selected,then set

b) if and is selected,then set

c) if is selected, then setd) if is selected, then set .

2) When

a) if and is selected,then set

b) if and is selected,then set

c) if and is selected,then set

d) if is selected, then sete) if is selected, then set

where

if modotherwise

and

ifotherwise.

.

The following, Theorem 5, shows that the routing schemeproposed is fully adaptive: for all SE’s in the cyclic Banyannetwork, except for the SE’s in the last stage, a cell can berouted to either the output links, or to the chain-out links, andstill be deliverable to a designated output port of the network.

Theorem 5: Let and . Foreach port of connected withor a valid path exists, in accordance with the self-routing method of Definition 10, through the port fromto a designated output port of the network.

Proof: From Corollary 5, it follows whena valid alternative path exists through the link con-

nected with (where denotes the binary complementof ). Therefore, instead of the intended output link,another output link can be utilized as an alternate path todeliver input cell to the destined output port correctly. Simi-larly, by Corollary 4, a valid alternative path exists throughthe chain-out links connected with or .Conversely, if a valid path exists through the chain-out link,the Theorem 4 provides that a valid path exists through

.We can now prove the correctness of our fully adaptive

self-routing scheme for the cyclic Banyan network.Theorem 6: The routing control procedure of Definition 10,

as applied in the cyclic Banyan network, correctly delivers aninput cell with an arbitrary destination tag to the indicateddestination port.

Proof: The method first allocates an available link inaccordance with the procedure of Part I. Theorem 5 providesthe existence of a valid path from that link to the destinationport.

Second, the method updates the deviation tagin accor-dance with the rules of Part II. The following shows that thisupdating rules actually causes the cell to proceed along a pathto the destination port.

1) The updating rules 1(a) and 1(b) are proven byapplying Corollary 5.The updating rules 1(c) and 1(d) are proven byapplying Corollary 4.

2) The updating rules 2(a) and 2(b) are proven byapplying Corollary 5 and Theorem 4.The updating rule 2(c) is proven by applying Theorem4.

3) represents the required number of additional ad-justment routing steps through successive levels ofthe current stage, by means of the chain-out link(s),for the cell to proceed to the next stage on a pathto the destination port. Thus, the updating rule 2(d)is proven. With this and Corollary 2, the rule 2(e) isalso proven.

Fig. 4 illustrates fully adaptive routing control in the cyclicBanyan network in accordance with the method of Definition10, where a cell collision or a fault is represented by thesymbol “X.” In one case, a cell tries to move from the inputport (0000) to the output port (0000). As indicated in Fig. 4,if collisions/faults occur, the SE redirects the cell from theintended output link, now blocked by the collision or the fault,to another output link. The cell then travels through the cyclicBanyan network along an alternate path in accordance with therouting control method of Definition 10. The second case isthat a cell enters the network at the input port (1011) and hasthe output port (1011) as its designated output port. While thecell travels the network, it is faced with two congested/faultylinks. In response to these collisions/faults, each of the SE’sredirects the cell in accordance to the routing control method asshown in Fig. 4. The third case is that a cell enters the networkat the input port (1110) and has the output port (1110), and isfaced with three congested/faulty links.

IV. PERFORMANCE ANALYSIS OF THE CYCLIC

BANYAN NETWORK

The performance enhancement of the cyclic Banyan net-work is accomplished by using the fully adaptive, deflectionself-routing control utilizing all the links to make alternatepaths. In this section, we will analyze this performance en-hancement of the cyclic Banyan network over several kindsof networks. We will use a common queuing network modelto investigate the performance of the cyclic Banyan network.

A. The Cyclic Banyan Network Under Uniform Traffic

We analyze the network under the uniform traffic model,thus we employ the following assumptions as usually done.

Page 8: The deflection self-routing Banyan network: a large-scale ATM …hklee.kaist.ac.kr/publications/1999 IEEE Trans. Network... · 2001-07-03 · 588 IEEE/ACM TRANSACTIONS ON NETWORKING,

PARK et al.: DEFLECTION SELF-ROUTING BANYAN NETORK 595

Assumption 1:

• New cells arrive at input ports according to Poissonprocess with rate where .

• Input cells are uniformly distributed over all output ports.Also, the probability that a cell arrives at network is thesame for all input ports.

• Each cell has equal probability to win the conflict.

These assumptions imply that, for each switching stageof the network, the pattern of cell distribution is identicaland statistically independent for all the SE’s. Therefore, eachswitching stage can be characterized by a single SE, and thisfact makes the analysis of the network very simple.

In many works [17], [18], for the performance analysisof the multipath switching network, the contention at thedestination was ignored. Such assumption was justified byusing a fast output queue which could accept multiple requestsper cycle, and by employing the multiplexer in front of thequeue.

For examples, to avoid the contention at the destination,the multiplexer operating at two times the speed of thenetworks and the fast queue accepting two cells per cycle isgenerally used for the replication-2 Banyan, and the 41multiplexer operating at four times the speed of the networksand the fast queue accepting four cells per cycle is used for thereplication-4 Banyan. For the replication-8 Banyan, the 81multiplexer operating at eight times the speed of the networksand the fast queue accepting eight cells per cycle is also used.

For the proposed switching network, we need the 41multiplexer operating at four times the speed of the networksand the fast queue accepting four cells per cycle. Additionallywe also need the 1 2 demultiplexer operating at the samespeed of the networks for each output port and the 13demultiplexer operating at the same speed. Therefore, wecan also apply the following assumption for the performanceanalyses of the various kinds of the networks compared inthis section.

Assumption 2:There is no contention at the output port ofthe network.

The cyclic Banyan network, using the deflection self-routingcontrol proposed, does not utilizes the buffering mechanismand the backpressure mechanism to deal with the output linkcontention within SE’s. Therefore, the cell arriving at the laststage and having the value of deviation tagas nonzero islost, unless there is available chain-out link. Thus, we assumethe following.

Assumption 3:The SE’s in the last stage drop input cell,if the value of deviation tag of the cell is nonzero andchain-out link is not available.

For the performance analysis of the cyclic Banyan network,we will use a model that is similar to the queuing modelof [19], which was used for analysis of the multicomputernetwork. In this model, each packet was associated with aclass or a type. By associating a cell with a class as the caseof the model, we will model the deflection self-routing inthe cyclic Banyan network in terms of the performance. Weclassify the cell as a class in accordance with the topologicaldistance between the SE at which the cell actually arrives

and the originally intended SE (or an SE equivalent to theoriginally intended SE). Thus, we classify the cell in stageas one of classes, and classify the cell of which suchtopological distance is zero as the class “0.”

The class associated with cell is being changed into otherclasses in accordance with the transfer of the cell from aSE to another SE in the same stage or the next stage. Theprobability that the cell of a class changes into the cell ofanother class only depends on the probability that cell wincompetition against other cells within the SE.

We first define the following variables in the same manneras in [19], and derive a set of state equations relating thesevariables. The symbol is defined to represent the numberof cells which were transferred from an SE of a certain stageto an SE of a certain stage and at the same time become thecells of a new class.

Definition 11: The symbol is defined to represent thenumber of cells which transferred from the stageto the stage

and become the class.We also define to present the total number of all cells

in stage , which were in stage previously.Definition 12: The total number of cells in stage, which

were in stage previously, is denoted as the symbol

defined by: .Now let us define some abbreviations to simplify the

expression of the state equations presented later.Definition 13: The symbol denotes the number of all

input links of an SE, and the symbol denotes the numberof all chain-in links of an SE. The total number of all thecells departing from SE’s of the stage and arrivingat the input links of an SE of the stage except for thecells of class 0, is denoted by the symbol defined by

. The number of all cells departingfrom SE’s of the stage and arriving at the chain-in links ofan SE of the same stageexcept for the cells of class 0, isdenoted by the symbol defined by .

The symbol is defined to represent the transitionprobability with which the cell of a class changes into thecell of another class.

Definition 14: The symbol is defined to denote thetransition probability that the cell of the classof the stage

is changed into the classwhile delivered to an SE of thestage .

Using the symbol , we can easily illustrate the statetransition diagram used in the proposed performance modelas shown in Fig. 5, where it is for the case of thecyclic Banyan network. Each state is labeled with the pair ofthe stage number and the topological distance between the SEat which the cell actually arrives and the originally intendedSE (or an SE equivalent to the originally intended SE).

We can obtain the throughput of the cyclic Banyan networkin steady state by using this kind of state diagram and thestate-transition probability which will be presented inthe following.

Theorem 7: When the state transition is in steady state, thestate transition probability that the cell of the class 0 ofthe first stage is changed into the classwhile it is delivered

Page 9: The deflection self-routing Banyan network: a large-scale ATM …hklee.kaist.ac.kr/publications/1999 IEEE Trans. Network... · 2001-07-03 · 588 IEEE/ACM TRANSACTIONS ON NETWORKING,

596 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 7, NO. 4, AUGUST 1999

Fig. 5. The state transitions of cell switched through the cyclic Banyan network (N = 16).

to an SE of the second stage is given by

The probability that the total “ ” cells arrive at “ ” inputlinks from among “ ” input links of an SE in the first stageat given cell time is a binomial probability function

The probability that at least one of “ ” cells of the class 0within an SE of the first stage is changed into the classwhileit is delivered through the output link under considerationand to SE’s of the second stage, is given by the function

that will be derived later. Then the

transition probability is the sum over all possiblenumbers of cells arriving the input links, of the product ofthese two probabilities.

For the stage ( 2), the state transition probabilitythat the cell of the class of the stage is changed into the

class while it is delivered to an SE of the stageis given by

Page 10: The deflection self-routing Banyan network: a large-scale ATM …hklee.kaist.ac.kr/publications/1999 IEEE Trans. Network... · 2001-07-03 · 588 IEEE/ACM TRANSACTIONS ON NETWORKING,

PARK et al.: DEFLECTION SELF-ROUTING BANYAN NETORK 597

whereand . The joint probability that, from thestage “ ” cells of the class 0, cells of class

and cells of class arrive at “ ” inputlinks from among “ ” input links of an SE in the stageatgiven cell time is a multinomial probability function

The joint probability that “ ” cells of the class 0, cells

of class and cells of class arrive at “ ”chain-in links from among “ ” chain-in links of an SE in thestage is a multinomial probability function

The probability that at least one from among “ ” cellsof the class within an SE of the stage is changed intothe class while it is delivered through the output link underconsideration and to SE of the stageis given by the function

that will be also derived

later. Then the transition probability is the sum overall possible numbers of cells arriving the input links and thechain-in links, of the product of these three probabilities.

Now let us derive the probability function . To modelthe transformation of the class of input cell switched withinSE’s in accordance with the deflection self-routing control,we derive the formulas hereinafter based on [15]. First letme define the probability function for the case that allthe cells arriving at an SE is of class 0, i.e., the case that

and in the following Lemma 2.Lemma 2: The probability that at least one from among all

the input cells of the class 0 which are in an SE of the stageis still those of the class 0, while it is delivered through the

output link under consideration, and to an SE of the next stageis given by the function defined by

The probability that at least one from among all the inputcells of the class 0 within an SE of the stagebecome ofthe class 1, while it is delivered through the output link underconsideration, and to an SE of the next stage is givenby the function defined by

The probability that at least one from among all the inputcells of the class 0 within an SE of the stagebecome of theclass 1, while it is delivered through the chain-out link under

consideration, and to an SE of the same stageis given bythe function defined by

if

if .

Second, let us define the probability function for the case thatthe degree of the deflection is reduced by one level throughthe intrastage routing within the same stage. We obtainedthese formulas based on [15]. To simplify the mathematicalexpressions for the probability function, we introduce thematrices in Definition 15 hereinafter.

Definition 15: Each of the matrix and the matrix isdefined, respectively, by

and

where

and

Additionally, let us define three operation symbols for matricesas the following. First the symbol “” denotes a multiplicationof two matrices analogous to the scalar product of vectors

......

......

Also, the symbol “ ” denotes a multiplication of two matricesanalogous to the Cartesian product of vectors

......

......

......

Lastly, the symbols “ ” denotes an evaluation of a matrixanalogous to the summation of the components of vector as

Page 11: The deflection self-routing Banyan network: a large-scale ATM …hklee.kaist.ac.kr/publications/1999 IEEE Trans. Network... · 2001-07-03 · 588 IEEE/ACM TRANSACTIONS ON NETWORKING,

598 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 7, NO. 4, AUGUST 1999

the following:

......

Using two matrices and three operators defined above, wecan simply define the transition probability functionas Lemma 3.

Lemma 3: The probability function that at leastone from among all the input cells of the classwithin an SEof the stage become of the class while it is deliveredthrough the chain-out link under consideration is defined by

if

if and

if and

if and

if and

if and

if and and

if and

where

Finally, we present the probability functions of the case that thedegree of the deflection is increased by losing the competitionfor the routing.

Lemma 4: The probability that at least one from among allthe input cells of the classwithin an SE of the stagebecomeof the class while it is delivered through the chain-out

link under consideration, and to an SE of the same stageisgiven by the function defined by

if

The probability that at least one from among all the inputcells of the class within an SE of the stagebecome of theclass while it is delivered through the output link underconsideration, and to an SE of the next stage is givenby the function defined by

if

if

where

The probability that at least one from among all the input cellsof the class within an SE of the stage become of theclass while it is delivered through the output link underconsideration, and to an SE of the next stage is given

by the function defined by

if

if .

Algorithm: Due to the complexity of the state transitions,we measure the performance of the network by iterativecalculations. Each step of the iteration starts from the firststage and ends at the last stage. These steps are repeated untilthe steady state is reached. The normalized throughputis obtained by dividing the number of accepted cells by themaximum number of possible arrivals

The normalized delay is obtained by dividing thesummation of the total number of the delays for the interstageroutings and the total number of the expected delays for theintrastage routings by the number of the stages of the network

Simulation: In order to validate the analysis presented inthe previous section, we did some simulations of the cyclicBanyan networks. The basic assumptions for the analysis wereimplemented in the simulator as follows.

• The sources generate cells according to Poissonprocess with rate where .

• The destination of each cell generated by sources is setrandomly by a random number generator (one out of zeroto ) to simulate uniform traffic.

Page 12: The deflection self-routing Banyan network: a large-scale ATM …hklee.kaist.ac.kr/publications/1999 IEEE Trans. Network... · 2001-07-03 · 588 IEEE/ACM TRANSACTIONS ON NETWORKING,

PARK et al.: DEFLECTION SELF-ROUTING BANYAN NETORK 599

Fig. 6. Normalized throughput versus network size for the cyclic Banyannetwork.

• If there is a conflict among cells for a link, one cellselected randomly is transferred to the link, and othercells are deflected to other links.

• The throughput and the delay are measured at each outputport of the network, and averaged over the network sizeand simulation time span to get the normalized throughputand the normalized delay of the network.

The sample simulation results and the analytic results areshown in Figs. 6 and 7. Fig. 6 shows normalized throughputversus network size for the cyclic Banyan network with theinput load of 1.0. The notable fact found out from Fig. 6 is thatthe normalized throughput starts to degrade when the sizereaches . It is because the number of the crosspoints (

) of the cyclic Banyan network starts to be lessthan that ( ) of the crossbar, i.e., the complete connection,from the size. Fig. 7 plots normalized delay versus networksize for the same network with the input load of 1.0. Thereare differences between the analytic results and simulationresults, since the analytic analysis is approximated and basedon several uniformity and independence assumptions. Theseapproximations and assumptions make the analysis simple,easy to understand, and easy to compute the results.

Comparing the performance of the cyclic Banyan networkwith that of the generic Banyan network and the replicatedBanyan networks, we now investigate the performance im-provement of the cyclic Banyan network as the following. Forfair comparison among the networks, first let us investigatethe hardware cost of the networks. We assume the hardwarecost is proportional to the number of gates involved. Hence, thehardware cost of the generic Banyan network is (SE’s) (number of gates per SE), that of the Replicated-2Banyan network is ( SE’s) (number of gatesper SE) (2 Replication), that of the cyclic Banyan network is( SE’s) (number of gates per SE), that of theB-network is ( SE’s) (number of gates per SE),that of the Replicated-4 Banyan network is (SE’s) (number of gates per SE) (4 Replication) and that

Fig. 7. Normalized delay versus network size for the cyclic Banyan network.

of the Replicated-8 Banyan network is ( SE’s)(number of gates per SE) (8 Replication).So we can normalize these expressions for the hardware

costs of the networks through dividing each of these expres-sions by ( ) that is the number of all the SE’sof the generic Banyan network. These normalizedhardware cost means the number of the gates that are needed toconstruct the function corresponding to one SE of the genericBanyan network.

The normalized hardware costs for the generic Banyannetwork, the Replicated-2 Banyan network, the cyclic Banyannetwork, the B-network, the Replicated-4 Banyan network,and the Replicated-8 Banyan network are (number of gatesper SE), (number of gates per SE2), (number of gates perSE), (number of gates per SE) 2, (number of gates per SE

4), and (number of gates per SE8), respectively.The detail values of the normalized hardware cost for the

networks, of which the size are shown in Table I.To evaluate the number of the gates for one switching unit,we use the following assumptions. We assume that we needone gates for aNAND logic, three gates for an exclusive-orlogic, and seven gates for a flip-flop. We also assume that eachswitching unit is composed of input cell buffers, output celllatches, a complete interconnection by which we can transfercells from the input cell buffers to the output cell latches,which is composed of flip-flop’s, an input selection function,and an output selection function, and typical contention controlfunction.

The input cell buffer is composed of 512 flip-flop’s forstoring 64 bytes. The output cell latch is composed of 16 flip-flop’s for storing 2 bytes. The number of the flip-flop’s forthe complete interconnection is evaluated by the expression:(the number of the input cell buffers) (the number of theoutput cell latches) (2-bytes bandwidth). It is notable thatwe need a few additional gates for processing and updatingof the deviation tag and controlling of the contentionin accordance with the routing scheme defined in Definition10, and the number of the additional gates for the deviation

Page 13: The deflection self-routing Banyan network: a large-scale ATM …hklee.kaist.ac.kr/publications/1999 IEEE Trans. Network... · 2001-07-03 · 588 IEEE/ACM TRANSACTIONS ON NETWORKING,

600 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 7, NO. 4, AUGUST 1999

TABLE INORMALIZED HARDWARE COMPLEXITY (NUMBER OF GATES) FOR THE BANYAN , REPLICATED BANYANS, B-NETWORK, AND CYCLIC BANYAN (N = 1024)

Fig. 8. Normalized throughput versus network size for Banyan, replicatedBanyans, B-network, and cyclic Banyan.

tag is proportional to scale for the case of thenetwork. Being irrelative to the size of the network, the

number of the gates for the switching unit is almost invariable.Therefore all the normalized hardware costs of the networks ofvarious sizes are almost same with that of the network.

According to Table I, to construct the cyclic Banyan net-work, we need about 20% more gates than the case of thereplicated-2 Banyan network. The hardware cost of the cyclicBanyan is 1.2 times that of the replicated-2 Banyan network,0.8 times that of the B-network, 0.6 times that of the replicated-4 Banyan network, and 0.3 times that of the replicated-8Banyan network.

As shown in Fig. 8, the throughput of 1024 1024 cyclicBanyan network is 3.51 times that of the regular Banyannetwork, 2.15 times that of the replicated-2 Banyan network,1.51 times that of the replicated-4 Banyan network, and even1.20 times that of the replicated-8 Banyan network. Thenormalized delays for the proposed networks of all sizes areonly about 1.7 times that of the generic Banyan networks ofall sizes, respectively, as shown in Fig. 7. Furthermore, asshown in Fig. 9, if we reduce load, or if speed up the proposednetwork, we can get the network of which throughput is almost1.0.

As the size of networks is larger, the performance of thecyclic Banyan network is better than the replicated Banyannetworks. The throughput of 524 288524 288 cyclic Banyan

Fig. 9. Normalized throughput versus network size for the cyclic Banyannetwork under several load factors.

network is 4.76 times that of the regular Banyan network,2.69 times that of the replicated-2 Banyan network, 1.70times that of the replicated-4 Banyan network, and even 1.22times that of the replicated-8 Banyan network. We can seethat an extreme large-sized proposed network has far betterperformance than the replicated Banyan networks. So thecyclic Banyan network has far better scalability so that it canbe used for large-scale ATM systems in which the replicatednetwork cannot be employed.

B. The Cyclic Banyan Network Under Nonuniform Traffic

We analyze the cyclic Banyan network under nonuniformtraffic. For the performance analyses, we did simulations. Weemploy the same assumptions of analysis under uniform trafficexcept Assumption 1. We can represent a nonuniform traffic asa load matrix where element meansthe given probability that a cell arriving at input portwouldbe destined for output port. Thus, the sum of row of thematrix represents the total load on input portand the sumof column represents the total load offered to output port.There can be infinitely many nonuniform traffic patterns.

Of these patterns we are interested in a nonuniform trafficpattern,hot-group model[2], which may represent a realistictraffic patterns. We assume that input cells are nonuniformlydistributed over all output ports in such a way that the output

Page 14: The deflection self-routing Banyan network: a large-scale ATM …hklee.kaist.ac.kr/publications/1999 IEEE Trans. Network... · 2001-07-03 · 588 IEEE/ACM TRANSACTIONS ON NETWORKING,

PARK et al.: DEFLECTION SELF-ROUTING BANYAN NETORK 601

Fig. 10. Normalized throughput versus congested stage of512�512 cyclicBanyan network under nonuniform traffic.

ports can be divided into two groups named “hot outlet group”and “cold outlet group.” The load matrix is partitioned as

. The row sum of is while therow sum of is such that . In other words,there are two equal-sized outlets groups, and the probabilityof a cell originating at a source going to destination groupis , while going to destination group is . Let is hotratio, then and , . Usingthis model under various input loads’s and hot ratios ’s,we did simulation hereinafter.

The notable fact that we found out from the results ofsimulation is that the performance of the Cyclic Banyannetwork under nonuniform traffic is independent of the lo-cation of congested stage. From Fig. 10, we can verify thisfact. If the size of two outlet groups is equal, whatevernonuniform traffic pattern we apply, the result of simulation toestimate throughput is the same for an input condition. So wecan estimate the whole performance under nonuniform trafficpattern with only two factors: input load and hot ratio .

Now we investigate the performance of the cyclic Banyannetwork under nonuniform traffic. As shown in Fig. 11, whenwe put load into the cyclic Banyan network, if the hotratio is 0.6, the throughput of 512 512 network is degradedby only 0.022%. If the hot ratio is 0.7, the throughput ofthe network is degraded by 2.871%. Even if the hot ratiois0.8, the throughput of the network is degraded by 12.321%.When we put load into the cyclic Banyan network, ifthe hot ratio is 0.6, the throughput of 512 512 network isdegraded by 1.668%. If the hot ratiois 0.7, the throughputof the network is degraded by 11.230%. When we put load

into the cyclic Banyan network, if the hot ratiois 0.6, the throughput of 512 512 network is degraded by6.840%. If the hot ratio is 0.7, the throughput of the networkis degraded by 17.502%. We also could see the throughput of128 128 proposed network is 0.99 even when we appliedload and hot ratio to the network as shown inFig. 11. In this case, the throughput is only 0.666% less thanthat under uniform traffic.

Fig. 11. Normalized throughput versus network size for the cyclic Banyannetwork under nonuniform traffic.

Fig. 12. Logical structure of the cell resequencing buffer.

C. Cell Sequence Integrity

The cells having the identical source and the identical desti-nation with each others can reach the same output port out ofsequence since they can be transferred through different pathsin the cyclic Banyan network. The cells transferred throughdifferent paths may have various delay time correspondingto the paths. Through the equalization of the network delayencountered by cells, we can eliminate the cell sequenceintegrity problem. We introduce a cell resequencing bufferwith which each output port of the network is equipped.Radically, this cell resequencing buffer is a memory bufferwith random read and write, in which the cells are held beforedelivery, as shown in Fig. 12. The detail design of the cellresequencing buffer will be presented in the Appendix.

From the simulation results shown in Fig. 13, we can seethat the delay time of the cells transferring through the 512

512 cyclic Banyan network is less than 100 cell timefor almost all proper combinations of the input loads ratesand the hot rates. Furthermore Fig. 14 shows that the delaytime encountered by the cell traveling through the switchingnetworks of various sizes, of which the size are less than 1024

1024, is also less than 100 cell times, even under the heavyload and hot rate pairs.

As a result, for the case of employing the cyclic Banyannetwork as the multipath switching network, we can providean effective cell resequencing buffer using small size rese-quencing window, whereas the resequencing window is thenumber of the equalization slots, i.e., the minimum number of

Page 15: The deflection self-routing Banyan network: a large-scale ATM …hklee.kaist.ac.kr/publications/1999 IEEE Trans. Network... · 2001-07-03 · 588 IEEE/ACM TRANSACTIONS ON NETWORKING,

602 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 7, NO. 4, AUGUST 1999

Fig. 13. Delay distribution for various input load and hot ratio.

Fig. 14. Delay distribution for various network size.

time slots that a cell must wait before leaving the network.So we can construct the cell resequencing buffer with suchsmall quantities of hardware as shown in Fig. 15 presentedin the Appendix. It is notable that the total delay time of thecells transferred through the cyclic Banyan network and thenthrough the resequencing buffer is 100 cell time, and the delayvariation of the cells is 0 cell time.

V. CONCLUDING REMARKS

In this paper, we have designed and evaluated a newhigh-performance, large-scale ATM switch, i.e., the cyclicBanyan network that is a class of the deflection self-routingBanyan network. It is a new augmented Banyan network usingfully adaptive self-routing control that exploits all links asalternate paths. By adding extra links between SE’s within thesame stage and extending the self-routing scheme, we presentthe network using all links as alternate paths. Although theproposed network has no internal buffer, it never loses cellwithin the switching network.

To design the proposed network and its self-routing, we usethe topological properties: all SE’s of the Banyan network arearranged in a regular pattern in terms of topology. In otherwords, each stage of the Banyan network is composed ofsequence of thecyclic group realized with SE’s, and stagesalso connected symmetrically through the links between them.We proved such properties of the Banyan networks throughan algebraic formalism in this paper.

The proposed routing scheme is a fully distributed routingscheme that requires a little additional computation in eachSE. The computation is increment/decrement function andshift function for a fixed sized operand ( bits).The proposed self-routing is as simple as that of the Banyannetwork, and all SE’s of proposed network have a uniformstructure.

We also provide performance analyses under uniform trafficpattern and nonuniform one for a quantitative comparison. As aresult of the analysis of the networks under uniform traffic, wehave found that the throughput of 10241024 cyclic Banyannetwork is 3.51 times that of the regular Banyan network,2.15 times that of the replicated-2 Banyan network, 1.51 timesthat of the replicated-4 Banyan network, and even 1.20 timesthat of the replicated-8 Banyan network, whereas the hardwarecomplexity of the network is 1.2 times that of the replicated-2Banyan network. All the normalized delays of the proposednetworks of all sizes are only about 1.7 times that of thegeneric Banyan network of all sizes, respectively. As the sizeof networks is larger, the performance of the cyclic Banyannetwork is very better than the replicated Banyan networks.

As a result of analysis of the networks under nonuniformtraffic, we provided quantitative comparison among the pro-posed network under uniform traffic and that under nonuniformtraffic having several load and hot ratio pairs as parameters.When we put load 0.7 into the cyclic Banyan network, ifthe hot ratio is 0.6, the throughput of 512 512 network isdegraded by only 0.022%. If the hot ratio is 0.7, the throughputof the network is degraded by 2.871%. Even if the hot ratiois 0.8, the throughput of the network is degraded by 12.321%.When we put load 0.8 into the cyclic Banyan network, ifthe hot ratio is 0.6, the throughput of 512 512 networkis degraded by 1.668%. If the hot ratio is 0.7, the throughputof the network is degraded by 11.230%. When we put load0.9 into the cyclic Banyan network, if the hot ratio is 0.6, thethroughput of 512 512 network is degraded by 6.840%. Ifthe hot ratio is 0.7, the throughput of the network is degradedby 17.502%.

We also present a simple cell resequencing buffer providinga hardware sliding window mechanism so that the proposednetwork can preserve the sequence integrity of cells that tourvarious paths and then arrive at a output port. The simplecontrol mechanism is possible because the cell delay variationof the proposed network is bound by the limit of 100 celltime, as shown in the results of the simulations. As a result,we presented the switching network of which the cell delaytime is 100 cell time and the cell delay variation is 0 cell time.

We introduced only a new network, using the topologicalproperties found, we can, of course, provide many extensionsof the Banyan network using the deflection self-routing. We

Page 16: The deflection self-routing Banyan network: a large-scale ATM …hklee.kaist.ac.kr/publications/1999 IEEE Trans. Network... · 2001-07-03 · 588 IEEE/ACM TRANSACTIONS ON NETWORKING,

PARK et al.: DEFLECTION SELF-ROUTING BANYAN NETORK 603

Fig. 15. The detail design of the cell resequencing buffer.

have also consider only the Banyan network built out of 44 SE’s; however, this scheme can also be applied to networkscomposed of lager SE’s, e.g., 8 8 SE’s.

APPENDIX

DETAIL DESIGN OF THECELL RESEQUENCINGBUFFER

In this appendix, we present a cell resequencing buffer, asshown in Fig. 15. The cell resequencing buffer is composed ofthe cell buffer memory, in which the real cells are stored, theset of the buckets which is used to maintain the linked lists ofthe addresses of the cells stored in the cell buffer memory, thewaiting pool of idle addresses of the cell buffer memory, thecell input functions, which calculate the number of the timeslots by which a cell must wait before leaving the network andselect an input bucket, then link the address obtained from thewaiting pool to the linked list of the bucket and store the cell,and the cell output functions, which select the bucket fromwhich the address of output cell is fetched, then fetch theaddress and deliver the cell.

Now we investigate the function of the cell resequencingbuffer by reviewing the behavior of the resequencing buffer.For the operation, first the maximum delay register is ini-tialized with the value of maximum delay estimated, andthe base bucket address increment register and the outputbucket address increment register is initialized with zeros,respectively.

On the other hand, according to the cell time clock, inputcell is stored in the cell buffer memory at rising edge, andoutput cell is fetched from the cell buffer memory and isgiven to output at falling edge. By the way to use this cellresequencing buffer, we need to extend internal cell formatfor seven bits. In this field, the total delay time of the cell

transited the network will be stored. The total delay time ofthe cell passing the network is calculated by increasing by onefor each time when the cell transit a link between SE’s, andthen is updated in the header of the cell before transiting.

When the cell arrive at the cell resequencing buffer out ofthe SE of the final stage, the select input bucket calculates theaddress of the bucket to which the cell belong as the following:the value of the maximum delay register minus the value of thedelay time of the cell header plus the value of the base bucketaddress increment register. The result value is the address ofthe bucket in which the linked list of the addresses of thecells for the bucket is maintained. By the way, the addresswhere the real cell arrived will be stored is obtained from theidle addresses pool. Then the cell is stored in the cell buffermemory, and the address is linked to the list that is maintainedby the selected bucket.

On the output side, at each cell time, a cell is deliveredto output as the following. The address of the bucket main-taining the linked list, to which the address of the appropriateoutput cell belong, is obtained from the output bucket addressincrement register. At the falling edge of the cell time clock,the address fetched from the linked list is sent to the memoryaddress register, then the cell stored in this location within thecell buffer memory is sent to output. As a result, the delaytimes of all the cells are equalized to a certain delay time, andso we can guarantee the integrity of cell sequence.

REFERENCES

[1] CCITT, “Broadband aspects of ISDN,” CCITT Recommendation I.121,Blue Book, Geneva, Switzerland, 1989, vol. III.7.

[2] S. Gianatti and A. Pattavina, “Performance analysis of shared-bufferedBanyan networks under arbitrary traffic patterns,” inProc. INFO-COM’93, pp. 943–952.

Page 17: The deflection self-routing Banyan network: a large-scale ATM …hklee.kaist.ac.kr/publications/1999 IEEE Trans. Network... · 2001-07-03 · 588 IEEE/ACM TRANSACTIONS ON NETWORKING,

604 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 7, NO. 4, AUGUST 1999

[3] R. Rooholamini and M. Garver, “Finding the right ATM switch for themarket,” IEEE Comput.,vol. 27, pp. 16–28, Apr. 1994.

[4] E. P. Pathgeb, W. Fischer, C. Hinterberger, E. Wallmeier, and R. Wille-Fier, “The main streetXpresscore services node—A versatile ATMswitch architecture for the full service network,”IEEE J. Select. AreasCommun.,vol. 15, pp. 795–806, June 1997.

[5] W. Willinger, M. S. Taqqu, R. Sherman, and D. V. Wilson, “Self-similarity through high-variability: Statistical analysis of ethernet LANtraffic at the source level,”IEEE/ACM Trans. Networking,vol. 5, pp.71–86, Feb. 1997.

[6] B. Tsybakov and N. D. Georganas, “On self-similar traffic in ATMqueues: Definitions, overflow probability bound, and cell delay dis-tribution,” IEEE/ACM Trans. Networking,vol. 5, pp. 397–410, June1997.

[7] P. T. Gaughan and S. Yalamanchili, “Adaptive routing protocols forhypercube interconnection networks,”IEEE Comput.,vol. 26, pp. 12–24,May 1993.

[8] E. Ayanoglu, “Signal flow graphs for path enumeration and deflectionrouting analysis,” inProc. IEEE GLOBECOM’89,pp. 1022–1029.

[9] C.-T. A. Lea, “Load-sharing Banyan network,”IEEE Trans. Comput.,vol. C-35, pp. 1025–1034, Dec. 1986.

[10] K. Y. Lee and H. Yoon, “The B-network: A multistage interconnectionnetwork with backward links,”IEEE Trans. Comput.,vol. 39, pp.966–969, Apr. 1990.

[11] N. Tzeng, P. Yew, and C. Zhu, “The performance of a fault-tolerantmultistage iterconnection network,” inProc. 1985 Int. Conf. ParallelProcessing,pp. 458–465.

[12] G. R. Goke and G. J. Lipovski, “Banyan networks for partitioning mul-tiprocessor systems,” inProc. 1st Annu. Symp. Computer Architecture,1973, pp. 21–28.

[13] J. Park, H. Yoon, H. Lee, and S. Eun, “The ring-Banyan network: Afault-tolerant multistage interconnection network with an adaptive self-routing,” in Proc. 1992 Int. Conf. Parallel and Distributed Systems,Hsinshu, Taiwan, pp. 196–203.

[14] C. Wu and T. Feng, “On a class of multistage interconnections net-works,” IEEE Trans. Comp.,vol. C-29, pp. 694–702, Aug. 1980.

[15] J. H. Patel, “Performance of processor-memory interconnections formultiprocessors,”IEEE Trans. Comput.,vol. C-30, pp. 771–780, Oct.1981.

[16] J.-H. Park, H. Yoon, and H.-K. Lee, “The cyclic Banyan network:A fault-tolerant multistage interconnection network with the fully-adaptive self-routing,” inProc. 7th IEEE Symp. Parallel and DistributedProcessing,TX, Oct. 1995, pp. 702–710.

[17] R. Venkatesan and H. T. Mouftah, “Balanced gamma network—A newcandidate for broadband packet switch architectures,” inProc. IEEEINFOCOM’92, pp. 2482–2488.

[18] T. D. Morris and E. F. Gehringer, “A cost-effective reliable multipathinterconnection network,”ACM Comput. Architecture News,pp. 45–65,June 1991.

[19] M. Harchol and P. E. Black, “Queuing theory analysis of greedy routingon arrays and tori,” Dept. Elect. Eng. Comput. Sci., Univ. California,Berkeley, CA 94720, Tech. Rep. UCB/CSD 93/756, June 1993.

Jae-Hyun Park (S’91–M’96) received the B.S.degree in computer science from Chung-Ang Uni-versity, Korea, in 1988, and the M.S. and Ph.D.degrees in computer science from the Korea Ad-vanced Institute of Science and Technology, Taejon,in 1991 and 1995, respectively.

Since 1995, he has been a Senior Engineer withSamsung Electronics Company, Korea. His mainresearch interests include interconnection networks,ATM switching architectures, multiprotocol labelswitching, parallel computing, distributed/parallel

operating system, and hard real-time operating systems.He received the 1996 Best Research Paper Award-Golden Prize in 1997,

the 1997 Best Research Paper Award-Bronze Prize in 1998, and the 1997Award of Excellence in Research and Development-Silver Prize in 1998, allfrom Samsung Electronics Company. His biographical profile was includedin the 1998Who’s Who in the World(New Providence, NJ: Marquis Who’sWho, 1998).

Hyunsoo Yoonreceived the B.S. degree in electron-ics engineering from the Seoul National University,Seoul, Korea, in 1979, the M.S. degree in computerscience from Korea Advanced Institute of Scienceand Technology, Taejon, in 1981, and the Ph.D.degree in computer and information science fromThe Ohio State University, Columbus, in 1988.

During 1978–1980, he was with the TongyangBroadcasting Company, Korea, then Samsung Elec-tronics Company, Seoul, Korea, during 1980–1984.From 1988 to 1989, he was a Member of the

Technical Staff with AT&T Bell Labs, Indial Hill, IL. Since 1989, he hasbeen a Professor with the Department of Computer Science, Korea AdvancedInstitute of Science and Technology. His main research interests includeparallel computer architectures, parallel computing, interconnection networks,and high speed communication networks.

Heung-Kyu Lee received the B.S. degree in elec-tronics engineering from the Seoul National Uni-versity, Seoul, Korea, in 1978, and the M.S. andPh.D. degrees in computer science from the KoreaAdvanced Institute of Science and Technology, Tae-jon, Korea, in 1981 and 1984, respectively.

From 1984 to 1985, he was a a Post-DoctoralFellow at the University of Michigan, Ann Arbor.Since 1986, he has been a Professor in the Depart-ment of Computer Science, Korea Adanced Instituteof Science and Technology. Since 1990, he has also

been a Research Staff Member at the Satellite Technology Research Center,Korea Advanced Institute of Science and Technology. His major interests arereal-time fault-tolerant computing, multimedia systems, and satellite remotesensing.