anew design technique indicating function...

6
A New Design Technique for Weakly Indicating Function Blocks P. Balasubramanian, D.A. Edwards School of Computer Science, The University of Manchester, Oxford Road, Manchester Ml 3 9PL, United Kingdom. E-mail: (padmanab, doug)@cs.man.ac.uk. Abstract-This paper presents a novel technique for gate- independent of any delays in circuit elements and as such, level design of combinatorial logic as weakly indicating they dispense with the need for a global clock. The vast function blocks. The input state space associated with a .. .. function block expands exponentially with a gradual increase majority of existing desigenautomation flows target in the number of inputs. As a result, large area overhead would synchronous circuits. Even when asynchronous designs incur for an asynchronous realization. Hence, a novel design leverage existing tool flows, they introduce large area methodology for realizing combinational logic as a function overheads. As a result, asynchronous function block block is developed under the discipline of quasi-delay- implementation requires special techniques in comparison insensitivity with four-phase handshaking and dual-rail encoding, whilst trying to mitigate the area overhead. The witht synths focus is on design adhering to the weakly indicating timing digital circuits. regime. Based on analysis with some combinational The remainder of this paper is organized as follows. benchmarks and widely used logic circuit functionality, the Section 2 describes the function block. In section 3, some proposed method is found to enable compact realizations and relevant works have been discussed. Section 4 describes the appears to be promising for weakly indicating function block novel weak-indication function block design technique with design comprising many inputs and outputs. the help of a block diagram and also lists the logic I. INTRODUCTION decomposition constraints necessary for an asynchronous style implementation. The proposed method is illustrated by Digital logic design has been dominated by synchronous means of a case study in section 5. Results corresponding to solutions for the past several decades. The renewal of the different functionalities considered are given in section 6. interest in and requirement of asynchronous design is In the next section, we make the concluding remarks and largely motivated by the need to overcome the problems highlight directions of our further work. associated with clocking. Future deep sub-micron technologies would be characterized by irregular parameter II. FUNCTION BLOCK - CHARACTERIZATION variations (voltage and temperature variations, process A function block is the asynchronous equivalent of a changes and noise), which could make asynchronous design digital combinatorial circuit [4]: it computes one or more an attractive solution. It is expected that parametric variance outputs from a set of input signals. Apart from computing of device delay might reach 35% by 2020 [1]. Since the desired function outputs based on the function inputs, a asynchronous circuits are self-timed, they tend to absorb the function block must also be transparent to handshaking that deviations of device characteristics. Self-timed design is a is implemented by its surrounding latches. The transparency method for designing asynchronous circuits such that their to handshaking is what makes function blocks different from correct operation depends neither on the speed of their combinational logic circuits. Besides it should have a valid components nor on the delays of the communication wires. combinatorial structure meaning that it should not have The scope for asynchronous design techniques are dangling inputs or outputs and no feed-back signal paths. increasing due to a number of potential advantages: Function blocks can be strongly indicating (SI) or weakly * Enhanced modularity (permitting design reuse) indicating (WI) depending on how they behave with respect * No problems with clock signal skew to the signaling transparency. In this work, we restrict * Operational scalability (system may slow down but does ourselves to the WI timing regime and discuss how to arrive not fail) at realizations suitable for implementation with standard * Robustness to parametric variations, for example, cells along with custom designed C-elements, forming part temperature and supply voltage [2] of the cell library. A function block is categorized as WI if it * Reduced susceptibility to electro-magnetic interference starts to compute and produce valid (empty) outputs as soon * Minimization of power consumption due to the absence as possible, i.e. when some but not all its input signals might of clock activity and have become valid (empty). Such a function block never * the portion of the circuit dissipating power is that produces all valid (all empty) outputs until after all inputs involved in computation [3] paving way for automatic have become valid (empty). This behavior is defined by power-down of unused circuitry. Seitz's weak conditions [10]. It has also been proved in [10] Unlike synchronous circuits, where a global clock is used that if independent function blocks satisfy the weak- to control when the outputs of the combinational logic are indication constraints, then they can be combined to form read, asynchronous circuits must operate correctly 978-1 -4244-2277-7/08/$25.00 ©2008 IEEE Authorized licensed use limited to: The University of Manchester. Downloaded on July 23, 2009 at 10:27 from IEEE Xplore. Restrictions apply.

Upload: others

Post on 14-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ANew Design Technique Indicating Function Blocksapt.cs.manchester.ac.uk/ftp/pub/amulet/papers/Bala...level design of combinatorial logic as weakly indicating they dispense with the

A New Design Technique for Weakly IndicatingFunction BlocksP. Balasubramanian, D.A. Edwards

School of Computer Science, The University of Manchester,Oxford Road, Manchester Ml 3 9PL, United Kingdom.

E-mail: (padmanab, doug)@cs.man.ac.uk.

Abstract-This paper presents a novel technique for gate- independent of any delays in circuit elements and as such,level design of combinatorial logic as weakly indicating they dispense with the need for a global clock. The vastfunction blocks. The input state space associated with a .. ..function block expands exponentially with a gradual increase majority of existing desigenautomation flows targetin the number of inputs. As a result, large area overhead would synchronous circuits. Even when asynchronous designsincur for an asynchronous realization. Hence, a novel design leverage existing tool flows, they introduce large areamethodology for realizing combinational logic as a function overheads. As a result, asynchronous function blockblock is developed under the discipline of quasi-delay- implementation requires special techniques in comparisoninsensitivity with four-phase handshaking and dual-railencoding, whilst trying to mitigate the area overhead. The witht synthsfocus is on design adhering to the weakly indicating timing digital circuits.regime. Based on analysis with some combinational The remainder of this paper is organized as follows.benchmarks and widely used logic circuit functionality, the Section 2 describes the function block. In section 3, someproposed method is found to enable compact realizations and relevant works have been discussed. Section 4 describes theappears to be promising for weakly indicating function block novel weak-indication function block design technique withdesign comprising many inputs and outputs.

the help of a block diagram and also lists the logic

I. INTRODUCTION decomposition constraints necessary for an asynchronousstyle implementation. The proposed method is illustrated by

Digital logic design has been dominated by synchronous means of a case study in section 5. Results corresponding tosolutions for the past several decades. The renewal of the different functionalities considered are given in section 6.interest in and requirement of asynchronous design is In the next section, we make the concluding remarks andlargely motivated by the need to overcome the problems highlight directions of our further work.associated with clocking. Future deep sub-microntechnologies would be characterized by irregular parameter II. FUNCTION BLOCK - CHARACTERIZATIONvariations (voltage and temperature variations, process A function block is the asynchronous equivalent of achanges and noise), which could make asynchronous design digital combinatorial circuit [4]: it computes one or morean attractive solution. It is expected that parametric variance outputs from a set of input signals. Apart from computingof device delay might reach 35% by 2020 [1]. Since the desired function outputs based on the function inputs, aasynchronous circuits are self-timed, they tend to absorb the function block must also be transparent to handshaking thatdeviations of device characteristics. Self-timed design is a is implemented by its surrounding latches. The transparencymethod for designing asynchronous circuits such that their to handshaking is what makes function blocks different fromcorrect operation depends neither on the speed of their combinational logic circuits. Besides it should have a validcomponents nor on the delays of the communication wires. combinatorial structure meaning that it should not haveThe scope for asynchronous design techniques are dangling inputs or outputs and no feed-back signal paths.

increasing due to a number of potential advantages: Function blocks can be strongly indicating (SI) or weakly* Enhanced modularity (permitting design reuse) indicating (WI) depending on how they behave with respect*No problems with clock signal skew to the signaling transparency. In this work, we restrict* Operational scalability (system may slow down but does ourselves to the WI timing regime and discuss how to arrive

not fail) at realizations suitable for implementation with standard* Robustness to parametric variations, for example, cells along with custom designed C-elements, forming part

temperature and supply voltage [2] of the cell library. A function block is categorized as WI if it* Reduced susceptibility to electro-magnetic interference starts to compute and produce valid (empty) outputs as soon* Minimization of power consumption due to the absence as possible, i.e. when some but not all its input signals might

of clock activity and have become valid (empty). Such a function block never* the portion of the circuit dissipating power is that produces all valid (all empty) outputs until after all inputs

involved in computation [3] paving way for automatic have become valid (empty). This behavior is defined bypower-down of unused circuitry. Seitz's weak conditions [10]. It has also been proved in [10]

Unlike synchronous circuits, where a global clock is used that if independent function blocks satisfy the weak-to control when the outputs of the combinational logic are indication constraints, then they can be combined to formread, asynchronous circuits must operate correctly

978-1-4244-2277-7/08/$25.00 ©2008 IEEE

Authorized licensed use limited to: The University of Manchester. Downloaded on July 23, 2009 at 10:27 from IEEE Xplore. Restrictions apply.

Page 2: ANew Design Technique Indicating Function Blocksapt.cs.manchester.ac.uk/ftp/pub/amulet/papers/Bala...level design of combinatorial logic as weakly indicating they dispense with the

larger function blocks, suitable for design of cascaded design. Recent works have also primarily targetedarithmetic circuits. implementation of arithmetic circuits, especially integer

In the case of circuits using dual-rail (DR) signals, addition [12] [13] [14]. The DIMS approach [12], widelyfunction blocks implicitly indicate completion. In contrast to adopted for strongly indicating adder designs can bebundled-data encoding, DR encoded data does not require a modified as in [4] to suit the weak-indication timing model.separate request wire; instead it is embedded within the data But it is not suitable for synthesizing combinational logicwires. Moreover, each data wire x is now represented by two with many inputs, as it severely exacerbates the areadata wires xO and xl. A transition on the xO wire indicates overhead, besides giving rise to many unacknowledgedthat a zero has been transmitted, while a transition on the xl transitions in the circuit (circuit orphans) for naive C-wire indicates that a one has been transmitted. But element decomposition. The notion of a quad-gate has beensimultaneous transitions on both xO and xl is not allowed introduced in [14]. Based on published results, it can beand hence considered to be invalid. Since the request wire is observed that the quad-gate based implementation of a DIembedded within the data wires, a transition on either xO or carry-lookahead adder with speedup (DICLASP) [13],xl informs the receiver about the validity of the data. In mentioned as quad-DICLASP in [14], is found to consumeother words, they will always produce functionally correct more transistors and significantly more power dissipatingoutputs independent of the wire delays whereas bundled- than the original DICLASP of [13], while reporting adata protocol is not delay-insensitive (DI). The 4-phase marginal improvement in delay for the best case addition.protocol, also known as return-to-zero (RTZ) protocol, is For the worst case 32-bit addition, there is degradation insimilar to the 2-phase protocol except for the fact that it is speed. The merits/demerits of quad-gate basedlevel-sensitive. Therefore, the request wire has to go low implementation for realization of arbitrary combinationalbefore it can go high again leading to an intermediate RTZ logic has not yet been reported in current literature. In [15],phase between two successive data values. The advantage of early output logic has been proposed, which has a flavor of4-phase protocol is that logic processing elements can be the weak-indication phenomenon in that some/all validmade much simpler and familiar logic gates can be used, outputs may be produced without requiring the arrival of allwhereas designs based on 2-phase protocol would require valid inputs. However the converse does not hold well, i.e.complex control circuits to process transitions. weak-indication is not strictly synonymous with the early

Circuits designed following the four-phase protocol DR output logic of [15]. Though theoretical models based onapproach are generally quasi-delay-insensitive (QDI), since Petri nets for the WI timing regime were made available inthe class of DI circuits is rather small [5]. QDI is as robust [16], general purpose datapath synthesis algorithms for thisas the DI class to variable operating conditions and timing model have not yet been developed [21]. Further, astransistor variations [7]. A circuit is QDI if and only if the mentioned in [20], weak-indication logic synthesis involvesproduction rule set describing it is stable and non-interfering relatively more complexity and is difficult. This paper tends[8]. It is also an attractive design style mainly for the simple to precisely address this issue.timing closure and analysis that it permits. QDI circuit IV. WI FUNCTION BLOCK DESIGNdesign assumes that both operators and wires can take anarbitrary time (finite and positive) to switch, except for As mentioned in the previous section, much of thecertain wires that form isochronic forks [6] [9] (weakest previous work have addressed the design of arithmeticcompromise to DI). If the delays to all the end-points of a circuits, paying little attention to realization of an arbitraryforking wire are identical, then the wire-fork is called non-regenerative logic functionality. An asynchronousisochronic. The isochronic fork assumption has been defined implementation of an arbitrary combinational logic usuallyin terms of the acknowledgement by Martin in [5] as: "In an requires the generation of all minterms (standard productisochronic fork, when a transition on one output is terms), which is 0[2f] for 'n' inputs, causing an input stateacknowledged and thus completed, the transitions on all space explosion. Hence the main issue dealt with in thisoutputs are acknowledged and thus completed". paper is the efficient realization of any arbitrary

combinational logic as a QDI weak-indication functionIII. PREVIOUSWORK block pertaining to the discipline of 4-phase handshaking

DI (also include QDI) circuits can be identified as either with DR encoding using standard library cells (including 2-pertaining to control circuits or datapaths. While control input and 3-input C-elements), which poses a real andcircuits are those that realize the sequencing of actions of a significant challenge for asynchronous logic design, as incomputation, datapaths are those that deal with the general, the area overhead becomes massive in comparisonmanipulation and transmission of data. Datapath design, in with synchronous combinational logic synthesis solutions.general, is more difficult than control circuit design and In contrast with a SI function block design, which exhibitsrelatively much of the published research has concentrated worst case latency as in a synchronous circuit, actual caseon the design of control circuitry. Though a generalized latency can be expected from a WI function block. In ordermethod for design of delay-insensitive datapath circuits has to facilitate good area optimized implementations (measuredbeen discussed in [11], it largely corresponds to the design in terms of two-input gate count [19] and transistor costof an adder circuitry. Infact, the transistor level [20]), we bank on widely used conventional multiple-outputimplementation of an adder circuit using CMOS technology minimization strategies available [18] and utilize them forin [11] represents the ultimate for the case of a WI adder translation to asynchronous logic while complying with

Authorized licensed use limited to: The University of Manchester. Downloaded on July 23, 2009 at 10:27 from IEEE Xplore. Restrictions apply.

Page 3: ANew Design Technique Indicating Function Blocksapt.cs.manchester.ac.uk/ftp/pub/amulet/papers/Bala...level design of combinatorial logic as weakly indicating they dispense with the

weak-indication property by incorporating novel logic recursive application of absorption and distributive axiomstransformations and decomposition techniques (which of Boolean algebra.guarantee speed-independence), while simultaneously * Where C1 and C2 are not mutually orthogonal andsatisfying the monotonic cover constraint (MCC) [23]. 1CSI[S(C1), S(C2)]J = 0 and S(C1) is singleton (S(C2) could

A. Terninologies andDefinitions also be singleton); by absorption law, we therefore get:Before we mention the main issues and constraints ID(C2) = S(C2) = S(C1) + 1 = D(C1) + 1.

involved in synthesis and decomposition, new terminologies * Where C1 and C2 are not mutually orthogonal and ifare elucidated alongside existing ones for clarity. ICSI[S(C1), S(C2)]l = 0 and S(C1) is not singleton (S(C2) is

also not singleton), then redundancy insertion through1) Support set andDependency set ofa Boolean cube identity axiom is followed by application of distributive andA literal is an variable or its inversion (say, k or k'). A absorption laws. Subsequently, additional cube(s) get

Boolean cube, C is a conjunction of literals. The support set introduced into the logic network.of a cube, S(C) entails the enumeration of all the literals that * Consider the whole functionality as a globalare a function of the cube, while a cube dependency set, combinatorial network with DR inputs and outputs.D(C) entails enumeration of all the support set literals in * For k cubes, perform cube dependency intersection withtheir actual form that the cube depends upon for its each of the remaining (k-i) cubes in the network.evaluation to a logic '1'. * If for two cubes C1 and C2, the equality relationshipFor a cube C assigned with ef'g 'h, its S(C) = {e,fg,h} and CDI[D (C1), D(C2)]l = JD(C1)J = ID(C2)1 holds good, then

D(C) = {ef',g',h} both C1 and C2 are replaced by an intermediate node symboland it is substituted back into its parent nodes. This analogy

2) Cubes Support Intersection set and Cubes Dependency can be extended to any number of cubes in the network.Intersection set * In the above scenario, CE = JD(C) = ID(C2) . If the

The intersection of the support set of two cubes equality is not satisfied, then opt for the case, where(dependency set of two cubes) is characterized by the ICDI[D(C1), D(C2)]l = JD(C1)J - 1 = ID(C2) - 1. Thenliterals that are common to both the cubes. This is referred replace the shared literals by an intermediate node andto by an intersection set, CSI (CDI). For example, with D(C1) substitute it back into its parent nodes via, a conjunction.and D(C2) described by {a,b,c',d} and {a,b',c',"1 However, this is permissible only when HD(C1, C2) = 1 inrespectively, CSI[S(C1), S(C2)] = {a,b,c} and CDI[D(C1), order to preserve speed-independence, where HD signifiesD(C2)] = {a,c'} and hence their respective cardinalities are: the Hamming distance between the cube vectors.CSI[S(C1), S(C2)] = 3 and ICDI[D (C1), D(C2)]l = 2. * A cube in the transformed Boolean network is to be

chosen only once, also an intermediate node (cube) resulting3) Covertin andubeCoed cubes [17],covegCnovereten CE from the network should be chosen only once, whether forWe identify a cube C1 as fully covering another cube C2, usiuino o oeig

if D(C2) C D(C1) and hence the following equality holds: substhtution or for coverbng.ICD*C) C)J=I(21adC DC)1

The covering cube absorbs the covered cube.CDI[D(C1), D(C2)] = D(C2) and CE = ID(C2)l. * If a cube covers more than one cube in the network, then

4) DNF andMODNF priority for a particular cube to be covered would be based

A Boolean formula is said to be in disjunctive normal upon a maximal value of CE for the covered cube.*A covered cube can be the covering cube for another

form (DNF) if it is a disjunction of clauses, each of which is c A covered cube can be substitute io asomanya conjunction of literals. The mutually orthogonal DNF cube and a covered cube can be substitutedanto as many

(MODNF) of a logic function consists of a set of . . a

cube provided CE for the other covering cubes with respectconjunctions which are mutually orthogonal cubes [24], i.e. tohicvedcueseqatohearnatyftethe cubes do not overlap. Every Boolean cube is mutually dependencyvet oftecoeed cue.orthogonal to every other Boolean cube in a MODNF. dpUnc smalle cubes (bod

* Uncovered smaller cubes (bound dictated by maximumFormally, when two Boolean cubes C1 and C2 are mutually fan-in) might exist in the final network.orthogonal, then the following inequalities would hold good: * Synchronization of the DR encoded inputs (for both dataCSI[S(C1), S(C2)]J > 1 and CDID(C1), D(C2)]l >0. and spacer values) is achieved through block I1 (shown inB. Synthesis issues andDecomposition constraints figure 1), which has 2n DR inputs and a single output.

* Obtain a minimized solution for all the m function * Decomposed multi-level realization of the functionoutputs (in positive phase) of the multiple-input multiple- outputs, which satisfy the MCC is represented by block 12.output functionality description, originally in single-rail The DR encoded inputs to the block may be only a subset offormat. These correspond to the expressions for true the actual inputs, whereas there would certainly be 2m DRfunction outputs, after DR encoding. outputs.

* Obtain minimized expressions for all the m function * Block 13 basically ensures that the weak-indicationoutputs (in negative phase). These correspond to the criteria are satisfied by ensuring synchronization betweenBoolean equations for false outputs after DR encoding. the output of block I1 and any two outputs of block 12,

* Transform both the true and false function output enabling self-timed operation. But it is recommended not toexpressions into MODNF. This is accomplished through combine those two complementary data rails of a functionredundancy insertion using identity law and also by output which are already associated with the longest path

Authorized licensed use limited to: The University of Manchester. Downloaded on July 23, 2009 at 10:27 from IEEE Xplore. Restrictions apply.

Page 4: ANew Design Technique Indicating Function Blocksapt.cs.manchester.ac.uk/ftp/pub/amulet/papers/Bala...level design of combinatorial logic as weakly indicating they dispense with the

delay as it would further increase their delay. The logical While we have adopted the former approach as it is forvalues of the inputs from block 12 to block 13 are indeed estimating the gate count; to compute the transistor cost, wepreserved on its output side. The number of actual function have additionally considered the cost of a 3-input C-elementoutputs extracted from block 12 is (2m-2). as equal to 12, in line with the library specification. But, weThe above logic decomposition constraints facilitate have decomposed all n-input OR gates into (n-1) OR2 gates,

weak-indication function block realization using only C- for ease of calculation. This results in a difference betweenelements and OR gates. The implementation is partly actual and estimated transistor cost, as highlighted in Table I,technology-dependent in the sense that the synthesis and for a DR weak-indication 4-to-2 encoder. However, thedecomposition process facilitates incorporating 3-input C- difference is not reflected in the gate count parameter as it iselements besides 2-input Muller elements. This is preferable normalized.as the depth of the synthesized network would be reduced. TABLE IThe block diagram of the proposed method is shown in Two-INPUT GATE COUNT AND TRANsISTOR COST FOR WI 4:2 ENcODER

figure 1. Logic 2-input Transistorfunction gate count cost

V. ILLUSTRATIVE EXAMPLE 4-to-2 79 (actual) 212 (actual)Encoder 79 (estimated) 220 (estimated)

The weak-indication circuit implementation of a 4-to-2encoder with 8 DR inputs (ill, iI 0, i2 1, i20, i3 1, i30, i41, We define the generalized optimization rules that arei40) and 4 DR outputs (yll, ylO, y2l, y2O) is portrayed by valid for disjunctive operations: a DNF sub-expression canfigure 2 as an illustration of the proposed method. Of these, be isolated from a logic network without creating an orphan(il 1,ilO) correspond to the most significant DR inputs, if and only if:while (yllXyl0) represent the most significant DR outputs. * it is associated with both (only) a variable and itsThe portions of the circuit corresponding to the three complement separately (or)functional blocks of figure I have been clearly marked and * at least one of its fork outputs is fed to an OR-gate input.identified as blocks I1,12 and 13 in figure 2.To comment on the area complexity of the resultant VI. RESULTSANDDISCUSSION

multi-level speed-independent network; for the sake of Analysis has been carried out by considering some IWLSsimplicity, the 2-input gate count metric used in [19] and the benchmarks [22] and few widely used digital logic circuitstransistor cost metric used in [20] have been employed here. and these are listed in Table II. Extracting SOP sub-In case of the former, a n-input C-element (CEn) is equated expressions would enable further reduction in transistor costto (n-1) 2-input C-elements ((n-I)CE2), a CE2 is considered for logic functions. On the whole, for the circuitas equal to four 2-input gates (4G2) and an n-input gate (Gn) functionalities listed in Table II, the proposed methodis replaced by (n-I)G2. In the latter case, where a strongly enables compact realizations (known thus far) forindicating network is synthesized using only CE2 and 2- combinational benchmarks. However, this excludes SOPinput OR gate (OR2) components, the cost of a CE2 is sub-expression extraction and variable substitution.equated to 10 and that of an OR2 as 6.

Block I1090. 4 b * Synchronization

raaof dual-ralral cion outs

Xn-19_ Vf(2n inputs,

Block 13Block 12Ao A Sy~~~~~~~~~nchronization to

Figure 1. Block diagram of proposed ar cSatisfy weak-indicationYbc d_ * 1 y09l ! ~~~~~~~functional constraintsE

Decomposed * (4 inputs, -m o lzo to lri c * ~2 outputs)monoai toicJzofdU 1

Authorized licensed use limited to: The University of Manchester. Downloaded on July 23, 2009 at 10:27 from IEEE Xplore. Restrictions apply.

Page 5: ANew Design Technique Indicating Function Blocksapt.cs.manchester.ac.uk/ftp/pub/amulet/papers/Bala...level design of combinatorial logic as weakly indicating they dispense with the

illSilOi11i2 l Block Ili20

i30i41

IN~i30, i40 ~~~~~~~~~~~~~~~~~~Block 12i40

121 Bloc 13

130i11131

illO - +

141 +t' 2

Figure 2. Optimal implementation of a weak-indication 4-to-2 Encoder

For higher order functions, a selective consideration of Additionally, replacement of certain Muller elements bythe input space is necessitated in order to facilitate speed- logical conjunction operators is presently being considered.independent decompositions. This adds some more This is of interest and importance as it will tend to furthercomplexity into the decomposition process, which is quite reduce the gap (in terms of area overhead) betweencomplex as it decomposes several gates in the entire synchronous and asynchronous logic realizations.implementation space simultaneously. Hence speed- VI OCUINADFRHRWRindependent decompositions of larger functionalspecifications need to be analyzed by selectively This paper presents a novel technique for gate level WIconsidering a portion of the entire input space, employing function block design of an arbitrary combinatorial logic,valid multiple-input C-element decomposition rules. within the ambit of QDI with DR encoding and 4-phase

Authorized licensed use limited to: The University of Manchester. Downloaded on July 23, 2009 at 10:27 from IEEE Xplore. Restrictions apply.

Page 6: ANew Design Technique Indicating Function Blocksapt.cs.manchester.ac.uk/ftp/pub/amulet/papers/Bala...level design of combinatorial logic as weakly indicating they dispense with the

handshaking. To our knowledge, there has not been any [2] A.J. Martin, S.M. Bums, T.K. Lee, D. Borkovic and P.J. Hazewindus,method formulated to address the specific problem dealt "The first asynchronous microprocessor: the test results," ComputerArchitecture News, vol. 17, no. 4, pp. 95-110, June 1989.with in this paper. Improving the efficacy of the synthesis [3] K. van Berkel, R. Burgess, J. Kessels, A. Peeters, M. Roncken andtechnique and trying to address larger specifications forms a F. Schalij, "A fully asynchronous low power error corrector for the DCCpart of our present work. Power/performance evaluation of player," Proc. IEEE ISSCC, pp. 88-89, 1994.

the design pertaining to the proposed heuristic wouldbe [4] J. Sparso and S. Furber (Eds.), Principles of Asynchronous Circuitthe designs pertaining to the proposed heuristic would be Design -A Systems Perspective, Kluwer Academic Publishers, 2001.our next step. [5] A.J. Martin, "The limitations to delay-insensitivity in asynchronous

TABLE IIcircuits," Proc. 6th MIT Conference, pp. 263-278, MIT Press, 1980.

TWO-INPTGATEQUIVALETA ANDTRANSISTO[6] A.J. Martin, "Compiling communicating processes into delay-TWO-INPUT GATE EQUIVALENT COUNT AND TRANSISTOR COST FOR WI insensitive VLSI circuits," Distributed Computing, vol. 1, no. 4, pp. 226-

REALIZATIONS OF COMBINATIONALpFUNCTIONS 234, 1986.Combinatorial logic and its 2-input Transistor [7] T.M. Mak, "Is CMOS more reliable with scaling," presented at IEEEspecification gate count cost International On-Line Testing Workshop, July 2002.newtplal [8] R. Manohar and A.J. Martin, "Quasi-delay-insensitive circuits are(10 inputs, 2 outputs) 167 480 Turing-complete," Caltech CS Technical Report, CS-TR-95-1 1, 1995.conl [9] K. van Berkel, "Beware the isochronic fork," Integration, the VLSI(7 inputs, 2 outputs) 183 514 journal, vol. 13, no. 2, pp. 103-128, June 1992.newcwp [10] C.L. Seitz, In Introduction to VLSI Systems, Chapter 7 - System(4 inputs, 5 outputs) 143 444 Timing, C.A. Mead and L.A. Conway (Eds.), Addison-Wesley, 1990.wim [11] A.J. Martin, "Asynchronous datapaths and the design of an(4 inputs, 7 outputs) 130 422 asynchronous adder," Formal Methods in System Design, vol. 1, no. 1, pp.newbyte 119-137, July 1992.(5 inputs, 8 outputs) 181 566 [12] J. Sparso and J. Staunstrup, "Delay-insensitive multi-ring structures,"squar5 Integration, the VLSIjournal, vol. 15, no. 3, pp. 313-340, Oct. 1993.(5 inputs, 8 outputs) 500 1356 [13] F.-C. Cheng, S. Unger and M. Theobald, "Self-timed carry-lookaheaddcl adders," IEEE Trans. on Computers, vol. 49, no. 7, pp. 659-672, July 2000.(4 inputs, 7 outputs) 193 614 [14] X. Li and J.W. Sanders, "Efficient function block implementation ofc17 self-timed circuits," Proc. 47th IEEE Intl. Midwest Symposium on Circuits(5 inputs, 2 outputs) 115 300 and Systems, vol. 2, pp. 269-272, 2004.newapla2 [15] C. Brej, "Early Output Logic and Anti-Tokens," PhD thesis,(6 inputs, 7 outputs) 337 914 University of Manchester, 2006.4-bit priority encoder [16] A. Yakovlev, M. Kishinevsky, A. Kondratyev, L. Lavagno and M.P.(4 inputs, 4 outputs) 55 168 Koutny, "On the models for asynchronous circuit behaviour with OR8-bit priority encoder causality," Formal Methods in System Design, vol. 9, no. 3, pp. 189-234,(8 inputs, 8 outputs) 128 422 November 1996.3-to-8 decoder with [17] B. Teel and D. Wilde, "A logic minimizer for VLSI PLA design,"active high Enable (4 inputs, 8 outputs) 104 350 Proc. 19th IEEE Design Automation Conference, pp. 156-162, 1982.4-to-16 decoder with [18] R.K. Brayton, A.L. Sangiovanni-Vincentelli, C.T. McMullen and G.D.

activehigh Enable (5 inputs, 16 outputs) 213 758 Hachtel, Logic Minimization Algorithms for VLSI Synthesis, Kluweractive high Enable (5 inputs, 16 outputs) 213 758 AcdmcPbises,945-to-32 decoder with Academic Publishers, 1984.active high Enable (6 inputs, 32 outputs) 442 1670 [19] I. David, R. Ginosar and M. Yoeli, "An efficient implementation of

Boolean functions as self-timed circuits," IEEE Trans. on Computers, vol.41, no. 1, pp. 2-11, January 1992.[20] W.B. Toms and D.A. Edwards, "Efficient synthesis of speed

ACKNOWLEDGMENT independent combinational logic circuits," Proc. 10th Asia and South-Pacific Design Automation Conference, pp. 1022-1026, 2005.The authors acknowledge the support of EPSRC, UK for [21] SEDATE project, Casefor Support. Proposed Research, 2006.

this research through the SEDATE project grant [22] K. McElvain, "IWLS '93 Benchmark Set: Version 4.0," distributed asEP/D052238/1. part ofthe MCNC International Workshop on Logic Synthesis, May 1993.

[23] A. Kondratyev, M. Kishinevsky and A. Yakovlev, "Hazard-freeREFERENCES implementation of speed-independent circuits," IEEE Trans. on CAD of

Integrated Circuits and Systems, vol. 17, no. 9, pp. 749-77 1, September '98.[1] SIA's ITRS Report 2005, http://www.itrs.net [24] A.D. Zakrevski, Logical Synthesis of Cascade Circuits (translated

from Russian), Energiya, Moscow, 1974.

Authorized licensed use limited to: The University of Manchester. Downloaded on July 23, 2009 at 10:27 from IEEE Xplore. Restrictions apply.