1 design space exploration for power-efficient mixed-radix ling adders chung-kuan cheng computer...

40
1 Design Space Exploration for Power-Efficient Mixed-Radix Ling Adders Chung-Kuan Cheng Computer Science and Engineering Depart. University of California, San Diego

Upload: candace-collins

Post on 12-Jan-2016

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Design Space Exploration for Power-Efficient Mixed-Radix Ling Adders Chung-Kuan Cheng Computer Science and Engineering Depart. University of California,

1

Design Space Exploration for Power-Efficient Mixed-Radix Ling Adders

Chung-Kuan ChengComputer Science and Engineering

Depart. University of California, San Diego

Page 2: 1 Design Space Exploration for Power-Efficient Mixed-Radix Ling Adders Chung-Kuan Cheng Computer Science and Engineering Depart. University of California,

2

Outline Prefix Adder Problem

Background & Previous Work Extensions: High-radix, Ling

Our Work Area/Timing/Power Models Mixed-Radix (2,3,4) Adders ILP Formulation

Experimental Results Future Work

Page 3: 1 Design Space Exploration for Power-Efficient Mixed-Radix Ling Adders Chung-Kuan Cheng Computer Science and Engineering Depart. University of California,

Prefix Adder – Challenges

Logical Levels

Wire Tracks

Fanouts

Area

Physical placement

Detail routing

New Design Scope

Timing

Gate Cap

Wire Cap

Gate sizingBuffer insertion

Signal slope

Input arrival time

Output require time

Increasing impact of physical design and concern of power.

Power

Static power

Dynamic power

Power gating

Activity Probability

Page 4: 1 Design Space Exploration for Power-Efficient Mixed-Radix Ling Adders Chung-Kuan Cheng Computer Science and Engineering Depart. University of California,

4

Binary Addition Input: two n-bit binary numbers

and , one bit carry-in Output: n-bit sum and one bit

carry out Prefix Addition: Carry generation &

propagation

011... aaan

011... bbbn 0c

011... sssn

nc

)(

:Propagate

:Generate

1

iiii

iiii

iii

iii

bacs

cpgc

bap

bag

Page 5: 1 Design Space Exploration for Power-Efficient Mixed-Radix Ling Adders Chung-Kuan Cheng Computer Science and Engineering Depart. University of California,

5

Prefix Addition – Formulation

iiiiii bapbag Pre-processing:

Post-processing:

Prefix Computation:

iii

iii

cps

cPGc

0]0:[]0:[1

]:1[]:[]:[

]:1[]:[]:[]:[

kjjiki

kjjijiki

PPP

GPGG

Page 6: 1 Design Space Exploration for Power-Efficient Mixed-Radix Ling Adders Chung-Kuan Cheng Computer Science and Engineering Depart. University of California,

6

1234

12:13:14:1

Prefix Adder – Prefix Structure Graph

gpi

pi

G[i:0]

si

biai

GP[i, j] GP[j-1, k]

GP[i, k]

gp generator

sum generator

GP cell

Pre-processing

Post-processing

Prefix Computation

Page 7: 1 Design Space Exploration for Power-Efficient Mixed-Radix Ling Adders Chung-Kuan Cheng Computer Science and Engineering Depart. University of California,

7

Previous Works – Classical prefix adders

12345678

12:13:14:16:17:18:1 5:1

Brent-Kung:Logical levels: 2log2n–1 Max fanouts: 2Wire tracks: 1

12345678

12:13:14:16:17:18:1 5:1

Kogge-Stone:Logical levels: log2nMax fanouts: 2Wire tracks: n/2

12345678

12:13:14:16:17:18:1 5:1

Sklansky:Logical levels: log2nMax fanouts: n/2Wire tracks: 1

Page 8: 1 Design Space Exploration for Power-Efficient Mixed-Radix Ling Adders Chung-Kuan Cheng Computer Science and Engineering Depart. University of California,

8

High-Radix Adders

Each cell has more than two fan-in’s

Pros: less logic levels 6 levels (radix-2) vs. 3 levels (radix-4)

for 64-bit addition Cons: larger delay and power in

each cell

Page 9: 1 Design Space Exploration for Power-Efficient Mixed-Radix Ling Adders Chung-Kuan Cheng Computer Science and Engineering Depart. University of California,

9

Radix-3 Sklansky & Kogge-Stone Adder

David Harris, “Logical Effort of Higher Valency Adders”

Page 10: 1 Design Space Exploration for Power-Efficient Mixed-Radix Ling Adders Chung-Kuan Cheng Computer Science and Engineering Depart. University of California,

10

Ling Adders

iiiiii bapbag Pre-processing:

Post-processing:

Prefix Computation:

iii

iii

cps

cPGc

0]0:1[]0:1[

]:1[]:[]:[

]:1[]:[]:[]:[

kjjiki

kjjijiki

PPP

GPGG

Prefix Ling

* *[ 1:0] [ 1:0] 1( )i i i i i is G t G t p

,i i i i i i

i i i

g a b p a b

t a b

1*

]1:[1*

]1:[ , iiiiiiii ppPggG* * * *[ : ] [ : ] [ 1: 1] [ 1: ]i k i j i j j kG G P G

*]:1[

*]:[

*]:[ kjjiki PPP

*]0:1[1 iii Gpc

Page 11: 1 Design Space Exploration for Power-Efficient Mixed-Radix Ling Adders Chung-Kuan Cheng Computer Science and Engineering Depart. University of California,

11

An 8-bit Ling Adder

H1H2H3H4H5H6H7H8

12345678

0 1

si

ai bipi-1

Hi

gi pi-1 pi-2

G*i P*

i-1

gi-1

Page 12: 1 Design Space Exploration for Power-Efficient Mixed-Radix Ling Adders Chung-Kuan Cheng Computer Science and Engineering Depart. University of California,

12

Area Model

Distinguish physical placement from logical structure, but keep the bit-slice structure.

Logical view Physical view

Bit position

Logical level

Bit position

Physical level

Compact placement

12345678 12345678

Page 13: 1 Design Space Exploration for Power-Efficient Mixed-Radix Ling Adders Chung-Kuan Cheng Computer Science and Engineering Depart. University of California,

13

Timing Model Cell delay calculation:

pfd Effort Delay Intrinsic Delay

hgf

Logical EffortElectrical Effort = Cout/Cin= (fanouts+wirelength) / size

Intrinsic properties of the cell

Page 14: 1 Design Space Exploration for Power-Efficient Mixed-Radix Ling Adders Chung-Kuan Cheng Computer Science and Engineering Depart. University of California,

14

Power Model Total power consumption:

Dynamic power + Static Power Static power: leakage current of device

Psta = *#cells Dynamic power: current switching

capacitancePdyn = Cload

is the switching probability = j (j is the logical level*)

cellsCjPPP loadstadyntotal # * Vanichayobon S, etc, “Power-speed Trade-off in Parallel Prefix Circuits”

Page 15: 1 Design Space Exploration for Power-Efficient Mixed-Radix Ling Adders Chung-Kuan Cheng Computer Science and Engineering Depart. University of California,

15

ILP Formulation OverviewStructure variables: •GP cells•Connections (wires)•Physical positions

Capacitance variables: •Gate cap•Vertical wire cap•Horizontal wire cap

Timing variables: •Input arrival time•Output arrival time

Power Objective

ILPILOG CPLEX

Optimal Solution

Page 16: 1 Design Space Exploration for Power-Efficient Mixed-Radix Ling Adders Chung-Kuan Cheng Computer Science and Engineering Depart. University of California,

16

Integer Linear Programming (ILP) ILP: Linear Programming with integer

variables. Difficulties and techniques:

Constraints are not linear Linearize using pseudo linear constraints

Search Space too large Reduce search space

Search is slow Add redundant constraints to speedup

Page 17: 1 Design Space Exploration for Power-Efficient Mixed-Radix Ling Adders Chung-Kuan Cheng Computer Science and Engineering Depart. University of California,

17

ILP – Integer Linear Programming

Linear Programming: linear constraints, linear objective, fractional variables.

Integer Linear Programming: Linear Programming with integer variables.

LP Optimal

Constraints

ILP Optimal

Page 18: 1 Design Space Exploration for Power-Efficient Mixed-Radix Ling Adders Chung-Kuan Cheng Computer Science and Engineering Depart. University of California,

ILP – Pseudo-Linear Constraint

Minimize: x3

Subject to: x1 300

x2 500

x3 = min(x1, x2)

Minimize: x3

Subject to: x1 300

x2 500

x3 x1

x3 x2

x3 x1 – 1000 b1 (1)

x3 x2 – 1000 (1 – b1) (2)

b1 is binary

Problem: ILP formulation:

LP objective: 0ILP objective: 300

• A constraint is called pseudo-linear if it’s not effective until some integer variables are fixed.

• Pseudo-linear constraints mostly arise from IF/ELSE scenarios• binary decision variables are introduced to indicate true or false.

Page 19: 1 Design Space Exploration for Power-Efficient Mixed-Radix Ling Adders Chung-Kuan Cheng Computer Science and Engineering Depart. University of California,

19

ILP Solver Search Procedure

0Root (all vars are fractional)

b1

b2

b3

b4

2

‘0’ ‘1’infeasible

‘0’ ‘1’

3feasible

(current best)

infeasible

Minimize F(b1, b2, b3, b4, f1,…) bi is binary

It is VERY helpful if ILP objective is close to LP objective

2

‘0’ ‘1’

4

‘0’ ‘1’

3 2

4

55

‘0’ ‘1’

Cut

2 Bound (Smallest candidate)

Page 20: 1 Design Space Exploration for Power-Efficient Mixed-Radix Ling Adders Chung-Kuan Cheng Computer Science and Engineering Depart. University of California,

Interval Adjacency Constraint

H1H2H3H4H5H6H7H8

12345678

(7,3): Interval [7,1]

(3,2): Interval [3,1]

(7,2): Interval [7,4]

Must be adjacent,i.e. 4 = 3 + 1

(column id, logic level)

Page 21: 1 Design Space Exploration for Power-Efficient Mixed-Radix Ling Adders Chung-Kuan Cheng Computer Science and Engineering Depart. University of California,

21

Linearization for Interval Adjacency Constraint

(i, j)

(i, h) (k1, l1) (k2, l2)

wl wr1 wr2

],[ ),(),(R

hiL

hi yy ],[ )1,1()1,1(R

lkL

lk yy ],[ )2,2()2,2(R

lkL

lk yy

],[ ),(),(R

jiL

ji yy

11 if 1),(),( (i,j,k,l) wrwl(i,j,h) yy Llk

Rhi

1 if 1),,,(1),(

),( wl(i,j,h) lkjiwrkylk

Rhi

11 ),,,(1),(

),( wl(i,j,h))(nlkjiwrkylk

Rhi

11 ),,,(1),(

),( wl(i,j,h))(nlkjiwrkylk

Rhi

iyLji ),(

Linearize

Pseudo Linear

Left interval bound equal to column index

Page 22: 1 Design Space Exploration for Power-Efficient Mixed-Radix Ling Adders Chung-Kuan Cheng Computer Science and Engineering Depart. University of California,

22

Search Space Reduction Ling’s adder:

separate odd and even bits

Double the bit-width we are able to search H1H2H3H4H5H6H7H8

12345678

Page 23: 1 Design Space Exploration for Power-Efficient Mixed-Radix Ling Adders Chung-Kuan Cheng Computer Science and Engineering Depart. University of California,

23

Redundant Constraints Cell (i,j) is known to have logic level j before

wire connection Assume load is MinLoad (fanout=1 with

minimum wire length):

MinLoadjP ji ),(

Cell (i,j) has a path of length j-1 Assume each cell along the path has MinLoad

)(),( MinLoadLEPDjT ji

Page 24: 1 Design Space Exploration for Power-Efficient Mixed-Radix Ling Adders Chung-Kuan Cheng Computer Science and Engineering Depart. University of California,

24

Experiments – 16-bit Uniform Timing

Page 25: 1 Design Space Exploration for Power-Efficient Mixed-Radix Ling Adders Chung-Kuan Cheng Computer Science and Engineering Depart. University of California,

Experiments – 16-bit Uniform Timing

25

Page 26: 1 Design Space Exploration for Power-Efficient Mixed-Radix Ling Adders Chung-Kuan Cheng Computer Science and Engineering Depart. University of California,

26

Min-Power Radix-2 Adder (delay= 22, power = 45.5FO4 )

1

1

2

2

3

3

4

4

5

5

6

6

7

7

8

8

9

9

10

10

11

11

12

12

13

13

14

14

15

15

16

16

Page 27: 1 Design Space Exploration for Power-Efficient Mixed-Radix Ling Adders Chung-Kuan Cheng Computer Science and Engineering Depart. University of California,

27

Min-Power Radix-2&4 Adder (delay=18, power = 29.75FO4 )

1

1

2

2

3

3

4

4

5

5

6

6

7

7

8

8

9

9

10

10

11

11

12

12

13

13

14

14

15

15

16

16

Radix-2 Cell Radix-4 Cell

Page 28: 1 Design Space Exploration for Power-Efficient Mixed-Radix Ling Adders Chung-Kuan Cheng Computer Science and Engineering Depart. University of California,

28

Min-Power Mixed-Radix Adder (delay=20, power = 28.0FO4)

1

1

2

2

3

3

4

4

5

5

6

6

7

7

8

8

9

9

10

10

11

11

12

12

13

13

14

14

15

15

16

16

Radix-2 Cell Radix-4 Cell Radix-3 Cell

Page 29: 1 Design Space Exploration for Power-Efficient Mixed-Radix Ling Adders Chung-Kuan Cheng Computer Science and Engineering Depart. University of California,

29

Experiments – 16-bit Non-uniform Time (Mixed Radix)

ILP is able to handle non-uniform timings

Ling adders are most superior in increasing arrival time– faster carries

Page 30: 1 Design Space Exploration for Power-Efficient Mixed-Radix Ling Adders Chung-Kuan Cheng Computer Science and Engineering Depart. University of California,

Increasing Arrival Time (delay=35.5, power = 27.0FO4 )

30

Page 31: 1 Design Space Exploration for Power-Efficient Mixed-Radix Ling Adders Chung-Kuan Cheng Computer Science and Engineering Depart. University of California,

Decreasing Arrival Time (delay=34.5, power = 30.5FO4)

31

Page 32: 1 Design Space Exploration for Power-Efficient Mixed-Radix Ling Adders Chung-Kuan Cheng Computer Science and Engineering Depart. University of California,

Convex Arrival Time (delay=35.9, power = 32.4FO4 )

32

Page 33: 1 Design Space Exploration for Power-Efficient Mixed-Radix Ling Adders Chung-Kuan Cheng Computer Science and Engineering Depart. University of California,

Increasing Required Time (delay=34.5, power = 30.5FO4)

33

Page 34: 1 Design Space Exploration for Power-Efficient Mixed-Radix Ling Adders Chung-Kuan Cheng Computer Science and Engineering Depart. University of California,

Decreasing Required Time (delay=36.5, power = 32.5FO4)

34

Page 35: 1 Design Space Exploration for Power-Efficient Mixed-Radix Ling Adders Chung-Kuan Cheng Computer Science and Engineering Depart. University of California,

Convex Required Time (delay=36.5, power = 32.5FO4)

35

Page 36: 1 Design Space Exploration for Power-Efficient Mixed-Radix Ling Adders Chung-Kuan Cheng Computer Science and Engineering Depart. University of California,

36

Experiments – 64-bit Hierarchical Structure (Mixed-Radix)

Handle high bit-width applications 16x4 and 8x8

ILP Block ILP Block ILP Block ILP Block

ILP Block

a1b1a16b16a17b17a32b32a33b33a48b48a49b49a64b64

…... …... …... …...

Level 1

Level 2

…... …... …... …...

…... …... …... …...

GP*[64:50]GP*[48:34] GP*[32:18] GP*[16:2]

GP*[1:1]GP*[17:17]GP*[33:33]GP*[49:49]

…... …... …...H64 H49 H48 H33 H32 H17 H16 H1

Page 37: 1 Design Space Exploration for Power-Efficient Mixed-Radix Ling Adders Chung-Kuan Cheng Computer Science and Engineering Depart. University of California,

37

Experiments – 64-bit Hierarchical Structure

TSL: a 64-bit high-radix three-stage Ling adder

V. Oklobdzija and B. Zeydel, “Energy-Delay Characteristics of CMOS Adders”,in High-Performance Energy-Efficient Microprocessor Design, pp. 147-170, 2006

Page 38: 1 Design Space Exploration for Power-Efficient Mixed-Radix Ling Adders Chung-Kuan Cheng Computer Science and Engineering Depart. University of California,

38

ASIC Implementation - Results 64-bit hierarchical design (mixed-radix)

by ILP vs. fast carry look-ahead adder by Synopsys Design Compiler

TSMC 90nm standard cell library was used

Page 39: 1 Design Space Exploration for Power-Efficient Mixed-Radix Ling Adders Chung-Kuan Cheng Computer Science and Engineering Depart. University of California,

39

Future Work

ILP formulation improvement Expected to handle 32 or 64 bit

applications without hierarchical scheme

Optimizing other computer arithmetic modules Comparator, Multiplier

Page 40: 1 Design Space Exploration for Power-Efficient Mixed-Radix Ling Adders Chung-Kuan Cheng Computer Science and Engineering Depart. University of California,

40

Q & A

Thank You!