ece 551 digital system design & synthesis

83
ECE 551 Digital System Design & Synthesis Lecture 12 “To synthesis, and beyond…”

Upload: orinda

Post on 23-Feb-2016

59 views

Category:

Documents


0 download

DESCRIPTION

ECE 551 Digital System Design & Synthesis. Lecture 12 “To synthesis, and beyond…”. So, the thing finally synthesized!. So, what have you created so far? A list of the required hardware cells A netlist describing their interconnections - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: ECE 551 Digital System Design & Synthesis

ECE 551Digital System Design &

Synthesis

Lecture 12“To synthesis, and beyond…”

Page 2: ECE 551 Digital System Design & Synthesis

So, the thing finally synthesized! So, what have you created so far?

A list of the required hardware cells A netlist describing their interconnections A simulation model that hopefully reflects reality

more accurately than the pure HDL-level simulation Includes semi-accurate logic delays

2

Page 3: ECE 551 Digital System Design & Synthesis

Now What? After synthesis, we have a netlist mapped to

our specific tech library ROMs PLDs FPGAs Standard cells Custom logic

Choose implementation platform based on cost and performance requirements

3

Page 4: ECE 551 Digital System Design & Synthesis

ROMs

Use like a GIANT truth table

Can be inefficient forsimple logic! Gates

Specify just the 1’s Specify just the 0’s

ROM Has to specify both! All outputs for all possible

minterms

4

x y zdcba

0 0 0 00 0 0 10 0 1 00 0 1 10 1 0 00 1 0 10 1 1 00 1 1 11 0 0 01 0 0 1. . . .

1 1 0 01 1 0 11 1 1 01 1 1 1

. . . .

. . . .

0 0 00 1 10 1 10 1 00 1 10 1 00 1 00 1 10 1 10 1 0. . .. . .. . .0 1 00 1 10 1 11 1 0

address data

Page 5: ECE 551 Digital System Design & Synthesis

ROMs Use like a GIANT truth table

5

x

y

zd

c

b

a0 0 00 1 10 1 10 1 00 1 10 1 00 1 00 1 10 1 10 1 0. . .. . .. . .0 1 00 1 10 1 11 1 0

addressdata

0

1

1

0

1

0

0

Page 6: ECE 551 Digital System Design & Synthesis

ROMs Use like a GIANT truth table 64K ROM: 8K entries x 8 bits (13 addr. lines)

8 Boolean functions using any of these 13 1-bit variables

6

abcdefghijkl

m

s

z

y

x

w

v

ut

address data

Page 7: ECE 551 Digital System Design & Synthesis

ROMs Use like a GIANT truth table 64K ROM: 8K entries x 8 bits (13 addr.

lines) 2 4-bit functions of 3 4-bit variables (plus flag) Other options possible

7

abcdefghijkl

m

s

z

y

x

w

v

ut

address data

Page 8: ECE 551 Digital System Design & Synthesis

ROM Logical Structure

8

AddressD ecoder

(N on-program m able)

O R M em ory Array(2 n x m )

addr[0 ]addr[1 ]

addr[n -1 ]

w [m -1] w [0]

2n M in term s (w ord lines)fo rm ed from inputs

n inputs

m outputs

Page 9: ECE 551 Digital System Design & Synthesis

ROM Circuit Structure

9

AddressD ecodern - to - 2 n

(N on-program m able)

D [m -1 ] D [0]D [1 ]D [2 ]

addr[0 ]

addr[1 ]

addr[n -1 ]

En_bar

m ask-program m edpu lldow n transistor link

V D D

Pull-upR esisto r

O R -P lane (2 n x m )

M em oryC ell

W ord lines

O utpu ts (B it-lines )

Pro

duct

Ter

ms

( Wor

d-lin

es)

Inpu

ts

Page 10: ECE 551 Digital System Design & Synthesis

Erasable Programmable ROM (EPROM)

10

AddressDecodern - to - 2n

(Non-program m able)

w[m -1] w[0]w[1]w[2]

addr[0]

addr[1]

addr[n -1]

En_bar

Floating Gate(P rogram m able) V D D

Pull-upResistor

m in2n -3

m in2n -1

m in2n -2

m in0

OR-Plane (2 n x m )

Mem oryCell

Page 11: ECE 551 Digital System Design & Synthesis

Flash Memory A flash memory is an electrically erasable PROM

configured with additional circuitry to allow erasure/programming blocks of memory (e.g. 16-64 Kbytes) in circuit.

Widely used as the program storage memory for computers and embedded systems, as well as data storage memory (audio, video, file systems) High endurance 100k/1M+ erase cycles

Flash memory (SSDs) are cost-competitive with magnetic disks up to several GB, with no mechanical shock issues, and much better random-access times.

Some FPGAs use flash memory instead of SRAM to allow instant-on behavior and not expose IP.

11

Page 12: ECE 551 Digital System Design & Synthesis

Comparison of ROMs

12

D evice

EEPR O M

F LASH

EPR O M

PRO M

Program m ingM ode Erase M ode

In-circu itByte-by-byte

In-circu it

O ut-of-c ircu it

C ustom byuser (O TP***) N one

In-circu itByte-by-byte

In -circu itBulk or sector

O ut-of-c ircu itBulk, U V Ligh t

C om plexityand C ost

R O M * M ask N one

AccessT im e

150 ns

*R equires h igh vo lum e to o ffset N R E** P rogram m ing tim e: 500 m s*** O ne-tim e program m able

Exam ple

T M S47C 25632K x 8 C M O S

AT27B V400256K x 16 or

512K x 8

In te l 27324K x 8 NM O S 45 ns

In te l 28648K x 8 NM O S

AT49LV102464K x 16 N M O S 70 ns**

Page 13: ECE 551 Digital System Design & Synthesis

ROMs Cheap – couple bucks each Reuse EEPROMs with different truth tables Non-volatile - keep values when power gone Very slow compared to gates (memory read) Combinational-only Limited to fairly simple designs (e.g., 20 or

fewer inputs) due to exponential scaling

ROMs are good for complex operations that use few variables (trigonometry, matrix inversion, etc.)

They are often used in combination with other types of logic 13

Page 14: ECE 551 Digital System Design & Synthesis

PLDs Programmable Logic Devices

PLA (Programmable Logic Array) – programmable AND and OR arrays

PAL (Programmable Array Logic) – programmable AND array and fixed OR array

Programming done at points where wires cross

14

a !a b !b c !c d !d

x y

a !a b !b c !c d !d

x

y

PLA PAL

OutputsInputsInputs

Outputs

ProductTerms

Page 15: ECE 551 Digital System Design & Synthesis

PLDs Programming points where wires cross x = a b c + a d y = a b c d + a b d + b c d

15

a !a b !b c !c d !d

x y

a !a b !b c !c d !d

x

y

PLA PAL

OutputsInputsInputs

Outputs

ProductTerms

Page 16: ECE 551 Digital System Design & Synthesis

PLDs Moderate per-unit price – 1s to 10s of $ Most are re-programmable Faster than ROMs Relatively slow compared to gates

Programming points cause delay Limited complexity

“Complex” PLDs have sequential ability, but are still too limited for very complex designs

Crossbar design scales poorly with number of inputs

Good when you don’t need the complexity of FPGA and want to save money.

16

Page 17: ECE 551 Digital System Design & Synthesis

FPGAs Field Programmable Gate Array

Temporary (Flash/SRAM based) Permanent (Anti-fuse) not as common

Pros Allow for very complex implementations Generally re-useable

(upgrades/bug-fixes/prototype) Low non-recurring engineering (NRE) costs

Cons Expensive per-unit (10s-100s of $) Slower than gates

Programming points MPGA – mask-programmable (one time)

17

Page 18: ECE 551 Digital System Design & Synthesis

Programming an FPGA Most designs based on SRAM

During configuration, the SRAM bits in the device are written with the desired values Note that this means that your IP is being passed into the

FPGA in a serial stream for the whole world to see! Different circuits implemented based on values

set in SRAM bits that form LUTs, control multiplexers, and make routing connections

18

Page 19: ECE 551 Digital System Design & Synthesis

Routing Elements Programmable connection

Programmable bypass

19

RoutingResource #1

P

RoutingResource #2

DFF

OUT

SIGNAL

P

Page 20: ECE 551 Digital System Design & Synthesis

Logic Elements Look-Up Table (LUT)

Essentially a very small memory Most common size is 4-input LUT

20

P1P2

P3P4

P5P6

P7P8

a cb

OUT

01234567

Page 21: ECE 551 Digital System Design & Synthesis

Logic Elements Look-Up Table (LUT) Example

OUT = a XOR b XOR c

21

01

10

10

01

a cb

OUT

01234567

Page 22: ECE 551 Digital System Design & Synthesis

Logic Elements Look-Up Table (LUT) Example

OUT = ab + ac + bc

22

10

11

11

01

a cb

OUT

01234567

Page 23: ECE 551 Digital System Design & Synthesis

Logic Elements Look-Up Table (LUT)

Extremely flexible in implementing logic Can implement any function!

Larger and slower than just using gates

23

P1P2

P3P4

P5P6

P7P8

a cb

OUT

01234567

Page 24: ECE 551 Digital System Design & Synthesis

FPGA Logic Structure

“Cell” or “logic block”: 1 or more LUTs

(generally 4-input) At least one D flip-flop Possibly fast carry logic

Connect several logic blocks to form circuit

24

4-LUT

carry logic

Cout Cin

OUT

DFF

I1 I2 I3 I4

Page 25: ECE 551 Digital System Design & Synthesis

Xilinx 4000 Combinational Logic Block

25

Page 26: ECE 551 Digital System Design & Synthesis

Xilinx 4000 FPGA (# of CLBs not to scale)

26

SwitchMatrix

CLB

IO B

IO B

IO B

IO B

IO B

IO B

IO B

IO B

IO B

IO B

IO B IO B IO B

IO B IOB IO B

Verticallong line

Horizontallong line

CLB

SwitchMatrix

CLBCLB

SwitchMatrix

SwitchMatrix

SwitchMatrix

SwitchMatrix

SwitchMatrix

SwitchMatrix

SwitchMatrixIO B

IO B

IO B

IOB

Page 27: ECE 551 Digital System Design & Synthesis

FPGA Summary Allow for complex implementations Generally reuseable

(upgrades/bugfixes/prototype) Low non-recurring engineering (NRE) costs

Relatively expensive per-unit (10s-100s of $)

Slower than pure gates (programming points), but FPGAs are normally first to latest technology

Newer FPGAs incorporate memories, multipliers, peripherals, and even processors all on the same chip

27

Page 28: ECE 551 Digital System Design & Synthesis

FPGA Trends Hardware specialization

Memory block hierarchies I/O interfaces

High-speed serial I/O Clock management Hardware for DSP (MAC units)

Intellectual Property (IP) cores Hard-cores Soft-cores http://www.altera.com/products/ip/ipm-index.html

Conversion to mask-programmed devices Altera Hard Copy, Xilinx Easy Path

Current Technology Examples...

Page 29: ECE 551 Digital System Design & Synthesis

Xilinx Virtex-5Xilinx’s nearly top of the line FPGA 65nm process technology

550MHz RAM blocks 6-input LUTs

Serial connectivity Ethernet MACs Rocket I/O serial 3.25Gbps PCI Express endpoint

Enhanced DSP blocks (25x18, 48b accum) 1760 pin BGA with 1200 I/O EasyPath

Page 30: ECE 551 Digital System Design & Synthesis

Xilinx Virtex-5 Applications

Page 31: ECE 551 Digital System Design & Synthesis

Xilinx Virtex-5 Family

Page 32: ECE 551 Digital System Design & Synthesis

Altera Stratix III

Page 33: ECE 551 Digital System Design & Synthesis

Stratix III

Page 34: ECE 551 Digital System Design & Synthesis

Stratix III

Page 35: ECE 551 Digital System Design & Synthesis

Altera Stratix III

Page 36: ECE 551 Digital System Design & Synthesis

Altera NIOS

Page 37: ECE 551 Digital System Design & Synthesis

Altera NIOS

Page 38: ECE 551 Digital System Design & Synthesis

Altera NIOS

Page 39: ECE 551 Digital System Design & Synthesis

Stratix III vs. Virtex-5http://www.altera.com/literature/wp/wp-01007.pdf

Page 40: ECE 551 Digital System Design & Synthesis

Stratix III vs. Virtex-5

Page 41: ECE 551 Digital System Design & Synthesis

More Current Products Actel FPGAs

Flash-based design eliminates configuration time Less susceptible to radiation induced upsets

Also manufactured in antifuse technology

Page 42: ECE 551 Digital System Design & Synthesis

Mask-Programmable Gate Arrays Mask-programmable (MPGAs) Fixed logic elements, metal routing added

42

Fixed Spacing

Base Cell

Metal interconnect placed in channels between cellsTransistor / gate

Page 43: ECE 551 Digital System Design & Synthesis

MPGAs Cheap per-unit pricing ($1s-$10s) Fast compared to ROMs/PLDs/FPGAs Simpler Mask than Standard Cell (routing

only) Fixed gates available High non-recurring engineering (NRE) cost -

design time, mask fabrication... $10K-$100Ks

Best for medium-to-large quantities

Used for medium-to-high-volume designs, or hardware that must be faster than FPGA 43

Page 44: ECE 551 Digital System Design & Synthesis

Standard Cells

44

Gates and other small structures

Can also use macroblocks Groups of pre-optimized

cells Larger custom-layout

structures Better logic density

From: http://www.zuraleff.com/layout

Page 45: ECE 551 Digital System Design & Synthesis

Standard Cell Layouts

45

…Adjustable

Spacing

Megacells

Metal interconnect placed in channels between cells

Gate, flip-flop, 1-bit adder, …

Page 46: ECE 551 Digital System Design & Synthesis

IC Layout Styles

Technologies in terms of layout styles:

46

Adjustable Spacing

Megacells

Standard Cell

Gate Array

…Fixed Spacing

Base Cell

Metal interconnect placed in channels between cells

Gate, flip-flop, 1-bit adder, …

Transistor / gate

Page 47: ECE 551 Digital System Design & Synthesis

Standard Cells Cheap per-unit pricing ($1s-$10s) Achieve better logic density than MPGA Fast compared to ROMs/PLDs/FPGAs

High NREs (design time, mask fabrication...) $100Ks-$10Ms More expensive masks than Gate Arrays

Used for Large quantities and/or Performance-critical operations

47

Page 48: ECE 551 Digital System Design & Synthesis

Custom Logic Manual layout Extremely high NRE

Huge design time! Even longer verification

time Maximum performance

and density

PLD/FPGA physicalhardware is custom logic They sell a LOT of them! You don’t have to

amortize all of their NRE, just part

48

Page 49: ECE 551 Digital System Design & Synthesis

Hardware Implementations Making the right platform choice is one of the

most important decisions for a design project’s success

There is no one “best” method

Tradeoffs between cost, speed, time-to-market, upgradeability, power efficiency

Technological changes are shifting traditional design choices. Engineers must be ready.

49

Page 50: ECE 551 Digital System Design & Synthesis

Hardware Trends Standard Cell & Custom getting more

expensive Validation is getting harder with smaller gates and

more complex designs, and is not scaling well w/Moore’s Law.

Licensing of IP is being used to counter-act NRE “Hard” (layout) and “Soft” (HDL) IP cores ARM architecture a great example

50

Page 51: ECE 551 Digital System Design & Synthesis

Hardware Trends FPGAs are getting faster and bigger

Big enough to implement a lot of designs that used to require Standard Cells

Lots of built-in IP for connectivity: Ethernet, USB, SATA

Power is becoming a significant driver Moore’s Law scaling survives for logic density but

is dying for total power consumption More computing devices are battery powered, and

batteries are not keeping pace with Moore’s Law

51

Page 52: ECE 551 Digital System Design & Synthesis

Technology Mapping Generally part of synthesis Use different tools / components based on

standard cells vs. FPGA target

Divides your circuit into basic building blocks.

52

Page 53: ECE 551 Digital System Design & Synthesis

Tech Mapping: Standard Cells Need to select your library

Which cells you’re using Which macro-cells / specialized structures

In this class, we’re using: TSMC 65/45/40 nm cell libraries

Tech mapping then implements your netlist in terms of the available cells

How do you choose?

53

Page 54: ECE 551 Digital System Design & Synthesis

Tech Mapping: Standard Cells Example boolean equation:

z = a b c + c d + e Example cell library:

2-input NAND, INV Resulting tech-mapped circuit:

54

acb

ecd

z

Page 55: ECE 551 Digital System Design & Synthesis

Tech Mapping: FPGAs Need to know building blocks of the FPGA

LUT size (if uses LUTs) Any special resources (multipliers, RAM blocks)

Tech mapping then implements your netlist in terms of those building blocks

55

Page 56: ECE 551 Digital System Design & Synthesis

Tech Mapping: FPGAs Example boolean equation:

z = a b c + c d + e Example basic block:

4-input LUT Resulting tech-mapped circuit:

56

acb

ecd

z

LUT #1

LUT #2

Page 57: ECE 551 Digital System Design & Synthesis

Tech Mapping: FPGAs Example boolean equation:

z = a b c + c d + e Example basic block:

4-input LUT Resulting tech-mapped circuit:

57

acb

ecd

z

LUT #1

LUT #2

a b c

y y + c d + e

Page 58: ECE 551 Digital System Design & Synthesis

Now What? So you’ve:

Designed your hardware in Verilog. Chosen your hardware implementation

(std. cells, FPGA, etc) How do you get from a netlist to silicon?

VLSI CAD (“Physical Design”)

58

Page 59: ECE 551 Digital System Design & Synthesis

VLSI CAD Flow

59

Translation

Verified HDL Description

Generic Netlist

Technology Mapping

Cell Library / FPGA

DescriptionPlace

Route

Partition & Floorplan

Mask Gen.

...To Fab!

Std. cells

Config. Bits

…Program!

FPGA

Post-SynthesisNetlist

Page 60: ECE 551 Digital System Design & Synthesis

Partitioning & Floorplanning Sometimes you have BIG circuits

Makes placement take a long time Yields poor results (too large a solution space)

Use partitioning and floorplanning Partitioning: Divide netlist into partitions Floorplanning: Assign partitions to chip regions Place regions separately Benefit: Small problems are easier to solve well

than large ones

What’s the Disadvantage? 60

Page 61: ECE 551 Digital System Design & Synthesis

Partitioning Example

61

A

B

C

D

E

G

F

H

I

J

K

L

How might we choose to form 3 partitions?

Page 62: ECE 551 Digital System Design & Synthesis

Partitioning Example - Bad

62

A

B

C

D

E

G

F

H

I

J

K

L

Page 63: ECE 551 Digital System Design & Synthesis

Partitioning We want to try to make our partitions as

independent as possible. Independent = fewer outside connections

Why? Want to keep wires short Try to place partitions adjacent to the partitions

they interconnect with If we have a lot of interconnections, this may not

be easy/possible

63

Page 64: ECE 551 Digital System Design & Synthesis

Partitioning Example - Bad

64

A

B

C

D

E

G

F

H

I

J

K

L

Page 65: ECE 551 Digital System Design & Synthesis

Partitioning Example - Better

65

A

B

C

D

E

G

F

H

I

J

K

L

Page 66: ECE 551 Digital System Design & Synthesis

Floorplanning OK, so we’ve divided our problem up into

partitions

Now, figure out where partitions should be placed relative to one another

Assign partitions to regions of the silicon / FPGA

Try to avoid long wires between partitions Don’t want to have to route wires through

too many other partitions Wastes area in those partitions 66

Page 67: ECE 551 Digital System Design & Synthesis

Floorplanning Example

67

4

72

1

5

3

6

9

8

Page 68: ECE 551 Digital System Design & Synthesis

Floorplanning Example Try to arrange partitions to minimize cross-

partition routing

68

4

7

2

1

5 3

6

98

Eat your heart out, Sudoku.

Page 69: ECE 551 Digital System Design & Synthesis

Placement Need to assign physical locations to

cells/LUTs If partitioning

Relative to the partition boundaries Otherwise

Relative to the chip boundaries

Common goal Reduce total wirelength of placed circuit

69

Page 70: ECE 551 Digital System Design & Synthesis

Placement Standard Cells:

Choosing a row for each cell Choosing a location within the row for each cell

FPGAs: Choosing which physical LUTs implement each

netlist LUTs

70

Page 71: ECE 551 Digital System Design & Synthesis

Routing Have locations for all the cells/LUTs in the

netlist Now need to connect them together to

actually make the circuit

Different techniques for std. cell vs. FPGA

Divided into: Global Detailed (local)

71

Page 72: ECE 551 Digital System Design & Synthesis

Global Routing Find a rough path for each net Figure out what areas a signal passes

through

72

Page 73: ECE 551 Digital System Design & Synthesis

Detailed Routing: Std. Cells Connect the cells within the global regions Common goal: minimize channel width

73

Channel Width

1 2 2 4 4 0 3 0 4

2 4 4 3 0 0 3 3 1

Page 74: ECE 551 Digital System Design & Synthesis

Detailed Routing: FPGAs Assign signals in netlist to:

Wires Switchbox points

Fixed set of available resources Can’t “widen” routing channels like Std Cell

Common goal: Reduce congestion Congestion is the ratio of signals:wires By keeping areas “open”, more likely to be able to

route later signals

74

Page 75: ECE 551 Digital System Design & Synthesis

Detailed Routing: FPGAs Common goal: Reduce congestion

75

Page 76: ECE 551 Digital System Design & Synthesis

Detailed Routing: FPGAs Common goal: Reduce congestion

76

Page 77: ECE 551 Digital System Design & Synthesis

Detailed Routing: FPGAs Frequently start with an “idealized” routing

Signals can share wires Repeatedly “rip up” and reroute

One or more nets (signals) Stop when no wires are shared

77

Page 78: ECE 551 Digital System Design & Synthesis

Final Steps: Std. Cells Generate “masks” for each layer, indicating

where the material in that layer goes Have cell locations, cell library has cell “design” Plus metal layers created during the routing phase

Send to chip fabrication foundry

78

Page 79: ECE 551 Digital System Design & Synthesis

Final Steps: FPGAs Generate the “configuration bitstream”

The series of 1’s and 0’s that determine the FPGA’s function

Tools determine these values based on: LUT contents Routing resource useage

Load the configuration onto the FPGA Also called “programming” or “configuring”

79

Page 80: ECE 551 Digital System Design & Synthesis

Conclusion Synthesis isn’t the end of the process!

Many steps after it Choose target implementation

Examine cost/performance tradeoffs Use CAD tools to implement synthesized

circuit on FPGA or std. cells Optionally partition & floorplan Place & Route Generate bitstream or layout masks

See ECE 556 for more details on CAD algorithms

80

Page 81: ECE 551 Digital System Design & Synthesis

3.125 Gb/s Transceiver

Page 82: ECE 551 Digital System Design & Synthesis

Xilinx Digital Clock Manager (DCM)

Eliminate clock skew using Delay-Locked Loop (DLL) Monitors clock skew on output and corrects Frequency doubling, multiphase clocks

Fractional Digital Frequency Synthesizer (DFS) - fOUT = M/N fIN

Page 83: ECE 551 Digital System Design & Synthesis

Input/Output Block (IOB)

Slew rate and drive strength controlPull-up, pull-down and keeperDDR signalsControlled-Z input/outputBoundary scan