advances in designing clockless ddigital systemsigital systemsnowick/async-overview-pt1-revd.pdf ·...

29
Advances in Designing Clockless Digital Systems Advances in Designing Advances in Designing Clockless Clockless Digital Systems Digital Systems Prof. Steven M. Prof. Steven M. Nowick Nowick [email protected] [email protected] Department of Computer Science Department of Computer Science Columbia University Columbia University New York, NY, USA New York, NY, USA

Upload: others

Post on 25-Apr-2020

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M

Advances in Designing Clockless Digital Systems

Advances in Designing Advances in Designing ClocklessClockless Digital SystemsDigital Systems

Prof. Steven M. Prof. Steven M. [email protected]@cs.columbia.edu

Department of Computer ScienceDepartment of Computer ScienceColumbia UniversityColumbia UniversityNew York, NY, USANew York, NY, USA

Page 2: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M

#2

IntroductionIntroduction

Synchronous vs. Asynchronous Systems?Synchronous vs. Asynchronous Systems?

Synchronous Systems:Synchronous Systems: use a use a global clockglobal clockentire system operates entire system operates at fixedat fixed--raterate

uses uses “centralized control”“centralized control”

clock

Page 3: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M

#3

Introduction (cont.)Introduction (cont.)

Synchronous vs. Asynchronous Systems? (cont.)Synchronous vs. Asynchronous Systems? (cont.)

Asynchronous Systems:Asynchronous Systems: no global clockno global clock

components can operate atcomponents can operate at varying ratesvarying rates

communicate locallycommunicate locally via via “handshaking”“handshaking”

uses uses “distributed control” “distributed control”

“handshakinginterfaces”(channels)

Page 4: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M

#4

Trends and ChallengesTrends and Challenges

Trends in Chip Design: Trends in Chip Design: next decadenext decade“Semiconductor Industry Association (SIA) Roadmap”“Semiconductor Industry Association (SIA) Roadmap” (97(97--8)8)

Unprecedented Challenges:Unprecedented Challenges:complexity and scale (= size of systems)complexity and scale (= size of systems)

clock speedsclock speeds

power managementpower management

reusability & scalabilityreusability & scalability

“time“time--toto--market”market”

Design becoming unmanageable using a centralized Design becoming unmanageable using a centralized single clock (synchronous) approach….single clock (synchronous) approach….

Page 5: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M

#5

Trends and Challenges (cont.)Trends and Challenges (cont.)

1. Clock Rate:1. Clock Rate:

1980: 1980: several several MegaHertzMegaHertz

2001: 2001: ~750 ~750 MegaHertzMegaHertz -- 1+ 1+ GigaHertzGigaHertz2005:2005: several several GigaHertzGigaHertz

Design Challenge:Design Challenge:

“clock skew”:“clock skew”: clock must be clock must be nearnear--simultaneoussimultaneous across across entire chipentire chip

Page 6: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M

#6

Trends and Challenges (cont.)Trends and Challenges (cont.)

2. Chip Size and Density:2. Chip Size and Density:

Total #Transistors per Chip: Total #Transistors per Chip: 6060--80% increase/year80% increase/year~1970: ~1970: 4 thousand4 thousand (Intel 4004 microprocessor)(Intel 4004 microprocessor)

today: today: 5050--200+ million200+ million

2006 and beyond:2006 and beyond: towards 1towards 1 billion+billion+

Design Challenges:Design Challenges:system complexity, design time, clock distributionsystem complexity, design time, clock distributionclock will require 10clock will require 10--20 cycles to reach across chip20 cycles to reach across chip

Page 7: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M

#7

Trends and Challenges (cont.)Trends and Challenges (cont.)

3. Power Consumption3. Power Consumption

Low power: everLow power: ever--increasing demandincreasing demand

consumer electronics:consumer electronics: batterybattery--poweredpowered

highhigh--end processors:end processors: avoid expensive fans, packagingavoid expensive fans, packaging

Design Challenge:Design Challenge:

clock clock inherentlyinherently consumes power consumes power continuouslycontinuously

“power“power--down” techniques: complex, only partly effectivedown” techniques: complex, only partly effective

Page 8: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M

#8

Trends and Challenges (cont.)Trends and Challenges (cont.)

4. Time4. Time--toto--Market, Design ReMarket, Design Re--Use, ScalabilityUse, Scalability

Increasing pressure for faster Increasing pressure for faster “time“time--toto--market”.market”. Need:Need:reusable components:reusable components: “plug“plug--andand--play” designplay” design

flexible interfacing:flexible interfacing: under varied conditions, voltage scalingunder varied conditions, voltage scaling

scalable design:scalable design: easy system upgradeseasy system upgrades

Design Challenge:Design Challenge: mismatch w/ central fixedmismatch w/ central fixed--rate clockrate clock

Page 9: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M

#9

Trends and Challenges (cont.)Trends and Challenges (cont.)

5. Future Trends: “Mixed Timing” Domains5. Future Trends: “Mixed Timing” Domains

Chips themselves becoming Chips themselves becoming distributed systems….distributed systems….contain many subcontain many sub--regions, regions, operating at different speeds:operating at different speeds:

Design Challenge:Design Challenge: breakdown of single centralizedbreakdown of single centralizedclock controlclock control

Page 10: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M

#10

Asynchronous Design: Potential AdvantagesAsynchronous Design: Potential Advantages

Several Potential Advantages:Several Potential Advantages:

Lower PowerLower Powerno clockno clock components use power only “on demand”components use power only “on demand”

Robustness, ScalabilityRobustness, Scalability

no global timingno global timing “mix“mix--andand--match” variablematch” variable--speed componentsspeed components

composablecomposable/modular design style /modular design style “object“object--oriented”oriented”

Higher PerformanceHigher Performance

systems not limited to “worstsystems not limited to “worst--case” clock ratecase” clock rate

Page 11: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M

#11

Asynchronous Design: Some Recent DevelopmentsAsynchronous Design: Some Recent Developments

1. Philips Semiconductors:1. Philips Semiconductors:commercial use: 100 million commercial use: 100 million asyncasync chips for consumer electronics:chips for consumer electronics:

pagers, cell phones, smart cards, digital passports, automotive pagers, cell phones, smart cards, digital passports, automotive 33--4x lower power4x lower power,, less electromagnetic interference (“EMI”)less electromagnetic interference (“EMI”)

2. Intel:2. Intel:experimental: Pentium instructionexperimental: Pentium instruction--length decoder = “RAPPID” (1990’s)length decoder = “RAPPID” (1990’s)33--4x faster 4x faster than synchronous subsystemthan synchronous subsystem

3. Sun Labs:3. Sun Labs:commercial use: highcommercial use: high--speed speed FIFO’sFIFO’s in recent “Ultra’s” (memory access)in recent “Ultra’s” (memory access)

4. IBM Research:4. IBM Research:experimental: highexperimental: high--speed pipelines, filters, mixedspeed pipelines, filters, mixed--timing systemstiming systems

Recent Startups:Recent Startups: Fulcrum, Fulcrum, TheseusTheseus Logic, Handshake Solutions, Logic, Handshake Solutions, SilistrixSilistrix

Page 12: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M

#12

Asynchronous CAD Tools: Recent DevelopmentsAsynchronous CAD Tools: Recent Developments

DARPA’sDARPA’s “CLASS” Program:“CLASS” Program: ClocklessClockless InitiativeInitiative (2003(2003--07)07)

Goals:Goals:-- CAD tool: CAD tool: produce viable produce viable commercialcommercial--grade grade asyncasync tool flowtool flow-- demonstration: demonstration: a complex Boeing ASIC chipa complex Boeing ASIC chip

Participants:Participants:Lead (PI):Lead (PI): BoeingBoeingIndustrial participants:Industrial participants:

Philips (via Philips (via asyncasync incubated startup, “Handshake Solutions”)incubated startup, “Handshake Solutions”)TheseusTheseus Logic, Logic, CodetronixCodetronix

Academic participants:Academic participants:Columbia, UNC, UW, Yale, OSUColumbia, UNC, UW, Yale, OSU

Targets:Targets: cover wide “design space” cover wide “design space” –– very robust to highvery robust to high--speed circuitsspeed circuits

Columbia’s role:Columbia’s role: (i) high(i) high--speed pipelines, (ii) CAD optimizationsspeed pipelines, (ii) CAD optimizations

Page 13: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M

#13

Asynchronous Design: ChallengesAsynchronous Design: Challenges

Critical Design Issues:Critical Design Issues:

components must components must communicate cleanly:communicate cleanly: ‘hazard‘hazard--free’ designfree’ design

highlyhighly--concurrent designs:concurrent designs: much harder to verify!much harder to verify!

Lack of Automated “ComputerLack of Automated “Computer--Aided Design” Tools:Aided Design” Tools:

most commercial “CAD” tools targeted to synchronous most commercial “CAD” tools targeted to synchronous

Page 14: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M

#14

What Are CAD Tools?What Are CAD Tools?

Software programs to aid digital designers = Software programs to aid digital designers = “computer“computer--aided design” toolsaided design” tools

automatically automatically synthesize synthesize and and optimizeoptimize digital circuitsdigital circuits

Output:optimized circuit

implementation

Input:desired circuit

specificationCADTOOL

Page 15: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M

#15

Asynchronous Design ChallengeAsynchronous Design Challenge

Lack of Existing Asynchronous Design Tools:Lack of Existing Asynchronous Design Tools:

Most commercial “CAD” tools targeted to synchronousMost commercial “CAD” tools targeted to synchronous

Synchronous CAD tools: Synchronous CAD tools: major drivers of growth in microelectronics industry major drivers of growth in microelectronics industry

Asynchronous “chickenAsynchronous “chicken--andand--egg” problem:egg” problem:

few CAD tools few CAD tools less commercial use of less commercial use of asyncasync designdesign

especially lacking: tools for designing/especially lacking: tools for designing/optmzngoptmzng. large systems. large systems

Page 16: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M

#16

Overview: My Research AreasOverview: My Research Areas

CAD Tools for Asynchronous Controllers (CAD Tools for Asynchronous Controllers (FSM’sFSM’s))“MINIMALIST” Package“MINIMALIST” Package: for synthesis + optimization: for synthesis + optimization

Other Research Areas:Other Research Areas:

CAD Tools for Designing LargeCAD Tools for Designing Large--Scale Scale AsyncAsync SystemsSystems

MixedMixed--Timing Interface Circuits:Timing Interface Circuits:

for interfacing sync/for interfacing sync/asyncasync systemssystems

HighHigh--Speed Asynchronous PipelinesSpeed Asynchronous Pipelines

Page 17: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M

#17

CAD Tools for CAD Tools for AsyncAsync ControllersControllers

MINIMALIST:MINIMALIST: developed at Columbia University [1994developed at Columbia University [1994--] ] extensible CAD package for synthesis of extensible CAD package for synthesis of asynchronous controllersasynchronous controllersintegrates synthesis, optimization and verification toolsintegrates synthesis, optimization and verification toolsused in 80+ sites/17+ countries (being taught in IIT Bombay)used in 80+ sites/17+ countries (being taught in IIT Bombay)URL:URL: httphttp://://www.cs.columbia.edu/asyncwww.cs.columbia.edu/async

Includes several optimization tools: Includes several optimization tools: State MinimizationState MinimizationCHASM: CHASM: optimal state encodingoptimal state encoding22--Level HazardLevel Hazard--Free Logic MinimizationFree Logic MinimizationVerilogVerilog backback--endend

Key goal: Key goal: facilitate designfacilitate design--space explorationspace exploration

Page 18: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M

#18

Example: “PEExample: “PE--SENDSEND--IFC” (HP Labs)IFC” (HP Labs)Inputs:req-sendtreqrd-iqadbld-outack-pkt

Outputs:tackpeackadbld

0

1

2

7

3

4

5

6

8

9

10

req-send+ treq+ rd-iq+/adbld+

adbld-out+/peack+

rd-iq-/peack- adbld-

tack+

adbld-out- treq-rd-id+/ adbld+

adbld-out+/peack+

rd-iq-/ peack-adbld- tack-

adbld-out- treq+ ack-pkt+/ peack+ tack+

ack-pkt- treq-/peack- tack-

treq-/tack-

treq+/tack+

ack-pkt+/peack- tack-

adbld-out-treq- ack-pkt+/

peack+

req-send-/--

adbld-out-treq+ rd-iq+/

adbld+

From HP Labs“Mayfly” Project:

B.Coates, A.Davis, K.Stevens,“The Post Office

Experience: Designing a Large Asynchronous Chip”,

INTEGRATION: the VLSI Journal, vol. 15:3,pp. 341-66 (Oct. 1993)

Page 19: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M

#19

EXAMPLE (cont.):EXAMPLE (cont.):

Examples:

Design-Space Explorationusing MINIMALIST:

optimizing for area vs. speed

Page 20: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M

#20

CAD Tools for LargeCAD Tools for Large--Scale Asynchronous SystemsScale Asynchronous Systems

Target Architecture:Target Architecture:Input Specification:Input Specification:= “Control Data= “Control Data--flow Graph”flow Graph”

RegisterRegister RegisterRegister

Functional Functional UnitUnit

Functional Functional UnitUnit

Functional Functional UnitUnit

CtrlrCtrlr 11

CtrlrCtrlr 22

CtrlrCtrlr 33

control unitcontrol unit

StartStart

LoopLoop C< 0C< 0

B:=2dx+dxB:=2dx+dx C:=X<aC:=X<a

M:=U*X1M:=U*X1

EndloopEndloop

EndEnd

X:=X:=X+dxX+dx C:=X<aC:=X<a

Target:Target:-- synthesize synthesize distributed control distributed control -- 1 controller per functional unit1 controller per functional unit[Theobald/Nowick, IEEE Design Automation Conf. (2001)]

Page 21: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M

#21

MixedMixed--Timing InterfacesTiming Interfaces

AsynchronousDomain

SynchronousDomain 2

AsynchronousDomain

SynchronousDomain 1

Goal: provide low-latency communication between “timing domains”

Challenge: avoid synchronization errors

Goal: provide low-latency communication between “timing domains”

Challenge: avoid synchronization errors

Page 22: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M

#22

MixedMixed--Timing Interfaces: SolutionTiming Interfaces: SolutionAsync-Sync FIFO

AsynchronousDomain

SynchronousDomain 1

SynchronousDomain 2

Asy

nc-S

ync

FIFO

Sync

-Asy

ncFI

FO

… developed complete family of mixed-timing interface circuits[Chelcea/Nowick, IEEE Design Automation Conf. (2001)]

… developed complete family of mixed-timing interface circuits[Chelcea/Nowick, IEEE Design Automation Conf. (2001)]

Solution: insert mixed-timing FIFO’s ⇒ provide safe data transferSolution: insert mixed-timing FIFO’s ⇒ provide safe data transfer

AsynchronousDomain

Mixed-Clock FIFO’s

Page 23: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M

#23

HighHigh--Speed Asynchronous PipelinesSpeed Asynchronous Pipelines

global clock

NON-PIPELINED COMPUTATION: “datapath component” =adder, multiplier, etc.

SYNCHRONOUS

Page 24: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M

#24

HighHigh--Speed Asynchronous PipelinesSpeed Asynchronous Pipelines

global clock

SYNCHRONOUS

“PIPELINED COMPUTATION”: like an assembly line

no global clock ASYNCHRONOUS

Page 25: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M

#25

HighHigh--Speed Asynchronous PipelinesSpeed Asynchronous Pipelines

Goal: Goal: extremely fast extremely fast asyncasync datapathdatapath componentscomponents

speed:speed: comparable to comparable to fastestfastest existing synchronous designsexisting synchronous designs

additional benefits:additional benefits:

dynamically adapt to variabledynamically adapt to variable--speed interfaces: speed interfaces: voltage scaling!voltage scaling!

“elastic” processing of data in pipeline“elastic” processing of data in pipeline

no clock distributionno clock distribution

Contributions: Contributions: 3 new 3 new asyncasync pipeline stylespipeline styles [SINGH/NOWICK][SINGH/NOWICK]

MOUSETRAP:MOUSETRAP: static logicstatic logicHighHigh--Capacity/Capacity/LookaheadLookahead:: dynamic logicdynamic logic

Obtain multiObtain multi--GigaHertzGigaHertz speedsspeedsUsed by IBM, currently incorporated into Philips tool flowUsed by IBM, currently incorporated into Philips tool flow

Page 26: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M

#26

MOUSETRAP: A Basic FIFO (no computation)MOUSETRAP: A Basic FIFO (no computation)

Stages communicate using Stages communicate using transitiontransition--signaling:signaling:

reqN

ackN-1

reqN+1

ackN

Data Latch

Latch Controller

doneN

Data in Data out

En

Stage NStage N-1 Stage N+1

[Singh/Nowick, IEEE Int. Conf. on Computer Design (2001)]

Page 27: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M

#27

“MOUSETRAP” Pipeline: w/computation“MOUSETRAP” Pipeline: w/computation

logicData Latch

Latch Controller

doneN

logiclogic

reqreqNN

ackN-1

reqreqN+N+11

ackN

Stage N+1

delaydelay

Stage N

delaydelaydelaydelay

Stage N-1

Function Blocks:Function Blocks: use “synchronous” singleuse “synchronous” single--rail circuits (not hazardrail circuits (not hazard--free!)free!)“Bundled Data” Requirement:“Bundled Data” Requirement:

each each ““reqreq”” must arrive must arrive afterafter data inputs valid and stabledata inputs valid and stable

Page 28: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M

#28

Page 29: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M

#29

MOUSETRAP: A Basic FIFOMOUSETRAP: A Basic FIFOStages communicate using Stages communicate using transitiontransition--signaling:signaling:

reqN

ackN-1

reqN+1

ackN

Data Latch

Latch Controller

doneN

Data in Data out

En

1 transition1 transitionper data item!per data item!

Stage NStage N-1 Stage N+1

One Data Item