physical synthesis 2 - iwlsiwls.org/iwls2015/physical-synthesis-2.0.pdfa. b. kahng, physical...

52
1 A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote Physical Synthesis 2.0 Andrew B. Kahng UCSD CSE and ECE Departments [email protected] http://vlsicad.ucsd.edu

Upload: others

Post on 22-Sep-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

1A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

Physical Synthesis 2.0

Andrew B. KahngUCSD CSE and ECE Departments

[email protected]://vlsicad.ucsd.edu

Page 2: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

ECE 260B – CSE 241A Intro and ASIC Flow 2 Andrew B. Kahng, UCSD

Concept: “Design Principles”

Partition the problem divide and conquer, hierarchy Different abstraction levels: RT-level, gate-level, switch-level,

transistor-level

Orthogonalize concerns Function vs. implementation

Logic vs. timing vs. embedding

Solve chicken-egg conundrums

Constrain the design space to simplify the design process Balance between design complexity and performance E.g., standard-cell methodology “freedom from choice”

[UCSD ECE 260BCSE 241A]

Page 3: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

ECE 260B – CSE 241A Intro and ASIC Flow 3 Andrew B. Kahng, UCSD

Concept: How the IC Design Flow is EvolvingFlow expands in two directions

System-Level Design Design for Manufacturability (DFM)More design care-abouts

Area, Timing, Power, Signal Integrity, Reliability, Cost

Key challenges: loops, chicken-egg “Design closure” through tight

integrations RTL, GDSII “signoffs” = business

structure of semiconductor creation“One-pass flow”: required for

Productivity, requires Predictability By Guardbands? By “Unifications”? By Statistics? By Methodology (to avoid issues)?

High Level Synthesis

GDSII

Logic Synthesis

FP, Place, CTS, Opt

Routing

Extraction, Timing, Physical

Verification

Manufacturing

Architecture Design

Verification

RTL

Gate Netlist

Updated Gate Netlist

[UCSD ECE 260BCSE 241A]

Page 4: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

4A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

Outline

• Why Physical Synthesis• Physical Synthesis 1.0• Example Challenges / Stressors

• FinFET• Noise and Chaos• Clock Skew• Complexity and Hyperlocality• Better (and, more complex) Signoff• New Mixed-Height Sweet Spot?

• Physical Synthesis 2.0 ?

Page 5: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

5A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

Logic Design Needs Spatial Information• High aspect ratio floorplan: shift one macro block from left to

right, and vary its shape (with constant area) • 10% power range (post-route): center location, taller blockage

= more power, more contribution of wire (delays)• Separation of logical, temporal, spatial must crumble

190

195

200

205

210

215

220

225

230

Pow

er (m

W)

0% 25% 50% 75% 100%

Shift the location of blockage

260µm x 65µm184µm x 92µm

Macro size

Page 6: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

6A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

How Do We Predict Spatial Information ?

• Predict by modeling• Machine learning, regression, etc.• (Don’t dismiss this!)[SLIP15] http://vlsicad.ucsd.edu/Publications/Conferences/325/c325.pdf[DAC00] http://vlsicad.ucsd.edu/Publications/Conferences/112/c112.pdf[DATE13] http://vlsicad.ucsd.edu/Publications/Conferences/296/c296.pdf[SLIP13] http://vlsicad.ucsd.edu/Publications/Conferences/300/c300.pdf

• Predict by assuming and enforcing• Make a prediction, then make the prediction come true• (Constant-delay methodology)

• Predict by doing• Constructive prediction • (Run under the hood – quick and dirty, else no leverage)

Page 7: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

7A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

Outline

• Why Physical Synthesis• Physical Synthesis 1.0• Example Challenges

• FinFET• Noise and Chaos• Clock Skew• Complexity and Hyperlocality• Better (and, more complex) Signoff• New Mixed-Height Sweet Spot?

• Physical Synthesis 2.0 ?

Page 8: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

8A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

Synthesis vs. Physical Synthesis• Synthesis (DC, RC)

• Elaboration, mapping to generic gates• Clock gating• Apply timing constraints, remap / optimize• Multibit FF optimization• MBIST insertion• Scan chain stitching• Further optimization, area recovery

• Physical Synthesis (DCT/DCG, RCP)• LEF list• Tech file, map file• tluplus_{max,min}• floorplan DEF• {min,max}_routing_layer

Page 9: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

9A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

Physical Synthesis

• In• RTL + SDC + Library models + Floorplan DEF

• Out• Better netlist (usually), at one (worst) corner• Better netlist (usually) + placed DEF (not legalized)• N.B.: very fast TAT required by customers

• Netlist (+ placed DEF) is passed to P&R + signoff• Place, placeOpt, CTS, CTSOpt, route, routeOpt, leakage

recovery, timing closure • Different companies and tools in a long tool chain

Page 10: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

10A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

Example

FloorplanSpecified by designers

e.g., DCT(Physical

Synthesis)

Floorplan in DEF or physical guidance

P&R flow

Routed Results

Libraries, LEF, tech files

RC tech file (tluplus,captable)Floorplan information

Physical Synthesis

physical information

Netlist + initial placement

Page 11: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

11A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

Note: “P&R + Signoff” is Complicated!• N. MacDonald, Broadcom Corp., “Timing Closure in Deep

Submicron Designs”, 2010 DAC Knowledge Center articleTOP-LEVEL NETLIST / SPEF

BLOCK-LEVEL NETLIST / SPEF

Timing ClosedStatic Timing Analysis for all Modes / Corners

About 5 iterations

Violation Classes Addressed for Each Iteration (in order of priority)(1) Electrical Rule Violations(2) Noise Violations(3) Setup Violations(4) Hold Violations

Breakdown of Timing Violations on per Block Basis

Manual Repair of Timing Failures

(1) Vt Swap, Resizing, Buffer Insertion, NDR Changes, Useful Skew

Operations Permitted at Each Iteration(in order of preference)

(2) Vt Swap, Resizing, Buffer Insertion, NDR Changes

(3) Vt Swap, Resizing, Buffer Insertion(4) Vt Swap, Resizing(5) Vt Swap

Page 12: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

12A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

Since That Article Was Written:

20nm90nm 45/40nm 28nm 16/14nm 10nm ≤7nm65nm

BTI

Temp inversion

Noise

MCMM

Maxtrans

EM

AOCV / POCV

PBA Fixed‐margin spec

Multi‐patterning

Cell‐POCV

MOL, BEOL R Dynamic IR

Fill effects

Layout rules

BEOL, MOL variations

Signoff criteria with AVS

SOC complexity

LVF

MIS

Phys‐aware timing ECO

Min implant

[DAC15]

Page 13: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

13A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

How Can Physical Synthesis Possibly Work?• “If it sounds too good to be true, it usually is …”

• What do we do with constraints at (physical) synthesis stage?• Overconstrain the clock period in synthesis (was by 20%, now by

~10%)• Utilization: 60% target in synthesis (sometimes 50%, 55%)

85+% post-placement• Which detailed placer, CTS tool, router, optimizer?• Complex tool “sensitivities” (noisy, chaotic behavior)• Information that is ignored (advanced manufacturing)• Information that is never available (CTS, SI)

• What explains “success”? Guardbands, low expectations…? • Designers’ preoccupation with area and schedule helps…

Page 14: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

14A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

Challenges

• FinFET, BEOL scaling effects• Drive• Resistivity• Gate-wire balance

• Clock effects• Skew across corners• Top-level clock distribution (CGCs, muxes, dividers, …)• Useful skews = area vs. delay tradeoffs

• “Extreme localization” effects• Advanced (multi-)patterning• Pin access, congestion, coupling• Breakdown of placement-optimization separation

Page 15: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

15A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

Questions

• If Logic Synthesis can’t know outcomes at end of Physical Design, can it be doing the right thing? (Simple information arguments) (What margin is left on the table? Are we seeing placebo effects (association vs. causation etc.)?)

• Can Logic Synthesis be made better aware of future Physical Design outcomes?

• Is Logic Synthesis at risk of being eclipsed by Physical Design? (Venus-Mars Sun-Moon, etc.)

LSLS

Page 16: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

16A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

Outline

• Why Physical Synthesis• Physical Synthesis 1.0• Example Challenges

• FinFET• Noise and Chaos• Clock Skew• Complexity and Hyperlocality• Better (and, more complex) Signoff• A Mixed-Height Sweet Spot?

• Physical Synthesis 2.0 ?

Page 17: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

17A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

FinFET: Current Density + Discreteness • Better electrostatic control + continued gate length scaling

• Drive current cell height (e.g., 8.25T), better area density (w/ fin height )• Effective width 1.6x equivalent area with planar devices

• Current density , plus fin discreteness challengesMulti-Fin 3D FinFET

http://www.synopsys.com/Company/Publications/DWTB/Pages/dwtb‐finfet‐jan2013.aspx

http://www.synopsys.com/Company/Publications/DWTB/Pages/dwtb‐finfet‐process‐soc‐2015q1.aspx

NWell

Fin

M1

Poly

MOL1

VIA0 (MOLxM1)

Active

4Ppoly

3Pfin

1Pfin

3Pfin1Pfin

2Pfin

M2

Metal VIA1 (M1 M2)

MOL2

Page 18: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

18A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

FinFET: Aggressive Voltage Scaling• FinFET enables voltage scaling for reduced dynamic

power• Better electrostatic control better performance at low supply

voltage• High-performance mode: wire-dominated• Low-performance mode: gate-dominated

C. H. Lin, VLSI‐TSA, 2012, p. 1‐2.

Page 19: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

19A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

Gate-Wire Balancing• Unbalanced gate-wire delay causes severe delay variation

on data and clock paths across modes• Delay variation in clock paths == skew variation Increased difficulty for timing closure (“ping-pong effect”)

• Minimization of skew variation is important for timing closure(Our work at DAC15 uses global-local optimization achieves 22% skew variation reduction)

datapath

launch path capture path

CornerClock latency

SkewLaunch  Capture

SS, 0.7V, ‐25°C 1.0 1.1 ‐0.1

FF, 1.1V, ‐25°C 0.9 0.7 +0.2

Low voltage: gate delay dominatesHigh voltage: wire delay dominates Skew reversal Power/area overheads

1.0 1.1

Skew = -0.1/+0.2

/0.7/0.7

[DAC15]

Page 20: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

20A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

FinFET: Less Body Effect, Richer Libraries?• FinFET 4-input NAND ~ planar bulk 3-input NAND• More complex cells / higher fan-in cells could be

made available to synthesis

‘Bulk FinFETs: Fundamentals, Modeling, and Application’, Jong‐Ho Lee, SNU

Number of fan‐in limited by body effect

w/ body effect

Page 21: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

21A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

Pin Accessibility Below 20nm • Routing challenged by complex rules for multi-patterning

• Limited pin access with small track cells• Wider power rail

for reliable connection fewer pin access points

• Complex design rules+ less pin access Difficulty in routing

< MinOverlap

< MinSpacing metal pitch < via pitch

Inserted via Blocked by the via

9T NAND2

M1

M2

FinPoly

V1

Wider power rail

Access point

Pin accessibility problem conflict between area reduction and routability

[DAC15]

Page 22: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

22A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

Outline

• Why Physical Synthesis• Physical Synthesis 1.0• Example Challenges

• FinFET• Noise and Chaos• Clock Skew• Complexity and Hyperlocality• Better (and, more complex) Signoff• New Mixed-Height Sweet Spot?

• Physical Synthesis 2.0 ?

[ISQED02] http://vlsicad.ucsd.edu/Publications/Conferences/131/c131.pdf[iSQED10] http://vlsicad.ucsd.edu/Publications/Conferences/267/c267.pdf

Page 23: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

23A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

Slack vs. Layout Context• Layout knobs: SRAM pitches and buffer keepout distances• Post-P&R slacks of five embedded memories is “chaotic”• Physical synthesis challenge: Logic optimization given “chaos”

Testcase: Logic from OpenCores GPU THEIA + SRAMs

‐1.3

‐1.2

‐1.1

‐1

‐0.9

‐0.8

‐0.7

0 10 20 30WNS of paths th

rough SR

AMs (ns)

SRAM pitch (um)

slack‐1

slack‐2

slack‐3

slack‐4

slack‐5

Delta slack > 300ps

Buffer keepouts

Blockage

Blockage Blockage

sram_pitch

Placement region for standard cells

12345

Page 24: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

24A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

Slack vs. Clock Period• ∆path slack is 81ps at signoff clock period of 1.0ns• Changing clock period to 0.82ns changes ∆path

slack to 143ps!

0.06

0.07

0.08

0.09

0.1

0.11

0.12

0.13

0.14

0.15

0.80

0.81

0.82

0.83

0.84

0.85

0.86

0.87

0.88

0.89

0.90

0.91

0.92

0.93

0.94

0.95

0.96

0.97

0.98

0.99

1.00

1.01

1.02

1.03

1.04

1.05

1.06

1.07

1.08

1.09

1.10

1.11

1.12

1.13

1.14

1.15

1.16

1.17

1.18

1.19

1.20

1.21

1.22

1.23

1.24

1.25

1.26

1.27

1.28

1.29

1.30

Max Delta Path Slack (SI –

non‐SI) (ns)

Clock period (ns)

81ps at signoff clock period

143ps at tighter clock period

[SLIP15]

Page 25: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

25A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

Non-SI vs. SI • Top-1000 critical paths from Viterbi design (clock period = 1.0ns) • Slack diverges by 81ps !!! ~4 stages of logic at 28nm FDSOI• Unfortunately, we don’t know coupling before routing !!!

81ps

Path slack in SI Mode (ns)

Path Slack in

 Non

‐SI M

ode (ns) Ideal correlation

[SLIP15]

Page 26: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

26A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

20.821

21.221.421.621.822

22.222.422.622.823

0 0.2 0.4 0.6 0.8 1 1.2

3DIC Pow

er (m

W)

WLM Cap (pF)

WLM, RC (Interconnect proxy) Effects

• Example: SOCE-based “Shrunk2D” (S2D) flow [1]• Perform synthesis with different WLM caps, P&R with S2D flow• Shown: total power (#buffers, #instances, instance area, WL, …

similar)

1.35mW(6.43%)

[1] Panth et al., “Design and CAD Methodologies for Low Power Gate‐Level Monolithic 3D ICs”, Proc. ISLPED, 2014, pp. 171‐176. 

[DAC15]

Page 27: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

27A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

Outline

• Why Physical Synthesis• Physical Synthesis 1.0• Example Challenges

• FinFET• Noise and Chaos• Clock Skew• Complexity and Hyperlocality• Better (and, more complex) Signoff• A Mixed-Height Sweet Spot?

• Physical Synthesis 2.0 ?

Page 28: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

28A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

Sensitivity of CTS Outcomes to Layout Contexts

• Delay varies by up to 43% with clock entry point locations• Delay varies by up to 45% with core aspect ratio• NDRs, fill, buffer sizes, max fanout / max trans rules, … 100ps impacts on insertion delays, skew, slacks

0

100

200

300

400

500

600

700

8000.1

0.12

5

0.25

0

0.33 0.4

0.5

1.0

2.0

2.5

3.0

4.0

8.00

10.00

Fall de

lay (ps)

Core aspect ratio

BL BLM B RBM R

BL BLM BRBM

R

[SLIP13]

Page 29: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

29A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

Useful Skew Improves Timing• Useful skew optimization adjusts clock sink latencies to

improve timing• Our predictive useful skew flow resolves the “chicken-and-egg

loop” further improved timing

-893

-197-60

-1000

-800

-600

-400

-200

0Zero skew Typical

useful skewPredictive

useful skew

Tota

l neg

ativ

e sl

ack Useful skew

improves timing

6 testcases {3 RTLs x 2 clock periods}

Delay/Slack Clock latency

Clock

7/3

10/0

7/3FF1 FF2 FF3

5 5 5

Zero skew

Clock

7/2

10/2

7/2FF1 FF2 FF3

7 6 5

Useful skew

[ISQED14]

Page 30: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

30A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

Conventional Useful Skew Optimization• Standard useful skew flow has chicken-egg problem

• One solution: Back-annotation flows (large runtime)

Placement / Place Opt.

CTS

Routing / Route Opt.

Skew_opt

RTL netlist

Synthesis

CTS Opt.

Netlist and placement assume zero skew

Useful skew optimization relies on placement

Back annotation

Wang et al. in DAC06 propose to back‐annotate useful skew from post‐placement to before‐synthesis

Page 31: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

31A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

NOLO: No-Loop Useful Skew Optimization• Our work: Cure the chicken-egg problem with delay prediction

Synthesis w/ Multi-Vt

Routing/Route Opt.

Placement/Place Opt.

RTL netlist

CTS/CTS Opt.

Predictive Useful Skew

Synthesis w/ LVT

LVT-only netlist

• Use setup slacks from LVT-only synthesis estimation of achievable slacks

• Use hold slacks from multi-VT synthesis reduce pessimism

• Advantage: One-pass approach, not constrained by placement

Page 32: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

32A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

Experimental Results• Predictive flow achieve similar or better timing and much

smaller runtime

0

50

100

150

200

-6 -5 -4 -3

Run

time

(min

)

TNS (ns)

0

40

80

120

160

-7 -6 -5 -4 -3

Run

time

(min

)

TNS (ns)

0

50

100

150

200

-9 -8 -7 -6

Run

time

(min

)

TNS (ns)0

400

800

1200

1600

-25 -20 -15 -10

Run

time

(min

)

TNS (ns)

aes_cipher des_perf

jpeg_encoder mpeg2

Back annotation (BA) Prediction (w/o LVT-only syn)Prediction (w/ LVT-only syn) Average of various BA flows

Page 33: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

33A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

Outline

• Why Physical Synthesis• Physical Synthesis 1.0• Example Challenges

• FinFET• Noise and Chaos• Clock Skew• Complexity and Hyperlocality• Better (and, more complex) Signoff• A Mixed-Height Sweet Spot?

• Physical Synthesis 2.0 ?

Page 34: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

34A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

BEOL Multi-Patterning Impacts

Mandrel

Mwidth

Mspace

Spacer

Swidth

Wire1width = Mwidth

Mx metal

Wire2width = Mspace – 2*Swidth

Floating fill wires

Line-end extensionsLine-end cuts

Mandrel

Page 35: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

35A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

Placement-Sizing Interference• New “interferences” between post-layout optimization

and P&R• Rules for device layers (FEOL) become considerably

more complex and restrictive• Minimum implant width rules for implant region• Minimum notch and jog width rule for oxide diffusion (OD)

HVT HVTLVT

HVT LVT

LVT

HVT

HVT

OD

Cell boundary

[ICCAD15]

Page 36: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

36A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

Placement-Sizing Interference (cont.)• Drain-to-drain abutment (DDA)

• Example solution

Intertwine the historically separate tasks of P&R and post‐route optimization

Cell boundary

Active region

Poly

Power/ground

Connection

D D D S

SD

DDAviolation

Min implant widthviolation

Min implant widthviolation

Min jog/notch widthviolation

[ICCAD15]

Page 37: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

37A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

Outline

• Why Physical Synthesis• Physical Synthesis 1.0• Example Challenges

• FinFET• Noise and Chaos• Clock Skew• Complexity and Hyperlocality• Better (and, more complex) Signoff• A Mixed-Height Sweet Spot?

• Physical Synthesis 2.0 ?

Page 38: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

38A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

I. Flexible Timing Models

setup

c2q

hold

c2q

C2q‐setup‐hold surface

setup holdc2q

• Setup time, hold time and clock-to-q (c2q) delay of FF⇒ values interdependent, but NOT fixed

• Flexible FF timing model can exploit operating (function/test) modes⇒ “Free” pessimism reduction in STA

• Sequential LP:• setup-c2q opt • hold-c2q opt

• Goal: Find best {setup, hold, c2q} for each FF instance

[ISQED14]

hold

c2q1

c2qn

...

setup‐hold‐c2q   flexible model

setup‐hold‐c2q   fixed model

Page 39: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

39A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

Flexible Timing Model Recover Margin• Independent datapaths in PBA: using fixed FF timing

model loses performance optimization opportunity

470ps

480ps

460ps

470ps460ps

480ps

FF3

FF1

FF2

setup: 10ps c2q: 20ps

setup: 10ps

c2q: 20ps setup: 20ps

c2q: 10ps

Total: 500ps Total: 500ps

Total: 500ps

20ps

10ps 10ps

20ps

520ps? 500ps!

Page 40: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

40A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

Improved Timing Signoff Flow

Extract path timing information

LP formulation with flexible flip‐flop timing model

Solve Sequential LP (STA_FTmax , STA_FTmin)

Annotate new timing model for each flip‐flop

Solution

Netlist (and SPEF, if routed)

Timing signoff with annotated timing

Takeaways• Fix timing violations “for free”• 48ps average improvement of

slack over 5 designs in a foundry 65nm technology

Next• Better exploitation of disjoint

cycles/modes • More accurate modeling of

setup-hold-c2q tradeoff• Circuit optimization should

natively exploit FF timing model flexibility

Page 41: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

41A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

II. Signoff Definition (e.g., with AVS, Aging)• VBTI  : Voltage for BTI‐aging estimation• Vlib : Supply voltage for timing library characterization• Vfinal: Vdd of a circuit with AVS at end‐of‐lifetime

VlibVlib

VBTIVBTI Deratedlibrary

Deratedlibrary

|Vt||Vt| Circuit implementation

and signoff

Circuit implementation

and signoff

circuitcircuitBTI degradation

and AVSBTI degradation

and AVSVfinalVfinal

? Chicken & Egg Loop

VBTI and Vlibdepend on aging during AVS (Vfinal)

Vfinaldepends on circuit

Circuit implementation depends on VBTI and Vlib

[DATE13]

Page 42: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

42A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

Observations and HeuristicsObservation #1: Vfinal is not sensitive to cells along the timing‐critical path

Observation #2: ΔVt with a constant Vfinalthroughout lifetime ≈ adaptive Vdd

Solve “Chicken & Egg Loop” by having VBTI = Vlib = Vheur≈ Vfinal

Heuristic #1: Use average of critical path replicas to

estimate Vfinal (Vheur)

Heuristic #2: approximate Vdd in AVS by constant Vheur

Page 43: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

43A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

Low Vlib High Vlib

LowVBTI

Slower circuitLess aging

Faster circuitLess aging

HighVBTI

Slower circuit More aging

Faster circuitMore aging

“Knee” Point for Signoff Definition

Experiment setup:DC/AC BTI @ 125°C32nm PTM technology4 benchmark circuit implementations

Optimistic aging library  large power penalty

Our method finds “Knee” point for balanced area and power tradeoff

Overly pessimistic aging library  large area penalty

Ignore AVS  larger area

Page 44: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

44A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

Outline

• Why Physical Synthesis• Physical Synthesis 1.0• Example Challenges

• FinFET• Noise and Chaos• Clock Skew• Complexity and Hyperlocality• Better (and, more complex) Signoff• A Mixed-Height Sweet Spot?

• Physical Synthesis 2.0 ?

Page 45: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

45A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

Mixed Cell Height Implementation (!)• Large cell height better timing, but large area and power• Small cell height smaller area/power per gate, but large delay

and more #buffers• Mixing cell height enables tradeoffs between performance and

area/power (recall FinFET introduction!) better design QoR• E.g., use large-height high-fanin cells to improve pin accessibility• Already have flop trays, etc. as problematic multi-height instances

Technology: 28nm LPIn red are 12T cells = larger area, smaller delayIn blue are 8T cells = smaller area, larger delay

[ICCAD15]

Page 46: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

46A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

Cost of Mixing Cell Heights• “Breaker cells” are required to align regions with different cell heights Optimization must comprehend corresponding area cost

12T Cell

8T Cell 12T Cell

12T Cell

……

64nm48nm64nm

four sites

P/G rail

Cell boundary

Assume: M2 pitch = 64nm

Y directional shift

X directional shift

No routing blockage Routing blockage on M1/M2

one M2 pitch

Page 47: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

47A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

Optimization Flow

Synthesis

Initial placement

Partitioning

Legalization

Floorplan Update

Cell mapping

Routing / RoutOpt

Initial placement uses modified LEF enable optimization with a conventional flow Slicing-based partition with DP to

divide die area into regions with different cell heights Internal-timer guided placement

legalization Floorplan update with “breaker cell”

penalty Row-based cell mapping places cells

onto rows with corresponding heights

Page 48: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

48A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

Example of Optimization Flow

Initial placement(8T/12T cells are “freely” placed)

Partitioning(Yellow blocks = regions)

Legalization

New floorplanMixed-height placement

Technology: 28nm LPDesign: AES8T cells are in blue12T cells are in red

Page 49: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

49A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

Benefits from Mixing Cell Heights• Technology: 28nm LP (12T/8T) Design: AES • 25% area reduction as compared to 12T-only design • 20% performance improvement compared to 8T-only design

Page 50: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

50A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

Outline

• Why Physical Synthesis• Physical Synthesis 1.0• Example Challenges

• FinFET• Noise and Chaos• Clock Skew• Complexity and Hyperlocality• Better (and, more complex) Signoff• A Mixed-Height Sweet Spot?

• Physical Synthesis 2.0 ?

Page 51: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

51A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

Physical Synthesis 2.0• It’s the predictability! (and, prediction is challenged…)

• New devices and patterning technologies• Complex PD tool chain; chaotic behavior of tools and flows• Oblivious to clocks, corners, coupling how can Physical

Synthesis be doing the right thing? (= target for margin recovery!)

• What will Physical Synthesis 2.0 look like? • (1) Higher-level value: what Physical Design cannot do

• Datapath architecture selection• Resource sharing• Mux mapping

• (2) Other types of prediction (machine learning, big data, etc.) ! • (3) Constructive prediction deeper into implementation flow

• (More integration… ) Clock and MCMM awareness• Hyperlocality awareness: coloring, congestion, coupling, interactions …

LSLS

Page 52: Physical Synthesis 2 - IWLSiwls.org/iwls2015/physical-synthesis-2.0.pdfA. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote 5 Logic Design Needs Spatial Information • High aspect

52A. B. Kahng, Physical Synthesis 2.0, IWLS-2015 Keynote

THANK YOU !