power-optimal pipelining in deep submicron technology

27
Power-Optimal Pipelining in Deep Submicron Technology Seongmoo Heo and Krste Asanović Computer Architecture Group, MIT CSAIL ISLPED 2004 8/10/2004

Upload: tavia

Post on 01-Feb-2016

47 views

Category:

Documents


0 download

DESCRIPTION

ISLPED 2004 8/10/2004. Power-Optimal Pipelining in Deep Submicron Technology. Seongmoo Heo and Krste Asanovi ć Computer Architecture Group, MIT CSAIL. Traditional Pipelining. Goal: Maximum performance. Vdd. Setup. Clk-Q. Propagation Delay. Clk. Clk. Clk. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Power-Optimal Pipelining in Deep Submicron Technology

Power-Optimal Pipelining in Deep Submicron Technology

Seongmoo Heo and Krste Asanović

Computer Architecture Group, MIT CSAIL

ISLPED 20048/10/2004

Page 2: Power-Optimal Pipelining in Deep Submicron Technology

Traditional Pipelining• Goal: Maximum performance

Clk

Clk-Q SetupPropagation Delay

Clk

Clk

Vdd

Page 3: Power-Optimal Pipelining in Deep Submicron Technology

Pipelining as a Low-Power Tool

Clk

Clk-Q SetupPropagation Delay

Time Slack

• Goal: Low-Power, Fixed Throughput

Time Slack

Vdd

Clk

Clk

Page 4: Power-Optimal Pipelining in Deep Submicron Technology

Pipelining as a Low-Power Tool

Clk

Clk-Q SetupPropagation Delay

Time Slack

• Goal: Low-Power, Fixed Throughput

Time Slack

Vdd

Clk

Clk

Traded for Power(supply voltage scaling)

Page 5: Power-Optimal Pipelining in Deep Submicron Technology

Pipelining as a Low-Power Tool

Delay

Power

Pipelining

Time slack

Flip-flopPowerOverhead

* Clock frequency fixed

Page 6: Power-Optimal Pipelining in Deep Submicron Technology

Pipelining as a Low-Power Tool

Delay

Power

Supply voltage scaling

Power Saving

* Clock frequency fixed

Page 7: Power-Optimal Pipelining in Deep Submicron Technology

Power-Optimal Pipelining• Power reduction from pipelining limited by power

overhead of increased number of flip-flops Power-Optimal Pipelining

Page 8: Power-Optimal Pipelining in Deep Submicron Technology

Power-Optimal Pipelining

Delay

Power

Too shallow pipelining

• Power reduction from pipelining limited by power overhead of increased number of flip-flops Power-Optimal Pipelining

Page 9: Power-Optimal Pipelining in Deep Submicron Technology

Power-Optimal Pipelining

Delay

Power

Too deep pipelining

Too shallow pipelining

• Power reduction from pipelining limited by power overhead of increased number of flip-flops Power-Optimal Pipelining

Page 10: Power-Optimal Pipelining in Deep Submicron Technology

Power-Optimal Pipelining

Delay

Power

Optimal Power Saving

Too deep pipelining

Too shallow pipeliningOptimal pipelining

• Power reduction from pipelining limited by power overhead of increased number of flip-flops Power-Optimal Pipelining

Page 11: Power-Optimal Pipelining in Deep Submicron Technology

Contribution• Pipelining is an old idea.• Research focus has been on performance impact of

pipelining.• Idea of using pipelining [Chandrakasan ’92] to lower

power has not been fully explored in deep submicron technology.

• Analysis and circuit-level simulation of Power-Optimal Pipelining for different regimes of Vth, activity factor, clock gating

Page 12: Power-Optimal Pipelining in Deep Submicron Technology

1. Impact of pipelining on power component

2. Impact of pipelining on total power (with/without clock-gating)

Bottom-to-Top Approach

TotalPower

(clock-gated)

Power

Timeactive activeinactive

SwitchingPower

Component

LeakagePower

Component

Idle Power

Component

Page 13: Power-Optimal Pipelining in Deep Submicron Technology

1. Impact of pipelining on power component

2. Impact of pipelining on total power (with/without clock-gating)

Bottom-to-Top Approach

TotalPower

(not clock-gated)

SwitchingPower

Component

LeakagePower

Component

Idle Power

Component

Power

Timeactive inactive active

*Idle power = power consumed when circuit is idle and not clock-gated

Page 14: Power-Optimal Pipelining in Deep Submicron Technology

• Target digital system: Fixed throughput, Highly parallel computation, Logic-dominant

• Test bench– BPTM (Berkeley Predictive Technology Model)

70nm process: – LVT(0.17/-0.2), MVT(0.19/-0.22), HVT(0.21/-0.24)– Hspice simulation at 100°C, Clock = 2 GHz

Methodology

BaselineN FO4 inverters (N = 2 ~ 24)

One Pipeline Stage

TG flip-flops TG flip-flops

Page 15: Power-Optimal Pipelining in Deep Submicron Technology

Pipelining and Switching Power:Analytical Trend

O(N2)

O(1/N)

Number of FO4 per stage, N

Sw

itch

ing

Po

wer

OptimalSaving

Optimal FO4

Quadratic reductionof logic switching power

Flip-flop overhead

Vdd2 N2

Page 16: Power-Optimal Pipelining in Deep Submicron Technology

O(1/N)

Lea

kag

e P

ow

er

O(N ) (1<< 2)

OptimalSaving

Optimal FO4

Superlinear reductionof logic leakage power

Flip-flop overhead

Pipelining and Leakage Power:Analytical Trend

Number of FO4 per stage, N

Vdd * e(Vdd) N

DIBL effect

Page 17: Power-Optimal Pipelining in Deep Submicron Technology

Pipelining and Idle Power:Analytical Trend

• Clock-gating is not always possible– Increased control complexity – insufficient setup time of clock enable signal

• Leakage Power + Flip-flop Switching Power– Between leakage power scaling and flip-flop

switching power scaling depending on leakage level

Page 18: Power-Optimal Pipelining in Deep Submicron Technology

Pipelining and Idle Power:Analytical Trend

O(1/N)

Number of FO4 per stage, N

Re

lati

ve

Po

we

r

O(N ) (1<< 2)

OptimalSaving

Optimal FO4

O(1/N)

Number of FO4 per stage, N

O(N)

OptimalSaving

Optimal FO4Linear reduction of Flip-flop switching power 1/N * Vdd

2 N

Leakage Power Scale

Flip-flop Switching Power Scale

Idle Power

Page 19: Power-Optimal Pipelining in Deep Submicron Technology

Simulation Results:Power Components

Power Components

Switching Power

Leakage Power

Idle Power

Right hand side curve

O(N2) O(N ) (1<< 2)

O(N) or O(N ) (1<< 2)

Saving* 79(HVT)~ 82(LVT)%

70(LVT)~ 75(HVT)%

55(HVT)~ 70(LVT)%

N* 6 6 8

N = Number of FO4 inverters per stage(Not including flip-flop delay)

N* = Optimal N Saving* = Optimal power saving by pipelining

Fixed Throughput @ 2 GHz

Page 20: Power-Optimal Pipelining in Deep Submicron Technology

Optimal Power Saving

Optimal FO4 = 6

ClockGating

NoClockGating

Optimal FO4 = 6~8

activity factor activity factor

rela

tive

po

we

r

rela

tive

po

we

r

*2 GHz*Flip-flop delaynot included in optimal FO4

Page 21: Power-Optimal Pipelining in Deep Submicron Technology

IdlePower

SwitchingPower

SwitchingPower

LeakagePower

activity factor activity factor

rela

tive

po

we

r

rela

tive

po

we

r

Optimal Power Saving

Optimal FO4 = 6 Optimal FO4 = 6~8

ClockGating

NoClockGating

Page 22: Power-Optimal Pipelining in Deep Submicron Technology

activity factor activity factor

rela

tive

po

we

r

rela

tive

po

we

r

Optimal Power Saving

Optimal FO4 = 6 Optimal FO4 = 6~8

LVT

ClockGating

NoClockGating

Page 23: Power-Optimal Pipelining in Deep Submicron Technology

Discussion• LVT can be fast and power-efficient

– enables lower Vdd

• Flip-flop delay more important than flip-flop power for power-optimal pipelining

Page 24: Power-Optimal Pipelining in Deep Submicron Technology

Limitation of This Work

Effect on optimal logic depth

Effect on optimal power saving

Super-linear growth of flip-flops

Additional memory Reduced glitches Parasitic wire capacitance

Page 25: Power-Optimal Pipelining in Deep Submicron Technology

Conclusion• Pipelining is an effective low-power tool

when used to support voltage scaling in digital system implementing highly parallel computation.

• Optimal Logic Depth: 6-8 FO4– ~ 8-10 FO4 including flip-flop delay

• Optimal Power Saving: 55 – 80% – It depends on Vth, AF, Clock-Gating

• Insights:– Pipelining is more effective with High AF

• Pipelining is most effective at saving switching power

– Pipelining is more effective with lower Vth • Except for when leakage power is dominant.

– Pipelining is more effective with clock-gating • reduced flip-flop overhead.

Page 26: Power-Optimal Pipelining in Deep Submicron Technology

Acknowledgments• Thanks to SCALE group members and

anonymous reviewers

• Funded by NSF CAREER award CCR-0093354, NSF ITR award CCR-0219545, and a donation from Intel Corporation.

Page 27: Power-Optimal Pipelining in Deep Submicron Technology

BACKUP SLIDES