l26: power estimation. year power(w) 1980 1985199019952000 10 20 30 40 50 5 15 25 35 45 i286 i386 dx...

42
L26: Power Estimation

Upload: prosper-farmer

Post on 17-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: L26: Power Estimation. year Power(W) 1980 1985199019952000 10 20 30 40 50 5 15 25 35 45 i286 i386 DX 16 i486 DX25 i486 DX 50 i486 DX2 66 P-PC601 50 P6

L26: Power Estimation

Page 2: L26: Power Estimation. year Power(W) 1980 1985199019952000 10 20 30 40 50 5 15 25 35 45 i286 i386 DX 16 i486 DX25 i486 DX 50 i486 DX2 66 P-PC601 50 P6

year

Power(W)

1980 1985 1990 1995 2000

10

20

30

40

50

5

15

25

35

45

i286i386 DX 16 i486 DX25

i486 DX 50

i486 DX2 66 P-PC601 50

P6 166

P5 66

Alpha21064 200

Alpha 21164

i486 DX4 100

P II 300

P-PC604 133

P-PC750 400

P III 500

Alpha 21264

• Battery technology • Microprocessor power dissipation

year

Capacity(Watt-Hour/lb)

1965 1975 1985 1995 2000

10

20

30

40

50

Is it possible?

Nickel-Cadmium

Page 3: L26: Power Estimation. year Power(W) 1980 1985199019952000 10 20 30 40 50 5 15 25 35 45 i286 i386 DX 16 i486 DX25 i486 DX 50 i486 DX2 66 P-PC601 50 P6

Cost vs Power

PowerPower

heat sink,air flow

fan sink

exotic

none none

$1-5

$10-15

$50+

CostCostStrategyStrategy

LaptopComputer

Low powerprocessor

1 Watt

3-5 Watt

5-15 Watt

15+ Watt

Page 4: L26: Power Estimation. year Power(W) 1980 1985199019952000 10 20 30 40 50 5 15 25 35 45 i286 i386 DX 16 i486 DX25 i486 DX 50 i486 DX2 66 P-PC601 50 P6

Power Estimation

• Circuit Level Power Estimation

• Logic/module Level Power Estimation

• High Level Power Estimation

• Software Power Estimation

Page 5: L26: Power Estimation. year Power(W) 1980 1985199019952000 10 20 30 40 50 5 15 25 35 45 i286 i386 DX 16 i486 DX25 i486 DX 50 i486 DX2 66 P-PC601 50 P6

Power Estimation Techniques

• Circuit Simulation (SPICE): a set of input vectors, accurate, memory and time constraints

• Monte Carlo: randomly generated input patterns, normal distributed power per time interval T using a simulator switch level simulation (IRSIM): defined as no. of rising and falling transitions over total number of inputs

• Powermill (transistor level): steady-state transitions, hazards and glitches, transient short circuit current and leakage current; measures current density and voltage drop in the power net and identifies reliability problem caused by EM failures, ground bounce and excessive voltage drops.

• DesignPower (Synopsys): simulation-based analysis is within 8-15% of SPICE in terms of percentage difference (Probability-based analysis is within 15-20% of SPICE).

Page 6: L26: Power Estimation. year Power(W) 1980 1985199019952000 10 20 30 40 50 5 15 25 35 45 i286 i386 DX 16 i486 DX25 i486 DX 50 i486 DX2 66 P-PC601 50 P6

Power Estimation Techniques• Static (non-Simulative) - useful for synthesis and architect

ural exploration – Probability-based

– Entropy-based

• Dynamic (simulative) - useful for final power– Direct

– Sampling-based

– Compaction-based

• Hybrid (high-level simulation + low-level analytical model evaluation) – Power macromodels for datapath, control, memory

– Instruction-level models for microprocessors, DSPs

Page 7: L26: Power Estimation. year Power(W) 1980 1985199019952000 10 20 30 40 50 5 15 25 35 45 i286 i386 DX 16 i486 DX25 i486 DX 50 i486 DX2 66 P-PC601 50 P6

Previous work(1)

• Simulation based approach

– accurate and system independent

– pattern dependent and after implementation

– Direct simulation

• SPICE, transistor-level simulator, IRSIM

– Statistical simulation

• Monte Carlo simulation

Page 8: L26: Power Estimation. year Power(W) 1980 1985199019952000 10 20 30 40 50 5 15 25 35 45 i286 i386 DX 16 i486 DX25 i486 DX 50 i486 DX2 66 P-PC601 50 P6

Previous work(2)

• Non-simulation based approach– library, stochastic, information theoretic model

– Behavioral-level approach• library(parameter, area, delay, internal power dissipation)

• useful in comparing different adder and multiplier architecture for their switching activity

• stochastic– using probability density function, joint probability density

function.

• Information theoretic– entropy

Page 9: L26: Power Estimation. year Power(W) 1980 1985199019952000 10 20 30 40 50 5 15 25 35 45 i286 i386 DX 16 i486 DX25 i486 DX 50 i486 DX2 66 P-PC601 50 P6

Previous work(3)

• Logic-level approach– using signal probability

– zero-delay based approach

– OBDD

2 (1 )(1 )n n np p

( ) ( ) ( ) ( ) ( )x xprob z prob x prob f prob x prob f

Page 10: L26: Power Estimation. year Power(W) 1980 1985199019952000 10 20 30 40 50 5 15 25 35 45 i286 i386 DX 16 i486 DX25 i486 DX 50 i486 DX2 66 P-PC601 50 P6

Circuit Level

• SPICE– classical tool for power analysis of circuits– large runtime for large circuits– mainly used as the reference for other power estimation tools.

• PowerMill– uses simplified electrical model of the transistor.– operating conditions of transistor are stored as look-up tables.(inte

rpolated by piecewise linear approximation.)– stages - partitioned subcircuits, source/drain connected transistors– 2-3 orders faster than SPICE

Page 11: L26: Power Estimation. year Power(W) 1980 1985199019952000 10 20 30 40 50 5 15 25 35 45 i286 i386 DX 16 i486 DX25 i486 DX 50 i486 DX2 66 P-PC601 50 P6

Introduction

• Glitch – additional power is typically 20%

Digita l CM O SCircuits

flip-flop

flip-flop

c lock

inputsoutputs

Page 12: L26: Power Estimation. year Power(W) 1980 1985199019952000 10 20 30 40 50 5 15 25 35 45 i286 i386 DX 16 i486 DX25 i486 DX 50 i486 DX2 66 P-PC601 50 P6

Circuit Level

• IRSIM– event-driven, switch-level simulator– modeled by capacitive nodes and transistors– partitioning into stages is used.– voltage level - High, Low, Undetermined

Partitioning into stages

Page 13: L26: Power Estimation. year Power(W) 1980 1985199019952000 10 20 30 40 50 5 15 25 35 45 i286 i386 DX 16 i486 DX25 i486 DX 50 i486 DX2 66 P-PC601 50 P6

Theoretical background

• Synchronous system controlled by global clock*

1 1

( )

*lim

N S

nxi

N

xi nP

N S

c lock

input s ignalx(t)

1 2 3 N

S

*

0 1 1

( ) ( 1)

*lim

N S

nxi

N

xi n xi nP

N S

0 0 0 1 0 1 0 1 1 1 xi xi xi xi xi xip p p p p p

Page 14: L26: Power Estimation. year Power(W) 1980 1985199019952000 10 20 30 40 50 5 15 25 35 45 i286 i386 DX 16 i486 DX25 i486 DX 50 i486 DX2 66 P-PC601 50 P6

Hierarchical approach to power estimation of combinational

circuits(1)• Estimate power of large circuit in a short time

– model sub-circuit

– compute steady-state prob.

– compute edge-activity using state-transition-diagram(std)

– compute energy

• State-Transition Diagram– 2 input NOR2 1 1 2 2

3 1 2

( 1) (1 ( )) ( )* ( )* ( )

( 1) (1 ( ))(1 ( ))

node n x n x n x n node n

node n x n x n

X1

X2

X1 X2

node2

node3

Page 15: L26: Power Estimation. year Power(W) 1980 1985199019952000 10 20 30 40 50 5 15 25 35 45 i286 i386 DX 16 i486 DX25 i486 DX 50 i486 DX2 66 P-PC601 50 P6

Hierarchical approach to power estimation of combinational

circuits(2)

Page 16: L26: Power Estimation. year Power(W) 1980 1985199019952000 10 20 30 40 50 5 15 25 35 45 i286 i386 DX 16 i486 DX25 i486 DX 50 i486 DX2 66 P-PC601 50 P6

Hierarchical approach to power estimation of combinational

circuits(3)• Computation of steady-state prob.

– Compute edge prob.

– Make state-transition matrix

– Compute steady-state prob.

– Compute edge-activity

• Energy computation of each edge in the std– Compute edge activity energy using SPICE

# of edges - 1

0

*j jj

energy W EA

Page 17: L26: Power Estimation. year Power(W) 1980 1985199019952000 10 20 30 40 50 5 15 25 35 45 i286 i386 DX 16 i486 DX25 i486 DX 50 i486 DX2 66 P-PC601 50 P6

Hierarchical approach to power estimation of combinational

circuits(4)• Computation of output signal parameters

– compute x3 using std

of NOR 1

– compute energy for second

NOR using Wj calculated

NOR 1 and EAj obtained

for the second NOR

X1

X2

X1 X2

node2

node3

X4

X3

X4

Page 18: L26: Power Estimation. year Power(W) 1980 1985199019952000 10 20 30 40 50 5 15 25 35 45 i286 i386 DX 16 i486 DX25 i486 DX 50 i486 DX2 66 P-PC601 50 P6

Hierarchical approach to power estimation of combinational

circuits(5)• Loading and routing considerations

– Recompute edge energy with concerning of load cap.

– Effect of loading can be taken into account

Page 19: L26: Power Estimation. year Power(W) 1980 1985199019952000 10 20 30 40 50 5 15 25 35 45 i286 i386 DX 16 i486 DX25 i486 DX 50 i486 DX2 66 P-PC601 50 P6

Power estimation of sequential circuits

• Sequential block has a combinational block and some storage elements like flip-flop

• Extend method to flip-flop2 1 1 4

3 2

4 2

( 1) ( )* ( ) (1 ( ))* ( )

( 1) 1 ( 1)

( 1) ( 1)

node n D n n n node n

node n node n

node n node n

node5

node6

node7

QD

node2

node3

node4

Q1

010

101

00,10,11

00,01,10

11 01

010node2=0node3=1node4=0

D=101

Page 20: L26: Power Estimation. year Power(W) 1980 1985199019952000 10 20 30 40 50 5 15 25 35 45 i286 i386 DX 16 i486 DX25 i486 DX 50 i486 DX2 66 P-PC601 50 P6

Experimental result(1)

• Power estimation of basic cells and multipliers

Page 21: L26: Power Estimation. year Power(W) 1980 1985199019952000 10 20 30 40 50 5 15 25 35 45 i286 i386 DX 16 i486 DX25 i486 DX 50 i486 DX2 66 P-PC601 50 P6

Logic Level

• Pattern dependent analysis– example : Entice-Aspen System(‘94 Workshop on Low Power Design)– cell characterization

• using SPICE simulation under different conditions• parameters : supply voltage, input signal slope, operating temperature, fabricati

on process variation• modeling styles : polynomials, tables, piecewise linear• power vector : Set of logic values and signal transitions

– activity analysis• using Verilog-XL simulator• find event vector in the power vector set• total energy = (Energy of power vector)(# of occurrences)

Page 22: L26: Power Estimation. year Power(W) 1980 1985199019952000 10 20 30 40 50 5 15 25 35 45 i286 i386 DX 16 i486 DX25 i486 DX 50 i486 DX2 66 P-PC601 50 P6

High Level Power Estimation

• RTL power estimation– problem

• given an RTL circuit description consisting of m modules, and an input vector sequence of length N, estimate the average power consumption

– estimation process• perform behavioral simulation and collect the input statistics for all modules

in RTL descriptions

• evaluate the power macro-model equation for each module and sum over the modules

– implementation• in the form of a power co-simulator

• collect input statistics from the output of behavioral simulator

• produce the power value at the end

Page 23: L26: Power Estimation. year Power(W) 1980 1985199019952000 10 20 30 40 50 5 15 25 35 45 i286 i386 DX 16 i486 DX25 i486 DX 50 i486 DX2 66 P-PC601 50 P6

High Level Power Estimation• Power macro model

– census macro modeling

– sampler macro modeling

– adaptive macro modeling

• Census macro modeling

– input data statistics must be collected for every simulation cycle

– very slow simulation

– assumed input vectors

• macro-model is biased

• ex) pseudo-random, speech data, etc.

BehavioralSimulator

CensusMacro

ModelingInputvectors

input vectorsfor

each module

Powerestimate

Gate-levelPower

Simulation

requiresinput

vector

confidencelevel

and interval

vectors

power

requiresinput vectors

Confidence leveland interval

BehavioralSimulator

SamplerMacro

Modeling

Inputvectors

input vectorsfor

each module

Powerestimate

Page 24: L26: Power Estimation. year Power(W) 1980 1985199019952000 10 20 30 40 50 5 15 25 35 45 i286 i386 DX 16 i486 DX25 i486 DX 50 i486 DX2 66 P-PC601 50 P6

High Level Power Estimation• Sampler macro modeling

– collects and analyzes input vectors for a relative small number of cycles

– using statistical random sampling methods

• Adaptive macro modeling– involves a gate-level simulator

on a small number of cycles– improve the estimation accuracy– bias of the static macro models

is reduced

Page 25: L26: Power Estimation. year Power(W) 1980 1985199019952000 10 20 30 40 50 5 15 25 35 45 i286 i386 DX 16 i486 DX25 i486 DX 50 i486 DX2 66 P-PC601 50 P6

High Level Power Estimation

• Power estimation at high level– statistical technique

• only consider the operations of a given type, number of bus, register and memory access

– power dissipation depends on• data activity• physical capacitance

– two approaches considering physical cap.• develop analytic models for estimating the switched

capacitance• synthesis the circuit and then estimate the power dissipation of

the circuit

Page 26: L26: Power Estimation. year Power(W) 1980 1985199019952000 10 20 30 40 50 5 15 25 35 45 i286 i386 DX 16 i486 DX25 i486 DX 50 i486 DX2 66 P-PC601 50 P6

High Level Power Estimation

• Develop analytic models– develop analytic models

for estimation

– models is a function of the circuit complexity and technology/library parameters

– key issue• estimation of the circuit

complexity

• Synthesis approach– procedure

• quick synthesis

• estimate power dissipation using RTL/gate-level estimation techniques

– tends to be more accurate

– requires the development of a quick synthesis capability

• much more efficient than a full synthesis program in time

Page 27: L26: Power Estimation. year Power(W) 1980 1985199019952000 10 20 30 40 50 5 15 25 35 45 i286 i386 DX 16 i486 DX25 i486 DX 50 i486 DX2 66 P-PC601 50 P6

Software Power Estimation

• Objective

– estimate the power dissipation of a piece of code

• Lower level method

– gate level power estimation

• Higher level method

– architectural power estimation

– bus switching activity

– instruction level power analysis

Page 28: L26: Power Estimation. year Power(W) 1980 1985199019952000 10 20 30 40 50 5 15 25 35 45 i286 i386 DX 16 i486 DX25 i486 DX 50 i486 DX2 66 P-PC601 50 P6

Software Power Estimation

• Gate level power estimation– most accurate method– too slow approach– usefulness

• evaluate the power dissipation behavior of a processor design

• characterize the processor for the more efficient instruction

• Architectural power estimation– less precise but much fast– determine which system

components are active in each execution cycle

• Bus switching activity

– bus activity is assumed to be representative of overall switching activity

– computed from the sequence of op-codes, addresses, and data

• Instruction level power analysis

– characterize the power dissipation of instruction sequence

– use for optimizing a program based on the power estimate

Page 29: L26: Power Estimation. year Power(W) 1980 1985199019952000 10 20 30 40 50 5 15 25 35 45 i286 i386 DX 16 i486 DX25 i486 DX 50 i486 DX2 66 P-PC601 50 P6

InstructionName

LOADDLOAD

ADDMULT

LOAD;ADDLOAD;MULT

BaseCost(pJ)

1.982.370.991.192.102.25

LOAD

0.13

DLOAD

0.150.17

ADD

1.191.190.26

MULT

0.920.920.530.66

LOAD;ADD

1.251.320.860.790.40

LOAD;MULT

1.061.060.990.960.530.79

Circuit State Effects (pJ)

Software Power Estimation• Instruction level power

analysis– base cost

• independent of the prior state of the processor

– circuit state effects• take into account the effect

of prior processor state

DLOAD A<-x, B<-yLOAD C<-x; MULT D<-A,BADD A<-C,D

Total

Instruction BaseCircuitState

2.372.250.99

1.191.060.99

5.61 3.24

Page 30: L26: Power Estimation. year Power(W) 1980 1985199019952000 10 20 30 40 50 5 15 25 35 45 i286 i386 DX 16 i486 DX25 i486 DX 50 i486 DX2 66 P-PC601 50 P6

Power Optimization

• Modeling and Technology– Sources of power consumption

• Switching component• Short-circuit component• Leakage component• Static power

– Voltage Scaling– Adiabatic switching

• Circuit Design Level• Logic and Module Design Level

• Architecture and System Design Level• Some Design Examples

Page 31: L26: Power Estimation. year Power(W) 1980 1985199019952000 10 20 30 40 50 5 15 25 35 45 i286 i386 DX 16 i486 DX25 i486 DX 50 i486 DX2 66 P-PC601 50 P6

Switching component

– energy for charge parasitic capacitors(gate, diffusion, and interconnect)

– ex1. In CMOS, output nodes are charged or discharged.

– ex2. Charge sharing

PMOSnetwork

NMOSnetwork

2

2

1DDLVC

High

Low

Evaluation

Page 32: L26: Power Estimation. year Power(W) 1980 1985199019952000 10 20 30 40 50 5 15 25 35 45 i286 i386 DX 16 i486 DX25 i486 DX 50 i486 DX2 66 P-PC601 50 P6

Short circuit component

– finite rise and fall time direct current path between Vdd and GND

TpddinTn VVVV (Both NMOS and PMOS are turned on)

Vin

Vin

Is

Is

t1t2

DD

TDD

t

t

Tin

t

t

sc

V

VV

T

k

dtVtVk

Tdtti

TI

3

2

)2(

12

'

))((2

'4)(

4 2

1

2

1

Page 33: L26: Power Estimation. year Power(W) 1980 1985199019952000 10 20 30 40 50 5 15 25 35 45 i286 i386 DX 16 i486 DX25 i486 DX 50 i486 DX2 66 P-PC601 50 P6

Leakage component• reverse viased PN junction

– subthreshold current

0/

0 )1( IeII tj VVR

)1( /)/()(0

TdsTTgs VVnVVVds eeII

Gate

P+ P+N

Reverse leakage currentVdd

Gnd

Page 34: L26: Power Estimation. year Power(W) 1980 1985199019952000 10 20 30 40 50 5 15 25 35 45 i286 i386 DX 16 i486 DX25 i486 DX 50 i486 DX2 66 P-PC601 50 P6

Static power– although CMOS circuits consume power only when switching, some situations consume static power.– reduced voltage levels feeding CMOS gates

– pseudo-NMOS logic style• single PMOS pullup network(always ON because the gate is grounded)• when the output is driven low, conducting path from the supply to ground is created.

Vdd

Vdd

Weakly turned on

Vdd-Vtn

Page 35: L26: Power Estimation. year Power(W) 1980 1985199019952000 10 20 30 40 50 5 15 25 35 45 i286 i386 DX 16 i486 DX25 i486 DX 50 i486 DX2 66 P-PC601 50 P6

Voltage Scaling• Why?

– Lowering the supply voltage is most effective means of power reduction.

– Feature size of the process geometry decreases.– Smaller process geometry requires the voltage to be lo

wered because of the thinner gate.– Although the delay increases as the voltage is lowered,

the small channel length of the advance process increases the circuit performance

)()()( 2DDLeakageDDSCddL VIVIVfCP

Page 36: L26: Power Estimation. year Power(W) 1980 1985199019952000 10 20 30 40 50 5 15 25 35 45 i286 i386 DX 16 i486 DX25 i486 DX 50 i486 DX2 66 P-PC601 50 P6

Voltage Scaling

• Scaling from 5V to 3.3V

– External components(TTL) operate at 5V and the cost to interface with them made the voltage scaling difficult.

– The components from low-voltage industry such as LVTTL, CMOS, BiCMOS(which operate at 3.3V) make the voltage scaling with low cost.

• Scaling below 3.3V

– Depending the technology, the supply voltage can be lower than 3.3V.

– The supply voltage cannot be too close to the threshold voltage. significant speed loss.

Page 37: L26: Power Estimation. year Power(W) 1980 1985199019952000 10 20 30 40 50 5 15 25 35 45 i286 i386 DX 16 i486 DX25 i486 DX 50 i486 DX2 66 P-PC601 50 P6

Performance vs. Voltage scaling– The lowest voltage possible without significant loss of performance is

the voltage when the electron velocity get out of saturation.

– As the feature size is shrunken, the same electric field can be obtained even when the supply voltage is decreased. ( i.e, velocity saturation occurs at the lower supply voltage. )

Supply voltage

electronspeed Terminal speed

As process technology is shrunken

Page 38: L26: Power Estimation. year Power(W) 1980 1985199019952000 10 20 30 40 50 5 15 25 35 45 i286 i386 DX 16 i486 DX25 i486 DX 50 i486 DX2 66 P-PC601 50 P6

Adiabatic Switching

• Energy injected into a node with cap. C to a voltage, V, is Esig = CV2/2

• Energy drawn from the power supply Einj = QV = CV2

• Einj = 2 Esig : Half of the energy drawn from the supply is dissipated

• Also, Esig is dissipated when the node pulled low.

• All energy drawn from the supply is used only once before being discarded.

Page 39: L26: Power Estimation. year Power(W) 1980 1985199019952000 10 20 30 40 50 5 15 25 35 45 i286 i386 DX 16 i486 DX25 i486 DX 50 i486 DX2 66 P-PC601 50 P6

Adiabatic Switching

• Solution : charge the load from a supply that is at the same potential as the load

• Same supply voltage as the voltage of the load

• Charge transfer proceeds sufficiently slowly to not require a large potential drop Energy dissipation varies roughly with the inverse of the switching time

• Difficulties

– switching transitions must occur when there is no potential drop across the switching devices

– zero energy occurs with arbitrarily low speed switching : With realistic switching rates, the energy savings may not sufficient compared to the circuit complexity.

Page 40: L26: Power Estimation. year Power(W) 1980 1985199019952000 10 20 30 40 50 5 15 25 35 45 i286 i386 DX 16 i486 DX25 i486 DX 50 i486 DX2 66 P-PC601 50 P6

Adiabatic Switching

• Example : inverter– 1. input(X, X’) set to a value value.– 2. (evaluation) slow voltage ramp to Va from 0 to Vdd. One of Y and Y’ is adia

batically charged to Vdd.– 3. (hold) Y and Y’ can be used as the inputs of other stage

– 4. (restore)ramp to Va from Vdd to 0.

X

XX’ X

X’Va

X’

YY’X

X’

Y

Y’

Va

Page 41: L26: Power Estimation. year Power(W) 1980 1985199019952000 10 20 30 40 50 5 15 25 35 45 i286 i386 DX 16 i486 DX25 i486 DX 50 i486 DX2 66 P-PC601 50 P6

Adiabatic Switching

• Other components of power consumption

– on-resistance of switches

– Process improvement(parasitic capacitance reduction) allows lower power consumption in adiabatic circuits

• Application

– When a small number of circuit nodes which significant capacitance driven by high voltage.

– Capacitive transcducers, LCD panels, etc.

Page 42: L26: Power Estimation. year Power(W) 1980 1985199019952000 10 20 30 40 50 5 15 25 35 45 i286 i386 DX 16 i486 DX25 i486 DX 50 i486 DX2 66 P-PC601 50 P6

Reduction of Switched Capacitance

• Reduce the switching activity of the digital circuits to the minimal level required to perform the computation saves the power.

• Methods

– power down mode of the chip

– gated clock

– circuit optimization to reduce transitions

– reduction of # of operations(algorithm change)

– data representation

– resource ordering

– logic style : Dynamic or Static

– layout optimization