2014-2-11 john lazzaro (not a prof - “john” is always ok)

76
UC Regents Spring 2014 © UCB CS 152: L7: Power and Energy 2014-2-11 John Lazzaro (not a prof - “John” is always OK) CS 152 Computer Architecture and Engineering www-inst.eecs.berkeley.edu/ ~cs152/ TA: Eric Love Lecture 7 -- Power and Energy Pla y:

Upload: fonda

Post on 21-Jan-2016

25 views

Category:

Documents


0 download

DESCRIPTION

www-inst.eecs.berkeley.edu/~cs152/. CS 152 Computer Architecture and Engineering. Lecture 7 -- Power and Energy. 2014-2-11 John Lazzaro (not a prof - “John” is always OK). TA: Eric Love. Play:. Today: Power and Energy. Metrics: Power and energy (intro). Short Break. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

2014-2-11

John Lazzaro(not a prof - “John” is always OK)

CS 152Computer Architecture and Engineering

www-inst.eecs.berkeley.edu/~cs152/

TA: Eric Love

Lecture 7 -- Power and Energy

Play:

Page 2: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

Today: Power and Energy

Metrics: Power and energy (intro)

Short Break.

Metrics: Power and energy (technique)

Page 3: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

Power and Energy

Universal: Power and energy units are comparable across all of applied physics.

So, we use automobilesto introduce terminology.

Page 4: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

The Joule: Unit of energy. A 1 Gallon gas container holds 130 MJ of energy.

1 J = 1 W s.

1 W = 1 J/s.

The Watt: Unit of power.A rate of energy (J/s). A gas pump hose delivers 6 MW.

120 KW: The power delivered by a Tesla Supercharger. Tesla Model S has a306 MJ battery (good for 265 miles).

Page 5: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

Energy and Performance

Sad fact: Computers turn electrical energy into heat. Computation is a byproduct.

Air or water carries heat away, or chip melts.

Page 6: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

+1V -

1 Ohm Resistor

1A

1 Joule heats 1 gram of water 0.24 degree C

This is how electric tea pots work ...

1 Joule of Heat Energy per Second

1 Watt

20 W rating: Maximum power the package is able to transfer to the air. Exceed rating and resistor burns.

The Watt: Unit of power. The amount of energy burned in the resistor in 1 second.

The Joule: Unit of energy.Can also be expressed as Watt-Seconds. Burning 1 Watt for 100 seconds uses 100 Watt-Seconds of energy.

Page 7: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

Cooling an iPod nano ...

Like resistor on last slide, iPod relies on passive transfer of heat from case to the air.

Why? Users don’t want fans in their pocket ...

To stay “cool to the touch” via passive cooling,

power budget of 5 W.

If iPod nano used 5W all the time, its battery would last 15 minutes ...

Page 8: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

Powering an iPod nano (2005 edition)

1.2 W-hour battery: Can supply 1.2 watts of power for 1 hour.

1.2 W-hr / 5 W ≈ 15 minutes.

Real specs for iPod nano : 14 hours for music,

4 hours for slide shows.

85 mW for music.300 mW for slides.

More W-hours require bigger battery and thus bigger “form factor” -- it wouldn’t be “nano” anymore :-).

Page 9: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

Finding the (2005) iPod nano CPU ...

Two 80 MHz CPUs. One CPU used for audio, one for slides.

Low-power ARM roughly 1mW per MHz ... variable clock, sleep modes.

85 mW system power realistic ...

A close relative ...

Page 10: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

The CPU is only part of power budget!

“Amdahl’s Law for Power”

If our CPU took no power at all to run, that would only double battery life!

CPULCD Backlight

“other”

LCD

GPU

2004-era notebook running a full

workload.

Page 11: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

2010 nano0.74

ounces(50% of

2005 Nano)

What’s happened since 2005?

“Up to” 24 hours audio playback. 70% improvement from 2005 nano. 0.39 W Hr

(33% of 2005 Nano)

Page 12: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

Processors and Energy

Page 13: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

Moore’s Law2.6 Billion

1 Million

2 Thousand

Page 14: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

Main driver: device scaling ...

From: “Facing the Hot Chips Challenge Again”, Bill Holt, Intel, presented at Hot Chips 17, 2005.

Page 15: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

1974: Dennard Scaling

If we scale the gate length by a factor 𝞳, how should we scale other aspects of transistor to get the “best” results?

notscaled

𝞳 = 5 scaling

Page 16: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

Dennard Scaling

notscaled

𝞳 = 5 scalingThings we do:

scale dimensions, doping, Vdd.

What we get: 𝞳2 as many transistors at the same power density!

Whose gates switch times faster!𝞳

Power density scaling ended in 2003 (Pentium 4: 3.2GHz, 82W, 55M FETs). Why? We could no longer scale Vdd.

Page 17: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

The Power Wall

Why? We can no longer fully scale Vdd ... because MOS transistor leakage current is no longer a negligible part of the power budget.

Page 18: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

Switching Energy: Fundamental Physics

Every logic transition dissipates energy.

How can we limit

switching energy?

(1) Reduce # of clock transitions. But we have work to do ...(2) Reduce Vdd. But lowering Vdd limits the clock speed ...(3) Fewer circuits. But more transistors can do more work.(4) Reduce C per node. One reason why we scale processes.

Vdd

12

C

Vdd

E0-

>1=

2

Vdd

12

C

Vdd

E1-

>0=

2

C

Strong result: Independent of technology.

Page 19: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

Scaling switching energy per gate ...

IC process scaling (“Moore’s Law”)

From: “Facing the Hot Chips Challenge Again”, Bill Holt, Intel, presented at Hot Chips 17, 2005.

Due to reducing V and C (length and width of Cs decrease, but plate distance gets smaller).

Recent slope more shallow because V is being scaled less aggressively.

Page 20: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

0V =

Second Factor: Leakage Currents

Even when a logic gate isn’t switching, it burns power.

Igate: Ideal capacitors have zero DC current. But modern transistor gates are a few atoms thick, and are not ideal.

Isub: Even when this nFet

is off, it passes an Ioff leakage current.

We can engineer any Ioffwe like, but a lower Ioff

also results in a lower Ion, and thus a lower

maximum clock speed.Intel’s 2006 processor designs, leakage vs switching power A lot of work

was done to get a ratio this good ... 50/50 is common.Bill Holt, Intel, Hot Chips 17.

Page 21: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

Engineering “On” Current at 25 nm ...

I

dsVs

Vd

V g

0.7 = Vdd

0.25 ≈

Vt

I

ds

1.2 mA = Ion

Ioff

= 0 ???

We can increase Ion by

raising Vdd and/or lowering Vt.

Page 22: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

Plot on a “Log” Scale to See “Off” Current

I

dsVs

Vd

V g

I

ds

Ioff

≈ 10 nA

We can decrease Ioff by

raising Vt - but that lowers Ion.

0.25 ≈

Vt

1.2 mA = Ion

0.7 = Vdd

Page 23: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

Ioff? Ion? Recall: Timing Lecture ...

I

ds

0.25 ≈

Vt

1.2 mA = Ion

0.7 = Vdd

Ioff

≈ 10 nA

A “on” n-FET

empties the

bucket.

An “off” n-FET while bucket

fills.Ioff

Ion

Why on?

Vdd >>

Vt

Why

open?

Gnd <<

Vt

I ds

= Current through

n-FET

Page 24: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

Device engineers trade speed and power

From: Silicon Device Scaling to the Sub-10-nm RegimeMeikei Ieong,1* Bruce Doris,2 Jakub Kedzierski,1 Ken Rim,1 Min Yang1

We can reduce leakage

(Pstandby) by raising Vt.

We can increase speed

by raising Vdd and lowering Vt.

We can reduce CV (Pactive)

by lowering Vdd.

2

Page 25: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

Customize processes for product types ...

From: “Facing the Hot Chips Challenge Again”, Bill Holt, Intel, presented at Hot Chips 17, 2005.

Page 26: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

Vgs

IdsIntel 22nm Process

Transistor channel is a raised fin.Gate controls

channel from sides and top.Channel depth is fin

width. 12-15nm for L=22nm.

Page 27: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

Clock rates have flattened out, but ...

Page 28: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

Performance: Put more transistors to work

Page 29: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

Moore’s Law2.6 Billion

1 Million

2 Thousand

We still scale to get more

transistors per unit area ... but we use design techniques to

reduce power.

Page 30: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

Takeaway Abstractions

From Part I ...

Page 31: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

Small circuits can go very fast in standard CMOS ...

This oscillator runs at 210 GHz in a 32 nm SOI CMOS logic process, and consumes 42 mW ...

But if we used these techniques for a CPU, our 150 W air-cooled power limit would limit our design to using about ten thousand transistors ...... but we are used to using 100s of millions!

Page 32: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

The Power Wall

Dennard scaling stopped working in 2004.Why? MOSFET off currents became non-negligible. We limit clock speed to prevent chip from melting.

Page 33: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

Dynamic Power: 4 ways to reduce it ...

Every logic transition dissipates energy.

How can we limit

switching energy?

(1) Reduce # of clock transitions. But we have work to do ...(2) Reduce Vdd. But lowering Vdd limits the clock speed ...(3) Fewer circuits. But more transistors can do more work.(4) Reduce C per node. One reason why we scale processes.

Vdd

12

C

Vdd

E0-

>1=

2

Vdd

12

C

Vdd

E1-

>0=

2

C

Strong result: Independent of technology.

Page 34: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

0V =

Static power: We trade off speed for power.

Even when a logic gate isn’t switching, it burns power.

Igate: Ideal capacitors have zero DC current. But modern transistor gates are a few atoms thick, and are not ideal.

Isub: Even when this nFet

is off, it passes an Ioff leakage current.

We can engineer any Ioffwe like, but a lower Ioff

also results in a lower Ion, and thus a lower

maximum clock speed.Intel’s 2006 processor designs, leakage vs switching power A lot of work

was done to get a ratio this good ... 50/50 is common.Bill Holt, Intel, Hot Chips 17.

Page 35: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

Factor of 60 in leakage vs. 3.5 in speed

(40, 45, 50) are transistor channel lengths (in nm)

Chart shows 9 different NAND gates for an IC process, each with a different speed vs. static power tradeoff.

Page 36: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

FO4 (Delay, Power) vs Vdd -- 65nm

FO4 is the delay of one inverter driving four additional inverters. Power in this plot includes dynamic and static.

Page 37: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

Six low-power design techniques

Power-down idle transistors

Parallelism and pipelining

Slow down non-critical paths

Clock gating

Thermal management

Data-dependent processing

Page 38: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

Trading Hardware for Power

via Parallelism and Pipelining ...

Design Technique #1 (of 6)

Page 39: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

Gate delay roughly

linear with Vdd

This magic trick brought to you by Cory Hall ...

And so, we can transform this:

Block processes stereo audio. 1/2

of clocks for “left”, 1/2 for “right”.

P ~ F ⨯ Vdd2

P ~ 1 ⨯ 12

Into this: Top block processes “left”, bottom “right”.

CV2 power only

P ~ #blks ⨯ F ⨯ Vdd 2

P ~ 2 ⨯ 1/2 ⨯ 1/4 = 1/4

Page 40: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

Chandrakasan & Brodersen (UCB, 1992)

Simple

Pipelined

Parallel

From:

Page 41: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

Multiple Cores for Low Power

Trade hardware for power, on a large

scale ...

Page 42: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

Cell: The PS3 chip

2006

Page 43: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

Cell (PS3 Chip): 1 CPU + 8 “SPUs”

PowerPC

L2 Cache512 KB

Synergistic Processing

Units(SPUs)

8

Page 44: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

One Synergistic Processing Unit (SPU)

256 KB Local Store, 128 128-bit Registers

SPU issues 2 inst/cycle (in order) to 7 execution units

SPU fills Local Store using DMA to DRAM and network

Page 45: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

A “Schmoo” plot for a Cell SPU ...

The lower Vdd, the longer the maximum clock period, the slower the clock frequency.

12

C

Vdd

E1-

>0=

212

C

Vdd

E0-

>1=

2

The lower Vdd, the less dynamic energy consumption.

Page 46: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

Clock speed alone doesn’t help E/op ...

12

C

Vdd

E1-

>0=

212

C

Vdd

E0-

>1=

But, lowering clock frequency while keeping voltage constant spreads the same amount of work over a longer time, so chip stays cooler ... 2

Page 47: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

Scaling V and f does lower energy/op

7W to reliably get 4.4 GHz performance. 47C die temp.

1 W to get 2.2 GHz performance. 26 C die temp. If a program that needs a

4.4 Ghz CPU can be recoded to use two 2.2 Ghz CPUs ... big win.

Page 48: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

How iPod nano 2005 puts its 2 cores to use ...

Two 80 MHz CPUs. Was used in several nano generations, with one CPU doing audio decoding, the other doing photos, etc.

Page 49: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

2013 Macbook Air

Haswell CPU/GPU

Voltage range: 0.655V to 1.041V ... 2.5x in CV2 energy

Page 50: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

Powering down idle circuits

Design Technique #2 (of 6)

Page 51: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

Add “sleep” transistors to logic ...

Example: Floating point unit logic.When running fixed-point instructions, put logic “to sleep”.

+++ When “asleep”, leakage power is dramatically reduced.--- Presence of sleep transistors slows down the clock rate when the logic block is in use.

Page 52: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

Intel example: Sleeping cache blocks

From: “Facing the Hot Chips Challenge Again”, Bill Holt, Intel, presented at Hot Chips 17, 2005.

A tiny current supplied in “sleep” maintains SRAM state.

Page 53: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

Intel Medfield

Page 54: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

Intel Medfield

Switches 45 power “islands.”

Fine-grained control of leakage power, to track user activity.

“Race to idle” strategy -- finish tasks quickly, to get to power down.

Page 55: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

Playing a game ...

Page 56: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

Watching a video ...

Page 57: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

Looking at phone screen, not doing anything ...

Page 58: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

Phone in your pocket, waiting for a call ...

Page 59: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

Slow down “slack paths”

Design Technique #3 (of 6)

Page 60: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

Fact: Most logic on a chip is “too fast”

From “The circuit and physical design of the POWER4 microprocessor”, IBM J Res and Dev, 46:1, Jan 2002, J.D. Warnock et al.

Most wires have hundreds of picoseconds to spare.The critical path

Page 61: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

Use several supply voltages on a chip ...

Why use multi-Vdd? We can reduce dynamic power by using low-power Vdd for logic off the critical path.

From: “Facing the Hot Chips Challenge Again”, Bill Holt, Intel, presented at Hot Chips 17, 2005.

What if we can’t do a multi-Vdd design? In a multi-Vt process, we can reduce leakage power on the slow logic by using high-Vth transistors.

Page 62: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

From a chapter from new book on ASIC design by Chinnery and Keutzer (UCB).

Logical partition into 0.8V and 1.0V nets

done manually to meet 350 MHz spec

(90nm).

Level-shifter insertion and placement done

automatically.

Dynamic power in 0.8V section cut 50%

below baseline.

Leakage power in 1.0V section cut 70%

below baseline.

Page 63: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

Gating clocks to save power

Design Technique #4 (of 6)

Page 64: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

On a CPU, where does the power go?

From: Bose, Martonosi, Brooks: Sigmetrics-2001 Tutorial

Half of the power goes to latches (Flip-Flops).

Most of the time, the latches don’t change state.

So (gasp) gated clocks are a big win. But, done with CAD tools in a disciplined way.

Page 65: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

Synopsis Design Compiler can do this ...

“Up to 70% power savings at the block level, for applicable circuits” Synopsis Data Sheet

<=

Page 66: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

Power Compiler also can do this ...

10-20% push-buttonpower savings, using techniques like this one.

Page 67: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

Data-Dependent Processing

Design Technique #5 (of 6)

Page 68: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

Example: Video Decode Transform

Most of the time, the inputs flip between small positive and negative integers.

In 2’s complement, wastes power:

+1: 0b00001-1: 0b11110

Page 69: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

Solution: Add bias value to all inputs

30+% power reduction for a bias of 64. For this linear transform, correcting the output for the bias is trivial.

Page 70: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

Thermal Management

Design Technique #6 (of 6)

Page 71: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

Keep chip cool to minimize leakage power

A recipe for thermal runaway

Page 72: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

IBM Power 4: How does die heat up?

4 dies on amulti-chip

module

2 CPUsper die

Page 73: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

UC Regents Spring 2014 © UCBCS 152: L7: Power and Energy

115 Watts: Concentrated in “hot spots”

Fixedpointunits

Hot spots

Cachelogic

66.8 C == 152 F 82 C == 179.6 F

Page 74: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

Idea: Monitor temperature, servo clock speed

TDP = Thermal Design Point

Page 75: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

iPad Air, iPad Mini retina, and iPhone

5S all use the A7

Repeatedly running the same benchmark on three Apple

products.

The TDP of each form factor dictates how long it can run at

“top speed”TDP = Thermal Design Point

Page 76: 2014-2-11 John Lazzaro (not a prof - “John” is always OK)

On Thursday

Time to market via chip verification.

Have fun in section !