temperature-aware design presented by mehul shah 4/29/04

26
Temperature-Aware Design Presented by Mehul Shah 4/29/04

Post on 21-Dec-2015

218 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Temperature-Aware Design Presented by Mehul Shah 4/29/04

Temperature-Aware Design

Presented by Mehul Shah4/29/04

Page 2: Temperature-Aware Design Presented by Mehul Shah 4/29/04

The Problem

Power & Thermal densities are increasing Currently @ 50W/cm2, 100W/cm2 @ 50nm technology Power density doubles every 3 years

Operating Vdd scaling much more slowly (ITRS) Cost of cooling rising exponentially

$1 - $3 per Watt of power dissipation Packages designed for worst case power

Hot spots – heat dissipation non-uniform across chip Low-Power design techniques not sufficient Big Hammer : Global Clock Gating limits performance

Page 3: Temperature-Aware Design Presented by Mehul Shah 4/29/04

Impact of Temperature on Design

Increased Delay, Lower Reliability Slower Transistors

Carrier mobility lower at higher temperature Inverter 35% slower at 110

o C vs. 60

o C

Higher Leakage Power By orders of magnitude at higher temperature Leakage becoming more significant than switching

power Higher Metal Resistivity

Copper 39% more resistive at 120o C vs. 20

o C

Lower Mean-Time-To-Failure (MTF) MTF = MTFo exp (Ea / kb T) MTF decreases exponentially w/ Temperature

Page 4: Temperature-Aware Design Presented by Mehul Shah 4/29/04

Moral of the Story

Problem: Temperature adversely affects power, performance & reliability

Solution: “Temperature-Aware” Design

Page 5: Temperature-Aware Design Presented by Mehul Shah 4/29/04

Temperature Aware Design

Thermal Modeling Estimate Operating Temperature Simple : Allow architects to easily reason

about thermal effects Detailed : Model runtime temperature at

Functional-Unit granularity Computationally Efficient Flexible : Easily extend to novel architectures

Dynamic Thermal Management Use runtime behavior and thermal status to

adjust/distribute workload among Functional-Units

Page 6: Temperature-Aware Design Presented by Mehul Shah 4/29/04

Talk Outline

Thermal Modeling Model Description Validation & Case Studies

Dynamic Thermal Management Results Conclusions

Page 7: Temperature-Aware Design Presented by Mehul Shah 4/29/04

References

Kevin Skadron et. al, “Temperature-Aware Microarchitecture”

Wei Huang et. al, Compact Thermal Modeling for Temperature-Aware Design”

Page 8: Temperature-Aware Design Presented by Mehul Shah 4/29/04

Thermal Modeling

Thermal model interacts with Power, Performance, Reliability models

Design convergence requires several iterations

Page 9: Temperature-Aware Design Presented by Mehul Shah 4/29/04

Heat Flow vs. Electrical Phenomenon

Both can be described by the same differential equations Heat Flow = Electrical Current Temperature = Voltage Capacitance = Heat Absorption Capacity

Describe design as a Thermal RC circuit Node = Functional Block

Solve RC equations to obtain Node Temperature

Page 10: Temperature-Aware Design Presented by Mehul Shah 4/29/04

HotSpot Package

Page 11: Temperature-Aware Design Presented by Mehul Shah 4/29/04

Equivalent Model

Page 12: Temperature-Aware Design Presented by Mehul Shah 4/29/04

Equivalent Model (Continued)

Die Area divided into micro-architectural blocks Spreader, Sink divided into five blocks

Rsp, Rhs areas under the die Trapezoids not covered by the die

Rconvective represents thermal resistance from package to air RC Model

Vertical R’s : heat flow between layers Lateral R’s : heat diffusion within a layer

R1 = Block1 to Spreader, R2 = Block1 to rest of the chip R = t / k * A

t : thickness k : thermal conductivity of the material A : Cross-sectional area

C = c * t * A c : thermal capacitance per unit volume Require empirical scaling factor due to lumped model

Page 13: Temperature-Aware Design Presented by Mehul Shah 4/29/04

HotSpot Validation

Page 14: Temperature-Aware Design Presented by Mehul Shah 4/29/04

Fallacy of Using a Power Metric

Page 15: Temperature-Aware Design Presented by Mehul Shah 4/29/04

Compact Thermal Model

Page 16: Temperature-Aware Design Presented by Mehul Shah 4/29/04

Equivalent Model

Page 17: Temperature-Aware Design Presented by Mehul Shah 4/29/04

Equivalent Model (Cont.)

Compact Model vs. HotSpot Arbitrary granularity grid Thermal interface material Spreader, Interface under the die are divided into

chip granularity Primary Heat Flow Path

Rvertical = t / (k * A) C = Alpha * cp * ρ * A

Alpha : To account for lumped capacitor model Cp : specific heat ρ : material density

Page 18: Temperature-Aware Design Presented by Mehul Shah 4/29/04

Equivalent Model (Secondary Path)

Interconnect Thermal Model

Self-heating power & wire length prediction

Pself = I2R R = ρm * L / Am

Page 19: Temperature-Aware Design Presented by Mehul Shah 4/29/04

Equivalent Model (Secondary Path, Cont.)

Equivalent Thermal Resistance

Page 20: Temperature-Aware Design Presented by Mehul Shah 4/29/04

Model Validation & Evaluation (Primary)

Steady State

Transient

Page 21: Temperature-Aware Design Presented by Mehul Shah 4/29/04

Model Validation (Secondary)

Page 22: Temperature-Aware Design Presented by Mehul Shah 4/29/04

Case Study

Page 23: Temperature-Aware Design Presented by Mehul Shah 4/29/04

Thermal Management

Dynamic Thermal Management

Emergency Threshold temperature above which chip is in thermal violation

Trigger Threshold temperature above which DTM is applied

Page 24: Temperature-Aware Design Presented by Mehul Shah 4/29/04

DTM Techniques

Temperature-Tracking Frequency Scaling Feedback controlled Fetch Toggling Migrating Computation Dynamic Voltage Scaling (DVS) Global Clock Gating

Page 25: Temperature-Aware Design Presented by Mehul Shah 4/29/04

DTM Results

Page 26: Temperature-Aware Design Presented by Mehul Shah 4/29/04

Conclusions

Accurate Thermal models are essential for early design estimation

Models are similar to electrical RC networks Arbitrary granularity for localized temperature

information Model all parts of the package

Architectural Techniques can reduce demands on the IC package by

Dynamically adjusting workload to avoid emergencies

Reducing Hot Spots