1 single-isa heterogeneous multi-core architectures: the potential for processor power reduction...

22
1 Heterogeneous Multi- Core Architectures: The Potential for Processor Power Reduction Rakesh Kumar, Keith I. Farkas, Norman P. Jouppi, Parthasarathy Ranganathan, Dean M. Tullsen Presenter: Borys Bradel

Upload: miranda-cross

Post on 12-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction Rakesh Kumar, Keith I. Farkas, Norman P. Jouppi, Parthasarathy

1

Single-ISA Heterogeneous Multi-Core Architectures:

The Potential for Processor Power Reduction

Rakesh Kumar, Keith I. Farkas, Norman P. Jouppi,

Parthasarathy Ranganathan, Dean M. TullsenPresenter: Borys Bradel

Page 2: 1 Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction Rakesh Kumar, Keith I. Farkas, Norman P. Jouppi, Parthasarathy

2

Introduction

Different programs have different requirements (e.g. ILP) Extends to phases of a single program Heterogeneous cores Use core that matches the requirements

Reuse existing cores Use multiple generations of the same

family of processors

Page 3: 1 Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction Rakesh Kumar, Keith I. Farkas, Norman P. Jouppi, Parthasarathy

3

Outline

Methodology Hardware Assumptions Power

Experiments Optimal – energy/energy delay product Heuristic based – static/dynamic

Related Work Conclusion

Page 4: 1 Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction Rakesh Kumar, Keith I. Farkas, Norman P. Jouppi, Parthasarathy

4

Single ISA Multi-Core Benefits

Small area overhead because of the growth in core sizes between generations

Clock frequencies of older cores would scale with technology P3 1 GHz = P4 1.4 GHz Increased pipeline depth precisely

because could not scale

Page 5: 1 Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction Rakesh Kumar, Keith I. Farkas, Norman P. Jouppi, Parthasarathy

5

Hardware – Alpha Family

2 in order cores EV4=21064 EV5=21164

2 out of order cores EV6=21264 EV8-=21464 (multi thread support

removed)

Page 6: 1 Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction Rakesh Kumar, Keith I. Farkas, Norman P. Jouppi, Parthasarathy

6

Hardware Size

15% more area than just using 21464

Page 7: 1 Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction Rakesh Kumar, Keith I. Farkas, Norman P. Jouppi, Parthasarathy

7

Assumptions

Can switch cores dynamically Private L1 cache and common L2 cache All cores use 0.10 micron technology Single process executing on a single core at any one

time 2.1 GHz clock (=21264 0.35 micron 600 MHz) Input voltage 1.2V Cores shut down when idle 1000 cycle restart cost (staged, phase lock loop left

alone) 150 ms memory access Stall cycles through CACTI

Page 8: 1 Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction Rakesh Kumar, Keith I. Farkas, Norman P. Jouppi, Parthasarathy

8

Core Configurations

Page 9: 1 Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction Rakesh Kumar, Keith I. Farkas, Norman P. Jouppi, Parthasarathy

9

Power Model

Use Wattch to account for activity based dissipation

Use scaling and offset factors to account for other factors

This hybrid model is closer to manufacturer’s data points

Peak power: data sheets less L2 cache and output pins

Typical power: scaled based on Intel chips

Page 10: 1 Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction Rakesh Kumar, Keith I. Farkas, Norman P. Jouppi, Parthasarathy

10

Power and Area Statistics

Page 11: 1 Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction Rakesh Kumar, Keith I. Farkas, Norman P. Jouppi, Parthasarathy

11

Performance Modeling

Use SMTSIM, a cycle accurate simulator

simpoint is used to identify representative instructions of programs and how many instructions need to be fast forwarded

Page 12: 1 Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction Rakesh Kumar, Keith I. Farkas, Norman P. Jouppi, Parthasarathy

12

Varying Performance Ratio

Page 13: 1 Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction Rakesh Kumar, Keith I. Farkas, Norman P. Jouppi, Parthasarathy

13

Varying Energy Efficiency Ratio

Page 14: 1 Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction Rakesh Kumar, Keith I. Farkas, Norman P. Jouppi, Parthasarathy

14

Oracle Switching for Energy

Performance always within 10% of EV8-

Page 15: 1 Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction Rakesh Kumar, Keith I. Farkas, Norman P. Jouppi, Parthasarathy

15

Oracle Switching for Energy

Page 16: 1 Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction Rakesh Kumar, Keith I. Farkas, Norman P. Jouppi, Parthasarathy

16

Oracle Switching for Energy Delay Product

Performance always within 50% of EV8-

Page 17: 1 Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction Rakesh Kumar, Keith I. Farkas, Norman P. Jouppi, Parthasarathy

17

Oracle Switching for Energy Delay Product

Page 18: 1 Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction Rakesh Kumar, Keith I. Farkas, Norman P. Jouppi, Parthasarathy

18

Others

Voltage/frequency scaling – not as good

Static core selection only EV6 and EV8- are used

Dynamic heuristic Running average performance within 10% Every 100 time intervals (100 million

instructions) cores are sampled for 5 intervals

Select best core based on sampling

Page 19: 1 Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction Rakesh Kumar, Keith I. Farkas, Norman P. Jouppi, Parthasarathy

19

Results for Heuristics

Page 20: 1 Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction Rakesh Kumar, Keith I. Farkas, Norman P. Jouppi, Parthasarathy

20

Results for Heuristics/Static Core

Page 21: 1 Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction Rakesh Kumar, Keith I. Farkas, Norman P. Jouppi, Parthasarathy

21

Related Work

Gating based power optimization Cannot gate at a fine enough

granularity May still have leakage This could be thought of as gating to

reduce capabilities of different units Voltage and frequency scaling

Chip wide – one size does not fit all Fine grained – granularity problems

Page 22: 1 Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction Rakesh Kumar, Keith I. Farkas, Norman P. Jouppi, Parthasarathy

22

Conclusions

Heterogeneous multi core architectures reduce the energy-delay product More fine grained than other approaches

Using several cores from the same family is good Reduces development/testing costs Is it scalable?

Just use EV6??