chapter 1 performance & technology trends read sections 1.5, 1.6, and 1.8

16
Chapter 1 Performance & Technology Trends Read Sections 1.5, 1.6, and 1.8

Upload: rudolph-douglas

Post on 05-Jan-2016

228 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Chapter 1 Performance & Technology Trends Read Sections 1.5, 1.6, and 1.8

Chapter 1

Performance & Technology Trends

Read Sections 1.5, 1.6, and 1.8

Walid Abu-Sufah
15
Page 2: Chapter 1 Performance & Technology Trends Read Sections 1.5, 1.6, and 1.8

CPE 432 Chapter 1.2

Chapter 1 — Computer Abstractions and Technology — 2

Section 1.5: The Power Wall

Page 3: Chapter 1 Performance & Technology Trends Read Sections 1.5, 1.6, and 1.8

CPE 432 Chapter 1.3

Chapter 1 — Computer Abstractions and Technology — 3

Power Trends

In CMOS IC technology

FrequencyVoltageload CapacitivePower 2

×1000×30 5V → 1V

Clock rates hit a “power wall”

Page 4: Chapter 1 Performance & Technology Trends Read Sections 1.5, 1.6, and 1.8

CPE 432 Chapter 1.4

Chapter 1 — Computer Abstractions and Technology — 4

The power wall

Performance was always improved by increasing frequency (up to 2004)

However by 2006, companies could not reduce generated power and remove more heat

Hence performance improvement could not be achieved by increasing frequency because of the increased power generated >>>>> THE POWER WALL

How else can we improve performance?

Page 5: Chapter 1 Performance & Technology Trends Read Sections 1.5, 1.6, and 1.8

CPE 432 Chapter 1.5

Chapter 1 — Computer Abstractions and Technology — 5

Read Section 1.6: The Sea ChangeThe Switch to Multiprocessors

Page 6: Chapter 1 Performance & Technology Trends Read Sections 1.5, 1.6, and 1.8

CPE 432 Chapter 1.6

Chapter 1 — Computer Abstractions and Technology — 6

Uniprocessor Performance

Uniprocessor performance is constrained by power, instruction-level parallelism, memory latency

Page 7: Chapter 1 Performance & Technology Trends Read Sections 1.5, 1.6, and 1.8

CPE 432 Chapter 1.7 Dr. W. Abu-Sufah

A Sea Change is at Hand The power challenge has forced a change in the design

of microprocessors Since 2002 the rate of improvement in the response

time of programs on desktop computers has slowed from a factor of 1.5 per year to less than a factor of 1.2 per year

As of 2006 all desktop and server companies are shipping microprocessors with multiple processors – cores – per chipProduct AMD

BarcelonaIntel

NehalemIBM Power 6 Sun Niagara

2

Cores per chip 4 4 2 8

Clock rate 2.5 GHz ~2.5 GHz? 4.7 GHz 1.4 GHz

Power 120 W ~100 W? ~100 W? 94 W

Plan is to double the number of cores per chip per generation (about every two years) Plan not followed!!

Page 8: Chapter 1 Performance & Technology Trends Read Sections 1.5, 1.6, and 1.8

CPE 432 Chapter 1.8 Dr. W. Abu-Sufah

Multicore microprocessorsRequire explicitly parallel programming

In single core microprocessors, hardware implemented instruction level parallelism to execute multiple instructions IN PARALLEL

Instruction level parallelism is hidden from the programmer

Parallel programming is hard (harder) to do. Involves:

- Programming for performance- Load balancing- Optimizing communication and synchronization

8

With the introduction of multicore microprocessors,The Free Lunch Era Ended !!!

Page 9: Chapter 1 Performance & Technology Trends Read Sections 1.5, 1.6, and 1.8

CPE 432 Chapter 1.9 Dr. W. Abu-Sufah

Read Section 1.8: Pitfalls and Fallacies

Page 10: Chapter 1 Performance & Technology Trends Read Sections 1.5, 1.6, and 1.8

CPE 432 Chapter 1.10 Dr. W. Abu-Sufah

Pitfalls and Fallacies

Pitfalls: Easily made mistakes

Fallacies:ErrorsMyths…

Page 11: Chapter 1 Performance & Technology Trends Read Sections 1.5, 1.6, and 1.8

CPE 432 Chapter 1.11 Dr. W. Abu-Sufah

Pitfall: Amdahl’s Law

Pitfall: Improving an aspect of a computer and expecting a proportional improvement in overall performance

provedOverall_imTime Can’t be done!

aspect unimprovedaspect improved

provedOverall_im Timefactor timprovemen

TimeTime

Example: multiply operations account for 80 seconds of a 100 seconds run time of a program

How much improvement in multiply performance to get the program to run 5 times faster (i.e. in {100/5} = 20s)?

20 n

8020

Page 12: Chapter 1 Performance & Technology Trends Read Sections 1.5, 1.6, and 1.8

CPE 432 Chapter 1.12 Dr. W. Abu-Sufah

Amdahl’s Law

ExTime

ExTime Speedup

new

oldoverall

Best Speedupoverall you could ever hope to do:

enhancedmaximum Fraction - 1

1 Speedup

enhanced

enhancedenhancedoldnew Speedup

FractionFraction ExTime ExTime 1

ExTimeold ExTimenew

fraction enhanced

enhanced

enhancedenhanced Speedup

Fraction Fraction 1

1

Page 13: Chapter 1 Performance & Technology Trends Read Sections 1.5, 1.6, and 1.8

CPE 432 Chapter 1.13

Amdahl’s Law example

13

New CPU 10X faster I/O bound server, so 60% time waiting for I/O

56.1

64.0

1

100.4

0.4 1

1

SpeedupFraction

Fraction 1

1 Speedup

enhanced

enhancedenhanced

overall

• Apparently, its human nature is to be attracted by 10X faster, vs. keeping in perspective that it is just 1.56X faster

Page 14: Chapter 1 Performance & Technology Trends Read Sections 1.5, 1.6, and 1.8

CPE 432 Chapter 1.14

Amdahl’s Law example:Make the common case fast

14

Fraction = 0.1, Speedup = 10

1.1

91.0

1

100.1

0.1 1

1

SpeedupFraction

Fraction 1

1 Speedup

enhanced

enhancedenhanced

overall

3.5

19.0

1

100.9

0.9 1

1 Speedupoverall

Fraction = 0.9, Speedup = 10

Page 15: Chapter 1 Performance & Technology Trends Read Sections 1.5, 1.6, and 1.8

CPE 432 Chapter 1.15 15

Pitfall: MIPS as a Performance Metric

MIPS: Millions of Instructions Per Second Doesn’t account for

Differences in ISAs between computers Differences in complexity between instructions

Page 16: Chapter 1 Performance & Technology Trends Read Sections 1.5, 1.6, and 1.8

CPE 432 Chapter 1.16 Dr. W. Abu-Sufah

Pitfall: MIPS as a Performance Metric (cont.)

How should MIPS be computed? It is not the maximum theoretical MIPS quoted

by the manufacturer.

610time Execution

count nInstructioMIPS P program executing A processor

610CPI

rate Clock

rate Clock

CPIcount nInstructiotime Execution

610rate Clock

CPIcount nInstructiocount nInstructio

MIPS

CPI varies between programs on a given CPU