1 fundamentals of computer design introduction classes of computers defining computer architecture...

63
1 Fundamentals of Computer Fundamentals of Computer Design Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated Circuits Tends in Cost Dependability Measuring and Reporting Performance Quantitative Principles of Computer Design CDA 4102/5155 – Fall 2015 Copyright © 2015 Prabhat Mishra

Upload: jemimah-conley

Post on 01-Jan-2016

230 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

1

Fundamentals of Computer DesignFundamentals of Computer Design Introduction

Classes of Computers

Defining Computer Architecture

Trends in Technology

Trends in Power and Energy in Integrated Circuits

Tends in Cost

Dependability

Measuring and Reporting Performance

Quantitative Principles of Computer DesignCDA 4102/5155 – Fall 2015 Copyright © 2015 Prabhat Mishra

Page 2: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

2

Microprocessor Performance Trends

Relative to VAX-11/780 using SpecInt Benchmarks

Due to technological advances

Due to advances in architecture

Slowdown due to limits ofpower and available ILP

Page 3: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

3

Design Complexity

Exponential Growth – doubling of transistors every couple of years

Page 4: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

4

Technology and Demand

Technology Demand

#of transistors are doubling every 2 years

Communication, multimedia, entertainment, networking

Exponential growth of design complexity verification complexity

Page 5: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

5

Fundamentals of Computer DesignFundamentals of Computer Design Introduction

Classes of Computers

Defining Computer Architecture

Trends in Technology

Trends in Power and Energy in Integrated Circuits

Tends in Cost

Dependability

Measuring and Reporting Performance

Quantitative Principles of Computer Design

Conclusion

Page 6: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

6

Computer MarketDesktop

Driven by price-performance

$1000 - $10,000 [$100 - $1000 per processor]

ServerThroughput, availability, scalability

$10K - $10M [$200 - $2000 per processor]

Embedded SystemsApplication specific

Low cost, low power, real-time performance

$10 - $100,000 [$0.20 - $200 per processor]

Page 7: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

7

An Example Embedded System

Digital Camera Block Diagram

Page 8: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

8

Components of Embedded Systems

Analog Digital Analog

Memory

Coprocessors

Controllers

Converters

Processor

Interface

Software(Application Programs)

ASIC

Page 9: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

9

Fundamentals of Computer DesignFundamentals of Computer Design Introduction

Classes of Computers

Defining Computer Architecture

Trends in Technology

Trends in Power and Energy in Integrated Circuits

Tends in Cost

Dependability

Measuring and Reporting Performance

Quantitative Principles of Computer Design

Conclusion

Page 10: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

10

Computer ArchitectureDefinition

Instruction set architecture (ISA) Programmer (user) View

Implementation Organization: CPU, memory, buses, I/O

Hardware: logic design, packaging technology

Computer design must meet Functional requirements

Area, performance, cost, power goals Optimize, evaluate, and explore to find best possible architecture

Consider other factors Time-to-market, technology trend, safety, reliability, …

Page 11: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

11

Instruction-Set Architecture (ISA)

An instruction set architecture is a specification of a standardized programmer-visible interface to hardware, comprised of:

A set of instructions (instruction types and operations)With associated argument fields, assembly syntax, binary encoding.

A set of named storage locations and addressingRegisters, memory, … programmer-accessible caches?

A set of addressing modes (ways to name locations)

Types and sizes of operands

Control flow instructions

Often an I/O interface (usually memory-mapped)

Page 12: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

12

Example: MIPS

0r0r1°°°r31PClohi

Programmable storage

232 x bytes

31 x 32-bit GPRs (R0=0)

32 x 32-bit FP regs (paired DP)

HI, LO, PC

Data types ?

Format ?

Addressing Modes?

Arithmetic logical ADD, ADDU, SUB, SUBU, AND, OR, XOR, NOR, SLT, SLTU,

ADDI, ADDIU, SLTI, SLTIU, ANDI, ORL, XORL, LUI

SLL, SRL, SRA, SLLV, SRLV, SRAV

Memory AccessLB, LBU, LH, LHU, LW, LWL, LWR

SB, SH, SW, SWL, SWR

ControlJ, JAL, JR, JALR

BEQ, BNE, BLEZ, BGTZ, BLTZ, BGEZ, BLTZAL, BGEZAL

Page 13: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

13

MIPS64 Instruction Format

Page 14: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

MIPS Implementation

Page 15: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

Pipelined Implementation

Page 16: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

16

Fundamentals of Computer DesignFundamentals of Computer Design Introduction

Classes of Computers

Defining Computer Architecture

Trends in Technology

Trends in Power and Energy in Integrated Circuits

Tends in Cost

Dependability

Measuring and Reporting Performance

Quantitative Principles of Computer Design

Conclusion

Page 17: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

17

Technology Trend

Component IC technology: transistor/chip increases 55% per year

DRAM: density increases 40-60% per year

Magnetic disk: density increases 100% per year

Network: Ethernet from 10 100Mb took 10 years;

100Mb 1Gb in 5 years

Scaling of performance, wires and powerFeature size: 10 micron in 1971; 0.18 in 2001, …

Microprocessor organization improvementWiring delayPower issue: ~100 watts for 2GHz Pentium 4

Page 18: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

18

Disk Comparison

CDC Wren I, 1983

3600 RPM

0.03 GBytes capacity

Tracks/Inch: 800

Bits/Inch: 9550

Three 5.25” platters

Bandwidth: 0.6 MBytes/sec

Latency: 48.3 ms

Cache: none

Seagate 373453, 2003

15000 RPM (4X)

73.4 GBytes (2500X)

Tracks/Inch: 64000 (80X)

Bits/Inch: 533,000 (60X)

Four 2.5” platters (in 3.5” form factor)

Bandwidth: 86 MBytes/sec (140X)

Latency: 5.7 ms (8X)

Cache: 8 MBytes

Page 19: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

19

Memory Comparison

1980 DRAM (asynchronous)

0.06 Mbits/chip

64 K transistors, 35 mm2

16-bit data bus per module

16 pins/chip

13 Mbytes/sec

Latency: 225 ns

(no block transfer)

2000 Double Data Rate Synchronous (clocked) DRAM

256.00 Mbits/chip (4000X)

256 M transistors, 204 mm2

64-bit data bus per DIMM (4X)

66 pins/chip

1600 Mbytes/sec (120X)

Latency: 52 ns (4X)

Block transfers (page mode)

Page 20: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

20

LAN Comparison

Ethernet 802.3

Year of Standard: 1978 10 Mbits/s link speed Latency: 3000 sec Shared media Coaxial cable

Ethernet 802.3ae

Year of Standard: 200310,000 Mbits/s

(1000X)link speed

Latency: 190 sec

(15X)Switched mediaCategory 5 copper wireCopper core

InsulatorBraided outer conductor

Plastic Covering

Copper, 1mm thick, twisted to avoid antenna effect

Twisted Pair:"Cat 5" is 4 twisted pairs in bundle

Page 21: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

21

CPU Comparison

1982 Intel 80286

12.5 MHz 2 MIPS (peak) Latency 320 ns 134,000 xtors, 47 mm2

16-bit data bus, 68 pins Microcode interpreter, separate FPU chip (no caches)

2001 Intel Pentium 4

1500 MHz (120X)4500 MIPS (peak) (2250X)Latency 15 ns (20X)42,000,000 xtors, 217 mm2

64-bit data bus, 423 pins3-way superscalar,

Dynamic translate to RISC, Superpipelined (22 stage),Out-of-Order execution

On-chip 8KB Data caches, 96KB Instr. Trace cache, 256KB L2 cache

Page 22: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

22

Bandwidth vs. Latency

Processor: ‘286, ‘386, ‘486, Pentium, Pentium Pro, Pentium 4 (21x,2250x)

Ethernet: 10Mb, 100Mb, 1000Mb, 10000 Mb/s (16x,1000x)

Memory Module: 16bit plain DRAM, Page Mode DRAM, 32b, 64b, SDRAM, DDR SDRAM (4x,120x)

Disk: 3600, 5400, 7200, 10000, 15000 RPM (8x, 143x)

Latency improvement is 10X while bandwidth improvement is 100X to 1000X.

Page 23: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

23

Fundamentals of Computer DesignFundamentals of Computer Design Introduction

Classes of Computers

Defining Computer Architecture

Trends in Technology

Trends in Power and Energy

Tends in Cost

Dependability

Measuring and Reporting Performance

Quantitative Principles of Computer Design

Conclusion

Page 24: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

24

Page 25: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

Power and Energy

dtPE

t

P

E

In many cases, faster execution means less energy, but the opposite may be true if power has to be increased to allow faster execution.

Page 26: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

Power and EnergyPower is drawn from a voltage source

Power:

Energy:

Average Power:

( ) ( )DD DDP t i t V

0 0

( ) ( )T T

DD DDE P t dt i t V dt

avg

0

1( )

T

DD DD

EP i t V dt

T T

Page 27: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

Dynamic Power

Cfsw

iDD(t)

VDD

dynamic

0

0

sw

2sw

1( )

( )

T

DD DD

TDD

DD

DDDD

DD

P i t V dtT

Vi t dt

T

VTf CV

T

CV f

Power needed to charge and discharge load capacitances when transistors switch. The capacitor needs to charge for output to be ‘1’ For output to be ‘0’, capacitor needs to discharge

This repeats T.fsw times over an interval of T

2dynamic DDP CV f Here, is activity factor

and f is clock frequency.

Page 28: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

28

Static Power

Because leakage current flows even when a

transistor is off, now static power important too

Leakage current increases in processors with smaller transistor sizes Increasing the number of transistors increases power even if they are turned off In 2006, goal for leakage is 25% of total power consumption; high performance designs at 40% Very low power systems even gate voltage to inactive modules to control loss due to leakage

VoltageCurrentPower staticstatic

Page 29: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

Reducing Energy Consumption

[www.transmeta.com]

Pentium Crusoe

Running the same multimedia application.

Infrared Cameras (FLIR) can be used to detect thermal distribution.

Page 30: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

Dynamic Power Management (DPM)

RUN: operationalIDLE: a SW routine may stop the CPU when not in use, while monitoring interruptsSLEEP: Shutdown of on-chip activity

RUN

SLEEPIDLE

400mW

160µW50mW

90µs

90µs10µs

10µs160ms STRONGARM

SA1100

Page 31: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

Dynamic Voltage Scaling (DVS)

E = P x TP V2

E (energy), P (power), T (time), V (voltage)

Example A task is given with workload (W) and deadline (D).

Assume that idle energy is negligible.

31T 2TD T D

V

V/2

E1 V12.T1 = V2.T E2 V2

2.T2 = V2/4.2T = E1/2

Page 32: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

Multicores – Low Power?

MulticoreOne core with frequency 2 GHz

Two cores with 1 GHz frequency (each) Same performance Two 1 GHz cores require half power/energy

– Power freq2

– 1GHz core needs one-fourth power compared to 2GHz core.

New challenges Performance concerns – how to keep them busy?

Reliability concerns – MTTF goes worse!

and more …

Page 33: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

33

Fundamentals of Computer DesignFundamentals of Computer Design Introduction

Classes of Computers

Defining Computer Architecture

Trends in Technology

Trends in Power and Energy in Integrated Circuits

Tends in Cost

Dependability

Measuring and Reporting Performance

Quantitative Principles of Computer Design

Conclusion

Page 34: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

34

DRAM Pricing

© 2003 Elsevier Science (USA). All rights reserved.

Page 35: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

35

Processor Pricing (Intel Pentium III)

© 2003 Elsevier Science (USA). All rights reserved.

Page 36: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

36

Silicon wafer and microprocessor die

This 8-inch wafer contains 564 MIPS64 R20K processors (0.18) Intel Pentium 4 Microprocessor

Page 37: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

37

Cost of an Integrated Circuit (IC)

Cost of IC: (die + packaging + test) / yield

See examples in Page 22-24

Cost of a systemProcessor board: ~ 37% I/O device: ~ 37%Cabinet: ~ 6%Software: ~ 20%

Page 38: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

38

Cost

Unit cost Monetary cost of manufacturing one unit, excluding NRE cost

NRE cost (Non-Recurring Engineering cost) The one-time monetary cost of designing the system

Total cost NRE cost + unit cost * # of unit

Per-product cost total cost / # of units = (NRE cost / # of units) + unit cost

• Example– NRE=$2000, unit=$100– For 10 units

– total cost = $2000 + 10*$100 = $3000– per-product cost = $2000/10 + $100 = $300

Amortizing NRE cost over the units results in an additional $200 per unit

Page 39: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

39

NRE versus Unit Cost

High NRE, low production cost

Low NRE, high production cost

Volume

Un

it C

ost

Page 40: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

40

Cost versus Price

Page 41: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

41

Fundamentals of Computer DesignFundamentals of Computer Design Introduction

Classes of Computers

Defining Computer Architecture

Trends in Technology

Trends in Power and Energy in Integrated Circuits

Tends in Cost

Dependability

Measuring and Reporting Performance

Quantitative Principles of Computer Design

Conclusion

Page 42: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

42

Define and Quantify Dependability

How to decide when a system is operating properly?

Infrastructure providers now offer Service Level Agreements (SLA) to guarantee that their networking or power service would be dependable

Systems alternate between 2 states of service with respect to an SLA: State 1: Service accomplishment, where the service is

delivered as specified in SLA

State 2: Service interruption, where the delivered service is different from the SLA

Failure = transition from state 1 to state 2

Restoration = transition from state 2 to state 1

Page 43: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

43

Dependability

Module reliability = measure of continuous service accomplishment (or time to failure)

Two metrics:

1. Mean Time To Failure (MTTF) – measures Reliability

2. Failures In Time (FIT) = 1/MTTF, the rate of failures

Traditionally reported as failures per billion hours of operation

Mean Time To Repair (MTTR) measures Service Interruption

Mean Time Between Failures (MTBF) = MTTF+MTTR

Module availability measures service as alternate between the 2 states of accomplishment and interruption (number between 0 and 1, e.g. 0.9)

Module availability = MTTF / ( MTTF + MTTR)

Page 44: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

44

Example

If modules have exponentially distributed lifetimes (age of module does not affect probability of failure), overall failure rate is the sum of failure rates of the modules

Calculate FIT and MTTF for 10 disks (1M hour MTTF per disk), 1 disk controller (0.5M hour MTTF), and 1 power supply (0.2M hour MTTF):

hours

MTTF

FIT

eFailureRat

000,59

000,17/000,000,000,1

000,17

000,000,1/17

000,000,1/5210

000,200/1000,500/1)000,000,1/1(10

( )

Page 45: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

45

Fundamentals of Computer DesignFundamentals of Computer Design Introduction

Classes of Computers

Defining Computer Architecture

Trends in Technology

Trends in Power and Energy in Integrated Circuits

Tends in Cost

Dependability

Measuring and Reporting Performance

Quantitative Principles of Computer Design

Conclusion

Page 46: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

46

Performance MeasurementPerformance metrics execution time

Increasing performance decreases execution time

Other metricsWall-clock time, response time, elapsed time

CPU time: user or system

We will focus on CPU performance, i.e., user CPU time on unloaded system

nxtimeExecution

ytimeExecution

yePerformanc

xePerformanc

Page 47: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

47

Choosing Programs to Evaluate Performance

Real applicationsFor example: gcc compiler, Microsoft Word

Modified (or scripted) applicationsFor example: remove I/O, script to simulate interactive

behavior.

KernelsFor example: Livermore loops, Linpack

Toy benchmarksFor example: sieve of eratosthenes, quicksort

Synthetic benchmarksFor example: wheatstone, dhrystone

Low

er Accuracy

Page 48: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

48

Benchmark SuitesDesktop

New SPEC CPU2006SPEC CPU2000: 11 integer, 14 floating-pointSPECviewperf, SPECapc: graphics benchmarks

ServerSPEC CPU2000: running multiple copiesSPECSFS: for NFS performanceSPECWeb: Web server benchmarkTPC-x: measure transaction-processing, queries, and

decision making database applications

Embedded ProcessorEEMBC: EDN Embedded Microprocessor Benchmark

Consortium

Page 49: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

49

SPEC CPU Benchmarks

Page 50: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

50

Reporting Performance

Performance should be reproducible

Description of the machine and compiler flags

Report for both baseline and optimized version

Source code modifications

Not allowed in SPEC benchmarks

Allowed but difficult or impossible

– TPC-C using Oracle or SQL database

Allowed in supercomputer benchmarks

– Modify or re-write algorithms

Hand-coding in assembly for EEMBC benchmark

Page 51: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

51

Comparing Performance

Arithmetic Mean:

What is the mixture of programs in the workload?

n

i

in Time1

1

Arithmetic Mean: 500.5 55 20

Page 52: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

52

Comparing Performance

Weighted Arithmetic Mean:

What if programs are fixed and inputs are not?

n

i

ii TimeWeight1

Page 53: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

53

Comparing Performance

Geometric Mean:

Execution time ratio is normalized to a base machine.Reference machine is not important.

The arithmetic means are different depending on which machine is used as basis, but geometric means are same.

Geometric mean does not predict execution time

n

n

i

iRatioTimeExecution1

Page 54: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

54

Normalized Execution Times (SPECRatio)

Geometric mean does not predict execution time Performance of machines A and B are same only if program P1

is executed 100 times for every occurrence of program P2

Rewards easy enhancements Improving program P3 (2 to 1) is same as improving program

P4 (1000 to 500).

Page 55: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

55

Fundamentals of Computer DesignFundamentals of Computer Design Introduction

Classes of Computers

Defining Computer Architecture

Trends in Technology

Trends in Power and Energy in Integrated Circuits

Tends in Cost

Dependability

Measuring and Reporting Performance

Quantitative Principles of Computer Design

Conclusion

Page 56: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

56

Amdahl’s Law

)/()1(

1

nffSpeedup

• Where:

f is a fraction of the execution time that can be enhanced

n is the enhancement factor

• Example: f = 0.1, n = 10 Speedup = 1.1

Make the common case fast Performance improvement to be gained from using some faster mode of execution is limited by the fraction of the time the faster mode can be used.

Page 57: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

57

Application of Amdahl’s Law

Amdahl’s law is useful for comparing overall performance of two design alternatives.

Example: Floating-point (FP) operations consume 50% of the

execution time of a graphics application. FP square root (FPSQRT) is used 20% of the time.

1. Improve FPSQRT operation execution by 10 times– Speedup = 1 / ((1-0.2) + 0.2/10) = 1.22

2. Improve all FP operations by 1.6 times– Speedup = 1 / ((1-0.5) + 0.5/1.6) = 1.23

Due to higher frequency of FP operations, the performance gain is more (case 2) compared to drastic improvement of FPSQRT (case 1).

Page 58: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

58

Measuring the Performance

Performance Equation CPU time = Instruction Count x Clock cycle time x CPI

How to compute these parametersKnown for existing processors

Clock cycle time

Use of counters in new processors CPI, Instruction count

Simulation for performance analysisProfile based

Trace-driven

Execution-driven

Page 59: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

59

CPU Performance Equation

The parameters are dependent Instruction Count: ISA and compiler technology CPI: Organization and ISA Cycle Time: Hardware technology and organization

Many performance enhancing techniques improves one with small/predictable impacts on the other two.

ClockRatestCyclePerInnCountInstructio

CycleTimestCyclePerInnCountInstructioTimeCPU

/1

Page 60: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

60

ExampleParameters:

Frequency of FP operations (incl. FPSQR) = 25%CPI for FP operations = 4; CPI for others = 1.33Frequency of FPSQR = 2%; CPI of FPSQR = 20

Compare 2 designs:Decrease CPI of FPSQR to 2CPI of all FP to 2.5

0.2%)7533.1(%)254(1

n

i ICTotal

ICCPICPI

iiorig

64.1)220(%20.2

)(%2

newFPSQRoldFPSQRnewFPSQR CPICPICPICPI orig

625.1)5.2%25()33.1%75( newFPCPI

Page 61: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

61

Fundamentals of Computer DesignFundamentals of Computer Design Introduction

Classes of Computers

Defining Computer Architecture

Trends in Technology

Trends in Power and Energy in Integrated Circuits

Tends in Cost

Dependability

Measuring and Reporting Performance

Quantitative Principles of Computer Design

Conclusion

Page 62: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

62

Fallacies and Pitfalls

The relative performance of two processors with the same ISA can be judged by clock rate or by the performance of a single benchmark suite.

1.7 GHz Pentium 4 relative to 1.0 GHz Pentium III

© 2003 Elsevier Science (USA). All rights reserved.

Page 63: 1 Fundamentals of Computer Design Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power and Energy in Integrated

63

Fallacies and PitfallsBenchmarks remain valid indefinitely.

One line in matrix300(SPEC89) executes 99% of the time

Peak performance tracks observed performance.

The best design is the one that optimizes the primary objective without considering design costs.

Synthetic benchmarks predict performance for real programs. Compiler/hardware optimizations can inflate performance

MIPS is an accurate measure for comparing performance among computers Consider using FP hardware instead of FP routines.

66 1010

CPI

ClockRate

ExecTime

InstCountMIPS