bİl 221 bilgisayar yapısı lab. – 1: benchmarking

BİL 221 Bilgisayar Yapısı

Lab. – 1: Benchmarking

2

Basic Performance Metrics• Time related:

– Execution time [seconds]• wall clock time• system and user time

– Latency– Response time

• Rate related:– Rate of computation

• floating point operations per second [flops]• integer operations per second [ops]

– Data transfer (I/O) rate [bytes/second]• Effectiveness:

– Efficiency [%]– Memory consumption [bytes]– Productivity [utility/($*second)]

• Modifiers:– Sustained– Peak– Theoretical peak

3

What Is a Benchmark?

• The term “benchmark” also commonly applies to specially-designed programs used in benchmarking

• A benchmark should:– be domain specific (the more general the benchmark, the

less useful it is for anything in particular)– be a distillation of the essential attributes of a workload– avoid using single metric to express the overall performance

• Computational benchmark kinds– synthetic: specially-created programs that impose the load

on the specific component in the system– application: derived from a real-world application program

Benchmark: a standardized problem or test that serves as a basis for evaluation or comparison (as of computer system performance) [Merriam-Webster]

4

Purpose of Benchmarking

• To define the playing field• To provide a tool enabling quantitative

comparisons• Acceleration of progress

– enable better engineering by defining measurable and repeatable objectives

• Establishing of performance agenda– measure release-to-release or version-to-version

progress– set goals to meet– be understandable and useful also to the people not

having the expertise in the field (managers, etc.)

5

Properties of a Good Benchmark

• Relevance: meaningful within the target domain• Understandability• Good metric(s): linear, orthogonal, monotonic• Scalability: applicable to a broad spectrum of

hardware/architecture• Coverage: does not over-constrain the typical

environment• Acceptance: embraced by users and vendors• Has to enable comparative evaluation• Limited lifetime: there is a point when

additional code modifications or optimizations become counterproductive

Adapted from: Standard Benchmarks for Database Systems by Charles Levine, SIGMOD ‘97

Metrics of Performance

Compiler

Programming Language

Application

DatapathControl

Transistors Wires Pins

ISA

Function Units

(millions) of Instructions per second: MIPS(millions) of (FP) operations per second: MFLOP/s

Cycles per second (clock rate)

Megabytes per second

Answers per monthOperations per second

7

MIPS ve MFLOPS

MIPS10 timeCPU

countn Instructio

10CPI

Clockrate

timeCPU

countn Instructio

CPI

ClockrateClockrate

countn InstructioCPI timeCPU

66

Machines with different instruction Machines with different instruction sets?sets?

Programs with different instruction Programs with different instruction mixes?mixes?

Uncorrelated with performanceUncorrelated with performance Marketing metricMarketing metric

““Meaningless Indicator of Meaningless Indicator of Processor Speed”Processor Speed”

610 timeCPU

operations FP ofNumber MFLOP/s

Popular in supercomputing Popular in supercomputing communitycommunity

Often not where time is spentOften not where time is spent Not all FP operations are equalNot all FP operations are equal

““Normalized” MFLOP/sNormalized” MFLOP/s Can magnify performance Can magnify performance

differencesdifferences A better algorithm (e.g., with A better algorithm (e.g., with

better data reuse) can run faster better data reuse) can run faster even with higher FLOP counteven with higher FLOP count

MFLOPS

MIPS

8

Benchmarklar To increase predictability, collections of benchmark To increase predictability, collections of benchmark

applications, called applications, called benchmark suitesbenchmark suites, are popular, are popular SPECCPUSPECCPU: popular desktop benchmark suite: popular desktop benchmark suite

CPU only, split between integer and floating point CPU only, split between integer and floating point programsprograms

SPECint2000 has 12 integer, SPECfp2000 has 14 SPECint2000 has 12 integer, SPECfp2000 has 14 integer pgmsinteger pgms

SPECCPU2006 was announced Spring 2006SPECCPU2006 was announced Spring 2006SPECSFSSPECSFS (NFS file server) and (NFS file server) and SPECWebSPECWeb (WebServer) (WebServer)

added as server benchmarksadded as server benchmarkswww.spec.org

Transaction Processing CouncilTransaction Processing Council measures server measures server performance and cost-performance for databasesperformance and cost-performance for databasesTPC-CTPC-C Complex query for Online Transaction Complex query for Online Transaction

ProcessingProcessingTPC-H models ad hoc decision supportTPC-H models ad hoc decision supportTPC-W a transactional web benchmarkTPC-W a transactional web benchmarkTPC-App application server and web services TPC-App application server and web services

benchmarkbenchmark

http://www.spec.org/

SPEC CPU (integer)

“Representative” applications keeps growing with time!

SPEC CPU (floating point)

11

Performans Nasıl Özetlenir? Arithmetic average of execution times??Arithmetic average of execution times??

But they vary in basic speed, so some would be more But they vary in basic speed, so some would be more important than others in arithmetic averageimportant than others in arithmetic average

Could add weights per program, but how to Could add weights per program, but how to pick weight? pick weight? Different companies want different weights for their Different companies want different weights for their

productsproducts SPECRatio: Normalize execution times to SPECRatio: Normalize execution times to

reference computer, yielding a ratio reference computer, yielding a ratio proportional to performance =proportional to performance = time on reference computer / time on computer time on reference computer / time on computer

being ratedbeing rated Spec uses an older Sun machine as referenceSpec uses an older Sun machine as reference

12

Oranlar If program SPECRatio on Computer A is 1.25 If program SPECRatio on Computer A is 1.25

times bigger than Computer B, thentimes bigger than Computer B, then

1.25

reference

A A

referenceB

B

B A

A B

ExecutionTime

SPECRatio ExecutionTimeExecutionTimeSPECRatioExecutionTime

ExecutionTime Performance

ExecutionTime Performance

Note that when comparing 2 computers as a Note that when comparing 2 computers as a ratio, execution times on the reference ratio, execution times on the reference computer drop out, so choice of reference computer drop out, so choice of reference computer is irrelevant computer is irrelevant

13

Ortalamalar .1 numbers, positive of tuple-an be ,,Let 1 nnrr nr

n

r

rr

rrr

rrr

rrr

ii

n

H

nG

nnA

nnQ

n

n

ii

n

ii

n

ii

n

1

111

)(mean Harmonic

1)(mean Geometric

)(mean Arithmetic

)(mean Quadratic

1

1

1

1

222

1

r

r

r

r

14

Geometrik Ortalama Since ratios, proper mean is geometric mean Since ratios, proper mean is geometric mean

(SPECRatio unitless, so arithmetic mean meaningless)(SPECRatio unitless, so arithmetic mean meaningless)

1

n

ni

i

GeometricMean SPECRatio

1.1. Geometric mean of the ratios is the same as the ratio Geometric mean of the ratios is the same as the ratio

of the geometric meansof the geometric means

2.2. Ratio of geometric means Ratio of geometric means = Geometric mean of = Geometric mean of performanceperformance ratios ratios choice of reference computer is irrelevant! choice of reference computer is irrelevant!

These two points make geometric mean of ratios These two points make geometric mean of ratios attractive to summarize performanceattractive to summarize performance

SPECjvm2008

Workloads

Workload Descriptions

bİl 221 bilgisayar yapısı lab. – 1: benchmarking

Documents

benchmark suites

benchmarkinga benchmark

term benchmark

second flopsinteger

second clock ratemegabytes

basic performance metricstime

mipsmillions of fp operations

server benchmarkswww