gary marsdenslide 1university of cape town computer architecture – introduction andrew hutchinson...

18
Gary Marsden Slide 1 University of Cape Town Computer Architecture – Introduction Andrew Hutchinson & Gary Marsden (me) ([email protected]) 2005

Upload: garey-doyle

Post on 27-Dec-2015

219 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Gary MarsdenSlide 1University of Cape Town Computer Architecture – Introduction Andrew Hutchinson & Gary Marsden (me) ( gaz@cs.uct.ac.za ) 2005

Gary Marsden Slide 1University of Cape Town

Computer Architecture – Introduction

Andrew Hutchinson & Gary Marsden (me)

([email protected])2005

Page 2: Gary MarsdenSlide 1University of Cape Town Computer Architecture – Introduction Andrew Hutchinson & Gary Marsden (me) ( gaz@cs.uct.ac.za ) 2005

Gary Marsden Slide 2University of Cape Town

The Grand Scheme

AbstractionsRole of Performance Language of the MachineArithmetic [3] Performance Issues (6) Processor: Datapath & Control (3) Pipelining (4) Memory hierarchy (4) Peripherals

CS1

CS2

Page 3: Gary MarsdenSlide 1University of Cape Town Computer Architecture – Introduction Andrew Hutchinson & Gary Marsden (me) ( gaz@cs.uct.ac.za ) 2005

Gary Marsden Slide 3University of Cape Town

Review

Chapter 1: Abstractions– HLL -> Assembly -> Binary– Application (system s/w (hardware))– 5 classic components

• Input, output, memory, datapath, control

– Processor gets inst. & data from memory– Input for placing items in memory– Output reads from memory– Control co-ordinates– Abstraction / Interfaces

Page 4: Gary MarsdenSlide 1University of Cape Town Computer Architecture – Introduction Andrew Hutchinson & Gary Marsden (me) ( gaz@cs.uct.ac.za ) 2005

Gary Marsden Slide 4University of Cape Town

Review II

Chapter 2: The Role of Performance

CPU time =

Popular measures– MIPS

• Depends on inst. Set | Varies between progs. | Can vary inversely with performance

– GFLOPS• Better, perhaps, as instructions are similar• FLOPS can differ across machines

Instruction Count x CPI

Clock Rate

Page 5: Gary MarsdenSlide 1University of Cape Town Computer Architecture – Introduction Andrew Hutchinson & Gary Marsden (me) ( gaz@cs.uct.ac.za ) 2005

Gary Marsden Slide 5University of Cape Town

Review III

Chapter 3 - Machine Language– R type and I type (distinguished by first field)– Registers– Address modes

• Register, base, immediate, PC relative

– P1: Simplicity favours regularity– P2: Smaller is faster– P3: Good design demands compromise– P4: Make the common case fast

Chapter 4 - Arithmetic– ALU construction

Page 6: Gary MarsdenSlide 1University of Cape Town Computer Architecture – Introduction Andrew Hutchinson & Gary Marsden (me) ( gaz@cs.uct.ac.za ) 2005

Gary Marsden Slide 6University of Cape Town

New Chapter 4 - Performance

What is a fast computer?– Clock speed?– Responsiveness?– Data processing?

It depends…QuickTime™ and a

TIFF (Uncompressed) decompressorare needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Page 7: Gary MarsdenSlide 1University of Cape Town Computer Architecture – Introduction Andrew Hutchinson & Gary Marsden (me) ( gaz@cs.uct.ac.za ) 2005

Gary Marsden Slide 7University of Cape Town

What is performance?

There are so many elements to a computer system that it is pretty much impossible to determine, in advance, the performance level of a system

Computer vendors have their own idea of ‘performance’– Macintosh users have the ‘bounce’ test

Performance means different things to different people– 747 or F-15

Page 8: Gary MarsdenSlide 1University of Cape Town Computer Architecture – Introduction Andrew Hutchinson & Gary Marsden (me) ( gaz@cs.uct.ac.za ) 2005

Gary Marsden Slide 8University of Cape Town

Two key measures

Users of personal computers are interested in ‘response time’– Also called execution time– F-15

Data centre managers are interested in ‘throughput time’– Total work done in a given time– 747

This implies at least two different performance ‘metrics’

Page 9: Gary MarsdenSlide 1University of Cape Town Computer Architecture – Introduction Andrew Hutchinson & Gary Marsden (me) ( gaz@cs.uct.ac.za ) 2005

Gary Marsden Slide 9University of Cape Town

Response time

As we are looking at microprocessor design, response time is of primary interest

Start by considering execution time– Longer is bad; shorter is good

So we can say that

Performance =1

Execution Time

Page 10: Gary MarsdenSlide 1University of Cape Town Computer Architecture – Introduction Andrew Hutchinson & Gary Marsden (me) ( gaz@cs.uct.ac.za ) 2005

Gary Marsden Slide 10University of Cape Town

What is time?

“Time is an illusion; lunch time doubly so”We kind of cheated when we said “time”Time can be thought of in two ways

– Wall time / Response time /Elapsed time: all mean the number of seconds passed since the task was started until it ends

– CPU time: In multi-user systems, it makes more sense to count the time the CPU spends on that user’s task (‘system’ time confuses issue)

• Try the Unix ‘time’ command

Page 11: Gary MarsdenSlide 1University of Cape Town Computer Architecture – Introduction Andrew Hutchinson & Gary Marsden (me) ( gaz@cs.uct.ac.za ) 2005

Gary Marsden Slide 11University of Cape Town

CPU performance

In personal computers, most people focus on clock speed - e.g. 4 Ghz processor– 4 GHz is the clock rate; 0.25 ns is the cycle length

CPU execution time = Clock cycles for a program

So increasing clock speed, or decreasing the number of cycles for a program will improve performance

We need to think a bit more about ‘cycles for a program’

Clock Rate

Page 12: Gary MarsdenSlide 1University of Cape Town Computer Architecture – Introduction Andrew Hutchinson & Gary Marsden (me) ( gaz@cs.uct.ac.za ) 2005

Gary Marsden Slide 12University of Cape Town

Program length

The number of cycles require for a program will depend on the number of instructions contained in a program and the number of cycles it takes to execute each of these instructions– We average this to CPI (Cycles per Instruction)

So…– CPU clock cycles = Number of instructions x CPI

Therefore– CPU execution time = Number of instructions x

CPIClock Rate

Page 13: Gary MarsdenSlide 1University of Cape Town Computer Architecture – Introduction Andrew Hutchinson & Gary Marsden (me) ( gaz@cs.uct.ac.za ) 2005

Gary Marsden Slide 13University of Cape Town

A note on CPI

Different instructions take different numbers of clock cycles to complete– A lot more on this later

By using hardware and software monitoring tools, one can calculate a sufficiently useful CPI value

Page 14: Gary MarsdenSlide 1University of Cape Town Computer Architecture – Introduction Andrew Hutchinson & Gary Marsden (me) ( gaz@cs.uct.ac.za ) 2005

Gary Marsden Slide 14University of Cape Town

So how can I make my program faster?

Algorithm: Instruction count (and possibly CPI)– Algorithm choice affects number and type of

instructions Language: Instruction count and CPI

– Some languages require more statements per expression and may require more high-cycle instructions

Compiler: Instruction count and CPI– Compilers can have a huge effect on both these

measures; too complex to deal with hereProcessor Instruction Set: CPI, clock rate &

instruction count– We will get in to this in the next chapter

Page 15: Gary MarsdenSlide 1University of Cape Town Computer Architecture – Introduction Andrew Hutchinson & Gary Marsden (me) ( gaz@cs.uct.ac.za ) 2005

Gary Marsden Slide 15University of Cape Town

Comparative Performance

So we know the elements, but how do we compare systems?

For situations where there is only one application to run, this is straightforward

Most of us, however, run multiple applicationsCould test the application on different

platforms, but only if it is available for target platforms

Alternatively create a simple application to create an idealised usage– The Benchmark!

Page 16: Gary MarsdenSlide 1University of Cape Town Computer Architecture – Introduction Andrew Hutchinson & Gary Marsden (me) ( gaz@cs.uct.ac.za ) 2005

Gary Marsden Slide 16University of Cape Town

Benchmarks

There are many of these created to indicate performance for different types of task

A good place to look is – www.spec.org

However, some compilers are created to optimise for specific benchmarks

Hence the need for reproducability

Page 17: Gary MarsdenSlide 1University of Cape Town Computer Architecture – Introduction Andrew Hutchinson & Gary Marsden (me) ( gaz@cs.uct.ac.za ) 2005

Gary Marsden Slide 17University of Cape Town

You may have head of MIPS?

MIPS (Millions of Instructions Per Second) was, for a long time, the standard benchmark, but has now been replaced

On the plus side, bigger numbers usually mean faster computers

On the down side– Different instruction sets do different things– MIPS ratings vary per program (different MIPS

for same computer)– MIPS can vary inversely with performance.

Page 18: Gary MarsdenSlide 1University of Cape Town Computer Architecture – Introduction Andrew Hutchinson & Gary Marsden (me) ( gaz@cs.uct.ac.za ) 2005

Gary Marsden Slide 18University of Cape Town

Some performance laws

Moore’s law: “The complexity of an integrated circuit will double every 18 months”– What are the implications of that?

Amdahl’s Law: Even dramatic improvements in part of a task will not have significant effects on the overall task (diminishing returns)– Implication is that the common task should be

fast