cs352h: computer systems architecturefussell/courses/cs352h/lectures/8-mips-pipeline.pdfuniversity...

37
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell CS352H: Computer Systems Architecture Topic 8: MIPS Pipelined Implementation September 29, 2009

Upload: dohanh

Post on 18-May-2018

231 views

Category:

Documents


4 download

TRANSCRIPT

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell

CS352H: Computer Systems Architecture

Topic 8: MIPS Pipelined Implementation

September 29, 2009

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 2

MIPS Pipeline

Five stages, one step per stageIF: Instruction fetch from memoryID: Instruction decode & register readEX: Execute operation or calculate addressMEM: Access memory operandWB: Write result back to register

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 3

Pipeline Performance

Assume time for stages is100ps for register read or write200ps for other stages

Compare pipelined datapath with single-cycle datapath

200ps200psj

500ps200ps100 ps200psbeq

600ps100 ps200ps100 ps200psR-format

700ps200ps200ps100 ps200pssw

800ps100 ps200ps200ps100 ps200pslw

Total timeRegisterwrite

Memoryaccess

ALU opRegisterread

Instr fetchInstr

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 4

Pipeline Performance

Single-cycle (Tc= 800ps)

Pipelined (Tc= 200ps)

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 5

Pipeline Speedup

If all stages are balancedi.e., all take the same timeTime between instructionspipelined

= Time between instructionsnonpipelined

Number of stages

If not balanced, speedup is lessSpeedup due to increased throughput

Latency (time for each instruction) does not decrease

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 6

Pipelining and ISA Design

MIPS ISA designed for pipeliningAll instructions are 32-bits

Easier to fetch and decode in one cyclec.f. x86: 1- to 17-byte instructions

Few and regular instruction formatsCan decode and read registers in one step

Load/store addressingCan calculate address in 3rd stage, access memory in 4th stage

Alignment of memory operandsMemory access takes only one cycle

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 7

Hazards

Situations that prevent starting the next instruction in thenext cycleStructure hazards

A required resource is busyData hazard

Need to wait for previous instruction to complete its dataread/write

Control hazardDeciding on control action depends on previous instruction

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 8

Structure Hazards

Conflict for use of a resourceIn MIPS pipeline with a single memory

Load/store requires data accessInstruction fetch would have to stall for that cycle

Would cause a pipeline “bubble”

Hence, pipelined datapaths require separateinstruction/data memories

Or separate instruction/data caches

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 9

Data Hazards

An instruction depends on completion of data access by aprevious instruction

add $s0, $t0, $t1sub $t2, $s0, $t3

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 10

Forwarding (aka Bypassing)

Use result when it is computedDon’t wait for it to be stored in a registerRequires extra connections in the datapath

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 11

Load-Use Data Hazard

Can’t always avoid stalls by forwardingIf value not computed when neededCan’t forward backward in time!

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 12

Code Scheduling to Avoid Stalls

Reorder code to avoid use of load result in the nextinstructionC code for A = B + E; C = B + F;

lw $t1, 0($t0)lw $t2, 4($t0)add $t3, $t1, $t2sw $t3, 12($t0)lw $t4, 8($t0)add $t5, $t1, $t4sw $t5, 16($t0)

stall

stall

lw $t1, 0($t0)lw $t2, 4($t0)lw $t4, 8($t0)add $t3, $t1, $t2sw $t3, 12($t0)add $t5, $t1, $t4sw $t5, 16($t0)

11 cycles13 cycles

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 13

Control Hazards

Branch determines flow of controlFetching next instruction depends on branch outcomePipeline can’t always fetch correct instruction

Still working on ID stage of branch

In MIPS pipelineNeed to compare registers and compute target early in the pipelineAdd hardware to do it in ID stage

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 14

Stall on Branch

Wait until branch outcome determined before fetching nextinstruction

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 15

Branch Prediction

Longer pipelines can’t readily determine branch outcomeearly

Stall penalty becomes unacceptable

Predict outcome of branchOnly stall if prediction is wrong

In MIPS pipelineCan predict branches not takenFetch instruction after branch, with no delay

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 16

MIPS with Predict Not Taken

Predictioncorrect

Predictionincorrect

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 17

More-Realistic Branch Prediction

Static branch predictionBased on typical branch behaviorExample: loop and if-statement branches

Predict backward branches takenPredict forward branches not taken

Dynamic branch predictionHardware measures actual branch behavior

e.g., record recent history of each branch

Assume future behavior will continue the trendWhen wrong, stall while re-fetching, and update history

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 18

Pipeline Summary

Pipelining improves performance by increasing instructionthroughput

Executes multiple instructions in parallelEach instruction has the same latency

Subject to hazardsStructure, data, control

Instruction set design affects complexity of pipelineimplementation

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 19

MIPS Pipelined Datapath

WB

MEM

Right-to-leftflow leads tohazards

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 20

Pipeline registers

Need registers between stagesTo hold information produced in previous cycle

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 21

Pipeline Operation

Cycle-by-cycle flow of instructions through the pipelineddatapath

“Single-clock-cycle” pipeline diagramShows pipeline usage in a single cycleHighlight resources used

c.f. “multi-clock-cycle” diagramGraph of operation over time

We’ll look at “single-clock-cycle” diagrams for load &store

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 22

IF for Load, Store, …

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 23

ID for Load, Store, …

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 24

EX for Load

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 25

MEM for Load

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 26

WB for Load

Wrongregisternumber

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 27

Corrected Datapath for Load

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 28

EX for Store

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 29

MEM for Store

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 30

WB for Store

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 31

Multi-Cycle Pipeline Diagram

Form showing resource usage

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 32

Multi-Cycle Pipeline Diagram

Traditional form

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 33

Single-Cycle Pipeline Diagram

State of pipeline in a given cycle

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 34

Pipelined Control (Simplified)

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 35

Pipelined Control

Control signals derived from instructionAs in single-cycle implementation

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 36

Pipelined Control

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 37

Concluding Remarks

ISA influences design of datapath and controlDatapath and control influence design of ISAPipelining improves instruction throughputusing parallelism

More instructions completed per secondLatency for each instruction not reduced

Hazards: structural, data, control