comp sci 251 -- pipelining 1 ch. 13 pipelining. comp sci 251 -- pipelining 2 pipelining

Comp Sci 251 -- pipelining1

Ch. 13 Pipelining


Pipelining


Performance of Pipeline

One instruction completes every clock cycle n stages in pipeline up to n times faster

speedup < n because some instructions do not need every stage

Note: individual instructions are not faster


MIPS Pipeline

5 stages IF: instruction fetch ID: instruction decode (read registers) EX: instruction execution, address calc MEM: memory access WB: write back (to register)


Implementing a single-cycle pipeline

IF ID EX MEM WB

IF ID EX MEM WB

IF ID EX MEM WB

• Every stage takes the same time, whether there is work or not

• Each stage must be stretched to accommodate the slowest instruction


Total time for each instruction

Instr Fetch

Reg Read ALU Operation

Data access

Register Write

Total Time

Load word (lw)

200 ps 100 ps 200 ps 200 ps 100 ps 800 ps

Store word (sw)

200 ps 100 ps 200 ps 200 ps 700 ps

R-format (add, sub, and, or, slt)

200 ps 100 ps 200 ps 100 ps 600 ps

Branch (beq)

200 ps 100 ps 200 ps 500 ps


Single-cycle non-pipelined execution

IR R ALU MEM R

IR R ALU MEM R

lw $7, 100($5)

lw $8, 200($5)

lw $9, 300($5) IR800 ps

800 ps

3 independent lw instructions will take 3 x 800 ps = 2400 ps


Pipelined execution

IF

IF

IF

ALU MEM RR

R ALU MEM

R ALU MEM

R

R

200 400 600 800 1000 1200 1400

lw $7, 100($5)

lw $8, 200($5)

lw $9, 300($5)200ps

200ps

200ps

3 independent lw instructions takes 3 x 200 ps = 600 ps

4 times faster than the single-cycle non-pipelined execution


Pipeline Hazards

Control Hazards caused by conditional branch instruction cannot decide which instruction is next until

stage 3 (or stage 2 with beefed up processor)

pipeline wants to start next instruction during stage 2


Solutions to Control Hazards

Stall: – start next instruction during stage 3 (assume branch is resolved

in stage 2)– equivalent to placing a “nop” after every branch

Predict: – if incorrect, flush the bad instruction– Some prediction strategies

assume all branches not taken static: assume some always taken, others never taken dynamic: use past history, keep stats on branches


Solutions to Control Hazards

Delayed decision: (used in MIPS and SPARC)– instruction following branch always executes– branch takes place after this instruction– compiler or assembler fills “delay slots” with useful

instructions (or nop’s) Change order of neighboring instructions, if logically

acceptable Or nop


Data Hazards

Assume register write happens in WB stage

add $s0, $t0, $t1

sub $t2, $s0, $t3

Example requires three pipeline stalls– too costly to allow– too frequent for compiler to resolve


Solution to Data Hazards

Forwarding: getting the missing item early from internal resources

sub gets $s0 value from ALU, not reg. file sometimes forwarding avoids stalls


Solution to Data Hazards

sometimes forwarding only reduces stalls


Advanced Pipelining Techniques

Superpipelining: large number of stages Superscalar

– multiple copies of each stage– several instructions started/finished per cycle


Advanced Pipelining Techniques

Dynamic Pipeline Scheduling


Pentium Pro / Power PC 604

comp sci 251 -- pipelining 1 ch. 13 pipelining. comp sci 251 -- pipelining 2 pipelining

Documents