instruction set issues

24

Upload: trella

Post on 01-Feb-2016

33 views

Category:

Documents


0 download

DESCRIPTION

Instruction Set Issues. MIPS easy Instructions are only committed at MEM  WB transition Other architectures are more difficult Instructions may update state early FP more difficult Memory updating ops (e.g. string moves). Instruction Set Issues (cont.). Difficult architectural features - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Instruction Set Issues
Page 2: Instruction Set Issues

Instruction Set Issues

• MIPS easy– Instructions are only committed at MEMWB

transition

• Other architectures are more difficult– Instructions may update state early– FP more difficult– Memory updating ops (e.g. string moves)

Page 3: Instruction Set Issues

Instruction Set Issues (cont.)

• Difficult architectural features– “Odd” bits of state (e.g. condition codes)

• May need saving/restoring on exceptions

– Implicitly set condition codes• Complicate branch resolution

• Explicit setting helps here (still a RAW hazard)

– Multicycle operations• Widely differing execution times, lots of potential

data hazards, etc.

Page 4: Instruction Set Issues

Instruction Set Issues

• VAX suffers from many of these problems

• Solution: pipeline the microcode

• Intel 32-bit 80x86 processors since 1995 use a similar approach

Page 5: Instruction Set Issues

A.5. Handling Multicycle Operations

• MIPS: FP operations– Long latency (EX repeated)– Several functional units– Structural hazards– Data hazards

Page 6: Instruction Set Issues

DLX: FP Design

• Four functional units:– Integer ALU

• as before

– FP multiplier• also used for integer multiplication

– FP adder• addition, subtraction and conversion

– FP divider• also used for integer division

Page 7: Instruction Set Issues

MIPS Design with FP Units

Page 8: Instruction Set Issues

MIPS Multicycle Operations

Unit LatencyInitiation Interval

Integer ALU 0 1

Memory (loads) 1 1

FP add 3 1

FP multiply 6 1

FP divide 24 25

Page 9: Instruction Set Issues

Hazards

• Divides– Structural hazard

• Multiple register writes possible in a cycle

• Out-of-order completion– WAW hazards– Exception-handling complications

• RAW hazards increase

Page 10: Instruction Set Issues

Potential RAW Hazards• Example (SPARC syntax):

ldd [%fp-8], %f4fmuld %f4, %f6, %f0faddd %f0, %f8, %f2std %f2, [%fp-16]

Instr. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

ld F D X M W

mul F D X X X X X X X M W

add F D X X X X M W

st F D X M

Page 11: Instruction Set Issues

Multiple Writes• Up to four instructions may need to write in

the same cycle

• Solution– Track writes in ID– Stall at instruction issue

• Alternatively:– Stall at MEM or WB

• Stall instruction with shorter latency (may free RAW hazards)

Simpler: all stallsat one point

Page 12: Instruction Set Issues

WAW Hazards• Example:

faddd %f4, %f6, %f2… ! Integer opldd [%fp-8], %f2

Instr. 1 2 3 4 5 6 7 8

faddd F D X X X X M W

… F D X M W

ldd F D X M W

Page 13: Instruction Set Issues

WAW Hazards (cont.)

• Rare– Compiler scheduling may result in unlikely

instruction sequences, so must be caught

• Solutions:– Stall issue of ldd– Prevent write by faddd

Page 14: Instruction Set Issues

Maintaining Precise Exceptions• Out-of-order completion:

fdivd %f2, %f4, %f0faddd %f10, %f8, %f10fsubd %f12, %f14, %f12

Complete longbefore fdivd

• Sub may cause an exception after add is complete, but not div• No longer precise

Page 15: Instruction Set Issues

Maintaining Precise Exceptions

• It may be very difficult to handle exceptions precisely– E.g. the add has destroyed one of its operands!

• Four solutions:– Accept imprecise exceptions

• Needed for VM & IEEE FP

• Allow switching between precise and imprecise modes

Page 16: Instruction Set Issues

Maintaining Precise Exceptions

• Solutions (cont.)– Buffer results until earlier instructions complete

• Buffers may grow very large, and extensive forwarding required

• History files: restore original register values

• Future files: store new register values

– Software executes intervening instructions to get “up to date” before returning from exception

Page 17: Instruction Set Issues

Maintaining Precise Exceptions

• Solutions (cont.)– Hybrid scheme

• Instructions are only issued when it is certain that preceding instructions will not cause an exception

• May require stalling the pipeline

Page 18: Instruction Set Issues

Performance of the MIPS FP Pipeline

• Structural Hazards (divide unit)– Very low: 0-2 cycles per FP operation

• RAW hazards– Divide: 12-24 cycles, average 14.2– Add: 0.7-2.3 cycles, average 1.7– In general, about 0.5 × latency

Page 19: Instruction Set Issues

Overall MIPS FP Performance

• Stalls per instruction– 0.65-1.21 cycles– Average: 0.87– 82% from FP RAW hazards

Page 20: Instruction Set Issues

A.6. Putting It All TogetherMIPS R4000 Pipeline

• 64-bit instruction set

• Eight stage pipeline– superpipelining– IF + IS: instruction fetch– RF: decode/register fetch– EX: execution– DF + DS + TC: data cache access– WB: write back

Page 21: Instruction Set Issues

MIPS R4000 Pipeline

• Performance– Load delay: two cycles– Branch delay: three cycles

• Delayed branch (one cycle)

• Predict-not-taken strategy, with anulling

• Increased forwarding requirements– Three stages between EX and WB now

Page 22: Instruction Set Issues

MIPS R4000 Pipeline

• Floating Point– Three functional units

• Divider, multiplier, adder

• Shared components (8 sub-units)

– Latency: 2–112 cycles– Initiation rate: 1–111 cycles– Complicated stall handling

Page 23: Instruction Set Issues

MIPS R4000 Pipeline

• Performance:– CPI between 1.2 and 2.8 for SPEC92

benchmarks– Average: 2.0

• Integer: 1.54

• FP: 2.48

– Integer apps: mainly branch delays– FP apps: mainly FP data hazard stalls (RAW)

Page 24: Instruction Set Issues