lec jan29 2009
DESCRIPTION
TRANSCRIPT
Anshul Kumar, CSE IITD
CSL718 : Superscalar Processors
CSL718 : Superscalar CSL718 : Superscalar ProcessorsProcessors
Issue and Despatch29th Jan, 2009
Anshul Kumar, CSE IITD slide 2
Early proposals/prototypesEarly proposals/prototypesEarly proposals/prototypes
1982 1983 1984 1985 1986 1987 1988 1989
IBM
DEC
Stanford U
Kyushu U
Cheetah America project(4)
Multititan project(2)
Match(2) Torch(4)
SIMP(4) DSNS(4)
TermSuperscalar
Anshul Kumar, CSE IITD slide 3
Commercial superscalarsCommercial Commercial superscalarssuperscalars
RISCs• Intel 960KA/KB ⇒ 960CA (3) 1989• IBM Power 1 RS/6000 (4) 1990• HP PA7000 ⇒ PA7100 (2) 1992• SUN SPARC ⇒ SuperSparc (3) 1992• DEC Alpha 21064(2) 1992• Motorola MC88100 ⇒ MC88110(2) 1993• Motorola PowerPC 601/603 (3) 1993• MIPS R4000 ⇒ R8000(4) 1994
Anshul Kumar, CSE IITD slide 4
Commercial superscalarsCommercial Commercial superscalarssuperscalars
CISCs• Intel 80486 ⇒ Pentium (2) 1993• Motorola MC68040 ⇒ MC68060 (2) 1993• Gmicro Gmicro/100p ⇒
Gmicro 500 (2) 1993• AMD K5(2) – 4 RISC instr 1995• CYRIX M1 (2) 1995
Anshul Kumar, CSE IITD slide 5
Tasks of superscalar processingTasks of superscalar processingTasks of superscalar processing
Parallel Parallel Preserving thedecoding instruction sequential and issue execution consistency of
instruction executionand
exception processing
Anshul Kumar, CSE IITD slide 6
Superscalar decode and issueSuperscalar decode and issueSuperscalar decode and issue
I - cache
Instructionbuffer
Decode & Issue
IF D/I
ScalarIssue
I - cache
Instructionbuffer
Decode & Issue
IF D I
SuperscalarIssue
Anshul Kumar, CSE IITD slide 7
Parallel DecodingParallel DecodingParallel Decoding
• Fetch multiple instructions in instruction buffer
• Decode multiple instructions in parallel – instruction window
• Possibly check dependencies among these as well as with the instructions already under execution
Anshul Kumar, CSE IITD slide 8
Reducing decoding timeReducing decoding timeReducing decoding time
PrePre--decodingdecoding• Do partial decoding while
instructions are being loaded in I-cache
• Decoded information is appended to the instruction
• This includes instruction class, resources required etc.
Second level cacheor main memory
Pre-decode unit
I - cache
N bits/cycle
N + n bits/cycle
Anshul Kumar, CSE IITD slide 9
Pre-decoding examplesPrePre--decoding examplesdecoding examples
Processor No. of predecode bitsPA 7200 (1995) 5PA 8000 (1996) 5PowerPC 620(1996) 7UltraSparc (1995) 4HAL PM1 (1995) 4AMD K5 (1995) 5 (per byte)R 10000 (1996) 4
Anshul Kumar, CSE IITD slide 10
Blocking during issueBlocking Blocking during issueduring issue
EU EU EU
Decode Check & Issue
Instructionbuffer
issue window
Decode and issue instructions directly to EUs
Instructions may be blocked due to data dependency
Anshul Kumar, CSE IITD slide 11
Non-blocking IssueNonNon--blockingblocking IssueIssue
Reservationstation
Dep. Checking/dispatch
EU
Reservationstation
Dep. Checking/dispatch
EU
Reservationstation
Dep. Checking/dispatch
EU
Decode & Issue
Instructionbuffer
From buffers dispatch to EUs
Decode and issue to buffers
Anshul Kumar, CSE IITD slide 12
Handling of Issue BlockagesHandling of Issue BlockagesHandling of Issue Blockages
Preserving issue order Alignment of instruction issue
aligned unalignedin-order out of order
Anshul Kumar, CSE IITD slide 13
Issue OrderIssue OrderIssue Order
cd abe
a
Issue windowInstructionsto be issued
Instructionsissued
cd abe
a
Issue windowInstructionsto be issued
Instructionsissued
Issue in strict program order Out of order Issue
c
Example: MC 88110, PowerPC 601
Independent instructionDependent instructionIssued instruction
Anshul Kumar, CSE IITD slide 14
AlignmentAlignmentAlignment
cd abe
a
fixed windowcheckedin cycle 1
Aligned Issue Unaligned Issue
issuedin cycle 1
fgh
next window
cd be
b
checkedin cycle 2
issuedin cycle 2
fgh
de
d
checkedin cycle 3
issuedin cycle 3
fgh
c
cd abe
a
gliding window
fgh
cd be
b
fgh
defgh
c
def
Anshul Kumar, CSE IITD slide 15
Design space in instruction issueDesign Design spacespace in instruction issuein instruction issue
Coping with Coping with Use of Handling of Issuefalse data unresolved RSs issue blockages ratedependencies control (2-6)
dependencies
no Registerrenaming wait speculative
blocking non-blocking
Anshul Kumar, CSE IITD slide 16
Frequently used issue policiesFrequently used issue Frequently used issue policiespolicies
Traditional Traditional Traditional Traditionalscalar issue scalar issue scalar issue scalar issue
with RSs with RSs with spec. and renaming execution
CDC 6600 IBM 360/91i386MC68030R3000Sparc
I486MC68040R4000MicroSparc
in scalar processorsinin scalar processorsscalar processors
Anshul Kumar, CSE IITD slide 17
Frequently used issue policiesFrequently used issue Frequently used issue policiespolicies
Straightforward Straightforward Straight forward Advancedsuperscalar superscalar superscalar superscalarissue issue with issue with issue
RSs renaming (renaming+RSs)
aligned unaligned (speculative execution in all)
PentiumPowerPC601PA7100SuperSparcAlpha21164
MC68060PA7200UltraSparc
MC88110R8000
PowerPC602
R10000PentiumProPowerPC602PA8000Sparc64Am29000K5
in super scalar processorsinin super scalar processorssuper scalar processors
Anshul Kumar, CSE IITD slide 18
Design Space of Reservation StationsDesign Space of Design Space of Reservation StationsReservation Stations
Scope Layout of Operand fetch Instructionreservation policy dispatch schemestations
partial full
Anshul Kumar, CSE IITD slide 19
Layout of Reservation StationsLayout of Layout of Reservation StationsReservation Stations
Type Number of Number of readbuffer entries and write ports
Stand combined withalone renaming and(RS) reordering
individual 2-4group 6-16central 20total 15-40
depends onno. of EUsconnected
Anshul Kumar, CSE IITD slide 20
Reservation Stations (RS)Reservation Stations (RS)Reservation Stations (RS)
EU EU EU EU EU EU EU EU
RS RS RS RS RS
Individual RSs Group RSs Central RS
Anshul Kumar, CSE IITD slide 21
Operand Fetch PoliciesOperand Fetch PoliciesOperand Fetch Policies
Issueboundfetch
Dispatchboundfetch
Anshul Kumar, CSE IITD slide 22
Issue bound operand fetch (with single register file)
Issue bound operand fetchIssue bound operand fetch (with single register file)(with single register file)
EU EU
RS RS
EU EU
RS RS
Decode/issue
RF
instructiondata
Anshul Kumar, CSE IITD slide 23
Dispatch bound operand fetch (with single register file)
Dispatch bound operand fetch Dispatch bound operand fetch (with single register file)(with single register file)
EU EU
RS RS
EU EU
RS RS
Decode/issueinstructiondata
RF
Anshul Kumar, CSE IITD slide 24
Issue bound operand fetch (with multiple register files)
Issue bound operand fetchIssue bound operand fetch (with multiple register files)(with multiple register files)
EU EU
RS RS
EU EU
RS RS
Decode/issue
RF RF
instructiondata
Anshul Kumar, CSE IITD slide 25
Dispatch bound operand fetch (with multiple register files)
Dispatch bound operand fetch Dispatch bound operand fetch (with multiple register files)(with multiple register files)
EU EU
RS RS
EU EU
RS RS
Decode/issueinstructiondata
RF RF
Anshul Kumar, CSE IITD slide 26
Updating RFs and RSsUpdating Updating RFsRFs and and RSsRSs
EU EU
RS RS
EU EU
RS RS
Decode/issue
RF RF
instructiondata
Anshul Kumar, CSE IITD slide 27
Instruction dispatch schemeInstruction dispatch schemeInstruction dispatch scheme
Dispatch Dispatch Checking Treatment ofpolicy rate operand empty RS
availability
single multipleinstr/ instr/cycle cycleIndividual RS Group or central RS
Anshul Kumar, CSE IITD slide 28
Dispatch policyDispatch policyDispatch policy
Selection Arbitration Dispatchrule rule order
Rule for identifyinginstructions which areready for execution(data dependency check)
Rule for choosingone out of severalready instructions(earlier instruction has priority)
Anshul Kumar, CSE IITD slide 29
Dispatch orderDispatch orderDispatch order
in-order partially out ofout of orderorder
RS RScheck check
Anshul Kumar, CSE IITD slide 30
Checking availability of operandsChecking availability of operandsChecking availability of operands
Direct check of Check of explicit score-board bits status bits in RS
(usual for dispatch (usual for issuebound operand fetch) bound operand fetch)
control flow approach data flow approachFlynn’s terminology
Anshul Kumar, CSE IITD slide 31
Score-boardScoreScore--boardboard
RegisterFile
101
10
012
Data status
Introduced with CDC6600
Anshul Kumar, CSE IITD slide 32
Checking in dispatch bound fetchChecking in dispatch bound fetchChecking in dispatch bound fetch
RegisterFile
Reservationstation
OC Rs1 Rs2 Rd
EU
decodedinstruction
check V bits of sources
update Rdset V bitRs1,Rs2,Rd
reset V bit of Rd
OC(opcode)
Os1
Os2 (operand value)
result, Rd
Anshul Kumar, CSE IITD slide 33
Checking in issue bound fetchChecking in issue bound fetchChecking in issue bound fetch
OC Os1/Is1 Vs1 Os2/Is2 Vs2 Rd
EU
decodedinstruction
OC, Os1, Os2, Rd
result, Rd
RegisterFile
update Rd, set V bitRs1,Rs2,Rdreset V bit of Rd
Os1
Os2 (operand value)
Reservation station check Vs1, Vs2
associative update ofIs1, Is2 with Rd, set Vs bits
Anshul Kumar, CSE IITD slide 34
Treatment of an empty RSTreatment of an empty RSTreatment of an empty RS
Straight forward Bypassingapproach RS if empty
RS At least onecycle stay in RS
EU
RS
EUNx586 Sparc64PowerPc 604
Anshul Kumar, CSE IITD slide 35
Approaches in dispatchingApproaches in dispatchingApproaches in dispatching
Straight forward Enhanced Advancedin order partially out of order out of ordersingle single multiple
instr/cycle instr/cycle instr/cycleindividual RSs individual RSs group/central RSs
Power1, PPC603 Power2 PM1, PentiumProNx586, Am29000 PPC604,620 PA8000, R10000
Anshul Kumar, CSE IITD slide 36
ReferenceReferenceReference1. D. Sima, T. Fountain, P. Kacsuk, "Advanced Computer
Architectures : A Design Space Approach", Addison Wesley, 1997.