add on instruction set architecture - ryerson universitycourses/coe608/lectures/more-isa.pdfadd on...
Post on 08-May-2018
222 Views
Preview:
TRANSCRIPT
______________________________________________________________________________________________ © G. Khan Computer Organization & Architecture – COE608: ISA and IS Design Page: 1
Add on
Instruction Set Architecture
COE608: Computer Organization and Architecture
Dr. Gul N. Khan http://www.ee.ryerson.ca/~gnkhan
Electrical and Computer Engineering Ryerson University
Overview • More MIPS Instructions • Alternative Architecture • ARM CPU - ISA • IA-32/x86 – ISA
Chapter 2 of the Text
______________________________________________________________________________________________ © G. Khan Computer Organization & Architecture – COE608: ISA and IS Design Page: 2
Instruction Set Design Main Goals Maximize performance, minimize cost and reduce design time Between Software and Hardware Which is easier to modify?
instructions
software
hardware
______________________________________________________________________________________________ © G. Khan Computer Organization & Architecture – COE608: ISA and IS Design Page: 3
Switch Statement
Choose among four alternatives depending on whether k has the value 0, 1, 2 or 3. Compile the following C code manually: switch (k) { case 0: f = i+j; break; case 1: f = g+h; break; case 2: f = g–h; break; case 3: f = i–j; break; } Rewrite it as a chain of if-else statements, which we already know how to compile: if (k = = 0) f = i+j; else if (k = = 1) f = g+h; else if (k = = 2) f = g–h; else if (k = = 3) f = i–j; Using the following mapping: f:$s0, g:$s1, h:$s2, i:$s3, j:$s4, k:$s5
______________________________________________________________________________________________ © G. Khan Computer Organization & Architecture – COE608: ISA and IS Design Page: 4
Switch Statement
Final compiled MIPS code: bne $s5, $0, L1 add $s0, $s3, $s4 j Exit L1: addi $t0, $s5, -1 bne $t0, $0, L2 add $s0, $s1, $s2 j Exit L2: addi $t0, $s5, -2 bne $t0, $0, L3 sub $s0, $s1, $s2 j Exit L3: addi $t0, $s5, -3 bne $t0, $0, Exit sub $s0, $s3, $s4 Exit:
______________________________________________________________________________________________ © G. Khan Computer Organization & Architecture – COE608: ISA and IS Design Page: 5
Function/Procedure Call
Registers play a major role in keeping track of information for function calls.
MIPS Registers
The constant 0 $0 $zero Reserved for Assembler $1 $at Return Values $2-$3 $v0-$v1 Arguments $4-$7 $a0-$a3 Temporary $8-$15 $t0-$t7 Saved $16-$23 $s0-$s7 More Temporary $24-$25 $t8-$t9 Used by Kernel $26-27 $k0-$k1 Global Pointer $28 $gp Stack Pointer $29 $sp Frame Pointer $30 $fp Return Address $31 $ra ... sum( a, b ); ... // a, b: $s0,$s1 int sum( int x, int y) { return x + y; }
1000 add $a0,$s0,$zero 1004 add $a1,$s1,$zero 1008 addi $ra,$zero,1016 1012 j sum 1016 ... 2000 sum: add $v0,$a0,$a1 2004 jr $ra
______________________________________________________________________________________________ © G. Khan Computer Organization & Architecture – COE608: ISA and IS Design Page: 6
Another Procedure Call
______________________________________________________________________________________________ © G. Khan Computer Organization & Architecture – COE608: ISA and IS Design Page: 7
Function/Procedure Call
Why use jr here? Why not simply use j? Procedure return: jump register jr $ra
Copies $ra to program counter Can also be used for computed jumps
e.g. for case/switch statements Procedure call: jump and link jal Procedure-Label Address of following instruction put in $ra
Jumps to target address Single instruction to jump and save return address: jump and link (jal) With jr $ra as part of sum
1008 addi $ra,$zero,1016 #$ra=1016 1012 j sum #goto sumWith jal
1008 jal sum # $ra=1012,goto sum
Why have a jal?
______________________________________________________________________________________________ © G. Khan Computer Organization & Architecture – COE608: ISA and IS Design Page: 8
Function/Procedure Call Support Syntax for jal (jump link) is same as for j (jump): jal label jal should really be called “link and jump”: Step 1 (link): Save address of next instruction into $ra (Why next instruction? Why not current one?)
Step 2 (jump): Jump to the given label
Nested Procedures call other procedures For nested call, caller needs to save on the stack: Its return address Arguments and temporaries needed after the call Restore from the stack after the call C code:
int fact (int n) { if (n < 1) return f; else return n * fact(n - 1); }
Argument n in $a0 Result in $v0
______________________________________________________________________________________________ © G. Khan Computer Organization & Architecture – COE608: ISA and IS Design Page: 9
Nested Procedure Call Support
MIPS code:
fact: addi $sp,$sp,-8:adjust stack for 2 items sw $ra,4($sp) :save return address sw $a0,0($sp) :save argument slti $t0,$a0, 1 :test for n < 1 beq $t0,$zero, L1 addi $v0,$zero, 1 :yes, result is 1 addi $sp,$sp, 8 :pop 2 items from jr $ra :stack and return
L1: addi $a0,$a0,-1 :else decrement n jal fact :recursive call lw $a0,0($sp) :restore original n lw $ra,4($sp) :and return address addi $sp,$sp,8 :pop 2 items from stack mul $v0,$a0,$v0 : multiply to get jr $ra :result and return
______________________________________________________________________________________________ © G. Khan Computer Organization & Architecture – COE608: ISA and IS Design Page: 10
ARM & MIPS Similarities
ARM: the most popular embedded CPU core. ARM and MIPS CPUs have a similar basic set of instructions.
Compare and Branch in ARM Uses condition codes for result of an arithmetic or logical instruction
Compare instructions to set condition codes without keeping the result
Each instruction can be conditional • Top 4 bits of instruction word: condition value
Can avoid branches over single instructions
39Data addressing
Mem-mappedMem-mappedInput/output 31 × 32-bit15 × 32-bitRegisters
AlignedAlignedData alignment 32-bit flat32-bit flatAddress space
32 bits32 bitsInstruction size 19851985Date announced MIPSARM
______________________________________________________________________________________________ © G. Khan Computer Organization & Architecture – COE608: ISA and IS Design Page: 11
The Intel x86 ISA
Evolution and backward compatibility • 8080 (1974): 8-bit microprocessor
Accumulator, plus 3 index-register pairs • 8086 (1978): 16-bit extension to 8080
Complex instruction set (CISC) • 8087 (1980): floating-point coprocessor
Adds FP instructions and register stack • 80286 (1982): 24-bit addresses, MMU
Segmented memory mapping and protection • 80386 (1985): 32-bit extension (IA-32)
Additional addressing modes and operations Paged memory mapping as well as segments
• i486 (1989): pipelined, on-chip caches/FPU Compatible competitors: AMD, Cyrix, …
• Pentium (1993): super-scalar, 64-bit datapath Later versions added MMX instructions The infamous FDIV bug
• Pentium Pro (1995), Pentium II (1997) New micro-architecture
• Pentium III (1999) Added SSE (Streaming SIMD Extensions) and
associated registers • Pentium 4 (2001) New micro-architecture & added SSE2 instructions
______________________________________________________________________________________________ © G. Khan Computer Organization & Architecture – COE608: ISA and IS Design Page: 12
Basic x86 Registers 80386 Register Set
______________________________________________________________________________________________ © G. Khan Computer Organization & Architecture – COE608: ISA and IS Design Page: 13
x86 Instruction Formats
Variable length encoding • Postfix bytes specify addressing mode • Prefix bytes modify operation
Operand length, repetition, locking, …
______________________________________________________________________________________________ © G. Khan Computer Organization & Architecture – COE608: ISA and IS Design Page: 14
Implementing IA-32
• Complex instruction set makes implementation difficult Hardware translates instructions to simpler micro-operations Simple instructions: 1–1 Complex instructions: 1–many
Micro-engine similar to RISC Market share makes it economically viable
• Comparable performance to RISC Compilers avoid complex instructions
______________________________________________________________________________________________ © G. Khan Computer Organization & Architecture – COE608: ISA and IS Design Page: 15
Typical Instructions of IA 32
______________________________________________________________________________________________ © G. Khan Computer Organization & Architecture – COE608: ISA and IS Design Page: 16
Fallacies
• Powerful instruction ⇒ higher performance
Fewer instructions required But complex instructions are hard to
implement May slow down all instructions, including
simple ones Compilers are good at making fast code
from simple instructions • Use assembly code for high
performance But modern compilers are better at dealing
with modern processors More lines of code ⇒ more errors and less
productivity
top related