cen 333-316 computer organization and design isa hesham al-twaijry edited by: mansour al zuair

70
CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Upload: melanie-agatha-nichols

Post on 21-Jan-2016

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

CEN 333-316Computer Organization and Design

ISA

Hesham Al-Twaijry

Edited by: Mansour Al Zuair

Page 2: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 2

These Lectures: ISA & MIPS

Operands and data types Computational operations Memory access & addressing Branches Procedure call Instruction encoding Assembling and linking Alternatives Later in course: exceptions and interrupts

Components of anISA

Page 3: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 3

Assembly Language

Basic job of a CPU: execute lots of instructions.

Instructions are the primitive operations that the CPU may execute.

Different CPUs implement different sets of instructions. The set of instructions a particular CPU implements is an Instruction Set Architecture (ISA).

• Examples: Intel 80x86 (Pentium 4), IBM/Motorola PowerPC (Macintosh), MIPS, Intel IA64, ...

Page 4: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 4

Instruction Set Architectures

Early trend was to add more and more instructions to new CPUs to do elaborate operations

• VAX architecture had an instruction to multiply polynomials!

RISC philosophy (Cocke IBM, Patterson, Hennessy, 1980s) – Reduced Instruction Set Computing

• Keep the instruction set small and simple, makes it easier to build fast hardware.

• Let software do complicated operations by composing simpler ones.

Page 5: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 5

MIPS Architecture

MIPS – semiconductor company that built one of the first commercial RISC architectures

We will study the MIPS architecture in some detail in this class

Why MIPS instead of Intel 80x86?• MIPS is simple, elegant. Don’t want to get

bogged down in gritty details.• MIPS widely used in embedded apps, x86

little used in embedded, and more embedded computers than PCs

Page 6: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 6

MIPS Architectural Approach

Load/store or register-register instruction set• Data must be in registers to be operated on

– register operations affect the entire contents of register

• Only load/store instructions affect memory• True in all RISC instruction sets• True in all instruction sets designed since 1980

Emphasis on efficient implementation Simplicity: provide primitives rather than solutions

Page 7: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 7

Assembly Variables: Registers

Unlike HLL like C or Java, assembly cannot use variables• Why not? Keep Hardware Simple

Assembly Operands are registers• limited number of special locations built directly into the hardware• operations can only be performed on these!

Benefit: Since registers are directly in hardware, they are very fast (faster than 1 billionth of a second)

Drawback: Since registers are in hardware, there are a predetermined number of them

• Solution: MIPS code must be very carefully put together to efficiently use registers

Page 8: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 8

Assembly Variables: Registers

32 registers in MIPS• Why 32? Smaller is faster

Each MIPS register is 32 bits wide• Groups of 32 bits called a word in MIPS

Registers are numbered from 0 to 31 Each register can be referred to by number or name Number references:

$0, $1, $2, … $30, $31 By convention, each register also has a name to make it easier to code For now:

$16 - $23 $s0 - $s7$8 - $15 $t0 - $t7

In general, use names to make your code more readable

Page 9: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 9

Assembly Variables: Registers

Name Register number Usage$zero 0 the constant value 0$v0-$v1 2-3 values for results and expression evaluation$a0-$a3 4-7 arguments$t0-$t7 8-15 temporaries$s0-$s7 16-23 saved$t8-$t9 24-25 more temporaries$gp 28 global pointer$sp 29 stack pointer$fp 30 frame pointer$ra 31 return address

Page 10: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 10

C, Java variables vs. registers

In C (and most High Level Languages) variables declared first and given a type

• Example: int fahr, celsius; char a, b, c, d, e;

Each variable can ONLY represent a value of the type it was declared as (cannot mix and match int and char variables).

In Assembly Language, the registers have no type; operation determines how register contents are treated

Page 11: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 11

Comments in Assembly and instructions

Another way to make your code more readable: comments! Hash (#) is used for MIPS comments

• anything from hash mark to end of line is a comment and will be ignored

In assembly language, each statement (called an Instruction), executes exactly one of a short list of simple commands

Unlike in C (and most other High Level Languages), each line of assembly code contains at most 1 instruction

Instructions are related to operations (=, +, -, *, /) in C or Java

Page 12: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 12

Data Types: Typical

• Bit: 0, 1• Bit string: sequence of bits of a particular length

• 8 bits is a byte• 16 bits is a half-word• 32 bits is a word• 64 bits is a double-word

• Character:• Supported as a byte (signed or unsigned)

• Decimal:• Digits 0-9 encoded as 0000b thru 1001b, two per byte• Not supported in most newer architectures

• Integers:• 2's complement: next chapter

• Floating point: M x 2E

• Single precision• Double precision• Extended precision

Page 13: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 13

MIPS I Storage Model

232 bytes of memory: accessible by loads/stores 31 x 32-bit GPRs (R0 = 0) 32 x 32-bit FP regs–organized as 16 pairs HI, LO: used for integer multiply/divide PC: branch and procedure call

0$0$1°°°$31

PClohi

$f0$f1°°°

$f15

Page 14: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 14

Computational Instructions

Arithmetic/logical instructions• Three operand format: result + two sources• Operands: registers, 16-bit immediates• Signed & unsigned arithmetic operations:

– Sign-extension for immediates– Trapping of overflow for signed values

• Compare instructions– Signed vs. Unsigned: comparison is different

Integer multiply/divide• Use HI/LO registers

Floating point instructions• Operate on floating point registers• Double and single precision• Typical: add, multiply, divide, subtract

Page 15: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 15

MIPS Addition and Subtraction

Syntax of Instructions:1 2,3,4

where:

1) operation by name

2) operand getting result (“destination”)

3) 1st operand for operation (“source1”)

4) 2nd operand for operation (“source2”) Syntax is rigid:

• 1 operator, 3 operands• Why? Keep Hardware simple via regularity

Page 16: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 16

Addition and Subtraction of Integers

Addition in Assembly• Example: add $s0,$s1,$s2 (in MIPS)

Equivalent to: a = b + c (in C)

where MIPS registers $s0,$s1,$s2 are associated with C variables a, b, c

Subtraction in Assembly• Example: sub $s3,$s4,$s5 (in MIPS)

Equivalent to: d = e - f (in C)

where MIPS registers $s3,$s4,$s5 are associated with C variables d, e, f

Page 17: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 17

Addition and Subtraction of Integers

How do the following C statement?

a = b + c + d - e; Break into multiple instructions

add $t0, $s1, $s2 # temp = b + c

add $t0, $t0, $s3 # temp = temp + d

sub $s0, $t0, $s4 # a = temp - e Notice: A single line of C may break up into several lines of

MIPS. Notice: Everything after the hash mark on each line is ignored

(comments)

Page 18: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 18

Addition and Subtraction of Integers

How do we do this?

f = (g + h) - (i + j); Use intermediate temporary register

add $t0,$s1,$s2 # temp = g + h

add $t1,$s3,$s4 # temp = i + j

sub $s0,$t0,$t1 # f=(g+h)-(i+j)

Page 19: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 19

MIPS instructions

Register Zero

How do we do f = g (in C) ?

add $s0,$s1, $zero (in MIPS)where MIPS registers $s0,$s1 are associated with C variables f, g

Immediates:• Add

f = g + 10 (in C)

addi $s0,$s1,10 (in MIPS)

where MIPS registers $s0,$s1 are associated with C variables f, g

Page 20: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 20

MIPS instructions

There is no Subtract Immediate in MIPS: Why?• if an operation can be decomposed into a simpler operation, don’t

include it• addi …, -X = subi …, X => so no subi

addi $s0,$s1,-10 (in MIPS)f = g - 10 (in C)

where MIPS registers $s0,$s1 are associated with C variables f, g

Page 21: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 21

MIPS Integer Arithmetic

Instruction Example Meaning Commentsadd add $1,$2,$3 $1 = $2 + $3 3 operands; exception

possiblesubtract sub $1,$2,$3 $1 = $2 – $3 3 operands; exception

possibleadd immediate addi $1,$2,100 $1 = $2 + 100 + constant; exception possibleadd unsigned addu $1,$2,$3 $1 = $2 + $3 3 operands; no exceptionssubtract unsign subu $1,$2,$3 $1 = $2 – $3 3 operands; no exceptionsadd imm unsign addiu $1,$2,100 $1 = $2 + 100 + constant; no exceptionssub imm unsign subiu $1,$2,100 $1 = $2 – 100 – constant; no exceptionset less than slt $1,$2,$3 $1 = ($2 < $3) compare less than signedset less than imm slti $1,$2,100 $1 = ($2 < 100) compare w. constant signedset less than unssltu $1,$2,$3 $1 = ($2 < $3) compare less than unsignedset l. t. imm. uns.sltiu $1,$2,100 $1 = ($2 < 100) compare< constant unsigned

Page 22: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 22

Multiply / Divide

Start multiply, divide

multiply mult $2,$3 Hi, Lo = $2 x $3 64-bit signed product

multiply unsign multu$2,$3 Hi, Lo = $2 x $3 64-bit unsigned product

divide div $2,$3 Lo = $2 ÷ $3, Lo = quotientHi = $2 mod $3 Hi = remainder

divide unsign divu $2,$3 Lo = $2 ÷ $3, Unsigned quotientHi = $2 mod $3 Unsigned

remainder Move result from multiply, divide

Move from Hi mfhi $1 $1 = Hi Used to get copy of HiMove from Lo mflo $1 $1 = Lo Used to get copy of Lo Rationale:

• deal with 64-bit result• simplify handling of instruction

Registers

HI

LO

Add

Page 23: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 23

MIPS Logical Instructions

Instruction Example Meaning Commentand and $1,$2,$3 $1 = $2 & $3 Logical ANDor or $1,$2,$3 $1 = $2 | $3 Logical ORxor xor $1,$2,$3 $1 = $2 $3 Logical XORnor nor $1,$2,$3 $1 = ~($2 |$3) Logical NORand immediate andi $1,$2,10 $1 = $2 & 10 Logical AND w. constantor immediate ori $1,$2,10 $1 = $2 | 10 Logical OR w. constantxor immediate xori $1, $2,10 $1 = $2 10 Logical XOR w. constantshift left log sll $1,$2,10 $1 = $2 << 10 Shift left by constantshift right log srl $1,$2,10 $1 = $2 >> 10 Shift right by constantshift right arith sra $1,$2,10 $1 = $2 >> 10 Shift right (sign extend) shift left log var sllv $1,$2,$3 $1 = $2 << $3 Shift left by variableshift right log var srlv $1,$2, $3 $1 = $2 >> $3 Shift right by variableshift right arith srav $1,$2, $3 $1 = $2 >> $3 Shift right arith. by varload upper imm lui $1,40 $1 = 40 << 16 Places immediate into upper

16 bits

Page 24: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

How about larger constants?

Chapter 2 24

We'd like to be able to load a 32 bit constant into a register Must use two instructions, new "load upper immediate"

instructionlui $t0, 1010101010101010

Then must get the lower order bits right, i.e.,ori $t0, $t0, 1010101010101010

1010101010101010 0000000000000000

filled with zeros

1010101010101010 0000000000000000

0000000000000000 1010101010101010

1010101010101010 1010101010101010

ori

Page 25: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 25

Memory Addressing

Byte addressing:• Since 1980 every machine uses addresses to level of 8-bits.

Three questions:• Can individual bytes be accessed?

– Yes, in almost every machine (half-words also)

• How do byte addresses map into words?– Byte order– A word is accessible either as 32 bits or as 4 bytes

• How can words be positioned in memory?– Alignment

Page 26: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 26

Machine Language

Instructions, like registers and words of data, are also 32 bits long

• Example: add $t1, $s1, $s2• registers have numbers, $t1=9, $s1=17, $s2=18

Instruction Format:

00000010001 10010 01001 00000 100000

op rs rt rd shamt funct

Page 27: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 27

Byte Ordering

Two conventions–named based on Gulliver’s travels Big Endian:

• address of most significant byte = word address (x000 = Big End of word)

• IBM 360/370, Motorola 68k, Sparc, HP PA Little Endian:

• address of least significant byte = word address(000x = Little End of word)

• Intel 80x86, DEC Vax, DEC Alpha Bimodal: MIPS, PowerPC (both mostly Big Endian)

msb lsb

3 2 1 0little endian byte 0

0 1 2 3

big endian byte 0

Page 28: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 28

Alignment

Alignment: require that objects fall on address that is multiple of their size.

Important performance effect. Historically:

• Early machines (IBM 360 in 1964) require alignment

• Restriction removed in 1970s: too hard for programmers!

• RISC machines: reintroduce restriction–important to performance

Example: word access (also half-word and double word)

0 1 2 3

Aligned

NotAligned

Page 29: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 29

MIPS memory access

All memory access through loads and stores Aligned words, halfwords, and bytes

• halfwords and bytes may be sign or 0 extended Floating-point loads/stores for FP registers Single addressing mode (displacement or based):

•   16-bit sign-extended displacement (immediate field) • + register • = memory address

In addition:• displacement = 0 uses register contents as address• register = 0 uses 16-bit displacement as address

Registers

+

MemoryData to load/location to store into

Page 30: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 30

MIPS Load/store Instructions

Instruction Example Meaning Commentsstore word sw 8($4), $3 Mem[$4+8]=$3 Store wordstore half sh 6($4), $3 Mem[$4+6]=$3 Stores only lower 16

bitsstore byte sb 7($4), $3 Mem[$4+7]=$3 Stores only lowest bytestore float sf 4($2), $f2 Mem[$2+4]=$f2 Store FP word

load word lw $1, 8($2) $1=Mem[8+$2] Load wordload halfword lh $1, 6($2) $1=Mem[6+$2] Load half; sign extendload half unsign lhu $1, 6($2) $1=Mem[6+$2] Load half; zero extendload byte lb $1, 5($2) $1=Mem[5+$2] Load byte; sign extend load byte unsign lbu $1, 5($2) $1=Mem[5+$2] Load byte; zero extend load float lf F1, 4($3) $f1=Mem[4+$3] Load FP register

Page 31: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

MIPS Load/store Instructions

Load and store instructions Example:

C code: A[12] = h + A[8];

MIPS code: lw $t0, 32($s3)add $t0, $s2, $t0sw $t0, 48($s3)

Remember arithmetic operands are registers, not memory!

Can’t write: add 48($s3), $s2, 32($s3)

Chapter 2 31

Page 32: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Example

Can we figure out the code?

Chapter 2 32

swap(int v[], int k);{ int temp;

temp = v[k]v[k] = v[k+1];v[k+1] = temp;

}swap:

muli $2, $5, 4add $2, $4, $2lw $15, 0($2)lw $16, 4($2)sw $16, 0($2)sw $15, 4($2)jr $31

Page 33: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Machine Language

Instructions, like registers and words of data, are also 32 bits long

• Example: add $t1, $s1, $s2• registers have numbers, $t1=9, $s1=17, $s2=18

Instruction Format:

000000 10001 10010 01000 00000 100000

op rs rt rd shamt funct

Can you guess what the field names stand for?

Chapter 2 33

Page 34: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 34

Machine Language

Consider the load-word and store-word instructions,• What would the regularity principle have us do?• New principle: Good design demands a compromise

Introduce a new type of instruction format• I-type for data transfer instructions• other format was R-type for register

Example: lw $t0, 32($s2)

35 10010 01000 32

op rs rt 16 bit number

Where's the compromise?

Page 35: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Stored Program Concept

Instructions are bits Programs are stored in memory

— to be read or written just like data

Fetch & Execute Cycle• Instructions are fetched and put into a special register• Bits in the register "control" the subsequent actions• Fetch the “next” instruction and continue

Chapter 2 35

Processor Memorymemory for data, programs,

compilers, editors, etc.

Page 36: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 36

MIPS Jump/branch Instructions

Two classes:• Jumps–unconditional, not pc-relative

– For procedure call, unconditional control, switch statements

• Branches–conditional and PC relative– For conditional control and pc-relative unconditional

Instruction example meaning comment

Jump j 10000 PC = 10000 jump to addressJump register jr $31 PC = $31 jump to address in registerJump and link jal 10000 $31 = PC + 4; Save PC next instruction

PC = 10000 jump to address

Page 37: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 37

MIPS Compare and Branch

Conditional branch is compare-and-branch• conditions:

– comparison against 0: equality, sign-test

– comparison of two registers: equality only

– remaining set of compare-and-branch take two instructions

• unconditional formulated with $0: beq $0,$0,where

Instruction Example Meaning

branch equal beq $1,$2,100 if ($1 == $2) PC=PC+4+100branch not eq bne $1,$2,100 if ($1 != $2) PC=PC+4+100branch l.t. 0 bltz $1,100if ($1 < 0) PC = PC+4+100 branch l.t./eq 0blez $1,100 if ($1 <= 0) PC = PC+4+100branch g.t. 0 bgtz $1,100 if ($1 > 0) PC = PC+4+100 branch g.t./eq 0 bgez $1,100 if ($1 >= 0) PC = PC+4+100

Page 38: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 38

Compiling C if into MIPS

Compile by hand

if (i == j)

f = g+h;

else f = g-h;

Use this mapping:• f: $s0, • g: $s1, • h: $s2, • i: $s3, • j: $s4

Exit

i == j?

f=g+h f=g-h

(false) i != j

(true) i == j

° Final compiled MIPS code: beq $s3, $s4, True # branch i==j

sub $s0, $s1, $s2 # f=g-h(false)

j Fin # go to Fin

True: add $s0,$s1,$s2 # f=g+h (true)

Fin:

Page 39: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 39

Example: Searching an Array

C code:count=0;for (index=head; index<=n; index++)

if (C[index] = = target) count ++; MIPS assembly code, assuming:

• head in $1; starting address of C in $2; target in $3; n in $4• Register allocation: count in $5, index in $6

li $5,0 ; set count =0 (actually addui $5,$0,0) move $6,$1 ; initial index (actually addu $6,$1,$0)

loop: bgt $6,$4,exit ; if index>n exit (actually sgt, bne)sll $7,$6,2 ; multiply index by 4addu $7,$7,$2 ; address of C [index]lw $8,0($7) ; $8 = C[index] bne $8,$3,next ; test if equaladdui $5,$5,1 ; increment count

next: addui $6,$6,1 ; increment indexb loop ; unconditional branch to loop

exit:• Simplest, but not best code!

Page 40: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 40

Instruction Support for Functions

... sum(a,b);... /* a, b: $s0,$s1 */}int sum(int x, int y) {

return x+y;}

address1000 add $a0,$s0,$zero # x = a1004 add $a1,$s1,$zero # y = b 1008 addi $ra,$zero,1016 #$ra=10161012 j sum #jump to sum1016 ...

2000 sum: add $v0,$a0,$a1

2004 jr $ra # new instruction

Page 41: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 41

Instruction Support for Functions

•Single instruction to jump and save return address: jump and link (jal)

•Before:1008 addi $ra,$zero,1016 #$ra=10161012 j sum #go to sum

•After:1012 jal sum # $ra=1016,go to

sum

•Why have a jal? Make the common case fast: functions are very common.• Syntax for jr (jump register):

jr register

•Instead of providing a label to jump to, the jr instruction provides a register which contains an address to jump to.

•Very useful for function calls:jal stores return address in

register ($ra)jr jumps back to that

address

Page 42: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 42

MIPS: Software Register Convention

Caller save: caller saves at call if needed after call Callee save: called procedure saves if register needed

16 $s0 callee saves

. . .

23 $s7

24 $t8      temporary (cont’d)

25 $t9

26 $k0      reserved for OS kernel

27 $k1

28 $gp Pointer to global area

29 $sp Stack pointer

30 $fp frame pointer

31 $ra Return Address (HW)

0 $0 zero constant 0

1 $at reserved for assembler

2 $v0 expression evaluation &

3 $v1 function results

4 $a0 arguments

5 $a1

6 $a2

7 $a3

8 $t0     temporary: caller saves

. . .

15 $t7

Page 43: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 43

MIPS Calling Convention

Caller (the calling function)• Load arguments: first four in $a0–$a3, rest on stack• Save caller-saved registers: $a0–$a3, $t0-$t9 if used• Execute jal instruction

Callee: (the function being called) start-up• Allocate memory in frame: $sp = $sp – frame size• Save callee-saved registers $s0–$s7,$fp,$ra if used• Create frame: $fp = $sp + frame size

Return:• Place return value in $v0• Restore any callee-saved registers• Pop stack: $sp = $sp+frame size• Return by jr $ra

Only need to do what is needed!

Page 44: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 44

Register Conventions - saved

When callee returns from executing, the caller needs to know which registers may have changed and which are guaranteed to be unchanged.

Register Conventions: A set of generally accepted rules as to which registers will be unchanged after a procedure call (jal) and which may be changed.

$0: No Change. Always 0. $s0-$s7: Restore if you change. Very important, that’s why they’re

called saved registers. If the callee changes these in any way, it must restore the original values before returning.

$sp: Restore if you change. The stack pointer must point to the same place before and after the jal call, or else the caller won’t be able to restore values from the stack.

HINT -- All saved registers start with S

Page 45: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 45

Register Conventions - volatile

$ra: Can Change. The jal call itself will change this register. Caller needs to save on stack if nested call.

$v0-$v1: Can Change. These will contain the new returned values.

$a0-$a3: Can change. These are volatile argument registers. Caller needs to save if they’ll need them after the call.

$t0-$t9: Can change. That’s why they’re called temporary: any procedure may change them at any time. Caller needs to save if they’ll need them afterwards.

Page 46: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 46

Instruction Encoding-I

3-formats, all 32-bits in length fixed 6-bit opcode begins each instruction ALU Format (also R format): one opcode

• register-register-register ALU instructions

OP=0 rs rt rd sa funct

Bits 6 5 5 5 5 6

firstsourceregister

secondsourceregister

resultregister

shiftamount

functioncode

Function code: Detailed opcode: Add, Sub, or, and, ...

Page 47: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 47

Instruction Encoding-II

Immediate instruction format (I format):• Loads/stores (including floating point)• Immediate instructions (e.g. addi, lui, etc.)• different opcode for each instruction

OP rs rt immediate

Bits 6 5 5 16

firstsource

or baseregister

secondsource

or targetregister

immediate field

Page 48: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 48

Instruction Encoding-III

Jump format (J format):• used for j, jal• 26-bit offset field

OP jump target

Bits 6 26

jump target address

Page 49: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 49

Addressing Modes

Byte Halfword Word

Registers

Memory

Memory

Word

Memory

Word

Register

Register

1. Immediate addressing

2. Register addressing

3. Base addressing

4. PC-relative addressing

5. Pseudodirect addressing

op rs rt

op rs rt

op rs rt

op

op

rs rt

Address

Address

Address

rd . . . funct

Immediate

PC

PC

+

+

Page 50: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 50

Translation Hierarchy

Assembler

Assembly language program

Compiler

C program

Linker

Executable: Machine language program

Loader

Memory

Object: Machine language module Object: Library routine (machine language)

Page 51: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 51

Assembling Programs

Assembly:• resolve labels on instructions and data:

– relative to PC for instructions– relative to some register for data– either two-pass or use backpatch

• expand any macros and pseudoinstructions• handle any assembler directives: data layout• translate instructions to binary• create object file:

– headers– code segment (called text in Unix)– Data segment– Relocation information: instruction/data words to relocate– Symbol table: unresolved references + visible symbols– debugging information

Page 52: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 52

Linking and Loading

Linker–combine multiple object modules, resolving cross references:

• Search and link in any library modules• Determine address for any data or code in module and fix-up the

address appropriately• Resolve cross references (both code and data)• If all modules present yields an executable.

Loader• Reads executable• Loads code and data segments• Initializes registers, stack, and arguments• Jumps to program’s start-up routine to initiate execution

Page 53: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 53

Alternative ISA Approaches

Internal storage: registers, stacks, none• Registers: choice since 1984, all machines in use today• Stacks: in 1960s-70s• Only memory: not used successfully in 25 years

Typical operations:• Heavily used ones are little changed since 1970• Fancy instructions in some machines, but under used

Operands and addressing: where can operands be• Register-register: all since 1980• Register-memory: 360, 80x86, 680x0• Memory-memory: VAX• Addressing: many different address modes

Instruction formats: fixed versus variable

Page 54: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 54

Operations Supported

Most machines support a base set of operations like those in the MIPS ISA.

Recently many architectures have added limited support (both operations and data types) for graphics and multimedia.

Examples of operations included in more elaborate instruction sets:

• support for arithmetic and logical instructions on all data types (bytes, half words)

• support for larger integer data types• string instructions: copies, compares, translation• subroutine call instructions• support for data structures: queues, stacks• support for bit strings as a data type

Page 55: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 55

Methods of Testing Condition

Condition Codes• Processor status bits set as a side-effect of arithmetic

instructions (possibly moves) or explicitly by compare or test instructions.

• example:add r1, r2, r3

bz label Condition Register: evaluate into register, test:

• example:cmp r1, r2, r3bgt r1, label

Compare and Branch• example:

bgt r1, r2, label

Page 56: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 56

Accessing and Addressing Operands

All recent machines are general-purpose register architectures (mainly load/store architecture)

Substantial differences in both expressiveness and complexity based on how operands are accessed.

Example– VAX: • Any operand can reside in a register or in memory• Any memory location can be addressed with any address mode• Example: ADDW3–adds 2 16-bit operands, result in 3rd

– Each operand can be a register, immediate, or in memory 27 combinations!

– Each memory operand has a choice of 20+ addressing modes more than 20,000 different forms of the add instruction

– Instruction size varies accordingly

Page 57: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 57

Addressing mode Example Meaning

Register Add R4,R3 R4 R4+R3

Immediate Add R4,#3 R4 R4+3

Displacement Add R4,100(R1) R4 R4+Mem[100+R1]

Register indirect Add R4,(R1) R4 R4+Mem[R1]

Indexed / Base Add R3,(R1+R2) R3 R3+Mem[R1+R2]

Direct or absolute Add R1,(1001) R1 R1+Mem[1001]

Memory indirect Add R1,@(R3) R1 R1+Mem[Mem[R3]]

Auto-increment Add R1,(R2)+ R4 R1+Mem[R2]; R2 R2 + d ¬

Auto-decrement Add R1,–(R2) R2 R2–d; R1 R1+Mem[R2]¬

Scaled Add R1,100(R2)[R3] R1 R1+Mem[100+R2+R3*d]

Addressing Modes

Page 58: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 58

Examples of Instruction Formats

Variable:

Fixed:

Hybrid:

• If code size is most important, use variable length.– may be dictated by instruction set.

•If performance is most important, use fixed length.•Hybrid in use on 80x86 shares +/–.

Page 59: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 59

Rationale for ISA Choices

Metrics:• Design cost impact: HW and SW• Performance and other execution time metrics

– Instruction and data bytes accessed• Static metrics (code size)

Influences on ISA Effectiveness• Program usage: importance of various alternatives• Organizational techniques:

– Pipelining, memory hierarchies• Compiler technology• OS needs• Basic implementation technology:

– Memory vs logic; high-speed vs. operations in parallel

Page 60: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 60

Compilers and ISA

Ease of compilation• Orthogonality: few special registers or special cases,

all operand modes available with any data type or instruction type• Completeness: support for wide range of applications• Regularity: no overloading of meaning for instruction fields• Streamlined: resource needs easily determined

Efficiency of code:• Minimize hidden work–do what’s needed• Primitives rather than solutions

Register Assignment is critical in • Easier if lots of registers

Page 61: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 61

Operand Size Usage

Support these data sizes and types: • 8-bit, 16-bit, 32-bit integers• 32-bit and 64-bit floating point numbers

Frequency of reference by size

0% 20% 40% 60% 80%

Byte

Halfword

Word

Doubleword

0%

0%

31%

69%

7%

19%

74%

0%

Int Avg.

FP Avg.

Page 62: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 62

Addressing Mode Usage

Three programs measured on machine with all address modes (VAX), not including registers:

• Displacement: 42% avg, 32% to 55%• Immediate: 33% avg, 17% to 43%• Register deferred (indirect): 13% avg, 3% to 24%• Scaled: 7% avg, 0% to 16%• Memory indirect: 3% avg, 1% to 6%• Misc: 2% avg, 0% to 3%

75% displacement & immediate 88% displacement, immediate & register indirect

1%

0%

24%

43%

32%

6%

16%

3%

17%

55%

1%

6%

11%

39%

40%

0% 10% 20% 30% 40% 50% 60%

Memory indirect

Scaled

Register deferred

Immediate

Displacement

gcc

spice

TEX

Page 63: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 63

Immediate Size

How big are immediates?• 50% to 60% fit within 8 bits• 75% to 80% fit within 16 bits• Assuming sign extension!

0

10

20

30

40

50

60

0 4 8 12 16 20 24 28 32

Number of bits needed for immediate value

TEX

spice

gcc

Page 64: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 64

Displacement Address Size

Average of 5 SPECfp and 5 SPECint programs 1% of addresses need > 16 bits 12-16 bits sufficient

0

5

10

15

20

25

30

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Value

Freq

uenc

y (%

)

Integer

Floating Point

Page 65: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 65

Top 10 80x86 Instructions

Simple instructions dominate instruction frequency

Rank Instruction Average Percent total executed1 load 22%

2 conditional branch 20%

3 compare 16%

4 store 12%

5 add 8%

6 and 6%

7 sub 5%

8 move register-register 4%

9 call 1%

10 return 1%

Total 96%

Page 66: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 66

Conditional Branch Distance

Distance from branch in log( instructions)• 35% of integer branches are –4..+3 instructions

0%

5%

10%

15%

20%

25%

30%

35%

40%

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Number of bits needed to specify distance between target and branch

Integer

Floating Point

Page 67: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 67

Conditional Branches

PC-relative since most branches are relatively close to the current PC address

At least 8 bits suggested (± 128 instructions) Compare Equal/Not Equal most important for integer programs

(86%)

37%

23%

40%

86%

7%

7%

0% 20% 40% 60% 80% 100%

EQ/NE

GT/LE

LT/GE

Frequency of comparison types in branches

Int Avg.

FP Avg.Branch

comparison

Page 68: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 68

Summary: ISA Desires vs. MIPS

Use general purpose registers with a load-store architecture: Yes Provide at least 16 general purpose registers plus separate floating-point

registers: 31 GPR & 32 FPR Support these addressing modes: displacement (with address offset of 12 – 16

bits), immediate (size 8–16 bits), and register deferred: Yes: 16-bits for immediate, displacement; (disp=0 => register deferred)

All addressing modes apply to all data transfer instructions : Yes Use fixed instruction encoding if interested in performance and use variable

instruction encoding if interested in code size : Fixed Support these data sizes and types: 8-bit, 16-bit, 32-bit integers and 32-bit and

64-bit floating point numbers: Yes Support these simple instructions, since they will dominate the number of

instructions executed: load, store, add, subtract, move register-register, and, shift, compare equal, compare not equal, branch (with a PC-relative address at least 8-bits long), jump, call, and return: Yes, 16b branch offsets, simple branch compares

Aim for a minimalist instruction set: Yes

Page 69: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 69

Stack example

Int leaf_example (int g, int h, int I, int j) {

int f;

f = (g +h) – (I + j);

return f;

} Answer

• We have 4 arguments and one return value• 3 registers will be used $s0, $t0, and $t1• Create a space for three registers in the stack

Page 70: CEN 333-316 Computer Organization and Design ISA Hesham Al-Twaijry Edited by: Mansour Al Zuair

Chapter 2 70

Stack example cont.

Sub $sp, $sp, 12 # make room add $v0, $s0, $zero

Sw $t1, 8($sp) lw $s0, 0($sp) # restore old values

Sw $t0,4($sp) lw $t0, 4($sp)

Sw $s0, 0($sp) lw $t1, 8($sp)

Add $t0, $a0, $a1 add $sp, $sp, 12

Add $t1, $a2, $a3 jr $ra

Sub $s0, $t0, $t1

Contents of register $s0

Contents of register $t0

Contents of register $t1

$sp

$sp

$sp

High address

Low address a. b. c.