undergrad instruction set

116
COE–5320 Computer Architecture Instruction Set Architecture [1] Angel E. Gonz ´ alez-Lizardo, Ph.D. Polytechnic University of Puerto Rico September 18, 2014 COE–5320 Computer Architecture Angel E. Gonz ´ alez-Lizardo, Ph.D. 1

Upload: marcel-ortiz

Post on 28-Aug-2015

44 views

Category:

Documents


2 download

DESCRIPTION

Instruction Set for MIPS fully explained for beginners.

TRANSCRIPT

  • COE5320 Computer ArchitectureInstruction Set Architecture [1]

    Angel E. Gonzalez-Lizardo, Ph.D.

    Polytechnic University of Puerto Rico

    September 18, 2014

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 1

  • Instructions: Language of the Computer

    IntroductionThe language of computers is called instructions.Their vocabulary is called Instruction SetComputer designers have a common goal:

    To find a language that makes easy to build the hardwareand the compiler while maximizing performance andminimizing cost.

    Simplicity of the equipment is a valuable consideration.The secret of computing: The stored programBoth instructions and data are stored as numbers in the computer.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 2

  • Instructions: Language of the ComputerArithmetic Instructions

    Every computer must be able to perform arithmetics.The MIPS instruction

    add a, b, c # a = b + c;

    Each MIPS instruction is able to perform only ONE operation.

    The statement,

    a = b + c + d + e

    translates into

    add a, b, c # The sum of b and c is placed in a.

    add a, a, d # The sum of b, c, and d is placed in a.

    add a, a, e # The sum of b, c, d, and e is placed in a.

    Three arithmetic operations generate 3 instructions.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 3

  • Instructions: Language of the Computer

    First Design Principle

    Simplicity Favors Regularity

    The simpler the instructions, the simpler the hardware to executethem.Each line contains only one instruction.

    MIPS AssemblyThe text after the # is a comment.Comments always terminate at the end of the line.Each instruction has three operands, no less, no more.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 4

  • Instructions: Language of the Computer

    First Design PrincipleExample: Compiling two C assignments into MIPS.This segment of C language program contains five variables.

    a = b + c;

    d = a - e;

    Translating these instructions into MIPS assembly language is performed by a compilerand yields:

    add a, b, c

    sub d, a, e

    Two simple C statements compile into two assembly language instructions.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 5

  • Instructions: Language of the Computer

    First Design PrincipleA more complex statement

    f= (g+h)- (i+j);

    The compiler must break this statement into several instructions, for example

    add t0 , g, h # temporary variable t0 becomes g + h

    add t1 , i, j # temporary variable t1 becomes i + j

    sub f, t0 , t1 # f gets t0 - t1 , the final result

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 6

  • Instructions: Language of the Computer

    First Design PrincipleThe table below shows the portions of MIPS assembly languagedescribed so far.

    Table: MIPS Assembly Language

    Category Instruction Example Meaning Comments

    Arithmetic add add a, b, c a = b + c Always three operandssubract sub a, b, c a = b c Always three operands

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 7

  • Operands in a Computer

    RegistersOperands of arithmetic instructions are always registers.Registers in the MIPS-32 architecture are 32-bits wide.The name word is given to such groups of 32-bitsMIPS-32 architecture has only 32 registers.The reason for that is the Second Design Principle:

    Smaller is faster

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 8

  • Operands in a Computer

    RegistersExample: Same as before but using registers

    f= (g+h)-(i+j);

    Assuming registers $s0, $s1, $s2, $s3, and $s4 are assigned to f, g, h, i, and jrespectively,

    add $t0 , $s1 , $s2 # register t0 becomes g + h

    add $t1 , $s3 , $s4 # register t1 becomes i + j

    sub $s0 , $t0 , $t1 # $s0 gets the final result

    Registers called $sx are used for variables and the ones called $txare used for temporary variables.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 9

  • Operands in a Computer

    RegistersThe processor can keep only a limited number of data elements.Data structures and arrays must be kept in memory.Data Transfer Instructions move data from memory to processorand viceversa.To access a word in memory, the instruction must provide amemory address.Memory is just a large single dimensional array with the addressacting like an index.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 10

  • Operands in a Computer

    Figure: Memory addresses andcontents.

    Data Transfer InstructionsThe main instruction to movedata from memory to theprocessor is load word (lw).The main instruction to movedata from the processor tomemory is store word (sw).

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 11

  • Operands in a Computer

    Example: Memory operationThe instruction

    g = h + A[8]

    will be compiled into

    lw $t0 , 32( $s3) # Temporary register

    add $s1 , $s2 , $t0 # g = h + A[8]

    The constant (32) in the data transfer is called the Offset while theregister ($s3) is called the Base Register.

    Effective Address = Offset + Base Register

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 12

  • Operands in a Computer

    CompilerThe compiler:

    Associates register with variables.Allocates arrays and structures memory locations in memory.Places the right data address into the data transfer instructions.

    In MIPS a word address must be multiple of 4.This is called Alignment Restriction.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 13

  • Operands in a Computer

    Immediate InstructionsA constant in the instruction is called an immediate operand.More than half of MIPS arithmetic instructions use immediateoperands.Since immediate operands are very frequent, immediateinstructions are includedThe instruction

    addi $s3 , $s3 , 4 #$s3 = $s3 + 4

    illustrate arithmetic immediate instruction.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 14

  • Immediate Instructions

    Third Design Principle

    Make the common case fast

    Immediate instructions illustrate the third design principleConstant operands occur frequently.Immediate operands are much faster than constants in memory.Tables in Figure 2 show a summary of MIPS instruction set so far.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 15

  • Immediate Instructions

    Figure: MIPS Architecture revealed so far

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 16

  • Instructions Formats

    R-type InstructionsHow machine represent numbers? binary formatFor example 12310 = 11110112.MIPS Fields.

    The instruction is divided into fieldsEach field specifies part of the information needed forexecution.

    The R-type (for Register) instruction format is:

    op rs rt rd shamt function

    6 bits 5 bits 5 bits 5 bits 5 bits 6 bits

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 17

  • Instructions Formats

    R-type Instructionsop. Basic operation of the instruction, called opcode.rs. The first register source operand.rt. The second register source operand.rd. The destination register where the result is stored.shamt. Shift amount, used for the shift instructions, specify how theshift is done.function. Function, used for selecting a variant of the operationspecified by the op field.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 18

  • Instructions Formats

    Fourth design principle

    Good design demands good compromises

    Fixed length instructions are easier to decode.Fixed length instructions implies different types of instructions toperform different operations.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 19

  • Instructions Formats

    I-type InstructionsThe I-type (for immediate) instruction used for data transfer is:The 16-bit address means a load word instruction can load anyword within a region of 215 or 32,768 bytes (213 or 8,192 words)of the address in the base register.

    op rs rt immediate

    6 bits 5 bits 5 bits 16 bits

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 20

  • Instructions Formats

    Instruction Types so far

    Multiple formats complicate hardware,To reduce complexity keep the formats similar.The opcode identify the formats, indicating to the hardware whatfields to look at.R-type arithmetic instructions have opcode 0, while load and storehave distinct opcodes.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 21

  • Instructions Formats

    Assembly to Machine LanguageLets translate the instruction

    A[300] = h + A[300]

    which is compiled into

    lw $t0 , 1200( $t1) #Temporary reg $t0 gets A[300]

    add $t0 , $s2 , $t0 #Temporary reg $t0 gets h + A[300]

    sw $t0 , 1200( $t1) #Stores h + A[300] into A[300]

    into machine language.

    Figure: Instruction fields in decimal representation

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 22

  • Instructions Formats

    Assembly to Machine Language

    In MIPS, registers $s0 to $s7 map onto registers 16 to 23.

    Registers $t0 to $t7 map onto registers 8 to 15.

    Thus, $s3 is register 18, $t0 is register 8, and $t1 is register 9.

    The binary representation of the instruction is in Figure 4.

    Figure: Instruction fields in binary representation

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 23

  • Instructions Formats

    Two Key Principles1 Instructions are represented by

    numbers.2 Programs are stored in memory

    just like numbers.Consequence of the storedprogram are what we calledintelligence in a canAlso, binary compatibility orinheritance of softwareready-made

    Figure: Programs in memory

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 24

  • Instructions Formats

    Summary

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 25

  • Other Instructions

    Logic InstructionsBitwise operations are useful in a computer.Logical operations in Java or C translate directly to MIPS assembly.Table 2 shows a summary of MIPS logical operations.

    Table: C or Java to MIPS

    Logical Operations C Java MIPS

    Shift Left >>> srl

    Bit-by-Bit AND & & and, andiBit-by-Bit OR | | or, ori

    Bit-by-Bit NOR nor

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 26

  • Other Instructions

    Logic InstructionsThe first two operations are called shifts.They move all the bits in the word to the left or right,The emptied bits are filled with zeros.For example if the register $s0 contains

    0000 0000 0000 0000 0000 0000 0000 1001 = 9,

    executing instruction sll t2,s0, 4, $t2 turns into:

    0000 0000 0000 0000 0000 0000 1001 0000 = 144

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 27

  • Other InstructionsLogic Instructions

    The machine version of the instruction is:

    op rs rt rd shamt funct

    0 0 16 10 4 0

    Shifting left by i bits gives the same result as multiplying by 2i

    In the previous pattern 9 24 = 144Masking is a useful way of isolating fields.For example, executing and $t0, $t1, $t2 with

    $t2 = 0000 0000 0000 0000 0000 1101 0000 0000 AND$t1 = 0000 0000 0000 0000 0011 1100 0000 0000

    $t0 = 0000 0000 0000 0000 0000 1100 0000 0000

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 28

  • Other InstructionsLogic Instructions

    If the instruction executed was or $t0, $t1, $t2, the result would be:

    $t0 = 0000 0000 0000 0000 0011 1101 0000 0000

    Further if we execute nor $t0, $t1, $t2, the result is

    $t0 = 1111 1111 1111 1111 1100 0010 1111 1111

    MIPS also providesandi: AND immediateori OR immediate

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 29

  • Other Instructions

    Figure: MIPS ISA so farCOE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 30

  • Other Instructions

    Decision-Making InstructionsWhat distinguishes a computer from a calculator?Its ability to make decisions.MIPS assembly includes two decision-making instructions.The instructions are branch if equal (beq), and branch if notequal (bne)

    beq reg1 , reg2 , L1 \# GOTO label L1 if [reg1 ]=[ reg2].

    bne reg1 , reg2 , L1 \# GOTO label L1 if [reg1 ]~=[ reg2 ]}.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 31

  • Other Instructions

    Decision-Making InstructionsExample. Compiling an if statement into a conditional branch. In thefollowing code segment, f g, h, i, and j are variables.

    if (i==j) f = g + h; else f = f - i;

    Assuming that the five variables f through j correspond to the five registers $s0 to $s4,what is the compiled MIPS code?

    Answerbeq $s3 , $s4 , Else #go to Else if i equals j

    sub $s0 , $s0 , $s3 # f = f - i

    j Exit # go to exit

    Else: add $s0 , $s1 , $s2 # f = g + h

    Exit:

    The label L1 is assigned a memory address pointing to the appropriate instructionduring the compilation process.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 32

  • Other Instructions

    Figure: Illustration of the options in theprevious example

    Decision-Making InstructionsThe compilers frequentlycreate branches where they donot appear in the original code.Avoiding the writing of explicitlabels and branches is one ofthe benefits of high-levellanguage programming.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 33

  • Basic Programming

    LoopsDecision making instructions are also used for loops.

    Compiling a loop with a variable array index Here is a C loop:

    Loop: g = g + A[i];

    i = i + j;

    if (i!=h) go to Loop;

    Assume A is an array of 100 elements and that g, h, i, and j are associated to registers $s1 to $s4by the compiler, and $s5 holds the base address of A.

    Loop: add $t1 , $s3 , $s3 # Temp reg $t1= 2*i

    add $t1 , $t1 , $t1 # Temp reg $t1= 4*i

    add $t1 , $t1 , $s5 # $t1= address of A[i]

    lw $t0 , 0($t1) # Temp reg $t0= A[i]

    add $s1 , $s1 , $t0 # g = g + A[i]

    add $s3 , $s3 , $s4 # i = i + j

    bne $s3 , $s2 , Loop # go to Loop if i not equal to h

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 34

  • Basic Programming

    Basic BlocksThese sequence of instructions that end in a branch are sofundamental to compiling that they have got their own buzzword :Basic Block.A Basic Block is a sequence of instructions with

    No embedded branches (except at end)No branch targets (except at beginning)

    A compiler identifies basic blocks for optimizationAn advanced processor can accelerate execution of basic blocks

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 35

  • Basic Programming

    Example. While ProgrammingCompile the C code:

    while (save[i] == k)

    i = i + j;

    Assuming i, j, and k are associated to registers $s3, $s4 and $s5 by the compiler, and $s6 holdsthe base address of save. Answer:

    Loop: add $t1 , $s3 , $s3 # Temp reg $t1= 2*i

    add $t1 , $t1 , $t1 # Temp reg $t1= 4*i

    add $t1 , $t1 , $s6 # $t1 = address of save[i]

    lw $t0 , 0($t1) # Temp reg $t0= save[i]

    bne $t0 , $s5 , Exit # go to Exit if save[i] does not equal k

    add $s3 , $s3 , $s4 # i = i + j

    j Loop # go to Loop

    Exit:

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 36

  • Basic Programming

    Set if Less ThanThe test of inequality or equality is the most popular for stoping a loop,however sometimes it is useful to find out if a variable is greater than other.

    The MIPS slt (set if less than) instructions compares two registers andset a third to 1 if the first is less than the second.

    For example slt $t0, $s3, $s4 , means that $t0 is set to 1 if $s3 < $s4.Otherwise $t0 is set to 0.

    The compiler uses slt, bne, and beq and the register $zero to create allrelative conditions: =, >,

  • Basic Programming

    Set if Less ThanWhat is the code to test if variable a associated to $s0 is less thanvariable b in $s1, and then branch to the label Less, if the conditionholds?Answer:

    Less: slt $t0 , $s0 , $s1 # $t0= 1 if $s0

  • Basic Programming

    Case/Switch StatementMost programming languages have a case or switch statement.One way of implementing a switch is through a sequence ofif-then-else.Another way is using a table of addresses called a jump addresstable.To support this situation MIPS provides the instruction jumpregister (jr), an unconditional jump to the address specified in aregister.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 39

  • Basic Programming

    Case/Switch Statement.Example: Compiling a switch Statement using a Jump AddressTable.Consider the code:

    switch (k) {

    case 0: f = i + j; break ; /* k = 0*/

    case 1: f = g + h; break ; /* k = 1*/

    case 2: f = g - h; break ; /* k = 2*/

    case 3: f = i - j; break ; /* k = 3*/

    }

    Assume the six variables are contained in registers $s0 to $s5 and that register $t2contains 4.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 40

  • Basic ProgrammingCase/Switch StatementAnswer

    The switch variable k is used to index a jump address table, then jump via the loaded value.First we make sure k is in the test range.

    slt $t3 , $s5 , $zero # $t3=1 if k < 0

    bne $t3 , $zero , Exit # if k < 0 go to Exit

    slt $t3 , $s5 , $t2 # Test if k < 4

    beq $t3 , $zero , Exit # if k >= 4 go to Exit

    Then we multiply the index k by 4 so we can use it as pointer.

    add $t1 , $s5 , $s5

    add $t1 , $t1 , $t1

    Assume that four sequential words in memory starting with the address in $t4 contain theaddresses corresponding to the labels L0, L1, L2, and L3.

    add $t1 , $t1 , $t4 # $t1 = address of JumpTable[k]

    lw $t0 , 0($t1) # $t1 = JumpTable[k]

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 41

  • Basic Programming

    Case/Switch StatementAnswer

    The instruction jr jumps to the address specified in a register.

    jr $t0

    Finally, the cases

    L0: add $s0 , $s3 , $s4 # f = i + j

    j Exit

    L1: add $s0 , $s1 , $s2 # f = g + h

    j Exit

    L2: sub $s0 , $s3 , $s4 # f = g - h

    j Exit

    L4: sub $s0 , $s1 , $s2 # f = i - j

    Exit:

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 42

  • Summary of MIPS Assembly

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 43

  • Summary of MIPS Assembly

    MIPS Instruction fields

    Figure: MIPS Instruction fields

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 44

  • Supporting Procedures

    ProceduresProcedures are tools used with two purposes:

    1 Make the code easier to understand.2 Make the code reusable.

    Procedures: programs that concentrate in a portion of the task.Parameters

    Allow for separation between the procedure and the rest of theprogram and data.Allow to pass values and return results.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 45

  • Supporting Procedures

    ProceduresTo execute procedures the program must follow six steps:

    1 Place the parameters where the procedure can access them.2 Transfer control to the procedure.3 Acquire the storage resources needed by the procedure.4 Perform the desired task.5 Place the results where the main program can access them.6 Return the control to the point of origin.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 46

  • Supporting Procedures

    Register AllocationAs mentioned, register are the fastest place to hold data.They must be used as much as possible.MIPS allocates the following registers for procedure calling:

    $a0-$a3: four arguments register to pass parameters.$v0-$v1: two value register to return values.$ra: one return address register to return the point of origin.

    MIPS includes an instruction just to call procedures: jalProcedureAddressThe instruction is jump-and-link save the return address in $ra andjumps to the target address.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 47

  • Supporting Procedures

    Register AllocationThe execution of a procedure

    The caller program places the parameters in $a0 to $a3The caller program uses jal X to jump to the procedure X.The callee program (the procedure) perform its calculations.The callee program returns control using jr $ra.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 48

  • Supporting Procedures

    StackIf more registers for parameters are needed,

    If the procedure uses more than 4 register, a stack is usedAny register needed by the caller must be restored to their originalvalues before the procedure was invoked.This is called spilling registers.MIPS software allocates another register for the stack called thestack pointer, $sp.MIPS stacks grow form higher address to lower addressThis convention means you push values into the stack bysubtracting from the stack pointer.Adding to the stack pointer shrinks the stack, popping values off thestack.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 49

  • Supporting Procedures

    Leaf ProcedureExample: Compiling a Procedure that does not call another procedure.Consider the code:

    int leaf_example (int g, int h, int i, int j)

    {

    int f;

    f = (g + h) - (i - j);

    return f;

    }

    The compiled program has three parts, saving the registers for the caller,performing the computations, and restoring the registers:

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 50

  • Supporting Procedures

    Leaf ProcedureThe parameters g, h, i, and j correspond to the argument registers $a0 to $a3.,and f correspond to $s0.

    leaf_example:

    addi $sp , $sp , -12 # make room in the stack for 3 items.

    sw $t1 , 8($sp) # save register $t1 to use afterwards

    sw $t0 , 4($sp) # save register $t0 to use afterwards

    sw $s0 , 0($sp) # save register $s0 to use afterwards

    add $t0 , $a0 , $a1 # $t0 contains g + h

    sub $t1 , $a2 , $a3 # $t1 contains i - j

    sub $s0 , $t0 , $t1 # f = (g + h) - (i - j)

    add $v0 , $s0 , $zero # returns f

    lw $s0 , 0($sp) # restore register $s0 for the caller

    lw $t0 , 4($sp) # restore register $t0 for the caller

    lw $t1 , 8($sp) # restore register $t1 for the caller

    addi $sp , $sp , 12 # adjust the stack pointer back

    jr $ra # jump back to the calling routine

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 51

  • Supporting Procedures

    Register Preservation RulesTo avoid saving and restoring register that are never used, MIPS offerstwo classes of registers:

    $t0-$t9. 10 temporary registers that are not preserved by the callee.$s0-$s7. 8 saved registers that must be preserved. If used thecallee saves them and restore them.

    This simple convention reduces register spilling.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 52

  • Supporting Procedures

    Nested ProceduresProcedure that do not call other procedures are called leafprocedures.Life would be simpler if all procedures were leaf.If a procedure A calls a procedure B, both using $a0 to passparameters, B must preserve the value of $a0 for A.One solution

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 53

  • Supporting Procedures

    Nested ProceduresLets consider the procedure that computes the factorial:

    int fact (int n)

    {

    if (n < 1) return (1)

    else return (n * fact(n-1));

    }

    Assuming we can add or subtract constants, what is the MIPS code forthis procedure?

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 54

  • Supporting Procedures

    Nested ProceduresThe parameter n correspond to the argument register $a0. Hence, the value of $a0must be pushed on the stack.

    fact:

    addi $sp , $sp , -8 # Open two words in the stack

    sw $ra , 4($sp) # save the return address

    sw $a0 , 0($sp) # save the argument

    The next two instruction test if n is less than 1.

    slti $t0 , $a0 , 1 # test for n < 1

    beq $t0 , $zero , L1 # if n >= 1 go to L1

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 55

  • Supporting ProceduresNested ProceduresIf n < 1, fact returns 1 by putting 1 into a value register.

    addi $v0 , $zero , 1 # returns 1

    addi $sp , $sp , 8 # pops two items off the stack

    jr $ra # return to after jal

    If n is not less than 1, n is decremented and then fact is called again.

    L1: addi $a0 , $a0 , -1 # decrement n

    jal fact # calls fact again

    Then, when fact returns, the old address and old

    lw $a0 , 0($sp) # restore argument

    lw $ra , 4($sp) # restore the return address

    addi $sp , $sp , 8 # pops two items off the stack

    Assuming the multiplication instruction exists.

    mult $v0 , $a0 , $v0 # n * fact(n-1)

    jr $ra # return to the caller

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 56

  • Supporting Procedures

    Nested ProceduresRegisters $a0-$a3, $s0-$s7, and stack pointer are preserved.Registers $v0-$v1, $t0-$t9 are not preserved.The stack above the stack pointer is preserved.The stack below the stack pointer is not preserved.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 57

  • Supporting Procedures

    Nested ProceduresVariables local to the procedures are also stored in the stack.This is done when variables do not fit in the registers.The procedure frame or activation record is the stack segmentcontaining a procedure saved register and local variables.Some MIPS software use a frame pointer ($fp) to point the firstword of a procedure.Hence, the $fp points to the begin of the procedure frame and the$sp points to its end.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 58

  • Supporting Procedures

    Nested ProceduresC language has two storage classes: automatic and static.Automatic variables are local data discarded when the procedureexits.Static variables exist across exits from procedures.To ease the access to static data MIPS reserves another registercalled global pointer, $gp.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 59

  • Nested ProceduresThe frame pointer points to the first word saved by theprocedure

    Figure: Stack Allocation (a) before, (b) during and (c) after the procedure call.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 60

  • Summary

    Register Mapping

    Figure: MIPS register convention. Register 1, called $at, is used by the assembler andregister 26 and 27, called $k0 and $k1, are reserved to the operating system

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 61

  • Summary

    Figure: SummaryCOE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 62

  • Summary

    Figure:

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 63

  • Communicating with People

    CommunicationComputers were created to crunch numbersMost computers today use the American Standard Code forInformation Interchange (ASCII) code to represent characters.ASCII codes characters are 8-bit wide.MIPS provides special instructions to move bytes.

    Load byte (lb) loads a byte from memory placing it in therightmost 8 bits of a register.Store byte (sb) takes a byte from the rightmost 8 bits of aregister and writes them into memory.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 64

  • Communicating with People

    ASCII Code

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 65

  • Communicating with People

    CommunicationThus, for copying a byte

    lb $t0 , 0($sp)

    sw $t0 , 0($gp)

    Three choices for representing a string1 The first position of the string is reversed to give the length of

    the string.2 An accompanying variable has the length of the string (as in a

    structure).3 The last position of the string is marked with a character to

    mark the end of the strings.C language uses the null character to mark the end of strings.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 66

  • Communicating with People

    CommunicationExample: Compiling a string copy procedure (C style).

    void strcpy (char x[], char y[])

    {

    int i;

    i=0;

    while ((x[i] = y[i]) !=0) /* copy and test byte */

    i = i + 1;

    }

    What is the MIPS assembly code?

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 67

  • Communicating with People

    CommunicationAssuming the base addresses of the arrays x and y are in the registers $a0 and $a1, respectively,and i is in $s0.

    strcpy:

    addi $sp , $sp , -4 # Adjust the stack for 1 word

    sw $s0 , 4($sp) # save the $s0.

    add $s0 , $zero , $zero # initialize i

    L1: add $t1 , $a1 , $s0 # Address of y[i] in $t1

    lb $t2 , 0($t1) # $t2 = y[i]

    add $t3 , $a0 , $s0 # Address of x[i] in $t3

    sb $t2 , 0($t3) # x[i] = y[i]

    addi $s0 , $s0 , 1 # increment i

    bne $t2 , $zero , L1 # if y[i] !=0 go to L1

    lw $s0 , 4($sp) # y[i]==0, restore s0

    addi $sp , $sp , 4 # Adjust the stack

    jr $ra # return

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 68

  • Communicating with People

    UnicodeThere is Universal Encoding or Unicode using 16 bits to representa character.Java, for example uses unicode.MIPS have a set of instructions to load an store halfwords or16-bits quantities.These instructions will not be treated at the moment, but revisedlater.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 69

  • Constants and Immediate Operands

    Why ImmediatesA constant can be included in the instruction via the I-typeinstructions.52% of the arithmetic instructions in the gcc compiler use animmediate operand69% of the instructions of spice use an immediate operand.Observe the sequence:

    sw $t0 AddrConstant4($zero) # $t0 = constant 4

    add $sp , $sp , $t0 # sp = sp + t0

    With the immediate instruction we avoid accessing the memoryaddress AddrConstant4 to get the constant 4.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 70

  • Constants and Immediate Operands

    Why ImmediatesExample: Translating assembly constants into machine languageThe add immediate instruction addi adds a constant to a register

    addi $sp , $sp , 4 # $sp = $sp +4

    The op field for addi is 8. Try to guess the rest of the fields in thecorresponding machine instruction.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 71

  • Constants and Immediate Operands

    Why ImmediatesWe know that register $sp maps to register 29 of the register file, so fields rt and rs ofthe instruction must be 29. The immediate field contains the constant in the instruction.

    op rs rt Immediate

    8 29 29 4

    In binary format:

    op rs rt Immediate

    001000 11101 11101 0000 0000 0000 0100

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 72

  • Constants and Immediate Operands

    Why ImmediatesImmediate operands are also popular for comparisons.Then MIPS has the instruction :

    slti $t0 , $t2 , 10 # $t0 = 1 if $t2 < 10

    The immediate instructions allows toallocate only the instruction space for constantsavoiding wasting memory accesses in those constantsavoiding the compiler having to resolve them to constants

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 73

  • Constants and Immediate Operands

    Why ImmediatesMake the common case FastConstant operands are frequent in arithmetic operations.Making the operand part of the instruction is much faster thanaccessing memory to get them.Then, immediate addressing is implemented to make commoncases faster.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 74

  • Target Address Computations

    Branches and JumpsThe simplest addressing in MIPS is the jumpThey use the third MIPS instruction format, the j-type instruction.Consider the instruction j 10000, assembled into

    6 bits 26 bits

    2 10000

    where the opcode of the jump is 2 and the jump address is 10000.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 75

  • Target Address ComputationsBranches and Jumps

    The conditional branch instruction needs two operands.For example:

    bne $s0 , $s1 , Exit # go to exit if $s0 \neq $s1

    This is assembled as

    5 16 17 Exit

    6 bits 5 bits 5 bits 16 bits

    The new PC is obtained by (PC-relative)

    PCnew = (PC+4) + Branch Immediate 4 (1)

    or in other words

    Branch Target Address = (PC+4) + Branch Immediate 4 (2)COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 76

  • Target Address Computations

    Branches and Jumps

    Since PC = address of the next instruction we can branch within215 words of the current instruction.This is called PC-relative addressing mode.

    PC-relative addressing is used for all conditional branches because thetarget address is likely to be close to the branch.

    Jump and link (jl) calls a procedure that has no reason to be close to thecall, then it uses long addressing mode provided by the j-type instructions.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 77

  • Target Address Computations

    Branching Far Away More than 16 bits offsets

    Nearly every conditional branch is to a nearby location.

    A far away branch is an offset that requires more than 16 bits.

    In such a case, the assembler inverts the test condition and inserts anunconditional jump.

    For example, the instruction

    beq $s0 , $s1 , L1

    is replaced by

    bne $s0 , $s1 , L2

    j L1

    L2:

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 78

  • Target Address Computations

    Branches and JumpsJump Instruction

    opcode 26-bit address

    The 26-bits field is a word address, or a 28-bit byte address.

    The MIPS jump instruction replaces the 28 lower bits of the PC.

    PCnew = PC(31 : 28) & 26-bit field & 00

    If the jump target is farther than 256 MB away, the jump instruction mustbe replaced with a jr instruction that allows for a full 32-bits address.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 79

  • Addressing Modes

    The five MIPS addressing modesRegister Mode: Where the operand is a register.Base or displacement addressing: Where the operand is at thememory location whose address is the sum of a register and aconstant in the instruction.Immediate addressing: Where the operand is a constant withinthe instruction.PC-relative addressing: Where the address is the sum of the PCand a constant in the instruction.Pseudo direct addressing: Where th jump address is the 26 bitsof the instruction concatenated with the upper bits of the PC.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 80

  • Addressing Modes

    The five MIPS addressing modes

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 81

  • Instruction Formats Summary

    Figure: MIPS Instruction Formats

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 82

  • How Compilers Work

    Steps to start a program

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 83

  • How Compilers Work

    CompilersTranslate a C language program into an assembly languageprogram.High level language programs: fewer lines than assembly.In the 70s many operating systems were written in assemblybecause of small memories and inefficient compilers.As memory capacity increased and compilers improved assemblyprogramming was not indispensable.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 84

  • How Compilers Work

    AssemblerThe assembler deals with the pseudoinstructions.Pseudoinstructions exists in the assembly language but does nothave a hardware implementation.For example

    move $t0 , $t1

    is in fact executed as a

    add $t0 , $zero , $t1

    Assembly also accepts numbers in a variety of numeric bases (hex,bin, etc), change their base to binary.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 85

  • How Compilers Work

    AssemblerThe assembler turns the program into a Object FileThe object file is a combination of

    Machine language instructions.Data.Information needed to place the program in memory.

    Assembler keeps track of the labels used by the program in asymbol table containing pairs of symbols-addresses

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 86

  • How Compilers Work

    AssemblerThe object file for Unix, typically 6 pieces:

    The object file header describing the size and position of the otherpieces.

    The text segment containing the machine language code.

    The data segment containing any data that comes with the program.

    The relocation information identifying instructions and data words thatdepend on absolute addresses when the program is loaded into memory.

    The symbol table containing the remaining labels that are not defined,such as external references.

    The debugging information with a description of how the modules werecompiled.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 87

  • How Compilers Work

    Linker Puting it togetherEach procedure is compiled and assembled separatelyOne line change causes only one procedure to be recompiled orreassembled.The linker stitches together all the independently compiledprocedures.Three steps for linking

    1 Place code and data modules symbolically in memory.2 Determine the addresses of data and instruction labels.3 Patch both the internal and external references.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 88

  • How Compilers Work

    Linker Puting it togetherThe linker will

    Use the relocation information and the symbol table to resolveall undefined labels (jumps, branches, and data addresses).If all external references are resolved, the linker determines thememory location for each module.When the linker places the modules in memory, all absolutereferences (memory addresses that are not relative to a register)are relocated to its true location.The linker produces an executable file that can be run in acomputer.Usually the executable has the same format as the object file butwithout unresolved references.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 89

  • How Compilers Work

    LoaderTo start it UNIX gives the following steps:

    1 Reads the file header to determine the size of the text and data segments.

    2 Creates an address space large enough for the text and data.

    3 Copies the parameters (if any) to the main program onto the stack.

    4 Initializes the machine registers and sets the stack pointer to the first freelocation.

    5 Jumps to a start-up routine that copies the parameters into the argumentregisters and calls the main routine of the program.

    6 When the main routine returns, the start-up routine terminates theprogram with an exit system call

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 90

  • Examples

    The Swap ProcedureLets derive the MIPS code from a procedure written in C:The swap procedure.

    swap (int v[], int k)

    {

    int temp;

    temp = v[k];

    v[k] = v[k+1];

    v[k+1] = temp;

    }

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 91

  • The Swap Procedure

    When translating from C to assembly language we follow thesteps:

    1 Allocate the registers to program variables.2 Produce code for the body of the procedure.3 Preserve registers across the procedure invocation

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 92

  • Examples

    Register Allocation$a0-$a3 are the registers to pass parameters to procedures.swap has only two parameters v and k, and one additional variabletemp.Then $a0 and $a1 are associated with v and k, while temp isassociated with $t0.We use $t0 since swap is a leaf procedure.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 93

  • Examples

    Produce codeFirst multiply the index by 4

    add $t1 , $a1 , $a1 # $t1=k*2

    add $t1 , $t1 , $t1 # $t1=k*4

    add $t1 , $a0 , $t1 # $t1=v+(k*4), the address of v[k]

    Next, load v[k] and v[k+1]

    lw $t0 , 0($t1) # loads v[k] in t0

    lw $t2 , 4($t1) # loads v[k+1] in t2

    Then, store the swapped addresses.

    sw $t2 , 0($t1) # v[k] = $t2

    sw $t0 , 4($t1) # v[k + 1] = $t0

    jr $ra

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 94

  • Examples

    sort Proceduresort (int v[] int n)

    {

    int i j;

    for (i = 0; i < n; i = i + 1){

    for (j=i-1; j>=0 && v[j] > v[j+1]; j=j-1) {swap(v,j)

    }

    }

    }

    Assume that i is in $s0, j is in $s1, v base address is in $s2, and n is in $s3.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 95

  • Examples

    sort ProcedureSaving Registers:

    sort: addi $sp , $sp , -20

    sw $ra , 16( $sp)

    sw $s3 , 12( $sp)

    sw $s2 , 8($sp)

    sw $s1 , 4($sp)

    sw $s0 , 0($sp)

    Parameter saving:

    move $s2 , $a0

    move $s3 , $a1

    Outer Loop:

    move $s0 , $zero

    for1tst:slt $t0 , $s0 , $s3

    beq $t0 , $zero , exit1

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 96

  • Examples

    sort ProcedureInner Loop:

    addi $s1 , $s0 , -1

    for2tst:slti $t0 , $s1 , 0

    bne $t0 , $zero , exit2

    add $t1 , $s1 , $s1

    add $t1 , $t1 , $t1

    add $t2 , $s2 , $t1

    lw $t3 , 0($t2)

    lw $t4 , 4($t2)

    slt $t0 , $t4 , $t3

    beq $t0 , $zero , exit2

    Pass Parameters and call

    move $a0 , $s2

    move $a1 , $s1

    jal swap

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 97

  • Examples

    sort ProcedureInner loop

    addi $s1 , $s1 , -1

    j for2tst

    outer loop

    exit2: addi $s0 , $s0 , 1

    j for1tst

    Restoring Registers

    exit1: lw $s0 , 0($sp)

    lw $s1 , 4($sp)

    lw $s2 , 8($sp)

    lw $s3 , 12( $sp)

    lw $ra , 16( $sp)

    addi $sp , $sp , 20

    jr $ra

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 98

  • Examples

    Arrays vs. PointersModern optimizing compilers can produce just as good code for pointer or arrays.

    Consider the code

    clear1(int array[], int size)

    {

    int i

    for (i = 0; i < size; i = i + 1)

    array[i] = 0;

    }

    clear2(int *array , int size)

    {

    int*p;

    for (p = &array [0]; p< &array[size]; p = p + 1)

    *p = 0;

    }

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 99

  • Examples

    Arrays vs. Pointersclear1 uses indices while clear2 uses pointers.

    The second procedure deserve some explanations

    The address of a variable is denoted by &.The object pointed by a pointer is indicated by *.The declarations *p and *array declare them as pointers tointegers.

    Let us look at the assembly code.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 100

  • Examples

    Array version of clearAssume $a0 and $a1 hold array and size respectively. i is allocated inregister $t0.

    move $t0 , $zero # i = 0

    loop1: add $t1 , $t0 , $t0 # i=i*2

    add $t1 , $t1 , $t # i=i*4

    add $t2 , $a0 , $t # $t2=address of array[i]

    sw $zero , 0($t2) # array[i] = 0

    addi $t0 , $t0 , 1 # i =i+1

    slt $t3 , $t0 , $a1 # $t3= (i

  • Examples

    Pointer version of clearAssume $a0 and $a1 hold array and size respectively. p is allocated inregister $t0.

    move $t0 , $a0 # p=address of array [0]

    loop2: sw $zero , 0($t0) # Memory[p] = 0

    addi $t0 , $t0 , 4 # p=p + 4

    add $t1 , $a1 , $a1 # $t1=size * 2

    add $t1 , $t1 , $t1 # $t1=size * 4

    add $t2 , $a0 , $t1 # $t2=address of array[size]

    slt $t3 , $t0 , $t2 # $t3=(p

  • Examples

    Improved Pointer version of clearThis version moves the address calculation out of the loop.

    move $t0 , $a0 # p=address of array [0]

    add $t1 , $a1 , $a1 # $t1=size * 2

    add $t1 , $t1 , $t1 # $t1=size * 4

    add $t2 , $a0 , $t1 # $t2=address of array[size]

    loop2: sw $zero , 0($t0) # Memory[p]=0

    addi $t0 , $t0 , 4 # p=p+4

    slt $t3 , $t0 , $t2 # $t3=(p

  • Examples

    ComparingArray Version

    move $t0 , $zero

    loop1: add $t1 , $t0 , $t0

    add $t1 , $t1 , $t1

    add $t2 , $a0 , $t1

    sw $zero , 0($t2)

    addi $t0 , $t0 , 1

    slt $t3 , $t0 , $a1

    bne

    $t3 , $zero , loop1

    Pointer Version

    move $t0 , $a0

    add $t1 , $t1 , $a1

    add $t1 , $t1 , $t1

    add $t2 , $a0 , $t1

    loop2: sw $zero , 0($t0)

    addi $t0 , $t0 , 4

    slt $t3 , $t0 , $t2

    bne

    $t3 , $zero , loop2

    The array version has to multiply the index every iteration.

    The pointer is updated more efficiently.

    Instructions per iterations are 7 and 4 from left to right.

    An optimized compiler will translate array versions to a pointer version.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 104

  • Real Stuff

    IA-32 InstructionsDesigners sometimes provide more powerful instructions thanthose found in MIPS.Their goal is reduce the number of instructions in a program.The danger is increasing the complexity of the hardware, increasingthe time to execute.MIPS was the vision of a single small group in 1985.Not the case of the Intel IA-32, developed by several independentgroups who evolved the architecture over 20 years.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 105

  • Real Stuff

    IA-32 Milestones1978: The Intel 8086 was announced as a register dedicated architecture.

    1980: The Intel 8087 FP coprocessor is announced. Used the stack instead ofregisters. Extended the 8086 architecture in 60 instructions.

    1982: The 80286 extended the 8086 architecture by increasing the address spaceto 24 bits, creating an elaborate memory-mapping and protection model, andadding a few instructions to handle the protection.

    1985: the 80386 extended the 80286 to 32 bits. Also added new instructionsturning the 386 into a nearly general purpose register machine. Paging supportwas also added.

    1989-95: The 80486, Pentium, and Pentium Pro aimed for higher performanceadding only 4 new user-visible instructions.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 106

  • Real Stuff

    IA-32 Milestones1997: MMX (Multi Media Extension) expanded Pentium and Pentium Proarchitectures. 57 new instructions using the FP stack to accelerate multimedia andcommunications applications.

    1999: Intel added another 70 instructions labelled SSE (Streaming SIMDExtensions) as part of Pentium III.

    2001: Intel adds another 144 instructions for double precision arithmetic. FPregisters can be used for FP operations instead of the stack.

    AMD enhances the IA-32 architecture increasing the address space from 32 to 64bits. It provides a legacy mode, identical to IA-32 and a compatibility modeAMD64 (user programs are IA-32, operating system is IA-64).

    Intel capitulates and embraces AMD64 enhancing it with a 128-bit compare andswap instruction. Adds SSE3 supporting complex arithmetics. AMD will offerSSE3 in subsequent chips.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 107

  • Real Stuff

    The 80386 Register Set

    80386 Register Set

    The 80386 extended all 80286 16-bit(except segment registers) register to32 bits.

    The prefix E was added to the nameto denote 32-bit version.

    The 80386 has only 8 GPRs asopposed to 32 GPRs of MIPS.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 108

  • Real Stuff

    Operand types for arithmetic, logical, and data transferinstructions

    Source/destination operand Second Source operand

    Register RegisterRegister ImmediateRegister MemoryMemory RegisterMemory Immediate

    The IA-32 logical and arithmetic instructions used one operand as source and destination.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 109

  • Real Stuff

    Addressing Modes

    Mode Description Register Restrictions MIPS equivalent

    Register Indirect Address in a register not ESP or EBPlw $s0, 0($s1)

    Based mode with 8 or 32-bit displacement

    Address is the content of aregister plus displacement

    not ESP or EBPlw $s0, 100($s1)

    Based plus scaled index Address is Base +(2scale index) wherescale is 0, 1, 2, or 3

    Base: Any GPR Index:Not ESP mul $t0, $s2,4 add $t0,

    $t0, $s1 lw $s0, 0($t0)

    Based plus scaled indexwith 8 or 32-bit displace-ment

    Address is Base +(2scale index) wherescale is 0, 1, 2, or 3

    Base: Any GPR Index:Not ESP mul $t0, $s2,4 add $t0,

    $t0, $s1 lw $s0, 100($t0)

    Two size of addresses within the instruction: displacements are 8 or32-bit wide.

    Memory operands can be used in any instruction.

    There are restrictions on what registers can be used with each mode.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 110

  • Real Stuff

    Classes Of InstructionsFour major classes of instructions

    1 Data movement instructions: move, push, pop, etc.2 Arithmetic and logic instructions: test, integer, decimal

    arithmetic operations.3 Control flow: conditional branches, unconditional jumps,

    calls, and returns.4 String instructions: string move and string compare.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 111

  • Real Stuff

    Comparing IA-32 and MIPS

    IA-32Arithmetic and logic instructions: oneoperand in memory.

    Conditional branches are based oncondition codes or flags.

    Comparison with 0 requires extrainstructions.

    Branch address is specified in bytes.

    MIPSOnly data transfer instructions accessmemory

    Conditional branches based on anarithmetic comparison speeds upcomparison with zero.

    Branch address is specified in wordsfavoring simplicity.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 112

  • Real Stuff

    The 80386 InstructionFormat

    One opcode bit says if theoffset is 8 or 32-bit wide.

    Opcode Post-byte specifyingthe addressing mode.

    Second Post-byte for thebased plus scaled indexmodes.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 113

  • Fallacies and Pitfalls

    More powerful instructions mean higher performance.Data transfers performed with 80x86 prefix repeating instructionsyield 40 MB/sec, while load/store data transfer yield 60 MB/sec.Write code in assembly language for higher performance.With the level of optimization included in today compilers, the codewritten in high level language is often faster than code written inassembly, specially for long programs.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 114

  • Concluding Remarks

    The 4 Design Principles1 Simplicity favors regularity. The regularity of the fields in the

    MIPS instruction allow all the instruction types to be processedalmost by the same hardware, keeping the machine simple.

    2 Smaller is faster. Speed is the reason for 32 registers instead ofmore.

    3 Good design demands good compromises. For example, MIPSdoes not provide 32 bits for immediate addresses, to keep allinstruction the same length.

    4 Make the common case fast. Arithmetic immediate instructionsare the example of this principle.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 115

  • References

    John L. Hennessy and Patterson.Computer Organization and Design, The Hardware/Software Interface,, volume 1.MK, San Mateo, CA, 2007.

    COE5320 Computer Architecture Angel E. Gonzalez-Lizardo, Ph.D. 116

    Instructions: Language of the ComputerIntroductionArithmetic InstructionsFirst Design PrincipleMIPS Assembly

    Operands in a ComputerRegistersData Transfer InstructionsExample: Memory operationCompilerImmediate Instructions

    Instructions Formats R-type InstructionsFourth design principleI-type InstructionsInstruction Types so farAssembly to Machine LanguageTwo Key Principles

    Other InstructionsLogic InstructionsDecision-Making Instructions

    Basic ProgrammingLoopsBasic BlocksWhile ProgrammingSet if Less ThanCase/Switch StatementSummary of MIPS Assembly

    Supporting ProceduresProceduresRegister AllocationStackLeaf ProcedureRegister Preservation RulesNested Procedures

    SummaryRegister Mapping

    Communicating with PeopleCommunicationUnicode

    Constants and Immediate OperandsWhy Immediates

    Target Address ComputationsBranches and Jumps

    Addressing ModesThe five MIPS addressing modesInstruction Formats Summary

    How Compilers WorkSteps to start a programCompilersAssemblerLinker Puting it togetherLoader

    ExamplesThe Swap Proceduresort ProcedureArrays vs. PointersArray version of clearPointer version of clearImproved Pointer version of clearComparing

    Real StuffIA-32 InstructionsIA-32 MilestonesThe 80386 Register SetOperand types for arithmetic, logical, and data transfer instructionsAddressing ModesClasses Of InstructionsComparing IA-32 and MIPSThe 80386 Instruction Format

    Fallacies and PitfallsConcluding RemarksThe 4 Design Principles