cmsc 22200 computer architecture - department of … · · 2016-09-29cmsc 22200 computer...

CMSC 22200 Computer Architecture

Lecture 2: ISA

Prof. Yanjing Li Department of Computer Science

University of Chicago

Administrative Stuff !  Lab1 is out!

"  Due next Thursday (10/6)

!  Lab2 "  Out next Thursday

2

Lecture Outline !  Introduction to ISA

!  Case Study: ARMv8 / LEGv8

3

Review: Basic Concepts !  Basic concepts

"  What is a computer? "  What is the von Neumann model? "  What is ISA? "  What is uarch? "  Design point

4

!  Instruction (e.g., add) !  Number of general purpose registers !  Number of ports to the register file !  Number of cycles to execute the MUL instruction !  Whether or not the machine employs pipelined instruction

execution !  Power/thermal management !  Support for virtual memory

5

ISA or uarch?

ISA !  Instructions

"  Opcodes, Addressing Modes, Data Types "  Instruction Types and Formats "  Registers, Condition Codes

!  Memory organization "  Address space, Addressability, Alignment "  Virtual memory management

!  Call, Interrupt/Exception Handling !  Access Control, Priority/Privilege !  I/O: memory-mapped vs. instr. !  Task/thread Management !  Power and Thermal Management !  Multi-threading support, Multiprocessor support

6

Many Different ISAs Over Decades !  X86 !  ARM !  MIPS !  SPARC !  IBM 360

!  What/why are the fundamental differences?

7

ISA Element: Instruction !  Or machine code, consists of

"  opcode: what the instruction does (add, sub, …) "  operands: who it is to do it to (register, memory, immediate)

!  Example

8

Data Types !  Representation of information for which there are

instructions that operate on the representation

!  ARMv8 "  Integer (byte, half word, word, doubleword, quad word) "  Floating point (half-, single-, double-precision) "  Fixed point "  Vector formats

!  Others (e.g., x86) "  BCD, strings

9

Instruction Process Style !  Specifies the number of “operands” an instruction

“operates” on and how it does so

!  0, 1, 2, 3 address machines "  0-address: stack machine (op, push, pop) "  1-address: accumulator machine (e.g., add mem) "  2-address: 2-operand machine (op D, S; one is both source

and dest) "  3-address: 3-operand machine (op D, S1, S2; source and dest

separate)

!  E.g., ARMv8 represents a 3-address machine

10

Instruction Classes !  Operate instructions

"  Process data: arithmetic and logical operations "  Fetch operands, compute result, store result "  Implicit sequential control flow (e.g., PC <= PC + 4)

!  Data movement instructions "  Move data between memory, registers, I/O devices "  Implicit sequential control flow

!  Control flow instructions "  Change the sequence of instructions that are executed

11

Instruction Addressing Modes !  Specifies how to obtain an operand of an instruction

"  Register "  Immediate "  Memory (displacement, register indirect, indexed, absolute,

memory indirect, autoincrement, autodecrement, …)

!  Fewer or more addressing modes? Tradeoffs?

12

Instruction Addressing Modes for Memory !  Specify how to obtain memory operands

"  Absolute LWRt,10000useimmediatevalueasaddress

"  RegisterIndirect: LWRt,(rbase)

useGPR[rbase]asaddress

"  Displacedorbased: LWRt,offset(rbase)useoffset+GPR[rbase]asaddress

"  Indexed: LWRt,(rbase,rindex)

useGPR[rbase]+GPR[rindex]asaddress"  MemoryIndirect LWRt((rbase))

usevalueatM[GPR[rbase]]asaddress

"  Autoinc/decrement LWRt,(rbase)useGRP[rbase]asaddress,butinc.ordec.GPR[rbase]eachKme

13

Instruction Length !  Fixed length: Length of all instructions the same

+ Easier to decode single instruction in hardware + Easier to decode multiple instructions concurrently (superscalar) -- Wasted bits in instructions (Why is this bad?) -- Harder-to-extend ISA (how to add new instructions?)

!  Variable length: Length of instructions different + Compact encoding (Why is this good?) + extensibility -- More logic to decode a single instruction -- Harder to decode multiple instructions concurrently

!  Tradeoffs "  Code size (memory space, bandwidth, latency) vs. hardware complexity "  ISA extensibility and expressiveness vs. hardware complexity "  Performance/energy efficiency? Smaller code vs. ease of decode

14

Uniform/Non-uniform Decode of Inst !  Uniform decode: Same bits in each instruction correspond

to the same meaning "  Opcode is always in the same location "  Ditto operand specifiers, immediate values, … "  Many �RISC� ISAs: MIPS, SPARC + Easier decode, simpler hardware + Enables parallelism: generate target address before knowing the

instruction is a branch -- Restricts instruction format (fewer instructions?) or wastes space

!  Non-uniform decode "  E.g., opcode can be the 1st-7th byte in x86 + More compact and powerful instruction format -- More complex decode logic

!  Uniform decode usually means fixed length as well

15

x86 vs. MIPS Instruction Formats !  x86

!  MIPS:

16

R-type06-bit

rs5-bit

rt5-bit

rd5-bit

shamt5-bit

funct6-bit

opcode6-bit

rs5-bit

rt5-bit

immediate16-bit

I-type

opcode6-bit

immediate26-bit

J-type

ISA Element: Registers !  Fast storage

!  How many? !  Size of each register? !  General purpose vs. special purpose?

!  Why is having registers a good idea? "  Because programs exhibit a characteristic called data locality "  A recently produced/accessed value is likely to be used more

than once (temporal locality) !  Storing that value in a register eliminates the need to go to

memory each time that value is needed !  Complier: Register optimization is important!

17

ISA Element: Memory Organization !  Address space: How many uniquely identifiable locations in

memory

!  Addressability: How much data does each uniquely identifiable location store "  Byte addressable: most ISAs

!  Aligned/unaligned access

18

byte-3 byte-2 byte-1 byte-0

byte-7 byte-6 byte-5 byte-4

MSB LSB

Load/Store vs. Memory/Memory Architectures

!  Load/store architecture: operate instructions operate only on registers

!  E.g., MIPS, ARM and many RISC ISAs

!  Memory/memory architecture: operate instructions can operate on memory locations

!  E.g., x86

19

ISA Element: I/O !  How to interface with I/O devices

"  Memory mapped I/O !  A region of memory is mapped to I/O devices !  I/O operations are loads and stores to those locations

"  Special I/O instructions !  IN and OUT instructions in x86 deal with ports of the chip

"  Tradeoffs? !  Which one is more general purpose?

20

Other ISA Elements !  Privilege modes

"  User vs supervisor "  Who can execute what instructions?

!  Exception and interrupt handling "  What procedure is followed when something goes wrong with an

instruction? "  What procedure is followed when an external device requests the processor?

!  Virtual memory "  Each program has the illusion of the entire memory space, which is greater

than physical memory

!  Access protection

21

CISC vs. RISC !  CISC, Complex instruction set computer # complex instructions

"  Initially motivated by �not good enough� code generation "  Memory size/bandwidth considerations

!  RISC, Reduced instruction set computer # simple instructions "  Goal: enable better compiler control and optimization "  Motivated by

!  Simplifying the hardware # lower cost, higher frequency !  Enabling the compiler to optimize the code better

!  Simple compiler, complex hardware vs. complex compiler, simple hardware

22

CISC vs. RISC !  Usually, …

!  RISC "  Simple instructions "  Fixed length "  Uniform decode "  Few addressing modes

!  CISC "  Complex instructions "  Variable length "  Non-uniform decode "  Many addressing modes

23

CISC vs. RISC !  Example: x86

!  Each x86 instruction can be translated into a sequence of micro-instructions (uops) "  Uops can be RISC-like "  Stored in a read-only memory structure (UROM) "  Why uops?

!  Simple processing engine to support complex instructions !  Extensibility !  Flexibility (can be patched to fix bugs)

!  Translation # unification of ISAs (ARM, x86, GPU)?

24

Aside: Ultimate RISC

25 wikipedia

Review: Programmer Visible (Architectural) State

26

M[0]M[1]M[2]M[3]M[4]

M[N-1]MemoryarrayofstoragelocaKonsindexedbyanaddress

ProgramCountermemoryaddressofthecurrentinstrucKon

Registers-givenspecialnamesintheISA(asopposedtoaddresses)-generalvs.specialpurpose

InstrucKons(andprograms)specifyhowtotransformthevaluesofprogrammervisiblestate

Programmer Invisible State !  Microarchitectural state !  Programmer cannot access this directly

!  E.g. cache state !  E.g. pipeline registers

27

ARMv8/LEGv8 Case Study

28

The ARMv8 ISA !  Commercialized by ARM Holdings (www.arm.com) !  Large share of embedded core market

"  Applications in mobile, consumer electronics, network/storage equipment, cameras, printers, …

!  Typical of many modern ISAs !  Reference (5740 pages)

"  https://developer.arm.com/docs/ddi0487/a/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile

**Basedonoriginalfigurefrom[P&HCO&D,COPYRIGHT2016Elsevier.ALLRIGHTSRESERVED.]

ARMv8 Overview !  RISC, Load/store architecture, both 32- and 64-bit !  3-address machine !  32-bit instructions !  Simple datatypes

"  int, fp, fixed point/vector interpretation !  Addressing modes: reg, imm, simple mem addressing

"  mem address from reg and instruction contents only !  32 GPRs, PC, SP, ELR, 32 SIMD/FP registers !  Byte addressable !  Memory space and memory alignment? !  You will implement ARMv8 in C (Lab1)

LEGv8 !  A subset of ARMv8

"  With some differences

!  Reference "  Green card from textbook "  Also available online "  http://booksite.elsevier.com/9780128017333/arm_ref.php

Instruction Formats


Registers !  32 × 64-bit register file, and 1 64-bit PC


Memory Accesses

!  Memory is byte addressed "  Each address identifies an 8-bit byte

!  Alignment "  Does not require words (4 bytes, or 32 bits) to be

aligned in memory, except for instructions and the stack


R-format Instructions

!  Instruction fields "  opcode: operation code "  Rm: the second register source operand "  shamt: shift amount "  Rn: the first register source operand "  Rd: the register destination

opcode Rm shamt Rn Rd 11 bits 5 bits 6 bits 5 bits 5 bits


R-format Example

ADD X9,X20,X21 // add the values in X20 and X21, and put

//the result in X9, or GPR[x9] = GPR[x20]+GPR[x21]

10001011000two 10101two 000000two 10100two 01001two

1000 1011 0001 0101 0000 0010 1000 1001two =

8B15028916



shamt in R-format instructions

!  shamt: how many positions to shift !  Shift left logical (LSL)

"  R[Rd] <- R[Rn] << shamt //Shift left and fill with 0 bits "  LSL by i bits: multiplies by 2i

!  Shift right logical (LSR) "  R[Rd] <- R[Rn] >> shamt //Shift right and fill with 0 bits "  LSR by i bits: divides by 2i (unsigned only)

!  Note, R-format instructions in ARMv8 support shift operations in the second operand before applying the operation specified in opcode



C to Assembly 101 !  C code:

f = (g + h) - (i + j); "  f, …, j in X19, X20, …, X23

!  Compiled into assembly:

ADD X9, X20, X21 ADD X10, X22, X23 SUB X19, X9, X10


I-format Instructions

!  Immediate instructions "  Rn: source register "  Rd: destination register "  Immediate field: constant data; zero-extended

!  Example: ADDI X22, X22, #4

"  What does the machine code look like for ADDI?

opcode Rn Rd 10 bits 12 bits 5 bits 5 bits

immediate


D-format Instructions

!  Load/store instructions "  Rn: base register "  address: constant offset from contents of base register (+/- 32

doublewords) "  op2: expands the opcode field "  Rt: destination (load) or source (store) register number

!  Example: LDUR X9,[X22,#64]

"  LDUR opcode:111110000102; op2:0

"  X9 (Rt field)

"  X22 (Rn field)

opcode op2 Rn Rt 11 bits 9 bits 2 bits 5 bits 5 bits

addOffset


C to Assembly 201 !  C code:

A[12] = h + A[8]; "  h in X21, base address of A in X22

!  Compiled code: "  Index 8 requires offset of 64 (byte-addressed memory)

LDUR X9,[X22,#64]

ADD X9,X21,X9

STUR X9,[X22,#96]


B Format Instructions

!  Example: B L1 "  branch unconditionally to instruction labeled L1;

!  B opcode: 0A016-0BF16 "  In ARMv8, it is 0001012

!  Effect: if taken, PC = PC + BranchAddr

opcode 6 bits 26 bits

BR_address


cmsc 22200 computer architecture - department of … · · 2016-09-29cmsc 22200 computer...

Documents