arm memory - university of new brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · arm...

30
ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5.

Upload: others

Post on 01-Jan-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores

ARM MemoryOwen Kaser, CS2253

Mostly corresponds to book Chapter 5.

Page 2: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores

Overview

● Loads and Stores● Memory Maps● Register-Indirect Addressing● Post- and Pre-indexed Addressing

Page 3: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores

16 Registers is Not Enough

● So far, the only places discussed for data are the ARM's CPU registers

● Most interesting programs need more data.● We need memory outside the CPU for our bulk

data storage.● Also, memory can contain pre-computed tables

(eg, of trig functions) that are never altered● For your toaster's software, the machine code

can be set at the factory. Fancy toaster: you can “flash” your toaster with improved software.

Page 4: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores

Loads and Stores

● Recall that ARM is a “load/store” architecture. Cannot directly do calculations on values in memory. Have to load them into a CPU register to use them as inputs.

● Similarly, calculations put results into registers. Then you can use a store instruction to put them into memory.

● Loads and stores need to specify where in memory things should go. This will be a numeric “memory address”.

● (Memory) addressing modes are small built-in calculations the CPU can do, to compute the memory address.

● Simple case: value in, say, R3 is to be used as the address.

Page 5: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores

System Memory Maps

● A system built around an ARM7TDMI processor uses 32-bit values as memory addresses. Each address would correspond to a byte (oops, octet).

● The overall “memory address space” ranges from 0 to 0xFFFFFFFF.

● But the overall memory address space is further subdivided (boundaries are often small multiples of powers of 2)

● RAM, ROM, flash, and I/O devices can be given their own subdivisions.

● More on I/O devices later in the course. For now, just realize that some memory addresses accept stores, and some ignore them.

Page 6: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores

Ex. Memory Map (extracts from book Table 5.1)

Start End Description

0x00000000 0x0003FFFF On-chip flash

0x00040000 0x00FFFFFF reserved

0x01000000 0x1FFFFFFF ROM

0x20000000 0x20007FFF (Static) RAM

…..

0x4000C000 0x4000CFFF UART 0 (a “serial port”) device

…..

0xE0001000 0xE0001FFF “data watchpoint and trace” (DWT) facility

….

0xE0004000 0xFFFFFFFF reserved

Page 7: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores

For Simplicity....

● Let's only mess with addresses in a range that corresponds to RAM memory.

● Then, loads and stores both make sense.

Page 8: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores

Register-Indirect Addressing Mode

● Let's suppose you want to load the byte at address 0x00005000 into register R3.

● 8 bit value into a 32-bit container. If we want the 8-bit value to be zero-extended, use LDRB instruction.

● If you want it sign-extended, use LDRSB.● Simplest case: a register stores the address of some

data you care about. Let's go for R1.● Assembler: MOV R1, #0x00005000 ;address to R1

LDRB R3, [R1] ; memory value to R3

Page 9: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores

Looping Through Memory

● Let's suppose you want to wipe clear (to 0) the contents of all memory locations from 0x00005000 to 0x00005FFF.

● A loop will work nicely.

MOV R1, #0x00005000 ; starting location

MOV R2, #0x00006000; when to stop

MOV R3, #0

LP STRB R3, [R1] ; wipe clear current location's value

ADD R1, R1, #1 ; advance to next location

TEQ R1, R2 ; has R1 hit the stopping location?

BNE LP

….

Page 10: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores

Speeding It Up

● If the area to be cleared is properly aligned (starts on a multiple of 4) and is the right size (a multiple of 4) we can clear out 4 consecutive addresses with one STR (store word) instruction.

● Recall that a 32-bit word is stored across 4 addresses: A, A+1, A+2, A+3.

Page 11: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores

Faster Code

MOV R1, #0x00005000 ; starting location

MOV R2, #0x00006000; when to stop

MOV R3, #0 ; 4 bytes of zeros

LP STR R3, [R1] ; wipe clear current location's value AND the next 3 locations' values

ADD R1, R1, #4 ; advance to location of next group of 4 bytes

TEQ R1, R2 ; has R1 hit the stopping location?

BNE LP

● Loop runs only ¼ as many times now.

Page 12: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores

Even Faster

● The pattern of “use a register to provide a memory address, then update the register in preparation for the next loop” is extremely common.

● ARM designers created an addressing mode that does BOTH of these operations in a single instruction. “post-indexed”

● STR R3, [R1], #4 is equivalent to

STR R3, [R1]

ADD R1, R1, #4

Page 13: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores

Textbook Figure 5.2

Page 14: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores

Even Faster Code

MOV R1, #0x00005000 ; starting location

MOV R2, #0x00006000; when to stop

MOV R3, #0 ; 4 bytes of zeros

LP STR R3, [R1], #4 ; wipe 4, then advance “pointer” R1

ADD R1, R1, #4 ; advance to location of next group

TEQ R1, R2 ; has R1 hit the stopping location?

BNE LP

Page 15: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores

Java Pre- vs Post-Increment

● Can draw a parallel to Java's ++ operators.● Recall, v = M[ p++] in Java

– it uses the current version of p to index M

– then it increments p. post-increment.

● Versus v = M[++p] in Java– it first increments p pre-increment

– then then new value of p is used to index into M

Page 16: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores

Post-Indexed Addressing

● In ARM, post-indexed indexing takes a base register. (Should not be R15.)

● Uses that base register's value to go to memory● Then updates the base register's value by a little

computation– adding/subtracting a constant (earlier example)

– adding/subtracting a register● which is allowed to be modified by the barrel shifter● can be shifted/rotated by a constant amount● can be shifted/rotated by a register amount

● Usefulness of fanciest of these seems doubtful● LDR R1, [R2], ROR R3 ; is this useful???

Page 17: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores

Useful? Example

● Java, for an int array M, variable x:

j = 0;

while (….) {

sum += M[j];

j += x;}

● ARM: suppose x in R2, start of M in R1● In loop body: LDR R3, [R1], R2 LSL #2

Page 18: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores

Pre-Indexed Addressing

● There are two flavours of pre-indexed addressing. Both do a little computation and use the computed effective address to go to memory. In one, the base register is updated. Other flavour does not update.

● In assembly language, the ! symbol means to update the base register. Don't use R15 as the base register with !

● Ok to use R15, without ! The value of R15 is 8 bytes beyond the start of the current machine code. [Details of why are a bit advanced.]

Page 19: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores

Rationale for the “little computations”

● PC-relative addressing for constants● Getting a field of an object, given the start of

the object.● Indexing into array of objects, selecting a field

(if the object size is a power of two)● (Selected largely by analyzing what compilers

for HLLs would find useful, I think...rather than focussing on assembly language programmers)

Page 20: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores

Pre-indexed Figure (Textbook)

● Instruction is STR r0, [r1, #12]● Add ! to update r1 when finished:

STR r0, [r1, #12]! ; r0 ← x20c

Page 21: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores

Some Pre-indexed Examples

● MOV R1, 0x123456578 fails. Constant is not a rotation of an 8 bit value.

● Instead, initialize a memory location with your constant. Then use PC-relative addressing to load it.

● LDR R1, myConst ; pseudo-op

… 1000 bytes later...

myConst DCD 0x12345678

● The LDR instruction is actually something like

LDR R1, [PC, #996] ; PC was already 8 ahead● 996 is close enough to PC. Must be within 4 kiB.

Page 22: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores

Ex: Field Access for an Object

● In HLLs, the fields of an object occupy consecutive memory addresses (possibly with padding)

● Let's suppose that an object starts at 1000. There are two 32-bit fields, then a 16-bit halfword field that we want to load into R2.

● Let's suppose that R1 contains the starting address of the object.

● Use LDRH R2, [R1, #8] ; immediate offset is 8

(Desired field starts 8 bytes later: gotta skip over first two words.)

● (Minor point: LDRH requires offset ±256)

Page 23: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores

Ex: Array Access

● Suppose R1 contains the starting address of an array.

● Suppose the array's elements are 4 bytes each● To load the wth array element, we want address

R1 + 4*w● Suppose value w is in R2● LDR R5, [R1, R2 , LSL #2] loads desired value.

Page 24: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores

No ADR Pseudo-op

● The Crossware assembler does not seem to support ADR, which is used to put an address into a register (that you will then use as a base register). For instance, summing values in array…

MOV R0, #0 ; accumulate answer

ADR R1, MyArr ; Keil pseudo-op

ADR R2, AfterMyArr ; past last valid address

LP LDR R3, [R1], #4

ADD R0, R0, R3

TEQ R1, R2

BNE LP

…..

MyArr DCD 34, 23, 56, 78, 12345566, ……...

AfterMyArr DCB 0

Page 25: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores

Instead of ADR● Instead of ADR, you should be able to do the following:

MOV R0, #0 ; accumulate answer

LDR R1, =MyArr

LDR R2, =AfterMyArr ; past last valid address

LP LDR R3, [R1], #4

ADD R0, R0, R3

TEQ R1, R2

BNE LP

…..

MyArr DCD 34, 23, 56, 78, 12345566, ……...

AfterMyArr DCD 0 ; wasted word, could avoid...

Page 26: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores

LDR As Pseudoinstruction

● LDR Rx, =value works for any 32-bit value (address or constant).

● It sets aside space in a “constant pool” , preinitialized to value. This constant pool is (by default) at the end of the current AREA.

● Then it generates machine code for a PC-relative LDR into Rx from this preinitialized location.

● Like a convenient DCD and LDR Rx, [PC, #something]● See textbook Chapter 6.

Page 27: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores

Machine-Code FormatsLDR/STR/LDRB/STRB

● From reference manual:

Page 28: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores

Meaning of Some Bits (Ref Man)

Page 29: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores

Exercise/Example

● Determine machine code for

LDR R3, [R1], #4

and also

STRB R3, [R1, R2, LSR #5]!

Page 30: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores

Load and Store Multiple

● There are instructions LDM and STM that load or store a number of registers.

● With LDM, a bit vector in the machine code indicates which register to load. They are loaded from consecutive addresses.

● STM works similarly● They are especially useful in storing things on

the runtime stack, and will be looked at when we cover that topic.