machine-level programming i: introduction apr. 14, 2008 topics assembly programmer’s execution...

Machine-Level Programming I:IntroductionApr. 14, 2008

Machine-Level Programming I:IntroductionApr. 14, 2008

TopicsTopics Assembly Programmer’s

Execution Model Accessing Information

RegistersMemory

Arithmetic operations

EECS 213

– 2 – EECS213, S’08

X86 Evolution: Programmer’s viewX86 Evolution: Programmer’s viewNameName DateDate TransistorsTransistors CommentsComments

80868086 19781978 29k29k 16-bit processor, basis for IBM PC & DOS; limited to 1MB 16-bit processor, basis for IBM PC & DOS; limited to 1MB address spaceaddress space

8028680286 19821982 134K134K Added elaborate, but not very useful, addressing scheme; basis Added elaborate, but not very useful, addressing scheme; basis for IBM PC AT and Windowsfor IBM PC AT and Windows

386386 19851985 275K275K Extended to 32b, added “flat addressing”, capable of running Extended to 32b, added “flat addressing”, capable of running Unix, Linux/gcc usesUnix, Linux/gcc uses

486486 19891989 1.9M1.9M Improved performance; integrated FP unit into chipImproved performance; integrated FP unit into chip

PentiumPentium 19931993 3.1M3.1M Improved performanceImproved performance

PentiumPro PentiumPro (P6)(P6)

19951995 6.5M6.5M Added conditional move instructions; big change in underlying Added conditional move instructions; big change in underlying microarchitecturemicroarchitecture

Pentium/MMXPentium/MMX 19971997 6.5M6.5M Added special set of instructions for 64-bit vectors of 1, 2, or 4 Added special set of instructions for 64-bit vectors of 1, 2, or 4 byte integer databyte integer data

Pentium IIPentium II 19971997 7M7M Merged Pentium/MMZ and PentiumPro implementing MMX Merged Pentium/MMZ and PentiumPro implementing MMX instructions within P6instructions within P6

Pentium IIIPentium III 19991999 8.2M8.2M Instructions for manipulating vectors of integers or floating Instructions for manipulating vectors of integers or floating point; later versions included Level2 cachepoint; later versions included Level2 cache

Pentium 4Pentium 4 20012001 42M42M 8 byte ints and floating point formats to vector instructions8 byte ints and floating point formats to vector instructions

– 3 – EECS213, S’08

Assembly Programmer’s ViewAssembly Programmer’s View

Programmer-Visible StateProgrammer-Visible State EIP Program Counter

Address of next instruction

Register FileHeavily used program data

Condition CodesStore status information about

most recent arithmetic operationUsed for conditional branching

EIP

Registers

CPU Memory

Object CodeProgram Data

OS Data

Addresses

Data

Instructions

Stack

ConditionCodes

Memory Byte addressable array Code, user data, (some) OS

data Includes stack used to support

procedures

– 4 – EECS213, S’08

text

text

binary

binary

Compiler (gcc -S)

Assembler (gcc or as)

Linker (gcc or ld)

C program (p1.c p2.c)

Asm program (p1.s p2.s)

Object program (p1.o p2.o)

Executable program (p)

Static libraries (.a)

Turning C into Object CodeTurning C into Object Code Code in files p1.c p2.c Compile with command: gcc -O p1.c p2.c -o p

Use optimizations (-O)Put resulting binary in file p

– 5 – EECS213, S’08

Compiling Into AssemblyCompiling Into Assembly

C CodeC Code

int sum(int x, int y){ int t = x+y; return t;}

Generated Assembly

_sum:pushl %ebpmovl %esp,%ebpmovl 12(%ebp),%eaxaddl 8(%ebp),%eaxmovl %ebp,%esppopl %ebpret

Obtain with command

gcc -O -S code.c

Produces file code.s

– 6 – EECS213, S’08

Assembly CharacteristicsAssembly CharacteristicsMinimal Data TypesMinimal Data Types

“Integer” data of 1, 2, or 4 bytesData valuesAddresses

Floating point data of 4, 8, or 10 bytes No aggregate types such as arrays or structures

Just contiguously allocated bytes in memory

Primitive OperationsPrimitive Operations Perform arithmetic function on register or memory data Transfer data between memory and register

Load data from memory into registerStore register data into memory

Transfer controlUnconditional jumps to/from proceduresConditional branches

– 7 – EECS213, S’08

Code for sum

0x401040 <sum>:0x550x890xe50x8b0x450x0c0x030x450x080x890xec0x5d0xc3

Object CodeObject CodeAssemblerAssembler

Translates .s into .o Binary encoding of each instruction Nearly-complete image of executable

code Missing linkages between code in

different files

LinkerLinker Resolves references between files

One of the object codes must contain function main();

Combines with static run-time libraries E.g., code for malloc, printf

Some libraries are dynamically linked Linking occurs when program begins

execution

• Total of 13 bytes

• Each instruction 1, 2, or 3 bytes

• Starts at address 0x401040

– 8 – EECS213, S’08

Machine Instruction ExampleMachine Instruction ExampleC CodeC Code

Add two signed integers

AssemblyAssembly Add 2 4-byte integers

“Long” words in GCC parlanceSame instruction whether signed

or unsigned

Operands:x: Register %eaxy: Memory M[%ebp+8]t: Register %eax

» Return function value in %eax

Object CodeObject Code 3-byte instruction Stored at address 0x401046

int t = x+y;

addl 8(%ebp),%eax

0x401046: 03 45 08

Similar to expression x += y

– 9 – EECS213, S’08

Disassembled00401040 <_sum>: 0: 55 push %ebp 1: 89 e5 mov %esp,%ebp 3: 8b 45 0c mov 0xc(%ebp),%eax 6: 03 45 08 add 0x8(%ebp),%eax 9: 89 ec mov %ebp,%esp b: 5d pop %ebp c: c3 ret d: 8d 76 00 lea 0x0(%esi),%esi

Disassembling Object Code Disassembling Object Code

DisassemblerDisassemblerobjdump -d p Useful tool for examining object code Analyzes bit pattern of series of instructions Produces approximate rendition of assembly code Can be run on either a.out (complete executable) or .o file

– 10 – EECS213, S’08

Data Formats Data Formats

““word” – 16b data type due to its originsword” – 16b data type due to its origins 32b – double word 64b – quad words

The overloading of “l” in GAS causes no problems since FP The overloading of “l” in GAS causes no problems since FP involves different operations & registersinvolves different operations & registers

C declC decl Intel data typeIntel data type GAS suffixGAS suffix Size (bytes)Size (bytes)

charchar ByteByte bb 11

shortshort WordWord ww 22

int, unsigned, long int, int, unsigned, long int, unsigned long, char *unsigned long, char *

Double wordDouble word ll 44

floatfloat Single precisionSingle precision ss 44

doubledouble Double precisionDouble precision ll 88

long doublelong double Extended precisionExtended precision tt 10/1210/12

– 11 – EECS213, S’08

Accessing Information Accessing Information

8 32bit registers8 32bit registers

Six of them mostly for Six of them mostly for general purposegeneral purpose

Last two point to key data in Last two point to key data in a process stacka process stack

Two low-order bytes of the first Two low-order bytes of the first 4 can be access directly 4 can be access directly (low-order 16bit as well)(low-order 16bit as well)

%eax

%ecx

%edx

%ebx

%esi

%edi

%esp

%ebp

%ax %ah %al

%cx %ch %cl

%dx %dh %dl

%bx %bh %bl

%si

%di

%sp

%bp

15 0831 7

Stack pointer

Frame pointer

– 12 – EECS213, S’08

Instruction formats Instruction formats

Most instructions have 1 or 2 operandsMost instructions have 1 or 2 operands operator [source[, destination]] Operand types:

Immediate – constant, denoted with a “$” in frontRegister – either 8 or 16 or 32bit registersMemory – location given by an effective address

Source: constant or value from register or memory Destination: register or memory

– 13 – EECS213, S’08

Operand specifiers Operand specifiers

Operand formsOperand forms Imm means a number Ea means a register form, e.g., %eax

s is 1, 2, 4 or 8 (called the scale factor) Memory form is the most general; subsets also work, e.g.,

Absolute: Imm M[Imm] Base + displacement: Imm(Eb) M[Imm + R[Eb]]

Operand valuesOperand values R[Ea] means "value in register"

M[loc] means "value in memory location loc"

TypeType FormForm Operand valueOperand value NameName

ImmediateImmediate $$ImmImm ImmImm ImmediateImmediate

RegisterRegister EEaa R[ER[Eaa]] RegisterRegister

MemoryMemory ImmImm (E (Ebb, E, Eii, , ss)) M[M[ImmImm + R[E + R[Ebb] + R[E] + R[Eii] * ] * ss]] Scaled indexedScaled indexed

– 14 – EECS213, S’08

Operand quizOperand quiz

OperandOperand ValueValue TypeType

%eax%eax

0x1040x104

$0x108$0x108

(%eax)(%eax)

4(%eax)4(%eax)

9(%eax, %edx)9(%eax, %edx)

260(%ecx, %edx)260(%ecx, %edx)

0xFC(, %ecx, 4)0xFC(, %ecx, 4)

(%eax, %edx, 4)(%eax, %edx, 4)

AddressAddress ValueValue

0x1000x100 0xFF0xFF

0x1040x104 0xAB0xAB

0x1080x108 0x130x13

0x10C0x10C 0x110x11

RegisterRegister ValueValue

%eax%eax 0x1000x100

%ecx%ecx 0x10x1

%edx%edx 0x30x3

– 15 – EECS213, S’08

Moving dataMoving data

Among the most common instructionsAmong the most common instructions

IA32 restriction – cannot move from one memory location to another with IA32 restriction – cannot move from one memory location to another with one instructionone instruction

Note the differences between Note the differences between movb, movsbl and movzblmovb, movsbl and movzbl

Last two work with the stackLast two work with the stackpushl %ebp = subl $4, %esp

movl %ebp, (%esp)

Since stack is part of program mem, you can really access allSince stack is part of program mem, you can really access all

InstructionInstruction EffectEffect DescriptionDescription

mov{l,w,b} mov{l,w,b} S,DS,D D D ← S← S Move double word, word or byteMove double word, word or byte

movsbl movsbl S,DS,D D D ← SignExtend(S)← SignExtend(S) Move sign-extended byteMove sign-extended byte

movzbl movzbl S,DS,D D D ← ZeroExtend(S)← ZeroExtend(S) Move zero-extended byteMove zero-extended byte

pushl pushl SS R[%esp] R[%esp] ← R[%esp] – 4;← R[%esp] – 4;M[R[%esp]] ← SM[R[%esp]] ← S

Push S onto the stackPush S onto the stack

popl Dpopl D D ←D ← M[R[%esp]]M[R[%esp]]R[%esp] R[%esp] ← R[%esp] + 4;← R[%esp] + 4;

Pop S from the stackPop S from the stack

– 16 – EECS213, S’08

movl Operand Combinationsmovl Operand Combinations

Cannot do memory-memory transfers with single instruction

movl

Imm

Reg

Mem

Reg

Mem

Reg

Mem

Reg

Source Destination

movl $0x4,%eax

movl $-147,(%eax)

movl %eax,%edx

movl %eax,(%edx)

movl (%eax),%edx

C Analog

temp = 0x4;

*p = -147;

temp2 = temp1;

*p = temp;

temp = *p;

– 17 – EECS213, S’08

Using simple addressing modesUsing simple addressing modes

void swap(int *xp, int *yp) { int t0 = *xp; int t1 = *yp; *xp = t1; *yp = t0;}

swap:pushl %ebpmovl %esp,%ebppushl %ebx

movl 8(%ebp),%edx movl 12(%ebp),%ecx

movl (%ecx),%eaxmovl (%edx),%ebxmovl %eax,(%edx)movl %ebx,(%ecx)

movl -4(%ebp),%ebxmovl %ebp,%esppopl %ebpret

Body

SetUp

Finish

Read value stored in location xp and store it in t0

Declares xp as being a pointer to an int

– 18 – EECS213, S’08

Understanding swapUnderstanding swap

void swap(int *xp, int *yp) { int t0 = *xp; int t1 = *yp; *xp = t1; *yp = t0;}

Register Variable

%ecx yp

%edx xp

%eax t1

%ebx t0

movl 12(%ebp),%ecx # ecx = yp

movl 8(%ebp),%edx # edx = xp

movl (%ecx),%eax # eax = *yp (t1)

movl (%edx),%ebx # ebx = *xp (t0)

movl %eax,(%edx) # *xp = eax

movl %ebx,(%ecx) # *yp = ebx

0x120

0x124

Rtn adr

%ebp 0

4

8

12

Offset

-4

123

456

Address

0x124

0x120

0x11c

0x118

0x114

0x110

0x10c

0x108

0x104

0x100

yp

xp

Old %ebp

Old %ebx

– 19 – EECS213, S’08








0x120

0x124

Rtn adr

%ebp 0

4

8

12

Offset

-4

123

456

Address

0x124

0x120

0x11c

0x118

0x114

0x110

0x10c

0x108

0x104

0x100

yp

xp

%eax

%edx

%ecx

%ebx

%esi

%edi

%esp

%ebp 0x104

– 20 – EECS213, S’08








0x120

0x124

Rtn adr

%ebp 0

4

8

12

Offset

-4

123

456

Address

0x124

0x120

0x11c

0x118

0x114

0x110

0x10c

0x108

0x104

0x100

yp

xp

%eax

%edx

%ecx

%ebx

%esi

%edi

%esp

%ebp

0x120

0x104

– 21 – EECS213, S’08








0x120

0x124

Rtn adr

%ebp 0

4

8

12

Offset

-4

123

456

Address

0x124

0x120

0x11c

0x118

0x114

0x110

0x10c

0x108

0x104

0x100

yp

xp

%eax

%edx

%ecx

%ebx

%esi

%edi

%esp

%ebp

0x124

0x120

0x104

– 22 – EECS213, S’08








0x120

0x124

Rtn adr

%ebp 0

4

8

12

Offset

-4

123

456

Address

0x124

0x120

0x11c

0x118

0x114

0x110

0x10c

0x108

0x104

0x100

yp

xp

%eax

%edx

%ecx

%ebx

%esi

%edi

%esp

%ebp

456

0x124

0x120

0x104

– 23 – EECS213, S’08








0x120

0x124

Rtn adr

%ebp 0

4

8

12

Offset

-4

123

456

Address

0x124

0x120

0x11c

0x118

0x114

0x110

0x10c

0x108

0x104

0x100

yp

xp

%eax

%edx

%ecx

%ebx

%esi

%edi

%esp

%ebp

456

0x124

0x120

123

0x104

– 24 – EECS213, S’08








0x120

0x124

Rtn adr

%ebp 0

4

8

12

Offset

-4

456

456

Address

0x124

0x120

0x11c

0x118

0x114

0x110

0x10c

0x108

0x104

0x100

yp

xp

%eax

%edx

%ecx

%ebx

%esi

%edi

%esp

%ebp

456

0x124

0x120

123

0x104

– 25 – EECS213, S’08








0x120

0x124

Rtn adr

%ebp 0

4

8

12

Offset

-4

456

123

Address

0x124

0x120

0x11c

0x118

0x114

0x110

0x10c

0x108

0x104

0x100

yp

xp

%eax

%edx

%ecx

%ebx

%esi

%edi

%esp

%ebp

456

0x124

0x120

123

0x104

– 26 – EECS213, S’08

Address Computation InstructionAddress Computation Instruction

lealleal SrcSrc,,DestDest leal = Load Effective Address Src is address mode expression Set Dest to address denoted by expression

UsesUses Computing address without doing memory reference

E.g., translation of p = &x[i]; Computing arithmetic expressions of the form x + k*y

k = 1, 2, 4, or 8.Leal 7(%edx,%edx,4), %eax

» when %edx=x, %eax becomes 5x+7

– 27 – EECS213, S’08

Address computation quizAddress computation quiz

InstructionInstruction Result in %edxResult in %edx

leal 6(%eax), %edxleal 6(%eax), %edx

leal (%eax, %ecx), %edxleal (%eax, %ecx), %edx

leal (%eax, %ecx, 4), %edxleal (%eax, %ecx, 4), %edx

leal 7(%eax, %eax, 8), %edxleal 7(%eax, %eax, 8), %edx

leal 0xA(, %ecx, 4), %edxleal 0xA(, %ecx, 4), %edx

leal 9(%eax, %ecx, 2), %edxleal 9(%eax, %ecx, 2), %edx


%eax%eax xx

%ecx%ecx yy

%edx%edx ??

– 28 – EECS213, S’08

Some arithmetic operationsSome arithmetic operations

InstructionInstruction EffectEffect DescriptionDescription

incl incl DD D D ← D + 1← D + 1 IncrementIncrement

decl decl DD D D ← D – 1← D – 1 DecrementDecrement

negl negl DD D D ← -D← -D NegateNegate

notl notl DD D D ← ~D← ~D ComplementComplement

addl addl S,DS,D D D ← D + S← D + S AddAdd

subl subl S,DS,D D D ← D – S← D – S SubstractSubstract

imull imull S,DS,D D D ← D * S← D * S MultiplyMultiply

xorl xorl S,DS,D D D ← D ^ S← D ^ S Exclusive orExclusive or

orl orl S,DS,D D D ← D | S← D | S OrOr

andl andl S,DS,D D D ← D & S← D & S AndAnd

sall sall k,Dk,D D D ← D << k← D << k Left shiftLeft shift

shll shll k,Dk,D D D ← D << k← D << k Left shift (same as sall)Left shift (same as sall)

sarl sarl k,Dk,D D D ← D >> k← D >> k Arithmetic right shiftArithmetic right shift

shrl shrl k,Dk,D D D ← D >> k← D >> k Logical right shiftLogical right shift

– 29 – EECS213, S’08

Arithmetic quiz Arithmetic quiz

InstructionInstruction DestinationDestination ValueValue

addl %ecx, (%eax)addl %ecx, (%eax)

subl %edx, 4(%eax)subl %edx, 4(%eax)

imull $16, (%eax, %edx, 4)imull $16, (%eax, %edx, 4)

incl 8(%eax)incl 8(%eax)

decl %ecxdecl %ecx

subl %edx, %eaxsubl %edx, %eax

AddressAddress ValueValue

0x1000x100 0xFF0xFF

0x1040x104 0xAB0xAB

0x1080x108 0x130x13

0x10C0x10C 0x110x11


%eax%eax 0x1000x100

%ecx%ecx 0x10x1

%edx%edx 0x30x3

machine-level programming i: introduction apr. 14, 2008 topics assembly programmer’s execution...

Documents