introduction to assembly here we have a brief introduction to ibm pc assembly language –cisc...

12
Introduction to Assembly • Here we have a brief introduction to IBM PC Assembly Language – CISC instruction set – Special purpose register set – 8 and 16 bit operations initially (expanded to 32 and 64 bit operations for Pentium) – Memory-register and register- register operations available – Several addressing modes including many implied addresses

Upload: darren-campbell

Post on 17-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction to Assembly Here we have a brief introduction to IBM PC Assembly Language –CISC instruction set –Special purpose register set –8 and 16 bit

Introduction to Assembly

• Here we have a brief introduction to IBM PC Assembly Language– CISC instruction set– Special purpose register set– 8 and 16 bit operations initially (expanded to 32

and 64 bit operations for Pentium)– Memory-register and register-register

operations available– Several addressing modes including many

implied addresses

Page 2: Introduction to Assembly Here we have a brief introduction to IBM PC Assembly Language –CISC instruction set –Special purpose register set –8 and 16 bit

Instruction Format• [name/label] [mnemonic] [operands] [;comments]

– Operands are either literals, variables/constants, or registers

• Number of operands depends on type of instruction, range from 0 to 2

• Examples:– mov ax, bx – 2 operands, source and destination

– mov ax, 5 – one operand is a literal

– mov y, ax – memory to register movement

– add ax, 5 – 2 operands for add

– mul value – 1 operand for mul, other operand implied to be ax

– nop – no operands for the no-op instruction

– je location – 1 operand with comparison implied to be a flag

Page 3: Introduction to Assembly Here we have a brief introduction to IBM PC Assembly Language –CISC instruction set –Special purpose register set –8 and 16 bit

Literals and Variables• Literals require that the type of

value be specified by following the value with one of the following:– D, d (or nothing) for decimal– H, h for hexadecimal– Q, q for octal– b for binary– Strings are placed in ‘ ’ or “ ”

marks– Examples:

• 10101011b• 0Ah• 35• ‘hello’• “goodbye”

• As we will define all assembly code within C/C++ programs, we will declare all variables in the C/C++ code itself– We only have to worry about

types• int is 32 bit

• short is 16 bit

• char is 8 bit

• We must insure that we place the datum into the right sized register, and that if we reference a literal, it is specified to be of the proper size to fit in the associated variable

Page 4: Introduction to Assembly Here we have a brief introduction to IBM PC Assembly Language –CISC instruction set –Special purpose register set –8 and 16 bit

Registers• There are 14 registers in the Intel-based architecture

– They are all special purpose • you can only use a given register as it was intended to be used• but there are some exceptions to this rule as described below

– There are 4 data registers:• AX – accumulator

– the only data register– AX is an implied register in the Mul and Div instructions

• BX – base counter – used for addressing, particularly when dealing with arrays and strings– BX can be used as a data register when not used for addressing

• CX – counter – implicitly used in loop instructions– in non-looping instructions, can be used as a data register

• DX – data register – Primarily used for In and Out instructions, but also used to store partial results of Mul and

Div operations– in other cases, can be used as a data register

• In the Pentium architecture, each of these are expanded to 32 bits (EAX, EBX, ECX, EDX), but there are also 8-bit versions (AL, AH, BL, BH, CL, CH, DL, DH)

Page 5: Introduction to Assembly Here we have a brief introduction to IBM PC Assembly Language –CISC instruction set –Special purpose register set –8 and 16 bit

Other Registers– Other registers can not be used for data but have

specific uses:– Segment registers

• point to different segments in memory• used as implied addressing

– SS – stack– CS – code– DS – data– ES – extra (used as a base pointer for variables)

– Indexing registers• used for offsets to the current procedure, stack, or string depending on the

instruction– BP – base pointer used with SS to address subroutine local variables on the stack– SP – stack pointer used with SS for top of stack– SI and DI – source and destination for string transfers

– IP – program counter– Status flags

Page 6: Introduction to Assembly Here we have a brief introduction to IBM PC Assembly Language –CISC instruction set –Special purpose register set –8 and 16 bit

Operations: Data Movement• mov and xchg instructions

– mov allows for register-register, memory-register, register-memory, register-immediate and memory-immediate

• first item is destination, second is source

• memory-memory moves must be done with 2 instructions using a register as temporary storage

• memory references can use direct, direct+offset, or register-indirect modes

• if datum is 8-bit, register only uses high or low side, 16-bit uses entire register, 32-bit uses extended register (e.g., EAX, EDX) and 64-bit combines two registers

– xchg instruction allows only register-register, memory-register and register-memory and exchanges two values rather than moves one value as with mov

Page 7: Introduction to Assembly Here we have a brief introduction to IBM PC Assembly Language –CISC instruction set –Special purpose register set –8 and 16 bit

Operations: Arithmetic/Conditional • inc/dec dest• add/sub dest, source

– dest is register or memory reference, source for add/sub is register, memory reference, or literal

• as long as dest and source are not both memory references

• mul/div source– one datum is source, the other

is implied to be eax (or ax or al)

– destination is implied as eax/edx combined (or ax/dx, al/ah depending on size)

• div places quotient in ax, al or eax, remainder in dx, ah or edx

• mul places low half of result in one place and high half in other

• shl, shr, sal, sar, shld, shrd – shift, shift arithmetic, shift

double

• rol, ror, rcl, rcr – rotate, rotate with carry

• Logic operations: AND, OR, XOR, NOT– form is OP dest, source

• NEG dest– convert two’s complement value

to its opposite

• CMP first, second– compare first and second and set

proper flag(s) (PF, ZF, NF)– the result of cmp operations are

then used for branch instructions

Page 8: Introduction to Assembly Here we have a brief introduction to IBM PC Assembly Language –CISC instruction set –Special purpose register set –8 and 16 bit

Operations: Branches• Conditional branches:

– yhese must be preceded by an instruction which sets at least one status flag (this includes cmp operations)

• the flag tested is based on which branch is used

– je/jne location • branch if zero flag set/clear

– jg/jge/jl/jle location • jump on > (positive flag set),

>=, < (negative flag set), <=

– jc/jnc/jz/jnz/jp/jnp location• Jump on carry, no carry, zero,

not zero, even parity, not even parity (odd parity)

• Unconditional branches do not use a previous comparison or flag, just branch to given location– jmp location

• jmp instructions are used to implement goto statements and procedure calls

• loop location– decrement cx (or ecx)

– if cx (ecx) != 0 then branch to label location

• used for for-loops

• since cx (or ecx) is used implicitly here, inside such a loop structure, we cannot use cx/ecx as a data register!

Page 9: Introduction to Assembly Here we have a brief introduction to IBM PC Assembly Language –CISC instruction set –Special purpose register set –8 and 16 bit

Addressing Modes• Immediate – place datum in instruction as a literal

– add ax, 10• use this mode when datum is known at program implementation time

• Direct – place variable in instruction– mov ax, x ; moves x into register ax– add y, ax ; sets y = [y] + [ax]

• use this mode to access a variable in memory

• Direct + Offset– mov ax, x+2 ; if x is word size, then this moves x[2]– mov ax, x[bx] ; offsets into x by the number stored in bx

• Note: mov ax, x[y] is illegal as it has 2 memory references, x and y!• use this mode when dealing with strings, arrays and structs

• Register Indirect – use index and/or segment registers– mov ax, [si + ds] ; base-indexed– mov ax, [si – 4] ; base with displacement– mov ax, [si + ds – 6] ; base-indexed with displacement

• we will not use these modes

Page 10: Introduction to Assembly Here we have a brief introduction to IBM PC Assembly Language –CISC instruction set –Special purpose register set –8 and 16 bit

Addressing Examples• Imagine that we have declared in C:

– int a[ ] = {0, 11, 15, 21, 99};

• Then, the following accesses give us the values of a as shown:– mov eax, a eax 0– mov eax, a+4 eax 11– mov eax, a+8 eax 15– mov eax, a[ebx] eax 99 if ebx = 16

• If ebx and ecx both = 0 and size is the number of items in the array, then we can iterate through the array as follows:

top: mov eax, a[ebx] … do something with the array value … add ebx, 4 add ecx, 1 cmp ecx, size jl top // use jl since we stop once ecx = = size

Page 11: Introduction to Assembly Here we have a brief introduction to IBM PC Assembly Language –CISC instruction set –Special purpose register set –8 and 16 bit

Writing Assembly in a C Program

• For simplicity, we will write our code inside of C (or C++) programs

• This allows us to – declare variables in C/C++ thus avoiding the .data section– do I/O in C/C++ thus avoiding difficulties dealing with

assembly input and assembly output– compile our programs rather than dealing with assembling

them using MASM or TASM

• To include assembly code, in your C/C++ program, add the following compiler directive

_ _ asm { }

• And place all of your assembly code between the { }

Page 12: Introduction to Assembly Here we have a brief introduction to IBM PC Assembly Language –CISC instruction set –Special purpose register set –8 and 16 bit

Data Types

• One problem that might arise in using C/C++ to run our assembly code is that we might mix up data types– if you declare a variable to be of type int, then this is a 4-byte

variable• moving it into a register means that you must move it into a 4-byte

register (such as eax) and not a 2-byte or 1-byte register!• if you try to move a variable into the wrong sized register, or a register

value into the wrong sized variable, you will get a “operand size conflict” syntax error message when compiling your program

– to use ax, bx, cx, dx, declare variables to be of type short– to use eax, ebx, ecx, edx, declare variables to be of type int– also notice that char are 1 byte, so should use either the upper

or lower half a register (al, ah, dl, dh)