30-sep-0194.201 - fall 2001: copyright ©t. pearce, d. hutchinson, l. marshall sept. 20011 problem:...
TRANSCRIPT
30-Sep-01 94.201 - Fall 2001: copyright ©T. Pearce, D. Hutchinson, L. Marshall Sept. 2001 1
Problem: must convert ideas (human thoughts) into an executing program (binary image in memory)
Need: DEVELOPMENT PROCESS people-friendly way to write programs tools to support conversion to binary image
assembly language: used by people to describe programs • syntax: set of symbols + grammar rules for
constructing statements using symbols• semantics: what is meant by statements; ultimately:
the binary image
Assembly Language / Development Process
30-Sep-01 94.201 - Fall 2001: copyright ©T. Pearce, D. Hutchinson, L. Marshall Sept. 2001 2
assembler: program – converts programs from assembly language to object format• object format: an intermediate format
– mostly binary, but may include other info
linker: program that combines object files to create an “executable” file
loader: loads executable files into memory, and may initialize some registers (e.g. IP )
Assembler, linker and loader are tools.
Assembly Language / Development Process
30-Sep-01 94.201 - Fall 2001: copyright ©T. Pearce, D. Hutchinson, L. Marshall Sept. 2001 3
Development Process
.EXELinker
loader is part of operating system (or possibly debugger)
simulator loads .OBJ files directly Computer System
Loader
memory
processor IP
.OBJ
.ASM
.OBJ
.LST
Assembler
human readable results (including assembly errors)
may link multiple OBJ files
Editor
people work here
30-Sep-01 94.201 - Fall 2001: copyright ©T. Pearce, D. Hutchinson, L. Marshall Sept. 2001 4
people create .ASM files using assembly language
syntax must account for all aspects of a program and development process:• constant values• reserve memory to use for variables• write instructions: operations & operands
– addressing modes
• directives to tools in development process
p86 Assembly Language
30-Sep-01 94.201 - Fall 2001: copyright ©T. Pearce, D. Hutchinson, L. Marshall Sept. 2001 5
binary value: consists of only 0’s and 1’s ends with ‘B’ or ‘b’, e.g. 10101110b
hexadecimal value: starts with 0 .. 9 may include 0 . . 9, A .. F (a . . f ) ends with ‘H’ or ‘h’, e.g. 0FFH (8-bit hex value)
decimal value: default format – no “qualifier” extension consists of digits in 0 . . 9, e.g. 12345
string: sequence of characters encoded as 7-bit ASCII bytes: enclosed in single quotes, e.g. ‘Hi Mom’ (6 bytes) character: string with length = 1
p86 Assembly Language: Constants
30-Sep-01 94.201 - Fall 2001: copyright ©T. Pearce, D. Hutchinson, L. Marshall Sept. 2001 6
user-defined names – represent addresses lets programmer refer to addresses using logical names
– no need for concern with exact hexadecimal values leave assembler to:
• decide exact addresses to use• deal with hexadecimal addresses
labels are used to identify addresses for:• control flow – identify address of target• memory variables – identify address where data is
stored labels serve in 2 roles:
• label definition and label reference
p86 Assembly Language: Labels
30-Sep-01 94.201 - Fall 2001: copyright ©T. Pearce, D. Hutchinson, L. Marshall Sept. 2001 7
used by assembler to decide exact address must be first non-blank text on a line user-defined name followed by “:” name must start with alpha A .. Z a .. z then contain: alpha, numeric, ‘_’
• e.g. Continue: L8R: Out_2_Lunch: cannot redefine reserved words
• e.g. MOV: is illegal
p86 Assembly Language: Label Definition
30-Sep-01 94.201 - Fall 2001: copyright ©T. Pearce, D. Hutchinson, L. Marshall Sept. 2001 8
label represents address of first allocated byte after definition• e.g. DoThis: MOV AX, [ BX ]• DoThis represents address of first byte of the MOV instruction
label reference: use of label in an operand • refers to address assigned by assembler (no “:”)
control flow example (assume CX contains loop counter):DoWhile:
CMP CX, 0JE DoneDoWhile
…JMP DoWhile
DoneDoWhile:MOV AX, . . . (etc.)
p86 Assembly Language: Label Address/Reference
30-Sep-01 94.201 - Fall 2001: copyright ©T. Pearce, D. Hutchinson, L. Marshall Sept. 2001 9
reserve memory for variables 2 sizes:
• DB reserves a byte of memory• DW reserves a word (2 bytes) of memory
may also provide an (optional) initialization value as an operand Examples:
DB ; reserves one byte
X: DB ; reserves one byte – label X is; defined to represent the address; of the byte
Y: DB 3 ; reserve one byte – label Y etc.; and initialize the byte to 3
p86 Assembly Language: Memory Declarations
30-Sep-01 94.201 - Fall 2001: copyright ©T. Pearce, D. Hutchinson, L. Marshall Sept. 2001 10
Examples (continued):DW ; reserve 2 consecutive bytes
Z: DW ; reserves 2 bytes – label Z is defined to represent the
; address of the first byte W: DW 256 ; reserve 2 bytes – label W etc., and initialize the bytes
; to 256 decimal (little endian !!!) HUH: DW W ; reserve 2 bytes – label HUH etc., and initialize the
; bytes to contain the address of the variable W aboveA:B:
DB ‘C’ ; reserves 1 byte – labels A and B both represent the: address of the byte, and initializes it to 43H
p86 Assembly Language: Memory Declarations
30-Sep-01 94.201 - Fall 2001: copyright ©T. Pearce, D. Hutchinson, L. Marshall Sept. 2001 11
complete instruction must be on one line instruction mnemonic & operands operands
• immediate: constant – can use label reference, example:
W: DW . . .
MOV BX, W
• register: register name• direct: [ address ]
– state address a label reference, example:W: DW
. . .MOV AX, [ W ]
p86 Assembly Language: Instructions
30-Sep-01 94.201 - Fall 2001: copyright ©T. Pearce, D. Hutchinson, L. Marshall Sept. 2001 12
Operands (continued)• register indirect: [ register ]
– exampleW: DW
. . .MOV BX, W MOV AX, [ BX ]
• relative-offset (control flow)– use label reference to identify target address– assembler calculates actual offsetJMP There . . . There:
etc.
p86 Assembly Language: Instructions
30-Sep-01 94.201 - Fall 2001: copyright ©T. Pearce, D. Hutchinson, L. Marshall Sept. 2001 13
statements that are intended to help tools are not assembled directly into instructions or memory declarationsEND Directive: directive to 2 tools: assembler and loader directs assembler to stop reading from .ASM file
• any subsequent statements are ignored operand: must be a label reference
• interpreted as specifying the address of the first instruction to be executed
• directs loader to load specified address into IP after loading .OBJ file
Syntax: END label-reference
p86 Assembly Language: Tool Directives
30-Sep-01 94.201 - Fall 2001: copyright ©T. Pearce, D. Hutchinson, L. Marshall Sept. 2001 14
each pass: processes all statements in .ASM file sequentially from start to finish
1st Pass: for each statement:1. check syntax 2. allocate any memory needed for image
– memory declaration (DB, DW)– instruction: opcode + operands
3. if includes a label definition: assign value to label and keep record of (label, value) association in Symbol Table
if syntax errors in 1st pass, then write errors to .LST file and stop, else ....2nd Pass: build binary for each statement:
• may require calculating offsets – may result in errors – e.g. trying to jump too far for a conditional jump (target out of range)
• write results to .LST file• if no errors – write results to .OBJ file
p86 Assembly Language: 2 Pass Assembly Process
30-Sep-01 94.201 - Fall 2001: copyright ©T. Pearce, D. Hutchinson, L. Marshall Sept. 2001 15
Memory Allocation: assumes allocation starts at address 0000H uses location counter ($) to track address of next byte to allocate as bytes are allocated, adjust $ value for each memory declaration: (DB, DW)
• allocate # bytes declared, for example: suppose $ = 0006H and next line of program is: DW – will allocate 2 bytes for DW statement:
» use address 0006H for low byte» use address 0007H for high byte
– after allocation: $ = 0008H (next byte to allocate)
• if next line of program is: DB ‘Hi Mom’– will allocate 6 bytes for DB statement
» use address 0008H for ‘H’, …, 000DH for ‘m’– after allocation: $ = 000EH
p86 Assembly Language: 1st Pass in Detail
30-Sep-01 94.201 - Fall 2001: copyright ©T. Pearce, D. Hutchinson, L. Marshall Sept. 2001 16
for each instruction:• allocate bytes needed to encode opcode and operands• example: suppose $ = 0006H when next line of program is:
MOV CL, BL – requires 2 bytes to encode instruction: opcode, CL dest, BL source– after allocation: $ = 0008H
if next line of program is:MOV BX, 1
– requires 4 bytes to encode– 2 bytes: opcode, BX dest, imm as src– 2 bytes: imm src value
– after allocation: $ = 000CH when a label definition is encountered:
• value of label = $– value of label is the address of the next byte allocated to the program
• save (label, $-value) in Symbol Table for recall during 2nd pass
p86 Assembly Language: 1st Pass in Detail
Instruction encodingdetails later
30-Sep-01 94.201 - Fall 2001: copyright ©T. Pearce, D. Hutchinson, L. Marshall Sept. 2001 17
build image of bytes to be loaded at addresses memory declaration: initialization value?
• if no init value – use default? instructions: opcodes and operands
• addressing mode info• label references in operands: look up values to use for label in
Symbol Table• for relative-offsets: offset = label-value – $
results to .LST file• errors• image created (none if errors!)
.OBJ file contains image in format that can be loaded by simulator• includes info about “start address” – to initialize IP during loading
p86 Assembly Language: 2nd Pass in Detail
30-Sep-01 94.201 - Fall 2001: copyright ©T. Pearce, D. Hutchinson, L. Marshall Sept. 2001 18
$ binary image0000 start:0000 C7C2E904 mov dx, 04E9H 0004 C6C048 mov al, 'H'0007 EE out [dx], al0008 C6C069 mov al, 'i'000B EE out [dx], al000C F4 hlt
end startAfter 1st Pass: syntax OK bytes have been allocated to statements Symbol Table constructed:
Symbol Value start 0000H
After 2nd Pass: binary image constructed
p86 Assembly Language: HI.LST ExampleLabel definition: in 1st pass – put
value in Symbol table
30-Sep-01 94.201 - Fall 2001: copyright ©T. Pearce, D. Hutchinson, L. Marshall Sept. 2001 19
34 0000 Start:35 0000 C7C4FEFF mov sp, 0FFFEH37 0004 C7C2E904 mov dx, 04E9H 43 0008 C6C510 mov ch, 1644 000B C6C101 mov cl, 145 000E 8B1EA300 mov bx, [Value] 47 0012 Outloop:48 0012 D3E3 shl bx, cl 49 0014 7206 jc Got150 0016 Got0: 51 0016 C6C030 mov al, ‘0’52 0019 E90300 jmp Dump54 001C Got1:55 001C C6C031 mov al, ‘1’
57 001F Dump:58 001F EE out [dx], al
p86 Assembly Language: ASSIGN2.LST Example
30-Sep-01 94.201 - Fall 2001: copyright ©T. Pearce, D. Hutchinson, L. Marshall Sept. 2001 20
60 0020 80ED01 sub ch, 161 0023 75ED jnz Outloop Symbol Value
Start 0000HOutloop 0012HGot0 0016HGot1 001CHDump 001FH
316 009B 0A00 Ten: dw 10317 009D 6400 Hundred: dw 100318 009F E803 Thousand: dw 1000319 00A1 1027 TenThousand: dw 10000
321 00A3 AF90 Value: dw 090AFH Symbol Value
Ten 009BHHundred 009DHThousand 009FHTenThousand 00A1HValue 00A3H
p86 Assembly Language: ASSIGN2.LST Example
Note: Do not confuse $ with IP$ = artifact of assemblerIP = artifact of processor
30-Sep-01 94.201 - Fall 2001: copyright ©T. Pearce, D. Hutchinson, L. Marshall Sept. 2001 21
allows declaration of symbolic constants improves (human) readability reduces the use of “magic numbers” Syntax:symbolic-name EQU numeric-constant
1st Pass: records (symbolic-name, constant) in symbol table
2nd Pass: replaces every occurrence of the symbolic-name with the specified constant• is NOT allocated any bytes !!!
p86 Assembly Language: EQU Directive
Note: No “:”
30-Sep-01 94.201 - Fall 2001: copyright ©T. Pearce, D. Hutchinson, L. Marshall Sept. 2001 22
DisplayPort EQU 04E9H ; magic number
MOV DX , DisplayPort
; improved readability
is assembled to same encoding as
MOV DX , 04E9H
p86 Assembly Language: EQU Example
30-Sep-01 94.201 - Fall 2001: copyright ©T. Pearce, D. Hutchinson, L. Marshall Sept. 2001 23
allows specification of the address of the next byte to be allocated
improves (human) readability reduces the use of “magic numbers” Syntax:ORG numeric-constant
Assembler assigns location counter the value of the specified numeric constant
p86 Assembly Language: ORG Directive
30-Sep-01 94.201 - Fall 2001: copyright ©T. Pearce, D. Hutchinson, L. Marshall Sept. 2001 24
…JMP DoAtF000H
; suppose this instruction was ; assembled to bytes starting at 0120H
ORG 0F000HDoAtF000H:
MOV BX , . . . ; MOV instruction will be assembled ; to bytes starting at address F000H
Must always increase location counter (never decrease) Typical use:
• Force assembly to specific address range• Resulting code will reside in particular type of memory located at
that address (e.g. ROM)
p86 Assembly Language: ORG Example