1 iki10230 pengantar organisasi komputer kuliah no. 09: compiling-assembling-linking sumber: 1. paul...

29
1 IKI10230 Pengantar Organisasi Komputer Kuliah no. 09: Compiling-Assembling-Linking Sumber : 1. Paul Carter, PC Assembly Language 2. Hamacher. Computer Organization, ed-5 3. Materi kuliah CS61C/2000 & CS152/1997, UCB 21 April 2004 L. Yohanes Stefanus ([email protected]) Bobby Nazief ([email protected]) bahan kuliah: http://www.cs.ui.ac.id/kuliah/POK/

Upload: karen-grays

Post on 15-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

1

IKI10230Pengantar Organisasi Komputer

Kuliah no. 09: Compiling-Assembling-LinkingSumber:1. Paul Carter, PC Assembly Language2. Hamacher. Computer Organization, ed-53. Materi kuliah CS61C/2000 & CS152/1997, UCB

21 April 2004

L. Yohanes Stefanus ([email protected])Bobby Nazief ([email protected])

bahan kuliah: http://www.cs.ui.ac.id/kuliah/POK/

2

Steps to Starting a Program

C program: foo.c

Assembly program: foo.s

Executable(mach lang pgm): foo.exe

Compiler

Assembler

Linker

Loader

Memory

Object(mach lang module): foo.o

lib.o

3

Example: C Asm Obj Exe Run

#include <stdio.h>

int main (int argc, char *argv[]) {

int i;

int sum = 0;

for (i = 0; i <= 100; i = i + 1)

sum = sum + i * i;

printf ("The sum from 0 .. 100 is %d\n", sum);

}

4

Compiler

° Input: High-Level Language Code (e.g., C, Java)

° Output: Assembly Language Code(e.g., Intel x86)

° Note: Output may contain directives & pseudoinstructions

5

Example: C Asm Obj Exe Run segment .text

LC0: db "The sum from 0 .. 100 is %d",0xa,0

_main:

push ebp

mov ebp,esp

sub esp,24

mov dword [ebp-8],0

mov dword [ebp-4],0

L3:

cmp dword [ebp-4],100

jle L6

jmp L4

L6:

mov eax,[ebp-4]

imul eax,[ebp-4]

add [ebp-8],eax

L5:

inc dword [ebp-4]

jmp L3

L4:

add esp,-8

mov eax,[ebp-8]

push eax

push dword LC0

call _printf

add esp,16

L2:

mov esp,ebp

pop ebp

ret

6

Where Are We Now?

C program: foo.c

Assembly program: foo.s

Executable(mach lang pgm): a.out

Compiler

Assembler

Linker

Loader

Memory

Object(mach lang module): foo.o

lib.o

7

Assembler

° Reads and Uses Directives

° Replace Pseudoinstructions

° Produce Machine Language

° Creates Object File

8

Producing Machine Language

° Simple Case• Arithmetic, Logical, Shifts, and so on.

• All necessary info is within the instruction already.

° What about Branches?• PC-Relative

• So once pseudoinstructions are replaced by real ones, we know by how many instructions to branch.

° What about jumps?• Some require absolute address.

° What about references to data?• These will require the full 32-bit address of the data.

° Addresses can’t be determined yet, so we create two tables…

9

Symbol Table

° List of “items” in this file that may be used by other files.

° What are they?• Labels: function calling

• Data: anything in the .data section; variables which may be accessed across files

° First Pass: record label-address pairs

° Second Pass: produce machine code• Result: can jump to a later label without first declaring it

10

Relocation Table

° List of “items” for which this file needs the address.

° What are they?• Any label jumped to: jmp or call

- internal

- external (including lib files)

• Any piece of data

11

Object File Format

° object file header: size and position of the other pieces of the object file

° text segment: the machine code

° data segment: binary representation of the data in the source file

° relocation information: identifies lines of code that need to be “handled”

° symbol table: list of this file’s labels and data that can be referenced

° debugging information

12

Example: C Asm Obj Exe Run segment .text

0x0:

db "The sum from 0 .. 100 is %d",0xa,0

0x1d:

push ebp

mov ebp,esp

sub esp,24

mov dword [ebp-8],0

mov dword [ebp-4],0

0x34:

cmp dword [ebp-4],100

jle 0x05 (0x42)

jmp 0x00000012 (0x54)

0x42:

mov eax,[ebp-4]

imul eax,[ebp-4]

add [ebp-8],eax

0x4c:

inc dword [ebp-4]

jmp 0xffffffe0 (0x34)

0x54:

add esp,-8

mov eax,[ebp-8]

push eax

push 0x0

call 0x0

add esp,16

0x6e:

mov esp,ebp

pop ebp

ret

13

Symbol Table Entries

° Symbol Table • Label Address

LC0: 0x00000000

main: 0x0000001d

L3: 0x00000034

L6: 0x00000042

L5: 0x0000004c

L4: 0x00000054

L2: 0x0000006e

° Relocation Information• Offset Type Value

0x0000005f dir32 .text

(LC0: offset 0 of .text segment)

0x00000064 DISP32 _printf

14

Where Are We Now?

C program: foo.c

Assembly program: foo.s

Executable(mach lang pgm): a.out

Compiler

Assembler

Linker

Loader

Memory

Object(mach lang module): foo.o

lib.o

15

Link Editor/Linker

° Step 1: Take text segment from each .o file and put them together.

° Step 2: Take data segment from each .o file, put them together, and concatenate this onto end of text segments.

° Step 3: Resolve References• Go through Relocation Table and handle each entry

• That is, fill in all absolute addresses

16

Four Types of Addresses

° PC-Relative Addressing (beq, bne): never relocate

° Absolute Address (jmp, call): always relocate

° External Reference (usually call): always relocate

° Data Reference: always relocate

17

Resolving References

° Linker assumes first word of first text segment is at address 0x00000000.

° Linker knows:• length of each text and data segment

• ordering of text and data segments

° Linker calculates:• absolute address of each label to be jumped to (internal or

external) and each piece of data being referenced

° To resolve references:• search for reference (data or label) in all symbol tables

• if not found, search library files (for example, for printf)

• once absolute address is determined, fill in the machine code appropriately

° Output of linker: executable file containing text and data (plus header)

18

Example: C Asm Obj Exe Run segment .text

0x15c0:

db "The sum from 0 .. 100 is %d",0xa,0

0x15dd:

push ebp

mov ebp,esp

sub esp,24

mov dword [ebp-8],0

mov dword [ebp-4],0

0x15f4:

cmp dword [ebp-4],100

jle 0x05 (0x1602)

jmp 0x12 (0x1614)

0x1602:

mov eax,[ebp-4]

imul eax,[ebp-4]

add [ebp-8],eax

0x160c:

inc dword [ebp-4]

jmp 0xe0 (0x15f4)

0x1614:

add esp,-8

mov eax,[ebp-8]

push eax

push 0x000015c0

call 0x00001778 (0x2da0)*

add esp,16

0x162e:

mov esp,ebp

pop ebp

ret

*0x1628 + 0x1778 = 0x2da0

19

Peta Memori .EXE

00000000

...

000015C0

00001631

...

0000B000

...

0000BB04

Obj lainnya

(..., _printf, ...)

Obj lainnya

Foo.o .text

.data

20

Where Are We Now?

C program: foo.c

Assembly program: foo.s

Executable(mach lang pgm): a.out

Compiler

Assembler

Linker

Loader

Memory

Object(mach lang module): foo.o

lib.o

21

Loader (1/3)

° Executable files are stored on disk.

° When one is run, loader’s job is to load it into memory and start it running.

° In reality, loader is the operating system (OS) • loading is one of the OS tasks

22

Loader (2/3)

° So what does a loader do?

° Reads executable file’s header to determine size of text and data segments

° Creates new address space for program large enough to hold text and data segments, along with a stack segment

° Copies instructions and data from executable file into the new address space (this may be anywhere in memory)

23

Loader (3/3)

° Copies arguments passed to the program onto the stack

° Initializes machine registers• Most registers cleared, but stack pointer assigned address of

1st free stack location

° Jumps to start-up routine that copies program’s arguments from stack to registers and sets the PC

• If main routine returns, start-up routine terminates program with the exit system call

24

Example: C Asm Obj Exe Run

0x000015c0: 0x20656854 0x206d7573 0x6d6f7266 0x2e203020

0x000015d0: 0x3031202e 0x73692030 0x0a642520 0xe5895500

0x000015e0: 0x0018ec81 0x45c70000 0x000000f8 0xfc45c700

0x000015f0: 0x00000000 0x64fc7d81 0x7e000000 0x0012e905

0x00001600: 0x458b0000 0x45af0ffc 0xf84501fc 0xe9fc45ff

0x00001610: 0xffffffe0 0xfff8c481 0x458bffff 0xc06850f8

0x00001620: 0xe8000015 0x00001778 0x0010c481 0xec890000

0x00001630: 0x0000c35d

0x000015c0: 54 68 65 20 73 75 62 20 66 72 6f 6d 20 30 20 2e

T h e s u m f r o m 0 .

000015dd: 55 push ebp

000015de: 89e5 mov ebp,esp

000015e0: 81ec18000000 sub esp,0x18

000015e6: c745f800000000 mov [ebp-8],0

000015ed: c745fc00000000 mov [ebp-4],0

000015f4: 817dfc64000000 cmp [ebp-4],0x64

000015fb: 7e05 jle 0x1602

000015fd: e912000000 jmp 0x1614

25

.ASM, .O, & .EXE

(FORMAT COFF)

26

Example: C Asm Obj Exe Run

.text

LC0:

.ascii "The sum from 0 .. 100 is %d\12\0"

main:

pushl %ebp

movl %esp,%ebp

subl $24,%esp

movl $0,-8(%ebp)

movl $0,-4(%ebp)

L3:

cmpl $100,-4(%ebp)

jle L6

jmp L4

L6:

movl -4(%ebp),%eax

imull -4(%ebp),%eax

addl %eax,-8(%ebp)

L5:

incl -4(%ebp)

jmp L3

L4:

addl $-8,%esp

movl -8(%ebp),%eax

pushl %eax

pushl LC0

call _printf

addl $16,%esp

L2:

movl %ebp,%esp

popl %ebp

ret

27

Example: C Asm Obj Exe Run

.text

0x0:

.ascii "The sum from 0 .. 100 is %d\12\0"

0x20:

pushl %ebp

movl %esp,%ebp

subl $24,%esp

movl $0,-8(%ebp)

movl $0,-4(%ebp)

0x34:

cmpl $100,-4(%ebp)

jle 6 (0x40)

jmp 0x14 (0x50)

0x40:

movl -4(%ebp),%eax

imull -4(%ebp),%eax

addl %eax,-8(%ebp)

0x4a:

incl -4(%ebp)

jmp -0x1b (0x34)

0x50:

addl $-8,%esp

movl -8(%ebp),%eax

pushl %eax

pushl 0x0

call 0x0 (undefined)

addl $16,%esp

0x64:

movl %ebp,%esp

popl %ebp

ret

28

Symbol Table Entries

° Symbol Table • Label Address

LC0: 0x00000000

L2: 0x00000064

L3: 0x00000034

L4: 0x00000050

L5: 0x0000004a

L6: 0x00000040

main: 0x00000020

° Relocation Information• Address Instr. Type Dependency • 0x0000005c call printf

29

Example: C Asm Obj Exe Run

.text

0x15c0:

.ascii "The sum from 0 .. 100 is %d\12\0"

0x15e0:

pushl %ebp

movl %esp,%ebp

subl $24,%esp

movl $0,-8(%ebp)

movl $0,-4(%ebp)

0x15f4:

cmpl $100,-4(%ebp)

jle 6 (0x1600)

jmp 0x14 (0x1610)

0x1600:

movl -4(%ebp),%eax

imull -4(%ebp),%eax

addl %eax,-8(%ebp)

0x160a:

incl -4(%ebp)

jmp -0x1b (0x15f4)

0x1610:

addl $-8,%esp

movl -8(%ebp),%eax

pushl %eax

pushl 0x15c0

call 0x2d90

addl $16,%esp

0x1624:

movl %ebp,%esp

popl %ebp

ret