linking: from the object file to the executable

96
Linking: from the object file to the executable An overview of static and dynamic linking Alessandro Di Federico Politecnico di Milano April 11, 2018

Upload: others

Post on 27-Mar-2022

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Linking: from the object file to the executable

Linking: from the objectfile to the executable

An overview of static and dynamic linking

Alessandro Di Federico

Politecnico di Milano

April 11, 2018

Page 2: Linking: from the object file to the executable

Index

ELF format overview

Static linking

Dynamic linking

Advanced linking features

Page 3: Linking: from the object file to the executable

main.c

Preprocessor

Compiler

Assembler

C code

Assembly

parse.c

Preprocessor

Compiler

Assembler

C code

Assembly

libm.a

LinkerObject file

Object file

Object files

program

Executable

Page 4: Linking: from the object file to the executable

We’ll focus on theELF format [2]

Page 5: Linking: from the object file to the executable

The object file

• An object file is the final form of a translation unit• It uses the ELF format• It’s divided in several sections:

.text The actual executable code

.data Global, initialized, writeable data.bss Global, non-initialized, writeable data

.rodata Global read-only data

.symtab The symbol table

.strtab String table for the symbol table.rela.section The relocation table for section

.shstrtab String table for the section names

Page 6: Linking: from the object file to the executable

An example

#include <stdint.h>

uint32_t uninitialized;uint32_t zero_intialized = 0;const uint32_t constant = 0x41424344;const uint32_t *initialized = &constant;

uint32_t a_function () {return 42 + uninitialized;

}

Page 7: Linking: from the object file to the executable

An example

$ gcc -c example.c -o example.o -fno-PIC -fno-stack-protector

Page 8: Linking: from the object file to the executable

.text

$ objdump -d -j .text example.o

Disassembly of section .text:

0000000000000000 <a_function >:0: 55 push rbp1: 48 89 e5 mov rbp ,rsp4: 8b 05 00 00 00 00 mov eax , [rip+0x0]a: 83 c0 2a add eax ,0x2ad: 5d pop rbpe: c3 ret

Page 9: Linking: from the object file to the executable

The sections

$ readelf -S example.o

Nr Name Type Off Size ES Flg Inf0 NULL 000 000 00 01 .text PROGBITS 040 00f 00 AX 02 .rela.text RELA 298 018 18 I 13 .data PROGBITS 050 008 00 WA 04 .rela.data RELA 2b0 018 18 I 35 .bss NOBITS 058 004 00 WA 06 .rodata PROGBITS 05c 004 00 A 0

...11 .shstrtab STRTAB 0b8 066 00 012 .symtab SYMTAB 120 138 18 913 .strtab STRTAB 258 039 00 0

Page 10: Linking: from the object file to the executable

The sections

$ readelf -S example.o

Nr Name Type Off Size ES Flg Inf0 NULL 000 000 00 01 .text PROGBITS 040 00f 00 AX 02 .rela.text RELA 298 018 18 I 13 .data PROGBITS 050 008 00 WA 04 .rela.data RELA 2b0 018 18 I 35 .bss NOBITS 058 004 00 WA 06 .rodata PROGBITS 05c 004 00 A 0

...11 .shstrtab STRTAB 0b8 066 00 012 .symtab SYMTAB 120 138 18 913 .strtab STRTAB 258 039 00 0

Page 11: Linking: from the object file to the executable

The sections

$ readelf -S example.o

Nr Name Type Off Size ES Flg Inf0 NULL 000 000 00 01 .text PROGBITS 040 00f 00 AX 02 .rela.text RELA 298 018 18 I 13 .data PROGBITS 050 008 00 WA 04 .rela.data RELA 2b0 018 18 I 35 .bss NOBITS 058 004 00 WA 06 .rodata PROGBITS 05c 004 00 A 0

...11 .shstrtab STRTAB 0b8 066 00 012 .symtab SYMTAB 120 138 18 913 .strtab STRTAB 258 039 00 0

Page 12: Linking: from the object file to the executable

The sections

$ readelf -S example.o

Nr Name Type Off Size ES Flg Inf0 NULL 000 000 00 01 .text PROGBITS 040 00f 00 AX 02 .rela.text RELA 298 018 18 I 13 .data PROGBITS 050 008 00 WA 04 .rela.data RELA 2b0 018 18 I 35 .bss NOBITS 058 004 00 WA 06 .rodata PROGBITS 05c 004 00 A 0

...11 .shstrtab STRTAB 0b8 066 00 012 .symtab SYMTAB 120 138 18 913 .strtab STRTAB 258 039 00 0

Page 13: Linking: from the object file to the executable

And the uninitialized variable?

They are special, we’ll talk about them later

Page 14: Linking: from the object file to the executable

And the uninitialized variable?

They are special, we’ll talk about them later

Page 15: Linking: from the object file to the executable

The sections

$ readelf -S example.o

Nr Name Type Off Size ES Flg Inf0 NULL 000 000 00 01 .text PROGBITS 040 00f 00 AX 02 .rela.text RELA 298 018 18 I 13 .data PROGBITS 050 008 00 WA 04 .rela.data RELA 2b0 018 18 I 35 .bss NOBITS 058 004 00 WA 06 .rodata PROGBITS 05c 004 00 A 0

...11 .shstrtab STRTAB 0b8 066 00 012 .symtab SYMTAB 120 138 18 913 .strtab STRTAB 258 039 00 0

Page 16: Linking: from the object file to the executable

.text

$ objdump -d -j .text example.o

Disassembly of section .text:

0000000000000000 <a_function >:0: 55 push rbp1: 48 89 e5 mov rbp ,rsp4: 8b 05 00 00 00 00 mov eax , [rip+0x0]a: 83 c0 2a add eax ,0x2ad: 5d pop rbpe: c3 ret

Page 17: Linking: from the object file to the executable

.text

$ objdump -d -j .text example.o

Disassembly of section .text:

0000000000000000 <a_function >:0: 55 push rbp1: 48 89 e5 mov rbp ,rsp4: 8b 05 00 00 00 00 mov eax , [rip+0x0]a: 83 c0 2a add eax ,0x2ad: 5d pop rbpe: c3 ret

Page 18: Linking: from the object file to the executable

.data, .rodata and .bss

$ objdump -s -j .data -j .rodata -j .bss example.o

Contents of section .data:0000 00000000 00000000 ........

Contents of section .rodata:0000 44434241 DCBA

Page 19: Linking: from the object file to the executable

And .bss?

.bss is not stored in the file, it’s all zeros

Page 20: Linking: from the object file to the executable

And .bss?.bss is not stored in the file, it’s all zeros

Page 21: Linking: from the object file to the executable

Custom sections

You can also create your own section, in GCC:

__attribute__ ((section ("mysection")))

Page 22: Linking: from the object file to the executable

Symbols

• A symbol is a label for a piece of code or data• A symbols is composed by:

st_name the name of the symbol (stored in .strtab).st_shndx index of the containing section.st_value the symbol’s address/offset in the section.st_size the size of the represented object.st_info the symbol type and binding.

st_other the symbol visibility.

Page 23: Linking: from the object file to the executable

Undefined symbols

If st_shndx is 0, the symbol is not defined in the current TUs.If the linker can’t find it in any TUs it will complain.

Page 24: Linking: from the object file to the executable

Common symbols

If st_shndx is COM (0xfff2), it’s a common symbol

• Used for uninitialized global variables• You can have multiple definition in different TUs• They can have different size, the largest will be chosen.• It has no storage associated in any object file• It ends up in .bss

Page 25: Linking: from the object file to the executable

Symbol types

The type of the symbol identifies what it represents:STT_OBJECT a global variable.STT_FUNC a function.

STT_SECTION a section.STT_FILE a translation unit.

Page 26: Linking: from the object file to the executable

Symbol binding

The binding of a symbol, determines if and how can be used byother translation units:STB_LOCAL local to the TU (static in C terms).

STB_GLOBAL available to other TUs (extern in C terms).STB_WEAK like STB_GLOBAL, but can be overridden.

Page 27: Linking: from the object file to the executable

Symbol visibility

binding =⇒ is it available to other TUs?visibility =⇒ is it available to other modules?

• A module is an executable or a dynamic library• Visibility determines if a function is exported by the library• Executables usually ignore visibility (they export nothing)• To change the visibility in GCC:

__attribute__ ((visibility ("...")))

Page 28: Linking: from the object file to the executable

Type of visibility

There are several different types of visibility:STV_DEFAULT visible, can use an external version.STV_HIDDEN not visible, always use own version.

STV_PROTECTED visible, but use always use own version.

Page 29: Linking: from the object file to the executable

Symbol table example

$ readelf -s example.o

# Val Sz Type Bind Vis Ndx Name0 000 0 NOTYPE LOCAL DEF UND1 000 0 FILE LOCAL DEF ABS example.c2 000 0 SECTION LOCAL DEF 13 000 0 SECTION LOCAL DEF 34 000 0 SECTION LOCAL DEF 5

...9 000 4 OBJECT GLOBAL DEF COM uninitialized

10 000 4 OBJECT GLOBAL DEF 5 zero_intialized11 000 4 OBJECT GLOBAL DEF 6 constant12 000 8 OBJECT GLOBAL DEF 3 initialized13 000 15 FUNC GLOBAL DEF 1 a_function

Page 30: Linking: from the object file to the executable

Relocations

A relocation is a directive for the linker

Dear linker, write the value of a certain symbolin a certain location in a certain way

Page 31: Linking: from the object file to the executable

Relocations

A relocation is a directive for the linker

Dear linker, write the value of a certain symbolin a certain location in a certain way

Page 32: Linking: from the object file to the executable

Structure of a relocation

• Relocations are organized in relocation tables• There can be a relocation table for each section• A relocation is composed by the following fields:

r_offset where to write (as an offset in the section).r_info symbol identifier and relocation type.

r_addend value to add to the symbol’s value (optional).

Page 33: Linking: from the object file to the executable

Relocation types

• Part of r_info, specify how to write the symbol’s value• There are several, architecture-specific, relocation types:

R_X86_64_64 full symbol value (64 bit).R_X86_64_PC32 offset from the relocation target (32 bit).And many others similar to these.

Page 34: Linking: from the object file to the executable

Relocation table example

$ readelf -r example.o

Section '.rela.text ' contains 1 entries:Offset Type Symbol 's Name + Addend000006 R_X86_64_PC32 uninitialized + 0

Section '.rela.data ' contains 1 entries:Offset Type Symbol 's Name + Addend000000 R_X86_64_64 constant + 0

Page 35: Linking: from the object file to the executable

Index

ELF format overview

Static linking

Dynamic linking

Advanced linking features

Page 36: Linking: from the object file to the executable

Who is the linker?

• Usually you don’t invoke it directly, GCC will do it for you• Use gcc -v if you want to see how it’s invoked• Under unix-like platforms, three main linkers:

ld.bfd wide feature set, slow, high memory usage.ld.gold ELF-only, fast, reduced memory usage.

lld the new kid in the block, faster, LLVM-based.

Page 37: Linking: from the object file to the executable

The linker

What does the linker do?

1 Take in input several object files2 Lay out the output binary3 Build the final symbol table from the inputs4 Apply all the relocations5 Output the final executable/dynamic library

Page 38: Linking: from the object file to the executable

The linker

What does the linker do?

1 Take in input several object files2 Lay out the output binary3 Build the final symbol table from the inputs4 Apply all the relocations5 Output the final executable/dynamic library

Page 39: Linking: from the object file to the executable

Binary layout

• Fix a starting address for each section name• Concatenate all the sections with the same name• Keep sections with same features close to each other

Page 40: Linking: from the object file to the executable

Building the symbol table

• Scan all the input symbol tables and merge them• Set symbol values to their final virtual address• Check all the undefined symbols have been resolved• Check no symbol is defined twice (one definition rule)

Page 41: Linking: from the object file to the executable

Relocations

Simply apply all the relocations directivesusing the symbols’ final address

Page 42: Linking: from the object file to the executable

Original .text

$ objdump -d -j .text example.o

Disassembly of section .text:

0000000000000000 <a_function >:0: 55 push rbp1: 48 89 e5 mov rbp ,rsp4: 8b 05 00 00 00 00 mov eax , [rip+0x0]a: 83 c0 2a add eax ,0x2ad: 5d pop rbpe: c3 ret

Page 43: Linking: from the object file to the executable

Linked .text

$ gcc main.c example.o -o example \-fno -PIC -fno -stack -protector

$ objdump -d -j .text example

Disassembly of section .text:

0000000000400514 <a_function >:400514: 55 push rbp400515: 48 89 e5 mov rbp ,rsp400518: 8b 05 72 0b 20 00 mov eax , [rip+0x200b72]40051e: 83 c0 2a add eax ,0x2a400521: 5d pop rbp400522: c3 ret

Page 44: Linking: from the object file to the executable

What about libraries?

• Let’s first focus on static libraries, i.e. .a files• Static libraries are copied into the final binary• A .a file is just an archive of object files

ar rcs libmine.a example.o other.oranlib libmine.a

• Typically a library is linked with the -l parameter1, e.g.gcc -lmine main.c -o main.o

• Not all the object files will be linked in• Only those providing otherwise undefined symbols

1libmine.a must be in the library search path (e.g. in /lib).Otherwise use -L to add a path to the library search path.

Page 45: Linking: from the object file to the executable

We said the linker places similar sections together

Why is that?

Page 46: Linking: from the object file to the executable

Memory mapping

Similar sections will be mapped together:• Code (e.g., .text) will go in a executable page• Read-only data (e.g., .rodata) will go in a read-only page• Writeable data (e.g., .data) will go in a writeable page

Page 47: Linking: from the object file to the executable

Introducing: segments

• A segment groups sections requiring similar permissions• Segments are defined in the program headers• A segment is composed by:

p_offset offset in the file where the segment starts.p_vaddr virtual address where it should be loaded.

p_filesz size of the segment in the file.p_memsz size of the segment in memory.p_flags permission: executable, writeable, readable.

Page 48: Linking: from the object file to the executable

Standard segments

Typically a program has two segments:+rx .rodata, .text, ...

+rw .data, .bss, ...

Page 49: Linking: from the object file to the executable

The kernel reads the program headers and maps the requiredpages in memory with the appropriate permissions

Page 50: Linking: from the object file to the executable

How comes we have both p_filesz and p_memsz?

They might differ: .bss is all zeros, so it’s not stored in the file.The p_memsz portion exceeding p_filesz is zero-intialized.

Page 51: Linking: from the object file to the executable

How comes we have both p_filesz and p_memsz?

They might differ: .bss is all zeros, so it’s not stored in the file.The p_memsz portion exceeding p_filesz is zero-intialized.

Page 52: Linking: from the object file to the executable

Program headers example

$ readelf -l example

Program Headers:Type Offset VirtAddr FileSiz MemSiz Flg...LOAD 0x000 0x400000 0x61c 0x61c R ELOAD 0xea8 0x600ea8 0x188 0x1f0 RW...

Section to Segment mapping:...02 ... .text ... .rodata ...03 ... .data .bss...

Page 53: Linking: from the object file to the executable

/proc/$PID/maps

$ cat /proc/$EXAMPLE_PID/maps

00400000 -00401000 r-xp ... ./ example00600000 -00602000 rw-p ... ./ example...

Page 54: Linking: from the object file to the executable

Bonus: final section table

$ readelf -S example

Nr Name Type Address Off Size Flg...8 .text PROGBITS 400380 000380 1e6 AX

...10 .rodata PROGBITS 400570 000570 004 A...18 .data PROGBITS 601020 001020 010 WA19 .bss NOBITS 601040 001030 058 WA...21 .shstrtab STRTAB 000000 00104f 0b222 .symtab SYMTAB 000000 001108 46823 .strtab STRTAB 000000 001570 148

Page 55: Linking: from the object file to the executable

Index

ELF format overview

Static linking

Dynamic linking

Advanced linking features

Page 56: Linking: from the object file to the executable

Why dynamic linking?

• Dynamic linking is required to support dynamic libraries• Dynamic libraries benefits:

• Fix a bug in a library ⇒ multiple applications will benefit• Load the library once ⇒ save physical memory• Do not replicate code ⇒ save disk space

Page 57: Linking: from the object file to the executable

What is dynamic linking?

• A lighter version of the linking process• It is performed at run-time• It loads the dynamic libraries in memory• It provides the address of symbols in dynamic libraries

Page 58: Linking: from the object file to the executable

Who is the dynamic linker?

$ readelf -l main

Program Headers:Type Offset VirtAddr FileSiz MemSiz Flg...INTERP 0x270 0x400270 0x00001c 0x00001c R[Program interpreter: /lib64/ld-linux -x86 -64.so.2]...

Page 59: Linking: from the object file to the executable

Differences w.r.t. static linking

• Used sections:.dynsym Dynamic symbol tables (instead of .symtab).dynstr String table for .dynsym (instead of .strtab)

.rela.dyn Dynamic relocation table (instead of .rela....)• Uses a different set of relocation types• r_offset is not an offset in a section, but a virtual address

Page 60: Linking: from the object file to the executable

Dynamic libraries

Dynamic libraries can be anywhere in the address space

• They can’t have absolute addresses at linking time• We want to share read-only parts among processes• We can’t patch the them at run-time!• So, no dynamic relocations in .text or .rodata

Page 61: Linking: from the object file to the executable

Position Independent Code

• Libraries are usually compiled as PIC• Never use absolute addresses• Use offsets from the current PC• In the linked binary, the program base address is 0

Page 62: Linking: from the object file to the executable

The idea

We can’t know a symbol’s absolute addressbut we know its relative distance from the current PC

For data too!The distance between .text and .data is constant2

2Usually

Page 63: Linking: from the object file to the executable

The idea

We can’t know a symbol’s absolute addressbut we know its relative distance from the current PC

For data too!The distance between .text and .data is constant2

2Usually

Page 64: Linking: from the object file to the executable

Two dynamic librarieslibone.c

extern int libtwo_variable;

int library_function(void) {return 42 + libtwo_variable;

}

libtwo.c

#include <stdlib.h>

int libtwo_variable = 0x435;

int libtwo_function(void) {system("echo␣hello");return 151;

}

Page 65: Linking: from the object file to the executable

Building a dynamic library

• Creating a dynamic library:gcc -fPIC -c libone.c -o libone.o

gcc -shared -fPIC libone.o -o libone.so

• Linking against a dynamic library:gcc main.c -lone -o main

• Dynamic libraries are not copied into the final executable• The .so file must be available in predefined paths:

• /usr/lib, /lib...• Any path in the LD_LIBRARY_PATH environment variable

Page 66: Linking: from the object file to the executable

Digression: linking order

• Input object files are always linked in• Their order doesn’t matter• Static and dynamic libraries order instead is important• Suppose libone.so requires libtwo.so

• The -ltwo parameter must be passed before -lone

• Or use fixed-point linking with --start-group/--end-group:gcc -Wl,--start-group -lone -ltwo -Wl,--end-group

Page 67: Linking: from the object file to the executable

Symbols in another library

What about symbols in another library?

• We can’t patch .text

• Two problems:1 How to access global variables in another module?2 How to call functions in another module?

Page 68: Linking: from the object file to the executable

Introducing: the GOT

• Let’s first see how we can access global variables• The linker will create a Global Offset Table (.got):

• It contains a pointer-sized entry for each imported variable• It holds their run-time addresses• It is populated upon startup by the dynamic loader

Page 69: Linking: from the object file to the executable

libone.so’s relocations$ gcc -fPIC -shared -L. -ltwo libone.c -o libone.so$ readelf -S libone.so

Nr Name Type Address Off Size ES Flg Lk Inf...18 .got PROGBITS 200fa8 fa8 58 08 WA 0 0...

.got is at the 0x200fa8-0x201000 range

$ readelf -r libone.so

Relocation section '.rela.dyn ':Offset Type Symbol 's Name + Addend...200fe0 R_X86_64_GLOB_DAT libtwo_variable + 0...

Page 70: Linking: from the object file to the executable

library_function code

$ objdump -j .text -d libone.so

00000000000006 c0 <library_function >:6c0: push rbp6c1: mov rbp ,rsp6c4: mov rax ,QWORD PTR [rip+0 x200915] # 200fe06cb: mov eax ,DWORD PTR [rax]6cd: add eax ,0x2a6d0: pop rbp6d1: ret

Page 71: Linking: from the object file to the executable

What about functions?

Do they work in the same way?

Page 72: Linking: from the object file to the executable

Introducing: lazy loading

• A program could import a lot of functions• Maybe some of them are not used very often• Applying all the relocations slows down the startup• Why don’t we fix the relocation only when needed?

Page 73: Linking: from the object file to the executable

The actors

• To implement lazy loading we use three new sections:.plt Small code stubs to call library functions

.got.plt Lazy GOT for library functions addresses.rela.plt Relocation table relative to .got.plt

• For each imported function we have an entry in all of them

Page 74: Linking: from the object file to the executable

libtwo.so’s sections

$ readelf -S libtwo.so

Nr Name Type Address Size ES Flg Inf...7 .rela.plt RELA 000538 48 18 AI 9

...9 .plt PROGBITS 0005a0 40 10 AX 0

...19 .got.plt PROGBITS 200 fa8 58 08 WA 0...

.got.plt is at the 0x200fa8-0x201000 range

Page 75: Linking: from the object file to the executable

$ objdump -d libtwo.so

Disassembly of section .text:

00000000000006 e0 <libtwo_function >:...700: call 5b0 <system@plt >

...

Disassembly of section .plt:...00000000000005 b0 <system@plt >:5b0: jmp QWORD PTR [rip+0 x200a0a] # 200fc0

...

Page 76: Linking: from the object file to the executable

.got.plt relocations

$ readelf -r libtwo.so

Relocation section '.rela.plt ':Offset Type Symbol 's Name + Addend200fc0 R_X86_64_JUMP_SLOT system + 0

Page 77: Linking: from the object file to the executable

The PLT

This was what it would look like without lazy loading

Page 78: Linking: from the object file to the executable

Lazy loading

• At startup, .got.plt doesn’t contain function addresses• It contains the address of the stub’s second instruction• This second part invokes the dynamic loader• The dynamic loader will fix the relocation• From then on, .got.plt will contain the correct address

Page 79: Linking: from the object file to the executable

The real PLT

$ objdump -d -j .plt libtwo.so

00000000000005 b0 <system@plt >:5b0: jmp QWORD PTR [rip+0 x200a0a] # 200fc05b6: push 0x05bb: jmp 5a0 <_init+0x20 >

$ objdump -s -j .got.plt libtwo.so

libtwo.so: file format elf64 -x86 -64

Contents of section .got.plt:200fb8 00000000 00000000 b6050000 00000000

b6 05 00 00 is 0x5b6 in little endian.

Page 80: Linking: from the object file to the executable

A note on the dynamic loader

• The dynamic loader ignores the sections• It just knows about program headers• In particular it uses PT_DYNAMIC, which points to .dynamic

• .dynamic contains all the information it needs:• Needed libraries• Address and size of .dynsym, .dynstr, .got.plt, .rela.plt• ...

Page 81: Linking: from the object file to the executable

.dynamic section

$ readelf -l libone.soProgram Headers:Type Offset VirtAddr FileSiz MemSiz Flg...DYNAMIC 0x000db8 0x200db8 0x0001f0 0x0001f0 RW...

$ readelf -S libone.so

Section Headers:Nr Name Type Address Off Size ES Flg...17 .dynamic DYNAMIC 200db8 000db8 0001f0 10 WA...

Page 82: Linking: from the object file to the executable

.dynamic content$ readelf -d libone.so

Type Name/Value(NEEDED) Shared library: [libtwo.so](NEEDED) Shared library: [libc.so.6]...(STRTAB) 0x358(SYMTAB) 0x208(STRSZ) 206 (bytes)(SYMENT) 24 (bytes)(PLTGOT) 0x200fa8(PLTRELSZ) 48 (bytes)...(JMPREL) 0x540(RELA) 0x468(RELASZ) 216 (bytes)(RELAENT) 24 (bytes)...

Page 83: Linking: from the object file to the executable

What’s mandatory?

This means that the section table is optional

• The same for .symtab and .strtab

• They are basically debugging information• In fact, the strip tool can remove them to save space

Page 84: Linking: from the object file to the executable

Index

ELF format overview

Static linking

Dynamic linking

Advanced linking features

Page 85: Linking: from the object file to the executable

Relaxation

• At compile-time we might ignore how long a jump is• Some ISAs have shorter instructions for short jumps• In MIPS, a long jump can take up to 3 instructions• What if the destination is unknown at compile-time?• The compiler must emit the most conservative option

Page 86: Linking: from the object file to the executable

Introducing: relaxation

• Certain smart linkers offer relaxation• That is, they’re able to move the code up and down• At link-time!• Symbols for each basic block must be provided

Page 87: Linking: from the object file to the executable

Reducing code size

• It might happen that unused functions make it to link time• The compiler ignores if a non-static function is unused• The linker has the required information to understand this

Page 88: Linking: from the object file to the executable

But the ELF linker is dumb!

• If it is asked to link a section, it will link it as a whole• The Mach-O format instead works at finer granularity

Page 89: Linking: from the object file to the executable

The solution

• The compiler can create a section per-function/data objectgcc -ffunction-sections -fdata-sections ...

• Then, we can tell the linker to drop unreferenced sections:gcc -Wl,--gc-sections

Page 90: Linking: from the object file to the executable

Link-time optimization

• Traditionally compilers optimize with a TU granularity• LTO (-flto) optimizes the program as whole• The compiler does not emit a standard object file• A high-level, internal representation is serialized:

GCC an ELF with sections containing GIMPLE.clang a bitcode file containing the LLVM IR.

• The linker will merge these IRs and optimize them

Page 91: Linking: from the object file to the executable

Pretty much like...

cat *.c > all.c

gcc all.c -o all

Page 92: Linking: from the object file to the executable

Pros and cons

• Pros:• Global view on the program• Larger optimization opportunities• Aggressive dead code elimination

• Cons:• Resource intensive• Only a subset of the optimizations can be run

Page 93: Linking: from the object file to the executable

Security considerations

-z relro An attacker [1] could change the .dynamic sectionfor malicious purposes. With -z relro enabled,once .dynamic has been initialized, it is markedread-only.

-z now An attacker [1] could use the lazy loading systemto call an arbitrary function. -z now completelydisables lazy loading.

-pie By default executable’s position, unlike libraries, isnot randomized. An attacker could then reusecode in the executable binary for maliciouspurposes. -pie compiles the executable as PICand produces a relocatable program (similar to ashared library).

Page 94: Linking: from the object file to the executable

Thanks

Page 95: Linking: from the object file to the executable

Bibliography

Alessandro Di Federico, Amat Cama, Yan Shoshitaishvili,Christopher Kruegel, and Giovanni Vigna.How the ELF Ruined Christmas.In 24th USENIX Security Symposium (USENIX Security15), pages 643–658, Washington, D.C., August 2015.USENIX Association.

Santa Cruz Operation.System V Application Binary Interface, 2013.http://www.sco.com/developers/gabi/latest/contents.html.

Page 96: Linking: from the object file to the executable

License

This work is licensed under the Creative CommonsAttribution-ShareAlike 3.0 Unported License. To view a copy ofthis license, visit http://creativecommons.org/licenses/by-sa/3.0/or send a letter to Creative Commons, 444 Castro Street, Suite900, Mountain View, California, 94041, USA.