part 2: advanced static analysis chapter 4: a crash course in x86 disassembly chapter 5: ida pro...

102
Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Upload: shana-collins

Post on 26-Dec-2015

222 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Part 2: Advanced Static Analysis

Chapter 4: A Crash Course in x86 DisassemblyChapter 5: IDA Pro

Chapter 6: Recognizing C Code Constructs in Assembly

Page 2: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

How software works

gcc compiler driver pre-processes, compiles, assembles and links to generate executable Links together object code (i.e. game.o) and static

libraries (i.e. libc.a) to form final executable Links in references to dynamic libraries for code

loaded at load time (i.e. libc.so.1) Executable may still load additional dynamic

libraries at run-time

Pre-processor

Compiler LinkerAssembler

ProgramSource

ModifiedSource

AssemblyCode

ObjectCode

ExecutableCode

hello.c hello.i hello.s hello.o hello

Page 3: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Static libraries

Suppose you have utility code in x.c, y.c, and z.c that all of your programs useLink together individual .o files

gcc –o hello hello.o x.o y.o z.o

Create a library libmyutil.a using ar and ranlib and link library in statically

libmyutil.a : x.o y.o z.o

ar rvu libmyutil.a x.o y.o z.o

ranlib libmyutil.a

gcc –o hello hello.c –L. –lmyutil

Note: library code copied directly into binary

Page 4: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Dynamic libraries

Avoid having multiple copies of common code on diskProblem: libc

“gcc program.c –lc” creates an a.out with entire libc object code in it (libc.a)

Almost all programs use libc!

Solution: Have binaries compiled with a reference to a library of shared objects versus an entire copy of the library

Libraries loaded at run-time from file system“ldd <binary>” to see which dynamic libraries a program relies

upongcc flags “–shared” and “-soname” for handling and generating

dynamic shared object files

Page 5: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

The linking process (ld)Merges object files

Merges multiple relocatable (.o) object files into a single executable program.

Resolves external references References to symbols defined in another object file.

Relocates symbols Relocates symbols from their relative locations in the .o files to new absolute

positions in the executable. Updates all references to these symbols to reflect their new positions.

References in both code and data» code: a(); /* reference to symbol a */» data: int *xp=&x; /* reference to symbol x */

Page 6: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Executables

Various file formatsLinux = Executable and Linkable Format (ELF)Windows = Portable Executable (PE)

Page 7: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

ELF

Standard binary format for object files in Linux

One unified format for Relocatable object files (.o), Shared object files (.so)Executable object files

Better support for shared libraries than old a.out formats.

More complete information for debuggers.

Page 8: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

ELF Object File FormatELF header

Magic number, type (.o, exec, .so), machine, byte ordering, etc.

Program header table Page size, virtual addresses of memory

segments (sections), segment sizes, entry point

.text section Code

.data section Initialized (static) data

.bss section Uninitialized (static) data “Block Started by Symbol”

ELF header

Program header table(required for executables)

.text section

.data section

.bss section

.symtab

.rel.text

.rel.data

.debug

Section header table(required for relocatables)

0

Page 9: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

ELF Object File Format (cont).symtab section

Symbol table Procedure and static variable names Section names and locations

.rel.text section Relocation info for .text section Addresses of instructions that will need to be

modified in the executable Instructions for modifying.

.rel.data section Relocation info for .data section Addresses of pointer data that will need to be

modified in the merged executable

.debug section Info for symbolic debugging (gcc -g)

ELF header

Program header table(required for executables)

.text section

.data section

.bss section

.symtab

.rel.text

.rel.data

.debug

Section header table(required for relocatables)

0

Page 10: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

PE (Portable Executable) file format

Windows file format for executables

Based on COFF Format Magic Numbers, Headers, Tables, Directories, Sections

Disassemblers Overlay Data with C Structures Load File as OS Loader Would Identify Entry Points (Default & Exported)

Page 11: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Example C Program

int e=7; int main() { int r = a(); exit(0); }

m.c a.c

extern int e; int *ep=&e;int x=15; int y; int a() { return *ep+x+y; }

Page 12: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Merging Relocatable Object Files into an Executable Object File

main()m.o

int *ep = &e

a()

a.o

int e = 7

headers

main()

a()

0system code

int *ep = &e

int e = 7

system data

more system code

int x = 15int y

system data

int x = 15

Relocatable Object Files Executable Object File

.text

.text

.data

.text

.data

.text

.data

.bss .symtab.debug

.data

uninitialized data .bss

system code

Page 13: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Program executionOperating system provides

Protection and resource allocation Abstract view of resources (files, system calls) Virtual memory

Uniform memory space abstraction for each processGives the illusion that each process has entire memory space

Page 14: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

How does a program get loaded?

The operating system creates a new process. Including among other things, a virtual memory

space Important: any hardware-based debugger must

know OS state in page tables to map accesses to virtual addresses

System loader reads the executable file from the file system into the memory space. Reads executable from file system into memory

spaceExecutable contains code and statically link librariesDone via DMA (direct memory access)Executable in file system remains and can be executed

again Loads dynamic shared objects/libraries into memory Resolves addresses in code given where code/data

is loaded

Then it starts the thread of execution running

Page 15: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Loading Executable Binaries

ELF header

Program header table(required for executables)

.text section

.data section

.bss section

.symtab

.rel.text

.rel.data

.debug

Section header table(required for relocatables)

0

.text segment(r/o)

.data segment(initialized r/w)

.bss segment(uninitialized r/w)

Executable object file for example program p

Process image

0x08048494

init and shared libsegments

0x080483e0

Virtual addr

0x0804a010

0x0804a3b0

Page 16: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

More on relocation

Assembly code with relative and absolute addresses With VM abstraction, old linkers decide layout and

can supply definitive addressesWindows “.com” formatLinker can statically bind the program to virtual addressesNow, they provide hints as to where they would like to be

placed But….this could also be done at load time (address

space layout randomization)Windows “.exe” formatLoader rewrites addresses to proper offsetsSystem needs to force position-independent code

» Force compiler to make all jumps and branches relative to current location or relative to a base register set at run-time

ELF uses Global Offset Table» Symbol addresses obtained from GOT before access» Can be targetted for hooks!» Implementation determines exploit

Page 17: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Program execution

Programmer-Visible State EIP - Instruction Pointer

a. k. a. Program CounterAddress of next instruction

Register FileHeavily used program data

Condition CodesStore status information about most recent arithmetic

operationUsed for conditional branching

EIP

Registers

CPU Memory

Object CodeProgram Data

OS Data

Addresses

Data

Instructions

Stack

ConditionCodes

MemoryMemory Byte addressable array Code, user data, OS data Includes stack used to support

procedures

Page 18: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Run-time data structures

kernel virtual memory(code, data, heap, stack)

memory mapped region forshared libraries

run-time heap(managed by malloc)

user stack(created at runtime)

unused0

%esp (stack pointer)

memoryinvisible touser code

brk

0xc0000000

0x08048000

0x40000000

read/write segment(.data, .bss)

read-only segment(.init, .text, .rodata)

loaded from the executable file

0xffffffff

Page 19: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Registers

The processor operates on data in registers (usually)movl (%eax), %ecx

Fetch data at address contained in %eax Store in register %ecx

movl $array, %ecxMove address of variable array into %ecx

Typically, data is loaded into registers, manipulated or used, and then written back to memory

The IA32 architecture is “register poor” Few general purpose registers Source or destination operand is often memory

locations Makes context-switching amongst processes easy

(less register-state to store)

Page 20: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

IA32 General Registers015 7831

%ah %al

%ch %cl

%dh %dl

%bh %bl

%eax

%ecx

%edx

%ebx

%esi

%edi

%esp

%ebp

%ax

%cx

%dx

%bx

%si

%di

%sp

%bp

Stack pointer

Frame pointer

Special purposeregisters

General purposeregisters (mostly)

Page 21: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Operand types

A typical instruction acts on 1 or more operandsaddl %ecx, %edx adds the contents of ecx to

edx

Three general types of operands Immediate

Like a C constant, but preceded by $e.g., $0x1F, $-533Encoded with 1, 2, or 4 bytes based on instruction

Register: the value in one of the 8 integer registers Memory: a memory address

There are many modes for addressing memory

Page 22: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Operand examples using mov

Memory-memory transfers cannot be done with single instruction

movl

Imm

Reg

Mem

Reg

Mem

Reg

Mem

Reg

Source Destination

movl $0x4,%eax

movl $-147,(%eax)

movl %eax,%edx

movl %eax,(%edx)

movl (%eax),%edx

C Analog

temp = 0x4;

*p = -147;

temp2 = temp1;

*p = temp;

temp = *p;

Page 23: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Addressing Modes

Immediate and registers have only one mode

Memory on the other hand … Absolute

specify the address of the data Indirect

use register to calculate address Base + displacement

use register plus absolute address to calculate address Indexed

Indexed» Add contents of an index register

Scaled index» Add contents of an index register scaled by a constant

Page 24: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Summary of IA32 Operand Forms

Scaled IndexedM[Imm + R[Eb] + R[Ei] * s]Imm (Eb, Ei, s)Memory

Scaled IndexedM[R[Eb] + R[Ei] * s](Eb, Ei, s)Memory

Scaled IndexedM[Imm + R[Ei] * s]Imm(, Ei, s)Memory

Scaled IndexedM[R[Ei] * s](, Ei, s)Memory

IndexedM[Imm + R[Eb] + R[Ei]]Imm(Eb, Ei)Memory

IndexedM[R[Eb] + R[Ei]](Eb, Ei)Memory

Base + displacmentM[Imm + R[Eb]Imm(Eb)Memory

IndirectM[R[Ea]](Ea)Memory

AbsoluteM[Imm]ImmMemory

RegisterR[Ea]Ea Register

ImmediateImm$ImmImmediate

NameOperand ValueFormType

Page 25: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

x86 instructions

RulesSource operand can be memory, register or

constantDestination can be memory or registerOnly one of source and destination can be memorySource and destination must be same size

Flags set on each instructionEFLAGSConditional branches handled via EFLAGS

Page 26: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

What’s the “l” for on the end?

addl 8(%ebp),%eaxIt stands for “long” and is 32-bitsIt tells the size of the operand.Baggage from the days of 16-bit processors

For x86, x86_648 bits is a byte16 bits is a word32 bits is a double word64 bits is a quad word

Page 27: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

IA32 Standard Data Types

10/12tExtended precisionlong double

8lDouble precisiondouble

4sSingle precisionfloat

4lDouble wordchar *

4lDouble wordunsigned long

4lDouble wordlong int

4lDouble wordunsigned

4lDouble wordint

2wWordshort

1bBytechar

Size in bytesGAS SuffixIntel Data TypeC Declaration

Page 28: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Global vs. Local variables

Global variables stored in either .data or .bss section of process

Local variables stored on stack

Page 29: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Global vs local exampleint x = 1;int y = 2;void a(){ x = x+y; printf("Total = %d\n",x);}int main(){a();}

void a(){

int x = 1;int y = 2;

x = x+y; printf("Total = %d\n",x);}int main() {a();}

Page 30: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Global vs local exampleint x = 1;int y = 2;void a(){ x = x+y; printf("Total = %d\n",x);}int main(){a();}

080483c4 <a>: 80483c4: push %ebp 80483c5: mov %esp,%ebp 80483c7: sub $0x18,%esp 80483ca: movl $0x1,-0x8(%ebp) 80483d1: movl $0x2,-0x4(%ebp) 80483d8: mov -0x4(%ebp),%eax 80483db: add %eax,-0x8(%ebp) 80483de: mov -0x8(%ebp),%eax 80483e1: mov %eax,0x4(%esp) 80483e5: movl $0x80484f0,(%esp) 80483ec: call 80482dc <printf@plt> 80483f1: leave 80483f2: ret

void a(){

int x = 1;int y = 2;x = x+y;printf("Total = %d\n",x);

}int main() {a();}

080483c4 <a>: 80483c4: push %ebp 80483c5: mov %esp,%ebp 80483c7: sub $0x8,%esp 80483ca: mov 0x804966c,%edx 80483d0: mov 0x8049670,%eax 80483d5: lea (%edx,%eax,1),%eax 80483d8: mov %eax,0x804966c 80483dd: mov 0x804966c,%eax 80483e2: mov %eax,0x4(%esp) 80483e6: movl $0x80484f0,(%esp) 80483ed: call 80482dc <printf@plt> 80483f2: leave 80483f3: ret

Page 31: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Arithmetic operations

void f(){ int a = 0; int b = 1; a = a+11; a = a-b; a--; b++;}

int main() { f();}

08048394 <f>: 8048394: push %ebp 8048395: mov %esp,%ebp 8048397: sub $0x10,%esp 804839a: movl $0x0,-0x8(%ebp) 80483a1: movl $0x1,-0x4(%ebp) 80483a8: addl $0xb,-0x8(%ebp) 80483ac: mov -0x4(%ebp),%eax 80483af: sub %eax,-0x8(%ebp) 80483b2: subl $0x1,-0x8(%ebp) 80483b6: addl $0x1,-0x4(%ebp) 80483ba: leave 80483bb: ret

Page 32: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Machine Instruction ExampleC Code

Add two signed integers

AssemblyAdd 2 4-byte integers

“Long” words in GCC parlanceSame instruction whether signed

or unsignedOperands:

x: Register %eaxy: Memory M[%ebp+8]t: Register %eax

»Return function value in %eax

Object Code3-byte instructionStored at address 0x401046

0x401046: 03 45 08

int sum(int x, int y){ int t = x+y; return t;}

_sum:pushl %ebpmovl %esp,%ebpmovl 12(%ebp),%eaxaddl 8(%ebp),%eaxmovl %ebp,%esppopl %ebpret

Page 33: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Condition codesThe IA32 processor has a register called eflags(extended flags)

Each bit is a flag, or condition codeCF Carry Flag SFSign Flag

ZF Zero Flag OFOverflow Flag

As programmers, we don’t write to this register and seldom read it directly

Flags are set or cleared by hardware depending on the result of an instruction

Page 34: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Condition Codes (cont.)

Setting condition codes via compare instructioncmpl b,aComputes a-b without setting destinationCF set if carry out from most significant bit

Used for unsigned comparisonsZF set if a == bSF set if (a-b) < 0OF set if two’s complement overflow

(a>0 && b<0 && (a-b)<0) || (a<0 && b>0 && (a-b)>0)

Byte and word versions cmpb, cmpw

Page 35: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Condition Codes (cont.)

Setting condition codes via test instructiontestl b,a Computes a&b without setting destination

Sets condition codes based on resultUseful to have one of the operands be a mask

Often used to test zero, positivetestl %eax, %eax

ZF set when a&b == 0SF set when a&b < 0Byte and word versions testb, testw

Page 36: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

if statements

void f(){ int x = 1; int y = 2; if (x==y) { printf("x equals y.\n"); } else { printf("x is not equal to y.\n"); }}

int main() { f();}

080483c4 <f>: 80483c4: push %ebp 80483c5: mov %esp,%ebp 80483c7: sub $0x18,%esp 80483ca: movl $0x1,-0x8(%ebp) 80483d1: movl $0x2,-0x4(%ebp) 80483d8: mov -0x8(%ebp),%eax 80483db: cmp -0x4(%ebp),%eax 80483de: jne 80483ee <f+0x2a> 80483e0: movl $0x80484f0,(%esp) 80483e7: call 80482d8 <puts@plt> 80483ec: jmp 80483fa <f+0x36> 80483ee: movl $0x80484fc,(%esp) 80483f5: call 80482d8 <puts@plt> 80483fa: leave 80483fb: ret

Page 37: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

if statementsint a = 1, b = 3, c; if (a > b)

c = a; else

c = b;

00000018: C7 45 FC 01 00 00 00 mov dword ptr [ebp-4],1 ; store a = 1

0000001F: C7 45 F8 03 00 00 00 mov dword ptr [ebp-8],3 ; store b = 3

00000026: 8B 45 FC mov eax,dword ptr [ebp-4] ; move a into EAX register

00000029: 3B 45 F8 cmp eax,dword ptr [ebp-8] ; compare a with b (subtraction)

0000002C: 7E 08 jle 00000036 ; if (a<=b) jump to line 00000036

0000002E: 8B 4D FC mov ecx,dword ptr [ebp-4] ; else move 1 into ECX register &&

00000031: 89 4D F4 mov dword ptr [ebp-0Ch],ecx ; move ECX into c (12 bytes down) &&

00000034: EB 06 jmp 0000003C ; unconditional jump to 0000003C

00000036: 8B 55 F8 mov edx,dword ptr [ebp-8] ; move 3 into EDX register &&

00000039: 89 55 F4 mov dword ptr [ebp-0Ch],edx ; move EDX into c (12 bytes down)

Page 38: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

int factorial_do(int x){ int result = 1; do { result *= x; x = x-1; } while (x > 1); return result;}

Loops

factorial_do: pushl %ebp movl %esp, %ebp movl 8(%ebp), %edx movl $1, %eax.L2: imull %edx, %eax decl %edx cmpl $1, %edx jg .L2 leave ret

Page 39: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

C switch statementsImplementation options

Series of conditionals testl followed by je Good if few cases Slow if many cases

Jump table (example below) Lookup branch target from a table Possible with a small range of integer constants

GCC picks implementation based on structure

Example:

switch (x) {case 1: case 5:

code at L0case 2:case 3:

code at L1default:

code at L2}

.L2

.L0

.L1

.L1

.L2

.L0

.L3

1. init jump table at .L32. get address at .L3+4*x3. jump to that address

Page 40: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Example int switch_eg(int x){ int result = x; switch (x) { case 100: result *= 13; break;

case 102: result += 10; /* Fall through */

case 103: result += 11; break;

case 104: case 106: result *= result; break;

default: result = 0; } return result;}

Page 41: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

41

leal -100(%edx),%eax cmpl $6,%eax ja .L9 jmp *.L10(,%eax,4) .p2align 4,,7.section .rodata .align 4 .align 4.L10: .long .L4 .long .L9 .long .L5 .long .L6 .long .L8 .long .L9 .long .L8.text .p2align 4,,7.L4: leal (%edx,%edx,2),%eax leal (%edx,%eax,4),%edx jmp .L3 .p2align 4,,7.L5: addl $10,%edx

.L6: addl $11,%edx jmp .L3 .p2align 4,,7.L8: imull %edx,%edx jmp .L3 .p2align 4,,7.L9: xorl %edx,%edx.L3: movl %edx,%eax

Key is Key is jump table at L10jump table at L10Array of pointers to jump locationsArray of pointers to jump locations

int switch_eg(int x){ int result = x; switch (x) { case 100: result *= 13; break;

case 102: result += 10; /* Fall through */

case 103: result += 11; break;

case 104: case 106: result *= result; break;

default: result = 0; } return result;}

Page 42: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

x86-64 conditionals

Modern CPUs with deep pipelinesInstructions fetched far in advance of executionMask the latency going to memoryProblem: What if you hit a conditional branch?

Must predict which branch to take!Branch prediction in CPUs well-studied, fairly effectiveBut, best to avoid conditional branching altogether

x86-64 conditionalsConditional instruction execution

Page 43: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Conditional MoveConditional move instruction

cmovXX src, dest Move value from src to dest if condition XX holds No branching Handled as operation within Execution Unit Added with P6 microarchitecture (PentiumPro onward)

Example

Current version of GCC won’t use this instruction Thinks it’s compiling for a 386

Performance 14 cycles on all data More efficient than conditional branching (simple control flow) But overhead: both branches are evaluated

movl 8(%ebp),%edx # Get xmovl 12(%ebp),%eax # rval=ycmpl %edx, %eax # rval:x

cmovll %edx,%eax # If <, rval=x

Page 44: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

x86-64 conditional example

absdiff: # x in %edi, y in %esimovl %edi, %eax # eax = xmovl %esi, %edx # edx = ysubl %esi, %eax # eax = x-ysubl %edi, %edx # edx = y-xcmpl %esi, %edi # x:ycmovle %edx, %eax # eax=edx if <=

ret

int absdiff( int x, int y){ int result; if (x > y) { result = x-y; } else { result = y-x; } return result;}

Page 45: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

IA32 Stack Region of memory

managed with stack discipline

Grows toward lower addresses

Register %esp indicates lowest stack address

address of top element

StackPointer%esp

Stack GrowsDown

IncreasingAddresses

Stack “Top”

Stack “Bottom”

Page 46: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

IA32 Stack PushingPushing

pushl SrcDecrement %esp by 4Fetch operand at SrcWrite operand at address

given by %esp e.g. pushl %eax

subl $4, %espmovl %eax,(%esp) Stack Grows

Down

IncreasingAddresses

Stack “Top”

Stack “Bottom”

StackPointer%esp -4

Page 47: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

IA32 Stack PoppingPopping

popl DestRead operand at address

given by %espWrite to DestIncrement %esp by 4

e.g. popl %eaxmovl (%esp),%eaxaddl $4,%esp

StackPointer%esp

Stack GrowsDown

IncreasingAddresses

Stack “Top”

Stack “Bottom”

+4

Page 48: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

%esp

%eax

%edx

%esp

%eax

%edx

%esp

%eax

%edx

0x104

555

0x108

0x108

0x10c

0x110

0x104

213

213

123

Stack Operation Examples

0x108

0x10c

0x110

213

123

0x108 0x104

pushl %eax

0x108

0x10c

0x110

213

123

0x104

213

popl %edx

0x108

213

Initially

Top

Top Top

Page 49: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Procedure Control Flow

Procedure call:call label

Push address of next instruction (after the call) on stackJump to label

Procedure return:ret Pop address from stack into eip register

Page 50: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

%esp

%eip

%esp

%eip 0x804854e

0x108

0x108

0x10c

0x110

0x104

0x804854e

0x8048553

123

Procedure Call Example

0x108

0x10c

0x110

123

0x108

call 8048b90

804854e: e8 3d 06 00 00 call 8048b90 <main>8048553: 50 next instruction

0x8048b90

0x104

%eip is program counter

Page 51: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

0x8048e910x8048553

%esp

%eip

0x104

%esp

%eip0x8048e90

0x1040x104

0x108

0x10c

0x110

0x8048553

123

Procedure Return Example

0x108

0x10c

0x110

123

ret

8048e90: c3 ret

0x108

%eip is program counter

0x8048553

Page 52: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Procedure Control FlowWhen procedure foo calls who:

 foo is the caller, who is the callee Control is transferred to the ‘callee’

When procedure returns Control is transferred back to the ‘caller’

Last-called, first-return (LIFO) order Naturally implemented via the stack

foo(…){

• • •who();• • •

}

who(…){

• • •amI();• • •amI();• • •

}

amI(…){

• • •• • •

}

call

call

retret

Page 53: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Procedure calls and stack framesHow does the ‘callee’ know where to return later?

Return address placed in a well-known location on stack within a “stack frame”

How are arguments passed to the ‘callee’? Arguments placed in a well-known location on stack

within a “stack frame”

Upon procedure invocation Stack frame created for the procedure Stack frame is pushed onto program stack

Upon procedure return Its frame is popped off of stack Caller’s stack frame is recovered

foo’sstack frame

who’sstackframe

Stack bottom

increasin

g ad

dressesamI’s

stackframe

stack gro

wth

Call chain: foo => who => amI

Page 54: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Keeping track of stack frames

The stack pointer (%esp) moves around Can be changed within procedure Problem

How can we consistently find our parameters? The base pointer (%ebp)

Points to the base of our current stack frameAlso called the frame pointerWithin each function, %ebp stays constant

Most information on the stack is referenced relative to the base pointer Base pointer setup is the programmer’s job

Actually usually the compiler’s job

Page 55: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

IA32/Linux Stack FrameCurrent Stack Frame (Yellow) (From Top

to Bottom) Parameters for function about to be

called “Argument build” of caller

Local variables If can’t keep in registers

Saved register context Old frame pointer

Caller Stack Frame (Pink) Return address

Pushed by call instruction

Arguments for this call “Argument build” of callee

etc…Stack Pointer(%esp)

Frame Pointer(%ebp)

Return Addr

SavedRegisters

+Local

Variables

ArgumentBuild

Old %ebp

Arguments

CallerFrame

Page 56: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

swap

void swap(int *xp, int *yp) { int t0 = *xp; int t1 = *yp; *xp = t1; *yp = t0;}

int zip1 = 15213;int zip2 = 91125;

void call_swap(){ swap(&zip1, &zip2);}

call_swap:• • •pushl $zip2 # Global Varpushl $zip1 # Global Varcall swap• • •

&zip2

&zip1

Rtn adr %esp

ResultingStack

•••

Calling swap from call_swap

Page 57: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

swap

void swap(int *xp, int *yp) { int t0 = *xp; int t1 = *yp; *xp = t1; *yp = t0;}

swap:pushl %ebpmovl %esp,%ebppushl %ebx

movl 12(%ebp),%ecxmovl 8(%ebp),%edxmovl (%ecx),%eaxmovl (%edx),%ebxmovl %eax,(%edx)movl %ebx,(%ecx)

movl -4(%ebp),%ebxmovl %ebp,%esppopl %ebpret

Body

Setup

Finish

Page 58: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

swap Setup #1

swap:pushl %ebpmovl %esp,%ebppushl %ebx

Resultingstack

&zip2

&zip1

Rtn adr %esp

EnteringStack

•••

%ebp

yp

xp

Rtn adr

Old %ebp

%ebp

•••

%esp

Page 59: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

swap Setup #2

swap:pushl %ebpmovl %esp,%ebppushl %ebx

Stack beforeinstruction

yp

xp

Rtn adr

Old %ebp %ebp

Resultingstack

•••

%esp

yp

xp

Rtn adr

Old %ebp

%ebp

•••

%esp

Page 60: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

swap Setup #3

swap:pushl %ebpmovl %esp,%ebppushl %ebx

Stack beforeinstruction

yp

xp

Rtn adr

Old %ebp %ebp

ResultingStack

•••

Old %ebx %esp

yp

xp

Rtn adr

Old %ebp %ebp

•••

%esp

Page 61: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Effect of swap Setup

yp

xp

Rtn adr

Old %ebp %ebp 0

4

8

12

Offset(relative to %ebp)

•••

&zip2

&zip1

Rtn adr %esp

EnteringStack

•••

%ebp

Old %ebx %esp

movl 12(%ebp),%ecx # get ypmovl 8(%ebp),%edx # get xp. . .

Body

ResultingStack

Page 62: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

swap Finish #1

movl -4(%ebp),%ebxmovl %ebp,%esppopl %ebpret

yp

xp

Rtn adr

Old %ebp %ebp 0

4

8

12

Offset

swap’sStack

•••

Old %ebx %esp-4

ObservationSaved & restored register %ebx

yp

xp

Rtn adr

Old %ebp %ebp 0

4

8

12

Offset

•••

Old %ebx %esp-4

Page 63: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

swap Finish #2

movl -4(%ebp),%ebxmovl %ebp,%esppopl %ebpret

yp

xp

Rtn adr

Old %ebp %ebp 0

4

8

12

Offset

swap’sStack

•••

Old %ebx %esp-4

yp

xp

Rtn adr

Old %ebp %ebp 0

4

8

12

Offset

swap’sStack

•••

%esp

Page 64: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

swap Finish #3

movl -4(%ebp),%ebxmovl %ebp,%esppopl %ebpret

yp

xp

Rtn adr

%ebp

4

8

12

Offset

swap’sStack

•••

yp

xp

Rtn adr

Old %ebp %ebp 0

4

8

12

Offset

swap’sStack

•••

%esp

%esp

Page 65: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

swap Finish #4

movl -4(%ebp),%ebxmovl %ebp,%esppopl %ebpret

&zip2

&zip1 %esp

ExitingStack

•••

%ebp

Observation Saved & restored register %ebx Didn’t do so for %eax, %ecx, or %edx

yp

xp

Rtn adr

%ebp

4

8

12

Offset

swap’sStack

•••

%esp

Page 66: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

swap void swap(int *xp, int *yp) { int t0 = *xp; int t1 = *yp; *xp = t1; *yp = t0;}

swap:pushl %ebpmovl %esp,%ebppushl %ebx

movl 12(%ebp),%ecxmovl 8(%ebp),%edxmovl (%ecx),%eaxmovl (%edx),%ebxmovl %eax,(%edx)movl %ebx,(%ecx)

movl -4(%ebp),%ebxmovl %ebp,%esppopl %ebpret

Body

Setup

Finish

Save old %ebp of caller frameSet new %ebp for callee (current) frameSave state of %ebx register from caller

Retrieve parameter yp from caller frameRetrieve parameter xp from caller frame

Perform swap

Restore the state of caller’s %ebx registerSet stack pointer to bottom of callee frame (%ebp)Restore %ebp to original state

Pop return address from stack to %eip

Equivalent to single leave instruction

Page 67: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Local variables

Where are they in relation to ebp?Stored “above” %ebp (at lower addresses)

How are they preserved if the current function calls another function?Compiler updates %esp beyond local variables

before issuing “call”

What happens to them when the current function returns?Are lost (i.e. no longer valid)

Page 68: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Register Saving Conventions

When procedure foo calls who:  foo is the caller, who is the callee

Can Register be Used for Temporary Storage?

Conventions “Caller Save”

Caller saves temporary in its frame before calling “Callee Save”

Callee saves temporary in its frame before using

Page 69: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

IA32 Register Usage

Integer Registers Two have special uses

%ebp, %esp

Three managed as callee-save%ebx, %esi, %ediOld values saved on stack

prior to using

Three managed as caller-save%eax, %edx, %ecxDo what you please, but

expect any callee to do so, as well

Return value in %eax

%eax

%edx

%ecx

%ebx

%esi

%edi

%esp

%ebp

Caller-SaveTemporaries

Callee-SaveTemporaries

Special

Page 70: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

simple.c

_simple: pushl %ebp Setup stack frame pointer movl %esp, %ebp movl 8(%ebp), %edx get xp movl 12(%ebp), %ecx get y movl (%edx), %eax move *xp to t addl %ecx, %eax add y to t movl %eax, (%edx) store t at *xp popl %ebp restore frame pointer ret return to caller

int simple(int *xp, int y){ int t = *xp + y; *xp = t; return t;}

gcc –O2 –c simple.c

Page 71: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Function pointersPointers in C can also point to code locations

Function pointers Store and pass references to code

Some uses Dynamic “late-binding” of functions

Dynamically “set” a random number generator Replace large switch statements for implementing dynamic event handlers

» Example: dynamically setting behavior of GUI buttons

Emulating “virtual functions” and polymorphism from OOP qsort() with user-supplied callback function for comparison

» man qsort Operating on lists of elements

» multiplicaiton, addition, min/max, etc.

Malware leverages this to execute its own code

Page 72: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Using pointers to functions// function prototypesint doEcho(char*);int doExit(char*);int doHelp(char*);int setPrompt(char*);

// dispatch table sectiontypedef int (*func)(char*);

typedef struct{ char* name; func function;} func_t;

func_t func_table[] ={ { "echo", doEcho }, { "exit", doExit }, { "quit", doExit }, { "help", doHelp }, { "prompt", setPrompt },};

#define cntFuncs (sizeof(func_table) / sizeof(func_table[0]))

// find the function and dispatch itfor (i = 0; i < cntFuncs; i++) { if (strcmp(command,func_table[i].name)==0){ done = func_table[i].function(argument); break; }}if (i == cntFuncs) printf("invalid command\n");

Page 73: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Function pointers example#include <sys/time.h>#include <stdio.h>void fp1(int i){ printf("Even\n“,i);}void fp2(int i) { printf("Odd\n”,i); }

main(int argc, char **argv) { void (*fp)(int); int i = argc;

if (argc%2) fp=fp2; else fp=fp1; fp(i);}

mashimaro % ./funcp aEven 2mashimaro % ./funcp a bOdd 3mashimaro %

main:

leal 4(%esp), %ecx

andl $-16, %esp

pushl -4(%ecx)

pushl %ebp

movl %esp, %ebp

pushl %ecx

subl $4, %esp

movl (%ecx), %eax

movl $fp2, %edx

testb $1, %al

jne .L4

movl $fp1, %edx

.L4:

movl %eax, (%esp)

call *%edx

addl $4, %esp

popl %ecx

popl %ebp

leal -4(%ecx), %esp

ret

Page 74: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Uses in operating system

Interrupt descriptor tablePointers to interrupt handler functionsIDTR points to IDT

System services descriptor tablePointers to system call functions

Import address tablePointers to imported library calls

Malware attacks all of these

Page 75: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

More disassembly

Code patterns in assembly Calling conventions (fast vs. standard vs. cdecl) ebp omission ecx use as C++ this pointer C++ vtables (virtual function table) WinXP SP2 prologue with patching support

For detours Exception handlers (FS register)

Linked list of functions stored in exception frames on stack

Page 76: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Advanced disassembly

Windows examplesLargely the same with small modificationsSize of operands (i.e. dword) specified (not in

operator suffix)Reverse ordering of operands

Page 77: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Disassembly example

0000 mov ecx, 5

0003 push aHello

0009 call printf

000E loop 00000003h

0014 ...

for(int i=0;i<5;i++)

{

printf(“Hello”);

}

0000 cmp ecx, 100h

0003 jnz 001Bh

0009 push aYes

000F call printf

0015 jmp 0027h

001B push aNo

0021 call printf

0027 ...

if(x == 256)

{

printf(“Yes”);

}

else

{

printf(“No”);

}

Page 78: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Disassembly example

int main(int argc, char **argv)

{

WSADATA wsa;

SOCKET s;

struct sockaddr_in name;

unsigned char buf[256];

// Initialize Winsock

if(WSAStartup(MAKEWORD(1,1),&wsa))

return 1;

// Create Socket

s = socket(AF_INET,SOCK_STREAM,0);

if(INVALID_SOCKET == s)

goto Error_Cleanup;

name.sin_family = AF_INET;

name.sin_port = htons(PORT_NUMBER);

name.sin_addr.S_un.S_addr = htonl(INADDR_ANY);

// Bind Socket To Local Port

if(SOCKET_ERROR == bind(s,(struct sockaddr*)&name,sizeof(name)))

goto Error_Cleanup;

// Set Backlog parameters

if(SOCKET_ERROR == listen(s,1))

goto Error_Cleanup;

push ebpmov ebp, espsub esp, 2A8hlea eax, [ebp+0FFFFFE70h]push eaxpush 101hcall 4012BEhtest eax, eaxjz 401028hmov eax, 1jmp 40116Fhpush 0push 1push 2call 4012B8hmov dword ptr [ebp+0FFFFFE6Ch], eaxcmp dword ptr [ebp+0FFFFFE6Ch], byte 0FFhjnz 401047hjmp 401165hmov word ptr [ebp+0FFFFFE5Ch], 2push 800hcall 4012B2hmov word ptr [ebp+0FFFFFE5Eh], axpush 0call 4012AChmov dword ptr [ebp+0FFFFFE60h], eaxpush 10hlea ecx, [ebp+0FFFFFE5Ch]push ecxmov edx, [ebp+0FFFFFE6Ch]push edxcall 4012A6hcmp eax, byte 0FFhjnz 40108Dhjmp 401165hpush 1mov eax, [ebp+0FFFFFE6Ch]push eaxcall 4012A0hcmp eax, byte 0FFhjnz 4010A5hjmp 401165h

Page 79: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Tools for disassembling

IDA Pro, IDA Pro Free

– Disassembler

– Execution graph

– Cross-referencing

– Searching

– Function analysis

– Function and variable labeling

Page 80: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Tools for disassembling

objdumpobjdump -d <object_file> Analyzes bit pattern of series of instructions Produces approximate rendition of assembly code Can be run on either executable or relocatable (.o) file

gdb Debuggergdb pdisassemble sum Disassemble procedurex/13b sum Examine the 13 bytes starting at sum

Page 81: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

In-class exerciseLab 5-1 (Steps 1-17)

– Use IDA Pro to bring up the code of DllMain

– Bring up Figures 5-1L, the equivalent of 5-2L, and 5-3L

– Find the remote shell routine in which memcmp is used to compare command strings received over the network

– Show the code for the function called if the command robotwork is invoked

– Show IDA Pro graphs of DLLMain and sub_10004E79

– Explain what the assembly code on p. 499 does

– Find the socket call referred to in Table 5-1L and change its integer constants to symbolic ones

– Show the assembly on p. 500. Find the routine that calls this assembly which shows that it is an anti-VM check.

Page 82: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

In-class exerciseLab 6-1

– Show the imported network functions in any tool

– Show the output of executing the binary

– Load binary in IDA Pro to generate Figure 6-1L

Lab 6-2

– Generate Listing 6-1L and 6-2L using a tool of your choice. What calls hint at this code's function?

– Using either Wireshark or netcat with Apate DNS, execute the malware to generate Listing 6-3L

– In IDA Pro, show the functions called by main. What does each one do?

– In IDA Pro, show the order that the WinINet calls are used and explain what each one does.

– Generate Listing 6-5L and explain what each cmp does.

Page 83: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Windows

Chapter 7: Analyzing Malicious Windows Programs

Page 84: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Types

Hungarian notation word (w) = 16 bit value double word (dw) = dword = 32 bit value

• dwSize = A type that is a 32-bit value

Handles (H)• HWND = A handle to a window

Long Pointer (LP) Callback

Page 85: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

File system functions

Malware often hits file systemCreateFile, ReadFile, WriteFileMemory mapping calls: CreateFileMapping,

MapViewOfFileTrickiness

• Alternate Data Streams (special file data)

• \Device\PhysicalMemory (accesses memory)

• \\.\ (accesses device)

Page 86: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Registry functions

Malware often hits registryRegistry stores OS and program configuration

informationHKEY_LOCAL_MACHINE (HKLM) – Settings global

to the machineHKEY_CURRENT_USER (HKCU) – Settings for

current userRegedit tool for examining valuesFunctions: RegOpenKeyEx, RegSetValueEx,

RegGetValue (Listing 7-1)

Page 87: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Networking APIs

Berkeley sockets APIsocket, bind, listen, accept, connect, recv, sendListing 7-3

WinINet API

InternetOpen, InternetOpenURL, InternetReadFile

Page 88: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

DLLs

Dynamic link librariesStore code that is re-used amongst applications

including malwareCan be used to store malicious code for injection

into a processMalware uses standard Windows DLLs to interact

with OSMalware uses third-party DLLs (e.g. Firefox DLL) to

avoid re-implementing functions

Page 89: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Processes

Execute code outside of current processCreateProcessListing 7-4

Hijack execution of current process

Injecting code via debugger or DLLs

Companion execution

Store executable in resource section of PEProgram extracts executable and writes it to disk

upon execution

Page 90: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Threads

Windows threads share same memory space but have separate registers and stackUsed by Malware to insert a malicious DLL into a

process's address spaceCreateThread with address of LoadLibrary as start

address

Page 91: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Services

Processes run in the backgroundScheduled and run by Windows service manager

without user inputOpenSCManager, CreateService, StartServiceAllows malware to maintain persistence on a

machineTypes

• WIN32_SHARE_PROCESS = allows multiple processes to contact service (e.g. svchost.exe)

• WIN32_OWN_PROCESS = independent process

• KERNEL_DRIVER = loads code into kernel

Page 92: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

COM

Microsoft Component Object ModelInterface standard that allows software components

to call each other• OleInitialize, CoInitializeEx

• CLSID = class identifier, IID = interface identifier

“Navigate” function in IWebBrowser2 interface• Used by malware to launch browser

• Listing 7-11

Malware implemented as COM server• Browser helper objects

• Detect COM servers running via its calls– DllCanUnloadNow, DllGetClassObject, DllInstall,

DllRegisterServer, DllUnregisterServer

Page 93: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Exceptions

Allow program to handle exceptional conditions during program executionWindows Structured Exception Handling

• Exception handling information stored on stack

• Listing 7-13

• Not all handlers respond to all exceptions

• Thrown to caller's frame if not handled

Used by malware to hijack execution• Handler address replaced by address to

injected malicious code

• Adversary then triggers exception

Page 94: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Kernel-mode malware

Windows API calls (Kernel32.dll)Typically call into underlying Native API (Ntdll.dll)Code in Ntdll then transfers to kernel

(Ntoskrnl.exe) via INT 0x2E, SYSENTER, SYSCALL

• Figure 7-3

Malware often calls Ntdll directly to avoid detection via interposition of security programs between Kernel32.dll and Ntdll.dll

• Example: Windows API (ReadFile, WriteFile) versus Native API (NtReadFile, NtWriteFile)

• Figure 7-4

Page 95: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Kernel-mode malware

Other Native API callsNtQuerySystemInformation,

NtQueryInformationProcess, NtQueryInformationThread, NtQueryInformationFile, NtQueryInformationKey

• Can also carry “Zw” prefix

NtContinue• Used to return from an exception

• Location to return is specified in exception context, but can be modified to transfer execution in nefarious ways

Page 96: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Kernel-mode malware

Legitimate programs typically do not use NativeAPI exclusively

Programs that are native applications (as specified in subsytem part of PE header) are likely malicious

Page 97: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

In-class exercise

Lab 7-2 Using strings, identify the network resource being used by the

malware What imports give away the mechanism this malware uses to

launch the browser? Go to the code snippet shown on p. 518. Follow the references

to show the values of rclsid and riid in memory. Debug the program and break at the call shown on p. 519. Run

the call to show the browser being launched with the embedded URL

Page 98: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Extra

Page 99: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

Run-time data structures

Page 100: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

More code snippetsRegistry modifications for disabling task manager and changing browser

default page

HKEY_CURRENT_USER\Software\Policies\Microsoft\Internet Explorer\Control Panel,HomepageHKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Policies\SystemDisableRegistryToolsHKEY_CURRENT_USER\Software\Microsoft\Internet Explorer\MainStart PageHKEY_CURRENT_USER\Software\Yahoo\pager\View\YMSGR_buzz content urlHKEY_CURRENT_USER\Software\Yahoo\pager\View\YMSGR_Launchcast DisableTaskMgr

Page 101: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

More code snippetsKills anti-virus, zone-alarm, firewall processes

Page 102: Part 2: Advanced Static Analysis Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly

More code snippetsNew variants

Download worm update files and register them as services regsvr32 MSINET.OCX

Internet Transfer ActiveX Control

Check for updates