tetcon2016 160104

39
BE-PUM: Binary Emulation for Pushdown Model Generation Obfuscation code localization based on CFG generation of malware Nguyen Minh Hai Industrial University of Ho Chi Minh City (IUH) with Quan Thanh Tho, Ho Chi Minh City University of Technology (HMCUT) , in Collaboration with Mizuhito Ogawa (JAIST) January 2016

Upload: bordeaux-i

Post on 27-Jan-2017

97 views

Category:

Software


0 download

TRANSCRIPT

Page 1: Tetcon2016 160104

BE-PUM: Binary Emulation for Pushdown Model Generation

Obfuscation code localizationbased on CFG generation of malware

Nguyen Minh HaiIndustrial University of Ho Chi Minh City (IUH)

with Quan Thanh Tho, Ho Chi Minh City University of Technology (HMCUT) , in

Collaboration with Mizuhito Ogawa (JAIST)

January 2016

Page 2: Tetcon2016 160104

BE-PUM• Binary Emulation for Pushdown Model Generation

• Key features: Generate model (CFG) from binary code of malwareShow better results compared with many other tools, e.g.

IDA Pro, Jakstab, Hooper...Tackle many obfuscation techniques and successfully

unpack many packers (27 different packers) Generic Unpacker for Model Generation of MalwareDetect packer by semantic signature (recognizing packer

techniques) Sematic Signature Matching for Packer Detection

1

Page 3: Tetcon2016 160104

Agenda

1.Motivation2.BE-PUM3.Experiments4.Conclusions5.Demo

2

Page 4: Tetcon2016 160104

Malwares

• Malware (malicious software) – a real threatVirusTrojan horseKeylogger

• How to dealSignature detection (Industry approach)Emulation (Sandbox approach)Model checking (Formal approach)

3

Page 5: Tetcon2016 160104

Issues

• Signature-based = Failed by obfuscation techniques

• Sandbox-basedHeavy costVirus may have different behaviors (at different

time points) Virus may even detect sandbox environment

• Model CheckingModel GenerationModel Checking

4

Page 6: Tetcon2016 160104

Model Checking Outline

Model GenerationModel

Checking

5

Page 7: Tetcon2016 160104

Typical approach• Control Flow Graph (CFG) is generated as the

modelOne program location is mapped a nodeDecide all of destinations when branching

• Things are more difficult with sophisticated binaries:Self-modification code (Encryption/Decryption)Indirect jumpMany other obfuscation techniques

6

Page 8: Tetcon2016 160104

Control Flow Graph

• Choices of many tools (CodeSurfer/x86, McVeto, JakStab, BIRD, Renovo, Syman, BINCOA/OSMOSE, IDA Pro)

Hexa Instructions 0x00401000 cmp eax, 0

0x00401003 jle 0x0040100d

0x00401005 mov eax, 0x00401001

0x0040100a jmp 0x00401015

0x0040100c halt

0x0040100d mov eax, 0x00401018

0x00401012 sub eax, 5

0x00401015 sub eax, 1

0x00401018 0x0040100c

00

03

05

0A

12

0D

15

18

0A

7

Page 9: Tetcon2016 160104

ExampleExample

8

Page 10: Tetcon2016 160104

9

Page 11: Tetcon2016 160104

10

Page 12: Tetcon2016 160104

Demo

Page 13: Tetcon2016 160104

Demo

Page 14: Tetcon2016 160104

BE-PUM• BE-PUM - Binary Emulation for Pushdown Model

• Apply pushdown model generation of binary codeApply concolic testing (dynamic symbolic execution) to

handle indirect jumpApply on-the-fly model generation for handling self-

modifying codeFocus on obfuscation techniques which are used in

malware and packer tools.

14

Page 15: Tetcon2016 160104

Running Examples

Page 16: Tetcon2016 160104

Running Example

Hexa Instructions 0x00401000 cmp eax, 0

0x00401003 jle 0x0040100d

0x00401005 mov eax, 0x00401001

0x0040100a jmp 0x00401015

0x0040100c halt

0x0040100d mov eax, 0x00401018

0x00401012 sub eax, 5

0x00401015 sub eax, 1

0x00401018 jmp eax

Page 17: Tetcon2016 160104

Running Example

Hexa Instructions 0x00401000 cmp eax, 0

0x00401003 jle 0x0040100d

0x00401005 mov eax, 0x00401001

0x0040100a jmp 0x00401015

0x0040100c halt

0x0040100d mov eax, 0x00401018

0x00401012 sub eax, 5

0x00401015 sub eax, 1

0x00401018 jmp eax

eax = α

Page 18: Tetcon2016 160104

Running Example

Hexa Instructions 0x00401000 cmp eax, 0

0x00401003 jle 0x0040100d

0x00401005 mov eax, 0x00401001

0x0040100a jmp 0x00401015

0x0040100c halt

0x0040100d mov eax, 0x00401018

0x00401012 sub eax, 5

0x00401015 sub eax, 1

0x00401018 jmp eax

eax = α

Page 19: Tetcon2016 160104

Running Example

Hexa Instructions 0x00401000 cmp eax, 0

0x00401003 jle 0x0040100d

0x00401005 mov eax, 0x00401001

0x0040100a jmp 0x00401015

0x0040100c halt

0x0040100d mov eax, 0x00401018

0x00401012 sub eax, 5

0x00401015 sub eax, 1

0x00401018 jmp eax

eax <0

eax >=0eax = α

Page 20: Tetcon2016 160104

Running Example

Hexa Instructions 0x00401000 cmp eax, 0

0x00401003 jle 0x0040100d

0x00401005 mov eax, 0x00401001

0x0040100a jmp 0x00401015

0x0040100c halt

0x0040100d mov eax, 0x00401018

0x00401012 sub eax, 5

0x00401015 sub eax, 1

0x00401018 jmp eax

00

03

05

0a

12

0d

15

18eax = α

Convert symbolic value of α into a concrete valueUse white-box testing to under-approximate α

jmp to α?

Page 21: Tetcon2016 160104

Test-case Generation

00

03

05

0a

12

0d

15

18

00

03

05

0a

12

0d

15

18

eax = eax =

Test-case-1 = Test-case-2 = -

Page 22: Tetcon2016 160104

Enlarging the Model by Testing Result Simulation Snapshot

eax=0x0040100C; start=0x00401000;return=0;address=0x0040100C;

Hexa Instructions 0x00401000 cmp eax, 0

0x00401003 jle 0x0040100d0x00401005 mov eax, 0x004010010x0040100a jmp 0x004010150x0040100c halt

0x0040100d mov eax, 0x004010180x00401012 sub eax, 5

0x00401015 sub eax, 1

0x00401018 jmp eax

00

03

05

0a

12

0d

15

18

0cTest-case-1Test-

case-2

Page 23: Tetcon2016 160104

Framework

15

Page 24: Tetcon2016 160104

Strategy for covered instruction selection• Instruction statistics collected from virus samples

• Full list of 300 supported instructionsCall Jump Return

add shl Call je jz jne jump mov cmovg ret cmp out setna lods daaand sal jnz jb jnae xchg cmovl int pop setnae movs dassub dec jc jnb jae movz cmovl aaa popa setnb neg enteror inc jnc jng jnae movsb cmovna aad popf setnbe nop inxor adc jle ja jl movsw cmovnae aam push setnc shld int1imul shr jnge jnl jnbe mosx cmovnbe aas pusha setne shrd int3ror ror jge jo jg movzb cmovne bsf pushf setng stc lahfdiv rep jnle jns loop movzw cmovng bswap rdtsc setnge stos leasbb mul js jno jp cmova cmovnge bt sahf setnl test leaveclc sar jno jpe jecxz cmovb cmovnl btc scas setnle xlatnot ror jmp loope loopne cmovbe cmovnle brt seta setno cbwidiv rcr loop loopz loopnz cmovc cmovno bts setae setnp cwdexadd rol cmove cmovnp cbw setb setns cmpsadc rcl cmovp cmovns cdq setbe seto cmpxchgdec mul cmovpe cmovnz clc setc setp cmpxchg8bshr sbb cmovpo cmovo cld sete setpe cpuidsar cmovs cmovz cli setg setpo cwd

cltd setge sets cwdecmc setl setz cwt

Arthimetic Conditinal Jump Move Control

16

Page 25: Tetcon2016 160104

Supported 400 Windows APIs• Kernel32.dll: _lwrite, accept, bind, CloseHandle, closesocket, connect,

CopyFile, CreateFile, CreateFileMapping, CreateProcess, CreateThread DeleteFile, ExitProcess, FindClose, FindFirstFile, FindNextFile, FreeEnvironmentStrings, GetCommandLine, GetCurrentDirectory GetCurrentProcess, GetEnvironmentStrings, GetFileAttributes, GetFileSize, GetFileType, gethostbyname, gethostname, GetLastError GetLocalTime, GetModuleFileName, GetModuleHandle, GetProcAddress, GetStartupInfo, GetStdHandle, GetSystemDirectory GetSystemTime, GetTickCount, GetVersion, GetVersionEx, GetWindowsDirectory, HeapAlloc, HeapCreate, HeapDestroy, HeapFree, HeapReAlloc, IsDebuggerPresent, listen, LoadLibrary, lstrcat, lstrcmp, lstrcpy, lstrlen, MapViewOfFile, MoveFile, PeekMessageA, ReadFile, recv, RegCloseKey, RegOpenKeyEx, RegSetValueEx, send, , SetCurrentDirectory, SetEndOfFile, SetFileAttributes, SetFilePointer, SetHandleCount, shutdown, socket ,UnmapViewOfFile, VirtualAlloc, VirtualFree, WaitForSingleObject, WinExec, WriteFile, WSACleanup, WSAStartup...

• User32.dll: MessageBox, SendMessage, FindWindow, PostMessage.

17

Page 26: Tetcon2016 160104

Best Practice• Apply bread-first-search strategy to ask Z3 to

generate as much test-case as possible• Use JNA (Java Native Access) to simulate API

calling

18

Page 27: Tetcon2016 160104

Indirect Jump• Virus.Win32.Aztec

00401057 . B8 00100000 MOV EAX,10000040105C . 05 00004000 ADD EAX, 0040000000401061 . FFE0 JMP EAX

BE-PUMIDA Pro

20

Page 28: Tetcon2016 160104

Overlapping InstructionHLLW.Rolog.f

•Junk code modifies the return address.

00437002 E8 03000000 CALL 0043700A00437007 E9 EB045D45 JMP 45A074F7

00437002 CALL 0043700A

0043700D RETN

0043700A POP EBP 0043700B INC EBP0043700C PUSH EBP

Code

21

Page 29: Tetcon2016 160104

Demo

BE-PUM

IDA Pro

22

Page 30: Tetcon2016 160104

Self-Modifying Code

• Virus.Win32.Seppuku.1606 : Self-Modifying Code

00401646 E8 B5F9FFFF CALL 00401000

EDI = 401067

004010E5 MOV EAX,DWORD PTR SS:[EBP+401489]004010EB STOS DWORD PTR ES:[EDI]

00401646 E8 00000000 CALL 0040164B

23

Page 31: Tetcon2016 160104

Decryption

• Email-Worm.Win32.Kickin.d : Self-decryption

00609223 pop ebp00609224 push 3d00609226 mov byte ptr ds:[esi+9cccd0e5],dh0060922C retn 8d9e0060922F pxor mm5,mm300609232 dec ecx00609233 fiadd word ptr ds:[ecx+80a6b31]

Decryption loop

ecx was set to 0CAh0060933A mov ecx,0ca00609345 lods byte ptr ds:[esi]00609346 xor al,ah00609348 inc ah0060934A rol ah,20060934D add ah,9000609350 stos byte ptr es:[edi]00609351 loopd 00609345

00609223 call 0060922800609228 mov ebx, [ebp+402705]0060922E add ebx,2800609231 pop eax00609232 sub eax,ebx00609234 mov [ebp+40270d],eax

24

Page 32: Tetcon2016 160104

Demo

BE-PUM

IDA Pro

25

Page 33: Tetcon2016 160104

Comparison with others• BE-PUM current tool: precise models (CFG)

generated from real malwaresIndirect jumps (now)Self-modification (now)Decryption (now)SEH (now)Packer techniques (now)

• ExperimentsCompare the CFG with those generated by

Jakstab and IDA Pro

29

Page 34: Tetcon2016 160104

Experiment statistics

30

Page 35: Tetcon2016 160104

Supported Techniques in Packer

32

Page 36: Tetcon2016 160104

Related Works

33

Page 37: Tetcon2016 160104

Remarks• BE-PUM plays the roles of both model generation

and model emulator for binariesModel Generation: on-the-fly manner, with

concolic technique–Missing piece: Loop invariant (handled by

looping many many times if needed) Emulator

– A “symbolic sandbox”

34

Page 38: Tetcon2016 160104

Demo

36

Page 39: Tetcon2016 160104

Thank you for your attention