taint scope

34
TAINTSCOPE A Checksum-Aware Directed fuzzing Tool for Automatic Software Vulnerability Detection Tielei Wang 1 , Tao Wei 1 , Guofei Gu 2 , Wei Zou 1 1 Peking University, China 2 Texas A&M University, US

Upload: geeksec80

Post on 11-Jun-2015

157 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Taint scope

TAINTSCOPEA Checksum-Aware Directed fuzzing Tool for Automatic Software Vulnerability Detection

Tielei Wang1, Tao Wei1, Guofei Gu2, Wei Zou1

1Peking University, China2Texas A&M University, US

Page 2: Taint scope

TERMS

Checksum – a way to check the integrity of data. Used in network protocols and files.

Fuzzing – generating malformed inputs and feeding them to the application.

Dynamic Taint Analysis – runs a program and observes which computations are affected by predefined taint sources (e.g. input)

data

data Checksum field

Checksum function

2

Page 3: Taint scope

THE PROBLEM

The input mutation space is enormous .

Most malformed inputs dropped at an early stage, if the program employs a checksum mechanism.

3

Page 4: Taint scope

THE PROBLEM

1 void decode_image(FILE* fd){2 ...3 int length = get_length(fd);4 int recomputed_chksum = checksum(fd, length);5 int chksum_in_file = get_checksum(fd);

//line 6 is used to check the integrity of inputs6 if(chksum_in_file != recomputed_chksum)7 error();8 int Width = get_width(input_file);9 int Height = get_height(input_file);10 int size = Width*Height*sizeof(int);11 int* p = malloc(size);12 ...13 for(i=0; i<Height; i++){// read ith row to p14 read_row(p+Width*i, i, fd);

4

Page 5: Taint scope

THE IDEA To infer whether/where a program checks the

integrity of input.

Identify which input bytes can flow into sensitive points:Taint analysis at byte level – monitors how application uses the input data.

Create malformed input focusing the “hot bytes”.

Repair checksum fields in input, to expose vulnerability.

Fully automatic

Found 27 new vulnerability – acrobat reader, google picasa and more.

5

Page 6: Taint scope

HOW DOES IT WORK?

1. Dynamic taint tracing2. Detecting checksum3. Directed fuzzing4. Repairing crashed samples

6

Page 7: Taint scope

HOW DOES IT WORK?

Execution Monitor

Checksum Locator

Directed Fuzzer

Checksum Repairer

Modified Program

Hot Bytes InfoInstruction Profile

CrashedSamples

Reports

7

Page 8: Taint scope

HOW DOES IT WORK?

Runs the program with well-formed input.

Execution monitor records: Which input bytes related to arguments of API

functions (e.g. malloc, strcpy) – “hot bytes” report.

Which bytes each conditional jump instruction

depends on (e.g. JZ, JE, JB) – checksum report.

Considering only data flow (no control flow).

1. DYNAMIC TAINT TRACING

8

Page 9: Taint scope

HOW DOES IT WORK?

Instruments instructions – movement (e.g. MOV, PUSH), arithmetic (e.g.

SUB, ADD), logic (e.g. AND, XOR) Taints all values written by an

instruction with union of all taint labels associated with values used by that instruction.

Considering also eflags register.

1. DYNAMIC TAINT TRACING

eax {0x6, 0x7}, ebx {0x8, 0x9} add eax, ebxeax {0x6, 0x7, 0x8, 0x9}, eflags {0x6, 0x7, 0x8, 0x9}

9

Page 10: Taint scope

HOW DOES IT WORK?1. DYNAMIC TAINT TRACING -

EXAMPLE

…0x8048d5b: invoking malloc: [0x8,0xf]…

8 int Width = get_width(input_file);9 int Height = get_height(input_file);10 int size = Width*Height*sizeof(int);11 int* p = malloc(size);

Input size is 1024 bytes“hot bytes” report:

10

Page 11: Taint scope

HOW DOES IT WORK?

Input size is 1024 byteschecksum report:

1. DYNAMIC TAINT TRACING - EXAMPLE

6 if(chksum_in_file != recomputed_chksum)7 error();

…0x8048d4f: JZ: 1024: [0x0,0x3ff]…

11

Page 12: Taint scope

HOW DOES IT WORK?

Checksum detector:

identify potential checksum check points the recomputed checksum value depends

on many input bytes Instruments conditional jump. Before

execution, checks whether the number of

marks associated with eflags register exceeds a threshold.

Problem with decompressed bytes.

2. DETECTING CHECKSUM

12

Page 13: Taint scope

HOW DOES IT WORK?

Refinement:

2. DETECTING CHECKSUM

Well-formed inputs can pass the checksum test,

but most malformed inputs cannot

13

Page 14: Taint scope

HOW DOES IT WORK?

Refinement:

2. DETECTING CHECKSUM

Well-formed inputs can pass the checksum test,

but most malformed inputs cannot Run well-formed inputs, identify the

always-taken and always-not-taken instructions.

14

Page 15: Taint scope

HOW DOES IT WORK?

Refinement:

2. DETECTING CHECKSUM

Well-formed inputs can pass the checksum test,

but most malformed inputs cannot Run well-formed inputs, identify the

always-taken and always-not-taken instructions.

Run malformed inputs, also identify the always-taken and always-not-taken instructions.

15

Page 16: Taint scope

HOW DOES IT WORK?

Refinement:

2. DETECTING CHECKSUM

Well-formed inputs can pass the checksum test,

but most malformed inputs cannot Run well-formed inputs, identify the

always-taken and always-not-taken instructions.

Run malformed inputs, also identify the always-taken and always-not-taken instructions.

Identify the conditional jump instructions that behaves completely different when processing well-formed and malformed inputs.

16

Page 17: Taint scope

HOW DOES IT WORK?

Checksum detector: Creates bypass rules –

always-taken, always-not-taken

2. DETECTING CHECKSUM

6 if(chksum_in_file != recomputed_chksum)7 error();

…0x8048d4f: JZ: 1024: [0x0,0x3ff]…

0x8048d4f: JZ: always-taken

17

Page 18: Taint scope

HOW DOES IT WORK?

Checksum detector: Checksum field identification

Input bytes that affects chksum_in_file are the checksum field.

2. DETECTING CHECKSUM

6 if(chksum_in_file != recomputed_chksum)7 error();

18

Page 19: Taint scope

HOW DOES IT WORK?

Generates malformed test cases – feeds them to the original or instrumented program.

According to the bypass rules, alters the execution traces at check points – sets the eflags register.

3. DIRECTED FUZZING

19

Page 20: Taint scope

HOW DOES IT WORK?

All malformed test cases are constructed based on the “hot bytes” information Using attack heuristics:

bytes that influence memory allocation are set to small, large or negative.bytes that flow into string functions are replaced by characters such as %n, %p.

Output – test cases that could cause to crash or consume 100% CPU.

3. DIRECTED FUZZING

20

Page 21: Taint scope

HOW DOES IT WORK?3. DIRECTED FUZZING

…0x8048d5b: invoking malloc: [0x8,0xf]…

6 if(chksum_in_file != recomputed_chksum)7 error();8 int Width = get_width(input_file);9 int Height = get_height(input_file);10 int size = Width*Height*sizeof(int);11 int* p = malloc(size);

0x8048d4f: JZ: always-taken

…0x8048d4f: JZ: 1024: [0x0,0x3ff]…

“hot bytes” reportChecksum report

Bypass info

21

Page 22: Taint scope

HOW DOES IT WORK?3. DIRECTED FUZZING

…0x8048d5b: invoking malloc: [0x8,0xf]…

6 if(chksum_in_file != recomputed_chksum)7 error();8 int Width = get_width(input_file);9 int Height = get_height(input_file);10 int size = Width*Height*sizeof(int);11 int* p = malloc(size);

0x8048d4f: JZ: always-taken

…0x8048d4f: JZ: 1024: [0x0,0x3ff]…

“hot bytes” reportChecksum report

Bypass info

Before executing 0x8048d4f, the fuzzer sets

the flag ZF in eflags to an opposite value

22

Page 23: Taint scope

HOW DOES IT WORK?

Fixing is expensive - fixes checksum fields only in test cases that caused crashing.

How?Cr – row data in the checksum field

D – input data protected by checksum filedChecksum() – the complete checksum algorithmT – transformationWe want to pass the constraint:

4. REPAIRING CRASHED SAMPLES

Checksum(D) == T(Cr)

23

Page 24: Taint scope

HOW DOES IT WORK?

Using symbolic execution to solve:

Checksum(D) is a runtime determinable constant:

Only Cr is a symbolic value. Common transformations (e.g. converting

from hex/oct to decimal), can be solved by existing solvers (STP).

4. REPAIRING CRASHED SAMPLES

Checksum(D) == T(Cr)

c== T(Cr)

24

Page 25: Taint scope

HOW DOES IT WORK?

If the new test case cause the original program to crash,

4. REPAIRING CRASHED SAMPLES

a potential vulnerability is detected!

25

Page 26: Taint scope

EVALUATION

An incomplete list of applications:

26

Page 27: Taint scope

EVALUATION“hot bytes” identification results – memory allocation

27

Page 28: Taint scope

EVALUATION

Checksum identification results:Threshold = 16

28

Page 29: Taint scope

EVALUATION

Correct checksum fields:

29

Page 30: Taint scope

EVALUATION

MS Paint Google Picasa Adobe Acrobat ImageMagick

irfanview gstreamer Winamp XEmacs

Amaya dillo wxWidgets PDFlib

27 previous unknown Vulnerabilities:

30

Page 31: Taint scope

EVALUATIONVulnerabilities detected by TaintScope:

31

Page 32: Taint scope

DISCUSSION TaintScope cannot deal with secure

integrity check schemes (e.g. cryptographic hash algorithms, digital signature) – impossible to generate valid test cases.

Limited effectiveness when all input data are encrypted (tracking decrypted data).

Checksum check points identification can be affected by the quality of inputs.

Not tracks control flow propagation. Not all instructions of x86 are

instrumented by the execution monitor.

32

Page 33: Taint scope

CONCLUSIONTaintScope can perform: Directed fuzzing

Identify which bytes flow into system/library calls.

dramatically reduce the mutation space. Checksum-aware fuzzing

Disable checksum checks by control flow alternation.

Generate correct checksum fields in invalid inputs.

33

Page 34: Taint scope

QUESTIONS

34