mining specifications of malicious behavior

47
Mining Specifications Mining Specifications of Malicious Behavior of Malicious Behavior Mihai Christodorescu (work done at University of Wisconsin) Somesh Jha University of Wisconsin Christopher Kruegel Technical University Vienna IBM Research

Upload: conner

Post on 25-Jan-2016

25 views

Category:

Documents


1 download

DESCRIPTION

Mining Specifications of Malicious Behavior. Mihai Christodorescu (work done at University of Wisconsin) Somesh Jha University of Wisconsin Christopher Kruegel Technical University Vienna. IBM Research. Wide Spectrum of Detectors. Static detectors:. Dynamic/hybrid detectors, host IDS:. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Mining Specifications of Malicious Behavior

Mining Specifications of Mining Specifications of Malicious BehaviorMalicious Behavior

Mihai Christodorescu (work done atUniversity of Wisconsin)

Somesh Jha University of Wisconsin

Christopher Kruegel Technical University Vienna

IBM Research

Page 2: Mining Specifications of Malicious Behavior

ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 2

Wide Spectrum of DetectorsWide Spectrum of Detectors

• Static detectors:

Semantics-aware malware detection [Christodorescu et al. 2005]

Model checking-based malware detection[Kinder et al. 2005]

Behavior-based spyware detection[Kirda et al. 2006]

Shadow Honeypots [Anagnostakis et al. 2005]

•Dynamic/hybrid detectors, host IDS:

……

Page 3: Mining Specifications of Malicious Behavior

ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 3

Misuse DetectionMisuse Detection

Distinct techniques, fundamentally similar…

They all require high-quality specifications of malicious behavior.

……

Page 4: Mining Specifications of Malicious Behavior

ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 4

Current Specification Current Specification GenerationGeneration• Specifications are manually developed

by experts.

Two issues:– Time consuming– Error prone

? Spec

Page 5: Mining Specifications of Malicious Behavior

ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 5

Problem 1: Problem 1: Specification Specification DelayDelayTime from appearance of new malware

to availability of specification:• Manual analysis of the binary• Testing of the specification

Anti-virus industry: 4-18 hours to generate a new specification.

window of vulnerability

Page 6: Mining Specifications of Malicious Behavior

ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 6

Problem 2: Problem 2: Spec. ImprecisionSpec. Imprecision

Too general = false positives Angry users

Infected machinesToo specific = false negatives

Page 7: Mining Specifications of Malicious Behavior

ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 7

Our SolutionOur Solution

MINIMAL: a technique for mining malicious specifications

• Automatic• Flexible specification language• Fast• Performs well (compared to a human

expert)

Page 8: Mining Specifications of Malicious Behavior

Specification LanguageSpecification Language

Page 9: Mining Specifications of Malicious Behavior

ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 9

What’s In a Specification?What’s In a Specification?

Requirements for obfuscationresilience:

1. Describe only relevant operations

2. Capture dependencies where present

3. Preserve independence of operations

Page 10: Mining Specifications of Malicious Behavior

ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 10

Specifying Malicious Specifying Malicious OperationsOperations• We chose system calls

– Compatible with specifications for behavior-based detectors

– Define interface between trusted OS and untrusted programs

• Mining algorithm is not restricted to the system-call interface.

Page 11: Mining Specifications of Malicious Behavior

ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 11

Specifying Malicious Specifying Malicious ConstraintsConstraints• Program operations are insufficient to

distinguish malicious from benign.

• We need to capture relations between operations:

F=open(“file”) ; read(F,buf) ; send(S,buf)

Constraints = logical formulas over system-call arguments

Page 12: Mining Specifications of Malicious Behavior

ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 12

A Sample Specification A Sample Specification (Malspec)(Malspec)• Mass-mailing malware:

Y))T,Base64(l(StringEqua

send(X4,“DATA”)

X:=socket()

connect(X2)

send(X3,“EHLO”)

Y:=read(Z2)

send(X5,T)

Z:=open(S2)

S:=process_name()

2XX

32 XX

43 XX

54 XX

2SS

2ZZ

Page 13: Mining Specifications of Malicious Behavior

Mining AlgorithmMining Algorithm

Page 14: Mining Specifications of Malicious Behavior

ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 14

The Specification Mining The Specification Mining ProblemProblem

Y))T,Base64(l(StringEqua

send(X4,“DATA”)

X:=socket()

connect(X2)

send(X3,“EHLO”)

Y:=read(Z2)

send(X5,T)

Z:=open(S2)

S:=process_name()

2XX

32 XX

43 XX

54 XX

2SS

2ZZ

Known malware

Known benign programsMINIMAL

Specification of malicious behavior

Page 15: Mining Specifications of Malicious Behavior

ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 15

The Basic Mining OperationThe Basic Mining Operation

Known malware

Known benign program

C Q A

A O O O Q C

Q M C P

O F

O

F M C

A

Malware dependence graph

O O A A B O

R R R M M A A

R RC C R

A W

O W

C C

Benign dependence graph

O O O Q

M C P

Minimalmalspec

Compute dependence graphsStep 1 Compute graph differenceStep 2

Page 16: Mining Specifications of Malicious Behavior

ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 16

Multi-Program MiningMulti-Program Mining

Maximal union of malspecs:

O O O Q

M C Pvs.

O Q

P

O O

M C

O O O Q

M C P

Page 17: Mining Specifications of Malicious Behavior

ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 17

System-Call Dependence System-Call Dependence GraphGraph• We use a dynamic analysis to construct

the dependence graph– Static analysis too imprecise on binary

code

• Steps:1. Collect system-call trace2. Infer dependencies between system calls3. Construct (an underapproximation of the)

dependence graph

Step 1

Page 18: Mining Specifications of Malicious Behavior

ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 18

Discovering DependencesDiscovering Dependences

NtOpenKey( 372, 0x20019, {24, 356, "ActiveComputerName", 0x40, 0, 0} )

NtQueryValueKey( 372, "ComputerName", Full, {TitleIdx=0, Type=1,Name="ComputerName", Data="Z...“}, 108, 76 )

NtClose( 372 )

Def-UseDependenc

es

SubstringDependenc

es

Step 1

Page 19: Mining Specifications of Malicious Behavior

ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 19

Discovering Local ConstraintsDiscovering Local Constraints

• Access to well-defined resources:– Windows registry– Access to self– System files/directories

Step 1

NtCreateFile ( …, { …, "I-Worm.Mydoom.l.exe", … }, … )

Page 20: Mining Specifications of Malicious Behavior

ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 20

Graph DifferencingGraph Differencing

Problem:Find the smallest subgraph of malicious operations that does not appear in any benign graph.

Solution:Minimal Contrast Subgraph

[Ting, Bailey SDM 2006]

Step 2

Page 21: Mining Specifications of Malicious Behavior

ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 21

Minimal Contrast SubgraphsMinimal Contrast Subgraphs

• Idea:Minimal contrast subgraphs andmaximal common edge setsare duals.

• Vertex and edge labels (i.e., system calls and constraint formulas) help the search.

Step 2

Page 22: Mining Specifications of Malicious Behavior

ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 22

Mining Contrast SubgraphsMining Contrast Subgraphs

• Size of graphs:100K-1.5M nodes, similar for edges

• Worst-case complexity: O(N!)

Step 2

C Q A

A O O O Q C

Q M C P

O F

O

F M C

AMalware dependence graph

O O A A B O

R R R M M A A

R RC C R

A W

O W

C C

Benign dependence graph

Page 23: Mining Specifications of Malicious Behavior

ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 23

Heuristics Reduce Problem Heuristics Reduce Problem SizeSize• Normalize dependence graph

– Replace system-call sequences with shorter equivalents

• Eliminate disconnected subgraphs

• Eliminate trivial subgraphs

[see paper for details]

Page 24: Mining Specifications of Malicious Behavior

EvaluationEvaluation

Page 25: Mining Specifications of Malicious Behavior

ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 25

Evaluating Evaluating MMINIINIMMALAL

• Goals:– Compare MINIMAL malspecs with those

from human expert– Use mined malspecs with behavior-based

detector

Page 26: Mining Specifications of Malicious Behavior

ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 26

Experimental SetupExperimental Setup

• Trace collection in Windows 2000:– Malware samples run with no user input

(cf. expected execution model)– Benign samples run with normal user input– Execution for 1 or 2 minutes

• 16 malware samples:– Netsky, MyDoom, Beagle

• 6 benign programs:– Firefox, Thunderbird, installers

Page 27: Mining Specifications of Malicious Behavior

ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 27

MMINIINIMMALAL vs. Human Expert vs. Human Expert

Average success rate: 77.26%Average mining time: 8 minutes

MMINIINIMMALAL malspecs

Netsky.A 5 7

Netsky.H 5 6

Bagle.C 7 12

MyDoom.E

7 10

Behavioral features as

given by Symantec website.

Behavioral features as

given by Symantec website.

Page 28: Mining Specifications of Malicious Behavior

ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 28

Mined Malspecs for Netsky.AMined Malspecs for Netsky.A

MMINIINIMMALAL malspecs

Create mutex

Self-installation

Modify boot sequence

Terminate antivirus

Email self as ZIP file

Copy self to network drive

Page 29: Mining Specifications of Malicious Behavior

ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 29

Limitations of Limitations of MMINIINIMMALAL

• Sensitive to test environment– Malicious behavior might not be observed

during tracing.

• Underapproximation of dependence graph– Complex constraints are not discovered.

• Sensitive to test-set selection– Not all differences are malicious behaviors.

Future WorkFuture Work

Page 30: Mining Specifications of Malicious Behavior

Mining Specifications of Mining Specifications of Malicious BehaviorMalicious Behavior

Mihai [email protected]

Page 31: Mining Specifications of Malicious Behavior

Here Be Dragons!Here Be Dragons!

Page 32: Mining Specifications of Malicious Behavior

ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 32

Mining Malspecs from Mining Malspecs from MalwareMalwareDynamic differential analysis

Malware system-call trace

Benign program system-call trace

Malspec:

Page 33: Mining Specifications of Malicious Behavior

ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 33

Specifying Malicious BehaviorSpecifying Malicious Behavior

• Example• Now manual• NEED: automatic

Page 34: Mining Specifications of Malicious Behavior

ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 34

A C Q A O O O Q C F O Q O M C P F O M C

Page 35: Mining Specifications of Malicious Behavior

ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 35

1.1. Only Relevant OperationsOnly Relevant Operations

• System calls

Approach: Collect system-call traces.

Page 36: Mining Specifications of Malicious Behavior

ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 36

2.2. DependencesDependences

Page 37: Mining Specifications of Malicious Behavior

ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 37

3.3. Independence of Independence of OperationsOperations

Page 38: Mining Specifications of Malicious Behavior

ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 38

Good SpecificationsGood Specifications

• One can write specifications satisfying the requirements.

• The algorithm to generate specifications must write specifications satisfying the requirements.

Page 39: Mining Specifications of Malicious Behavior

ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 39

Misuse-Detection Misuse-Detection FundamentalsFundamentals

• Database of specifications (“signatures”) defines what is malicious.

Protected computerProtected computerProtected computer

Malware detector

Protected computerUnknown executable Protected computer

SigDB

Page 40: Mining Specifications of Malicious Behavior

ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 40

• Byte signatures are not rich enough to capture malicious behavior.

Hackers evade detection through obfuscation.

Failures of Current DetectorsFailures of Current Detectors

All mass-mailing worms

Underapproximating byte signature

Overapproximating byte signature

Page 41: Mining Specifications of Malicious Behavior

ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 41

Behavior-Based DetectorsBehavior-Based Detectors

• New detection techniques use higher level specificationsSemantics-Aware Malware Detection (2005)Model Checking-based Malware Detection (2005)Behavior-based Spyware Detection (2006)

These still depend on the quality of the specifications!

Page 42: Mining Specifications of Malicious Behavior

ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 42

MMINIINIMMALAL: Malware-Mining : Malware-Mining AlgorithmAlgorithm

Known malware

NtOpenKey( 372, 0x20019, {24, 356, "ActiveComputerName",

0x40, 0, 0} )

Monitor system calls during Monitor system calls during executionexecution

System-call traceA C Q A O O O Q C F O Q O M C P F O M C

Step 1

Page 43: Mining Specifications of Malicious Behavior

ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 43

MMINIINIMMALAL: Malware-Mining : Malware-Mining AlgorithmAlgorithm

Discover dependences between Discover dependences between syscallssyscalls

System-call traceA C Q A O O O Q C F O Q O M C P F O M C

Data dependence

graph

C Q A

A O O O Q C

Q M C P

O F

O

F M C

A

Step 2

Page 44: Mining Specifications of Malicious Behavior

ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 44

MMINIINIMMALAL: Malware-Mining : Malware-Mining AlgorithmAlgorithm

Find malicious–benign graph Find malicious–benign graph differencedifference

C Q A

A O O O Q C

Q M C P

O F

O

F M C

A

Malspec

Malware dependence graph

O O O Q

M C P

O O A A B O

R R R M M A A

R RC C R

A W

O W

C C

Benign dependence graph

Step 3

Page 45: Mining Specifications of Malicious Behavior

ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 45

Dependence Graph Dependence Graph NormalizationNormalization• Many sequences of system calls are

equivalent:

Aggregation replaces such sequences with a canonical sequence. (see paper for details)

Step 2

read( …, 1 )read( …, 1 )read( …, 1 )read( …, 1 )read( …, 1 )

read( …, 5 )≡

Page 46: Mining Specifications of Malicious Behavior

ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 46

MMINIINIMMALAL Specs in Detection Specs in Detection

• Missed malspecs:– Due to unrecovered dependences

(e.g., ZIP compression of data)– Due to incompleteness of trace data

(e.g., certain behaviors did not execute)

• Using mined malspecs in semantics-aware malware detection:Netsky.A malspec Netsky.D, E, F, …

Page 47: Mining Specifications of Malicious Behavior

ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 47

ConclusionsConclusions

• The mining malicious of behavior can be automated

• Mined malspecs compare well with those from human experts

• Mining time significantly reduced over manual specification