shareing is caring

52
Halvar Flake ([email protected]) Sebastian Porst ([email protected])

Upload: zynamics-gmbh

Post on 24-Jun-2015

2.240 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ShaREing is Caring

Halvar Flake ([email protected])Sebastian Porst ([email protected])

Page 2: ShaREing is Caring
Page 3: ShaREing is Caring
Page 4: ShaREing is Caring

He has a problem

Huge DisassembledBinary

Statically linked library he is not aware of

Page 5: ShaREing is Caring

If he only knew ...

EScript.api(Adobe ReaderJavaScript Engine)

libjs (Spider Monkey)Open-Source JavaScript library

Page 6: ShaREing is Caring

He has a problem

PresentGuessingstringsFLIRT signatures

FutureBinCrowd

Page 7: ShaREing is Caring

Demo

Page 8: ShaREing is Caring
Page 9: ShaREing is Caring
Page 10: ShaREing is Caring

AdvantagesDisassembly vs. Source Notice vulnerable code Find other uses

Question: What else usesthis vulnerable function?

Answer: [ List of Programs ]

AcroForm.api

Vulnerable libtiff

Page 11: ShaREing is Caring

Technical Intermezzo A:Finding Functions

Page 12: ShaREing is Caring

We need fast lookup!

Compilers screw with us …

For now we have three ways ...

So how does one store functions?

Random register assignmentReordering instructionsSwitching mnemonics

Page 13: ShaREing is Caring

Small Prime Product

• Positive 64-bit integer number

• Characteristic for a function

• Small prime for each mnemonic

• Multiply

• Two functions are considered equal if they have the same list of mnemonics

– Order of mnemonics is ignored

• Match quality: High

Page 14: ShaREing is Caring

MD-Index

• Structural lookup in a database would be great

• Erm … but a graph is not a number

• We want a hash function for graphs!

Page 15: ShaREing is Caring

MD-Index

80-Bit Hash Value

Result: Fast DB lookup forparticular functions

Page 16: ShaREing is Caring

MD-Index

• Take every edge in the graph

• For every edge, construct 5-tuple:

– # of incoming edges in the source

– # of outgoing edges in the source

– # of incoming edges in the target

– # of outgoing edges in the target

– Topological order of the edge in the graph

• So a graph gives us a set of vectors

Page 17: ShaREing is Caring

MD-Index

• A set of vectors is not exactly a number

• Embed each vector into the reals:

– Map to

– It’s a 5-dimensional vector space over Q

– Each element is also “just” a number

• Use

• Now mix all the results:

Page 18: ShaREing is Caring

MD-Index

80-Bit Hash Value

Result: Fast DB lookup forparticular functions

Page 19: ShaREing is Caring

MD-Index with calls

• Just the flowgraph is too false-positive prone

• Encode the call positions, too

• Result: Hash function for flowgraph with calls at particular locations

Page 20: ShaREing is Caring

3-tiered lookup

• Does the prime product match?

– If yes, high confidence in correct match

• Does the MD-Index with calls match?

– If yes, medium confidence in correct match

• Does the MD-Index without calls match?

– If yes, low confidence in correct match

Page 21: ShaREing is Caring

Problems

• Comparison process is not very robust to changes in flow graphs

– BinDiff can do a lot more

– For most uses sufficient

• Comparison does not work for tiny functions

– Where tiny means less than 8 edges

– Context is not considered

Page 22: ShaREing is Caring
Page 23: ShaREing is Caring
Page 24: ShaREing is Caring

She has a problem

Dozens of previously analyzed rootkits

New suspicious file

Page 25: ShaREing is Caring

If she only knew

She came across that malware author two years ago

He reused his rootkit hiding code and she documented it back then

Page 26: ShaREing is Caring

Demo

Page 27: ShaREing is Caring
Page 28: ShaREing is Caring
Page 29: ShaREing is Caring
Page 30: ShaREing is Caring

AdvantagesRemember the past Import earlier results Simplify the future

Page 31: ShaREing is Caring

Technical Intermezzo B:Calculating scores for files

Page 32: ShaREing is Caring

How to find similar files?

Remember fuzzyness

Here is what we do ...

So we have this database, but ...

One file typically containsseveral different statically linkedand dynamically importedlibraries

Page 33: ShaREing is Caring

Calculating a file score

• Calculate a score that depends on the number of matches weighted by their quality

• The higher the score, the more significant functions are shared by two files

Page 34: ShaREing is Caring

Problems

• We are still working on score calculation

• Desired score depends on goal

– Comment porting, library identification, ...

Page 35: ShaREing is Caring
Page 36: ShaREing is Caring
Page 37: ShaREing is Caring

They have a problem

Complex team with different sub-teams

Information flow restricted by clearance levels

Page 38: ShaREing is Caring

If they only knew ...

BinCrowd manages different access levels in a centralized way

No data transfer from high clearance people to low clearance people

Page 39: ShaREing is Caring

They have another problem

Different members use different tools

Making new information available to other members is difficult

Page 40: ShaREing is Caring

If they only knew ...

BinCrowd makes iteasy to exchange information between different toolsIndividual members can use whatever tools they want

Page 41: ShaREing is Caring

No Demo

(BinNavi Plugin is not yet ready)

Page 42: ShaREing is Caring

AdvantagesCentral database of

knowledgeControlled transfer of

informationSynchronize information

from different tools

Page 43: ShaREing is Caring

Technical Intermezzo C:Use BinCrowd

Page 44: ShaREing is Caring

How do you actually access it?

We host a free community server

Here is what you need ...

So we have this database, but ...

We have a prepopulated databasewhere you can download andupload information.

Page 45: ShaREing is Caring

Software you need

• IDA Pro 5.6

• IDAPython 1.3.2

• A BinCrowd account (free)

• The BinCrowd IDA Pro Plugin

– http://github.com/zynamics

Page 46: ShaREing is Caring

Usage

• Register BinCrowd account

• Download the BinCrowd IDA Plugin

• Load BinCrowd IDA Plugin using ALT-9 in IDA

• Read the readme.txt file to find out what CTRL-1, CTRL-2, CTRL-3, and CTRL-4 do

Page 47: ShaREing is Caring

New IDB

Download prior results

Analyze IDB

Upload new Results Workflow

Page 48: ShaREing is Caring

Best practices

• Name your input files likeprogram.version.compiler.optimization_level.xxx

Page 49: ShaREing is Caring

A fair warning

• Passwords are transmitted in plain-text

• Database will be reset randomly during beta

– All data will be lost, accounts will be kept

• Cross-site request forgeries are a dime a dozen

Page 50: ShaREing is Caring

Credits and Thanks

• Nathan Fain

– For getting the first version of BinCrowd off theground

Page 51: ShaREing is Caring

Credits and Thanks

• Christian Ketterer

– For designing the web interface

• American Greetings

– Thanks in advance for not suing us over our liberal use of care bears when you guys find this presentation

Page 52: ShaREing is Caring

BinCrowd can be used for free!

Give it a try athttp://bincrowd.zynamics.com