automated mapping of large binary objects

27
Automated Mapping of Large Binary Objects Ben Sangster Roy Ragsdale Greg Conti ://www.loc.gov/loc/lcib/0611/images/map.jpg

Upload: muriel

Post on 08-Feb-2016

37 views

Category:

Documents


0 download

DESCRIPTION

Automated Mapping of Large Binary Objects. Ben Sangster Roy Ragsdale Greg Conti. http://www.loc.gov/loc/lcib/0611/images/map.jpg. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Automated Mapping of Large Binary Objects

Automated Mapping of Large Binary Objects

Ben SangsterRoy Ragsdale

Greg Conti

http://www.loc.gov/loc/lcib/0611/images/map.jpg

Page 2: Automated Mapping of Large Binary Objects

The views expressed in this presentation are those of the author and do not reflect the official policy or position of the United States Military Academy, the Department of the Army, the Department of Defense or the U.S. Government. 

The views expressed in this presentation are those of the author and do not reflect the official policy or position of the United States Military Academy, the Department of the Army, the Department of Defense or the U.S. Government. 

http://www.cdcr.ca.gov/News/Images/overcrowding/MuleCreek_071906v1.jpg

Page 3: Automated Mapping of Large Binary Objects

Motivation0400-07FF 1024-2047 Screen memory0800-9FFF 2048-40959 Basic ROM memory8000-9FFF 32758-40959 Alternate: Rom plug-in areaA000-BFFF 40960-49151 ROM : BasicA000-BFFF 49060-59151 Alternate: RAMC000-CFFF 49152-53247 RAM memory, including alternateD000-D02E 53248-53294 Video Chip (6566)D400-D41C 54272-54300 Sound Chip (6581 SID)D800-DBFF 55296-56319 Color nybble memoryDC00-DC0F 56320-56335 Interface chip 1, IRQ (6526 CIA)DD00-DD0F 56576-56591 Interface chip 2, NMI (6526 CIA)D000-DFFF 53248-53294 Alternate: Character setE000-FFFF 57344-65535 ROM: Operating SystemE000-FFFF 57344-65535 Alternate : RAMFF81-FFF5 65409-65525 Jump Table

Page 4: Automated Mapping of Large Binary Objects

Goals• Accurately identify regions within arbitrary binary

object

• Efficient algorithms

• Extensible framework

• Automated mapping process

• Automated process for generating test data

• Current State: BINMAP Utility

Page 5: Automated Mapping of Large Binary Objects
Page 6: Automated Mapping of Large Binary Objects

insert ~ 5MB here...

insert ~ 5MB here...

0

~12MB

Page 7: Automated Mapping of Large Binary Objects

insert ~ 5MB here...

insert ~ 5MB here...

0

~12MB

ASCII Text

Compressed Image 1

Compressed Image N

Unicode URLs

Data Structure

Data Structure

Page 8: Automated Mapping of Large Binary Objects

0

N

f(x)

Page 9: Automated Mapping of Large Binary Objects

0

N

f(x)

Page 10: Automated Mapping of Large Binary Objects

binary fragment

high entropy medium entropy low entropy

encryptioncompression repeatingvalues

machinecode

humanlanguage

datastructures

uncompressedmedia

RLE LZW ... EN FR RU ...AES DES ...

ECB CBC ...

Partial Taxonomy

Page 11: Automated Mapping of Large Binary Objects

Goal0400-07FF 1024-2047 ASCII Text (English)0800-9FFF 2048-40959 Pointer Table8000-9FFF 32758-40959 Variable Length ArrayA000-BFFF 40960-49151 Compressed DataA000-BFFF 49060-59151 Unicode (Basic Latin)C000-CFFF 49152-53247 Unknown RegionD000-D02E 53248-53294 Repeating Value (0xFF)D400-D41C 54272-54300 Encrypted Region (AES)D800-DBFF 55296-56319 PNG ImageDC00-DC0F 56320-56335 JavaScriptDD00-DD0F 56576-56591 Encrypted Region (RSA Key?)D000-DFFF 53248-53294 Unknown RegionE000-FFFF 57344-65535 BMP ImageE000-FFFF 57344-65535 Unicode (Hyperlinks?)FF81-FFF5 65409-65525 Repeating Value (0x00)

Page 12: Automated Mapping of Large Binary Objects

f(x)

Fragment type 1 a1-a2

Fragment type 2 a3-a4

Fragment type N a5-a6

Page 13: Automated Mapping of Large Binary Objects

Test 1 Test 2 Test 3 Test N

Fragment type 1 a1-a2 b1-b2 c1-c2 z1-z2

Fragment type 2 a3-a4 b3-b4 c3-c4 z3-z4

Fragment type N a5-a6 b5-b6 c5-c6 z5-z6

Page 14: Automated Mapping of Large Binary Objects

Shannon Entropy

Perl Random Number Sequence a1-a2

AES Encrypted Word Document a3-a4

ASCII Text Document a5-a6

BMP (Single Color) a7-a8

Page 15: Automated Mapping of Large Binary Objects

Shannon Entropy

Shannon entropy H(X) measures uncertainty and quantifies information contained in message.

http://en.wikipedia.org/wiki/Shannon_entropy

Other Techniques- Hamming Weight- Index of Coincidence- Mean / Standard Deviation- Traditional pattern matching- <Your ideas?>

Page 16: Automated Mapping of Large Binary Objects

Window Size(Shannon Entropy of AES sample)

Page 17: Automated Mapping of Large Binary Objects

Window Size(Shannon Entropy of AES sample)

Page 18: Automated Mapping of Large Binary Objects

Window Size(Shannon Entropy of AES sample)

Page 19: Automated Mapping of Large Binary Objects

Window Size(Shannon Entropy of AES sample)

Page 20: Automated Mapping of Large Binary Objects

Window Size(Shannon Entropy of 4 file types)

Page 21: Automated Mapping of Large Binary Objects

Window Size(Shannon Entropy of 4 file types)

Page 22: Automated Mapping of Large Binary Objects

BinMap Demo

Page 23: Automated Mapping of Large Binary Objects

Extensibility

Page 24: Automated Mapping of Large Binary Objects

Example

Page 25: Automated Mapping of Large Binary Objects

Entropy/Evaluating

Page 26: Automated Mapping of Large Binary Objects

Future Work• Improve Framework

– Analyze performance– Develop & improve plug-ins

• Improve Datasets

• Integrate with visualization, interaction and GUI

• Other identification measures

• Apply datamining techniques

• Increase size of taxonomy

Code repository: http://binmap.googlecode.com

Page 27: Automated Mapping of Large Binary Objects

0x3F 0x3F 0x3F? ? ?