analysis of (unknown) file formats
DESCRIPTION
The goal of this talk is to provide a general overview of effort that goes into, and to familiarize listeners with, making an unpacker or a validator for various ranges of binary file formats. Unpackers and validators are used in various ranges of security and utility products. Anti-virus products use them to do file introspection and ease malware detection while other uses include applications in hard-drive forensics and even everyday file extraction from archives. File format analysis enables writing such tools. The talk will provide real life experience, advice and techniques with insight into both analysis and programming challenges that are encountered daily and suggestions on how to solve them. Focus of this talk will be on the most common file formats that are encountered in the “wild”.TRANSCRIPT
ANALYSIS OF
(UNKNOWN)
FILE FORMATS
22nd September 2011
Mario Suvajac
Hi, I’m
Mario Suvajac
@msuvajac
suvajac.org
reversinglabs.com
FILE
FORMATS
http://www.tripleman.com/index.php?showimage=6
FILE FORMATS
• Structured information storage/carriers
– Compressed
– Encrypted
– All of the above
CATEGORIZATION
http://www.flickr.com/photos/fotomele/1072932978
CATEGORIZATION
• Availability
– Open
– Proprietary
• Different for each information type or contained in generalized container format
• Executables, archives...
Resources
Overlay*
Data1.cab Data1.hdr
Engine32.cab Layout.bin Setup.exe Setup.ibt Setup.ini Setup.inx
UPX 1.25
File N
Engine32.cab Engine32\*.*
Setup.ibt LZ\setup.ibt\*.*
Overlay
Unpacked PE32
WHY IS ANALYSIS
IMPORTANT?
http://www.flickr.com/photos/marodesu/5932256377
WHY IS ANALYSIS IMPORTANT?
• Writing unpackers & validators
– Anti-virus protection
– Computer forensics
– General software development
– ...
HOW TO
DO IT?
http://www.flickr.com/photos/karenilagan/2163284814
HOW TO DO IT?
• Specifications
• Reverse Engineering
• Asking Please
http://www.flickr.com/photos/19666640@N00/2884433955
FILE FORMAT PATTERNS
• File header
– Magic
– Sizes
– Offsets
– Algorithm ids
– Block descriptors
– ...
• Data
ZIP FILE FORMAT
Reverse
engineering
http://www.tripleman.com/index.php?showimage=520
BY Just Observing
• Experience based
• Hex editor
• Diffing’
BY Debugging
• Watching reads & further data manipulation
• Compression & encryption algorithms reversing
CODING TIPS
http://www.flickr.com/photos/the8rgrl/4642045
CODING TIPS
• Security risks
• Problems in practice
• corelib
THANKS,
QUESTIONS?!
Btw.
IS HIRING