building & leveraging white database for antivirus testing
DESCRIPTION
Presented at the International Antivirus Testing Workshop 2007 by Mario Vuksan, Director, Knowledgebase Services, Bit9TRANSCRIPT
Building and Leveraginga Whitelist Database for Anti-Virus TestingMario Vuksan, Director, Knowledgebase Services
Agenda• Growing Signature/Definition
Problem• Building a Global Whitelist• Leveraging a Global Whitelist• QA
Growing Signature Problem• Cumulative unique variants have grown ten-fold
over last 5 years (Yankee Group)• “Denial-Of-Service” Attacks: Malware changing
signature every 10 minutes
• Solutions– Heuristic & Behavioral Detections
• New Problem: High “False Positive” Count
Whitelist: a Google-sized ProjectSizing Software Universe
• Number of Files Released Daily by:• Microsoft – 500K / IBM – 100K / Sourceforge – 500K / Mozilla.Org – 250K
• More Components, Daily Builds, Auto Updaters
• 2.7B Files Indexed, heading for 10B• 30TB of Installers, heading for 100TB• Daily acquiring 50M File Records, ¼ of YouTube• Tracking 20,000 Software Companies
– E.g. DMOZ tracks 200,000+ Entities
100 TB
June2005
30M
300M
3B
10B
FilesIndexed
March2006
May2007
Dec2007
1 TB
8 TB
30 TB
Storage
Mechanics of a Whitelist
Collect
Extract
Analyze
Software Infrastructure
Hardware Infrastructure
Publish (Interfaces)
Consumers
Outbound Metadata Inbound User Metadata
Building a Whitelist• Trusted Partners
– Benefits• Trusted Source of Binary Material• In-depth Information on the Binary Data
Indexed– Realities
• Expensive Partner Programs• Complicated Applications• Lack of Interest• Lack of Comprehensive Repositories
Certifying Software– Certificate Mechanism
• As a Component for Validation• Costly Process, Cumbersome for QA
Departments• Great When Seen on Shareware Sites Less than 10% Penetration
– First-Seen Date• Microsoft & Shared Installer Components• Long Time & No Detection Likely Good
Challenges of Software Acquisition
• Buying/Getting Physical Media– Retail Prices vs. Ebay– How to process 35K DVDs?
• FTP Sites• Web Sites
– Simple: Links and Forms– Complicated: Javascript– Super Complicated: Frames and AJAX
• Shareware Sites• Warez
– Legal Ramifications– Users vs. Collectors
Harvesting The Internet• Order of Difficulty
– FTPs – Wget, Curl– Simple HTTPs – Open Source Spiders– Try Grabbing Download.com– Try Grabbing Downloads.microsoft.com– Try Grabbing Canon or any Driver Site
• Datacenter Requirements
Assuring Software is Trustworthy• Anti-Malware Scanning
– Name and Type Normalization• Behavior Scanning• Code Inspection• External Meta Data Collection and Matching
Software Analysis Results• Basic Embedded Data• PE Header Analysis
– Processor, Language, Binary Type• Packers and Protectors
– 500+ Variants– ASPack and Adobe– PECompact and Google
• Install Formats– Proprietary (like Skype)– Binary Diffs (Patch Factory, MS PSF)
• Runtime Analysis and Sandboxing
Software Classifications• Classifying Source
– Trust-based vs. Type-based• Classifying Files
– Functional (Font, Driver, Screensaver) vs. Descriptive • Classifying Products
– Basic• Open Source• Commercial: Driver vs. Application• IM / P2P / Games
– Better• Malware Classifications
– Interesting• Steganography/Watermarking/Hacking/Hiding
Industry & Government Certifications
• Government Certifications– NIAP, FIPS, DCTS
• Vulnerability Reports– CVE, CERT, SANS, MSB, etc.
• For Good Software:– Certification Programs
• Built for Vista, Windows Certified, Java Approved– eTrust Download
• For Malware:– StopBadware, CME
Leveraging the WhitelistDistribution of language
85%
2%1%
1%1%1%1%1%1%1%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%
English (U.S.)JapaneseChinese (Traditional)Chinese (Simplified)KoreanGermanItalianFrenchSpanishPortuguese (Brazil)DutchPolishTurkishRussianSwedishCzechDanishNorwegian BokmalFinnishHungarianGreekPortuguese (Portugal)HebrewArabicEnglish (Canadian)SlovakSlovenianBasqueCatalanCroatianBulgarianUkrainian
PE Header Subsystem
Distribution of subsystem
65%
29%
5% 1%0%0%0%0%0%
The Windows graphical userinterface (GUI) subsystem
The Windows character subsystem
Device drivers and native Windowsprocesses
Windows CE
The Posix character subsystem
Unknown subsystem
An Extensible Firmware Interface(EFI) application
An EFI driver with boot services
An EFI driver with run-time services
Other PE Header Data
Percentage of .NET Applications (based on COR20 header)
6%
94%
.NET application
Others
Percentage of binaries recoganized as DLLs (based on file characteristics bitmask)
76%
24%
DLL
Others
Percentage of binaries with bounded import table
29%
71%
Bounded Import Table
Unbounded
Distribution of machine code
87%
8%
4%
1%0%0%0%0%0%0%0%0%0%
Intel 386 or later processors andcompatible processors
Intel Itanium processor family
AMD64
Alpha_AXP
MIPS little endian
Power PC little endian
ARM little endian
Thumb
Hitachi SH3
MIPS with FPU
Hitachi SH4
MIPS16
What about False Positives?
• Typical Suspects:– Internet Explorer– Drivers (Network, File Access)– OS Components– Universal Installer and Uninstaller
Components• Optimized Applications:
– Using Obscure Third-Party Software– ASPack, PECompact, Themida
Archive Format Distribution• Most popular archive/packer formats
ARC/GZIP44%
ARC/MSCAB27%
ARC/ZIP7%
ARC/BZIP27%
ARC/TAR6%
SFX/MSCAB2%
ARC/LZ1%
SFX/UPX1%
ARC/MSI1%
SFX/MSDelta1%
ARC/PSF1%
ARC/RAR0%
ARC/ISCAB0%
SFX/ZIP0%
SFX/Nullsoft0%
SFX/RAR0%
SFX/IS0%
SFX/WISE0%
ARC/ISO0%
ARC/7ZIP0%
SFX/WISE/Embedded
0%
UPX 0.8x - 2.xx0%
ASPack 2.120%
SFX/BZIP20%
PECompact 2.xx0%
- ASPack 2.112.11d
0%
ARC/PSF0%
SFX/NOS0%
ARC/UDF0%
ARC/WIM0%
ARC/PSF0%
ARC/MSCAB0%
ASPack 2.10%
ARC/MSCAB0%
ASPack 2.110%
UPX 0.8x - 2.xx0%
PECompact 1.681.76 -
0%
- ASPack 2.112.11d
0%
ASPack 2.120%
ASPack 1.08.030%
ASPack 1.07b0%
PECompact 2.xx0%
ASPack 2.0000%
- WinUPack 0.370.390%
ARC/WIM0%
SFX/7ZIP0%
- WinUPack 0.280.3x0%
- ASPack 1.06b1.061b
0%
ASPack 1.08.020%
ASPack 2.120%
- ASPack 2.112.11d
0%
ASPack 2.0010%
Private exeProtector 2.0
0%
CExe 1.0a0%
PE Pack 1.00%
PECompact 1.301.32 -
0%
PECompact 2.xx0%
PC Guard 5.000%
UPX 0.720%
UPX 0.8x - 2.xx0%
Private exeProtector 2.0
0%
ASPack 2.10%
- ASPack 1.08.001.08.01
0%
ASPack 2.0000%
ASPack 1.08.030%
ASPack 1.08.040%
Or Are They False Positives?(FTP Injection Attacks)
• HP
Or Are They False Positives?(FTP Injection Attacks)• Nero AG
Vertical Detection• Malware Sample Vertical File Detection
Chart
• Good File Vertical Analysis• Anti-Malware Reports per Web Site
– Bit9 ISV Safe Software Program
Use Case: Anti-Malware• Benefits
– R&D Tool•Packers, Metadata, Sources
– QA Tool•False Positives
– Performance Accelerator•Robin Bloor’s AVID•Next Generation Anti-Malware
About Bit9• What We Do:
– Application and Device Control Solutions and Software Metadata Reporting
• What We Offer:– Bit9 Parity Protects against Malicious Software and Data
Leakage– The Bit9 Knowledgebase is the Largest Collection of
Actionable Intelligence about the World’s Software• Background
– Founded in 2002 by founders of Okena (Cisco)– $2 Million NIST ATP Grant in 2003– Headquartered in Cambridge, Mass.– Venture Funded