g enesis : a framework for achieving component diversity john c. knight, jack w. davidson, david...
TRANSCRIPT
GENESIS: A Framework For Achieving Component Diversity
John C. Knight, Jack W. Davidson, David Evans, Anh Nguyen-TuongUniversity of Virginia
Chenxi WangCarnegie Mellon University
DARPA SRS Kickoff 3
What Is The Problem?
Many machines with the same vulnerability
What is a vulnerability? A vulnerability is a fault in the classic
sense of dependability theory Fault types:
Degradation something breaks in one copy
Design flaw in design affectsall copies
Software faults are design faults
DARPA SRS Kickoff 4
Redundancy & Degradation Faults
Computer1
Computer2
ComputerN
Inputs Voter Outputs
Damage Assessment
State Restoration
Error Detection
Continued Service
N Modular Redundant(NMR) System
Identical Computers
DARPA SRS Kickoff 5
Redundancy & Design Faults
Redundancy is diversity Works well for degradation faults:
Faults have predictable statistical behavior Effective mathematical models available
What about design faults? Simple replication doesn’t work,
obviously Requires different (diverse) designs to
be effective
DARPA SRS Kickoff 7
Design Diversity Development
Version Development 1
SystemAssembly
ComponentSpecificatio
n
Version Development 2
Version Development N
InteractionBarriers
Goal: Different Faults Because Of Independent Development
TechnologyRestrictions
DARPA SRS Kickoff 8
Design Diverse System
Version1
Version2
VersionN
Inputs Voter Outputs
N Version System
How “Different”?
Assumption: Different Faults Because Of Independent Development
DARPA SRS Kickoff 9
Design Diversity
Does not work well for design faults No upper bound on failure probability No practical statistical models No definition of “design diversity” No procedure for achieving it
Linux vs. Windows is, however, worse—it is purely ad hoc
But, what else is there?
DARPA SRS Kickoff 11
Data Diversity
Heisenbug (Jim Gray): Program fails Sometimes if you rerun the program, it works Applied to Tandem operating system We all do this in daily operation
Several variants of approach developed Comprehensive, general approach
developed: Data diversity
DARPA SRS Kickoff 12
Data Diverse System
Copy1
Copy2
CopyN
Inputs Voter
N Copy Architecture
DataReexpression
DataReexpression
DataReexpression
Reverse DataReexpression
Same Software
Reverse DataReexpression
Reverse DataReexpression
DARPA SRS Kickoff 13
Data Diversity Low cost—software is copied Unknown performance for design faults Experimental evidence that it works well Can be very powerful:
sin(x) = sin(a + b)= sin(a)cos(b) +
cos(a)sin(b)= sin(a)sin(90-b) +
sin(90-a)sin(b) Choose a and b, repeat, vote
DARPA SRS Kickoff 14
The Vision
Diverse population of functionally-equivalent software
GENESIS Diversity Engine
Diversity Specifications
Software
Automated production of design-diverse, functionally-equivalent software
Automatic production of data-diverse, functionally-equivalent software
It might work…
DARPA SRS Kickoff 15
Overall Approach
Analysis of the diversity space Automated production of functionally-
equivalent software and data: Compiler and meta-compiler technology:
Source-level transformations Compiler transformations Data stream rewriting
Virtual Machine Technology Run-time software translation techniques
Rationale that diversity is an effectivedefense mechanism:
Experimental evaluation Modeling of effects of diversity on known
vulnerabilities Application to COTS software
Diverse population of functionally-equivalent software
GENESIS Diversity Engine
Diversity Specifications
Software
DARPA SRS Kickoff 16
Hierarchic Design Diversity
Software Application
Source CodeVersion 1
Binary 1
Binary 2
Binaryi
Source CodeVersion N
2 2 22 2 1 2 1 1 2 1 1 2 2 2 1 2 2 1 2
Binaryi
Source-to-Source
Transformations CompilerTransformations
Binaryi
Binaryi
Run-time Transformations
DARPA SRS Kickoff 17
Source to Source Transformations
Underlying model of tasks: e.g. fork/execs vs. threads
Process interaction: e.g. low-level semaphores vs. higher-level
monitors Fundamental libraries:
e.g. libc, sockets, etc… Diversity achieved by component
combinations
DARPA SRS Kickoff 18
Compiler Transformations Generate N compilers that target different
architectures Manipulate formal description of target
architecture—Computer Systems Description Language (CSDL): Instruction Set Architecture (ISA) specification Calling convention specification
Example diversity techniques: Different calling conventions ISA subsets created, enforced dynamically Memory layouts—code and data Implement the above within the same program
DARPA SRS Kickoff 19
Run-time Transformations
Software Dynamic Translation STRATA system:
Layer between hardware and application Designed to be easily retargeted
Virtual machine provides: Underlying target Supplementary rules on use of target
Software Dynamic Translation systems: FX 32 Dynamo Transmeta
DARPA SRS Kickoff 20
STRATA—Basic Operation
Context Switch
Fetch
Decode
Translate
New PC
Host CPU (Executing Translated Code from Cache)
Finished?
No
SDT Virtual Machine
Yes
Context Capture
Cached?
Yes
New Fragment
Next PC
Enforce Desired Policies
DARPA SRS Kickoff 21
Example STRATA Policies
Apply compile-time transformations dynamically: Rearrangement basic blocks, calling sequence
transformations, etc… Dynamic injection and enforcement of
behavioral policies E.g. resource usage (files, sockets, tasks)
Language diversity: dialects Only allow subsets of original instruction set Vary subsets dynamically
DARPA SRS Kickoff 22
STRATA System Architecture
Application
Host CPU
Target Specific Functions
Strata Virtual CPU
Context Management
Memory Management
Cache Management
Str
ata
Vir
tual
M
ach
ine
Target Interface
Linker
Machine Independe
nt Component
s
DARPA SRS Kickoff 23
Data Diversity
Diversity in the data space can avoid sequences of events that lead to failure
Diversity space offers large range of data re-expression options Precision (Exact,
Approximate) Locality (Internal, External) Sequence (inorder-ontime,
inorder-offtime, outoforder-ontime, outoforder-offtime)
Locality
Seq
uenc
e
Precision
DARPA SRS Kickoff 24
Data Re-expression Examples
Change floating point values: Lose precision Translate Rotate
Data sequences: Reorder data Change timing of data
Memory layout (code and data) Reorder transactions Reorder data in activation records SQL Rewriting …many more examples…
DARPA SRS Kickoff 25
Data Re-expression Space
These examples are ad hocProposals in literature are ad hocSo:
Use data re-expression space categorization to drive exploration of diversity techniques (instead of point
solutions)
DARPA SRS Kickoff 26
Evaluation
Theoretical: Modeling of effects of diversity on network
vulnerabilities E.g., WORM propagation
Understand limits of diversity Categorization of “diversity space” Identify unnecessary homogeneity in software
Not just code but also environment, configuration, etc…
Experimental: Directed fault seeding:
Apply known exploits to target system Apply all Genesis techniques Evaluate variants’ resistance to attack
Automated fault seeding
DARPA SRS Kickoff 27
Automatic Fault Seeding
Need test cases Need typical vulnerabilities, i.e., bugs Can typical bugs be synthesized? Prior work on syntactic transformations:
Simple mutations Wide variety of resilience Defects created with excellent statistical
properties Plan to try this route
DARPA SRS Kickoff 28
Automated Fault Seeding
TargetSoftwareSystem
TargetSoftwareSystem
TargetSoftwareSystem
TargetSoftwareSystem
TargetSoftwareSystem
TargetSoftwareSystem
TargetSoftwareSystem
TargetSoftwareSystem
TargetSoftwareSystem
TargetSoftwareSystem
TargetSoftwareSystem
TargetSoftwareSystem
AcceptanceTests
ErrorSeeding
GenesisTransformations
TargetSoftwareSystem
TargetSoftwareSystem
TargetSoftwareSystem
TargetSoftwareSystem
TargetSoftwareSystem
TargetSoftwareSystem
TargetSoftwareSystem
TargetSoftwareSystem
TargetSoftwareSystem
TargetSoftwareSystem
TargetSoftwareSystem
TargetSoftwareSystem
Vulnerability
Assessment
DARPA SRS Kickoff 29
State Of The Implementation
Exists, ready to use: CSDL Calling convention spec STRATA
DARPA SRS Kickoff 30
Specific Questions Posed
What you are trying to do (the problem you are addressing)? How will you show that you were successful?
What are the implications of successful results (or less than
successful results)?
What is your technical approach?
What is new, or hasn’t been attempted?
What significant problems do you anticipate, what makes your
project difficult and how do you plan to approach the difficulties?
If successful, what have you thought about regarding
transitioning the technology?
If successful, what would be next?
DARPA SRS Kickoff 31
Practical Problem
If this works: Building a system will require lots of
computer time Lots of systems will require LOTS of
computer time But it is just computer time
Will not be able to just press CDs Will require a substantial engineering
investment