all you ever wanted to know about dynamic taint …aavgerin/slides/woot10.pdfall you ever wanted to...
Post on 24-Aug-2020
2 Views
Preview:
TRANSCRIPT
All You Ever Wanted to Know AboutDynamic Taint Analysis
&Forward Symbolic Execution
(but might have been afraid to ask)
Edward J. Schwartz, Thanassis Avgerinos, David Brumley
8/16/2010 Carnegie Mellon University 1
(Yes, we were trying to overflow the title length fieldon the submission server)
A Few Things You Need to Know AboutDynamic Taint Analysis
&Forward Symbolic Execution
(but might have been afraid to ask)
Edward J. Schwartz, Thanassis Avgerinos, David Brumley
8/16/2010 Carnegie Mellon University 2
The Root of All Evil
8/16/2010 Carnegie Mellon University 3
Humans write programs
This Talk:Computers Analyzing Programs Dynamically at
Runtime
Two Essential Runtime Analyses
8/16/2010 Carnegie Mellon University 4
Dynamic Taint Analysis:What values are derived from user input?
Detect Exploits [Costa2005,Crandall2005,Newsome2005,Suh2004]
Detectpacking in malware
[Bayer2009,Yin2007]
Forward Symbolic Execution:What input will make execution reach this line of code?
Input Filter Generation [Costa2007,Brumley2008]
Automated Test Case Generation
[Cadar2008,Godefroid2005,Sen2005]
Our Contributions
1: Turn English descriptions into an algorithm
– Operational Semantics
2: Algorithm highlights caveats, issues, and unsolved problems that are deceptively hard
8/16/2010 Carnegie Mellon University 5
Dynamic Taint Analysis:Is this value affected by user input?
Forward Symbolic Execution:What input will make execution
reach this line of code?
Computers Analyzing Programs Dynamically at Runtime
Our Contributions (cont’d)
3: Systematize recurring themes in a wealth of previous work
8/16/2010 Carnegie Mellon University 6
1. How it works – example
2. Desired properties
3. Example issue. Paper has many more.
8/16/2010 Carnegie Mellon University 7
Dynamic Taint Analysis:What values are derived from user input?
8/16/2010 Carnegie Mellon University 8
x = get_input( )
y = x + 42
…
goto yInput is tainted
untaintedtainted
x 7
ΔVar Val
Tx
Tainted?Var
τ
Inputt = IsUntrusted(src)get_input(src)↓ t
Taint Introduction
8/16/2010 Carnegie Mellon University 9
x = get_input( )
y = x + 42
…
goto yData derived from
user input is tainted
untaintedtainted
y 49
ΔVar Val
x 7
Ty
Tainted?
T
Var
x
τ
BinOpt1 = τ[x1] , t2 = τ[x2]
x1 + x2 ↓ t1 v t2
Taint Propagation
8/16/2010 Carnegie Mellon University 10
Policy ViolationDetected
x = get_input( )
y = x + 42
…
goto y
untaintedtainted ΔVar Val
x 7
y 49
Tainted?
T
T
Var
x
y
τTaint Checking
Pgoto(ta) = ¬ ta
(Must be true to execute)
8/16/2010 Carnegie Mellon University 11
x = get_input( )
y = …
…
goto y
…strcpy(buffer,argv[1]) ;…return ;
Jumping to overwritten
return address
Real Use:Exploit Detection
Different Use:Program Control
Memory Load
8/16/2010 Carnegie Mellon University 12
Variables Memory
ΔVar Val
x 7
Tainted?
T
Var
x
τ
μAddr Val
7 42
Tainted?
F
Addr
7
τμ
Problem: Memory Addresses
8/16/2010 Carnegie Mellon University 13
x = get_input( )y = load( x )… goto y
All values derived from user input
are tainted??
7 42μ
Addr Val
Tainted?
F
Addr
7τμ
x 7Δ
Var Val
μAddr Val
x = get_input( )y = load( x )… goto y
Jump target could be any untainted
memory cell value
Policy 1:
8/16/2010 Carnegie Mellon University 14
Loadv = Δ*x] , t = τμ[v]
load(x) ↓ t
Taint depends only on the memory cell
Taint Propagation
7 42
Tainted?
F
Addr
7τμ
x 7Δ
Var Val
UndertaintingFailing to identify tainted values
- e.g., missing exploits
jmp_table
Policy Violation?
8/16/2010 Carnegie Mellon University 15
x = get_input( )y = load(jmp_table + x % 2 )…goto y
Policy 2:
Memory
printa
printb
Address expression is tainted
Load v = Δ*x] , t = τμ[v], ta = τ[x]load(x) ↓ t v ta
If either the address or the memory cell is tainted, then the value is tainted
Taint Propagation
OvertaintingUnaffected values are tainted
- e.g., exploits on safe inputs
Research ChallengeState-of-the-Art is not perfect for all
programs
8/16/2010 Carnegie Mellon University 16
Undertainting:Policy may miss taint
Overtainting:Policy may wrongly
detect taint
• How it works – example
• Inherent problems of symbolic execution
• Proposed solutions
8/16/2010 Carnegie Mellon University 17
Forward Symbolic Execution:What input will make execution reach this line of code?
The Challenge
8/16/2010 Carnegie Mellon University 18
0x12345678
232 possible inputs
packet_len(int header, char *packet)char buf*2048+ = “…”;if (header < 0)
return 0;if (header == 0x12345678)
strcpy(buf, packet); return strlen(buf);
Forward Symbolic Execution:What input will make execution
reach this line of code?
f t
f t
A Simple Example
8/16/2010 Carnegie Mellon University 19
header < 0
header symboliccan have any
value
packet_len(int header, …)
If (header < 0)
If header == 0x12345678 return 0;
strcpy(buf,packet);return strlen(buf);
Interpreter
InterpreterInterpreter
InterpreterInterpreter
header ≥ 0 Λheader != 0x12345678
header ≥ 0 Λheader == 0x12345678
header ≥ 0
What input will make execution reach this line of
code?
8/16/2010 Carnegie Mellon University 20
One Problem: Exponential Blowup Due to Branches
Branch 2
Branch 3
Branch 1
Exponential Number of Interpreters/formulas in # of branches
Interpreter
8/16/2010 Carnegie Mellon University 21
Path Selection Heuristics
Symbolic Execution Tree
• Depth-First Search (bounded) ,Random Search [Cadar2008]
• Concolic Testing [Sen2005,Godefroid2008]…
However, these are heuristics. In the worst case all create an exponential number of formulas in the tree height.
Symbolic Execution is not Easy
• Exponential number of interpreters/formulas
• Exponentially-sized formulas
• Solving a formula is NP-Complete!
8/16/2010 Carnegie Mellon University 22
branching
substitutions + s + s + s +s + s + s + s == 42
Other Important Issues
8/16/2010 Carnegie Mellon University 23
Formalization
Π = (s + s + s + s +s + s + s + s) ==
42
More complex policies
Conclusion
• Dynamic taint analysis and forward symbolic execution used extensively in literature
– Formal algorithm and what is done for each possible step of execution often not emphasized
• We provided a formal definition and summarized
– Critical issues
– State-of-the-art solutions
– Common tradeoffs
8/16/2010 Carnegie Mellon University 24
8/16/2010 Carnegie Mellon University 25
Questions?
Thank You!thanassis@cmu.edu
top related