self-composition by symbolic execution
DESCRIPTION
Presentation at the ICCSW 2013 workshop. Full paper is here: http://drops.dagstuhl.de/opus/volltexte/2013/4277/TRANSCRIPT
THE PROBLEMPRELIMINARIESTHE APPROACH
CONCLUSION
Self-composition by Symbolic Execution
Quoc-Sang Phan
Queen Mary, University of London
September 26, 2013
1 / 24
THE PROBLEMPRELIMINARIESTHE APPROACH
CONCLUSION
Outline
1 THE PROBLEMInformation FlowSelf-composition
2 PRELIMINARIESThe trace semanticsSymbolic Execution
3 THE APPROACHSelf-composition as Path-equivalencePath-equivalence generationImplementation
4 CONCLUSION
2 / 24
THE PROBLEMPRELIMINARIESTHE APPROACH
CONCLUSION
Information FlowSelf-composition
Attacker model
H
Secret HExternal observer
Secret H Public L
Public O
L
SW
3 / 24
THE PROBLEMPRELIMINARIESTHE APPROACH
CONCLUSION
Information FlowSelf-composition
Examples
Direct flow (explicit flow)
O = H + 3 ;
Indirect flow (implicit flow)
i f (H == L )O = t r u e ; // a c c e p t password
e l s eO = f a l s e ; // r e j e c t
4 / 24
THE PROBLEMPRELIMINARIESTHE APPROACH
CONCLUSION
Information FlowSelf-composition
The problem
(Qualitative) Information Flow: does the program leakinformation?
Quantitative Information Flow (QIF): how much does it leak?
Given a function F measuring secrecy. Leakage of informationis defined as:
∆F (H) = F (H)− F (H|O)
F can measure: Shannon entropy, Renyi’s min-entropy,guessing entropy.
Two-step analysis for QIF
Detect the leaks ← this presentation.
“Measure” the leaks.
5 / 24
THE PROBLEMPRELIMINARIESTHE APPROACH
CONCLUSION
Information FlowSelf-composition
Detecting information flow leaks
Type system
No false negatives, too many false positives (too restrictive)Fast
Taint analysis
Both false negatives and false positives.Fast (powerful to detect bugs).
Theorem proving (by self-composition)
Precise: no false positives, no false negativesImpractical in reality.
6 / 24
THE PROBLEMPRELIMINARIESTHE APPROACH
CONCLUSION
Information FlowSelf-composition
Self-composition
// program Pi f (H == L )
O = t r u e ;e l s e
O = f a l s e ;// copy o f P w i t h a l l v a r i a b l e s renamedi f (H1 == L1 )
O1 = t r u e ;e l s e
O1 = f a l s e ;
Self-composition in Hoare logic
{L = L1}P; P1{O = O1}7 / 24
THE PROBLEMPRELIMINARIESTHE APPROACH
CONCLUSION
Information FlowSelf-composition
Self-composition
Terauchi and Aiken. “Secure information flow as a safetyproblem”. SAS 2005.
“When we actually applied the self-composition approach, wefound that not only are the existing automatic safety analysis toolsnot powerful enough to verify many realistic problem instancesefficiently (or at all), but also that there are strong reasons tobelieve that it is unlikely to expect any future advance”.
8 / 24
THE PROBLEMPRELIMINARIESTHE APPROACH
CONCLUSION
Information FlowSelf-composition
Our contribution
Practical approach for Self-composition using Symbolic Executionand SMT solvers.
Shift the self-composing step from the source code to thesymbolic expressions.
Generate self-composition formula in first-order theories.
Implement on Symbolic Pathfinder and Z3.
9 / 24
THE PROBLEMPRELIMINARIESTHE APPROACH
CONCLUSION
The trace semanticsSymbolic Execution
The formal system
A deterministic program is modelled as a transition system:
P = (Σ, I ,F ,T )
Σ is the set of program states;
I ⊆ Σ : the set of initial states.σ ∈ I is a pair 〈H, L〉, which means I = IH × IL
F ⊆ Σ : the set of final states.
T ⊆ Σ× Σ : the transition function.
10 / 24
THE PROBLEMPRELIMINARIESTHE APPROACH
CONCLUSION
The trace semanticsSymbolic Execution
The trace semantics
A trace of (concrete) execution of program P:
ρ = σ0σ1..σn
σ0 ∈ I , σn ∈ F and 〈σi , σi+1〉 ∈ T for all i ∈ {0, .., n − 1}.
The semantics of P : the set R of all possible traces.
init(ρ) = σ0 and fin(ρ) = σn
11 / 24
THE PROBLEMPRELIMINARIESTHE APPROACH
CONCLUSION
The trace semanticsSymbolic Execution
Symbolic Execution
Example
i f (H == L )O = t r u e ; // a c c e p t password
e l s eO = f a l s e ; // r e j e c t
Execute program with input symbols: H = α and L = β
If (α == β) : O = true.
If (α 6= β) : O = false.
12 / 24
THE PROBLEMPRELIMINARIESTHE APPROACH
CONCLUSION
The trace semanticsSymbolic Execution
Symbolic Execution
A deterministic program is modelled as a transition system:
P = (Σs , I s ,F s ,T s)
Σs : the set of symbolic states
I s ⊆ Σs : the set of initial symbolic states
F s ⊆ Σs : the set of final symbolic states
T s ⊆ Σs × Σs : the transition function.
13 / 24
THE PROBLEMPRELIMINARIESTHE APPROACH
CONCLUSION
The trace semanticsSymbolic Execution
The semantics
A symbolic path (symbolic trace) of the program P:
ρs = σs0σ
s1..σ
sn
such that σs0 ∈ I s , σs
n ∈ F s and 〈σsi , σ
si+1〉 ∈ T s for all
i ∈ {0, . . . , n − 1}.
The symbolic semantics of P : the set Rs of all symbolic paths(aka the symbolic execution tree)
14 / 24
THE PROBLEMPRELIMINARIESTHE APPROACH
CONCLUSION
The trace semanticsSymbolic Execution
The summaries
Denote by X |y the value of the variable X at the state y . For eachσs
i ∈ F s :O|σs
i= fi (α, β)
σsi is reachable iff path condition ci (α, β) is SAT.
O =
f1(α, β) if c1(α, β)f2(α, β) if c2(α, β). . . . . .fn(α, β) if cn(α, β)
∀i , j ∈ [1, n] ∧ i 6= j .ci ∧ cj = ⊥
15 / 24
THE PROBLEMPRELIMINARIESTHE APPROACH
CONCLUSION
Self-composition as Path-equivalencePath-equivalence generationImplementation
Trace-equivalence
Self-composition in Hoare logic
{L = L1}P; P1{O = O1}
Interpret in trace semantics:
Self-composition as Trace-equivalence
∀ρ ∈ R, ρ1 ∈ R1.L|init(ρ) = L1|init(ρ1) → O|fin(ρ) = O1|fin(ρ1)
→ impossible to enumerate all traces.→ need an abstract interpretation: Symbolic Execution.
16 / 24
THE PROBLEMPRELIMINARIESTHE APPROACH
CONCLUSION
Self-composition as Path-equivalencePath-equivalence generationImplementation
Trace-equivalence
Self-composition in Hoare logic
{L = L1}P; P1{O = O1}
Interpret in trace semantics:
Self-composition as Trace-equivalence
∀ρ ∈ R, ρ1 ∈ R1.L|init(ρ) = L1|init(ρ1) → O|fin(ρ) = O1|fin(ρ1)
→ impossible to enumerate all traces.→ need an abstract interpretation: Symbolic Execution.
17 / 24
THE PROBLEMPRELIMINARIESTHE APPROACH
CONCLUSION
Self-composition as Path-equivalencePath-equivalence generationImplementation
Path-equivalence
Self-composition in Hoare logic
{L = L1}P; P1{O = O1}
Interpret in symbolic semantics:
Self-composition as Path-equivalence
∀ρs ∈ Rs , ρs1 ∈ Rs
1.
(L|init(ρs) = L1|init(ρs1)
) ∧ path(ρs) ∧ path(ρs1)
→ (O|fin(ρs) = O1|fin(ρs1)
)
18 / 24
THE PROBLEMPRELIMINARIESTHE APPROACH
CONCLUSION
Self-composition as Path-equivalencePath-equivalence generationImplementation
Path-equivalence generation
Symbolically Execution
H|init(ρs) = α; L|init(ρs) = β; H1|init(ρs1)
= α1; L1|init(ρs1)
= β
Path-equivalence generation
PE ≡ DF ∧ IF
where:
DF ≡n∧
i=1
ci (α, β) ∧ ci (α1, β)→ (fi (α, β) = fi (α1, β))
IF ≡n−1∧i=1
n∧j=i+1
ci (α, β) ∧ cj (α1, β)→ (fi (α, β) = fj (α1, β))
19 / 24
THE PROBLEMPRELIMINARIESTHE APPROACH
CONCLUSION
Self-composition as Path-equivalencePath-equivalence generationImplementation
Path-equivalence generation
The password checking program
O =
{true if α = βfalse if α 6= β
}
Path-equivalence generation
PE ≡ DF ∧ IF
where:
DF ≡ (α = β ∧ α1 = β → true = true)∧(α 6= β ∧ α1 6= β → false = false)
IF ≡ α = β ∧ α1 6= β → true = false
20 / 24
THE PROBLEMPRELIMINARIESTHE APPROACH
CONCLUSION
Self-composition as Path-equivalencePath-equivalence generationImplementation
Implementation
Tools to use:
Symbolic Execution: Symbolic Pathfinder of NASA
SMT solver: Z3 of Microsoft
Also extended to Quantitative Information Flow.
The project
“Secure Information Flow by Symbolic Execution”
Google Summer of Code 2013: evaluation submittedyesterday.
Mentor organization: NASA’s Java Pathfinder team.
Also extended to Quantitative Information Flow.
21 / 24
THE PROBLEMPRELIMINARIESTHE APPROACH
CONCLUSION
Self-composition as Path-equivalencePath-equivalence generationImplementation
Implementation
Tools to use:
Symbolic Execution: Symbolic Pathfinder of NASA
SMT solver: Z3 of Microsoft
Also extended to Quantitative Information Flow.
The project
“Secure Information Flow by Symbolic Execution”
Google Summer of Code 2013: evaluation submittedyesterday.
Mentor organization: NASA’s Java Pathfinder team.
Also extended to Quantitative Information Flow.
22 / 24
THE PROBLEMPRELIMINARIESTHE APPROACH
CONCLUSION
Conclusions
Shift the self-composing step from the source code to thesymbolic expressions.
Generate self-composition formula in first-order theories.
Implement on Symbolic Pathfinder and Z3.
23 / 24
THE PROBLEMPRELIMINARIESTHE APPROACH
CONCLUSION
THANK YOU FOR YOUR ATTENTION!
24 / 24