spring 2014 program analysis and verification lecture 1: introduction
DESCRIPTION
Spring 2014 Program Analysis and Verification Lecture 1: Introduction. Roman Manevich Ben-Gurion University. Three stories about software unreliability. 30GB Zunes all over the world fail en masse. December 31, 2008. Zune bug. 1 while (days > 365) { 2 if ( IsLeapYear (year)) { - PowerPoint PPT PresentationTRANSCRIPT
Spring 2014Program Analysis and Verification
Lecture 1: Introduction
Roman ManevichBen-Gurion University
2
Three stories about software unreliability
December 31, 2008
30GB Zunes all over the world fail en masse
3
Zune bug 1 while (days > 365) { 2 if (IsLeapYear(year)) { 3 if (days > 366) { 4 days -= 366; 5 year += 1; 6 } 7 } else { 8 days -= 365; 9 year += 1; 10 } 11 }
4
Zune bug 1 while (366 > 365) { 2 if (IsLeapYear(2008)) { 3 if (366 > 366) { 4 days -= 366; 5 year += 1; 6 } 7 } else { 8 days -= 365; 9 year += 1; 10 } 11 }
Suggested solution: wait for tomorrow 5
February 25, 1991
On the night of the 25th of February, 1991, a Patriot missile system operating in Dhahran, Saudi Arabia, failed to track and intercept an incoming Scud. The Iraqi missile impacted into an army barracks, killing 28 U.S. soldiers and injuring another 98.
Patriot missile failure
6
Patriot bug – rounding error• Time measured in 1/10 seconds• Binary expansion of 1/10:
0.0001100110011001100110011001100....• 24-bit register
0.00011001100110011001100• error of
– 0.0000000000000000000000011001100... binary, or ~0.000000095 decimal
• After 100 hours of operation error is 0.000000095×100×3600×10=0.34
• A Scud travels at about 1,676 meters per second, and so travels more than half a kilometer in this time
Suggested solution: reboot every 10 hours7
August 13, 2003
Billy Gates why do you make this possible ? Stop making moneyand fix your software!!
(W32.Blaster.Worm)
8
9
Windows exploit(s)Buffer Overflow
void foo (char *x) { char buf[2]; strcpy(buf, x); } int main (int argc, char *argv[]) { foo(argv[1]); }
./a.out abracadabraSegmentation fault
Stack grows this way
Memory addresses
Previous frame
Return address
Saved FP
char* x
buf[2]
…
ab
ra
ca
da
br
Buffer overrun exploits int check_authentication(char *password) { int auth_flag = 0; char password_buffer[16];
strcpy(password_buffer, password); if(strcmp(password_buffer, "brillig") == 0) auth_flag = 1; if(strcmp(password_buffer, "outgrabe") == 0) auth_flag = 1; return auth_flag;}int main(int argc, char *argv[]) { if(check_authentication(argv[1])) { printf("\n-=-=-=-=-=-=-=-=-=-=-=-=-=-\n"); printf(" Access Granted.\n"); printf("-=-=-=-=-=-=-=-=-=-=-=-=-=-\n"); } else printf("\nAccess Denied.\n"); }
(source: “hacking – the art of exploitation, 2nd Ed”) 10
11
Example of challenges in correct coding
12
(In)correct usage of APIs Application trend: Increasing number of libraries and APIs
– Non-trivial restrictions on permitted sequences of operations Typestate: Temporal safety properties
– What sequence of operations are permitted on an object?– Encoded as DFA
e.g. “Don’t use a Socket unless it is connected”
init connected closed
err
connect)( close)(
getInputStream)(getOutputStream)(
getInputStream)(getOutputStream)(getInputStream)(
getOutputStream)(
close)(
*
Challengesclass SocketHolder { Socket s; }Socket makeSocket() { return new Socket(); // A }open(Socket l) { l.connect(); }talk(Socket s) { s.getOutputStream()).write(“hello”); }
main() { Set<SocketHolder> set = new HashSet<SocketHolder>(); while(…) { SocketHolder h = new SocketHolder(); h.s = makeSocket(); set.add(h); } for (Iterator<SocketHolder> it = set.iterator(); …) { Socket g = it.next().s; open(g); talk(g); }}
13
14
What are the options?
15
Testing is not enough
• Observe some program behaviors• What can you say about other behaviors?
• Concurrency makes things worse
• Smart testing is useful– requires the techniques that we will see in the
course
16
Static analysis definition
Reason statically (at compile time) about the possible runtime behaviors of a program
“The algorithmic discovery of properties of a program by inspection of its source text1”-- Manna, Pnueli
1 Does not have to literally be the source text, just means w/o running it
17
Is it at all doable?x = ?if (x > 0) { y = 42;} else { y = 73; foo();} assert (y == 42);
Bad news: problem is generally undecidable
The gist of Static analysis
18
Goal: exploring program states
initialstates
badstates
19
reachablestates
Problem: cannot cover all reachable states in finite time
20
universe
Central idea: use approximation
Under Approximation
Exact set of configurations/behaviors
Over Approximation
Technique: explore abstract states
initialstates
badstates
21
reachablestates
Technique: explore abstract states
initialstates
badstates
22
reachablestates
Technique: explore abstract states
initialstates
badstates
23
reachablestates
Technique: explore abstract states
initialstates
badstates
24
reachablestates
25
Sound: cover all reachable states
initialstates
badstates
reachablestates
26
Unsound: miss some reachable states
initialstates
badstates
reachablestates
Imprecise abstraction
initialstates
badstates
27
reachablestates
False alarms
28
A sound message
x = ?if (x > 0) { y = 42;} else { y = 73; foo();} assert (y == 42); Assertion may be violated
29
• Avoid useless result
• Low false alarm rate• Understand where precision is lost
Precision
UselessAnalysis(Program p) { printf(“assertion may be violated\n”);}
Runtime vs. static analysis
Runtime Static analysis
Effectiveness Can miss errorsFinds real errors
Can find rare errorsCan raise false alarms
Cost Proportional to program’s execution
Proportional to program’s complexity
No need to efficiently handle rare cases
Can handle limited classes of programs and still be useful
30
Static analysissuccess stories
31
Driver’s Source Code in C
PreciseAPI Usage Rules(SLIC)
Defects
100% pathcoverage
Rules
Static Driver Verifier
Environment model
Static Driver Verifier
32
Bill Gates’ Quote
"Things like even software verification, this has been the Holy Grail of computer science for many decades but now in some very key areas, for example, driver verification we’re building tools that can do actual proof about the software and how it works in order to guarantee the reliability." Bill Gates, April 18, 2002. Keynote address at WinHec 2002
33
The Astrée Static Analyzer
Patrick CousotRadhia CousotJérôme Feret
Laurent MauborgneAntoine Miné Xavier Rival
ENS France
34
Objectives of Astrée• Prove absence of errors in safety critical C
code• ASTRÉE was able to prove completely
automatically the absence of any RTE in the primary flight control software of the Airbus A340 fly-by-wire system– a program of 132,000 lines of C analyzed
By Lasse Fuss (Own work) [CC-BY-SA-3.0 (http://creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons
35
About this course
36
A little about me
• History– Studied B.Sc., M.Sc., Ph.D. at Tel-Aviv University• Research in program analysis with IBM and Microsoft
– Post-doc in UCLA and in UT Austin– Joined Ben-Gurion University last year
37
Research interests• What’s a good algorithm for automatically
discovering (with no hints) that a program generates a binary tree where all leaves are connected in a list?– Combine static analysis with machine learning
• What’s a good algorithm for automatically proving that a parallel program behaves “well”?
• How can we automatically synthesize parallel code that is both correct and efficient?– Automatic reasoning + planning
38
Why study program analysis?
• Learn how to use abstraction to deal with intractable (usually undecidable) problems
• Understand how to systematically– Design compiler optimizations– Reason about correctness / find bugs (security)
• Some techniques may be applied in other domains– Computational learning– Analysis of biological systems
39
What do you get in this course?
• Learn basic principles of static analysis– Understand jargon/papers
• Learn a few advanced techniques– Some principled way of developing analysis– Develop one in a small-scale project
• Put to practice what you learned in logic, automata, data structures, and programming
40
My role• Teach you theory and practice• Teach you how to think of new techniques
• E-mail: [email protected]• Skype: rumster• Office hours: Wednesday 12:00-14:00• Course web-page– Announcements– Forum– Slides + lecture recordings
41
Requirements
1. Attend classes (don’t miss more than 5)2. 50%: 4-5 theoretical assignments
and programming assignments3. 50%: Exam
42
How to succeed in this course
• Make sure you understand material at class– Engage by asking questions and raising ideas
• Be on top of assignments– Submit on time– Don’t get stuck or give up on exercises – get help – ask me– Don’t start working on assignments the day before
• Be ethical
Joe (a day before assignment deadline):“I don’t really understand what you want from me in this assignment, can you help me/extend the deadline”?
43
More about Static analysis
44
45
The static analysis approach
• Formalize software behavior in a mathematical model (semantics)
• Prove properties of the mathematical model– Automatically, typically with approximation of the
formal semantics
• Develop theory and tools for program correctness and robustness
46
Kinds of static analysis• Spans a wide range
– type checking … up to full functional verification
• General safety specifications • Security properties (e.g., information flow)• Concurrency correctness conditions (e.g., absence of data
races, absence of deadlocks, atomicity)• Correct usage of libraries (e.g., typestate)
• Underapproximations useful for bug-finding, test-case generation,…
Static analysis techniques
• Abstract Interpretation
• Constraint-based analysis• Type and effect systems
47
Static analysis for verification
program
specification
Abstractcounterexample
Analyzer
Valid
48
Relation to program verification
• Fully automatic
• Applicable to a programming language
• Can be very imprecise• May yield false alarms
• Requires specification and loop invariants
• Program specific
• Relatively complete• Provides counter examples• Provides useful documentation• Can be mechanized using
theorem provers
Static Analysis Program Verification
49
Verification challenge
main(int i) { int x=3,y=1;
do { y = y + 1; } while(--i > 0) assert 0 < x + y;}
Determine what states can arise during any execution
Challenge: set of states is unbounded
50
Abstract Interpretation
main(int i) { int x=3,y=1;
do { y = y + 1; } while(--i > 0) assert 0 < x + y;}
Recipe1) Abstraction2) Transformers3) Exploration
Challenge: set of states is unbounded Solution: compute a bounded representation of (a superset) of program states
Determine what states can arise during any execution
51
1) Abstraction
main(int i) { int x=3,y=1;
do { y = y + 1; } while(--i > 0) assert 0 < x + y;}
• concrete state
• abstract state (sign)
: Var Z
#: Var{+, 0, -, ?}
x y i
3 1 7 x y i
+ + +
3 2 6
x y i
… 52
2) Transformers
main(int i) { int x=3,y=1;
do { y = y + 1; } while(--i > 0) assert 0 < x + y;}
• concrete transformer
• abstract transformer
x y i
+ + 0
x y i
3 1 0y = y + 1
x y i
3 2 0
x y i
+ + 0
y = y + 1
+ - 0 + ? 0
+ 0 0 + + 0
+ ? 0 + ? 053
3) Exploration
+ + ? + + ?
x y i
main(int i) { int x=3,y=1;
do { y = y + 1; } while(--i > 0) assert 0 < x + y;}
+ + ?
+ + ?
? ? ?
x y i
+ + ?
+ + ?
+ + ?
+ + ?
+ + ?
+ + ?
54
Incompleteness
main(int i) { int x=3,y=1;
do { y = y - 2; y = y + 3; } while(--i > 0) assert 0 < x + y;}
+ ? ?
+ ? ?
x y i
+ ? ?
+ + ?
? ? ?
x y i
+ ? ?
+ ? ?
+ ? ?
55
Parity abstraction
challenge: how to find “the right” abstraction
while (x !=1 ) do { if (x % 2) == 0 { x := x / 2; } else { x := x * 3 + 1; assert (x %2 ==0); }}
56
How to find “the right” abstraction?
• Pick an abstract domain suited for your property– Numerical domains– Domains for reasoning about the heap– …
• Combination of abstract domains
• Another approach – Abstraction refinement
57
Following the recipe (in a nutshell)
1) Abstraction
Concrete state Abstract statex
tn n n
x
t
n
2) Transformers
n
x
t
n
t n
x n
t->n = x
58
Example: shape (heap) analysis
tx
n
x
t n
x
t n n
xt n n
xtt
x
ntt
ntx
tx
tx
emp
void stack-init(int i) { Node* x = null;
do {
Node t = malloc(…)
t->n = x;
x = t;
} while(--i>0)
Top = x;
} assert(acyclic(Top))
tx
n n
x
t n n
x
t n n n
xt n n n
xt n n n
top59
x
t n n
tx
n
x
t n
x
t n n
xtt
x
ntt
ntx
tx
tx
emp
xt n
n
xt n
n
n
x
t
n
t n
x n
xt n
n
3) Exploration
x
t n
Top n
ntx Top
tx Top x
t n
Top n
void stack-init(int i) { Node* x = null;
do {
Node t = malloc(…)
t->n = x;
x = t;
} while(--i>0)
Top = x;
}
assert(acyclic(Top))
60
Change the abstraction to match the program
Abstraction refinement
program
specificationAbstractcounterexample
abstraction AbstractionRefinement counter
example
Verify
Valid
61
Recap: program analysis
• Reason statically (at compile time) about the possible runtime behaviors of a program
• use sound overapproximation of program behavior
• abstract interpretation– abstract domain – transformers – exploration (fixed-point computation)
• finding the right abstraction?
62
Next lecture:operational semantics
of programming languages
63
References• Patriot bug:
– http://www.cs.usyd.edu.au/~alum/patriot_bug.html– Patrick Cousot’s NYU lecture notes
• Zune bug:–
http://www.crunchgear.com/2008/12/31/zune-bug-explained-in-detail/
• Blaster worm:– http://www.sans.org/security-resources/malwarefaq/w32_blast
erworm.php• Interesting CACM article
– http://cacm.acm.org/magazines/2010/2/69354-a-few-billion-lines-of-code-later/fulltext
• Interesting blog post– http://www.altdevblogaday.com/2011/12/24/static-code-analys
is/64