tvla: a system for generating abstract interpreters
DESCRIPTION
TVLA: A System for Generating Abstract Interpreters. T. Lev-Ami, R. Manevich, M. Sagiv http://www.cs.tau.ac.il/~tvla. A. Loginov, G. Ramalingam, E. Yahav. T. Reps & R. Wilhelm. Object Oriented Programs. Extensive use of the heap Dynamic allocation Recursive data structures - PowerPoint PPT PresentationTRANSCRIPT
T. Lev-Ami, R. Manevich, M. Sagiv
http://www.cs.tau.ac.il/~tvla
TVLA: A System for Generating
Abstract Interpreters
A. Loginov, G. Ramalingam, E. Yahav
T. Reps & R. Wilhelm
Object Oriented Programs
• Extensive use of the heap
• Dynamic allocation
• Recursive data structures
• Destructive reference updates• x.f := y
Verification of OO Programs
• Verify properties about the heap
• No runtime errors– Null dereferences– Freed memory– Dangling pointers
• No memory leaks
• Partial correctness
A Common Approach to Pointer Analysis in Software Verification
• Break verification problem into two phases
– Preprocessing phase – separate points-to analysis• typically no distinction between objects allocated at same site
– Verification phase – verification using points-to results• May lose precision
– May lose ability to perform strong updates– May produce false alarms
Program
Property
Pointer Analysis
VerificationAnalysis
Loss of Precision in Two-Phase Approach
f = new InputStream();
f.read();
f.close();
Verify f is not readafter it is closed
Straightforward ...
Loss of Precision in Two-Phase Approach
while (?) {
f = new InputStream();
f.read();
f.close();
}
f1f
False alarm!“read may be erroneous”
closed=1/2
The TVLA Approach
Express concrete semantics using first order logic
Abstract Interpretation using 3-valued Kleene’s logic1. verification integrated with heap analysis
heap analysis (merging criterion) can adapt to verification
Program
Property
Frontend
First-OrderTransition Sys
AbstractInterpretation
while (?) {
f = new InputStream();
f.read();
f.close();
}
The TVLA Approach: An Example
f
closedf
closedf closed
closedf closed closed
…
Concrete States
f
closedf
closedf
Abstract States
Outline
• Expressing concrete semantics using logic (TVLA input)
• Canonincal abstraction
• Abstract interpretation using Canonincal abstraction (TVLA engine)
• Applications & ongoing Work
Traditional Heap Interpretation• States = Two level stores
– env: Var Values– fields: Loc Values– Values=Loc Atoms
• Example – env = [x 30, p 79]– next = [30 40, 40 50, 50 79, 79 90]– val = [30 1, 40 2, 50 3, 79 4, 90 5]
1 40 2 50 3 79 4 90 5 0 x
p
30 40 50 79 90
Predicate Logic• Vocabulary
– A finite set of predicate symbols Peach with a fixed arity
• Logical Structures S provide meaning for predicates – A set of individuals (nodes) U– pS: (US)k {0, 1}
• FOTC over TC, express logical structure properties
Representing Stores as Logical Structures
• Locations Individuals• Program variables Unary predicates• Fields Binary predicates• Example
– U = {u1, u2, u3, u4, u5}– x = {u1}, p = {u3}– next = {<u1, u2>, <u2, u3>, <u3, u4>, <u4, u5>}
u1 u2 u3 u4 u5xn n n n
p
Concrete Interpretation Rules (Lists)Statement Safety Precondition Postcondition
x =NULL 1 v:x’(v)
x=malloc() v0: active(v0) v: x’(v) eq(v, v0)
v: active’(v) active(v) eq(v,v0)
x=y 1 v:x’(v) y(v)
x=y next v0: y(v0) active(v0) v:x’(v) next(v0, v)
x next=y v0: x(v0) active(v0) v1,v2: next’(v1, v2) ( eq(v1, v0 ) next(v1, v2))
(eq(v1, v0) y(v2))
Invariants• No memory leaks
v: active(v) {x PVar} v1: x(v1) next*(v1, v)• Acyclic list(x)
v1,v2: x(v1) next*(v1, v2) next+(v2, v1)• Reverse (x)
v1,v2, v3: x(v1) next*(v1, v2) next(v2, v3) next’(v3, v2)
• 1: True
• 0: False
• 1/2: Unknown
• A join semi-lattice: 0 1 = 1/2
Kleene 3-Valued Logic
1/2 Information
order
Logical order
Boolean Connectives [Kleene] 0 1/2 1
0 0 0 01/2 0 1/2 1/21 0 1/2 1
0 1/2 1
0 0 1/2 11/2 1/2 1/2 11 1 1 1
3-Valued Logical Structures
• A set of individuals (nodes) U
• Predicate meaning– pS: (US)k {0, 1, 1/2}
Canonical Abstraction• Partition the individuals into equivalence classes based on the values of
their unary predicates
– Every individual is mapped into its equivalence class
• Collapse predicates via
– pS (u’1, ..., u’k) = {pB (u1, ..., uk) | f(u1)=u’1, ..., f(u’k)=u’k) }
• At most 2A abstract individuals
Canonical Abstraction
x = NULL;
while (…) do {
t = malloc();
t next=x;
x = t
}
u1x
t
u2 u3
u1x
t
u2,3
n n
n
n
u1x
t
u2 u3n n
x
t
n nu2u1 u3
Canonical Abstraction
x = NULL;
while (…) do {
t = malloc();
t next=x;
x = t
} u1x
t
u2,3n
n
n
Canonical Abstraction and Equality
• Summary nodes may represent more than one element
• (In)equality need not be preserved under abstraction
• Explicitly record equality
• Summary nodes are nodes with eq(u, u)=1/2
Canonical Abstraction and Equality
x = NULL;
while (…) do {
t = malloc();
t next=x;
x = t
}
u1x
t
u2 u3
u1x
t
u2,3
eq
eq
eq
n n
n
n
eq eq
eq
eq
eq
eqeq
u2,3
Canonical Abstraction
x = NULL;
while (…) do {
t = malloc();
t next=x;
x = t
}
u1x
t
u2 u3n n
u1x
t
u2,3n
n
Heap Sharing predicate
is(v)=0
u1x
t
u2 un…
u1x
t
u2..n
n
n
v:is(v) v1,v2: next(v1,v) next(v2,v) eq(v1, v2)
is(v)=0 is(v)=0
is(v)=0 is(v)=0
n n n
Heap Sharing predicate
is(v)=0
u1x
t
u2 un…
is(v)=1 is(v)=0
n n
n
n
u1x
t
u2n
is(v)=0 is(v)=1 is(v)=0
n
u3..n
n
n
v:is(v) v1,v2: next(v1,v) next(v2,v) eq(v1, v2)
Best Conservative Interpretation (CC79)
Abstraction
ConcretizationConcrete Representati
on
Collecting Interpretation
stc
ConcreteRepresentati
on
AbstractRepresentati
on
Abstract Representati
on
Abstract Interpretation
st#
yx
yx ...
Evaluateupdateformulas
y
x
y
x
...
inverse embedding
y
x
y
xCanonincalCanonincal abstraction
xy
“Focus”- Based Transformer (x = x n)
“Focus”-Based Transformer (x = x n)
y
x
y
x
EvaluateupdateFormulas (Kleene)
y
x
y
xCanonincal
yx
yx
Focus(x n)
“Partial ”
xy
Example: Abstract Interpretation
t
x
n
x
t n
x
tn n
xtt
x
ntt
nt
x
t
x
t
xempty
x
tn
n
x
tn
n
n
x
tn
t n
xn
x
tn
nreturn x
x = t
t =malloc(..);
tnext=x;
x = NULL
TF
Cleanness Analysis of Linked Lists (Nurit Dor, SAS 2000)
Static analysis of C programs manipulating linked lists
Defects checked References to freed memory Null dereferences Memory leaks
Existing algorithms are inadequate Miss errors Report too many false alarms
Dereference of NULL pointers
typedef struct element {
int value; struct element *next; } Elements
bool search(int value, Elements *c) {Elements *elem;for (elem = c;
c != NULL; elem = elem->next;)
if (elem->val == value)
return TRUE;return FALSE
Dereference of NULL pointers
typedef struct element {
int value; struct element *next; } Elements
bool search(int value, Elements *c) {Elements *elem;for (elem = c;
c != NULL; elem = elem->next;)
if (elem->val == value)
return TRUE;return FALSE
potential null dereference
Dereference of NULL pointers
typedef struct element {
int value; struct element *next; } Elements
bool search(int value, Elements *c) {Elements *elem;for (elem = c;
elem != NULL; elem = elem->next;)
if (elem->val == value)
return TRUE;return FALSE
potential null dereference
Memory leakage
Elements* reverse(Elements *c){Elements *h,*g;h = NULL;while (c!= NULL) {
g = c->next;h = c;c->next = h;c = g;
}return h;
typedef struct element {
int value; struct element *next; } Elements
Memory leakage
Elements* reverse(Elements *c){Elements *h,*g;h = NULL;while (c!= NULL) {
g = c->next;h = c;c->next = h;c = g;
}return h;
leakage of address pointed-by h
typedef struct element {
int value; struct element *next; } Elements
Memory leakage
Elements* reverse(Elements *c){Elements *h,*g;h = NULL;while (c!= NULL) {
g = c->next;h = c;c->next = h;c = g;
}return h;
typedef struct element {
int value; struct element *next; } Elements
✔ No memory leaks
Proving Correctness of Sorting Implementations (Lev-Ami, Reps, S, Wilhelm ISSTA 2000)
Partial correctness The elements are sorted The list is a permutation of the original list
Termination At every loop iterations the set of elements
reachable from the head is decreased
Example: InsertSortList InsertSort(List x) { List r, pr, rn, l, pl; r = x; pr = NULL; while (r != NULL) { l = x; rn = r n; pl = NULL; while (l != r) { if (l data > r data) { pr n = rn; r n = l; if (pl = = NULL) x = r; else pl n = r; r = pr; break; } pl = l; l = l n; } pr = r; r = rn; } return x; }
typedef struct list_cell { int data; struct list_cell *n;} *List;
assert sorted[x];
assert permutation[x,x’]
v:inOrder[dle, n](v) v1: next(v, v1) dle(v,v1)
Example: InsertSort
Run Demo
List InsertSort(List x) { if (x == NULL) return NULL pr = x; r = x->n; while (r != NULL) {
pl = x; rn = r->n; l = x->n; while (l != r) {
pr->n = rn ; r->n = l;
pl->n = r; r = pr; break; }
pl = l; l = l->n;
} pr = r; r = rn;
}
typedef struct list_cell { int data; struct list_cell *n;} *List;
14
Example: Mark and Sweep
void Sweep() { unexplored = Universe collected = while (unexplored ) { x = SelectAndRemove(unexplored) if (x marked) collected = collected {x} } assert(collected = = Universe – Reachset(root) )}
void Mark(Node root) { if (root != NULL) { pending = pending = pending {root} marked = while (pending ) { x = SelectAndRemove(pending) marked = marked {x} t = x left if (t NULL) if (t marked) pending = pending {t} t = x right if (t NULL) if (t marked) pending = pending {t} } } assert(marked = = Reachset(root))}
Challenges: Heap & Concurrency[Yahav POPL’01]
Concurrency with the heap is evil… Java threads are just heap allocated objectsData and control are strongly related
Thread-scheduling info may require understanding of heap structure (e.g., scheduling queue)
Heap analysis requires information about thread scheduling
Thread t1 = new Thread();Thread t2 = new Thread();…t = t1;…t.start();
Configurations – Example
at[l_C]
rval[myLock]
held_by
at[l_1]rval[myLock]
at[l_0]at[l_0]at[l_1]
rval[myLock]
blocked
l_0: while (true) {l_1: synchronized(myLock) {l_C: // critical sectionl_2: } l_3: }
Concrete Configuration
at[l_C]
rval[myLock]
held_by
at[l_1]rval[myLock]
at[l_0]at[l_0]
at[l_1]
rval[myLock]
blocked
Examples VerifiedProgram Property
twoLock Q No interferenceNo memory leaksPartial correctness
Producer/consumer No interferenceNo memory leaks
ApprenticeChallenge
Counter increasing
Dining philosophers with resource ordering
Absence of deadlock
Mutex Mutual exclusion
Web Server No interference
Compile-Time GC for Java(Ran Shaham, SAS’03)
The compiler can issue free when objects are no longer needed
Analysis of Java/JavaCard programsGeneric Java Bytecode FrontendPrecise analysis but expensive (cost of
procedures)
More Scalable Solution[Gilad Arnold]
Combine backward and forward analysis Forward analysis determine the shape Backward analysis locates heap liveness Use
E. YahavSchool of Computer Science
Tel-Aviv University
Verifying Safety Propertiesusing Separation
and Heterogeneous AbstractionsPLDI’04
G. RamalingamIBM T.J. Watson Research Center
Outline of Separation
Decompose verification problem into a set of subproblems
Adapt abstraction to each subproblem
Outline of Separation
Decompose verification problem into a set of subproblems• Analysis-user specifies a separation strategy
Adapt abstraction to each subproblem Heterogeneous abstraction
Prototype Implementation
Implemented over TVLACorrectness comes from the embedding theoremApplied to several example programs
Up to 5000 lines of Java
Used to verify Absence of concurrent modification exception (CME) JDBC API conformance IOStreams API conformance
Improved performance In some cases improved precision
Analysis Times
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
ISPath InputStream5 InputStream5b inputStream6JDBCExample
JDBCExample 2
db KernelBenchmark 1
SQLExecutor
Benchmark
Tim
e (s
ec)
vanilla
single
Ongoing Work
Temporal properties [Yahav, ESOP’03]Assume guarantee reasoning [Yorsh, VMCAI’04,
TACAS’04]Correctness of collection implementation
[Livshits]Automatic refinement [Loginov, ESOP’03] Integrating numeric domains [Gopan, TACAS’04]Procedure summaries [Jeannet, SAS’04]Sclalablity [Manevich SAS’04, Lev-Ami]Heap modularity [Bauer & Rinetzky]
Conclusion
FOTC is expressive for defining semanticsTVLA is a very useful research tool
Try before you publish Easy to try new algorithms
Sometimes faster than existing shape analyzers
But has high interpretation overhead Currently improved