modular static analysis with sets and relations for verifying data structure consistency
DESCRIPTION
Modular Static Analysis with Sets and Relations for Verifying Data Structure Consistency. Viktor Kuncak Computer Science and Artificial Intelligence Lab MIT. Joint work with:. Huu Hai Nguyen Peter Schmitt Suhabe Bugrara. Martin Rinard Andreas Podelski Daniel Jackson. Patrick Lam - PowerPoint PPT PresentationTRANSCRIPT
Modular Static Analysiswith Sets and Relations
for Verifying Data Structure Consistency
Viktor Kuncak
Computer Science and Artificial Intelligence Lab
MIT
Martin RinardAndreas PodelskiDaniel Jackson
Patrick LamThomas WiesKaren Zee
Huu Hai NguyenPeter SchmittSuhabe Bugrara
Joint work with:
Program analysis and verification
Discover/verify properties of software systems
Practical relevance: programmer productivity– performance: compiler optimizations– reliability: discovering and preventing errors– maintainability: understanding code
Broader implications– automated analysis of formal artifacts
(implications for XML documents, formal proofs)
Spectrum of analysis techniques
Broad research area, many dimensions– bug finding versus bug prevention– control-intensive versus data-intensive systems– generic versus application-specific properties
Original ideal: full program verification
Reality: verify partial correctness properties– success story: type systems– active area: temporal properties (typestate)
trend: towards complex properties
Data structure consistency properties
next
prev
next next
prev prev
root
acyclicity of next
x.next.prev = x
next nextfirst
3
size
rightleft
size field is consistent withthe number of stored objects
graph is a treeshape not given by types,
but by structural properties;
may change over time
unbounded number of objects, dynamically allocated
rightleft
class Node { Node f1, f2;}
Inconsistent data structures
Can cause program crashes
Looping
nextnext next
prev prev
next next next
Unexpected outcome of operations– removing two instead of one element
internal consistency
External data structure consistency
If a person has borrowed a book, then – person is registered with library, and– book is in the catalog
Book
Person
borrows
[0..4] A person can borrow at most 4 books at a time
Two persons cannot borrow the same book
[0..1]
- correlate different data structures - global- meaningful to users of the system- capture design constraints (object models)- inconsistency can lead to policy violations
relies on internal consistency to be even meaningful
Goal
Prove data structure consistency– for all program executions (sound)– with high level of automation– both internal and external consistency– both implementation and use of data structures
Using static analysis to enforce data structure consistency
source code of a program
static analyzer
data structures are consistent
error in program !x.next.prev = x
BAr
consistency properties
. . . proc remove(x : Node) { Node p=x.prev; n=x.next; if (p!=null) p.next = n; else root = n; if (n!=null) n.prev = p; } . . .
Challenges in verifying consistency
complexheterogenous data structures,in the context of application;developer-defined properties
precision
no single approachwill work
communicationwith developers
scalability
Outline
Goal: verify data structure consistency
Our approach through an example
Bohne: one of the analyses in our system
Current status and ongoing work
Future work
Example: Minesweeper Game
Analyzed using our system (based on Java version)
(actual screenshot)
Minesweeper game data structures
Cell object
init
true
isExposed
false
next
prev
next
prev
Minesweeper consistency properties
init
true
isExposed
false
next
prev
next
prev
next is acyclic
prev is inverse of next
object is in hidden cells list iff initialized and isExposed is false
1
isExposed
true
object is in hidden cells list iff its init flag is true and its isExposed flag is false
Formalization as an invariant
Difficulties– need to track exact reachability properties– correlate linking information with stored data
Need a way to deal with complexity
Complex consistency properties
{x | next*(HiddenListRoot,x) }{x | x.init & ! x.isExposed }
=
expression that is true whenever program
reaches certain points
{x | next*(HiddenListRoot,x) }{x | next*(HiddenListRoot,x) }
object is in hidden cells list iff its init flag is true and its isExposed flag is false
Formalization as an invariant
Towards factoring out complexity
{x | x.init & ! x.isExposed }
ListContent =
=
UnexposedCells =
ListContentUnexposedCells{x | x.init & ! x.isExposed }
How to enable such reasoning in our program?
abstract reasoning in terms of sets
Board module
List module
Minesweeper source code
proc remove(c : Cell) { Cell p=x.prev; n=x.next; if (p!=null) p.next = n; else root = n; if (n!=null) n.prev = p;}
init : bool;
proc expose(c:Cell) { remove(c); setFlag(c);}
proc setFlag(c : Cell) { c.isExposed = true;}
isExposed : bool; next, prev : Cell;
record Cell {
}
init
isExposed
next
prev
next
prev
partial record Cell {
}
encapsulate state
encapsulate operations
partial record Cell {
}
replace implementations(in the analysis only)
with set specifications
partial record Cell { } List.content =Board.UnexpCells
Encapsulating complexity in modules
No need to reason about data structure details! can use more scalable analysescontent UnexpCells
Reasoning in terms of setsMinesweeper source code
Board module
List.content =Board.UnexpCells
content ’ = content - c UnexpCells’= UnexpCells - c
List module
partial record Cell { }
proc expose(c:Cell) { remove(c); setFlag(c);}
equality is preserved:
proc remove(c : Cell) proc setFlag(c : Cell)
Justifying reasoning in terms of setsMinesweeper source code
List.content =Board.UnexpCells
content ’ = content - c
List module
partial record Cell { }
proc expose(c:Cell) { remove(c); setFlag(c);}
proc remove(c : Cell)
proc remove(c : Cell) { ... if (p!=null) p.next = n; ...}
UnexpCells’= UnexpCells - c
proc setFlag(c : Cell)
UnexpCells={x|x.init&!x.isExposed}
proc setFlag(c : Cell) { c.isExposed = true;}
specification section
abstraction section
implementation section
Three sections of a module
content = {x | next*(root, x) }
modularizedthe invariant!
abst module List { content = { x : Cell | next* root x} ; invariant tree [next]; invariant ALL x y. prev x = y ! (x null Æ y null ! next y = x);}
spec module List {
specvar content : Cell set;
proc remove(c : Cell) requires c in content & c != null modifies content ensures content ’ = content - c;
}
impl module List {partial record Cell { next, prev : Cell; }var root : Cell;proc remove(c : Cell) { Cell p=c.prev; n=c.next; if (p!=null) p.next = n; else root = n; if (n!=null) n.prev = p; }}
List module
showing conformance:use precise analyses but
only inside the List module
reasoning about List invariants is
confined to List module
Verification of List has dual benefits:
• justify analysis of clients
• prove partial correctness of List operations
Reasoning about program in terms of simpler interfaces - uses of interfaces - global consistency
scalable analyses
Summary of our approach: two steps
A implementation
A interface
B implementation
B interface
Checking that interfaces reflect implementationsand internal consistency is preserved - precise analyses
Application(Data Structure Client)
This approach addresses challenges
A interface B interface
Reasoning about program in terms of simpler structures
Checking that abstract structures reflect implementations
scalability
precision: within data structures
heterogeneity: multiple analyses
analysis1 analysis2
developers communicatewith system via interfaces
A implementation B implementation
Used in manual verification, VDM, ESC/Java as data abstraction
Application(Data Structure Client) analysis3
Key question in automating approach(while keeping it useful)
Application(Data Structure Client)
A interface B interface
analysis1 analysis2
analysis3
A implementation B implementation
How to chooseinterface language?
Our solution: set algebra
Set algebra as interface language
Useful: express key data structure properties– disjointness (A Å B = ;), inclusion (A µ B)– insertion (S’ = S [ x), removal (S’ = S \ x)– conceptual object state
• initialization, sequencing of API operations• symbolic notations for hierarchical state charts
Verifiable: on both sides of set abstraction– typestate techniques for interface uses– shape analyses for interface implementations
Two systems based on this insight
MONA CVC Litefield
constraintanalysis
Flag analysis for high-level
properties
Bohneinvariant inference
Jahobdata structure
analysis system
IsabelleOmega solver
for linear arithmetic
verificationconditiongenerator
BAPAdecision proceduredispatcher
SVV'05VMCAI'06
CADE'05, JAR
POPL’02SAS’03VMCAI’04
VMCAI'05
CC'05AOSD'05VSTTE’05
decision procedures and theorem provers
annotation inference algorithms
VMCAI’06
Hob data structure
analysis system
Outline
Goal: verify data structure consistency
Our approach through an example
Bohne: one of the analyses in our system
Current status and ongoing work
Future work
Bohne analysis properties
Analyzes linked data structures
Precisely handles reachability propertiescan define set of elements reachable from root:
content = { x | next*(root,x) }
Predictable: based on decision procedures
next
prev
next next
prev prev
rootrightleft
Starting point
MONA CVC Litefield
constraintanalysis
Flag analysis for high-level
properties
Bohneinvariant inference
IsabelleOmega solver
for linear arithmetic
verificationconditiongenerator
BAPAdecision proceduredispatcher
SVV'05VMCAI'06
CADE'05, JAR
POPL’02SAS’03VMCAI’04
VMCAI'05
decision procedures and theorem provers
invariant inference algorithms
VMCAI’06
Jahobdata structure
analysis system
CC'05AOSD'05VSTTE’05
Hob data structure
analysis system
Verification condition (VC) – a logical formula saying: “If precondition holds at entry, then postcondition holds in the final state, invariants are preserved, and there are no run-time errors”
verification condition generator
basic verifier = vcgen + decision procedure
data structures are consistent
error in program !
Bohne analysis
decision procedure
valid
invalid
implementation
specification
abstraction
syntactic translation(as in symbolic execution)
VC: pre wlpbody(post)pre, body , post
Decision procedure
Goal: precise reasoning about reachability
Reachability properties in trees are decidable– Monadic Second-Order Logic over Trees– existing MONA decision procedure
• construct a tree automaton for each formula• check emptiness of the language of automaton
rightleftUsing this approach: We can analyze implementations of treesBut only trees.Even parent links would introduce cycles!
Beyond trees
MONA CVC Litefield
constraintanalysis
Flag analysis for high-level
properties
Bohneinvariant inference
IsabelleOmega solver
for linear arithmetic
verificationconditiongenerator
BAPAdecision proceduredispatcher
SVV'05VMCAI'06
CADE'05, JAR
POPL’02SAS’03VMCAI’04
VMCAI'05
decision procedures and theorem provers
invariant inference algorithms
VMCAI’06
Jahobdata structure
analysis system
CC'05AOSD'05VSTTE’05
Hob data structure
analysis system
Field constraint analysis
Enables reasoning about non-tree fields
Can handle broader class of data structures– doubly-linked lists, trees with parent pointers– skip lists
nextnext next
nextSub
next next
nextSub
treebackbone
constrainedfields
Constrained fields satisfy constraint invariant: ALL x y. nextSub(x) = y next+(x,y)
Elimination of constrained fields
MONAfield
constraintanalysis
VMCAI'06
nextnext next
nextSub
next next
nextSub
treebackbone
constrainedfields
Constrained fields satisfy constraint invariant: ALL x y. nextSub(x) = y next+(x,y)
VC1(next,nextSub)VC2(next)
valid valid soundness
invalid invalidcompleteness
(for useful class including preservation of field constraints)
Elimination of constrained fields
nextnext next
nextSub
next next
nextSub
treebackbone
constrainedfields
Constrained fields satisfy constraint invariant: ALL x y. nextSub(x) = y next+(x,y)
Previous approaches– constraining formula must be deterministic
We allow arbitrary constraint formulas– fields need not be uniquely given by backbone
Inferring invariants
MONA CVC Litefield
constraintanalysis
Flag analysis for high-level
properties
Bohneinvariant inference
IsabelleOmega solver
for linear arithmetic
verificationconditiongenerator
BAPAdecision proceduredispatcher
SVV'05VMCAI'06
CADE'05, JAR
POPL’02SAS’03VMCAI’04
VMCAI'05
decision procedures and theorem provers
invariant inference algorithms
VMCAI’06
Jahobdata structure
analysis system
CC'05AOSD'05VSTTE’05
Hob data structure
analysis system
would need loop invariants
Loop invariant synthesis
root c
Possible states at entry to List.remove(c)
root
root c
c
Problem: unbounded number of objects
Solution: partition objects into sets
. . .
Partitioning with reachability
! Proot & ! Pc & ! Rc
root c
Partitioning properties of objects:Proot – pointed to by rootPc – pointed to by cRc – reachable from c
Group nodes according to whether properties hold
Proot ... Pc... ! Pc & Rc ...
abstract heap (represents unbounded number of concrete heaps)
. . . . . .
c
Pc... ! Pc & Rc ...
. . .
8 x. (Proot(x) & !Pc(x) & !Rc(x)) |
!Proot(x) & !Pc(x) & !Rc(x)) |
!Proot(x) & Pc(x) & Rc(x)) |
!Proot(x) & !Pc(x) & Rc(x)))
root
| 8 x. (Proot(x) & Pc(x) & !Rc(x)) |
!Proot(x) & !Pc(x) & Rc(x)))
Domain for inferring loop invariants
Çc 8 x. Çb Æa Pa(a,b,c)(x)
a summary node
partitioning properties and their negations (Rx, !Rx)
abstract heap
! Proot & ! Pc & ! Rc
C1C2 C3 C4
set of possible abstract heaps at a given program point
! Proot & ! Pc & ! RcProot PcPOPL’02: graph-basedSAS’03: undecidabilityVMCAI’04: formulasSAS’05 (Podelski, Wies)
. . . ...
! Pc & Rc
Domain for inferring loop invariants
Çc 8 x. Çb Æa Pa(a,b,c)(x)
! Proot & Rroot & ! Px & ! RxProot Pc
. . . ...
! Pc & Rc
Compared to predicate abstraction
Çb Æa Pa(a,b)
– predicates on object x and state, not just state– enables needed precision and efficiency
Propagating abstract heaps
n = c.next
p = c.prev
initial heaps
Finite state space - explore using a worklist algorithm
. . .
How to compute if heap is a successors?
F1
F2
Use verification condition generator!
Computing transitions
verification condition generator
Bohne analysis
Decision procedure
invariant synthesis
F1 wlp(F2)
F1 , basic block , F2
valid
transition F1 F2
is possible
Making invariant synthesis feasible
Naive algorithm: 2^2n queries
Reducing number of queries– transform each summary node independently
(Cartesian abstraction)– avoid recomputation
• precompute abstractions of transitions(generalization of Boolean programs)
• precompute unsatisfiable conjunctions• ‘semantic’ caching of queries
– auxiliary analysis to propagate true conjuncts
Improvements crucial for making analysis feasible
Analyses Developed in Hob
MONA CVC Litefield
constraintanalysis
Flag analysis for high-level
properties
Bohneinvariant inference
IsabelleOmega solver
for linear arithmetic
verificationconditiongenerator
BAPAdecision proceduredispatcher
SVV'05VMCAI'06
CADE'05, JAR
POPL’02SAS’03VMCAI’04
VMCAI'05 invariant inference algorithms
VMCAI’06
Jahobdata structure
analysis system
CC'05AOSD'05VSTTE’05
Hob data structure
analysis system
1 line / sec
depends on graduate student10 line / sec
100 lines / secusing MONA
(but could use SAT)
Outline
Goal: verify data structure consistency
Our approach through an example
Bohne: one of the analyses in our system
Current status– analyzed programs– ongoing work
Future work
Minesweeper experience
init
true
isExposed
false
next
prev
next
prev
next is acyclic
prev is inverse of next
object is in hidden cells list iff initialized and isExposed is false
Verified properties meaningful to designers and end users
disjoint(Hidden.content, Exposed.content)
“A cell is never both hidden and exposed”– consistency needed to understand the game
! disjoint(Mined.content,Exposed.content) => gameOver
“If a mined cell is exposed, the game is over”– defining property of the game
proc remove(n : Node)
requires n in Content & n != null
ensures Content’= Content – n
&
}
List with a cursorspec module IterList {
specvar Content : Node set
specvar Iter : Node set;
invariant Iter in Content;
impl module IterList {
var root, current : Node;
proc remove(n : Node) {
if (n==root) { root = root.next; }
Node prv, nxt;
prv = n.prev; nxt = n.next;
if (prv!=null) { prv.next = nxt; }
if (nxt!=null) { nxt.prev = prv; }
n.next = null; n.prev = null;
}
}root current
if (n==current) {
current = current.next; }
Iter ’ = Iter – nIter
Content
BUG
Verifying use of cursors
List.openIter();
bool b = List.isLastIter();
while (!b) {
c = List.nextIter();
View.drawCell(c);
b = List.isLastIter();
}
spec module IterList {
specvar Content, Iter : Node set;
invariant Iter in Content;
proc isLastIter() returns b : bool
ensures b' <=> (Iter ' = {});
proc nextIter() returns n : Node
requires Iter != { }
modifies Iter
ensures (n != null) &
(n in Iter) & (Iter ' = Iter - n) &
(n in Content);
}
iterator initialized before useno iteration past the endeach cell visited exactly once
Further analyzed programsWater particle simulation: ordering of computation phases
Web server: initialization, ordering, data structures– serving http://hob.csail.mit.edu
High-level properties– relationships between different data structures– none of individual analysis could handle alone
Individual data structures: – trees (w/ parents), doubly-linked lists (w/ cursors)– skip lists, lists with cross pointers, array, priority queue
Ongoing work:– turn-based strategy game, collection classes– operating system data structures
Jahob system
MONA CVC Litefield
constraintanalysis
Flag analysis for high-level
properties
Bohneinvariant inference
IsabelleOmega solver
for linear arithmetic
verificationconditiongenerator
BAPAdecision proceduredispatcher
SVV'05VMCAI'06
CADE'05, JAR
POPL’02SAS’03VMCAI’04
VMCAI'05
decision procedures and theorem provers
invariant inference algorithms
VMCAI’06
Jahobdata structure
analysis system
CC'05AOSD'05VSTTE’05
Hob data structure
analysis system
Jahob system
Successor to Hob
Goal: check data structures in more scenarios– richer interfaces and invariants
• maps to specify association lists, hash tables• relations to specify unbounded number of instances• symbolic cardinality constraints on sets
– future extension to other properties
Implementation language: Java subset
Specification language: Isabelle subset
New specialized decision procedures
Fine-grained combination of logics
MONA CVC Litefield
constraintanalysis
Flag analysis for high-level
properties
Bohneinvariant inference
IsabelleOmega solver
for linear arithmetic
verificationconditiongenerator
BAPAdecision proceduredispatcher
SVV'05VMCAI'06
CADE'05, JAR
POPL’02SAS’03VMCAI’04
VMCAI'05
decision procedures and theorem provers
invariant inference algorithms
VMCAI’06
Jahobdata structure
analysis system
CC'05AOSD'05VSTTE’05
Hob data structure
analysis system
Relational interfaces: impl and use
Logics for verifying uses of relations– two variable logic with counting (SAS’04)– fragments of first-order logic (AIOOL’05)
Book
Person
borrows = {(1, A), (2, B), (3, B)}
[0..4]
[0..1]
1 2 3
A B
use multiple logics for each verification condition in implementation of relation
New high-level analysis
modular methodology that supports OO style
New decision procedures
MONA CVC Litefield
constraintanalysis
Flag analysis for high-level
properties
Bohneinvariant inference
IsabelleOmega solver
for linear arithmetic
verificationconditiongenerator
BAPAdecision proceduredispatcher
SVV'05VMCAI'06
CADE'05, JAR
POPL’02SAS’03VMCAI’04
VMCAI'05
decision procedures and theorem provers
invariant inference algorithms
VMCAI’06
Jahobdata structure
analysis system
CC'05AOSD'05VSTTE’05
Hob data structure
analysis system
BAPA: Sets with cardinality bounds
Imposing constraints on abstract content
card(content) = size
card(a.content) = card(b.content)
next nextfirst
3
size size field is consistent withthe number of stored objects
Boolean Algebra with Presburger Arithmetic
Not widely known, but natural extension of BAs
Gave first complexity bound (CADE'05, JAR)– quantifier elimination algorithm (as in LICS’03)
Recent results (see technical report, submissions):– first PSPACE algorithm for quantifier-free fragment
– identified new useful polynomial-time fragment
S ::= V | S1 [ S2 | S1 Å S2 | S1 n S2
T ::= k | C | T1 + T2 | T1 – T2 | C¢T | card(S)
A ::= S1 = S2 | S1 µ S2 | T1 = T2 | T1 < T2
F ::= A | F1 Æ F2 | F1 Ç F2 | :F | 9S.F | 9k.F
From BAPA to PA
If A,B are disjoint, then |A [ B| = |A| + |B|
Make them disjoint: Venn diagram
Reduce set vars to integer varsFor quantifiers, use quantifier eliminationPreserves alternations elementary
2 3
6
1
4 |xc Å y Å zc|
x y
z
58
Quantifier-free BAPA
Previous technique gives NEXPTIME
Can do it in PSPACE:– analyze resulting equations
exponentially many variables polynomially many equations finite model property: solutions singly exp.
– guess sizes of sets– use alternating PTIME algorithm to check them
Also identified a tree-like fragment, in PTIME
Hob and Jahob systems
MONA CVC Litefield
constraintanalysis
Flag analysis for high-level
properties
Bohneinvariant inference
IsabelleOmega solver
for linear arithmetic
verificationconditiongenerator
BAPAdecision proceduredispatcher
SVV'05VMCAI'06
CADE'05, JAR
POPL’02SAS’03VMCAI’04
VMCAI'05
decision procedures and theorem provers
invariant inference algorithms
VMCAI’06
Jahobdata structure
analysis system
CC'05AOSD'05VSTTE’05
Hob data structure
analysis system
Future work: roadmap
So far: conformance of code to model (specification)
Next: address the construction of models– counterexamples for models (Alloy, FSE’05)– testing, run-time checking of specifications– efficient execution of declarative specifications
Fostering adoption of specifications– inference, syntax, quantifiers, defaults, templates
Deploy within software development environments
Integrate domain-specific knowledge– operating systems, games, embedded systems
Related work
Comparison to modular set-based analysis– we are similarly
• modular (but contracts: mutation,heap vs higher-order)• use sets and relations (of objects, not terms) note: LICS’03
– we also have• data abstraction: public and private contracts, abst funs• flow sensitivity, mutation: typestate• shape properties: relationships between typestates• different analyses in different modules
– but so far no: higher order functions, contract inference, two-level constraints, IDE
Related workShape analysis
– Jones, Muchnik ’79: memory optimizations– Larus, Hilfinger’88: detecting conflicts in memory accesses– Hendren, Nicolau ’90: parallelization, connection analysis– Chase, Wegman, Zadeck’90: allocation-site model– Klarlund, Schwartzbach’93: graph types– Deutsch ’94: symbolic bounds on paths– Fradet, Metayer ’97: graph-grammars– Sagiv, Reps, Wilhelm ’99: 3-valued framework– Lev-Ami, Sagiv ’00: TVLA implementation– Moeller, Schwartzbach ’01: PALE based on MONA– Yorsh, Reps, Sagiv ’04: assume/guarantee reasoning for 3VL– McPeak, Necula ’05: local pointer properties– Rugina, Hacket’05: region-based– Lee, Yang, Yi’05: combining three-valued and grammar-based
Related workModel checking:
– Holzmann ’97: SPIN– Burch, Clarke, Long, McMillan, Dill ’92: SMV– Pisman, Pnueli ’01: non-regular infinite state systems
Predicate abstraction – extracting models– Graf, Saidi ’97: using PVS– Ball, Podelski ’01: Cartesian abstraction– Ball, Majumdar, Millstein, Rajamani ’01: SLAM– Henzinger, Jhala, Sutre’02: BLAST– Flanagan, Qadeer ’02: use of Skolem constants– Lahiri, Seshia, Bryant ’04: UCLID, indexed predicates– Balaban, Pnueli, Zuck ’05: small models for lists– Bingham, Rakamaric’06: abstraction of lists– Lahiri, Qadeer ’06: lists and data properties
Related workDecision procedures and theorem provers
– Barrett, Berezin’04: CVC Lite– Detlef, Nelson, Saxe’03: Simplify– Ball, Lahiri, Musuvathi ’05: Zap– Thatcher, Wright’68: MSOL over finite trees– Klarlund, Moeller, Schwartzbach’00: MONA– Yorsh, Rabinovich, Sagiv, Meyer, Bouajjani’06: reachability logic– BAPA: Feferman,Vaught’59; Zarba’04,’05– Voronkov’95: Vampire, Weidenbach’01: Spass– Gordon’85: HOL, Pfenning’91: LF, Coquand, Huet’85: Coq– Constable, Allen, Bromley, Cleaveland, Cremer, Harper, Howe, Knoblock,
Mendler, Panangaden, Sasaki, Smith’86: NuPRL– Gray, Hickey, Nogin, Tapus: MetaPRL– Kaufmann, Manolios, Moore ’00: ACL2– Nipkow, Paulson, Wenzel’02: Isabelle
Related workProgram verification systems
– King ’70, Deutsch’73, Suzuki’73, Nelson’81, Guttag, Horning’93– Good, Akers, Smith ’86: Gypsy– Jones’86: VDM– Abrial, Lee, Neilson, Scharbach, Soerensen’91: B method– Owre, Shankar, Rushby, Stringer-Calvert: PVS– Ahrendt, Baar, Beckert, Giese, Habermalz, Haehnle, Menzel,
Schmitt’00: KeY– Foulger, King’01: SPARK Ada– Flanagan, Leino, Lilibridge, Nelson, Saxe, Stata‘02: ESC/Java– Marche, Paulin-Mohring, Urbain’03: Krakatoa– Breunesse, Poll’05: model fields in JML– Barnett, DeLine, Jacobs, Fähndrich, Leino, Schulte, Venter’05: Spec#– Leino, Mueller’06: model fields in Spec#
Conclusions
Goal: statically verify data structure consistency
Hob system: language, framework, analyses– specification language based on sets– new shape analysis, new high-level analysis– analyzed minesweeper, water, web server
• detailed data structure properties: trees, arrays, ...• properties meaningful to users of the system
Jahob system– richer specification language with relations– new decision procedures and analyses
Related work
Array bounds checking– Bodik, Gupta, Sarkar ’00: demand-driven– Rugina, Rinard ’00: bounds and region analysis
Pointer analyses– Steensgaard ’96: points-to in almost linear time– Andersen’94: inclusion constraints– Fähndrich, Rehof, Das ’00: instantiation constraints– Salcianu, Rinard ’05: side-effect analysis– Sridharan, Gopan, Shan, Bodik ’05: demand-driven– Sridharan, Bodik ’06: refinement-based
Cost of analyzing data structures
Doubly exponential state space
Non-elementary decision procedure
Mutable reversal of list using a loopcontent ’ = content, structure remains acyclic list5 seconds
2-level skip list insertioncontent ’ = content [ {x}, structure remains skip list35 seconds
Insertion into parent treecontent ’ = content [ {x} , structure remains parent tree 83 seconds
Related work
Type systems– Freeman, Pfenning ’91: refinement types– Xi, Pfenning ’99: dependent ML– Harren, Necula’05: dependent types in typed assembly– Smith, Walker, Morrisett ’00: alias types
Typestate systems– Strom, Yemini ’86: typestate for initialization– Fahndrich, DeLine ’01 ’04: finite state protocols– Das, Lerner, Seigle ’02: typestate inference– Ramalingam, Warshavsky, Field, Goyal, Sagiv ’02
Related workBug finding and dynamic specification synthesis
– Jackson, Vaziri ’00: finding bugs in code with Alloy– Taghdiri ’04: counterexample-driven refinement– Xie, Aiken ’05: Saturn– Evans ’94: LCLint– Engler, Musuvathi’00: metacompilation– Hovemeyer, Pugh ’04: FindBugs– Boyapati, Khurshid, Marinov ’02: Korat– Sen, Marinov, Agha: CUTE– Ernst, Czeisler, Griswold, Notkin’00: dynamic invariant inference– Ammons, Bodik, Larus ’02: dynamic finite state inference
Additional details and topics
Decidability of structural subtyping
Relational reasoning about datatypes
Two-variable logic and spatial conjunction
Boolean algebra with Presburger arithmetic
High-level analysis using set algebra