delta debugging and model checkers for fault localization
DESCRIPTION
Delta Debugging and Model Checkers for fault localization. Amin Alipour. Note: Some slides/figures in this presentations has been used/adapted from presentations by Andreas Zeller, Tevfik Bultan , and Alex Groce . . Outline. Software Fault – some facts Delta debugging - PowerPoint PPT PresentationTRANSCRIPT
Delta Debugging and Model Checkers for fault localization
Amin Alipour
Note: Some slides/figures in this presentations has been used/adapted from presentations by Andreas Zeller, Tevfik Bultan , and Alex Groce.
Outline
• Software Fault – some facts• Delta debugging– Simplifying test cases– Isolating failure inducing parts in test cases– Search in space
• Model checking– Background– Distance metrics
• Conclusion
Software faults
• Software fault/flaw/bug perturbs the state of a program to an error state.
• Error state can propagates through the execution of the program and cause a failure.
• Failure is manifestation of error.
Software debugging
• What we have for debugging?– Program– Set of test cases.– …
• For maintainable debugging of failures:– We need to understand the test case/failure.– We need to identify the location of faults. (Fault
Localization) Can we automate it?
Approaches to Fault Localization
• Program Slicing• Program Spectra• Statistical Reasoning• Delta Debugging• Model Checking
Delta Debugging
• Goal:- Removing components irrelevant to the failure
from test cases.– It can improve comprehension of the failure.
• Delta debugging comes with two techniques:– Simplification (minimization) of test cases, and– Isolation of failure-inducing parts from test cases.
Delta Debugging• Failing test cases are usually cluttered by
unnecessary/irrelevant things.…….<td align=left valign=top><SELECT NAME="op sys" MULTIPLE SIZE=7><OPTION VALUE="All">All<OPTION VALUE="Windows 3.1">Windows 3.1<OPTIONVALUE="Windows 95">Windows 95<OPTION VALUE="Windows 98">Windows 98<OPTION VALUE="Windows ME">Windows ME<OPTION VALUE="Windows 2000">Windows2000<OPTION VALUE="Windows NT">Windows NT<OPTION VALUE="Mac System 7">Mac System 7<OPTION VALUE="Mac System 7.5">Mac System 7.5<OPTION VALUE="MacSystem 7.6.1">Mac System 7.6.1<OPTION VALUE="Mac System 8.0">Mac System8.0<OPTION VALUE="Mac System 8.5">Mac System 8.5<OPTION VALUE="Mac System8.6">Mac System 8.6<OPTION VALUE="Mac System 9.x">Mac System 9.x<OPTIONVALUE="MacOS X">MacOS X<OPTION VALUE="Linux">Linux<OPTIONVALUE="BSDI">BSDI<OPTION VALUE="FreeBSD">FreeBSD<OPTIONVALUE="NetBSD">NetBSD<OPTION VALUE="OpenBSD">OpenBSD<OPTIONVALUE="AIX">AIX<OPTION VALUE="BeOS">BeOS <OPTION VALUE="HP-UX">HP-UX<OPTIONVALUE="IRIX">IRIX<OPTION VALUE="Neutrino">Neutrino<OPTION VALUE="OpenVMS">OpenVMS<OPTION VALUE="OS/2">OS/2<OPTION VALUE="OSF/1">OSF/1<OPTION VALUE="Solaris">Solaris<OPTIONVALUE="SunOS">SunOS<OPTION VALUE="other">other</SELECT></td><td align=left valign=top><SELECT NAME="priority" MULTIPLE SIZE=7><OPTION VALUE="--">--<OPTION VALUE="P1">P1<OPTION VALUE="P2">P2<OPTIONVALUE="P3">P3<OPTIONVALUE="P4">P4<OPTION VALUE="P5">P5</SELECT></td><td align=left valign=top><SELECT NAME="bug severity" MULTIPLE SIZE=7><OPTION VALUE="blocker">blocker<OPTION VALUE="critical">critical<OPTIONVALUE="major">major<OPTION VALUE="normal">normal<OPTIONVALUE="minor">minor<OPTION VALUE="trivial">trivial<OPTION VALUE="enhancement">enhancement</SELECT></tr></table>…..
Simplification of test cases
• Goal:– Minimizing the size of a failing test case, cF.
• cF = 1 2 ... n
• Minimizing test cases requires checking all subset of s. • Delta debugging simplifies a failing test case cF to a 1-
minimal test case.• 1-minimal failing test case:– A failing test case is 1-minimal, if any part of it (i) is
removed, the failure will disappear.
Simplification Algorithm
i = cF i • Test each 1, 2, ... n and each 1, 2, ..., n
• There are four possible outcomes1. Some i causes failure
– Partition i to two and continue with i as the test set
2. Some i causes failure– Continue with i as the test set with n 1 subsets
3. No test causes failure– Increase granularity by generating a partition with 2n subsets
4. The granularity can no longer be increased– Done, found the 1-minimal subset
Simplification- Examplen = 2
n = 4
n = 3
n = 2
n = 4
n = 3
Granularity
Simplification Example 21 <SELECT NAME="priority" MULTIPLE SIZE=7> F
2 <SELECT NAME="priority" MULTIPLE SIZE=7> P
3 <SELECT NAME="priority" MULTIPLE SIZE=7> P
4 <SELECT NAME="priority" MULTIPLE SIZE=7> P
5 <SELECT NAME="priority" MULTIPLE SIZE=7> F
6 <SELECT NAME="priority" MULTIPLE SIZE=7> F
7 <SELECT NAME="priority" MULTIPLE SIZE=7> P
8 <SELECT NAME="priority" MULTIPLE SIZE=7> P
9 <SELECT NAME="priority" MULTIPLE SIZE=7> P
10 <SELECT NAME="priority" MULTIPLE SIZE=7> F
11 <SELECT NAME="priority" MULTIPLE SIZE=7> P
12 <SELECT NAME="priority" MULTIPLE SIZE=7> P
13 <SELECT NAME="priority" MULTIPLE SIZE=7> P
Simplification Example 2-cont’d14 <SELECT NAME="priority" MULTIPLE SIZE=7> P
15 <SELECT NAME="priority" MULTIPLE SIZE=7> P
16 <SELECT NAME="priority" MULTIPLE SIZE=7> F
17 <SELECT NAME="priority" MULTIPLE SIZE=7> F
18 <SELECT NAME="priority" MULTIPLE SIZE=7> F
19 <SELECT NAME="priority" MULTIPLE SIZE=7> P
20 <SELECT NAME="priority" MULTIPLE SIZE=7> P
21 <SELECT NAME="priority" MULTIPLE SIZE=7> P
22 <SELECT NAME="priority" MULTIPLE SIZE=7> P
23 <SELECT NAME="priority" MULTIPLE SIZE=7> P
24 <SELECT NAME="priority" MULTIPLE SIZE=7> P
25 <SELECT NAME="priority" MULTIPLE SIZE=7> P
26 <SELECT NAME="priority" MULTIPLE SIZE=7> F
…….<td align=left valign=top><SELECT NAME="op sys" MULTIPLE SIZE=7><OPTION VALUE="All">All<OPTION VALUE="Windows 3.1">Windows 3.1<OPTIONVALUE="Windows 95">Windows 95<OPTION VALUE="Windows 98">Windows 98<OPTION VALUE="Windows ME">Windows ME<OPTION VALUE="Windows 2000">Windows2000<OPTION VALUE="Windows NT">Windows NT<OPTION VALUE="Mac System 7">Mac System 7<OPTION VALUE="Mac System 7.5">Mac System 7.5<OPTION VALUE="MacSystem 7.6.1">Mac System 7.6.1<OPTION VALUE="Mac System 8.0">Mac System8.0<OPTION VALUE="Mac System 8.5">Mac System 8.5<OPTION VALUE="Mac System8.6">Mac System 8.6<OPTION VALUE="Mac System 9.x">Mac System 9.x<OPTIONVALUE="MacOS X">MacOS X<OPTION VALUE="Linux">Linux<OPTIONVALUE="BSDI">BSDI<OPTION VALUE="FreeBSD">FreeBSD<OPTIONVALUE="NetBSD">NetBSD<OPTION VALUE="OpenBSD">OpenBSD<OPTIONVALUE="AIX">AIX<OPTION VALUE="BeOS">BeOS <OPTION VALUE="HP-UX">HP-UX<OPTIONVALUE="IRIX">IRIX<OPTION VALUE="Neutrino">Neutrino<OPTION VALUE="OpenVMS">OpenVMS<OPTION VALUE="OS/2">OS/2<OPTION VALUE="OSF/1">OSF/1<OPTION VALUE="Solaris">Solaris<OPTIONVALUE="SunOS">SunOS<OPTION VALUE="other">other</SELECT></td><td align=left valign=top><SELECT NAME="priority" MULTIPLE SIZE=7><OPTION VALUE="--">--<OPTION VALUE="P1">P1<OPTION VALUE="P2">P2<OPTIONVALUE="P3">P3<OPTIONVALUE="P4">P4<OPTION VALUE="P5">P5</SELECT></td><td align=left valign=top><SELECT NAME="bug severity" MULTIPLE SIZE=7><OPTION VALUE="blocker">blocker<OPTION VALUE="critical">critical<OPTIONVALUE="major">major<OPTION VALUE="normal">normal<OPTIONVALUE="minor">minor<OPTION VALUE="trivial">trivial<OPTION VALUE="enhancement">enhancement</SELECT></tr></table>…..
Simplification
<SELECT>
Isolation of Failure-inducing part from test case
• Even in minimal test cases, there are still some elements in the minimal test case that are not directly related to the failure. – E.g., a minimal test case for a C
compiler, still needs to have some symbols like: {,}, or variable declarations for the validity of test input that might be irrelevant to the failure.
#define SIZE 20Double mult(double z[], int n) { int i, j; i = 0; for(j=0;j<n);j++){ i = i + j + 1; z[i] = z[i]*(z[0] + 1.0); } return z[n];}
Isolation of Failure-inducing part from a test case
• How to isolate failure-related parts?– Find a pair of passing and failing input that are
very similar and contrast them.
#define SIZE 20Double mult(double z[], int n) { int i, j; i = 0; for(j=0;j<n);j++){ i + j + 1; z[i] = z[i]*(z[0] + 1.0); } return z[n];}
#define SIZE 20Double mult(double z[], int n) { int i, j; i = 0; for(j=0;j<n);j++){ i = i + j + 1; z[i] = z[i]*(z[0] + 1.0); } return z[n];}
Failing Test Case Passing Test Case
Isolation Algorithm
• Narrow down the gap between passing and failing test case, by removing their differences and making them more similar.
Isolation Example2 <SELECT NAME="priority" MULTIPLE SIZE=7> F4 <SELECT NAME="priority" MULTIPLE SIZE=7> F7 <SELECT NAME="priority" MULTIPLE SIZE=7> P6 <SELECT NAME="priority" MULTIPLE SIZE=7> P5 <SELECT NAME="priority" MULTIPLE SIZE=7> P3 <SELECT NAME="priority" MULTIPLE SIZE=7> P1 <SELECT NAME="priority" MULTIPLE SIZE=7> P
Cause for a failure
Can we use the isolation technique to find causes of the failure?
Cause for a failure - example
cause of a failure - example
Cause Transitionsrfrp a
a
ab
b
c
l1
l2
li
L1+1
lj
Lj+1
Cause
Cause Transition
Discussion on delta debugging
• It scales well.• It requires minimal information about the
program and its specification.• There are several extensions to it:– Hierarchal Delta debugging– Isolating schedules in concurrent systems.– Isolating failure-inducing changes in repositories.
Model Checkers for fault localization
Model Checking Problem
Model Checker
Program/Model
Specification/ assertions
Satisfied
Counter-example
Fault Localization with Model Checkers
• Model Checkers can perform different queries on program paths and states.
• These queries can be used for fault localization:– Contrasting– Distance Metrics– Max-SAT
Explanation with Distance Metrics• How it’s done:
Model checker
P+spec
First, the program (P) andspecification (spec) are sentto the model checker.
Explanation with Distance Metrics• How it’s done:
Model checker
P+spec C
The model checker findsa counterexample, C.
Explanation with Distance Metrics• How it’s done:
Model checker
BMC/constraint generator
P+spec C
The explanation tool uses P,spec, and C to generate (viaBounded Model Checking) aformula with solutions thatare executions of P that arenot counterexamples
Explanation with Distance Metrics• How it’s done:
Model checker
BMC/constraint generator
P+spec C
S
Constraints are added to thisformula for an optimizationproblem: find a solution thatis as similar to C as possible,by the distance metric d. Theformula + optimizationproblem is S
Explanation with Distance Metrics• How it’s done:
Model checker
BMC/constraint generator
P+spec C
Optimization tool
S -C
An optimization tool (PBS, the Pseudo-Boolean Solver) finds a solution to S:an execution of P that is nota counterexample, and isas similar as possible to C:call this execution -C
Explanation with Distance Metrics
Model checker
BMC/constraint generator
P+spec C
Optimization tool
S -C
C
-Cs
Report the differences (s)between C and –C to theuser: explanation and faultlocalization
“SSA” Transformationint main () { int x, y; int z = y; if (x > 0) y--; else y++; z++; assert (y == z);}
int main () { int x0, y0; int z0 = y0; y1 = y0 - 1; y2 = y0 + 1; guard1 = x0 > 0; y3 = guard1?y1:y2; z1 = z0 + 1; assert (y3 == z1);}
Transformation to Equationsint main () { int x0, y0; int z0 = y0; y1 = y0 - 1; y2 = y0 + 1; guard1 = x0 > 0; y3 = guard1?y1:y2; z1 = z0 + 1; assert (y3 == z1);}
(z0 == y0 y1 == y0 – 1 y2 == y0 + 1 guard1 == x0 > 0 y3 == guard1?y1:y2 z1 == z0 + 1 y3 == z1)
Transformation to Equationsint main () { int x0, y0; int z0 = y0; y1 = y0 - 1; y2 = y0 + 1; guard1 = x0 > 0; y3 = guard1?y1:y2; z1 = z0 + 1; assert (y3 == z1);}
(z0 == y0 y1 == y0 – 1 y2 == y0 + 1 guard1 == x0 > 0 y3 == guard1?y1:y2 z1 == z0 + 1 y3 == z1)
Uninitialized variables in CBMC are unconstrained inputs.
Transformation to Equationsint main () { int x0, y0; int z0 = y0; y1 = y0 - 1; y2 = y0 + 1; guard1 = x0 > 0; y3 = guard1?y1:y2; z1 = z0 + 1; assert (y3 == z1);}
(z0 == y0 y1 == y0 – 1 y2 == y0 + 1 guard1 == x0 > 0 y3 == guard1?y1:y2 z1 == z0 + 1 y3 == z1)
CBMC (1) negates the assertion
Transformation to Equationsint main () { int x0, y0; int z0 = y0; y1 = y0 - 1; y2 = y0 + 1; guard1 = x0 > 0; y3 = guard1?y1:y2; z1 = z0 + 1; assert (y3 == z1);}
(z0 == y0 y1 == y0 – 1 y2 == y0 + 1 guard1 == x0 > 0 y3 == guard1?y1:y2 z1 == z0 + 1 y3 != z1)
(assertion is now negated)
Transformation to Equationsint main () { int x0, y0; int z0 = y0; y1 = y0 - 1; y2 = y0 + 1; guard1 = x0 > 0; y3 = guard1?y1:y2; z1 = z0 + 1; assert (y3 == z1);}
(z0 == y0 y1 == y0 – 1 y2 == y0 + 1 guard1 == x0 > 0 y3 == guard1?y1:y2 z1 == z0 + 1 y3 != z1)
then (2) translates to SAT and usesa fast solver to find a counterexample
Execution Representation(z0 == y0 y1 == y0 – 1 y2 == y0 + 1 guard1 == x0 > 0 y3 == guard1?y1:y2 z1 == z0 + 1 y3 != z1)
Remove the assertion to get an equation forany execution of the program
Execution Representation(z0 == y0 y1 == y0 – 1 y2 == y0 + 1 guard1 == x0 > 0 y3 == guard1?y1:y2 z1 == z0 + 1 y3 != z1)
Execution represented by assignments toall variables in the equations
x0 == 1 y0 == 5 z0 == 5 y1 == 4 y2 == 6 guard1 == true y3 == 4 z1 == 6
Counterexample
Execution Representation(z0 == y0 y1 == y0 – 1 y2 == y0 + 1 guard1 == x0 > 0 y3 == guard1?y1:y2 z1 == z0 + 1 y3 == z1)
Use the assertion to find a passing trace.
x0 == 0 y0 == 5 z0 == 5 y1 == 4 y2 == 6 guard1 == false y3 == 6 z1 == 6
Passing Trace
Execution Representation x0 == 1 y0 == 5 z0 == 5 y1 == 4 y2 == 6 guard1 == true y3 == 4 z1 == 6
Counterexample
Execution represented by assignments toall variables in the equations
x0 == 0 y0 == 5 z0 == 5 y1 == 4 y2 == 6 guard1 == false y3 == 6 z1 == 6
Successful execution
The Distance Metric d x0 == 1 y0 == 5 z0 == 5 y1 == 4 y2 == 6 guard1 == true y3 == 4 z1 == 6
Counterexample
d = number of changes (s) between two executions
x0 == 0 y0 == 5 z0 == 5 y1 == 4 y2 == 6 guard1 == false y3 == 6 z1 == 6
Successful execution
The Distance Metric d x0 == 1 y0 == 5 z0 == 5 y1 == 4 y2 == 6 guard1 == true y3 == 4 z1 == 6
Counterexample
d = number of changes (s) between two executions
x0 == 0 y0 == 5 z0 == 5 y1 == 4 y2 == 6 guard1 == false y3 == 6 z1 == 6
Successful execution
The Distance Metric d x0 == 1 y0 == 5 z0 == 5 y1 == 4 y2 == 6 guard1 == true y3 == 4 z1 == 6
Counterexample
d = number of changes (s) between two executions
x0 == 0 y0 == 5 z0 == 5 y1 == 4 y2 == 6 guard1 == false y3 == 6 z1 == 6
Successful execution
1
The Distance Metric d x0 == 1 y0 == 5 z0 == 5 y1 == 4 y2 == 6 guard1 == true y3 == 4 z1 == 6
Counterexample
d = number of changes (s) between two executions
x0 == 0 y0 == 5 z0 == 5 y1 == 4 y2 == 6 guard1 == false y3 == 6 z1 == 6
Successful execution
d = 3
3 is the minimum possible distance between thecounterexample and a successful execution
The Distance Metric d x0 == 1 y0 == 5 z0 == 5 y1 == 4 y2 == 6 guard1 == true y3 == 4 z1 == 6
Counterexample
To compute the metric, add a new SATvariable for each potential
x0 == (x0 != 1) y0 == (y0 != 5) z0 == (z0 != 5) y1 == (y1 != 4) y2 == (y2 != 6) guard1 == !guard1 y3 == (y3 != 4) z1 == (z1 != 6)
New SAT variables
The Distance Metric d x0 == 1 y0 == 5 z0 == 5 y1 == 4 y2 == 6 guard1 == true y3 == 4 z1 == 6
Counterexample
And minimize the sum of the variables(treated as 0/1 values): a pseudo-Boolean problem
x0 == (x0 != 1) y0 == (y0 != 5) z0 == (z0 != 5) y1 == (y1 != 4) y2 == (y2 != 6) guard1 == !guard1 y3 == (y3 != 4) z1 == (z1 != 6)
New SAT variables
Explanation with Distance Metrics
Model checker
BMC/constraint generator
P+spec C
Optimization tool
S -C
C
-Cs
CBMC
explain
PBS
Discussion
• Usefulness of Fault Localization Techniques– Effectiveness:• Precision: Low false negative• Informative-ness: Enough clue to make a fix or refute
– Efficiency:• Performance: It should run within the budget
constraints.• Scalability: Ability to run on real size programs.• Information Usage: Making the most of the information
available.
Discussion
Fault LocalizationProgram
Test Cases
Specification
Development History Developers
Comments
Input
Suspicious components
Discussion
Fault Localization
Output
Suspicious components
Why?
Program
Specification
No answer!
…
Thank you!
What model checking gives us?
• We can query program (sub)paths with different characteristics. E.g.– All failing paths– All passing paths
The Distance Metric d• An SSA-form oddity:– Distance metric can compare values from
code that doesn’t run in either execution being compared
– This can be the determining factor in which of two traces is most similar to a counterexample
– Counterintuitive but not necessarily incorrect: simply extends comparison to all hypothetical control flow paths
Model Checking
• Model checking problem:– Given a transition system M and a property , verify if M
satisfies .• M can represent a program.• can denote a desired property for the program,
e.g.:– Deadlock does not happen, a particular function is called
at most once.• Model checking procedure must either verify the
program or return a counter-example (failing trace).
What model checking gives us?
• We can query program (sub)paths with different characteristics. E.g.– All failing paths– All passing paths
Explanation with Distance Metrics
Model checker
BMC/constraint generator
P+spec C
Optimization tool
S -C
C
-Cs
CBMC
explain
PBS
Typical State of program