fault diagnosis of software systems -...
TRANSCRIPT
![Page 1: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/1.jpg)
Rui Abreu Dept. of Informatics Engineering
Faculty of Engineering University of Porto
Thanks: Peter Zoeteweij, Tom Janssen, Arjan J.C. van Gemund
Johan de Kleer, Wolfgang Mayer
Fault Diagnosis of Software Systems
![Page 2: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/2.jpg)
2
LESI, UM
ST, UU Philips Research Labs
PhD, TUD
Ass. prof., FEUP Siemens, Porto
About the speaker…
![Page 3: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/3.jpg)
Software Faults
• Faults (bugs) have been around since the beginning of computer science
• can have serious financial or life-threatening consequences
3
![Page 4: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/4.jpg)
4
![Page 5: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/5.jpg)
5
Automated Diagnosis
• Purpose: identify the root causes of system failures • Originated in the AI area
– expert systems – model-based diagnosis
• Primarily applied to hardware (circuits, mechanical devices), e.g., – Line stuck at 0/1, valve stuck, – Sensors, amps not working, – Leakage, …
![Page 6: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/6.jpg)
6
Software
• Automated diagnosis automated debugging – E.g., applications in recovery don’t require the level of
detail needed for debugging • Background: embedded systems
– Functionality shift HW → SW – 25% annual growth rate – Nearly constant fault density
– Decreasingly dependable systems U
![Page 7: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/7.jpg)
7
Dependability approaches
How to reverse the trend?
• Design / development time: decrease SW fault density + LOC – Formal methods, SW arch, code generation, testing,
…
• Run-time: deal with imperfection – Fault detection, isolation, recovery (FDIR)
![Page 8: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/8.jpg)
8
Software fault diagnosis
Contributes to dependability in two ways
• At (design / ) development time: – shortens the test – diagnose – repair cycle – More bugs solved → more reliable products – (or shorted time to market)
• At run-time: – Can serve as the basis for (automated) recovery – Requires recovery-oriented design
![Page 9: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/9.jpg)
9
Live Demo later on
Outline
• Part I – Diagnosis principles – Model-Based Diagnosis – Spectrum-Based Fault Localization – Live Demo
• Part II – Existing systems – Current research – Case studies – Further applications – Other approaches
![Page 10: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/10.jpg)
10
Spectrum-based fault localization
• Black-box technique: no modeling required – As opposed to model-based diagnosis
• Inherently inaccurate – As opposed to model-based diagnosis
• Appears to work well in practice • Lends itself well to integration with existing
testing schemes • Low CPU and memory overhead
![Page 11: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/11.jpg)
11
Integration with testing
Test suite t1 t2 t3 t4 t5
![Page 12: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/12.jpg)
12
Integration with testing
Status t1 t2 t3 t4 t5
System components are ranked according
to likelihood of causing the
detected errors
1 2
2
![Page 13: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/13.jpg)
13
Terminology
fault
error
failure behavior ≠ expected behavior
(segmentation fault)
system state that may cause a failure
(index out of bounds)
the cause of an error in the system
(bug: array index un-initialized)
![Page 14: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/14.jpg)
14
Terminology
fault
error
failure For our purposes, the distinction between errors and failures is less relevant: failures are errors that affect the user; i.e. that are externally observable.
This depends on • specification • what can be observed
![Page 15: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/15.jpg)
15
Example: rational bubble sort void RationalSort( int n, int *num, int *den ) { int i,j;
for ( i=n-1; i>=0; i-- ) { assert( den[i] != 0 ); for ( j=0; j<i; j++ ) { if ( RationalGT( num[j], den[j], num[j+1], den[j+1] ) ) { swap( &num[j], &num[j+1] ); /* swap( &den[j], &den[j+1] ); */ } } } }
Fault: forgot to swap denominators Error: sequence is not a permutation of input sequence Failure: output is not a sorted version of the input
![Page 16: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/16.jpg)
16
Example: rational bubble sort
• Failure example:
4 3 1 1
, 4
, 3
4 3 1 3
, 1
, 4
4 3 1 1
, 3
, 4
4 3 1 3
, 4
, 1
![Page 17: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/17.jpg)
17
Example: rational bubble sort
• Faults do not automatically lead to errors: RationalSort works fine if the input array is already sorted, or if all denominators are equal:
1 1 1 1
, 2
, 3
1 1 1 3
, 1
, 2
1 1 1 1
, 3
, 2
1 1 1 3
, 2
, 1
![Page 18: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/18.jpg)
18
Example: rational bubble sort
• Errors do not lead automatically to failures:
1 2 1 0
, 2
, 4
1 2 1 4
, 0
, 2
1 2 1 0
, 4
, 2
1 2 1 4
, 2
, 0
ERROR!
• Numerators are swapped • Denominators are not swapped • However, the end result is –by mere chance- correct
![Page 19: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/19.jpg)
19
Fault Diagnosis
Identify component(s) that are root cause of failure
y = f(x,h) x f1 f2
f4 f5
f3
Diagnose failure: solve inverse problem h = f-1(x,y) Diagnosis: h2 = fault state, or h4 and h5 = fault state
x, y: observation vectors f: system function, fi: component functions h: system health state vector, hi: component health vars
healthy faulty
![Page 20: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/20.jpg)
20
Example Fault Diagnoses
• stuck-at-x, stuck valve, motor (HW) • sensors, amps not working (HW, SW) • wires disconnected, crossed (HW, SW) • wrong function, leakage (HW, SW) • excessive component delay (HW, SW) • component intermittent out-of-spec (HW, SW) • ...
![Page 21: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/21.jpg)
21
Modeling Information (1)
y = f(x,h) x f1 f2
f4 f5
f3
• suppose we only have nominal model (M) of f: – can only test for system failure, not locate component failure
y’ = M(x) model M of
nominal behavior
=? Pass/Fail
![Page 22: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/22.jpg)
22
Modeling Information (2)
y = f(x,h) x f1 f2
f4 f5
f3
• suppose we also have models (Mi) of all fi: – can infer location(s) of failure (Model-Based Diagnosis)
M1 M2
M4 M5
M3
=? Pass/Fail
y’ = M(x,h’)
search for h’=h such that y’ consistent with y
![Page 23: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/23.jpg)
23
Modeling Information (3)
y = f(x,h) x f1 f2
f4 f5
f3
• suppose we only have trace on involvement of fi: – can infer location(s) of failure (Spectrum-Based Diagnosis)
1 2
4 5
3
=? Pass/Fail
y’ = M(x) trace
correlate trace with Pass/Fail test outcomes
![Page 24: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/24.jpg)
24
Models in diagnosis
• Define expected behavior • Need not be explicit • May help reason about faults
x z
y1
y2
i1
i3
i2
y1=y2=x
![Page 25: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/25.jpg)
25
Reasoning about faults
1
0
1
i1
i3
i2
![Page 26: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/26.jpg)
26
Reasoning about faults
1
0
1
i1
i3
i2
![Page 27: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/27.jpg)
27
Reasoning about faults
1
0
1
i1
i3
i2
![Page 28: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/28.jpg)
28
Reasoning about faults
1
0
1
i1
i3
i2
Invalid
explanation!
![Page 29: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/29.jpg)
29
Reasoning about faults
1
0
1
i1
i3
i2
valid (h1,h2,h3) invalid (1,0,1) (1,1,1) (0,0,1) (0,1,1) (1,0,0) (1,1,0) (0,1,0) (0,0,0)
![Page 30: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/30.jpg)
30
Reasoning about faults
1
0
1
i1
i3
i2
valid invalid (1,0,1) 0.009801 (1,1,1) (0,0,1) 0.000099 (0,1,1) (1,0,0) 0.000099 (1,1,0) (0,1,0) 0.000099 (0,0,0) 0.000001
P(fail)=0.01
![Page 31: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/31.jpg)
31
Model-based diagnosis
i
hi
xi yi
model component i: hi ⇒ (yi = ¬xi)
model system: h1 ⇒ (y1 = ¬x1) h2 ⇒ (y2 = ¬x2) h3 ⇒ (y3 = ¬x3) y1=z z=x2
z=x3
x1 z
y2
y3
i1
i3
i2
h1
h2
h3
![Page 32: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/32.jpg)
32
Model-based diagnosis • For a given input, the model specifies a function
f : health states x input → output
• Diagnosis entails calculating the inverse:
f -1 : input x output → health states
• f -1 can be computed by truth maintenance system
![Page 33: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/33.jpg)
33
Improving MBD accuracy
1
0
1
i1
i3
i2
(1,0,1)
(0,0,1) (1,0,0) (0,1,0) (0,0,0)
![Page 34: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/34.jpg)
34
Improving MBD accuracy
0
0
1
i1
i3
i2
multiple (1,0,1)
(0,0,1) (0,0,1) (1,0,0) (1,0,0) (0,1,0) (0,1,0) (0,0,0) (0,0,0)
Add a second observation
![Page 35: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/35.jpg)
35
Improving MBD accuracy
0
0
1
i1
i3
i2
multiple z=1 (1,0,1)
(0,0,1) (0,0,1) (1,0,0) (1,0,0) (0,1,0) (0,1,0) (0,1,0) (0,0,0) (0,0,0) (0,0,0)
Add more probes 1
![Page 36: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/36.jpg)
36
Model strength
• Different models capture different failure modes • This is called the model “strength” • “Stronger” inverter model:
hi = ok ⇒ (yi = ¬xi) hi = stuck at 1 ⇒ (yi = 1) hi = stuck at 0 ⇒ (yi = 0) hi = bypass ⇒ (yi = xi)
![Page 37: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/37.jpg)
37
Model strength
i
hi
xi yi
model component i: hi ⇒ (yi = ¬xi) ¬ hi ⇒ ¬ yi
“stuck-at-zero”
![Page 38: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/38.jpg)
38
1
0
1
i1
i3
i2
Improving MBD accuracy
weak strong (1,0,1) (1,0,1) (0,0,1) (0,0,1) (1,0,0) (0,1,0) (0,0,0)
![Page 39: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/39.jpg)
39
Weak model
x
y1
y2
y1=y2=x c1
c2
c3
• Expected behavior is specified • Functionality of c1, c2, and c3 is unknown • c1 and c2 determine y1
• c1 and c3 determine y2
![Page 40: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/40.jpg)
40
Weak model
1
0
1 c1
c2
c3
• c1 is involved both in correct and incorrect behavior
• c3 is only involved in correct behavior • c2 is the only component that is exclusively
involved in the computation of an incorrect result
![Page 41: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/41.jpg)
41
Spectrum-based fault localization (SFL)
• Identify the components / parts whose activities coincide with the occurrence of failures
• Software can be seen as an executable model, that tells us which parts of a system are involved in a computation
• Two approaches can be distinguished – Statistics-based – Based on (logic) reasoning
![Page 42: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/42.jpg)
42
Program spectra
• Execution profiles that indicate, or count which parts of a software system are used in a particular test case
• Introduced in [Reps97] for diagnosing Y2K problems
• Many different forms exist [Harrold98]: – Spectra of program locations – Spectra of branches / paths – Spectra of data dependencies – Spectra of method call sub-sequences
![Page 43: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/43.jpg)
43
Block / function hit spectra
x1 x2 … xi … xM
1: function i called 0: function i not called
Function hit spectrum
1: block i executed 0: block i not executed
Block hit spectrum
Block: • C statement (compound stmt) • cases of a switch statement
![Page 44: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/44.jpg)
44
Fault diagnosis
x11 x12 … x1M e1
x21 x22 ... x2M e2
… … ... ... …
xN1 xN2 … xNM eN
1. Spectra for N test cases
M components
N cases
![Page 45: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/45.jpg)
45
Fault diagnosis
x11 x12 … x1M e1
x21 x22 ... x2M e2
… … ... ... …
xN1 xN2 … xNM eN
1. Spectra for M test cases
Row i: the blocks that are executed in case i
![Page 46: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/46.jpg)
46
Fault diagnosis
x11 x12 … x1M e1
x21 x22 ... x2M e2
… … ... ... …
xN1 xN2 … xNM eN
1. Spectra for M test cases
Column j : the test cases in which block j was executed
![Page 47: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/47.jpg)
47
Fault diagnosis
x11 x12 … x1M e1
x21 x22 ... x2M e2
… … ... ... …
xN1 xN2 … xNM eN
1. Spectra for M test cases 2. Error detection per test case
ei=1 : error in the i-th test ei=0 : no error in the i-th test
![Page 48: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/48.jpg)
48
Statistics-based Fault diagnosis
x11 x12 … x1M e1
x21 x22 ... x2M e2
… … ... ... …
xN1 xN2 … xNM eN
Compare every column vector with the error vector.
similarity sj
block j error vector
![Page 49: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/49.jpg)
49
Statistics-based Fault diagnosis
1 0
0 1 1 1 0 0 1 1
Jaccard similarity coefficient:
block j error vector
sj = n11+n10+n01
n11
![Page 50: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/50.jpg)
50
n11+n10+n01
n11
Statistics-based Fault diagnosis
1 0
0 1 1 1 0 0 1 1
Jaccard similarity coefficient:
block j error vector
sj =
![Page 51: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/51.jpg)
51
2 +n10+n01
Statistics-based Fault diagnosis
1 0
0 1 1 1 0 0 1 1
Jaccard similarity coefficient:
block j error vector
sj = 2
![Page 52: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/52.jpg)
52
2 + 1 +n01
Statistics-based Fault diagnosis
1 0
0 1 1 1 0 0 1 1
Jaccard similarity coefficient:
block j error vector
sj = 2
![Page 53: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/53.jpg)
53
2 + 1 + 1
Statistics-based Fault diagnosis
1 0
0 1 1 1 0 0 1 1
Jaccard similarity coefficient:
block j error vector
sj = 2
![Page 54: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/54.jpg)
54
Statistics-based Fault diagnosis
x11 x12 … x1M e1
x21 x22 ... x2M e2
… … ... ... …
xN1 xN2 … xNM eN
s1 s2 … sM
For every block: similarity with the error “block”
The component with the highest si most likely contains the fault.
m cases
M components error vector
![Page 55: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/55.jpg)
55
Statistics-based Fault diagnosis
component a b c d e f g fail
test 1 0 1 1 1 1 0 0 0 test 2 0 0 0 1 0 1 1 1 test 3 1 1 1 1 0 0 0 1 test 4 0 0 0 0 0 0 0 0 test 5 1 1 0 1 1 0 1 1
⅔ ½ ¼ ¾ ¼ ⅓ ⅔
s = n11+n10+n01
n11
![Page 56: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/56.jpg)
56
Example: rational bubble sort void RationalSort( int n, int *num, int *den ) { int i,j; /* block 1 */
for ( i=n-1; i>=0; i-- ) { assert( den[i] != 0 ); /* block 2 */ for ( j=0; j<i; j++ ) { if ( RationalGT( num[j], den[j], /* block 3 */ num[j+1], den[j+1] ) ) { swap( &num[j], &num[j+1] ); /* block 4 */ /* swap( &den[j], &den[j+1] ); */ } } } }
Fault: forgot to swap denominators Error: sequence is not a permutation of input sequence Failure: output is not a sorted version of the input
![Page 57: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/57.jpg)
57
4 2 0
1 2 1
2 0 4
1 2 1
2 4 0
1 2 1
0 2 4
1 2 1
ERROR!
earlier example
Example (2)
![Page 58: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/58.jpg)
58
Example (3)
Block 4 has highest similarity coefficient -> most likely suspect
![Page 59: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/59.jpg)
Reasoning-based Fault Diagnosis
59
• MBD – Reasoning approach based on behavioral comp models – High(er) diagnostic accuracy – Prohibitive (modeling and/or diagnosis) cost
• SFL – Statistical based on execution spectra – Lower diagnostic accuracy: cannot reason over multiple
faults – No modeling (except test oracle) + low diagnosis cost
![Page 60: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/60.jpg)
60
Idea: Extend SFL with MBD
• Combine best of both worlds • MBD
– Reasoning approach based on behavioral comp models
– High(er) diagnostic accuracy – Prohibitive (modeling and/or diagnosis) cost
• SFL – Statistical based on execution spectra – Lower diagnostic accuracy: cannot reason over MF – No modeling (except test oracle) + low diagnosis cost
60
![Page 61: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/61.jpg)
61
Working Example
61
![Page 62: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/62.jpg)
62
SFL
62
TARANTULA
![Page 63: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/63.jpg)
63
Reasoning
63
![Page 64: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/64.jpg)
64
Reasoning
64
![Page 65: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/65.jpg)
65
Reasoning
65
![Page 66: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/66.jpg)
66
![Page 67: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/67.jpg)
67
Ranking Candidates
• Probabilities updated according to Bayes’ rule
– where
![Page 68: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/68.jpg)
68
Ranking Candidates • Many ε-policies exist
– Ideally
• e.g.,
• ε = (1-h1) . (1-h2) . (1-h1) . (1-h2) . h1 . h2
– But estimating hj is far from trivial, hence approximations have been used so far (BAYES-A, [Abreu et al., WODA’08])
68
![Page 69: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/69.jpg)
69
Barinel
• Barinel’s key idea – for each dk , compute hj for the candidate’s faulty
components that maximizes the probability Pr(e|dk) of observations e occurring, conditioned on candidate dk
![Page 70: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/70.jpg)
70
Barinel Algorithm
1. Compute set of valid diagnosis candidates • D = {d1 = {1,2}, d2 = {1,3}}
2. Derive Pr(e|d) • Pr(e|d1) = (1- h1 . h2) . (1 – h2) . (1 – h1) . h1 • Pr(e|d2) = (1 – h1). (1 - h3) . (1 – h1) . h3 . h1
c1 c2 c3 e 1 1 0 1 (F) 0 1 1 1 (F) 1 0 0 1 (F) 1 0 1 0 (P)
![Page 71: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/71.jpg)
71
Barinel Algorithm
3. Compute hj by maximizing Pr(e|d) – Maximum likelihood estimation – Gradient ascent procedure – Pr(e|d1): h1 = 0.47 ; h2 = 0.19 Pr(d1) = 0.19 – Pr(e|d2): h1 = 0.41 ; h3 = 0.50 Pr(d2) = 0.04
4. Rank candidates according to Pr(d) – D = <{1,2}, {1,3}> – Inspection starts with components 1 and 2
![Page 72: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/72.jpg)
72
LIVE DEMO
• Requirements: – The Zoltar toolset (www.fdir.org/zoltar) – LLVM – OS: Linux
• Because of Zoltar… – Soon
• Will be available as an eclipse plugin • Support for Java
• T. Janssen, R. Abreu, and A.J.C. van Gemund, Zoltar: A Toolset for Automatic Fault Localization. In Proceedings of the 24th International Conference on Automated Software Engineering (ASE'09) - Tools Track, pp. 662--664, Auckland, New Zealand, November 2009. IEEE Computer Society. (Best Demo Award)
![Page 73: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/73.jpg)
73
Model-based vs. Spectrum-based Model-based • Model used primarily for reasoning • All generated explanations are valid • Most likely diagnosis need not be actual cause • Well suited for hardware
Spectrum-based • Model used primarily for error detection • Ranking may contain invalid explanations • Invalid explanations may rank high • Well suited for software
![Page 74: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/74.jpg)
74
Outline
• Part I – Diagnosis principles – Model-Based Diagnosis – Spectrum-Based Fault Localization – Live Demo
• Part II – Existing systems – Lessons learned – Case studies – Further applications – Related work
![Page 75: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/75.jpg)
75
Existing applications
• PinPoint: large on-line transaction processing systems (search engines, web mail) [Chen02]
• Tarantula: visualizing test information to aid manual debugging [Jones02]
• Ochiai [TAIC PART07; JSS09] • Barinel [ASE09] • …
![Page 76: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/76.jpg)
76
Similarity Coefficients
• Jaccard (PinPoint)
• Tarantula
• Ochiai (molecular biology)
![Page 77: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/77.jpg)
77
Fault diagnosis
1 0
0 1 1 1 0 0 1 1
Jaccard similarity coefficient:
block j error vector
sj = n11+n10+n01
n11
![Page 78: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/78.jpg)
78
n11+n10+n01
n11
Fault diagnosis
1 0
0 1 1 1 0 0 1 1
Jaccard similarity coefficient:
block j error vector
sj =
![Page 79: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/79.jpg)
79
2 +n10+n01
Fault diagnosis
1 0
0 1 1 1 0 0 1 1
Jaccard similarity coefficient:
block j error vector
sj = 2
![Page 80: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/80.jpg)
80
2 + 1 +n01
Fault diagnosis
1 0
0 1 1 1 0 0 1 1
Jaccard similarity coefficient:
block j error vector
sj = 2
![Page 81: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/81.jpg)
81
2 + 1 + 1
Fault diagnosis
1 0
0 1 1 1 0 0 1 1
Jaccard similarity coefficient:
block j error vector
sj = 2
![Page 82: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/82.jpg)
82
Diagnostic quality
• Percentage of blocks that need not be inspected:
![Page 83: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/83.jpg)
83
Discussion
• Under the specific conditions of our experiment, Ochiai outperforms 8 other coefficients.
• Why? • To what extent does this depend on the
conditions of our experiment? – Quality of the passed / failed information – Numers of runs – Artificial bugs in Siemens set
![Page 84: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/84.jpg)
84
Ochiai outperforms Tarantula
n11/(n11+n01) n11/(n11+n01) + n10/(n10+n00)
1 / (1 + )
1 / (1 + c ), with c = =
n10 n10+n00
n11+n01 n11
n10 n11
n11+n01 n10+n00
NF NP
n11>0
![Page 85: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/85.jpg)
85
Ochiai outperforms Tarantula
Tarantula
1 / (1 + c )
Ochiai n11
√((n11+n01).(n11+n10))
n10 n11
![Page 86: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/86.jpg)
86
Ochiai outperforms Tarantula
Tarantula
1 / (1 + c )
Ochiai n11
√((n11+n01).(n11+n10))
n10 n11
Only presence in passed runs
lowers the similarity
Absence in failed runs also lowers
the similarity
![Page 87: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/87.jpg)
87
Ochiai outperforms Jaccard
Jaccard n11
n11+n01+n10
Ochiai n11
√((n11+n01).(n11+n10))
![Page 88: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/88.jpg)
88
Ochiai outperforms Jaccard n11 √((n11+n01)(n11+n10)) square n11
2
((n11+n01)(n11+n10)) rewrite denominator n11
2
n112+n11n10+n11n01+n01n10
eliminate a11 n11 n11+n10+n01+ n01n10/n11
None of these steps modifies the ranking!
![Page 89: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/89.jpg)
89
Ochiai outperforms Jaccard
Jaccard n11
n11+n01+n10
Ochiai n11
n11+n10+n01+ n01n10/n11
differences are amplified
![Page 90: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/90.jpg)
90
Quality of the passed / failed info • Failure detection is a crude error detection
mechanism. • qe = n11 / (n11 + n10) • In the Siemens Set, qe ranges from 1.4% on
average for schedule2 to 20.3% on average for tot_info.
• Can be increased by excluding a run that contributes to n10
• Can be decreased by excluding a run that contributes to n11
![Page 91: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/91.jpg)
91
Quality of the passed / failed info Small fraction of fault activations detected is enough
![Page 92: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/92.jpg)
92
Number of runs
• On average, for the Siemens set: – Adding more failed tests is safe – 6 failed tests are enough – The number of passed tests has no influence
• However: – For individual runs the effect of adding passed tests
differs – It stabilizes around 20 passed tests
![Page 93: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/93.jpg)
93
Influence of #runs
![Page 94: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/94.jpg)
94
Influence of #runs • On average, for our
benchmark: – Adding failed runs is safe – 6 failed runs is enough – The number of passed
runs has no influence • However
– For individual runs, the effect of more passed runs differs
– It stabilizes around 20
![Page 95: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/95.jpg)
95
Dependence on Siemens set faults
• Investigate industrial relevance in TRADER project: improve the user-perceived reliability of high-volume consumer electronics devices
• Test case: television platform from NXP • Partners:
– Universities of Delft, Twente, Leiden, – Embedded Systems Institute, Design Technology
Institute, IMEC Leuven – NXP (former Philips Semiconductors)
![Page 96: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/96.jpg)
96
Embedded systems
• Low overhead • Little infrastructure needed • Consumer electronics
– No time for exhaustive debugging – Helps to identify responsible teams / developers
• Diagnosis can drive a recovery mechanism, e.g., rebooting suspect processes
![Page 97: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/97.jpg)
97
Case study – platform
• Control software of an analog TV • Decoding RC input, displays the on-screen
menu, teletext, optimizes parameters for audio / video processing based on signal analysis, etc.
• 450 K lines of C code • 2 MB of RAM + 2 MB in development version • CPU: MIPS running a small multi-tasking OS • Work is organized in 315 logical threads • UART connection to a PC
![Page 98: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/98.jpg)
98
Case study
TV TXT TV
1. Load problem:
![Page 99: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/99.jpg)
99
• 150 hit spectra of 315 functions, corresponding to the logical threads (one per second): 60 sec. TV, 30 sec. TXT, 60 sec. TV
• Marked the last 60 spectra as failed • 2nd in ranking of 315 functions
Diagnosis
![Page 100: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/100.jpg)
100
Case study
2. Teletext lock-up: – Existing problem in another product line – Copied to our platform, triggered by a remote control
key sequence – Inconsistency in two state variables, for which only
specific combinations are allowed
![Page 101: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/101.jpg)
101
Lock-up problem • Fault: case study lock-up
– In text mode, the sequence introduces a state inconsistency
• Error detection: – Check on the two variables involved in the inconsistency
(Hasan Sozer)
• Collecting spectra: – Small Koala component for caching / transmitting spectra – Transaction: time between two key presses
• Diagnosis: – block that introduces the inconsistency
2 ? 1 1
![Page 102: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/102.jpg)
102
Bool mgkey__rkeyntf_OnUp (KeySource source, KeySystem system, KeyCommand command) { hook_log (20345); if ((1) && Enabled) { Bool translated=0; hook_log (20346); hook_EndTransaction (); ... if ( !translated) { hook_log (20349); Translate (source, system, &command); } if (command >= 1000 && command <= 1009) { hook_log (20350); seq[0] = seq[1]; seq[1] = seq[2]; seq[2] = seq[3]; seq[3] = command - 1000; if ( !triggered) { hook_log (20351); if (seq[0] == 1 && seq[1] == 2) { hook_log (20353); triggered = 1; switch (seq[3]) { case 1: hook_log (20354); tmode = 6; break; case 2: hook_log (20355); ...
inconsistency
start a new spectrum
log use of the block in the current spectrum
Remember block 20354
Sample code
Used for testing
purposes only!
![Page 103: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/103.jpg)
103
Experiments
x11 x12 … x1M e1
x21 x22 ... x2M e2
… … ... ... …
xN1 xN2 … xNM eN
M > 60,000
6 - 26 trans.
13,451 – 13,796 blocks were executed in the scenarios
![Page 104: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/104.jpg)
104
Scenario 1
23 key presses:
P+ P- Vol+ Vol- Txt 751 100 121 100 Txt Txt Txt 751
• 23 spectra of 65535 bytes • 23 error detection reports: 22 pass, 1 fail
![Page 105: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/105.jpg)
105
Block numbers sorted on decreasing similarity to the error vector:
20353 (1/1) 20354 (1/1) 58890 (1/4) 3134 (1/5) 3664 (1/6) 3135 (1/6) 58889 (1/7) 59839 (1/8) 29569 (1/9) 1256 (1/9)15755 (1/10) 20351 (1/10)
15781 (1/11) 15777 (1/11) 15778 (1/11) 15779 (1/11) 15782 (1/11) 15823 (1/11) 20432 (1/11) 15727 (1/11) ...
• Block 20354 is right at the top of the diagnosis! • … but it shares the first position with block 20353.
Diagnosis
![Page 106: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/106.jpg)
106
Scenario 2
26 key presses:
P+ P- Vol+ Vol- Txt 121 751 100 121 100 Txt Txt Txt 751
The sequence 121 7 exonerates block 20353
![Page 107: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/107.jpg)
107
Diagnosis for scenario 2
• Block 20354 is diagnosed correctly • Its Ochiai similarity to the error vector is twice as
large as that of any other block
20354 (1/1) 20353 (1/2) 3134 (1/5) 50466 (1/11) 20432 (1/11)
15755 (1/11) 58208 (1/12) 58207 (1/12) 59816 (1/12) 50439 (1/12) 50436 (1/12) 14817 (1/12) 50432 (1/12) 50437 (1/12) 50288 (1/12) 50428 (1/12) 14814 (1/12) 14816 (1/12) 50422 (1/12) 14813 (1/12) ...
![Page 108: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/108.jpg)
108
void hook_EndTransaction() { ... if (IS_TXT_OFF(tmode) == IS_DISPLAY_TXT(CurrentDisplayProfile) ) { if ( gv_error_prev == 0 ) { hook_log( PROFILE_SIZE * 2 - 1 ); } gv_error_prev = 1; } else { gv_error_prev = 0; } ... }
Error detection Log an error on each transition from a consistent state To an inconsistent state:
Sample code
Used for testing
purposes only!
![Page 109: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/109.jpg)
109
Error detection
Definition of error is important • Inconsistent state: all transactions after the first
inconsistency are considered to demonstrate an error. This obscures the actual cause.
• Change to inconsistent state: also obscures the cause if the scenario includes TV mode to Txt transitions after the first inconsistency.
![Page 110: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/110.jpg)
110
Case Studies (Summary)
Case To Inspect Out of / Previous
Load Problem 2 logical threads 315
Teletext Lock-Up 2 blocks 60K
NVM corrupt 96 blocks, 10 files 150K, 1.8K
Scrolling Bug 5 blocks 150K
Invisible Pages 12 blocks 150K
Tuner Problem 2 files 1.8K
Zapping Crash 1 run (15 mins) 1 day (develop)
Wrong Audio 1 run (15 mins) ½ day (expert)
![Page 111: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/111.jpg)
111
Experiments
• Off-loading spectra via UART for off-line analysis • Half-word encoded spectra • Critical sections • Idle time after 24 transactions (tests) • TV ran really slow (zapping takes seconds), but
stable
![Page 112: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/112.jpg)
112
Resource constraints
• Limited memory • Limited CPU time • Concurrency
![Page 113: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/113.jpg)
113
Memory constraints
• It is not possible to store the spectra for all tests on the embedded device
• E.g., block level instrumentation of the TV software yields spectra of over 60.000 flags
• Using a byte per flag, we could store 24 spectra • Offloading a spectrum via the UART took several
seconds • Fortunately, we don’t need to store all spectra
![Page 114: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/114.jpg)
114
Update counters at run-time
Current sp. … 0 1 … 0 1 …
a00 … ++ … … a10 … ++ … … a01 … … ++ … a11 … … ++ …
passed failed
s = n11+n10+n01
n11
![Page 115: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/115.jpg)
115
Perform diagnosis anytime
a00 … 3 4 … 10 12 … a10 … 10 11 … 3 4 … a01 … 0 1 … 1 1 … a11 … 1 2 … 1 1 …
s = n11+n10+n01
n11 1. Calculate coeffients:
2. Sort
1/7 1/11 1/5 1/6
![Page 116: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/116.jpg)
116
Self diagnosing system a b c d e f g fail
test 1 test 2 test 3 test 4 test 5
a00 0 0 0 0 0 0 0 a01 0 0 0 0 0 0 0 a10 0 0 0 0 0 0 0 a11 0 0 0 0 0 0 0
Jaccard
![Page 117: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/117.jpg)
117
Self diagnosing system a b c d e f g fail
test 1 0 1 1 1 1 0 0 0 test 2 test 3 test 4 test 5
a00 1 0 0 0 0 1 1 a01 0 0 0 0 0 0 0 a10 0 1 1 1 1 0 0 a11 0 0 0 0 0 0 0
Jaccard 0 0 0 0 0 0 0
![Page 118: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/118.jpg)
118
Self diagnosing system a b c d e f g fail
test 1 test 2 0 0 0 1 0 1 1 1 test 3 test 4 test 5
a00 1 0 0 0 0 1 1 a01 1 1 1 0 1 0 0 a10 0 1 1 1 1 0 0 a11 0 0 0 1 0 1 1
Jaccard 0 0 0 ½ 0 1 1
![Page 119: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/119.jpg)
119
Self diagnosing system a b c d e f g fail
test 1 test 2 test 3 1 1 1 1 0 0 0 1 test 4 test 5
a00 1 0 0 0 0 1 1 a01 1 1 1 0 2 1 1 a10 0 1 1 1 1 0 0 a11 1 1 1 2 0 1 1
Jaccard ½ ⅓ ⅓ ⅔ 0 ½ ½
![Page 120: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/120.jpg)
120
Self diagnosing system a b c d e f g fail
test 1 test 2 test 3 test 4 0 0 0 0 0 0 0 0 test 5
a00 2 1 1 1 1 2 2 a01 1 1 1 0 2 1 1 a10 0 1 1 1 1 0 0 a11 1 1 1 2 0 1 1
Jaccard ½ ⅓ ⅓ ⅔ 0 ½ ½
![Page 121: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/121.jpg)
121
Self diagnosing system a b c d e f g fail
test 1 test 2 test 3 test 4 test 5 1 1 0 1 1 0 1 1
a00 2 1 1 1 1 2 2 a01 1 1 2 0 2 2 1 a10 0 1 1 1 1 0 0 a11 2 2 1 3 1 1 2
Jaccard ⅔ ½ ¼ ¾ ¼ ⅓ ⅔
![Page 122: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/122.jpg)
122
Self diagnosing system a b c d e f g fail
test 1 test 2 test 3 test 4 test 5
a00 2 1 1 1 1 2 2 a01 1 1 2 0 2 2 1 a10 0 1 1 1 1 0 0 a11 2 2 1 3 1 1 2
Jaccard ⅔ ½ ¼ ¾ ¼ ⅓ ⅔
We obtain the same diagnosis without storing any data related
to the individual test cases
![Page 123: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/123.jpg)
123
CPU time constraints
• Many embedded systems have real-time constraints (e.g., update display area during vertical blank)
• Recording a spectrum involves setting a bit for every block / function / etc. executed: unavoidable, but affordable
• Processing the recorded spectra must be done on a low-priority thread
![Page 124: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/124.jpg)
124
Spectrum cache
Spectrum currently being recorded
Spectrum currently being processed during idle time
Tests / transactions should allow sufficient idle time to prevent overflow of the spectrum cache
![Page 125: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/125.jpg)
125
Concurrency
• Hit-spectra occupy a bit per block/function/etc. • Bit level access is not atomic: two threads
modifying different bits in the same word lead to incorrect results
• Options: – Critical section per update (time) – Use a word per block/function/etc. (space) – Record spectra per thread (time + space)
![Page 126: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/126.jpg)
126
Trade-offs
Time Space
Small cache Idle time in tests critical sections bit encoded spectra
Large cache Fast tests atomic updates word encoded spectra
![Page 127: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/127.jpg)
127
Further applications • Systems that warn of possible errors within
themselves [Reps97] – Obtain spectra for nominal behavior in a warming-up
period – Generate a warning if previously unseen behavior is
detected • Recovery: reset those processes whose behavior
appears to correlate with potentially dangerous situations – Form of software rejuvenation [Huang95] – Requires recovery-oriented design
![Page 128: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/128.jpg)
128
Related work Delta Debugging [Zeller]: • Search for the smallest difference (delta) in the
initial state (input) of a passed run and a failed run that causes the failure of interest – Maintain dependencies between variables to
guarantee valid states • Interatively advance both runs in the debugger,
and repeat the search on the current state • Stop when the failure of interest occurs • This results in a sequence of failure-inducing
states that helps to locate the fault
![Page 129: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/129.jpg)
129
Related work • Nearest-neighbor [Renieris, Reiss] • Compare spectra for
– Single failed run – Most similar passed run
• Can be seen as SFL on a subset of all spectra • Many variants for selecting the subset are possible • Initial experiments did not promise great benefits
• [Jones05]: Tarantula outperforms NN and DD
![Page 130: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/130.jpg)
130
Related work DD + dynamic slicing [Gupta et al]: • Backward (dynamic) slice: all statements that
influence the value of a variable at a point in the execution.
• Forward: all statements that are affected by a variable at a point in the execution.
• Intersect: – Forward slice of the minimal failure inducing input
difference – Backward slice of the variables where the failure
occurs
![Page 131: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/131.jpg)
131
Related work
X
• Reported results are impressive • Slicing is expensive
faulty output
min. failure inducing input difference
![Page 132: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/132.jpg)
132
Related work Model-based debugging
void f( int x ) { int z,y1,y2;
1. z = x+1; 2. y1 = z*2; 3. y2 = z+2; 4. printf( “y1=%d, y2=%2\n”, y1, y2 ); }
![Page 133: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/133.jpg)
133
Related work Model-based debugging
void f( int x ) { int z,y1,y2;
1. z = x+1; 2. y1 = z*2; 3. y2 = z+2; 4. printf( “y1=%d, y2=%2\n”, y1, y2 ); }
hi: statement i contributes to the intended behavior of the program
h1⇒zOK h2⇒(zOK⇒y1OK) h3⇒(zOK⇒y2OK)
![Page 134: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/134.jpg)
134
Related work Model-based software debugging • Wotawa02: model-based debugging using
dependency-based models is equivalent to slicing • Survey in Mayer & Stumptner ASE08
![Page 135: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/135.jpg)
SFL vs. Related Work
135
![Page 136: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/136.jpg)
136
Conclusion
• More than any other software fault diagnosis method, SFL is – cheap – practicable, and – appears to work in practice
• SFL can easily be integrated with testing
• On going work – Diagnosis-based Approach to Test Sequencing – Automatic Recovery from Software Failures [QSIC10]
![Page 137: Fault Diagnosis of Software Systems - tarot2010.ist.tugraz.attarot2010.ist.tugraz.at/slides/Abreu.pdf– shortens the test – diagnose – repair cycle – More bugs solved → more](https://reader033.vdocuments.mx/reader033/viewer/2022041415/5e1b0e1b3958c142c2749d35/html5/thumbnails/137.jpg)
137
Help me! I’ve got a problem. We might lose the picture for a while…
No Way, Not now! They’re about to score!
I’ll recover for you!