1 delta debugging koushik sen eecs, uc berkeley. 2 all windows 3.1 windows 95 windows 98 windows me...

Post on 29-Dec-2015

228 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

Delta Debugging

Koushik SenEECS, UC Berkeley

2

<td align=left valign=top><SELECT NAME="op sys" MULTIPLE SIZE=7><OPTION VALUE="All">All<OPTION VALUE="Windows 3.1">Windows 3.1<OPTION VALUE="Windows 95">Windows 95<OPTION VALUE="Windows 98">Windows 98<OPTION VALUE="Windows ME">Windows ME<OPTION VALUE="Windows 2000">Windows 2000<OPTION VALUE="Windows NT">Windows NT<OPTION VALUE="Mac System 7">Mac System 7<OPTION VALUE="Mac System 7.5">Mac System 7.5<OPTION VALUE="Mac System 7.6.1">Mac System 7.6.1<OPTION VALUE="Mac System 8.0">Mac System 8.0<OPTION VALUE="Mac System 8.5">Mac System 8.5<OPTION VALUE="Mac System 8.6">Mac System 8.6<OPTION VALUE="Mac System 9.x">Mac System 9.x<OPTION VALUE="MacOS X">MacOS X<OPTION VALUE="Linux">Linux<OPTION VALUE="BSDI">BSDI<OPTION VALUE="FreeBSD">FreeBSD<OPTION VALUE="NetBSD">NetBSD<OPTION VALUE="OpenBSD">OpenBSD<OPTION VALUE="AIX">AIX<OPTION VALUE="BeOS">BeOS<OPTION VALUE="HP-UX">HP-UX<OPTION VALUE="IRIX">IRIX<OPTION VALUE="Neutrino">Neutrino<OPTION VALUE="OpenVMS">OpenVMS<OPTION VALUE="OS/2">OS/2<OPTION VALUE="OSF/1">OSF/1<OPTION VALUE="Solaris">Solaris<OPTION VALUE="SunOS">SunOS<OPTION VALUE="other">other</SELECT></td><td align=left valign=top><SELECT NAME="priority" MULTIPLE SIZE=7><OPTION VALUE="--">--<OPTION VALUE="P1">P1<OPTION VALUE="P2">P2<OPTION VALUE="P3">P3<OPTION VALUE="P4">P4<OPTION VALUE="P5">P5</SELECT></td><td align=left valign=top><SELECT NAME="bug severity" MULTIPLE SIZE=7><OPTION VALUE="blocker">blocker<OPTION VALUE="critical">critical<OPTION VALUE="major">major<OPTION VALUE="normal">normal<OPTION VALUE="minor">minor<OPTION VALUE="trivial">trivial<OPTION VALUE="enhancement">enhancement</SELECT></tr></table>

How do we go from this

File Print Segmentation Fault

3

into this

<SELECT>

File Print Segmentation Fault

Simplification

• Once one has reproduced a problem, one must find out what’s relevant:– Does the problem really depend on 10,000 lines

of input?– Does the failure really require this exact

schedule?– Do we need this sequence of calls?

4

Broken computer

• How do you identify the problem?

5

Why Simplify?

• Ease of communication. A simplified test case is easier to communicate.

• Easier debugging. Smaller test cases result in smaller states and shorter executions.

• Identify duplicates. Simplified test cases subsume several duplicates.

6

7

Motivation

• The Mozilla open-source web browser project receives several dozens bug reports a day.

• Each bug report has to be simplified– Eliminate all details irrelevant to producing the failure

• To facilitate debugging• To make sure it does not replicate a similar bug report

• In July 1999, Bugzilla listed more than 370 open bug reports for Mozilla.– These were not even simplified– Mozilla engineers were overwhelmed with work– They created the Mozilla BugAThon: a call for volunteers

to process bug reports

8

Your Solution

• How do you solve these problems?

• Binary search– Cut the test case in half– Iterate

• Brilliant idea: Why not automate this?

Binary Search

• Proceed by binary search. Throw away half the input and see if the output is still wrong.

• If not, go back to the previous state and discard the other half of the input.

9

Binary Search

• Proceed by binary search. Throw away half the input and see if the output is still wrong.

• If not, go back to the previous state and discard the other half of the input.

10

Binary Search

• Proceed by binary search. Throw away half the input and see if the output is still wrong.

• If not, go back to the previous state and discard the other half of the input.

11

Binary Search

• Proceed by binary search. Throw away half the input and see if the output is still wrong.

• If not, go back to the previous state and discard the other half of the input.

12

Binary Search

• Proceed by binary search. Throw away half the input and see if the output is still wrong.

• If not, go back to the previous state and discard the other half of the input.

13

Binary Search

• Proceed by binary search. Throw away half the input and see if the output is still wrong.

• If not, go back to the previous state and discard the other half of the input.

14

Binary Search

• Proceed by binary search. Throw away half the input and see if the output is still wrong.

• If not, go back to the previous state and discard the other half of the input.

15

Binary Search

• Proceed by binary search. Throw away half the input and see if the output is still wrong.

• If not, go back to the previous state and discard the other half of the input.

16

Binary Search

• Proceed by binary search. Throw away half the input and see if the output is still wrong.

• If not, go back to the previous state and discard the other half of the input.

17

Binary Search

• Proceed by binary search. Throw away half the input and see if the output is still wrong.

• If not, go back to the previous state and discard the other half of the input.

18

Binary Search

• Proceed by binary search. Throw away half the input and see if the output is still wrong.

• If not, go back to the previous state and discard the other half of the input.

19

Binary Search

• Proceed by binary search. Throw away half the input and see if the output is still wrong.

• If not, go back to the previous state and discard the other half of the input.

20

Binary Search

• Proceed by binary search. Throw away half the input and see if the output is still wrong.

• If not, go back to the previous state and discard the other half of the input.

21

Simplified input

22

<td align=left valign=top><SELECT NAME="op sys" MULTIPLE SIZE=7><OPTION VALUE="All">All<OPTION VALUE="Windows 3.1">Windows 3.1<OPTION VALUE="Windows 95">Windows 95<OPTION VALUE="Windows 98">Windows 98<OPTION VALUE="Windows ME">Windows ME<OPTION VALUE="Windows 2000">Windows 2000<OPTION VALUE="Windows NT">Windows NT<OPTION VALUE="Mac System 7">Mac System 7<OPTION VALUE="Mac System 7.5">Mac System 7.5<OPTION VALUE="Mac System 7.6.1">Mac System 7.6.1<OPTION VALUE="Mac System 8.0">Mac System 8.0<OPTION VALUE="Mac System 8.5">Mac System 8.5<OPTION VALUE="Mac System 8.6">Mac System 8.6<OPTION VALUE="Mac System 9.x">Mac System 9.x<OPTION VALUE="MacOS X">MacOS X<OPTION VALUE="Linux">Linux<OPTION VALUE="BSDI">BSDI<OPTION VALUE="FreeBSD">FreeBSD<OPTION VALUE="NetBSD">NetBSD<OPTION VALUE="OpenBSD">OpenBSD<OPTION VALUE="AIX">AIX<OPTION VALUE="BeOS">BeOS<OPTION VALUE="HP-UX">HP-UX<OPTION VALUE="IRIX">IRIX<OPTION VALUE="Neutrino">Neutrino<OPTION VALUE="OpenVMS">OpenVMS<OPTION VALUE="OS/2">OS/2<OPTION VALUE="OSF/1">OSF/1<OPTION VALUE="Solaris">Solaris<OPTION VALUE="SunOS">SunOS<OPTION VALUE="other">other</SELECT></td><td align=left valign=top><SELECT NAME="priority" MULTIPLE SIZE=7><OPTION VALUE="--">--<OPTION VALUE="P1">P1<OPTION VALUE="P2">P2<OPTION VALUE="P3">P3<OPTION VALUE="P4">P4<OPTION VALUE="P5">P5</SELECT></td><td align=left valign=top><SELECT NAME="bug severity" MULTIPLE SIZE=7><OPTION VALUE="blocker">blocker<OPTION VALUE="critical">critical<OPTION VALUE="major">major<OPTION VALUE="normal">normal<OPTION VALUE="minor">minor<OPTION VALUE="trivial">trivial<OPTION VALUE="enhancement">enhancement</SELECT></tr></table>

Complex Input

File Print Segmentation Fault

Simplified Input

• <SELECT NAME="priority" MULTIPLE SIZE=7>• Simplified from 896 lines to one single line• Required 12 tests only

23

Binary Search

• <SELECT NAME="priority" MULTIPLE SIZE=7>

24

Binary Search

• <SELECT NAME="priority" MULTIPLE SIZE=7>

25

Binary Search

• <SELECT NAME="priority" MULTIPLE SIZE=7>• <SELECT NAME="priority" MULTIPLE SIZE=7>

26

✘✔

Binary Search

• <SELECT NAME="priority" MULTIPLE SIZE=7>• <SELECT NAME="priority" MULTIPLE SIZE=7>• <SELECT NAME="priority" MULTIPLE SIZE=7>

27

✘✔✔

Binary Search

• <SELECT NAME="priority" MULTIPLE SIZE=7>• <SELECT NAME="priority" MULTIPLE SIZE=7>• <SELECT NAME="priority" MULTIPLE SIZE=7>

What do we do if both halves pass?

28

✘✔✔

29

Change Granularity

Increase Granularity of test case

DecreaseGranularity of test case

Failure of the test case

More chances Less chances

Progress of the search

Slower, Reduced by amount < ½

Faster,Reduced by amount > ½

30

General DD Algorithm

Basic idea:1. Start with few & large changes first

2. If all alternatives pass or are unresolved, apply more & smaller changes.

∆1∆1 ∆2∆2

∆2 ∆2 ∆1∆1

∆1∆1

∆1∆1 ∆2∆2∆2∆2 ∆3∆3

∆3∆3∆4∆4

∆4∆4∆5∆5

∆5∆5∆6∆6

∆6∆6∆7∆7

∆7∆7∆8∆8

∆8∆8

Binary Search

• <SELECT NAME="priority" MULTIPLE SIZE=7>• <SELECT NAME="priority" MULTIPLE SIZE=7>• <SELECT NAME="priority" MULTIPLE SIZE=7>

31

✘✔✔

Binary Search

• <SELECT NAME="priority" MULTIPLE SIZE=7>• <SELECT NAME="priority" MULTIPLE SIZE=7>• <SELECT NAME="priority" MULTIPLE SIZE=7>• <SELECT NAME="priority" MULTIPLE SIZE=7>

32

✘✔✔✔

Binary Search

• <SELECT NAME="priority" MULTIPLE SIZE=7>• <SELECT NAME="priority" MULTIPLE SIZE=7>• <SELECT NAME="priority" MULTIPLE SIZE=7>• <SELECT NAME="priority" MULTIPLE SIZE=7>• <SELECT NAME="priority" MULTIPLE SIZE=7>

33

✘✔✔✔✘

Binary Search

• <SELECT NAME="priority" MULTIPLE SIZE=7>• <SELECT NAME="priority" MULTIPLE SIZE=7>• <SELECT NAME="priority" MULTIPLE SIZE=7>• <SELECT NAME="priority" MULTIPLE SIZE=7>• <SELECT NAME="priority" MULTIPLE SIZE=7>• <SELECT NAME="priority" MULTIPLE SIZE=7>

34

✘✔✔✔✘

Binary Search

• <SELECT NAME="priority" MULTIPLE SIZE=7>• <SELECT NAME="priority" MULTIPLE SIZE=7>• <SELECT NAME="priority" MULTIPLE SIZE=7>• <SELECT NAME="priority" MULTIPLE SIZE=7>• <SELECT NAME="priority" MULTIPLE SIZE=7>• <SELECT NAME="priority" MULTIPLE SIZE=7>• <SELECT NAME="priority" MULTIPLE SIZE=7>

35

✘✔✔✔

36

Inputs and Failures

• Let E be the set of possible inputs

• rP E corresponds to an input that passes

• rF E corresponds to an input that fails

Binary Search

• <SELECT NAME="priority" MULTIPLE SIZE=7>• <SELECT NAME="priority" MULTIPLE SIZE=7>• <SELECT NAME="priority" MULTIPLE SIZE=7>• <SELECT NAME="priority" MULTIPLE SIZE=7>• <SELECT NAME="priority" MULTIPLE SIZE=7>• <SELECT NAME="priority" MULTIPLE SIZE=7>• <SELECT NAME="priority" MULTIPLE SIZE=7>

Example of rF and rP ?

37

✘✔✔✔

Binary Search

• <SELECT NAME="priority" MULTIPLE SIZE=7>• <SELECT NAME="priority" MULTIPLE SIZE=7>• <SELECT NAME="priority" MULTIPLE SIZE=7>• <SELECT NAME="priority" MULTIPLE SIZE=7>• <SELECT NAME="priority" MULTIPLE SIZE=7>• <SELECT NAME="priority" MULTIPLE SIZE=7>• <SELECT NAME="priority" MULTIPLE SIZE=7>

Example of rF and rP ?

rP = <SELECT NALE SIZE=7>

rF = <SELECT NAME="priori

38

✘✔✔✔

39

Changes

• We can go from one input to another by changes

• A change is a mapping : E E which takes one input and changes it to another input

r1 = <SELECT NAty" MULTIPLE SIZE=7>’ = insert ME="priori at appropriate positionWhat is ’(r1)?

40

Changes

• We can go from one input to another by changes

• A change is a mapping : E E which takes one input and changes it to another input

r1 = <SELECT NAty" MULTIPLE SIZE=7>’ = insert ME="priori at appropriate position What is ’(r1)?

<SELECT NAME="priority" MULTIPLE SIZE=7>

41

Decomposing Changes

• A change can be decomposed to a number of elementary changes 1, 2, ..., n where =1 o 2 o ... o n

– where (i o j)(r) = i(j(r))

• For example, deleting a part of the input file can be decomposed to deleting characters one be one from the input file – another way to say it: by composing deleting of single characters

we can get a change that deletes part of the input file ’ = insert ME="priori ’ = 1 o 2 o 3 o 4 o 5 o 6 o 7 o 8 o 9 o 10

1 = insert M 2 = insert E

42

Summary

• We have an input with failure: rF

• We have an input without failure: rP

• We have a set of changes cF = {1, 2, ..., n } such that

rF = (1 o 2 o ... o n )(rP) where rF is a run with failure

• Each subset c of cF is a test case

43

Testing Test Cases

• Given a test case c, we would like to know if the input generated by changing rP by the changes in c is an input that causes a failure

• We define a function

test: Powerset(cF) {P, F, ?}

such that, given c={1, 2, ..., m} cF

test(c) = F iff (1 o 2 o ... o m )(rP) is a failing input

44

Minimizing Test Cases

• Now the question is: Can we find the minimal test case c such that test(c) = F?

• A test case c cF is called the global minimum of cF iffor all c’ cF , |c’| < |c| test(c’) F

• Global minimum is the smallest set of changes which will make the program fail

45

Minimizing Test Cases

• Now the question is: Can we find the minimal test case c such that test(c) = F?

• A test case c cF is called the global minimum of cF iffor all c’ cF , |c’| < |c| test(c’) F

• Global minimum is the smallest set of changes which will make the program fail

• Finding the global minimum may require us to perform exponential number of tests

Search for 1-minimal Input

• Different problem formulationFind a set of changes that cause the failure, but

removing any change causes the failure to go away

• This is 1-minimality

46

47

Minimizing Test Cases

• A test case c cF is called a local minimum of cF iffor all c’ c , test(c’) F

• A test case c cF is n-minimal iffor all c’ c , |c| |c’| n test(c’) F

• A test case is 1-minimal if for all i c , test(c – {i}) F

48

Naïve Algorithm

• To find a 1-minimal subset of C, simply

• Remove one element from C

• If c – {} = X, recurse with smaller set

• If C – {} X, C is 1-minimal

49

Analysis

• In the worst case, – We remove one element from the set per

iteration– After trying every other element

• Work is potentially N + (N-1) + (N-2) + …

• This is O(N2)

50

Work Smarter, Not Harder

• We can often do better

• Silly to start out removing 1 element at a time– Try dividing change set in 2 initially– Increase # of subsets if we can’t make progress– If we get lucky, search will converge quickly

51

Minimization Algorithm

• The delta debugging algorithm finds a 1-minimal test case

• It partitions the set cF to 1, 2, ... n 1, 2, ... n

are pairwise disjoint

– cF = 1 2 ... n

• Define the complement of i as i = cF i

• Start with n = 2• Tests each test case defined by the partition and their

complements• Reduce the test case if a smaller failure inducing set is

found– otherwise refine the partition, i.e. n = n*2

52

Each Step of the Minimization Algorithm

Let n = 2(A) Start with as test set(B) Test each 1, 2, ... n and each 1, 2, ..., n

(C) There are four possible outcomes1. Some i causes failure

– Go to step (A) with = i and n = 2

2. Some i causes failure– Go to step (A) with = i and n = n 1

3. No test causes failure– Increase granularity: Go to (A) with = and n = 2n

4. The granularity can no longer be increased– Done, found the 1-minimal subset

Delta Debugging

53

54

Analysis

• Worst case is still quadratic

• Subdivide until each set is of size 1– Reduced to the naïve algorithm

• Good news– For single failure, converges in log N– Binary search again

55

The GNU C Compiler

• This program (bug.c) causes GCC 2.95.2 to crash when optimization is enabled

• We would like to minimize this program in order to file a bug report

• In the case of GCC, a passing program run is the empty input

• For the sake of simplicity, we model change as the insertion of a single character

– r is running GCC with an empty input

– r means running GCC with bug.c– each change i inserts the ith

character of bug.c

#define SIZE 20

double mult(double z[], int n){ int i, j;

i = 0; for (j = 0; j < n; j++) { i = i + j + 1; z[i] = z[i] * (z[0] + 1.0); } return z[n];}

void copy(double to[], double from[], int count){ int n = (count + 7) / 8; switch (count % 8) do { case 0: *to++ = *from++; case 7: *to++ = *from++; case 6: *to++ = *from++; case 5: *to++ = *from++; case 4: *to++ = *from++; case 3: *to++ = *from++; case 2: *to++ = *from++; case 1: *to++ = *from++; } while (--n > 0); return mult(to, 2);}

int main(int argc, char *argv[]){ double x[SIZE], y[SIZE]; double *px = x;

while (px < x + SIZE) *px++ = (px – x) * (SIZE + 1.0); return copy(y, x, SIZE);}

56

The GNU C Compiler

• The test procedure would– create the appropriate subset of bug.c

– feed it to GCC– return iff GCC had crashed, and otherwise

755

377

188

77

57

The GNU C Compiler

• The minimized code is

• The test case is 1-minimal– No single character can be removed without removing the

failure– Even every superfluous whitespace has been removed– The function name has shrunk from mult to a single t– This program actually has a semantic error (infinite loop), but

GCC still isn't supposed to crash• So where could the bug be?

– We already know it is related to optimization– If we remove the –O option to turn off optimization, the failure

disappears

t(double z[],int n){int i,j;for(;;){i=i+j+1;z[i]=z[i]*(z[0]+0);}return z[n];}

58

The GNU C Compiler

• The GCC documentation lists 31 options to control optimization on Linux

• It turns out that applying all of these options causes the failure to disappear– Some option(s) prevent the failure

–ffloat-store –fno-default-inline –fno-defer-pop–fforce-mem –fforce-addr –fomit-frame-pointer–fno-inline –finline-functions –fkeep-inline-functions–fkeep-static-consts –fno-function-cse –ffast-math–fstrength-reduce –fthread-jumps –fcse-follow-jumps–fcse-skip-blocks –frerun-cse-after-loop –frerun-loop-opt–fgcse –fexpensive-optimizations –fschedule-insns–fschedule-insns2 –ffunction-sections –fdata-sections–fcaller-saves –funroll-loops –funroll-all-loops–fmove-all-movables –freduce-all-givs –fno-peephole–fstrict-aliasing

59

The GNU C Compiler

• We can use test case minimization in order to find the preventing option(s)– Each i stands for removing a GCC option– Having all i applied means to run GCC with no option (failing)– Having no i applied means to run GCC with all options

(passing)• After seven tests, the single option -ffast-math is found which

prevents the failure– Unfortunately, it is a bad candidate for a workaround because it

may alter the semantics of the program– Thus, we remove -ffast-math from the list of options and make

another run– Again after seven tests, it turn out that -fforce-addr also prevents

the failure– Further examination shows that no other option prevents the

failure

60

The GNU C Compiler

• So, this is what we can send to the GCC maintainers– The minimal test case– “The failure only occurs with optimization”– “-ffast-math and -fforce-addr prevent the failure”

61

Case Studies

• Reducing a Mozilla bug to – Print the html file containing <SELECT>

• Minimizing fuzz input– Fuzzing: Take a program, send it randomly

generated input and see it crashes– Used delta debugging to minimize fuzz input

62

Another Application

• Yesterday, my program worked. Today, it does not. Why? – The new release 4.17 of GDB changed 178,000 lines– it no longer integrated properly with DDD (a graphical

front-end) – How to isolate the change that caused the failure.

63

Summary

• Delta Debugging is a technique, not a tool• Bad News:

– Probably must be reimplemented for each significant system

– To exploit knowledge of changes • Good News:

– Relatively simple algorithm, significant payoff

– It’s worth reimplementing

top related