1 tracing regression bugs presented by dor nir. 2 outline problem introduction. regression bug –...

57
1 Tracing regression bugs Presented by Dor Nir

Upload: leo-booker

Post on 17-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

1

Tracing regression bugsPresented by Dor Nir

2

Outline Problem introduction. Regression bug – definition. Industry tools. Proposed solution. Experimental results. Future work.

3

Micronose corp. A big company founded by Nataly

noseman 1998 – Version 1 of nosepad. (great

success)Application

4

Nosepad version 1

void Save(){

...

bDirty = false;

}

void Exit(){

if(IsDirty()) {

if(MsgBox(“Save your

noses?"))

Save();

}

CloseWindow();

}

}

class Nosepad

{

bool bDirty;

void AddNose(){

...

bDirty = true;

}

void DeleteNose(){

...

bDirty = true;

}

bool IsDirty(){

return bDirty;

}

5

Nosepad version 2 New features requires

… … Undo/Redo mechanism. …

Micronose expanding Promotions. New recruitments.

6

Undo/Redo Design Undo stack – Each operation will be added to the

undo stack. Redo stack - When undo operation this operation

will move to the redo stack.

Key “a”

Undo Redo

Key “b”

7

Nosepad version 2class Nosepad

{ …

Stack undoStack;

Stack redoStack;

void Undo(){

undoStack.Top().Operate(false);

redoStack.Push(undoStack.Pop()); }

void Redo(){

redoStack.Top().Operate(true);

undoStack.Push(redoStack.Pop());

}

Application

void AddNose(){ ...

undoStack.Add(AddNoseOp);

redoStack.Clear();

}

void DeleteNose() {

...

undoStack.Add(DelNoseOp);

redoStack.Clear();

}

.

.

.

}

8

Zelda from QA

9

Nosepad version 2 correctionclass Nosepad

{…

Stack undoStack;

Stack redoStack;

void Undo(){

undoStack.Top().Operate(false);

redoStack.Push(undoStack.Pop());

}

void Redo(){

redoStack.Top().Operate(true);

undoStack.Push(redoStack.Pop());

}

void AddNose(){

...

undoStack.Add(AddNoseOp);

redoStack.Clear();

}

void DeleteNose() {

undoStack.Add(DelNoseOp);

redoStack.Clear();

}

bool IsDirty(){

return bDirty &

!undoStack.IsEmpty();

}

10

Zelda from QA

11

Regression bug observations The second bug is a regression bug. The same test will succeeded on version 1 and

fail on version 2. The specifications for version 1 haven't

change for version 2 (only addition)

12

Version 1

Specifications1. X2. Y3. Z

Version 2

Specifications1. X2. Y3. Z4. A5. B

Changes in code

Regression bug

Bug…But no

regression

13

Regression bug definition Regression bug – Changes in existing code

that change the behavior of the application so it does not meet a specification that was previously met.

14

How to avoid regression bugs? Prevent inserting regression bug to the code:

Simple design. Programming language. Good programming. Methodology.

Test driven development. Code review. Communication.

Find regression bugs before release of the product. Extensive testing. White box \ Black box testing.

15

Automatic tools Find whether a regression bug exist. Quick Test Professional.

16

Where is it? What was the cause for the regression bug?

What was the change that caused the regression bug?

17

What is a change? Change existing code lines Adding new code lines. Delete code lines.

18

Problem definition When getting a check point C that failed, and

source code S of the AUT. We want to find the places (changes) in the code S that causes C to fail. We want to do it independently of the source

code language or technology. We know that at time T (previous to the failure)

the checkpoint passed.

nppp ..., 21

19

Solution 1QA

TestsTests

Programmer

Source code

Cooperation

20

Solution 1 - map

Windows.cpp

errorMessages.cpp

File.cpp

IO.cs

C:\code\files

DB project

Tests Source code

“SELECT NAMES from Table1” is not empty

File t.xml was Created successfully

Check text in message box

21

Solution 1 Much work has to be done for each new test. Maintenance is hard. We end up with a lot of code to be analyzed. Could use automatic tools (profilers).

22

Solution 2

Windows.cpp

errorMessages.cpp

File.cpp

IO.cs

C:\code\files

DB project

Check text in message box

Only

changes

“SELECT NAMES from Table1” is not empty

File t.xml was Created successfully

23

Source Control Version control tool. Data base of source code. Check-in \ Check-out operation. History of versions. Differences between versions. Very common in software development. Currently in market:VSS, starTeam, clear

case, cvs and many more.

24

Check point to code tool

Finding regression bug

Second phase

Source Control ToolChange AChange B…

Heuristics

Failed check point

First phase

Check point

Source code

Out put:Relevant changes: 1. Change X 2. Change Y 3. Change Z …

Input

252

Heuristics (second phase) Rank changes. Each heuristic will get different weight. Two kinds of heuristics:

Technology dependence. Non technology dependence.

13

26

Non-technology heuristics Do not depend on the technology of the code. Textual driven. No semantics.

27

Code lines affinity

Check point

Select "clerk 1" from the clerk tree (clerk number 2). Go to the next clerk.The next clerk is "clerk 3"

28

Check in comment affinity

Check point

Select "clerk 1" from the clerk tree (clerk number 2). Go to the next clerk.The next clerk is "clerk 3"

Check-in comment

Go to the next waiter when next item event is raise

29

File affinity

Check point

Select "clerk 1" from the clerk tree (clerk number 2). Go to the next clerk.The next clerk is "clerk 3"

Words histogram in file Clerk.cpp

Waiter 186

Waiters 15

Next 26

Number 174

…..

…..

30

File Name affinity

Check point

Select "clerk 1" from the clerk tree (clerk number 2). Go to the next clerk.The next clerk is "clerk 3"

ClerkDlg.cpp

31

More possible non technology heuristics Programmers history.

Reliable vs. “Prone to error” programmers. Experience in the specific module.

Time of change. Late at night. Close to release dead line.

32

Technology heuristics Depend on the source code language. Take advantage of known keywords. Use the semantics.

33

Function\Class\Namespace affinity

Check point

Select "clerk 1" from the clerk tree (clerk number 2). Go to the next clerk.The next clerk is "clerk 3"

34

Code complexity Deepness, number of branching.

if(b1 && b2)

{

if(c2 && d1)

c1 = true;

else

{

if((c2 && d2) || e1)

c1 = false;

}

}

if(b1 && b2 && c2 && d1)

c1 = true;

>

35

Words affinity problem

red, flower, white, black,

cloud

rain, green, red, coat

36

Words affinity problem (cont.)

red, flower, white, black,

cloud

rain, green, red, coat

>red, flower,

white, black, cloud

train, table, love

37

Word affinity

red

flower

red

blue

red

red

< <

38

How can we measure affinity? Vector space model of information retrieval .-

Wong S.K.M , raghavan Similarity of documents.

Improving web search results using affinity graph - benyu Zhang, Hau Li, Lei Ji, Wensi Xi, Weiguo Fan.

Similarity of documents. Diversity vs. Information richness of documents.

39

Affinity definition Synonym (a) - Group of words that are

synonyms of a or similar in the meaning to a.

Synonym (choose)

chosen, picked out; choice, superior, prime; discriminating, choosy, picky , select, selection

=

40

Words affinity definition (cont) 1 if a == b

ShallowAffinity (a,b) = 0 else 1 if a == b

Affinity (a,b) = else ShallowAffinity (synonym (a), synonym (b))

Affinity of groups of words

||||

),(

),( ..1 ..1

BA

bainityShallowAff

BAinityShallowAff ni mjji

2

),(),(),(

||

)},(),...,,(max{),( ..1

1

ABffinityAsymetricABAffinityAsymetricABAAffinity

A

baAffinitybaAffinityBAffinityAsymetricA ni

mii

}...,,{

}...,,{

321

321

m

n

bbbbB

aaaaA

42

Using affinity in the tool

))(),((),( PWordsCWordsAffinityPCRank

Words (C) = the group of words in the description of the checkpoint C.

Words (P) = Group of words in the source code/checkin/file etc…

43

Using affinity in heuristics Code line affinity:

Words (P, L) = Group of words in the source code located L lines from the

change P. β – coefficient that gives different weight for lines inside the change.

Check-in comment affinity:))((),((),(2 PcheckinWordsCWordsAffinityPCRank

44

Using affinity in heuristics (cont.) File affinity: P is a change in file F with histogram

map.

),(),(3 FCFileRankPCRank

))(),(),((),( FHstgrmFWordsCWordsnityHstgrmAffiFCFileRank

nii

niimii

amap

amapbaAffinitybaAffinitymapBAnityHstgrmAffi

..1

..11

][

][)},(),...,,(max{),,(

45

Using affinity in heuristics (cont.) File name affinity:

Code elements affinity:

))((),((),(4 PFileNameWordsCWordsAffinityPCRank

))((),((8

1

))((),((8

3

))((),((2

1

),(5

PNamespaceWordsCWordsAffinity

PClassNameWordsCWordsAffinity

PmeFunctionNaWordsCWordsAffinity

PCRank

46

AlgorithmInput: C – Checkpoint. T – The last time checkpoint C passed.1. Get the latest version of the source code for C from

the source control tool.2. Get files versions from the source control tool one

by one until the version check-in time is smaller then T. For each file version:

1. Get the change between the two sequential versions.2. Analyze and rank the change in respect to the checkpoint C (Rank(C,P))3. Add the rank to DB.

47

Observations and i ≠ j

Better affinity Better results The project is not always in a valid state.

),(),( 21 PCRankPCRank ji

21 PP

48

Implementation Visual source safe

Arexis merge – Diff tool.

MS Word

WordNet

MS Access – DB.

49

WordNet Developed at the University of Princtoen. Large lexical database of English. English nouns, verbs, adjectives, and adverbs

are organized into synonym sets, each representing one underlying lexicalized concept.

Different relations link the synonym sets.

50

Additional views Group by file. Group by time of change.

51

The tool

52

The tool

53

Experimental Results Source code:

C++ MFC framework 891 files in 29 folders 3 millions lines of code 3984 check-ins

54

Experimental results (cont.)Checkpoint No Grouping Group by file

1 1 1

2 2 7

3 2 2

4 - -

5 - 1

55

Challenges Time.

Cache. Filtering by one heuristic.

Words equality. Source code vocabulary.

Example - m_CountItemInTable. Additional synonyms.

Clerk ≈ Waiter.

56

Future work Add more heuristics. Learning mechanism – Automatic tuning of

heuristics. Why? Finding more about source of regression bugs.

Bad Programmer. Dead line. Technology. Design.

57