wcre2010 shihab

40
Predicting Re-opened Bugs A Case Study on the Eclipse Project Emad Shihab, A. Ihara, Y. Kamei, W. Ibrahim, M. Ohira, B. Adams, A. E. Hassan and K. Matsumoto [email protected] SAIL, Queen’s University, Canada 1

Upload: sailqu

Post on 15-Apr-2017

285 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Wcre2010 shihab

1

Predicting Re-opened BugsA Case Study on the Eclipse Project

Emad Shihab, A. Ihara, Y. Kamei, W. Ibrahim, M. Ohira, B. Adams, A. E. Hassan and K. Matsumoto

[email protected], Queen’s University, Canada

NAIST, Japan

Page 2: Wcre2010 shihab

2

When you discover a bug …

Report bug Fix bug Verify fix Close bug

Re-opened

Bug report

Page 3: Wcre2010 shihab

3

Degrade quality …

Page 4: Wcre2010 shihab

4

Increase maintenance costs …

Page 5: Wcre2010 shihab

5

Unnecessary re-work…

Page 6: Wcre2010 shihab

6

Research questions …

1. Which attributes indicate re-opened bugs?

2. Can we accurately predict if a bug will be re-opened using the extracted attributes?

Page 7: Wcre2010 shihab

7

Determine best

attributes

Mine code and bug

repositories

Approach overview

Extract attributes

Predict re-opened bugs

Page 8: Wcre2010 shihab

8

Our dimensions …

Work habit Bug report

Bug fix People

Page 9: Wcre2010 shihab

9

Work habit attributes

1. Time (Hour of day)2. Weekday3. Day of month4. Month

Page 10: Wcre2010 shihab

10

Bug report attributes1. Component 2. Platform3. Severity 4. Priority5. CC list6. Priority changed7. Description size 8. Description text9. Number of comments10. Comment size11. Comment text

Metadata

Textualdata

Page 11: Wcre2010 shihab

11

Bug fix attributes

1. Time to resolve (in days) 2. Last status3. Number of edited files

Page 12: Wcre2010 shihab

12

People attributes

1. Reporter Name 2. Reporter experience3. Fixer name4. Fixer experience

Page 13: Wcre2010 shihab

13

Research question 1

Which attributes indicate re-opened bugs?

Comment text, description text and fix location (component) are the best indicators

Page 14: Wcre2010 shihab

14

Top node analysis setup

1. Build 10 decision trees for each attribute set

3. Repeat using all attributes

2. Record the frequency and level of each attribute

Page 15: Wcre2010 shihab

Decision tree prediction model

15

No. files>= 5 < 5

Dev exp>= 3 < 3

Re-openedMonth

Time>= 12 < 12

Time to resolve>= 6 < 6 >= 24 < 24

Re-opened Not Re-opened Re-opened...

.

.

.

Level 1

Level 2

Level 3

Page 16: Wcre2010 shihab

16

Top node analysis example with 3 trees

Comment

Time No. comments

Comment

Time No. files

No. files

Time Description size

Level Frequency AttributesLevel 1 2

1CommentNo. files

Level 2 3111

TimeNo. commentsNo. filesDescription size

.

...

.

.

Page 17: Wcre2010 shihab

17

Which attributes best indicate re-opened bugs?

Work habit attributes

9 X Month 1 X Time (Hour of day)WeekdayDay of month

Page 18: Wcre2010 shihab

18

Which attributes best indicate re-opened bugs?

Bug report attributes

Component PlatformSeverity PriorityCC listPriority changedDescription size Description textNumber of commentsComment size10 X Comment text

Metadata

Textualdata

Page 19: Wcre2010 shihab

19

Which attributes best indicate re-opened bugs?

7 X Time to resolve3 X Last statusNumber of files in fix

Bug fix attributes

Page 20: Wcre2010 shihab

20

Which attributes best indicate re-opened bugs?

5 X Reporter name5 X Fixer nameReporter experienceFixer experience

People attributes

Page 21: Wcre2010 shihab

21

Combining all attributes

+ ++

Level Frequency AttributesLevel 1 10 Comment textLevel 2 19

1Description textComponent

Page 22: Wcre2010 shihab

22

Research question 2

Can we accurately predict if a bug will be re-opened using the extracted attributes?

Our models can correctly predict re-opened bugs with 63% precision and 85% recall

Page 23: Wcre2010 shihab

Decision tree prediction model

23

No. files>= 5 < 5

Dev exp>= 3 < 3

Re-openedMonth

Time>= 12 < 12

Time to resolve>= 6 < 6 >= 24 < 24

Re-opened Not Re-opened Re-opened...

.

.

.

Level 1

Level 2

Level 3

Page 24: Wcre2010 shihab

24

Performance measures

Re-opened precision:

Re-opened Recall:

Re-opened Not re-opened

Re-opened TP FP

Not re-opened FN TNPredicted

Actual

TPTP+FP

TPTP+FN

Not re-opened precision:

Not re-opened recall:

TNTN+FN

TNTN+FP

Page 25: Wcre2010 shihab

25Work habits Bug report Bug fix People

33

63

2127

74

83 83

67

PrecisionRecall

Prec

isio

n an

d re

call

(%)

Predicting re-opened bugs

Page 26: Wcre2010 shihab

26

Work habits Bug report Bug fix People

9397

93 91

71

91

39

66

PrecisionRecall

Prec

isio

n an

d re

call

(%)

Predicting NOT re-opened bugs

Page 27: Wcre2010 shihab

27

Combining all attributes

re-opened NOT re-opened

63

9785 90

PrecisionRecall

Prec

isio

n an

d re

call

(%)

+ ++

Page 28: Wcre2010 shihab

28

Bug comments are important …

Bug report is most important set

What words are important?

Comment text most important bug report attribute

Page 29: Wcre2010 shihab

29

Important words

Re-opened Not Re-opened

controlbackgrounddebuggingbreakpointblocked platforms

verifiedduplicatescreenshotimportanttestingwarning

Page 30: Wcre2010 shihab

30

Page 31: Wcre2010 shihab

31

Predicting re-opened bugs

Pr: 93 %Re: 71 %

Work habits Bug report Bug fix People

Pr: 33 %Re: 74 %

Pr: 97%Re: 91%

Pr: 93%Re: 39%

Pr: 63 %Re: 83 %

Pr: 21%Re: 83%

Pr: 91%Re: 66%

Pr: 27%Re: 67% Re-opened

Not Re-opened

Page 32: Wcre2010 shihab

32

Predicting re-opened bugs

Work habits Bug report Bug fix People

Page 33: Wcre2010 shihab

33

Predicting NOT re-opened bugs

Pr: 93 %Re: 71 %

Work habits Bug report Bug fix People

Pr: 97%Re: 91%

Pr: 93%Re: 39%

Pr: 91%Re: 66%

Page 34: Wcre2010 shihab

34

Predicting re-opened bugs

Pr: 97 %Re: 90 %

Pr: 63 %Re: 85 % Re-opened

Not Re-opened

+ ++

Bug report re-opened Bug report NOT re-opened

RecallPrecision

Page 35: Wcre2010 shihab

35

Predict re-opened

bugs

Mine code and bug

repositories

Approach overview

Attributes of re-opened

bugs

Measure performance

Page 36: Wcre2010 shihab

36Work habits Bug report Bug fix People

RecallPrecision

Prec

isio

n an

d re

call

quan

tity

Predicting re-opened bugs

Page 37: Wcre2010 shihab

37

Which attributes best indicate re-opened bugs?

Month (9)Time (1)

Work habits

Comment text (10)

Bug report Bug fix

Time to fix (7)Last status (3)

People

Fixer (5)Reporter (5)

Page 38: Wcre2010 shihab

38

Bug report

Page 39: Wcre2010 shihab

39

A typical work day…

Page 40: Wcre2010 shihab

40

Bug report attributes1. Component 2. Platform3. Severity 4. Priority5. CC list6. Priority changed7. Description size 8. Description text9. Number of comments10. Comment size11. Comment text

Metadata

Textualdata