mining version ar chiv es - universität des saarlandeshritcu/talks/mining... · * predicting...

34
Mining Version Archives Cătălin Hrițcu, [email protected] International Max Planck Research School for Computer Science 1

Upload: others

Post on 23-Aug-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Mining Version Ar chiv es - Universität des Saarlandeshritcu/talks/mining... · * predicting changes that could potentially lead to bugs * discovering application-specific usage

Mining Version ArchivesCătălin Hrițcu, [email protected]

International Max Planck Research School for Computer Science

1

Page 2: Mining Version Ar chiv es - Universität des Saarlandeshritcu/talks/mining... · * predicting changes that could potentially lead to bugs * discovering application-specific usage

Version Archives

Revision 68Revision 68Revision 68Revision 68Revision 68Revision 68Revision 68

miner.jpg

...

import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;

Revision 42import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;

Revision 42import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;

Revision 42import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;

Revision 42# # Preference Page # ComparePreferencePage.name= Compare/Patch ComparePreferencePage.generalTab.label= &General ComparePreferencePage.structureCompare.label= &Open

Revision 42

plugin.properties

import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;

Revision 42import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;

Revision 42import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;

Revision 42import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;

Revision 42import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;

Revision 42import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;

Revision 42import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;

Revision 42import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;

Revision 42

ComparePreferencePage.java ...

2

* Version archives are used by the great majority of software development projects.They store documents like source files together with their change history, and allow concurrent access to multiple developers.* Open source projects use CVS and more recently Subversion to manage their code repositories.Their version archives constitute an easily available source of very valuable information that can be investigated - a true treasure.

Page 3: Mining Version Ar chiv es - Universität des Saarlandeshritcu/talks/mining... · * predicting changes that could potentially lead to bugs * discovering application-specific usage

Version ArchivesGuiding

Software Changes

MappingBugs to Fixes

Mapping Failures to

Defects

Locating Cross-Cutting

Concerns

Raising Risk Awareness

Predicting Component

Failures

Discovering Usage Patterns

3

[a true treasure] - that can be used for* mapping failures to defects and fixes* predicting changes that could potentially lead to bugs* discovering application-specific usage patterns* locating cross-cutting concerns so that they can be converted to aspects

All these applications are very useful in practice, however this talk I will focus only on one:* guiding programmers along related changes

Nicolas Bettenburg will present tomorrow a way to predict component failures using complexity metrics extracted from the source code. The experiments were conducted on well-known projects at Microsoft, and Prof. Zeller had a presentation on this yestarday, in the Automated Debugging lecture.

There is also another paper by J. Anvik and others entitled “Who Should Fix this Bug?”, and which was discussed at large in this seminar. The authors of the paper identified difficulties tracing information between bug repositories and version archives. Some of the problems are trivial, like mapping CVS user names to the corresponding email address in Bugzilla, but still tool support is needed in order to make this automatic.

Page 4: Mining Version Ar chiv es - Universität des Saarlandeshritcu/talks/mining... · * predicting changes that could potentially lead to bugs * discovering application-specific usage

The Cost of Change

All other activities related to software development Maintenance

Optimistic Pessimistic

4

Many studies show that maintenance - the process of changing software - accounts for most of the costs of projects. Maintenance alone is usually between 50 and 75 percent of the costs.

So changing software is in fact expensive.

Page 5: Mining Version Ar chiv es - Universität des Saarlandeshritcu/talks/mining... · * predicting changes that could potentially lead to bugs * discovering application-specific usage

The Risks of Change

5

But this is not all.

Even the smallest change can cause the entire system to fail, in the most unexpected ways.

Page 6: Mining Version Ar chiv es - Universität des Saarlandeshritcu/talks/mining... · * predicting changes that could potentially lead to bugs * discovering application-specific usage

Changing Software

Missed Me!

Done?

6

However, at least changing software is easy! Right?All you need to do is start your favorite text editor and you are almost done.

And if by software you mean a school project, changing it works of course like this.

However for large software projects with tens of thousand of files, the situation is a lot worse. Searching for the right location to change is like looking for a person in a phone book for whom you only know the first name. It is a tedious and error prone process that can take days, and, what’s worse, you don’t even know when you are done.

So changing software is really hard, and developers could use some guidance.

Page 7: Mining Version Ar chiv es - Universität des Saarlandeshritcu/talks/mining... · * predicting changes that could potentially lead to bugs * discovering application-specific usage

7

The idea is simple.

All of you probably know Amazon.com, and their online store. Amazon was a bookstore at first, but today it sells almost everything.

And it has a very interesting feature.

Page 8: Mining Version Ar chiv es - Universität des Saarlandeshritcu/talks/mining... · * predicting changes that could potentially lead to bugs * discovering application-specific usage

8

Amazon helps customers browse along related items, by providing this list of books that were typically included in the same purchase. This information is obtained by applying data mining on the database of previous purchases.

Can’t we do the same thing for large programs?

Can’t we guide programmers along related changes?

Page 9: Mining Version Ar chiv es - Universität des Saarlandeshritcu/talks/mining... · * predicting changes that could potentially lead to bugs * discovering application-specific usage

“Programmers who changed this item also changed ...”

9

Of course we can!Using information from version archives we can get information like“programmers who changed this method also changed these other methods”.

Page 10: Mining Version Ar chiv es - Universität des Saarlandeshritcu/talks/mining... · * predicting changes that could potentially lead to bugs * discovering application-specific usage

Evolutionary Coupling

10

Version Archives convey important information about how a system evolved over time. In particular, version archives can tell us which parts of the system are coupled by common changes -- the authors call this evolutionary coupling.

Let’s see an example from the eclipse compare plug-inWe see two files ComparePreferencePage.java and plugin.properties, each listed with the number of changes: ComparePreferencePage.java was changed 40 times, while plugin.properties was changed 69 times.

Both files have been changed together 20 times, indicating some evolutionary coupling. This is not a very strong coupling, though, since ComparePreferencePage.java has been 20 times without plugin.properties being changed at the same time.

To obtain more details, we can increase the granularity and determine the evolutionary coupling between the individual attributes and methods.

This reveals new couplings - for instance, a coupling between the fKeys[] attribute and the initDefaults() method as well as a coupling between the fKeys[] attribute and the plugin.properties file. Both couplings are strong: In 10 out of 11 times that fKeys[] has been changed, plugin.properties has been changed, too.

It is worthy to note that evolutionary coupling is quite often not detectable by program analysis, since changes might affect resource or documentation files.

Page 11: Mining Version Ar chiv es - Universität des Saarlandeshritcu/talks/mining... · * predicting changes that could potentially lead to bugs * discovering application-specific usage

Revision 68Revision 68Revision 68Revision 68Revision 68Revision 68Revision 68

miner.jpg

...

import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;

Revision 42import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;

Revision 42import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;

Revision 42import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;

Revision 42# # Preference Page # ComparePreferencePage.name= Compare/Patch ComparePreferencePage.generalTab.label= &General ComparePreferencePage.structureCompare.label= &Open

Revision 42

plugin.properties

import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;

Revision 42import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;

Revision 42import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;

Revision 42import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;

Revision 42import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;

Revision 42import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;

Revision 42import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;

Revision 42import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;

Revision 42

ComparePreferencePage.java ...

ROSE

#42 (aweinand): fixed #13332 createGeneralPage() createTextComparePage() fKeys[] initDefaults() buildnotes_compare.html PatchMessages.properties plugin.properties

#42 (aweinand): fixed #13332 createGeneralPage() createTextComparePage() fKeys[] initDefaults() buildnotes_compare.html PatchMessages.properties plugin.properties

#42 (aweinand): fixed #13332 createGeneralPage() createTextComparePage() fKeys[] initDefaults() buildnotes_compare.html PatchMessages.properties plugin.properties

#42 (aweinand): fixed #13332 createGeneralPage() createTextComparePage() fKeys[] initDefaults() buildnotes_compare.html PatchMessages.properties plugin.properties

#42 (aweinand): fixed #13332 createGeneralPage() createTextComparePage() fKeys[] initDefaults() buildnotes_compare.html PatchMessages.properties plugin.properties

#42 (aweinand): fixed #13332 createGeneralPage() createTextComparePage() fKeys[] initDefaults() buildnotes_compare.html PatchMessages.properties plugin.properties

11

The concept of evolutionary coupling is used in a tool called ROSE (Reengineering of Software Evolution), which is developed Tom Zimmermann and others at the chair of Prof. Zeller.

ROSE works as follows:

It first processes a version archive and extracts transactions - which are sets of changes that have been committed at the same time. For CVS a sliding window approach is used ...

Finally, large transactions are removed, since it is very unlikely that they will help us determining evolutionary coupling.

Page 12: Mining Version Ar chiv es - Universität des Saarlandeshritcu/talks/mining... · * predicting changes that could potentially lead to bugs * discovering application-specific usage

#47423 ... ... ... fKeys[] initDefaults() ... ... plugin.properties

#41999 ... ... ... fKeys[] initDefaults() ... ... plugin.properties

#30989 ... ... ... fKeys[] initDefaults() ... ... plugin.properties

#20814 ... ... ... fKeys[] initDefaults() ... ... plugin.properties

#9872 ... ... ... fKeys[] initDefaults() ... ... plugin.properties

#752 ... ... ... fKeys[] initDefaults() ... ... plugin.properties

#601 ... ... ... fKeys[] initDefaults() ... ......

#42 (aweinand): fixed #13332 createGeneralPage() createTextComparePage() fKeys[] initDefaults() buildnotes_compare.html PatchMessages.properties plugin.properties

Mining Rules

12

Out of these transactions it infers rules like

Page 13: Mining Version Ar chiv es - Universität des Saarlandeshritcu/talks/mining... · * predicting changes that could potentially lead to bugs * discovering application-specific usage

#47423 ... ... ... fKeys[] initDefaults() ... ... plugin.properties

#41999 ... ... ... fKeys[] initDefaults() ... ... plugin.properties

#30989 ... ... ... fKeys[] initDefaults() ... ... plugin.properties

#20814 ... ... ... fKeys[] initDefaults() ... ... plugin.properties

#9872 ... ... ... fKeys[] initDefaults() ... ... plugin.properties

#752 ... ... ... fKeys[] initDefaults() ... ... plugin.properties

#601 ... ... ... fKeys[] initDefaults() ... ......

#42 (aweinand): fixed #13332 createGeneralPage() createTextComparePage() fKeys[] initDefaults() buildnotes_compare.html PatchMessages.properties plugin.properties

{fKeys[], initDefaults()} -> plugin.properties

Support = 7 Confidence = 7/8

Mining Rules

13

Whenever fKeys and initDefaults are changed, plugin.properties should probably also be changed.

Note, that these rules are probabilistic, and they have associated to thema support - or the number of transactions the rule was derived fromand a confidence - which is the support divided by the number of transactions that contain the premises of the rule

The rules are computed on the fly, once a set of changes is known and are restricted to a single conclusion. These optimizations make mining with ROSE very efficient - about half a second after each change.

Page 14: Mining Version Ar chiv es - Universität des Saarlandeshritcu/talks/mining... · * predicting changes that could potentially lead to bugs * discovering application-specific usage

14

ROSE is implemented as an eclipse plugin, and it can be easily be installed directly from eclipse.

Page 15: Mining Version Ar chiv es - Universität des Saarlandeshritcu/talks/mining... · * predicting changes that could potentially lead to bugs * discovering application-specific usage

15

Once installed and set up, the tool ...

suggests likely further changes

ordered by confidence

Page 16: Mining Version Ar chiv es - Universität des Saarlandeshritcu/talks/mining... · * predicting changes that could potentially lead to bugs * discovering application-specific usage

16

Once installed and set up, the tool ...

suggests likely further changes

ordered by confidence

Page 17: Mining Version Ar chiv es - Universität des Saarlandeshritcu/talks/mining... · * predicting changes that could potentially lead to bugs * discovering application-specific usage

ROSE Suggests Further Changes17

Once installed and set up, the tool ...

suggests likely further changes

ordered by confidence

Page 18: Mining Version Ar chiv es - Universität des Saarlandeshritcu/talks/mining... · * predicting changes that could potentially lead to bugs * discovering application-specific usage

18

and prevents errors due to incomplete changes

Page 19: Mining Version Ar chiv es - Universität des Saarlandeshritcu/talks/mining... · * predicting changes that could potentially lead to bugs * discovering application-specific usage

ROSE prevents incomplete changes

19

and prevents errors due to incomplete changes

Page 20: Mining Version Ar chiv es - Universität des Saarlandeshritcu/talks/mining... · * predicting changes that could potentially lead to bugs * discovering application-specific usage

20

Finally, ROSE is customizable ...

Page 21: Mining Version Ar chiv es - Universität des Saarlandeshritcu/talks/mining... · * predicting changes that could potentially lead to bugs * discovering application-specific usage

Customizable

21

Finally, ROSE is customizable ... one can change the support and confidence needed for ROSE to make a recommendation. One can also set what kind of changes should ROSE be concerned with: modifications, changes, deletions or all three of them, which is also the default.

Page 22: Mining Version Ar chiv es - Universität des Saarlandeshritcu/talks/mining... · * predicting changes that could potentially lead to bugs * discovering application-specific usage

Experiments

22

For the experiments the authors used eight large open source projects, which span over a wide range of applications, architectures and languages:* eclipse* gcc* gimp* jboss* jedit* koffice* postgresql* python

Page 23: Mining Version Ar chiv es - Universität des Saarlandeshritcu/talks/mining... · * predicting changes that could potentially lead to bugs * discovering application-specific usage

Eclipse GCC GIMP JBoss JEdit KOffice PostgreSQL Python

3,4592,684

8,0989544,7303,834

23,467

34,186

Eclipse GCC GIMP JBoss Jedit KOffice PostgreSQL Python

22,954

13,022

22,503

1,1086,98811,393

41,596

53,653

Number of Files

Number of CVS Transactions

23

Project Size

Page 24: Mining Version Ar chiv es - Universität des Saarlandeshritcu/talks/mining... · * predicting changes that could potentially lead to bugs * discovering application-specific usage

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0 0.1 0.2 0.3 0.4 0.5 0.6 Feedback

Precision

0.1

1.0

0.1

0.80.91.0

0.1

0.6

1.0

Minimum Support Count = 5Minimum Support Count = 3Minimum Support Count = 1

Navigation

Error Prevention

24

Results: Precision Versus FeedbackPrecision = the fraction of returned results that were expectedFeedback = the fraction of queries where ROSE makes at least one recommendationRecall = the fraction of expected results that were returned

As usual there is a tradeoff between these to, and ROSE can be customized so that it achieves either high precision or high feedback. So you can have either precise suggestions or many suggestions, but not both.

For example when suggesting related changes ROSE optimizes feedback:support = 1, confidence = 0.1 -> feedback over 0.66 and precision around 0.3

However, when preventing errors due to incomplete changes ROSE optimizes precision:support = 5, confidence = 0.3

Page 25: Mining Version Ar chiv es - Universität des Saarlandeshritcu/talks/mining... · * predicting changes that could potentially lead to bugs * discovering application-specific usage

Eclipse GCC GIMP JBoss JEdit KOffice PostgreSQL Python Average

0.290.270.29

0.23

0.310.310.300.310.300.33

0.37

0.290.24

0.21

0.360.35

0.45

0.34

0.660.66

0.76

0.65

0.74

0.590.600.630.64

Feedback Recall Precision

Navigation Support = 1Confidence = 0.1

25

Results: Navigation through the source code

A feedback of 0.66 means that in two out of three cases ROSE made at least one suggestion.

Precision is still high enough - the programmer has to check three suggestions in order to find a relevant one.

Page 26: Mining Version Ar chiv es - Universität des Saarlandeshritcu/talks/mining... · * predicting changes that could potentially lead to bugs * discovering application-specific usage

Error Prevention

Eclipse GCC GIMP JBoss JEdit KOffice PostgreSQL Python Average

0.690.67

0.82

0.46

0.38

0.65

0.890.95

0.700.75

0.72

0.89

0.50

0.42

0.73

0.920.96

0.83

0.030.020.020.010.010.020.030.08

0.03

Feedback Recall Precision

Support = 3Confidence = 0.9

26

Results: Error Prevention

Very low feedback - ROSE issued a warning in only 3% of the cases. However precision and recall are both very high. So 75% of the items from a warning need to be considered, and 70% of the items that also need to be changed are in fact recommended.

One result not present on this chart is that only 2% of the queries caused false alarms. So ROSE does not stand in the way.

Page 27: Mining Version Ar chiv es - Universität des Saarlandeshritcu/talks/mining... · * predicting changes that could potentially lead to bugs * discovering application-specific usage

Granularity

Feedback Recall Precision

0.29

0.44

0.82

0.290.33

0.66

Fine Granularity Coarse Granularity

Feedback Recall Precision

0.700.76

0.07

0.690.75

0.03

Navigation Prevention

Support = 3Confidence = 0.9

Support = 1Confidence = 0.1

27

In contrast to other approaches ROSE detects coupling between fine-grained program entities

However, it is relatively easy to make ROSE work at a file level - coarse-grained, and in this case ROSE makes more suggestions (increases feedback and recall while maintaining the same precision). However this yields less specific suggestions which are less useful.

Page 28: Mining Version Ar chiv es - Universität des Saarlandeshritcu/talks/mining... · * predicting changes that could potentially lead to bugs * discovering application-specific usage

Maintenance

Feedback Recall Precision

0.30

0.44

0.71

0.290.25

0.63

0.290.33

0.66

All Non-Maintenance Maintenance

Support = 1Confidence = 0.1

28

The experiments also validated the hypothesis that the predictive power of ROSE is best for changes to existing entities. The average recall raises to 44%, while precision stays almost the same.

Page 29: Mining Version Ar chiv es - Universität des Saarlandeshritcu/talks/mining... · * predicting changes that could potentially lead to bugs * discovering application-specific usage

Dimensions

Recall Precision

0.29

0.25

0.28

0.19

Only modifications Modifications, adds and deletes

Support = 1Confidence = 0.1Non-maintenance

29

Especially useful in the non-maintenance case is distinguishing between different kinds of changes: like modifications, additions and deletions thus obtaining multidimensional association rules. Considering all three increases the recall by 6%.

Page 30: Mining Version Ar chiv es - Universität des Saarlandeshritcu/talks/mining... · * predicting changes that could potentially lead to bugs * discovering application-specific usage

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

3.02.1.32.1.22.1.12.12.0OSS Releases

Transact/Day

RecallPrecision

Feedback

Likelihood 10

30

This charts shows how different measures of predictive power evolve over the lifetime of a project. This chart is for eclipse so the period is about 4 years.

One can easily notice the very short time between project start and the time ROSE makes useful suggestions. Almost all measures tend to stabilize after only a couple of months.

Page 31: Mining Version Ar chiv es - Universität des Saarlandeshritcu/talks/mining... · * predicting changes that could potentially lead to bugs * discovering application-specific usage

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

3.02.1.32.1.22.1.12.12.0OSS

0.24

0.270.28

0.35

Releases

Recall

Perfect RankingLinear Weighting

Last 180 DaysAll Changes (default)

31

Finally, the authors considered restricting the data used for the predictions to only recent changes.As one might imagine, this is useful for more dynamic projects with often refactorings like eclipse where old data quickly becomes obsolete, but less useful for stable projects like gcc.

This chart is again for eclipse, and you can see that restricting the data to the last 180 days or using a linear weighting function increase the recall of the top 10 suggestions by several percent. What you see on top is the perfect ranking. And what is interesting is that the difference between the suggestions given by ROSE and this perfect ranking are not that big. However there is enough place for further improvements.

Page 32: Mining Version Ar chiv es - Universität des Saarlandeshritcu/talks/mining... · * predicting changes that could potentially lead to bugs * discovering application-specific usage

Conclusions

32

* We have seen how information extracted from version archives can be used to guide developers along related software changes and to prevent errors due to incomplete changes.

* We presented ROSE, an eclipse plugin that automates the whole process

* And finally we have seen strong empirical evidence of the usefulness of the techniques in different circumstances

Page 33: Mining Version Ar chiv es - Universität des Saarlandeshritcu/talks/mining... · * predicting changes that could potentially lead to bugs * discovering application-specific usage

Thomas Zimmermann

Many Thanks

33

In the end I would like to thank Tom Zimmermann, the author of ROSE, for his support when preparing this presentation.

Page 34: Mining Version Ar chiv es - Universität des Saarlandeshritcu/talks/mining... · * predicting changes that could potentially lead to bugs * discovering application-specific usage

34

And of course ... Thank you!