mining version ar chiv es - universität des saarlandeshritcu/talks/mining... · * predicting...
TRANSCRIPT
Mining Version ArchivesCătălin Hrițcu, [email protected]
International Max Planck Research School for Computer Science
1
Version Archives
Revision 68Revision 68Revision 68Revision 68Revision 68Revision 68Revision 68
miner.jpg
...
import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;
Revision 42import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;
Revision 42import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;
Revision 42import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;
Revision 42# # Preference Page # ComparePreferencePage.name= Compare/Patch ComparePreferencePage.generalTab.label= &General ComparePreferencePage.structureCompare.label= &Open
Revision 42
plugin.properties
import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;
Revision 42import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;
Revision 42import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;
Revision 42import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;
Revision 42import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;
Revision 42import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;
Revision 42import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;
Revision 42import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;
Revision 42
ComparePreferencePage.java ...
2
* Version archives are used by the great majority of software development projects.They store documents like source files together with their change history, and allow concurrent access to multiple developers.* Open source projects use CVS and more recently Subversion to manage their code repositories.Their version archives constitute an easily available source of very valuable information that can be investigated - a true treasure.
Version ArchivesGuiding
Software Changes
MappingBugs to Fixes
Mapping Failures to
Defects
Locating Cross-Cutting
Concerns
Raising Risk Awareness
Predicting Component
Failures
Discovering Usage Patterns
3
[a true treasure] - that can be used for* mapping failures to defects and fixes* predicting changes that could potentially lead to bugs* discovering application-specific usage patterns* locating cross-cutting concerns so that they can be converted to aspects
All these applications are very useful in practice, however this talk I will focus only on one:* guiding programmers along related changes
Nicolas Bettenburg will present tomorrow a way to predict component failures using complexity metrics extracted from the source code. The experiments were conducted on well-known projects at Microsoft, and Prof. Zeller had a presentation on this yestarday, in the Automated Debugging lecture.
There is also another paper by J. Anvik and others entitled “Who Should Fix this Bug?”, and which was discussed at large in this seminar. The authors of the paper identified difficulties tracing information between bug repositories and version archives. Some of the problems are trivial, like mapping CVS user names to the corresponding email address in Bugzilla, but still tool support is needed in order to make this automatic.
The Cost of Change
All other activities related to software development Maintenance
Optimistic Pessimistic
4
Many studies show that maintenance - the process of changing software - accounts for most of the costs of projects. Maintenance alone is usually between 50 and 75 percent of the costs.
So changing software is in fact expensive.
The Risks of Change
5
But this is not all.
Even the smallest change can cause the entire system to fail, in the most unexpected ways.
Changing Software
Missed Me!
Done?
6
However, at least changing software is easy! Right?All you need to do is start your favorite text editor and you are almost done.
And if by software you mean a school project, changing it works of course like this.
However for large software projects with tens of thousand of files, the situation is a lot worse. Searching for the right location to change is like looking for a person in a phone book for whom you only know the first name. It is a tedious and error prone process that can take days, and, what’s worse, you don’t even know when you are done.
So changing software is really hard, and developers could use some guidance.
7
The idea is simple.
All of you probably know Amazon.com, and their online store. Amazon was a bookstore at first, but today it sells almost everything.
And it has a very interesting feature.
8
Amazon helps customers browse along related items, by providing this list of books that were typically included in the same purchase. This information is obtained by applying data mining on the database of previous purchases.
Can’t we do the same thing for large programs?
Can’t we guide programmers along related changes?
“Programmers who changed this item also changed ...”
9
Of course we can!Using information from version archives we can get information like“programmers who changed this method also changed these other methods”.
Evolutionary Coupling
10
Version Archives convey important information about how a system evolved over time. In particular, version archives can tell us which parts of the system are coupled by common changes -- the authors call this evolutionary coupling.
Let’s see an example from the eclipse compare plug-inWe see two files ComparePreferencePage.java and plugin.properties, each listed with the number of changes: ComparePreferencePage.java was changed 40 times, while plugin.properties was changed 69 times.
Both files have been changed together 20 times, indicating some evolutionary coupling. This is not a very strong coupling, though, since ComparePreferencePage.java has been 20 times without plugin.properties being changed at the same time.
To obtain more details, we can increase the granularity and determine the evolutionary coupling between the individual attributes and methods.
This reveals new couplings - for instance, a coupling between the fKeys[] attribute and the initDefaults() method as well as a coupling between the fKeys[] attribute and the plugin.properties file. Both couplings are strong: In 10 out of 11 times that fKeys[] has been changed, plugin.properties has been changed, too.
It is worthy to note that evolutionary coupling is quite often not detectable by program analysis, since changes might affect resource or documentation files.
Revision 68Revision 68Revision 68Revision 68Revision 68Revision 68Revision 68
miner.jpg
...
import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;
Revision 42import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;
Revision 42import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;
Revision 42import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;
Revision 42# # Preference Page # ComparePreferencePage.name= Compare/Patch ComparePreferencePage.generalTab.label= &General ComparePreferencePage.structureCompare.label= &Open
Revision 42
plugin.properties
import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;
Revision 42import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;
Revision 42import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;
Revision 42import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;
Revision 42import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;
Revision 42import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;
Revision 42import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;
Revision 42import org.eclipse.swt.widgets.*;import org.eclipse.swt.layout.*;import org.eclipse.swt.events.*;import org.eclipse.swt.graphics.Font;import org.eclipse.swt.graphics.Image;import org.eclipse.swt.graphics.Color;
Revision 42
ComparePreferencePage.java ...
ROSE
#42 (aweinand): fixed #13332 createGeneralPage() createTextComparePage() fKeys[] initDefaults() buildnotes_compare.html PatchMessages.properties plugin.properties
#42 (aweinand): fixed #13332 createGeneralPage() createTextComparePage() fKeys[] initDefaults() buildnotes_compare.html PatchMessages.properties plugin.properties
#42 (aweinand): fixed #13332 createGeneralPage() createTextComparePage() fKeys[] initDefaults() buildnotes_compare.html PatchMessages.properties plugin.properties
#42 (aweinand): fixed #13332 createGeneralPage() createTextComparePage() fKeys[] initDefaults() buildnotes_compare.html PatchMessages.properties plugin.properties
#42 (aweinand): fixed #13332 createGeneralPage() createTextComparePage() fKeys[] initDefaults() buildnotes_compare.html PatchMessages.properties plugin.properties
#42 (aweinand): fixed #13332 createGeneralPage() createTextComparePage() fKeys[] initDefaults() buildnotes_compare.html PatchMessages.properties plugin.properties
11
The concept of evolutionary coupling is used in a tool called ROSE (Reengineering of Software Evolution), which is developed Tom Zimmermann and others at the chair of Prof. Zeller.
ROSE works as follows:
It first processes a version archive and extracts transactions - which are sets of changes that have been committed at the same time. For CVS a sliding window approach is used ...
Finally, large transactions are removed, since it is very unlikely that they will help us determining evolutionary coupling.
#47423 ... ... ... fKeys[] initDefaults() ... ... plugin.properties
#41999 ... ... ... fKeys[] initDefaults() ... ... plugin.properties
#30989 ... ... ... fKeys[] initDefaults() ... ... plugin.properties
#20814 ... ... ... fKeys[] initDefaults() ... ... plugin.properties
#9872 ... ... ... fKeys[] initDefaults() ... ... plugin.properties
#752 ... ... ... fKeys[] initDefaults() ... ... plugin.properties
#601 ... ... ... fKeys[] initDefaults() ... ......
#42 (aweinand): fixed #13332 createGeneralPage() createTextComparePage() fKeys[] initDefaults() buildnotes_compare.html PatchMessages.properties plugin.properties
Mining Rules
12
Out of these transactions it infers rules like
#47423 ... ... ... fKeys[] initDefaults() ... ... plugin.properties
#41999 ... ... ... fKeys[] initDefaults() ... ... plugin.properties
#30989 ... ... ... fKeys[] initDefaults() ... ... plugin.properties
#20814 ... ... ... fKeys[] initDefaults() ... ... plugin.properties
#9872 ... ... ... fKeys[] initDefaults() ... ... plugin.properties
#752 ... ... ... fKeys[] initDefaults() ... ... plugin.properties
#601 ... ... ... fKeys[] initDefaults() ... ......
#42 (aweinand): fixed #13332 createGeneralPage() createTextComparePage() fKeys[] initDefaults() buildnotes_compare.html PatchMessages.properties plugin.properties
{fKeys[], initDefaults()} -> plugin.properties
Support = 7 Confidence = 7/8
Mining Rules
13
Whenever fKeys and initDefaults are changed, plugin.properties should probably also be changed.
Note, that these rules are probabilistic, and they have associated to thema support - or the number of transactions the rule was derived fromand a confidence - which is the support divided by the number of transactions that contain the premises of the rule
The rules are computed on the fly, once a set of changes is known and are restricted to a single conclusion. These optimizations make mining with ROSE very efficient - about half a second after each change.
14
ROSE is implemented as an eclipse plugin, and it can be easily be installed directly from eclipse.
15
Once installed and set up, the tool ...
suggests likely further changes
ordered by confidence
16
Once installed and set up, the tool ...
suggests likely further changes
ordered by confidence
ROSE Suggests Further Changes17
Once installed and set up, the tool ...
suggests likely further changes
ordered by confidence
18
and prevents errors due to incomplete changes
ROSE prevents incomplete changes
19
and prevents errors due to incomplete changes
20
Finally, ROSE is customizable ...
Customizable
21
Finally, ROSE is customizable ... one can change the support and confidence needed for ROSE to make a recommendation. One can also set what kind of changes should ROSE be concerned with: modifications, changes, deletions or all three of them, which is also the default.
Experiments
22
For the experiments the authors used eight large open source projects, which span over a wide range of applications, architectures and languages:* eclipse* gcc* gimp* jboss* jedit* koffice* postgresql* python
Eclipse GCC GIMP JBoss JEdit KOffice PostgreSQL Python
3,4592,684
8,0989544,7303,834
23,467
34,186
Eclipse GCC GIMP JBoss Jedit KOffice PostgreSQL Python
22,954
13,022
22,503
1,1086,98811,393
41,596
53,653
Number of Files
Number of CVS Transactions
23
Project Size
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0 0.1 0.2 0.3 0.4 0.5 0.6 Feedback
Precision
0.1
1.0
0.1
0.80.91.0
0.1
0.6
1.0
Minimum Support Count = 5Minimum Support Count = 3Minimum Support Count = 1
Navigation
Error Prevention
24
Results: Precision Versus FeedbackPrecision = the fraction of returned results that were expectedFeedback = the fraction of queries where ROSE makes at least one recommendationRecall = the fraction of expected results that were returned
As usual there is a tradeoff between these to, and ROSE can be customized so that it achieves either high precision or high feedback. So you can have either precise suggestions or many suggestions, but not both.
For example when suggesting related changes ROSE optimizes feedback:support = 1, confidence = 0.1 -> feedback over 0.66 and precision around 0.3
However, when preventing errors due to incomplete changes ROSE optimizes precision:support = 5, confidence = 0.3
Eclipse GCC GIMP JBoss JEdit KOffice PostgreSQL Python Average
0.290.270.29
0.23
0.310.310.300.310.300.33
0.37
0.290.24
0.21
0.360.35
0.45
0.34
0.660.66
0.76
0.65
0.74
0.590.600.630.64
Feedback Recall Precision
Navigation Support = 1Confidence = 0.1
25
Results: Navigation through the source code
A feedback of 0.66 means that in two out of three cases ROSE made at least one suggestion.
Precision is still high enough - the programmer has to check three suggestions in order to find a relevant one.
Error Prevention
Eclipse GCC GIMP JBoss JEdit KOffice PostgreSQL Python Average
0.690.67
0.82
0.46
0.38
0.65
0.890.95
0.700.75
0.72
0.89
0.50
0.42
0.73
0.920.96
0.83
0.030.020.020.010.010.020.030.08
0.03
Feedback Recall Precision
Support = 3Confidence = 0.9
26
Results: Error Prevention
Very low feedback - ROSE issued a warning in only 3% of the cases. However precision and recall are both very high. So 75% of the items from a warning need to be considered, and 70% of the items that also need to be changed are in fact recommended.
One result not present on this chart is that only 2% of the queries caused false alarms. So ROSE does not stand in the way.
Granularity
Feedback Recall Precision
0.29
0.44
0.82
0.290.33
0.66
Fine Granularity Coarse Granularity
Feedback Recall Precision
0.700.76
0.07
0.690.75
0.03
Navigation Prevention
Support = 3Confidence = 0.9
Support = 1Confidence = 0.1
27
In contrast to other approaches ROSE detects coupling between fine-grained program entities
However, it is relatively easy to make ROSE work at a file level - coarse-grained, and in this case ROSE makes more suggestions (increases feedback and recall while maintaining the same precision). However this yields less specific suggestions which are less useful.
Maintenance
Feedback Recall Precision
0.30
0.44
0.71
0.290.25
0.63
0.290.33
0.66
All Non-Maintenance Maintenance
Support = 1Confidence = 0.1
28
The experiments also validated the hypothesis that the predictive power of ROSE is best for changes to existing entities. The average recall raises to 44%, while precision stays almost the same.
Dimensions
Recall Precision
0.29
0.25
0.28
0.19
Only modifications Modifications, adds and deletes
Support = 1Confidence = 0.1Non-maintenance
29
Especially useful in the non-maintenance case is distinguishing between different kinds of changes: like modifications, additions and deletions thus obtaining multidimensional association rules. Considering all three increases the recall by 6%.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
3.02.1.32.1.22.1.12.12.0OSS Releases
Transact/Day
RecallPrecision
Feedback
Likelihood 10
30
This charts shows how different measures of predictive power evolve over the lifetime of a project. This chart is for eclipse so the period is about 4 years.
One can easily notice the very short time between project start and the time ROSE makes useful suggestions. Almost all measures tend to stabilize after only a couple of months.
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
3.02.1.32.1.22.1.12.12.0OSS
0.24
0.270.28
0.35
Releases
Recall
Perfect RankingLinear Weighting
Last 180 DaysAll Changes (default)
31
Finally, the authors considered restricting the data used for the predictions to only recent changes.As one might imagine, this is useful for more dynamic projects with often refactorings like eclipse where old data quickly becomes obsolete, but less useful for stable projects like gcc.
This chart is again for eclipse, and you can see that restricting the data to the last 180 days or using a linear weighting function increase the recall of the top 10 suggestions by several percent. What you see on top is the perfect ranking. And what is interesting is that the difference between the suggestions given by ROSE and this perfect ranking are not that big. However there is enough place for further improvements.
Conclusions
32
* We have seen how information extracted from version archives can be used to guide developers along related software changes and to prevent errors due to incomplete changes.
* We presented ROSE, an eclipse plugin that automates the whole process
* And finally we have seen strong empirical evidence of the usefulness of the techniques in different circumstances
Thomas Zimmermann
Many Thanks
33
In the end I would like to thank Tom Zimmermann, the author of ROSE, for his support when preparing this presentation.
34
And of course ... Thank you!