a new perspective on predicting maintenance...

47
A New Perspective on Predicting Maintenance Costs Florian Uunk April 29, 2012 Amsterdam, The Netherlands External advisor: Rick Kazman University of Hawai‘i and Software Engineering Institute, Carnegie-Mellon University A thesis in partial fulfillment of the requirements for the degrees: Master of Science in Computer Science Master of Science in Software Engineering Thesis supervisor: Patricia Lago Examiner: Ivica Crncovic VU University Amsterdam M¨alardalenUniversity Faculty of Sciences School of Innovation, Design and Engineering This thesis is performed as part of the Double Degree program in Global Software Engineering (GSEEM).

Upload: others

Post on 18-Jun-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

A New Perspective on Predicting Maintenance CostsFlorian Uunk

April 29, 2012Amsterdam, The Netherlands

External advisor: Rick KazmanUniversity of Hawai‘i

andSoftware Engineering Institute,

Carnegie-Mellon University

A thesis in partial fulfillment of the requirements for the degrees:

Master of Science in ComputerScience

Master of Science in SoftwareEngineering

Thesis supervisor: Patricia Lago Examiner: Ivica Crncovic

VU University Amsterdam Malardalen UniversityFaculty of Sciences School of Innovation, Design and

Engineering

This thesis is performed as part of the Double Degree program in Global Software Engineering (GSEEM).

Page 2: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

Abstract

In my thesis I focus on providing a foundation of data on whichdecision makers can base refactoring decisions. For this, I examine therelationship between software complexity and maintenance effort. Tomake the data a stronger basis for refactoring decisions, I present anew approach of correlating file metrics to maintenance effort, whereI look at the relation between changes in file metrics over multiplereleases and changes in the maintenance effort spent on these files. Ido this using a broadened and, more complete notion of maintenanceeffort. I measure maintenance effort in 4 ways: the amount of lines ofcode that had to be changed to resolve tasks, the amount of discus-sion that tasks generated, the amount of atomic changes to a file thatwere required to resolve a task, and the amount of bugs per month.To test this framework, I extracted data from 3 open source projects,where I measured the variation of both complexity and maintenanceeffort, using this new notion of effort, over multiple releases, and in-vestigated their correlation. I found that 21 of the tested metrics weresignificantly correlated to the effort measures, where complexity basedmetrics and incoming propagation cost show the highest correlation.Of the proposed measures for maintenance effort, the amount of dis-cussion to resolve an issue shows the highest correlation to the chosenmetrics.

i

Page 3: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

Contents

1 Introduction 11.1 Research questions . . . . . . . . . . . . . . . . . . . . . . . . 21.2 Roadmap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Background 42.1 A modularity decision framework . . . . . . . . . . . . . . . . 42.2 Research on metrics . . . . . . . . . . . . . . . . . . . . . . . . 5

3 Related work 73.1 Research on metrics and fault rates . . . . . . . . . . . . . . . 73.2 Research on metrics and maintenance effort . . . . . . . . . . 83.3 Differences in my approach . . . . . . . . . . . . . . . . . . . . 9

4 Methodology 114.1 The subject projects . . . . . . . . . . . . . . . . . . . . . . . 11

4.1.1 Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . 114.1.2 Selected projects . . . . . . . . . . . . . . . . . . . . . 12

4.2 The selected metrics . . . . . . . . . . . . . . . . . . . . . . . 134.2.1 Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . 134.2.2 Selected metrics . . . . . . . . . . . . . . . . . . . . . . 14

4.3 Measuring maintenance effort . . . . . . . . . . . . . . . . . . 164.3.1 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 164.3.2 Change in lines of code (Churn) . . . . . . . . . . . . . 164.3.3 Actions . . . . . . . . . . . . . . . . . . . . . . . . . . 164.3.4 Faults . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

5 The data collection methods 175.1 The data model . . . . . . . . . . . . . . . . . . . . . . . . . . 175.2 Extracting the project data . . . . . . . . . . . . . . . . . . . 21

5.2.1 Extracting releases . . . . . . . . . . . . . . . . . . . . 215.2.2 Extracting issues and comments . . . . . . . . . . . . . 215.2.3 Extracting files and source instances . . . . . . . . . . 225.2.4 Extracting churn . . . . . . . . . . . . . . . . . . . . . 22

5.3 Measuring the metrics . . . . . . . . . . . . . . . . . . . . . . 235.3.1 Measuring LOC metrics . . . . . . . . . . . . . . . . . 235.3.2 Measuring C&K metrics . . . . . . . . . . . . . . . . . 235.3.3 Measuring Propagation cost metrics . . . . . . . . . . . 23

5.4 Connecting metrics to maintenance . . . . . . . . . . . . . . . 24

ii

Page 4: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

6 Statistical analysis 256.1 Normalizing the data . . . . . . . . . . . . . . . . . . . . . . . 256.2 Spearman analysis . . . . . . . . . . . . . . . . . . . . . . . . 266.3 Data grouping . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

7 Results 28

8 Discussion 318.1 Research questions . . . . . . . . . . . . . . . . . . . . . . . . 318.2 Comparison to previous work . . . . . . . . . . . . . . . . . . 318.3 Threats to validity . . . . . . . . . . . . . . . . . . . . . . . . 328.4 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

9 Conclusion 34

A Apache project overview 41

iii

Page 5: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

1 Introduction

Estimating the duration, effort, cost and complexity of software projects isof vital importance for IT management. Underestimating cost and effort canlead to loss of credibility. Overestimating costs can lead to missed opportuni-ties and unhappy stakeholders [29]. However, estimations have traditionallybeen problematic in software engineering. There are various studies regardingaccuracy of estimations, Molukken and Jorgensen [37] did a review, showingthat 60-80% of all software projects run over budget, with projects goingover budget by 33% on average. Heemstra lists a number of reasons thatmake estimations difficult for software projects. In addition to the inherentdifficulty of measuring, he notes the lack of reference data to base estimateson [27].

More than half of the total development effort in software projects is spenton the maintenance phase [32]. With the trend of iterative development, thispercentage may even have increased in recent years. During the maintenancephase, software ages [39] and the code base gets cluttered by an accumulationof changes, often referred to as technical debt [23]. When technical debt isnot addressed, further development will be hindered.

Project managers have to decide if they want to spent their developmenttime on building new features, or on refactoring their code base. Variousframeworks have been developed to support these decisions [14], [30]. How-ever, these frameworks rely on accurate maintenance cost predictions for thecode with and without refactoring. Currently, most researchers seem to agreethat there is a lack of this data. [10], [30], [41]

But if managers do not have good insight into the benefits of refactoring,it is difficult to know when to refactor. My research aims to provide a foun-dation of data—relating complexity to effort and hence cost—upon whichsound refactoring decisions may be based. This foundation can then be usedin combination with work on modularity violation detection like [45] and [48]to decide which parts of the code base to refactor.

Specifically I have three research goals: 1) to introduce a new and morecomplete approach in measuring effort in software projects, including non-code-based measures, such as the amount of discussions and actions neededto resolve an issue, 2) to introduce a new approach of correlating source codemetrics variation to the changes of maintenance effort, and 3) to analyzewhich set of source code metrics is most correlated to maintenance effortusing these new approaches.

1

Page 6: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

1.1 Research questions

The primary hypothesis of this thesis is:

There is a statistically significant correlation between softwarecomplexity variation over successive releases of software projectfiles, and the variation of effort required to maintain those files.

Finding a way to correlate maintenance effort with changes in one or moresource code metrics of a file, based on measurements that are easily to captureand track, will make it possible to predict future maintenance costs of thatfile. This information can then be used to make economics-driven decisionsabout software maintenance, including refactoring. While answering thisquestion is of interest to most software projects, few industrial projects arewilling to contribute the data needed to answer the question with authority.For this reason I have turned to open-source projects as the primary source ofdata. The choice of open-source projects is not without its own complexities,but I believe that the challenges of dealing with open-source data are solvable,as I will discuss.

There are a number of questions that I will need to answer in my effortto prove or disprove the primary hypothesis presented above.

1. What metrics should be considered when predicting maintenance costs?

Numerous source code metrics have been proposed and studied, butnot all of them have proven impact on maintenance cost. I first need toidentify the set of metrics, with a solid theoretical and empirical basis,that I can test using my approach.

2. Which measures can effectively approximate maintenance effort?Maintenance effort manifests in multiple ways, and is not always mea-surable in code. For example, adding a few lines of code may requirea great deal of effort, in the form of intensive discussion or multiplerevisions, if the change itself is difficult. To link complexity metrics toeffort, I must first establish a way of measuring maintenance effort.

3. Which complexity metrics are significantly correlated to which mainte-nance effort type?The answer to this question will help in understanding how mainte-nance effort measures increase or decrease with which complexity met-rics. I will examine the metrics selected in 1) and their correlation tothe maintenance effort measures selected in 2) on a per-file basis, usingstatistical models.

2

Page 7: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

1.2 Roadmap

The rest of the thesis is organized as follows: In section 2 I will give somebackground on modularity decisions, source code metrics and modularityviolation detection. In section 3, I discuss the related work in the field, andhow my research framework is different from this previous work. Section 4describes the data selection. Section 5.1 describes how I extracted this data.Section 6 describes the analysis process on the data. Section 7 shows theresults I have obtained. Section 8 contains the discussion and threats tovalidity and suggestions for future work. Section 9 contains the conclusion.

3

Page 8: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

2 Background

In this section, I will describe the theoretical context that this thesis buildsupon. I will first describe a modularity decision making framework for whichthe results of this thesis can be used. I will then discuss the theoreticalbackground on source code metrics, the different dimensions of complexitythat they measure, and some influential research on metrics.

2.1 A modularity decision framework

To assess the economic benefits of refactoring, you to weigh the costs ofrefactoring against the benefits in maintenance costs. Leitch and Stroulia [30]developed a framework for these assessments. The steps in their frameworkare shown in figure 1.

Legacy code

Refactoring analysis

Maintenancecost

prediction

Refactoring plan

Legacy System

Maintenance cost

prediction

Maintenance cost prediction

Refactoring cost prediction

ROI estimation

ROI

Refactoring decision

Figure 1: Model for economics-based refactoring decisions

The framework starts with predicting the maintenance cost based onthe current code base. Then, the developers perform a refactoring analysis,where they investigate the source code for refactoring opportunities. Theydeliver a refactoring plan, that consist of a number of proposed changes tothe source code that are aimed at improving maintainability. The the costsof executing this plan can be predicted, for example using COCOMO [7],Based on this plan, the maintenance costs of the system after the refactoringeffort are predicted. The predictions of the refactoring cost, maintenancewithout refactoring, and maintenance after refactoring are then input for a

4

Page 9: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

Return-On-Investment (ROI) analysis. This analysis assesses the economicbenefits of the refactoring plan. Based on this ROI analysis, a manager candecide to accept or reject the refactoring plan.

There are a couple of challenges in using this framework. First, refactoringopportunities have to be identified. Numerous publications have addressedthis problem, like Trifu and Marinescu [45] and Wong et al. [48]. Second,the maintenance cost with and without the proposed refactoring has to beestimated. In this thesis, I will provide additional data that helps in makingthese estimations, based on source code metrics of the project.

2.2 Research on metrics

Published source code metrics can be divided into 5 categories, based on whatthey measure: size, complexity, coupling, cohesion, and inheritance [19]. Iwill give a brief description of each category, along with some of the mostinfluential publications on source code metrics.

Size Size is the most straightforward metric for source code. The numberof lines of code (LOC) is the easiest way of measuring size. But, like anymetric, it has its flaws. For example, as any experienced coder knows, it ispossible to write the same code with fewer (or greater) lines of code, whilemaintaining the same complexity. To mitigate these problems, several othermetrics have been proposed.

File complexity Measures of how complex the source file is are postu-lated to affect modifiability and maintainability: lower is better. Examplesof complexity-based metrics are Halstead Volume [25], a metric based on op-erator and operand counts, and McCabe Complexity [34], a metric based onthe number of possible paths in the program control graph.

Coupling Coupling describes the amount of connections the code has toother files or other classes. The assumption is that lower coupling is better.Briand et al. proposed a set of metrics that measure different versions of classto class coupling that are possible [8]. Another coupling metric is PropagationCost, which was first introduced by MacCormack et al. in 2006 [33].

Cohesion Cohesion measures how strongly the responsibilities of a file arerelated. The rationale behind measuring cohesion is the belief that sourcefiles and classes should focus on just one thing, and doing so will improvemaintainability.

5

Page 10: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

Inheritance Inheritance-based metrics only apply to object-oriented code.Less complex inheritance hierarchies are expected to be easier to understandand maintain.

Chidamber and Kemerer [16] (henceforth C&K) developed the first, andso far most influential, metrics suite aimed at measuring object-orientedsource code. It includes metrics for inheritance, coupling, cohesion and in-heritance.

As you can see, many different metrics and metric types have been pro-posed. However, there is a lack of strong statistical evidence to support manyof these metrics. I will discuss the existing empirical research in section 3.

6

Page 11: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

3 Related work

While there is a large amount of published research on source code metricsand maintenance effort, the field is far from stabilized.

In the research literature there are 2 broad categories of project outcomesthat have been associated with software metrics: linking metrics to faultrates, and linking metrics to maintenance effort. I will first review researchon fault rates, and review the research on maintenance effort. Finally, Idiscuss how my approach differs from previous work.

3.1 Research on metrics and fault rates

Much work has been done to study the correlation between source code met-rics and fault rates. I now discuss some of the most influential studies.

Comparing in controlled experiments Basili et al. in 1996 [4] gave 8groups of students the same assignment, at the end of which, the numberof errors that were found per class in the testing phase were counted andthen related to the various metrics in the C&K suite. They find all metricsin the suite to yield significant results. Briand et al. [11] replicated theexperimental setup of Basili et al with 8 student projects, and correlated atotal of 49 different metrics from various suites to fault rates. They find thatmost high-scoring metrics are coupling and inheritance based.

Comparing in industry data Tang et al. [44] compare the C&K metricsset against fault data in 3 real-time systems. They find significant results for 2complexity-based metrics. Cartwright and Shepperd [15] measured a numberof complexity and inheritance metrics, where they concluded that complexitymetrics are able to forecast faults. Subramanyam and Krishnan [43] correlatesome of the C&K metrics to fault rates They find significant results forcomplexity and inheritance based metrics.

Comparing in open source data Gyimthy et al. [24] study the C&Kmetric suite and a few metrics they developed themselves in the Mozillasoftware. They find significant values for complexity, size, inheritance andcoupling metrics.

Comparing over releases El Emam et al. [19,20] compares a larger set ofobject-oriented metrics against fault rates, where they control for class size.They construct a prediction model based on the first version of a project,

7

Page 12: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

and then test this on subsequent versions. They conclude that, of the metricsthey examine, only coupling-based metrics provide significant results. Olagueet al. [38] look at data for a project that utilizes iterative development. Theycompare metric data with faults over various iterations. They find significantvalues for complexity metrics. Zhou et al. [51] have written a correction of thestatistics used in the Olagues work. They also performed new research overa project that contains various iterations, and again find significant valuesfor complexity-based metrics.

3.2 Research on metrics and maintenance effort

A number of papers have attempted to correlate source code metrics to main-tenance effort. However, there is no generally agreed methods to determinethe maintenance effort for a file. I will now describe a number of approachesthat have been attempted.

Comparing against expert judgement Welker et al. [47] propose apolynomial that uses complexity based metrics to predict maintenance ef-fort. The weights for each of these metrics are automatically fitted, so thepolynomial matches data of expert judgement in 8 systems. They presentthis polynomial as the Maintainability Index.

Comparing against Maintainability Index Misra [36] and Zhou andXu [50] compare a list of complexity and inheritance metrics against theMaintainability Index at the system level. Both papers find significant cor-relations in both categories.

Comparing in controlled experiments Harrison et al. [26] comparemetrics against both expert judgement and maintenance measurements ob-tained in a controlled experiment. They find correlations between complex-ity and cohesion and their maintenance measures, and a correlation betweencomplexity and the expert judgement of a system.

Arisholm [2] looks at 10 changes made to an industrial system, where helogged hours spent on the tasks. He finds no correlation between source codemetrics and effort. However, this may be due to the small size of the dataset.

Comparing against change Li and Henry [31] did research linking aset of metrics against total change volume of classes in 2 different projects.They found significant correlations for complexity, coupling, cohesion and

8

Page 13: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

inheritance metrics. Binkly and Schach [6] looked at change volume of anindustrial system. They positively correlate this to coupling and complexitymetrics and to one inheritance metric. Ware et al. [46] look at the number ofchanges and the number of lines changed for files in a commercial application.They find significant correlations for complexity and coupling measures.

Comparing over releases Demeyer and Ducasse [17] try to identify prob-lem areas in the source code of a project and check if those problem areasget refactored in later releases. They find no correlation, however they alsonote that the project is in good shape, so there might not have been a needfor refactoring. Alshayeb and Li [1] try to correlate a polynomial, consistingof complexity, coupling, and inheritance metrics, to maintenance effort initerative projects. They do this by measuring lines of code added, deletedand changed first between releases of a project, then between changes withina release. They find that their constructed polynomial is reasonably good atpredicting effort between changes, but less good at predicting effort betweenreleases.

3.3 Differences in my approach

There are three ways in which my research is different from earlier researchin this area. The first difference is that instead of simply comparing filemeasures to maintenance effort, I compare an increase or decrease in filemeasures to an increase or decrease in maintenance effort. This analysis givesmuch more insight into the effect of source code metric variation during thelifetime of a project. If there is a clear correlation between a difference insome metric values and maintainability measures, the project manager willbe able to use this knowledge make informed decisions about maintenanceand refactoring opportunities.

The second difference is that instead of measuring maintenance effort asdifference in lines of code or the number of total changes, I look at a moreprecise estimation of effort. For this, I add 2 new measurements; the amountof actions—changes to a file—required to resolve an issue and the amountof discussion required to resolve an issue. This results in a new and morecomprehensive and holistic view of maintenance effort. The amount of codechanged is an inadequate measure of effort because it does not take intoaccount how difficult it was to make those changes. Actions and discussionsattempt to capture this dimension of effort; the idea is that a complex changewill require more discussion among project members and more changes to files(since some of the initial changes will not be correct and need subsequentrounds of changes).

9

Page 14: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

The third difference is that instead of looking at either effort measuresor fault rates, I look at both, since faults introduce extra maintenance workthat has to be performed on files as well.

To my knowledge, no previous work on maintenance prediction measuresdiscussion or the amount of actions, or also considers fault rates. I am also thefirst to look at change in file metrics versus change in maintenance effort overreleases. Previous work like [1,19,20,38,51] have also looked at maintenanceeffort or faults over various releases, but instead of comparing variations inmetrics over different releases to variations in maintenance effort, they createa formula to predict maintenance effort from source code metrics, and testthat over various releases and changes on the project during those releases.To my knowledge, my research is also on the biggest dataset so far, containingdata for over 6000 changes.

10

Page 15: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

4 Methodology

In this section I will describe the subject projects that I have studied, the setof metrics I chose to collect from these projects, the rationale behind thesechoices, and my data collection methods.

4.1 The subject projects

The selection of projects is important to the quality of the data, and thereforeto the validity of the research. Now I describe the criteria I used to selectthe projects for this research, and the motivation behind those criteria.

4.1.1 Criteria

To make my research as convincing as possible, I attempted to obtain a setof projects that was as large and heterogeneous as possible. I specificallysearched for heterogeneity along the following dimensions:

Variation in domain of software Uses of software can be categorizedinto various application domains. I tried to find projects from distinct do-mains to ensure that my research results would apply generically.

Variation in source code sizes Even though Dolado [18] has shown thatdevelopment productivity in software engineering does not vary significantlyacross project sizes, maintaining a large-scale software project is in practicestill different from maintaining a small scale software project and so I wantedto ensure that I sampled from projects with a broad range of sizes.

Variation in team size There has been considerable research on the ef-fects of team size on development speed. Brooks argues that smaller teamstend to have greater productivity per person [12], whereas Raymond [40]argues that larger teams are beneficial in Open Source Software.

Variation in Project Age The age of a body of software can influencedeveloper productivity in ways that may not be measurable by source codemetrics. For example, the technology chosen (language of implementation,operating system, development libraries, etc.) can cease being supported,and key developers can leave the project resulting in a knowledge loss.

Sampling the data from a set of projects that is heterogeneous amongthese dimensions will make my results more generally applicable. To controlthe scope of the work that I initially pursued, I have restricted my attention to

11

Page 16: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

projects written in Java, so that I could reuse the metrics extraction processfor entire set of projects. Furthermore, to be able to extract maintenanceeffort data, all of the chosen projects were required to use a version controlsystem and a bug tracking system. To make sure a sufficient amount of datais available for each project, the selected projects must all contain sourcecode and maintenance data for at least 3 releases, and all have a minimumof 500 resolved items in their bug tracking systems.

4.1.2 Selected projects

Based on the selection criteria, I did a research among various known opensource projects. A main obstacle in selecting projects was that many opensource projects used different bug tracking systems and version control sys-tems. To make sure that all data was in the same format, and the projectsadhere the same conventions, I decided to select projects that are stored inthe Apache Software Foundation repository. The Apache Software Founda-tion has a database of, as of 31-12-2011, 102 projects, which all use Jirafor their bug tracking and Subversion for their version control. The full setfrom this analysis is visible in Appendix A. I have analysed these projects,and based on the criteria described above, I selected 3 Java projects for myresearch.

I have summarized the project characteristics, the first and last releasefor which I have extracted data, and the number of resolved or closed issuesthat I was able to extract in table 1).

Derby Derby1 is a relational database. It was started by Cloudscape Inc.in 1996. Through a series of acquisitions, the project was transferred to IBMownership in 2001. IBM then donated it to the Apache Foundation in 2004.Subsequent to this donation IBM was still contributing to the project [22].In 2005, Sun Microsystems (now Oracle) took an interest in Derby, and thecompany made significant contributions since that time [28]. I extractedinformation from 18 releases of Derby available in the Apache repository. Asof 31-12-2011 there were a total of a 102 contributors who contributed to

1http://db.apache.org/derby/

Table 1: Selected ProjectsProject Releases Resolved Issues Contributers First Release Last Release Domain

Derby 18 3058 102 08-2005 (10.1.1.0) 10-2011 (10.8.2.2) Database

Lucene 18 2444 41 03-2006 (1.9.1) 12-2010 (3.0.3) Distributed search

PDFBox 8 699 9 02-2010 (1.0.0) 07-2011 (1.6.0) PDF document manipulation tool

12

Page 17: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

their bug-tracking system on over 3000 resolved or closed items.

Lucene Lucene2 is a high-performance, full-featured text search engine li-brary. It was started as a SourceForge project in 2000, but it has been anApache project since 2001. I extracted data from 18 releases. A total of 41contributors have contributed to Lucene’s bug-tracking system, as of 31-12-2011, on over 2400 resolved or closed items. An example consumer of theLucene search engine is Twitter Search [13].

PDFBox PDFBox3 is a collection of command-line tools for working withPDF documents. The PDFBox project was started in 2008. Just 9 peoplehave made contributions to their bug-tracking system as of 31-12-2011, onalmost 700 resolved or closed issues. I extracted data from 8 releases for thisproject.

4.2 The selected metrics

In this section, I will discuss the metrics that I calculated for each project.Each of these metrics will subsequently be compared against various measuresof maintenance effort to determine which of them are good predictors offuture maintenance effort (and hence cost).

4.2.1 Criteria

As described in section 3, numerous metrics (summarized in [35, 41]) havebeen proposed that purport to predict software quality and maintenanceeffort. Unfortunately, testing all these metrics for their power in predictingmaintenance effort was infeasible. To select a smaller target set of metricsto analyze, I applied 3 criteria.

The metric is widely applicable Since I am restricting my research toprojects written in Java, the metrics will have to be applicable at least tothis language.

The metric is defined at file level The unit of analysis in my researchis source file, so the metric has to be interpretable at the file level. I alsoemploy metrics that are defined at the class level, but for these metrics I

2http://lucene.apache.org/core/3http://pdfbox.apache.org/

13

Page 18: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

only consider files that contain just 1 class. This constraint only eliminatesaround 7% of the files from the candidate data set.

The metric has been consistently proven in previous research Tokeep the scope of the research manageable, I chose only metrics that havebeen consistently shown to be correlated with maintenance effort in previousstudies.

4.2.2 Selected metrics

I have selected the following metrics to investigate their correlation to main-tenance cost:

Source Lines of Code (LOC) The total lines of code in the file. Theidea behind this metric is that, all other things being equal, larger files areharder to maintain.

Weighted Method Complexity (WMC) The sum of the complexitiesof the methods in the class defined in this file.

Return For Class (RFC) Total number of methods that can be invokedfrom method invocations on the class defined in this file.

Coupling Between Objects (CBO) The number of other classes thatthe class in this file is connected to.

Lack of Cohesion Of Methods (LCOM) The number of method pairsin the class in this file that do not share the usage of a single attribute of theclass.

Depth in Tree (DIT) The number of classes that are a superclass of theclass in this file.

Number of Children (NOC) The number of classes that have the classin this file as a superclass.

WMC, RFC, CBO and LCOM, DIT and NOC have all been describedby C&K in 1994 [16]. I have altered their definitions slightly to make themmeaningful at the file level, as described in section 4.2.1. The C&K suite hasbeen studied heavily, and its metrics have been validated in many studies(e.g. [6, 26,31,36,46] ).

14

Page 19: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

In addition to the C&K metrics, I have selected one more metric—Propagation Cost. The creators of this metric claim that it captures ar-chitectural complexity.

Propagation cost (PC): The propagation cost metric, a coupling-basedmetric, is based on the observation that a change in one class can affect otherclasses. This metric was first introduced by MacCormack et al. in 2006 [33].I have added this metric because primary research shows coupling basedmetrics have a strong correlation to the appearance of ripple effects–a changein a single file affecting other files [9]. There has also been some promisingearly research on the predictiveness of propagation cost on maintenance effort[14,33].

Propagation cost is based on a visibility matrix for files in a project [33].This visibility matrix is a binary matrix where a project’s files are the labelson the rows and columns, and dependencies between the files are the values.These dependency values are determined using a path length L, which allowsa file A to be dependent on a file B through a dependency chain of length L.For example, with a path length of 1, only direct dependencies are used forcalculating the propagation cost—this is the traditional coupling metric. Tocalculate the propagation cost, just take the sum of all dependencies in thevisibility matrix, and divide that by the total possible dependencies in theproject to obtain the propagation cost value.

However, propagation cost when calculated this way, is not defined atthe file level. For this, instead of taking the sum of all dependencies in thematrix, I take the sum of the incoming dependencies for a file, and divide thatby the total possible dependencies per file to calculate incoming propagationcost per file. Concretely, this means I take the sum of the column in thevisibility matrix that represents the file, and divide this value by the lengthof the column. I do the same for outgoing dependencies to calculate theoutgoing propagation cost per file, where I take the sum of the row in thevisibility matrix that represents the file, and divide that by the length of therow. This approach is a slight variation on the work of Ferneley [21] andYang and Tempero [49] who have found promising results.

In my research I have investigated both incoming and outgoing propaga-tion costs with path lengths of 1, 3, 5, 10, and 20 to see which variant of thepropagation cost metric has the most predictive power.

Finally, I introduce a new variant of the propagation cost metric thatemploys a decay rate. With this decay rate, the strength of indirect depen-dencies is reduced by a factor D for each additional step in the dependencychain between the 2 files. In the present study, I applied a decay rate of0.1. The above combinations give a total of 20 propagation cost metrics(where the decaying version of propagation cost of length 1 is the same as

15

Page 20: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

the non-decaying version).The 20 propagation cost metrics, plus the 7 selected C&K metrics, add

up to a grand total of 27 metrics that I calculate for each file of each releaseof each project.

4.3 Measuring maintenance effort

Since I am looking at open source projects, the developers did not log hoursfor their maintenance work. To approximate maintenance effort, I scrutinizedthe data set to see which proxy measures for maintenance effort I could find.I settled on and collected 4 file-based proxy measures.

4.3.1 Discussion

This is the amount of discussion that occurred in resolving an issue. Theassumption is that a more complex change is likely to generate more discus-sion. Concretely, I measure the number of comments that have been madein the bug tracking system to resolve an issue that affects a file.

4.3.2 Change in lines of code (Churn)

Churn is the total number of lines of code that were changed in the file toresolve an issue. If a file is changed multiple times for the same issue, I see ifthe file changes overlap to make sure I don’t count the same changes multipletimes.

4.3.3 Actions

For actions I measure the amount of actions that were performed to resolvean issue. Concretely, this counts the total number of patches and commitsthat was needed to resolve an issue that affects a file. The notion here is thatthe more complex the file, the more likely it is that it would require a largenumber of actions if something has to be changed in it.

4.3.4 Faults

This is the amount of bugs that that were reporting per month during therelease. If more bugs occur in a file, then more maintenance effort will haveto be put in the file to restore its functionality.

16

Page 21: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

5 The data collection methods

5.1 The data model

The data model used in the data extraction and analysis procedure is illus-trated in Figure 2.

Figure 2: The data model

For each project, I study a number of releases, each of which have a set offiles associated with them. This connection is realized through the concept ofsource instances. A source instance is a folder in the version control systemthat stores an instance of the source code. Such an instance exists for eachrelease. Each project also has a list of issues, which are extracted fromthe project’s bug- or issue-tracking software. Issues consist of both bugreports and change requests. Developers can submit patches to suggest asolution to an issue. Patches consist of a list of actions, which are changesto files that were made to resolve an issue. For each action, I measure thenumber of lines of code added and removed, where changed lines count asboth added and removed. Each action corresponds to a change done in 1 filefor 1 issue, but issues are often resolved using multiple actions, and multipleactions on a single file may be performed in multiple patches to resolve anissue. The patches that finally get accepted (and which resolve the issue) arecalled commits. Developers can also associate comments with issues. Thesecomments are used for communication between developers.

For each file, we generate a number of file metrics. These are also storedin the database.

I will now disucss the various entities and the properties of those entitiesthat are extracted into the database for each project.

Releases A release in a project is a snapshot of the source code that wasdistributed as a new version. Releases are the time unit that I use in the

17

Page 22: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

analysis to compare metric and effort values over. For releases, I extract thefollowing properties:

• Version numberThe version number tells us which version the release represent.

• Release dateThe release date allows us to sort releases in chronological order. It isalso used it to calculate the time between releases, which is used forcalculating the fault rate per month.

• Source instanceThe source instance describes which instance of the source code in theversion control system contains the snapshot of the source code in thisrelease.

Source instance The version control system contains by convention mul-tiple folders containing different instances of the source code. All 3 projectsI look at use the the same organization for these folders. There is the trunkfolder, which contains the instance of the code where the main developmentis done. There is the branches folder, where developers can create a separateinstance for separate development and testing, and the tags folder.

The tags folder is a directory in the version control system that containsa replica of the source code as it existed at the moment of a release. It isa convention that for each release, a tag directory is stored in this way. Iuse these tag directories to extract metrics of the files during their variousreleases. For source instances, I extract the following properties:

• Instance pathThis is the path from the root of the version control system to thisinstance of the source code.

• Instance typeThe instance type describes whether the instance contains the trunk, abranch, or a tag.

Files Files are the unit of measurement in the analysis. I look at bothmetric values and effort measurements on a per-file basis. There is an entryin the database for each file in the repository. For files, I extract the followingproperties:

• File pathThe path from the root of the repository to this file.

18

Page 23: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

• Source instanceThe instance of the source code in the repository that this file belongsto.

• Relative pathThe path from the root of the source instance to this file. This path isused to compare files between source instances.

Issues Issues represent change requests or bug reports in the bug trackingsystem. They represent a wanted or needed atomic change in the project.For issues, I extract the following properties:

• Issue typeThe issue type describes whether the issue is a bug report or a changerequest.

• StatusThe status describes if the issue is resolved or closed yet, or if it is stillopen.

• Creation dateThe date this issue was created. I use this to calculate the fault rateduring a release.

• Resolution dateThe date this issue was resolved or closed. I use this to see which issueshave been fixed during a release, and thus for which issues to calculatemaintenance effort.

Patches Patches are suggested changes to the source code, which would(help) resolve an issue. Patches consist of a list of actions. For patches, Iextract the following properties:

• IssueThe issue that this patch aims to (help) resolve.

Commits A commit is a change in the code in the version control system.One or more commits together resolve an issue. Commits also consist ofactions, and can be seen as patches that have been accepted as resolving anissue. For commits, I extract the following properties:

• IssueThe issue that this commit resolves.

19

Page 24: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

Actions An action is a (suggested) change to a file. It consist of a list oflines that has to be added or removed. Patches and commits consist of a setof actions to one or more files, where each action belongs to either a patch ora commit, and each action applies to exactly one file. For actions, I extractthe following properties:

• Patch or CommitThe patch or commit that this action belongs to.

• FileThe file that this action affects.

• Lines addedThe number of lines of code that are added to this file by this action.

• Lines removedThe number of lines of code that are removed from this file by thisaction.

Comments Comments can be placed on change requests or bug reports.They are used by developers to discuss the resolution of the issue. I use thenumber of comments to estimate the amount of discussion that was requiredto resolve an issue. For comments, I extract the following properties:

• IssueThe issue that this comment is on.

File Metrics For each file, I extract and store a number of file metrics.These are used to relate to the various maintenance metrics in search for acorrelation. Files have a rows in the file metrics table associated with themfor each metric type. File metrics are stored with the following properties:

• FileThe file this metric value belongs to.

• Metric TypeThe type of metric that this value measures of the file.

• Metric valueThe value that this file has regarding the metric type.

20

Page 25: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

5.2 Extracting the project data

I populated the data model described above by extracting data from the bugtracking system and version control repositories for each project. Since theselected 3 projects were all maintained by the Apache foundation, it was pos-sible to use the same technologies for all projects. The bug tracking systemin use is Jira4, which has a WSDL API that is usable for data extraction.Their version control system is Subversion5.

For the extraction of the data, I used the Taverna Workflow ManagementSystem6. Taverna allows you to create workflows, that are essentially pro-grams that have a pipes-and-filters architecture. Filters can be in the formof Java programs, UNIX shell scripts, XML operations or service calls.

I have constructed a number of workflows that I will discuss here. Theseworkflows query both systems for the necessary data, format it, and insert itinto the database based on the data model. The workflows are all written sothat they either update the information if it already exists in the database,or insert it if it did not already exists. This way, if any error occurs duringthe sometimes long extraction runs, the workflow can just be restarted. Thismeans that a workflow can be run incorrectly many times, but as long as oneof the runs was successful, the data will be consistent.

5.2.1 Extracting releases

The workflow that extracts release information first queries the WSDL inter-face of the Jira bug tracking system for the existing releases in the project. Itthen extracts the version number and release date from the returned XML.A Java script then connects to the database to insert this information. Thesource instance field does not get filled yet by this workflow, since that in-formation is not available in Jira. I manually linked each release to its cor-responding source instance after the workflows have succeeded.

5.2.2 Extracting issues and comments

The issues and comments tables are both populated by the same workflow.This workflow first queries the Jira WSDL interface for a list of issues. Thenon a per issue basis, it queries the WSDL interface for comments to thatissue. It then inserts the obtained information for both entities into thedatabase. Since this workflow runs for every issue, it can take a while to

4http://www.atlassian.com/software/jira/5http://subversion.tigris.org/6http://www.taverna.org.uk

21

Page 26: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

execute. For this reason, I’ve written the workflow to run over a specifiedtime range. This time range makes the workflow query for issues that wereresolved within this time range.

5.2.3 Extracting files and source instances

This workflow expects a full checkout of the project repository to be availableon the filesystem. It recursively scans all files and folders in this checkout,where it first identifies different source instances. Since our 3 projects allfollow the same convention of dividing the code up in trunk, branches, andtags sections, this can be done using a regular expression. The workflow thensubtracts the part that describes the source instance from the full path toobtain the relative path for each file or folder. Finally the source instancesand their files that are obtained this way are inserted into the database.

5.2.4 Extracting churn

To extract the churn per issue, we need all patches and commits per issue.The workflow queries the database for a list of (already inserted) issues thatare resolved. Then per issue, it queries the Jira WSDL interface for a listof patches that have been submitted, and then downloads these patch files.Meanwhile, it queries the Jira web interface for a list of commits that havebeen done to resolve the issue, and queries Subversion for the diff files ofthese commits.

Now the workflow has a list of patch files and Subversion commit diff files.These 2 file types are both in the same format, and represent the changes aremade to files in applying the patch or commit. The workflow orders this listchronologically. This combination of patches and commit diff files is a logof what changes were made to files to resolve an issue. To get from this listof patches and commits to a list of actions, the workflow performs a regularexpression to the commits and patches into affected files, with the change inthat file.

For each action, the workflow now checks what the difference is with theaction on that file in the previous patch or commit (if it exists). This givesus the incremental effort for each action that was applied in order to resolvethe issue.

Because this workflow also takes a long time to run, I have made itpossible to run it over time ranges, like the workflow described in section5.2.3.

22

Page 27: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

5.3 Measuring the metrics

To know the metrics for each file during each release, I need to measure eachfile in the source instances that are snapshots of releases. For this I developeda set of workflows that extract the different metric types. Each workflowsmeasures all files in a specified source instance, so the workflows have to berun for each source instance in the projects. I will describe the workflowsbelow, with the input they expect from the source files.

5.3.1 Measuring LOC metrics

The workflow that measures the LOC metric takes the directory that containsthe source instance as input, and then runs the UNIX script:

wc -l

on all files to extract the number of lines of code. These values are thenstored in the FileMetrics table.

5.3.2 Measuring C&K metrics

To measure the C&K metrics, I used ckjm, a freely available tool, writtenby Diomidis Spinellis [42]. This tool takes the .class files as input, whichare compiled java files. To obtain these .class files, I downloaded the .jararchives for each projects for all the releases I needed file metrics for, andextracted these .jar files into folders that contain .class files.

Since Java class names are directly linked to file names, and packagehierarchies are directly linked to directory trees, I am able to reconstruct thefile path from the java name by replacing the periods in the class name byforward slashes. As discussed in section 4.2.1, files with more than one classdefined in them are ignored in this analysis.

The ckjm tool calculates the C&K metrics that we have described in4.2.2, and outputs the values as plain text. The workflow then extracts thesevalues, and inserts them into the FileMetrics table.

5.3.3 Measuring Propagation cost metrics

Like the C&K metrics, the propagation costs are also measured from the com-piled .class versions of the files. This workflow uses two tools that have beendeveloped at the Drexel University: Moka and Janus. The Moka tool trans-forms the .class files into .moka files, which contain an XML-based summaryof the class. These .moka files are then used by Janus to measure which class

23

Page 28: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

dependencies exist. Janus outputs a Dependency Structure Matrix (DSM),which is a matrix that contain all files in the project and the dependenciesamong them. This matrix is then used to calculate the propagation cost forthe different propagation costs and their path lengths, as described in section4.2.2. The class names are linked to file paths in the same way as done insection 5.3.2. The metrics are then stored in the FileMetrics table.

5.4 Connecting metrics to maintenance

I now want to compare the metric values against maintenance measures foreach release. We get the file metrics by looking at the files in the sourceinstance that is connected to the release in the database. The File Metricstable contains the metrics for these files.

In extracting effort data for the maintenance measures, we make a dis-tinction between discussion, churn, and actions and faults. Fault data iscalculated by counting the number of reported faults during the release, sothe creation date of the bug report determines which release the value countstowards. For discussion, churn, and action data, we look at the resolutiondate, so the date that the issue was resolved determines which release thevalue counts towards.

24

Page 29: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

6 Statistical analysis

6.1 Normalizing the data

The metrics described in section 4.2.2 give a data set per file that containsa list of releases, associated complexity metric and maintenance effort data.Since I want to analyze the relationship between increasing or decreasingcode metrics and maintenance effort, I transform these numbers so that theyreflect changes compared with the previous version. To this end I take themetric value of each file in a release, and divide that value by the metric valuein the previous release. This gives me a factor representing the increase ordecrease in the metric value as compared with the previous release. I do thesame calculation for effort values, dividing average effort measures by theircomparable values in the previous release. These derived data points thusrepresent the difference between two releases, or the relative change per re-lease of the same file. Since these are now relative values, I can compare themacross different files and can compare files across projects without having tocorrect for file size.

Table 2: Example Data Extracted from Project FilesFile Release Issue Metric Effort Group Rel. Metric Rel. Effort

1 client.java 1.0 PROJ-01 100 40 - - -2 client.java 1.0 PROJ-02 100 60 - - -3 client.java 1.1 PROJ-11 140 75 1 1.4 1.54 client.java 1.2 PROJ-20 70 60 2 0.5 0.85 library.java 1.1 PROJ-11 60 75 - - -6 library.java 2.0 PROJ-31 72 150 1 1.2 1.57 server.java 1.0 PROJ-11 200 40 - - -8 server.java 1.2 PROJ-21 240 52 1 1.2 1.3

This relative value calculation is exemplified in Table 2. Please ignorethe group column for now, this will be discussed in section 6.2. In row 3you can see that to get the relative value, the metric value is divided by themetric value in the previous release. For the relative effort value, the value isdivided by the average effort over the previous releases. The relative valuesof row 4, which are of the release after the release of row 3 are then calculatedusing the absolute values of row 3. The file “library.java” in row 5 and 6 hasno data associated with it for release 1.0 or 1.2, so the entry for release 1.1is used as first data point, and the entry for release 2.0 as second data point.

I have generated a table like this for each metric type versus effort typecombination. This yields a total of 27 x 4 = 108 tables.

25

Page 30: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

6.2 Spearman analysis

I now have a table of data points that I can analyze to understand the corre-lations between complexity metric values and effort measures. Unfortunately,the effort values do not follow a normal distribution, which limits the statis-tical tools that I can employ. For this reason I have opted to use a Spearmanrank correlation test.

However, Spearman assumes independent measurements in the data set.This means that I have to make sure that I do not compare data pointsof various releases of the same file in the same analysis, as these could bedependent.

6.3 Data grouping

I thus categorize the data points into groups as follows: I first skip all datapoints which come from the first release I have of a file, since I need valuesthat are relative to the previous release for my analysis. After that, all datapoints that belong to the second release of a file go in group 1, all data pointsthat belong to the third release of a file go in group 2, etc. This means thatdata from different releases of the same file will not be compared against eachother.

Table 2 shows an example of how the assignment to groups is done. Row1 and 2 belong to the first release I have information of for file “client.java”.Because I use values that are relative to the previous release, I can notuse these values in my analysis. Row 3 belongs to the second release of“client.java”, so it goes into group 1. Row 4 belongs to the third release of“client.java” that I have information for, so it goes into group 2.

Row 5 contains the first release of “library.java” that I have informationfor, thus it has no relative values. Please note that release 1.1, which thedata point for row 5 belongs to, does not have to be the first release in whichlibrary.java existed in the project; rather release 1.1 is the first release where Ihave maintenance effort data for “library.java”. Row 6 belongs to the secondrelease of “library.java” that I have information for, so it goes into group 1.

Row 7 is the first release of “server.java” that I have information for, so Ican not add it to a group. Row 8 is the second release of “server.java” thatI have information for, so I add it to group 1. Please note for row 8 that theinformation in group 1 here is not from the release right after the release inrow 7, but rather information from 2 releases later. This is because release1.2 is the first release after release 1.0 for which there is maintenance effortinformation.

Table 3 shows the number of data points per group that were obtained

26

Page 31: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

Table 3: Data Points Derived from GroupingGroup Data points

1 9602 4343 1824 845 40

in this way, in this case for discussion data. Because of reducing the numberof data points–which decreases the accuracy in the higher group numbers–Idecided to only include groups 1, 2 and 3 in my analysis.

Finally, in my analysis I exclude issues that did not affect source files. Ialso need changes over more than one release of a file to make sure the fileshows up in a group. Because of this selection in data, the total of 6201 issuesthat I extracted from the three projects resulted in a total of 1739 usable datapoints that are in a group, of which 1576 are in the first 3 groups.

In summary, each group of data shows whether and how one type of main-tenance effort varies with source code complexity changes in two successivereleases. Investigating all the groups will show how these two aspects changetogether over multiple releases.

27

Page 32: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

7 Results

Since I have performed a Spearman analysis on a set of 108 (27 x 4) codemetric type versus maintenance effort type combinations, over 3 differentreleases, I have obtained 324 p and rho values. Normally, it is acceptableto accept all outcomes with a p-value under 0.05 as a significant outcome.However with a large number of tests like this, that would lead to many falsepositives. I correct for this using the False Discovery Rate (FDR), describedby Benjamini and Hochberg [5] to control the expected proportion of falsepositives in the result set. I select the expected amount of false positives to be5%, this means that (on average) 5% of the results that are found significantare actually false positives, or type I errors. This results in a total amount of63 Spearman Analyses with significant outcome, spread over 40 metric type -effort type combinations. The highest p-value that is still declared significantat an FDR of 5% is 0.010.

The results are presented in Table 4. Values that are considered significantat an FDR of 5% are printed in bold.

Significant rho values in my results range from 0.29 to 0.03. I have chosento consider all rho values as long as there is a significant p-value, since thisp-value already shows that there is a correlation, even if it is not a very strongone. There are many factors that can influence the data, such as the inherentdifficulty of the issue, the experience of the developer, and the fact that I justmeasure proxies for effort. Each of these factors will reduce the strength ofcorrelation between the collected metric and effort values. I conjecture thatmeasuring these correlations again with a larger dataset, or one that filtersout specific sources of variation in the data set (such as issue difficult) wouldreveal higher Rho values.

The data shows that the most significant hits, 22 in total, are of met-ric/effort combinations that use the amount of discussion as the measureof effort. Actions show up 8 times in the significant hits and churn showsup 9 times. Of the 27 considered metric types, there is only 1 significantcorrelation to fault measures, this is a surprisingly low amount.

This data shows that the amount of discussion that is needed to resolve anissue is most closely correlated to source code metrics. All metrics that scoresignificantly on the faults, churn, or actions measures, also score significantlyon the discussion effort measure.

Of the scoring metric types, I found significant results for complexitybased and coupling based metrics in the C&K metric suite, and for variousvariations of the propagation cost. There is only 1 significant result forthe size based metric (LOC), and none of the inheritance or cohesion basedmetrics from the C&K suite show significant results.

28

Page 33: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

Among the variations of propagation cost, incoming propagation costmetrics seem to score higher rho values than outgoing propagation cost. Theoptimum path length for both incoming and outgoing propagation cost ap-pears to be 5. Non-decaying propagation cost scores higher than decayingpropagation cost, however this may also be dependent on the decay factorthat I have chosen.

Another noticeable fact is that the propagation cost metrics with pathlength 1 decaying and non-decaying score exactly equal. This is expected,as the decay factor only starts to take effect with path lengths greater than1. You can also see that the CBO (coupling between objects) and outgoingpropagation cost of length 1, which are similar in intent and implementation,also score similar results.

29

Page 34: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

Tab

le4:

met

ric

cate

gori

es-

effor

tm

easu

res

Met

ric

Typ

eE

ffor

tT

yp

eP

Gro

up

1P

Gro

up

2P

Gro

up

3R

ho

Gro

up

1R

ho

Gro

up

2R

ho

Gro

up

3R

FC

dis

cuss

ion

0.0

00044

0.0

00005

0.0

08104

0.2

67665

0.2

94932

0.2

58305

PR

OP

-IN

-5-N

churn

0.0

01877

0.0

00618

0.02

4730

0.2

62433

0.2

76050

0.25

5251

PR

OP

-IN

-5-N

dis

cuss

ion

0.0

00000

0.0

00002

0.02

9023

0.2

52568

0.2

69939

0.29

0199

PR

OP

-IN

-10-

Nac

tion

s0.0

05395

0.01

4327

0.01

0235

0.2

45397

0.25

5007

0.22

7875

PR

OP

-IN

-20-

Ndis

cuss

ion

0.0

00000

0.0

00002

0.03

1972

0.2

14027

0.2

54779

0.21

3701

WM

Cdis

cuss

ion

0.01

1136

0.0

00860

0.03

7767

0.19

4296

0.2

34121

0.19

8693

PR

OP

-IN

-10-

Ndis

cuss

ion

0.0

00000

0.0

00017

0.03

3132

0.2

10000

0.2

49413

0.25

8509

PR

OP

-IN

-3-N

dis

cuss

ion

0.0

00000

0.0

00014

0.03

3328

0.2

04334

0.2

45909

0.46

9582

PR

OP

-IN

-20-

Nch

urn

0.06

0578

0.0

00866

0.0

02220

0.18

1921

0.2

31393

0.2

01866

PR

OP

-IN

-3-N

acti

ons

0.05

0249

0.15

8010

0.0

01243

0.09

5129

0.13

8786

0.2

01210

CB

Odis

cuss

ion

0.0

00569

0.0

01236

0.06

8078

0.1

69772

0.2

14526

0.23

2638

PR

OP

-OU

T-5

-Ddis

cuss

ion

0.0

00032

0.0

00406

0.08

8805

0.1

69210

0.2

10509

0.43

5038

PR

OP

-OU

T-5

-Ndis

cuss

ion

0.0

00008

0.0

00000

0.10

4063

0.1

67687

0.1

99103

0.17

9817

PR

OP

-IN

-10-

Nch

urn

0.10

4461

0.0

05751

0.0

04181

0.16

7291

0.1

91257

0.1

74478

PR

OP

-IN

-3-D

dis

cuss

ion

0.0

00135

0.0

00513

0.10

7429

0.1

67505

0.1

96652

0.17

5641

PR

OP

-IN

-20-

Dch

urn

0.08

5308

0.0

07814

0.03

4112

0.14

2537

0.1

81523

0.16

4104

PR

OP

-IN

-10-

Dch

urn

0.09

3091

0.0

09535

0.03

9627

0.12

9498

0.1

77632

0.15

8916

PR

OP

-OU

T-2

0-D

dis

cuss

ion

0.0

00030

0.0

00230

0.14

0024

0.1

37781

0.2

10541

0.16

2847

PR

OP

-OU

T-3

-Ddis

cuss

ion

0.0

00038

0.0

01662

0.11

5767

0.1

55631

0.1

86864

0.20

2353

PR

OP

-OU

T-3

-Ndis

cuss

ion

0.0

00000

0.0

00003

0.12

5026

0.1

48419

0.1

82640

0.16

7002

PR

OP

-IN

-20-

Nac

tion

s0.0

05473

0.24

0186

0.02

6023

0.1

62186

0.12

5535

0.16

2042

PR

OP

-OU

T-1

0-D

dis

cuss

ion

0.0

00040

0.0

00275

0.13

0689

0.1

37781

0.1

77770

0.16

3026

PR

OP

-IN

-5-N

acti

ons

0.02

5416

0.13

7811

0.0

04198

0.11

8281

0.18

8294

0.1

54387

PR

OP

-IN

-5-D

dis

cuss

ion

0.0

00002

0.0

00554

0.16

6887

0.1

18084

0.1

82912

0.15

3885

PR

OP

-OU

T-5

-Nch

urn

0.51

4329

0.27

1746

0.0

07842

0.05

0278

0.08

5223

0.1

48704

PR

OP

-OU

T-1

0-N

acti

ons

0.77

5158

0.0

04388

0.18

3897

0.04

7211

0.1

44762

0.11

7764

PR

OP

-OU

T-2

0-N

churn

0.52

2952

0.19

9760

0.0

00499

0.04

5013

0.09

5871

0.1

23543

PR

OP

-OU

T-1

0-N

fault

s0.

5965

000.

2494

790.0

06697

0.04

1914

0.07

1588

0.1

20719

PR

OP

-OU

T-1

0-N

churn

0.50

6907

0.27

6148

0.0

00585

0.04

8636

0.09

1226

0.1

20624

PR

OP

-OU

T-1

0-N

dis

cuss

ion

0.0

00274

0.0

00000

0.21

9109

0.0

82389

0.1

57583

0.13

9906

PR

OP

-IN

-10-

Ddis

cuss

ion

0.0

00000

0.0

00096

0.22

4926

0.0

83375

0.1

30134

0.16

7204

PR

OP

-OU

T-1

-Ddis

cuss

ion

0.0

00737

0.02

0970

0.16

6423

0.1

04263

0.14

7759

0.15

0384

PR

OP

-OU

T-2

0-N

dis

cuss

ion

0.0

00039

0.0

00000

0.24

0412

0.0

74278

0.1

26909

0.13

9861

PR

OP

-IN

-3-N

churn

0.0

01035

0.04

3624

0.11

1714

0.0

98234

0.19

8409

0.15

9015

PR

OP

-IN

-20-

Ddis

cuss

ion

0.0

00000

0.0

00070

0.24

4104

0.0

68123

0.1

25535

0.13

5649

PR

OP

-OU

T-5

-Nac

tion

s0.

5723

620.0

01008

0.07

5781

0.11

4235

0.0

89123

0.13

0535

PR

OP

-OU

T-1

-Ndis

cuss

ion

0.0

00737

0.02

0970

0.16

6423

0.0

88898

0.14

2376

0.15

0384

PR

OP

-OU

T-2

0-N

acti

ons

0.70

3452

0.0

01969

0.16

1903

0.06

7607

0.0

72738

0.14

1149

PR

OP

-OU

T-3

-Nac

tion

s0.

6339

970.0

03344

0.22

9549

0.03

6049

0.0

68901

0.14

3187

LO

Cdis

cuss

ion

0.0

03094

0.07

3916

0.80

5239

0.0

27155

0.05

2623

0.10

7775

30

Page 35: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

8 Discussion

In this section, I discuss the result, threats to validity and possible futurework.

8.1 Research questions

Based on the analysis results presented in the previous sections, I can nowanswer the research questions proposed in Section 1:

1. What metrics should be considered when predicting maintenance costs?Previous work shows that the 7 metrics from the C&K metric suite,are good predictors of maintenance effort. Propagation cost metricsare also shown to be promising predictors.

2. Which measures can effectively approximate maintenance effort?Discussions, actions and churn data all appear to be valid proxy mea-sures for effort, in the sense that they are correlated with complexitymeasures that have been shown to be strongly correlated with effort inother research.

3. Which complexity metrics are significantly correlated to which mainte-nance effort type?Propagation cost, RFC, CBO, WMC and LOC show significant correla-tions to maintenance effort in the form of discussions. Various versionsof the propagation cost metric also show correlation actions and churns.One version of the propagation cost metric shows correlation to faults.

8.2 Comparison to previous work

Despite the different approach in correlating metrics to effort data, and thedifferent representation of maintenance effort, my results are in many wayscomparable to results in previous work. Many of the C&K metrics scoresignificantly on maintenance effort, and I have also found significant resultsfor many variations of the propagation cost metric. Coupling and cohesionbased metrics score in both my research and in previous research the highestcorrelations.

In previous work, there is a great variation in whether or not inheri-tance and cohesion based metrics are correlated to maintenance effort. Inmy research no metrics that are based on either of these principles scoresignificantly.

31

Page 36: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

An interesting result is that there is only 1 significant correlation betweensource metrics and fault rates. This is a strong contradiction with previousresearch, which consistently reported correlations between metrics and faultrates. This could be caused by the fact that we correlate the change in sourcemetrics to change in fault rates. It is possible that developers introduce bugsduring restructuring of code, which cancel out the positive effects of therefactoring. I have some suggestions for future work in section 8.4.

8.3 Threats to validity

Here I discuss the threats to the conclusions that exist in this research. Ihave split this section in three parts: construct validity, internal validity andexternal validity.

Construct validity Because developers in open source projects do notlog their effort, I had to employ a set of measures that I believe togetherapproximate true maintenance effort. This is a hypothesis impossible to testwithin my existing research framework. This research would therefore bemore accurate, and my hypothesized maintenance cost measures could bevalidated, if future research would be conducted using this framework thatcollects actual effort hours.

Internal validity One assumption I made is that all bugs and changerequests that were resolved between one release and the next are attributableto the latest release. However, it is possible that a few of the bugs or changerequests actually applied to an older release that was still being maintained.These bugs or change requests are described as backports by Bachmann etal. [3].

Another threat to validity is that, because I start from the bug or changerequest, and from there look at the effort, I miss commits that are not linkedto issues in the bug tracker in my analysis.

To make it possible to use class metrics in our file based approach, Ialso ignored files that had more than one class defined in them. However, Ishowed in section 4.2.1 that only 7% of the files are excluded for this reason.

External validity A threat to the external validity is that I only re-searched open source projects. While I expect that similar results will beobtained in industry projects, I cannot guarantee it.

Furthermore, I have only investigated 3 projects and I only investigateda subset of the possible metrics that I could have considered. For example

32

Page 37: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

I could have considered propagation cost metrics with different path lengthsand different decay factors. A larger study employing more projects andmore metric types would improve the validity of my conclusions.

8.4 Future work

In future work, this research framework could be expanded with data frommore projects. This will give a more solid base to do data analysis on. An-other possibility is focussing on collecting data from non open source projects,specifically projects with logged hours for maintenance work. These loggedhours would also make it possible to verify my current assumptions about theused maintenance measures. It would also be interesting to measure moremaintenance effort proxies in real industrial settings when efforts in terms oflogged hours are not available. It is possible that maintenance effort measuresare different in industrial settings.

As mentioned above, there are many extraneous project factors such asthe inherent difficult of bugs and issues, and the inherent skill of the devel-opers that add noise to the data. Further study that controls for some ofthese noise factors is thus called for. A solution here would be to categorizechange requests into groups of small, medium, and large requests. This way,it would be possible to see what the influence of source code metrics is onthe different sizes of changes, and eliminate the noise that this variation inchange size adds to the data.

Another point for further research to investigate why fault rates as main-tenance measure yields less significant results in our research. It would beinteresting to look at long-term reduction (over multiple releases) in faultrates after refactoring, and the possible introduction of faults during refac-toring effort (in the release after the refactoring) to see if that negativelyinfluenced our results.

Finally, if there is a balanced data set that contains more maintenancecost measures and both open source and industrial projects, the next stepwould be to investigate polynomial models that can predict maintenance costbased on predicted values of source code metrics.

33

Page 38: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

9 Conclusion

In this thesis, I introduced a new way of examining the relationship betweensource code metrics and maintenance effort (and hence cost) that is focusedon supporting modularity decisions. I do this by investigating whether andhow variations in source code metrics correlate with variations in mainte-nance effort. I also introduced a new perspective of maintenance effort whichconsiders non-code-based measurements, including the amount of discussionand actions to resolve issues. Using this new approach, I conducted a researchon the correlation between source code metrics and maintenance effort in 3open source projects, in which I identified a set of source code metrics thathave most impact on maintenance effort.

The framework proposed in this paper has the potential to help predictfuture maintenance cost based on changes in source code metrics. The main-tenance effort measures proposed in this paper push a step further to linkwith actually costs in terms of time and money, and fit in a long term visionof explicitly estimating the value of costly maintenance activities, such asrefactoring, that are currently hard to justify.

A paper has been extracted from this work, and at the moment of thesiscompletion it is under submission at the 28th International Conference onSoftware Maintenance.

34

Page 39: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

Acknowledgment

I would like to thank my external advisor, Rick Kazman, for all the construc-tive feedback on my research and writing.

I would also like to thank the research group: Yuanfang Cai, PeppoValetto, Phil Maconi and Carlos Andrade for their support and suggestions,with a special thanks to Kevin Cheung and Lu Xiao for assisting in the dataextraction process.

I would also like to thank Fetsje Bijma of the Department of Mathematicsat the VU University Amsterdam for her advice on the data analysis in thisresearch.

35

Page 40: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

References

[1] M. Alshayeb and W. Li. An empirical validation of object-orientedmetrics in two different iterative software processes. IEEE Transactionson Software Engineering, 29(11):1043 – 1049, nov. 2003.

[2] E. Arisholm. Empirical assessment of the impact of structural proper-ties on the changeability of object-oriented software. Information andSoftware Technology, 48(11):1046 – 1055, 2006.

[3] A. Bachmann, C. Bird, F. Rahman, P. Devanbu, and A. Bernstein.The missing links: bugs and bug-fix commits. In Proceedings of theeighteenth ACM SIGSOFT international symposium on Foundations ofsoftware engineering, FSE ’10, pages 97–106, New York, NY, USA, 2010.ACM.

[4] V. Basili, L. Briand, and W. Melo. A validation of object-orienteddesign metrics as quality indicators. IEEE Transactions on SoftwareEngineering, 22(10):751 –761, oct 1996.

[5] Y. Benjamini and Y. Hochberg. Controlling the false discovery rate:A practical and powerful approach to multiple testing. Journal of theRoyal Statistical Society. Series B (Methodological), 57(1):pp. 289–300,1995.

[6] A. B. Binkley and S. R. Schach. Inheritance-based metrics for predictingmaintenance effort: An empirical study. Technical Report TR 9705,Computer Science Department, Vanderbilt University, 1997.

[7] B. W. Boehm, Clark, Horowitz, Brown, Reifer, Chulani, R. Madachy,and B. Steece. Software Cost Estimation with Cocomo II with Cdrom.Prentice Hall PTR, Upper Saddle River, NJ, USA, 1st edition, 2000.

[8] L. Briand, P. Devanbu, and W. Melo. An investigation into couplingmeasures for c++. In Software Engineering, 1997., Proceedings of the1997 (19th) International Conference on, pages 412 –421, may 1997.

[9] L. Briand, J. Wust, and H. Lounis. Using coupling measurement forimpact analysis in object-oriented systems. In Software Maintenance,1999. (ICSM ’99) Proceedings. IEEE International Conference on, pages475 –482, 1999.

[10] L. C. Briand and J. Wust. Empirical studies of quality models in object-oriented systems. volume 56 of Advances in Computers, pages 97 – 166.Elsevier, 2002.

36

Page 41: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

[11] L. C. Briand, J. Wst, J. W. Daly, and D. V. Porter. Exploring therelationships between design measures and software quality in object-oriented systems. Journal of Systems and Software, 51(3):245 – 273,2000.

[12] J. Brooks, F.P. The Mythical Man-Month, Essays on Software Engi-neering. Addison-Wesley, Reading, MA, 1975.

[13] M. Busch. Twitter’s new search architecture. http://engineering.

twitter.com/2010/10/twitters-new-search-architecture.html,October 2010.

[14] J. Carriere, R. Kazman, and I. Ozkaya. A cost-benefit framework formaking architectural decisions in a business context. In Software Engi-neering, 2010 ACM/IEEE 32nd International Conference on, volume 2,pages 149 –157, may 2010.

[15] M. Cartwright and M. Shepperd. An empirical investigation of an object-oriented software system. IEEE Transactions on Software Engineering,26(8):786 –796, aug 2000.

[16] S. Chidamber and C. Kemerer. A metrics suite for object oriented de-sign. IEEE Transactions on Software Engineering, 20(6):476 –493, jun1994.

[17] S. Demeyer and S. Ducasse. Metrics, do they really help. In Proc. Lan-gages et Modeles a Objets LMO (LMO’99), pages 69–82, Paris, France,1999.

[18] J. Dolado. On the problem of the software cost function. Informationand Software Technology, 43(1):61 – 72, 2001.

[19] K. El-Emam, S. Benlarbi, N. Goel, and S. Ray. A validation of object-oriented metrics. Technical Report NRC/ERB-1063, National ResearchCouncil Canada, MA, Oct. 1999.

[20] K. E. Emam, W. Melo, and J. C. Machado. The prediction of faultyclasses using object-oriented design metrics. Journal of Systems andSoftware, 56(1):63 – 75, 2001.

[21] E. H. Ferneley. Design metrics as an aid to software maintenance: Anempirical study. Journal of Software Maintenance: Research and Prac-tice, 11(1):55–72, 1999.

37

Page 42: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

[22] A. Foundation. Derby history. http://db.apache.org/derby/derby_

charter.html#Derby+History, February 2006.

[23] M. Fowler. Technical debt. http://www.martinfowler.com/bliki/

TechnicalDebt.html, February 2009.

[24] T. Gyimothy, R. Ferenc, and I. Siket. Empirical validation of object-oriented metrics on open source software for fault prediction. IEEETransactions on Software Engineering, 31(10):897 – 910, oct. 2005.

[25] M. H. Halstead. Elements of Software Science (Operating and program-ming systems series). Elsevier Science Inc., New York, NY, USA, 1977.

[26] R. Harrison, S. J. Counsell, and R. V. Nithi. An investigation into theapplicability and validity ofobject-oriented design metrics. EmpiricalSoftw. Engg., 3(3):255–273, Sept. 1998.

[27] F. Heemstra. Software cost estimation. Information and Software Tech-nology, 34(10):627–639, 1992.

[28] M. LaMonica. Apache derby graduates with sun onboard. http:

//news.cnet.com/Apache-Derby-graduates-with-Sun-onboard/

2100-7344_3-5818473.html, August 2005.

[29] A. L. Lederer and J. Prasad. Causes of inaccurate software developmentcost estimates. Journal of Systems and Software, 31(2):125 – 134, 1995.

[30] R. Leitch and E. Stroulia. Assessing the maintainability benefits ofdesign restructuring using dependency analysis. In Software MetricsSymposium, 2003. Proceedings. Ninth International, pages 309 – 322,sept. 2003.

[31] W. Li and S. Henry. Object-oriented metrics that predict maintainabil-ity. Journal of Systems and Software, 23(2):111 – 122, 1993.

[32] B. P. Lientz. Issues in software maintenance. ACM Comput. Surv.,15:271–278, September 1983.

[33] A. MacCormack, J. Rusnak, and C. Y. Baldwin. Exploring the structureof complex software designs: An empirical study of open source andproprietary code. Manage. Sci., 52:1015–1030, July 2006.

[34] T. McCabe. A complexity measure. IEEE Transactions on SoftwareEngineering, SE-2(4):308 – 320, dec. 1976.

38

Page 43: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

[35] K. Z. Michalis Xenos, D. Stavrinoudis and D. Christodoulakis. Object-oriented metrics - a survey. In Proceedings of the FESMA 2000, Feder-ation of European Software Measurement Associations, Madrid, Spain,2000.

[36] S. C. Misra. Modeling design/coding factors that drive maintainabil-ity of software systems. Software Quality Control, 13(3):297–320, Sept.2005.

[37] K. Molokken and M. Jorgensen. A review of software surveys on softwareeffort estimation. In Empirical Software Engineering, 2003. ISESE 2003.Proceedings. 2003 International Symposium on, pages 223 – 230, sept.-1oct. 2003.

[38] H. M. Olague, L. H. Etzkorn, S. L. Messimer, and H. S. Delugach.An empirical validation of object-oriented class complexity metrics andtheir ability to predict error-prone classes in highly iterative, or agile,software: a case study. Journal of Software Maintenance and Evolution:Research and Practice, 20(3):171–197, 2008.

[39] D. Parnas. Software aging. In Software Engineering, 1994. Proceedings.ICSE-16., 16th International Conference on, pages 279 –287, may 1994.

[40] E. Raymond. The cathedral and the bazaar. Knowledge, Technology &Policy, 12:23–49, 1999. 10.1007/s12130-999-1026-0.

[41] M. Riaz, E. Mendes, and E. Tempero. A systematic review of softwaremaintainability prediction and metrics. In Empirical Software Engineer-ing and Measurement, 2009. ESEM 2009. 3rd International Symposiumon, pages 367 –377, oct. 2009.

[42] D. Spinellis. ckjm – chidamber and kemerer java metrics. http://www.spinellis.gr/sw/ckjm/, May 2010.

[43] R. Subramanyam and M. Krishnan. Empirical analysis of ck metricsfor object-oriented design complexity: implications for software defects.IEEE Transactions on Software Engineering, 29(4):297 – 310, april 2003.

[44] M.-H. Tang, M.-H. Kao, and M.-H. Chen. An empirical study on object-oriented metrics. In Software Metrics Symposium, 1999. Proceedings.Sixth International, pages 242 –249, 1999.

[45] A. Trifu and R. Marinescu. Diagnosing design problems in object ori-ented systems. In Reverse Engineering, 12th Working Conference on,page 10 pp., nov. 2005.

39

Page 44: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

[46] M. P. Ware, F. G. Wilkie, and M. Shapcott. The application of productmeasures in directing software maintenance activity. J. Softw. Maint.Evol., 19(2):133–154, Mar. 2007.

[47] K. D. Welker, P. W. Oman, and G. G. Atkinson. Development andapplication of an automated source code maintainability index. Journalof Software Maintenance: Research and Practice, 9(3):127–159, 1997.

[48] S. Wong, Y. Cai, M. Kim, and M. Dalton. Detecting software modularityviolations. In Software Engineering (ICSE), 2011 33rd InternationalConference on, pages 411 –420, may 2011.

[49] H. Yang and E. Tempero. Measuring the strength of indirect coupling. InSoftware Engineering Conference, 2007. ASWEC 2007. 18th Australian,pages 319 –328, april 2007.

[50] Y. Zhou and B. Xu. Predicting the maintainability of open source soft-ware using design metrics. Wuhan University Journal of Natural Sci-ences, 13:14–20, 2008.

[51] Y. Zhou, B. Xu, and H. Leung. On the ability of complexity metricsto predict fault-prone classes in object-oriented systems. Journal ofSystems and Software, 83(4):660 – 674, 2010.

40

Page 45: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

AA

pach

epro

ject

overv

iew

Tab

le5:

Ap

ach

ep

roje

cts

Nam

eT

ota

lis

sues

Rel

ease

sF

irst

rele

ase

Last

rele

ase

Tec

hn

iqu

eD

escr

ipti

on

Harm

ony

6500

15

2007/12

2010/09

C+

+/Java

AJava

Ru

nti

me

Envir

on

me

Had

oop

Com

mon

6000

59

2006/04

2010/08

Java

Th

eH

ad

oop

com

mon

lib

rary

Ger

on

imo

6000

23

2006/02

2011/11

Java

J2E

EA

pp

lica

tion

serv

erD

erby

5300

19

2004/08

2011/04

Java

Rel

ati

on

al

Data

base

Axis

25000

16

2006/05

2011/08

Java

Web

Ser

vic

esfr

am

ework

HB

ase

4500

27

2008/02

2011/08

Java

Big

data

stora

ge

Cam

el4500

12

2009/12

2011/09

Java

Mes

sage

rou

tin

gfr

am

ework

OF

Biz

4400

72010/04

2011/05

Java

Op

enso

urc

eau

tom

ati

on

pack

age

Wic

ket

4000

52011/08

2011/10

Java

Java

Web

ap

ps

fram

ework

Tu

scany

3900

25

2006/05

2011/08

Java/C

++

SO

Afr

am

ework

CX

F3800

14

2009/03

2011/10

Java

SO

Afr

am

ework

Str

uts

23600

52009/11

2011/09

Java

Ente

rpri

sew

ebap

ps

fram

ework

Lu

cen

e3500

23

2006/02

2011/09

Java

Fu

ll-t

ext

searc

hlib

rary

QP

ID3500

13

2006/12

2011/08

C+

+M

essa

ge

qu

euei

ng

fram

ework

Act

ive

MQ

3400

18

2006/06

2011/04

Java

Mes

sage

qu

euei

ng

fram

ework

Cass

an

dra

3300

44

2009/07

2011/10

Java

Dis

trib

ute

dst

ora

ge

Jack

Rab

bit

3000

72

2006/02

2011/10

Java

Conte

nt

rep

osi

tory

MyF

ace

s3000

39

2009/03

2011/09

Java

Web

ap

ps

use

rin

terf

ace

pla

tform

Fel

ix3000

34

2007/07

2011/09

Java

Ser

vic

ep

latf

orm

Str

uts

12900

21

2007/07

2009/12

Java

Ente

rpri

sew

ebap

ps

fram

ework

SO

LR

2900

92006/12

2011/09

Java

Sea

rch

serv

erb

ase

don

Lu

cen

eA

xis

2700

52007/05

2011/10

Java

Web

serv

ices

fram

ework

Tap

estr

y2500

19

2006/06

2008/09

Java

Java

web

ap

ps

fram

ework

Xala

nJ2

2500

11

2001/02

2007/12

Java

XS

LT

an

dX

Path

imp

lem

enta

tion

HIV

E2500

72009/04

07/2011

Java

Data

ware

hou

seb

uilt

on

Had

oop

Slin

g2200

13

2008/06

2011/08

Java

Web

ap

plica

tion

sfr

am

ework

PIG

2200

11

2007/10

2011/07

Java

Pla

tform

for

an

aly

zin

gla

rge

data

sets

UIM

A2200

52008/08

2011/08

Java/C

++

An

aly

sis

tool

for

un

stru

ctu

red

conte

nt

Coco

on

2200

42006/04

2008/04

Java

Web

dev

elop

men

tfr

am

ework

MyF

ace

sT

rin

idad

2100

12

2007/02

2011/04

Java

Exte

nsi

on

on

the

MyF

ace

sfr

am

ework

Had

oop

Map

/R

edu

ce2100

62009/11

2010/08

Java

Dis

trib

ute

dp

ipel

ine

fram

ework

Roller

2000

40

2005/10

2011/05

Java

Blo

gse

rver

Xer

ces-

C+

+2000

24

2008/06

2011/06

C+

+X

ML

pars

er

41

Page 46: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

Tab

le5:

Ap

ach

ep

roje

cts

Nam

eT

ota

lis

sues

Rel

ease

sF

irst

rele

ase

Last

rele

ase

Tec

hn

iqu

eD

escr

ipti

on

Op

enJP

A2000

19

2007/08

2011/07

Java

Java

per

sist

ence

imp

lem

enta

tion

Tap

estr

y5

1700

19

2008/09

2011/03

Java

Java

web

ap

ps

fram

ework

Ser

vic

eMix

1650

29

2005/11

2011/05

Java

Ente

rpri

sese

rvic

eb

roker

Dir

ecto

ryA

pach

eDS

1600

18

2006/09

2011/08

Java

LD

AP

serv

erO

pen

EJB

1600

10

2007/09

2011/10

Java

EJB

imp

lem

enta

tion

Cayen

ne

1600

62010/05

2011/09

Java

Ob

ject

/R

elati

on

al

Map

pin

gfr

am

ework

Xer

ces-

J1500

25

2000/12

2010/11

Java

XM

Lp

ars

erA

xis

2C

1500

17

2006/02

2009/04

CW

ebse

rvic

esfr

am

ework

MyF

ace

sT

om

ah

aw

k1500

10

2005/10

2011/07

Java

Exte

nsi

on

of

the

MyF

ace

sfr

am

ework

Sh

ind

ig1500

72009/07

2010/12

Java/P

HP

Op

enS

oci

al

conta

iner

Th

rift

1400

82010/04

2011/08

Mu

ltip

leF

ram

ework

for

serv

ices

dev

elop

men

tJam

esS

erver

1300

72004/06

2011/08

Java

Java

ente

rpri

sem

ail

serv

erH

ad

oop

HD

FS

1300

62009/11

2010/08

Java

Dis

trib

ute

dfi

lesy

stem

Jet

Sp

eed

21250

11

2004/10

2011/10

Java

Ente

rpri

sep

ort

al

pla

tform

Cou

chD

B1200

13

2009/07

2011/07

Erl

an

gD

ocu

men

td

ata

base

Ivy

1200

12

2005/04

2010/10

Java

Dep

end

ency

man

ager

ZooK

eep

er1200

12

2008/10

2011/02

Java

Ser

ver

for

dis

trib

ute

dco

ord

inati

on

Forr

est

1200

10

2004/10

2011/02

Java

Pu

blish

ing

fram

ework

Bee

Hiv

e1150

62004/11

2006/12

Java

Ob

ject

mod

elon

J2E

Ean

dS

tru

tsP

DF

Box

1100

82010/02

2011/07

Java

PD

Fto

ol

Axis

-C+

+1100

72005/06

2007/01

C+

+W

ebS

ervic

esfr

am

ework

Htt

pC

om

pon

ents

Htt

pC

lien

t1100

32007/08

2011/08

Java

Clien

t-si

de

HT

TP

imp

lem

enta

tion

MyF

ace

sT

ob

ago

1000

18

2010/01

2011/10

Java

Exte

nsi

on

of

the

MyF

ace

sfr

am

ework

OD

E1000

10

2007/05

2011/02

Java

Bu

sin

ess

pro

cess

exec

uti

on

engin

eC

++

Sta

nd

ard

Lib

rary

1000

82005/07

2009/05

C+

+IS

O/IE

C14882

C+

+st

an

dard

imp

lem

enta

tion

Ser

vic

eMix

Com

pon

ents

1000

62007/09

2011/02

Java

Com

pon

ents

for

the

Ser

vic

eMix

serv

ice

bu

sT

raffi

cse

rver

950

15

2010/05

2011/08

C+

+P

roxy

serv

erso

ftw

are

Kara

f940

13

2010/07

2011/10

Java

Ser

vic

ep

latf

orm

Ser

vic

eMix

4940

11

2008/04

2011/05

Java

Ente

rpri

sese

rvic

eb

us

AV

RO

900

14

2009/07

2011/09

Mu

ltip

leD

ata

seri

aliza

tion

syst

emM

ah

ou

t850

62009/04

2011/05

Java

Mach

ine

learn

ing

imp

lem

enta

tion

sG

eron

imo

dev

tools

800

18

2006/02

2011/11

Java

Set

of

dev

elop

men

tto

ols

for

Ger

on

imo

Piv

ot

800

82009/04

2011/01

Java

Web

dep

loym

ent

pla

tform

Flu

me

800

62010/06

2011/08

Java

Log

data

an

aly

sis

pla

tform

iBati

sfo

rJava

750

19

2004/11

2010/04

Java

Data

map

per

fram

ework

Com

mon

sla

ng

750

11

2002/10

2009/08

Java

Hel

per

uti

liti

esfo

rth

eja

va.lan

gA

PI

42

Page 47: A New Perspective on Predicting Maintenance Costsmdh.diva-portal.org/smash/get/diva2:526817/FULLTEXT01.pdf · A New Perspective on Predicting Maintenance Costs Florian Uunk April

Tab

le5:

Ap

ach

ep

roje

cts

Nam

eT

ota

lis

sues

Rel

ease

sF

irst

rele

ase

Last

rele

ase

Tec

hn

iqu

eD

escr

ipti

on

Dir

ecto

ryst

ud

io720

72007/02

2010/04

Java

LD

AP

toolin

gp

latf

orm

Jsp

Wik

i700

10

2007/12

2010/08

Java

Wik

iWik

icl

on

eJD

O700

72006/01

2010/07

Java

Java

per

sist

ence

imp

lem

enta

tion

Syn

ap

se700

52007/11

2011/09

Java

Ente

rpri

sese

rvic

eb

us

Xala

nC

700

32005/08

2009/06

C+

+X

SLT

an

dX

Path

imp

lem

enta

tion

Com

mon

sm

ath

690

62004/12

2011/03

Java

Math

emati

csan

dst

ati

stic

sco

mp

on

ents

Tik

a620

10

2007/12

2011/09

Java

Conte

nt

an

aly

sis

toolk

itP

luto

600

15

2005/12

2011/09

Java

Java

Port

let

imp

lem

enta

tion

Bu

ild

r600

14

2007/11

2011/06

Ru

by

Java

app

lica

tion

sb

uild

syst

emO

pen

Web

Bea

ns

600

11

2008/12

2011/08

Java

Conte

xt

an

dd

epen

den

cyin

ject

ion

fram

ework

Chu

kw

a600

42009/04

2010/04

Java

Dis

trib

ute

dsy

stem

sm

on

itori

ng

syst

emJet

Sp

eed

600

42004/07

2005/05

Java

Use

rcu

stom

izab

lep

ort

al

syst

emN

Pan

Day

570

92008/12

2011/05

.NE

T/Java

Inte

gra

tion

of

Ap

ach

eM

aven

for

.NE

TV

CL

550

32009/12

2011/04

Mu

ltip

leV

irtu

al

mach

ines

man

agem

ent

fram

ework

Ju

dd

i530

20

2004/06

2011/10

Java

UD

DI

v3

imp

lem

enta

tion

Tiles

520

15

2007/02

2011/02

Java

Web

inte

rface

tem

pla

tefr

am

ework

Sh

ale

500

52006/10

2008/05

Java

Web

ap

plica

tion

fram

ework

Xm

lBea

ns

470

10

2004/06

2009/12

Java

XM

Lto

Java

bin

din

gC

hem

istr

y450

72011/3

2011/09

Java

Conte

nt

man

agem

ent

inte

rop

erab

ilit

yse

rvic

eC

om

mon

sn

et425

12

2003/02

2011/06

Java

Clien

tsi

de

pro

toco

lim

ple

men

tati

on

sC

om

mon

sB

eanu

tils

400

12

2001/06

2010/03

Java

Ob

ject

acc

essi

bilit

yw

rap

per

Axio

m400

12

2006/11

2011/07

Java

XM

Lob

ject

mod

elW

hir

r400

62010/09

2011/08

Java

Lib

rari

esfo

rru

nn

ing

clou

dse

rvic

esA

ctiv

eM

QC

++

clie

nt

390

28

2006/11

2011/04

C+

+A

ctiv

eMQ

C+

+cl

ient

Xw

ork

380

72004/11

2011/04

Java

Com

man

d/in

terc

epto

rfr

am

ework

Com

mon

sC

on

figu

rati

on

370

82004/10

2011/09

Java

Gen

eric

con

figu

rati

on

inte

rface

Com

mon

sC

ollec

tion

s370

72001/07

2006/05

Java

Data

stru

ctu

res

lib

rary

ES

ME

370

42010/03

2011/08

Sca

laE

nte

rpri

seM

essa

gin

gE

nvir

on

men

tC

om

mon

sd

bcp

365

72002/08

2010/02

Java

Data

base

con

nec

tion

pools

lib

rary

Act

ive

MQ

.Net

340

72009/07

2011/06

.NE

TA

ctiv

eMQ

.NE

Tcl

ient

WS

S4J

320

13

2008/06

2010/10

Java

Web

serv

ices

secu

rity

Jam

esIM

AP

315

42010/09

2011/06

Java

Imap

lib

rary

Wookie

250

42009/06

2011/10

Java

Wid

get

sse

rver

43