[ieee 2013 4th international workshop on emerging trends in software metrics (wetsom) - san...
TRANSCRIPT
Measuring Software Projects Mayan Style
Siim Karus
University of Tartu
Tartu, Estonia
Abstract—The progress of contemporary software projects is a
subject to several different measurements. These measurements
are often subjective and rely on developers’ personal predictions.
In addition, software projects are assumed to progress linearly
from the beginning to an end. This can be a good approximation
of progress in projects with rigid and clearly defined and planned
deliverables or deadlines, but is not sufficient for application in
community-driven loosely defined projects. In this paper, we are
proposing an alternative cyclical view on measuring progress in
software development. Based on cyclical time perceptions from
other fields of life, we analysed 23 open source software projects
to find reoccurring patterns in open source software projects.
The empirically derived cyclical and event-based measurements
of software projects’ progress does not suffer from the linear
approximation issues seen with many other measurements.
Accordingly, we believe that the derived progress modelling
technique describes community-driven software better as it
adapts to environmental changes and lends itself for building
estimations on the projects’ future.
Index Terms—Open source software, wavelet analysis,
patterns, evolution, measurement
I. INTRODUCTION
Open source software projects offer an increasing wealth of
software project evolution data [1]. Much of this data is stored
in different information systems like source code repositories,
change management systems, project management systems and
social information exchange systems like e-mail. All these have
been useful sources of data for researchers aiming to improve
the quality of software or the software development process.
In our study, we are following this long-lasting practice by
taking a look at source code repositories in order to find
indicators that could be useful for tracking the projects’
progress. In other words, we are looking for reoccurring
patterns in open source software projects.
Knowing these patterns could help in:
Identifying project state or relative progress;
Identifying naturally occurring iterations in loosely led
projects;
Making estimations on subsequent software project
evolution cycles;
Preparing projects for their next milestones;
Making projects comparable with each-other.
In contemporary software development, subjective
experience from previous projects and project life cycle
theories, which are often based on the former, are used as the
basis in these tasks. Sometimes linear approximations like
project size in lines of code (LOC)1 or cumulative code churn
(sum of LOC added, deleted, and modified), or calendar-based
duration are used to aid in these tasks. Unfortunately, as it can
be seen from Figure 1, neither project size nor project duration
provide truly meaningful and reliable means for tracking
project progress due to the uneven activity in the projects. Even
more, the projects differ greatly in size, activity and duration
making size and duration based comparisons difficult.
Learning from the calendars developed by ancient cultures
like Mayans, we know that time and evolution can be
considered cyclical. This notion allows to measure duration and
progress even if we do not know the end date. Thus, it follows
the practice of open source community-driven software
projects that have no planned “end”. It does, however, come at
the expense of requiring a “container” for cycles so that we
would know when old cycle ends and new begins.
We aim to reduce that void by applying scale-tolerant
frequent pattern mining to identify similarities in software
projects and offer a common scale for comparing projects.
Correspondingly, we are trying to answer the following
questions:
RQ1. Are there reoccurring common evolution patterns in
open source software projects?
RQ2. If common evolution patterns exist, are they
regular/cyclical?
The paper starts by giving background on project life cycles
of software projects, and wavelet analysis method in Section II.
The dataset used for the study is explained in Section III and
the method in Section IV. We present the discussion in Section
V and conclude with Section VI.
II. BACKGROUND
To solve the undertaken task, we are making use of a
technique popular in mechanics, medicine, image and audio
processing. However, this technique is rarely used for business
process data exploration, which is the theme in this paper.
Thus, in the next subsections, we will give an overview of the
project life cycle theories followed by a short introduction to
wavelet analysis technique employed in this paper.
A. Project Life Cycle Theories
In order to determine, whether software projects follow
natural cyclical patterns, we need to understand the death of
software projects as it marks the end of the final cycle. The
1 In this paper, lines of code consists of all lines (including empty) of
all textual files stored in the project repository.
978-1-4673-6331-0/13/$31.00 c© 2013 IEEE WETSoM 2013, San Francisco, CA, USA28
death of an open source software project is little studied
phenomenon. It is as if open source software projects are
expected to never die. Inactive projects are removed from the
active Web and become difficult to find soon to be forgotten,
which contributes to the illusion.
There exists a wealth of studies into the vitality of open
source software [2]. These studies have focused on the
projects’ ability to provide support and grow in the number of
releases. We are not interested in measuring projects’ strength
or success – we only look for signs related to the development
process ignoring community support phrases. This gives a very
different interpretation to the “death” and success of a project
as a project where development has ceased can still be
successful in terms of user adoption and continued community
support as described in Section II.B. Most importantly, cease of
development does not always mean a failure.
Success in formal closed source in-firm software
development projects is clearly defined by their financial
success. However, this measure of success does not apply to
many of the open source projects, which are community led
and distributed freely. This has led to redefinition of success in
open source software projects. Success in open source software
projects can be considered high level of end-user adoption [3],
high software quality [4], high level of developer engagement,
or in many other objective or subjective respects [5]. This
distinction along with the volunteering oriented processes
means a different approach to development needs to be taken.
Eric Raymond explored the differences between open
source projects and closed source projects in his book “The
Cathedral and the Bazaar” [6], which has become one of the
most cited books on the management of community-driven
open projects. In his book he compares the closed team projects
to building a cathedral, where the process is planned, and open
team projects to a bazaar, where participants are constantly
competing and collaborating to reach the target in independent
small scale steps. The bazaar-like agile behaviours are clearly
noticeable in open source development processes; however, the
two distinct management philosophies have begun merging as
open source projects have increased in scale [7, 8].
A frequent criticism about the studies of open source
software is the low number of projects involved in these studies
[9]. In this study, we are using 23 projects from different
repositories and development teams to reduce inherent risks
from data sampling.
B. Causes of Death
Even though open source software has been around since
the advent of programming, the nature of open source software
has become more varied. In particular, improved opportunities
to cooperate have made community developed long-living
software more common. This can be seen as a necessary
evolutionary step to handle the increasing complexity and
volume of modern software.
A common aspect of interest in those projects is the
motivation behind them. There are claims of intrinsic
motivation in open source software development as well as
external motivators in play [10, 11]. No matter what are the
exact motivators for participating in open source software
development, it is clear that they have significant impact on the
Figure 1. Relative progress by date vs. relative progress by cumulative LOC churn. Bubble size reflects number of commits.
29
way the projects are managed and progress. This impact will
also show in the evolutionary patterns found in the projects.
There is lots of research on the subject of trying to identify,
what makes some open source projects successful and why
some projects never seem to catch on. The research on the
success of open source software projects is dispersed by the
various definitions of success of open source software projects
[12]. The determinants of open source project success can be
external or internal [13, 2]. For example, developers and also
end-users have shown preference towards opener project
licenses [14].
The studies on the success of projects are mainly focused
on identifying the reasons of popularity among end-users.
While this is a valid definition of success, we are left
wondering, why some projects seem to carry on forever
releasing new versions every now-and-then, but others seem to
cease evolving (independent of their popularity).
For the purposes of this study, we consider two main causes
of discontinued development of community-driven open source
software projects: loss of popularity, and reaching maturity. In
case of solo-projects, we could also include cases where the
developer just would no longer be available (e.g. due to death,
employment, etc.), which are not a case in multiple-developer
community-driven projects.
1) Loss of Popularity
Loss of popularity can be a result of many different events.
For example, a product might lose its share due to another
product replacing it. Another scenario for loss of popularity is
be due to technical advancement of platforms. For example,
applications built for older operating systems or deprecated
hardware will be unusable or not needed in the new setting.
2) Reaching Maturity
If a project reaches maturity, it will be used without any
changes for the foreseeable future. That means, the product
either becomes future-proof or achieves high level of forward
compatibility. This differs from the development process
maturity as defined by CMMI [15] or OMM [16].
In case of becoming future-proof, the product will function
as-is without any change. A common way of achieving this is
by building in extensibility options. This makes it preferred aim
for software libraries and protocols (e.g. TCP/IP or HTTP),
which can go without change for decades.
Forward compatibility on the other hand is achieved by
adaptation. This could mean application of fuzzy logic,
extensibility, modularity, natural-language-processing, change
estimation, or even self-evolving programs (e.g. worms and
viruses often apply this approach). A simple example of such
kind of projects is wrappers for different libraries.
Even though the differentiation between the causes of death
would be advisable, there is too little data on the use of
software to make such distinction. At best we could say that
projects, which are no longer available on the Web, have died
due to loss of popularity. Unfortunately, the unavailability of
the projects makes it impossible to gather data about these.
Thus, we can assume that all the projects involved in this study
have at least minimal user base for which the software is
sufficiently mature and the projects have died a “good” death.
C. Wavelet Analysis
Wavelet analysis is analysis of signals (time series) by
decomposing the signal into wavelet coefficients (also known
as shift coefficients) and scaling coefficients based on wavelet
functions (also known as filters). Such decomposition allows
compression as the resulting number of coefficients can be
smaller than the number of original samples. This
decomposition can be repeated on the wavelet coefficients until
the number of resulting wavelet coefficients is smaller than the
filter length.
In this study we chose to use a Daubechies filter of length 2
(also known as Haar wavelet [17]) due to its simplicity and
simple interpretation. We are applying discrete wavelet
transform meaning we are using discrete shift when matching
the time series with the wavelet. A decomposition of LOC
series of project “docbook2X” against different time series to
different levels of wavelet transform is shown on Figure 2. On
this figure lines mark scaling coefficients (V) and bars mark
wavelet coefficients (W). All these coefficients are normalised
and the last level of decomposition is excluded as it has only 1
value.
Wavelet transform has proven important in signal
processing thanks to its inherent properties which allow
comparisons at different scales and shifts. This gives three
important advantages compared to many other time series
analysis techniques:
Scaling coefficients allow fuzzy matching as
differences in details are “smoothed out”.
Filter coefficients allow detection of small anomalies
in series.
Discrete transform levels make series of different
lengths or scale comparable.
These advantages have been beneficial in financial
analytics for identification of anomalies and correlations to
identify opportunities [18, 19, 20]. The fuzzy matching and
scale comparisons have proven useful for clone detection in
image processing [21, 22]. The advantages of wavelet analysis
techniques are also useful for frequent pattern analysis of time
series data like used in this study.
III. DATA
A. Projects
The analysis was performed on 23 open source software
projects. 18 of these projects were from a dataset of software
project chosen randomly using Google Code Search. The
projects were selected from various repositories employing
different source code languages, and having multiple
developers in a team. This made sure that we represent
different team sizes, and project types. 15 of these projects are
on-going and 3 have had no development activity for at least a
year (are “dead”). The alive projects have stayed alive for the
minimum of 4 years and at least 3 years after the last data
sample timestamp used in the analysis. Projects “fbug-read-
only” and “vim7” have an earlier last data sample date as the
development in these projects was moved to another repository.
The list of projects in this dataset is given in Table I. This
30
dataset was verified to have source code and activity structure
similar to the dataset of more than 400000 open source
software projects tracked by ohloh.net2. Thus, this dataset
should represent the overall state of open source software
development fairly well.
The dataset of 18 projects was complemented by 5 dead
projects from sourceforge.com. The aim of complementing the
original dataset was to balance the number of dead and alive
projects in the study. These 5 projects were also used as an
independent dataset for verifying some of the patterns found in
the 18 project dataset.
We used specially built software to implement ETL
(extract-transform-load). The software downloaded the
projects’ CVS and SVN repository data into a SQL Server3
database and processed the history data to count lines of code
(LOC) and code churn metrics. A commit log of the projects
was exported from the SQL server for wavelet analysis.
B. Metrics
We conducted wavelet analysis in respect to two different
time series dimensions: days since the first commit and
cumulative code churn. Code churn is the sum of code added,
2 http://www.ohloh.net/ 3 http://www.microsoft.com/en-us/sqlserver/
modified and removed [23]. Those two time series were chosen
due to their popularity in project process measurement
frameworks. Even though, some solutions use lines of code
(LOC) in project snapshot to measure progress in software
development, we consider it a bad practice as LOC is not
monotonously growing throughout the development process
(see Figure 2). This measure can still be used comparing
progress to estimated final size of the software code base.
Future cumulative code churn can be estimated with reasonable
accuracy based on project snapshots as well [24]. Thus,
cumulative code churn as development progress measure
combines some of the benefits of measuring progress in time
spent and LOC of final code produced.
The data series used in the analysis were related to the
developers participating in the projects, code churn, and project
size.
The metrics relating to developers were:
Average number of active developers – it is reasonable
to assume that more active developers will write code
faster (more cumulative code in the same timeframe)
Cumulative number of developers – this reflects the
diversity of knowledge of the code as different
developers work on different sections of the code
Figure 2. DWT decomposition of LOC series by time (left) and cumulative churn (right). Bottom graphs show original series, number in brackets shows
transform level, lines show normalized scaling coefficients, bars normalized wavelet coefficients.
31
Number of commits – we would expect to see the
commit frequency to drop gradually before the death of
a project
Relative team size (cumulative number of developers
divided by the total number of developers at the date of
the last data point collected about the project) – as the
inclusion of other/new developers might be planned,
we get to know how many different developers have
already touched the code
The metrics relating to code churn were:
Mean LOC added, modified, deleted, and churned per
commit (4 metrics) – large commits might lead to
defects, which could be the cause for a project to be
abandoned
Cumulative LOC added, modified, deleted, and
churned per commit (4 metrics) – the size of a project
history relates to the complexity and abundance of
different thought patterns, which could be a deterrent
to new and old developer0s
Relative cumulative LOC churn (only for dead
projects) – the progress of development measured in
LOC
The metrics relating to project size were:
Mean LOC – the size of a project relates to the
complexity, which could be a deterrent to new and old
developers
Mean number of files – the size of a project relates to
the complexity, which could be a deterrent to new and
old developers
Relative progress by date (only for dead projects)
In our study, lines of code is measured by counting all text
lines including source code, comments, configuration settings,
readme, and build files. This takes into account our previous
findings showing that on average 4 different types of code are
used in open source software project and plaintext or
configuration files are a significant portion of that code [25].
IV. METHOD
The analysis and data preparation was conducted in several
steps: data aggregation, discrete waveform transform, similar
region detection and grouping.
In the first step, data series were aggregated along the two
time series dimensions. For days since first commit, the data
was aggregated in 7 day frames (corresponding to a week). For
cumulative code churn, a frame of 1000 LOC was used instead.
TABLE I. LISTING OF PROJECTS INVOLVED IN THE STUDY.
Name State Duration (weeks) Cumulative churn (kLOC) Location
bibliographic inactive 309 348 www.openoffice.org/bibliographic
bizdev inactive 272 531 www.openoffice.org/bizdev
commons active 121 2498 wso2.org
dia active 641 2521 live.gnome.org/Dia
docbook active 454 9073 docbook.sourceforge.net
docbook2X inactive 432 234 docbook2x.sourceforge.net
esb active 121 1057 wso2.org
exist active 363 3578 exist.sourceforge.net
fbug-read-only repo moved 22 39 fbug.googlecode.com
feedparser-read-only active 246 105 feedparser.googlecode.com
gnome-doc-utils active 263 64 live.gnome.org/GnomeDocUtils
gnucash active 604 4835 gnucash.org
groovy active 321 1775 svn.codehaus.org/groovy
ivam inactive 19 19 ivam.sourceforge.net
jackcc inactive 12 1152 jackcc.sourceforge.net
jd4x inactive 173 25 jdx.sourceforge.net
jedidbd inactive 42 293 jedidbd.sourceforge.net
tei active 276 3642 tei.sourceforge.net
valgrind active 375 2646 valgrind.org
vim7 repo moved 21 496 vim.org
VirtualDubMod15 inactive 157 24922 virtualdubmod.sourceforge.net
wsas active 120 2077 wso2.org
wsf active 121 3630 wso2.org
32
In the second step, discrete wavelet transform with
Daubechies filter with length 2 (also known as Haar filter) was
applied on the data series. This gives us two coefficient vectors
(wavelet and scaling coefficient) for each transform
(compression) level.
Linearly positively similar (maximum deviation 0.5%)
maximal sub-sequences of the coefficient vectors were
identified in the third step. We were looking at sub-sequences
of the minimum length of 3 as any two 2-value sequences are
linearly similar (but not necessarily positively). We only
looked for similarities in the same dimension and the same type
of coefficient (for example, we did not look for similarities
between cumulative LOC added filter coefficient vectors and
number of commits scaling coefficient vectors).
The analysis and data aggregation was performed using R
Statistics Suite4 with “wavelets”, “zoo”, and “chron” packages.
Package “wavelets” includes discrete wavelet transform
methods, package “zoo” includes time series aggregation
methods and package “chron” extends support for date and
time manipulations.
This method allows identification of patterns that could
present themselves in different levels of detail in respect to
cumulative LOC churn and date. However, there might be
metrics that are not covered in this study showing similar
evolution patterns between projects. Thus, the patterns
identified in this study can not be considered a complete listing
of reoccurring patterns shared between projects.
V. DISCUSSION
The analysis of similar and very common patterns
identified 58 patterns and sub-patterns that occurred in at least
14 projects more than once. No pattern was identified as
common to all projects. Two patterns of steady increase of
cumulative LOC added when compared plotted in respect to
cumulative code churn was found in 18 projects (different sets
of projects) making them the most universal pattern found in
the study. These patterns occurred on average 4 times in a
project. These patterns were: approximately 2.1% increase in
cumulative LOC added for three consecutive periods (P1) and
approximately 4.4% increase followed by 3.8% increase, and
3.4% increase in cumulative LOC added (P2). These can be
summed up as, a stable growth pattern and a decreasing growth
speed pattern.
An important aspect of these patterns is that these patterns
occurred in different scales. That is, both pattern P1 and P2
contained itself. More specifically, P1 occurred up to twice in
lower level before occurring again in higher level. Pattern P2
did not display such cyclical pattern as each scaling level
increased the frequency 2-7 times.
The most common patterns in relation to calendar time was
revealed to relate to cumulative LOC added (P3: increase of
about 0.4% followed by two periods no change) and
cumulative LOC churned (P4: very small code churn increase
slowing down in following periods). Both of these patterns
were present in 16 projects and occurred on average 7.5 times
4 http://www.r-project.org/
in each project. Pattern P4 showed cyclical behaviour as the
occurrences were about twice higher in lower level than in
higher level while pattern P3 did not show cyclical behaviour.
When measuring project age using P4 pattern, we notice
that all dead projects that contain this pattern will die around
0:2:0 as their age when age is noted as O1:O2:O3, where O1 is
number of pattern occurrences in reverse scaling level 4 (time
series is divided into 24 sections), O2 number of occurrences on
lower scaling level (2 more detailed level than O1) since
previous O1 occurrence, and O3 on another level lower level.
An odd dead project is “bizdev”, which does not have O2
occurrences at all, dying at 0:0:4, which is close to 0:2:0 due to
the average frequency multiplier of 2 between subsequent
scaling levels. “VirtualDubMod15” is another deviation that
has a total of 5 O3 occurrences and dies at old age of 0:3:0.
Alive projects fall into two categories: projects that have
not reached 2:00 (most of the alive projects fall into this
category) or that have at least one O1 occurrence (“wsf” and
“feedparser-read-only” are examples of this category). The
project closest to reaching 0:2:0 is “commons”, which stayed at
0:1:1 at the time of data collection. Similar observation on a
different subset of projects can be made using pattern P1.
There exist complementary patterns to P3 and P4 that
covers projects not covered by P3 and P4 respectively. P4’s
complementary pattern P5: 1.4% increase in cumulative LOC
churn followed by 1.1% increase and 0.6% increase.
Interestingly, this pattern does not have cyclical properties. On
the other hand, P3’s complementary pattern P6 of diminishing
increase in cumulative LOC added does repeat itself in lower
scaling levels. Despite the similarities, the project sets sharing
patterns P3-P6 are all different.
The reoccurring and cyclical patterns are similar to ancient
calendars as the cycle length is not fixed (it is approximate).
Instead, the cycle end is determined by pattern occurring in
higher scale (in calendars, it is determined by the cycle of
another celestial body).
Other reoccurring patterns were found also in project size in
LOC (non-cyclical) and number of files (non-cyclical) in
relation to calendar date, and cumulative number of developers
(cycle time around 1.5 occurrences) and number of files (non-
cyclical) in relation to cumulative code churn. When patterns
common to less projects were allowed, more than 250
additional patterns satisfied the condition of occurring at least 4
times on average in a project. These patterns were identified in
the cumulative number of developers, cumulative churn and its
components, LOC, and number of files in relation to both time
series.
VI. CONCLUSIONS
We have demonstrated that open source software projects
contain reoccurring and similar evolution patterns (RQ1). We
do also confirm that there is no universal reoccurring evolution
pattern – instead, there are several reoccurring patterns that
show similarities between different projects. Thus, if one is to
utilise these similarities to adjust different projects into a
common scale, one needs to start by identifying similarity
patterns in these projects.
33
The second result of the study is in confirming that open
source software projects evolve in cyclical pattern (RQ2). That
is, evolution patterns in low scale are periodically repeated in
larger scale. This can be useful in understanding the seemingly
missing deadlines of the projects’.
The cyclical patterns turn out to be suitable as common
scale for the projects in the dataset. This is shown by the
uniformity in the calculated ages of the projects’ at the time of
their death. This uniformity was achieved using unadjusted
parameters for common pattern identification, thus, a further
study could introduce even better and more uniform patterns by
identifying better commonality criteria for the patterns.
The path of using pattern matching to create common
baselines for comparing software projects can be extended by
looking at interactions between the evolution patterns in the
projects. That is, we might find that some features have regular
repeating patterns within a repeat cycle of another feature.
Another type of interactions that we might be interested in is
classification of pattern occurrence events by another feature
(for example, distinguishing churn increase patterns according
to whether they coincide with increase in LOC). Such studies
have the potential of uncovering new estimation and planning
models for software development. That is, we could identify
when the projects reach critical stages depending on
administrative and architectural choices like openness to
developers or use of agile development processes.
A path we are pursuing by introducing wavelet analysis to
software evolution analysis services is validation and
verification of such patterns in industrial settings. Whilst this
approach might not identify project end correctly, it does offer
a secret-preserving means to interact with and process
proprietary source code in order to improve or extend our
models with industry experience.
ACKNOWLEDGMENT
This research is partly funded by ERDF via the Estonian
Centre of Excellence in Computer Science.
REFERENCES
[1] A. Deshpande and D. Riehle, “The Total Growth of Open Source,” in
Open Source Development, Communities and Quality, 2008.
[2] U. Raja and M. Tretter, “Defining and Evaluating a Measure of Open
Source Project Survivability,” IEEE Transactions on Software
Engineering, vol. 38, no. 1, pp. 163-174, 2012.
[3] J. M. Beaver, X. Cui, J. L. St Charles and T. E. Potok, “Modeling success
in FLOSS project groups,” in Proceedings of the 5th International Conference on Predictor Models in Software Engineering
(PROMISE09), Vancouver, British Columbia, Canada, 2009.
[4] C. Conley and L. Sproull, “Easier Said than Done: An Empirical Investigation of Software Design and Quality in Open Source Software
Development,” in 42nd Hawaii International Conference on System
Sciences, 2009. HICSS '09., 2009.
[5] A. H. Ghapanchi, A. Aurum and G. Low, “A taxonomy for measuring
the success of open source software projects,” First Monday, vol. 16, no. 8, 2011.
[6] E. S. Raymond, The Cathedral and the Bazaar, vol. 3.0, Thyrsus Enterprises, 2000.
[7] J. Wesselius, “The Bazaar inside the Cathedral: Business Models for Internal Markets,” IEEE Software, vol. 25, no. 3, pp. 60-66, 2008.
[8] S. Black, P. Boca, J. Bowen, J. Gorman and M. Hinchey, “Formal Versus Agile: Survival of the Fittest,” Computer, vol. 42, no. 9, pp. 37-45, 2009.
[9] K. Crowston, K. Wei, J. Howison and A. Wiggins, “Free/Libre open-source software development: What we know and what we do not
know,” ACM Comput. Surv., vol. 44, no. 2, p. 35, 2008.
[10] J. Bitzer, W. Schrettl and P. J. Schröder, “Intrinsic motivation in open source software development,” Journal of Comparative Economics, vol.
35, no. 1, pp. 160-169, 2007.
[11] P. V. Singh, “The small-world effect: The influence of macro-level
properties of developer collaboration networks on open-source project
success,” ACM Trans. Softw. Eng. Methodol., vol. 20, no. 2, p. 27, August 2010.
[12] K. Crowston, J. Howison and H. Annabi, “Information systems success
in free and open source software development: theory and measures,” Software Process: Improvement and Practice, vol. 11, no. 2, pp. 123-
148, 2006.
[13] S.-Y. T. Lee, H.-W. Kim and S. Gupta, “Measuring open source software
success,” Omega, vol. 37, no. 2, pp. 426-438, 2009.
[14] C. Subramaniam, R. Sen and M. L. Nelson, “Determinants of open source software project success: A longitudinal study,” Decision Support
Systems, vol. 46, no. 2, pp. 576-585, 2009.
[15] D. Ahern, A. Clouse and R. Turner, Cmmi® distilled: a practical
introduction to integrated process improvement, Third ed., Addison-
Wesley Professional, 2008.
[16] E. Petrinja, R. Nambakam and A. Sillitti, “Introducing the OpenSource
Maturity Model,” in Proceedings of the 2009 ICSE Workshop on Emerging Trends in Free/Libre/Open Source Software Research and
Development, 2009.
[17] R. S. Stanković and B. J. Falkowski, “The Haar wavelet transform: its status and achievements,” Computers & Electrical Engineering, vol. 29,
no. 1, pp. 25-44, 2003.
[18] F. In and S. Kim, “The Hedge Ratio and the Empirical Relationship
between the Stock and Futures Markets: A New Approach Using
Wavelet Analysis,” The Journal of Business, vol. 79, no. 2, pp. 799-820, 2006.
[19] A. Rua and L. C. Nunes, “International comovement of stock market
returns: A wavelet analysis,” Journal of Empirical Finance, vol. 16, no. 4, pp. 632-639, 2009.
[20] J. Yang and P. Lin, “Dynamic risk measurement of futures based on wavelet theory,” in Seventh International Conference on Computational
Intelligence and Security (CIS), 2011.
[21] S. Khan and A. Kulkarni, “Reduced Time Complexity for Detection of Copy-Move Forgery Using Discrete Wavelet Transform,” International
Journal of Computer Applications, vol. 6, no. 7, pp. 31-36, September
2010.
[22] Y. Wang, K. Gurule, J. Wise and J. Zheng, “Wavelet Based Region
Duplication Forgery Detection,” in Ninth International Conference on Information Technology: New Generations (ITNG), 2012.
[23] J. Munson and S. Elbaum, "Code churn: a measure for estimating the
impact of code change," in Proceedings. International Conference on Software Maintenance, 1998., 1998.
[24] S. Karus and M. Dumas, “Code Churn Estimation Using Organisational and Code Metrics: An Experimental Comparison,” Information and
Software Technology, vol. 54, no. 2, pp. 203-211, February 2012.
[25] S. Karus and H. Gall, “A study of language usage evolution in open
source software,” in Proceedings of the 8th International Working
Conference, Honolulu, HI, USA, 2011.
34