size doesn’t matter? on the value of software size features for effort estimation
DESCRIPTION
Ekrem Kocaguneli, Tim Menzies : WVU,USA Jairus Hihn : JPL, USA Byeong Ho Kang : UTAS, Aus PROMISE'12, Lund SwedenTRANSCRIPT
Size Doesn’t Matter?
On the Value of Software SizeFeatures for Effort Estimation
Ekrem Kocaguneli, Tim Menzies : WVU,USAJairus Hihn : JPL, USAByeong Ho Kang : UTAS, Aus
2
Sound bites
Size matters!
But, lack of size features can be tolerated• caveat: need to first prune irrelevancies
PROMISE’12
Sept2012
3
Role of Size Features in SEE
Size features are at the heart of some of the most widely used SEE methods
COCOMO is based on LOC
Function points (FP) is based on logical transactions
Various others exist such as number ofrequirements, number of modules, number of web pages and so on…
PROMISE’12
Sept2012
4
Role of Size Features in SEE (cntd.)
Size features have their advantages and disadvantages
LOC can be automated for counting and is good a posteriori, but is difficult to estimate early on
FP provides a way of a size metric based on early design information; hence more accurate a priori
FP cannot be automated and is subjective… Even though training reduces the estimate variation
PROMISE’12
Sept2012
5
Objections to Size FeaturesAlthough particular size features may have their advantages in certain scenarios, there is a strong opposition…
“Measuring software productivity by lines of code is like measuring progress on an airplane by how much it weighs.” Bill Gates
“This (referring to LOC) is a very costly measuring unit because it encourages the writing of insipid code, but today I am less interested in how foolish a unit it is from even a pure business point of view.” E. W. Dijkstra
So we question: Under what conditions are size features actually a “must” and can we compensate their absence?
PROMISE’12
Sept2012
6
So let’s check…
If we throw away size attributes, what happens?
PROMISE’12
Sept2012
7
If we remove “size”, what happens?
DatasetsCocomo81 Nasa93 SdrCocomo81o Nasa93c1 DesharnaisCocomo81e Nasa93c2 DesharnaisL1Cocomo81s Nasa93c5 DesharnaisL2
DesharnaisL3
Error Measures
MAR
MMRE
MdMRE
Pred(25)
MMER
MBRE
MIBRE
Methods
CART
1NN
Compare standard successful methods run on reduced and full data sets, using 7 error measures and 13 data sets…
Full data set includes size featuresReduced data sets lacks size features
PROMISE’12
Sept2012
8
Evaluation (cntd.)
DatasetsCocomo81 Nasa93 SdrCocomo81o Nasa93c1 DesharnaisCocomo81e Nasa93c2 DesharnaisL1Cocomo81s Nasa93c5 DesharnaisL2
DesharnaisL3
Error Measures
MAR
MMRE
MdMRE
Pred(25)
MMER
MBRE
MIBRE
Methods
pop1NN
CART
1NN
Using 7 error measures
Compare pop1NN against CART & 1NN
On multiple data sets collected via COCOMO, COCOMOII and FP
Mann-Whitney 95%Why CART?Dejaeger et al. TSE 2012
PROMISE’12
Sept2012
9
Results (full data has “size”, reduced has not)
CART on reduced-dataset vs. CART on full-dataset
Last column shows total loss count of CART run on reduced dataset (i.e. no size features)
In 7 of 13 tests, taking out size makes CART perform worse
PROMISE’12
Sept2012
10
Results (full data has “size”, reduced has not)
Total loss counts of CART and 1NN run on reduced data vs. their variants run on full data…
Standard methods are better off with size attributes of the data sets… I.e. they cannot compensate for the lack of size attributes well
(copied from last slide)PROMISE’12
Sept2012
11
New idea
If we prune data irrelevancies, can we survive losing size attributes?
PROMISE’12
Sept2012
12
Instance selection• Chang (1974)
– Most of the instances are uninformative.– Reduced data sets of size 514, 150, 66 to 34, 14,6 prototypes .
• Li et al. (2009) – genetic algorithm for instance selection
• Turhan et al. (2009) – instance selection as a filter for cross-company defect data – See also, Kocaguneli et al. 2011
• Kocaguneli et al. (2011) variance-based selection:– Dendogram of clusters: prune sub-trees with large variances
• Keung et al.’s (2011) Analogy-X – instance selection method for analogous entry
• New idea, 1popNN : a very simple instance selector
PROMISE’12
Sept2012
13
pop1NN : the urchin shapeWe propose that a “popularity” based method can compensate the lack of size features
The “popularity” of an instance is the number of times it is the nearest-neighbor of other instances
Sea urchin is a good example for SEE data… Popular central instances that are closest neighbors to scattered neighbors…
PROMISE’12
Sept2012
14
Formally, this is rNN• rNN =
– Reverse Nearest Neighbor– E.g. how many residential areas would find a new store as their nearest choice. – E.g. predict popularity of a new cell phone plan, determine how many profiles have
the plan as their best match, against the existing plans in the market.
• Can be computed efficiently (rNN chaining) – see Lopez-Sastre et al., – Fast Reciprocal Nearest Neighbors Clustering, – Signal Processing, 2012, Vol. 92, pages 270—275)
Sept2012
PROMISE’12
15
So let’s check…
If we (1) throw away size attributesand (2) irrelevant rows,
then what happens?
PROMISE’12
Sept2012
16
Details:pop1NN (cntd.)
1. Calculate distances between every training instance-tuple2. Convert distances of Step 1 into ordering of neighbors3. Mark closest neighbors and calculate popularity4. Order training instances in decreasing popularity5. Decide which instances to select• Experiments with nearest neighbor on a hold-out set
6. Return Estimates for the test instances
pop1NN is a 6-step procedure…
PROMISE’12
Sept2012
17
Results (reduced data)
Loss values of pop1NN (on reduced data) vs. CART and 1NN (on full data)
pop1NN loses 2 out of 13 data sets against 1NN
pop1NN loses 4 out of 13 data sets against 1NN
PROMISE’12
Sept2012
18
Discussion
PROMISE’12
Sept2012
19
ConclusionsSuccessful methods (1NN & CART) cannot compensate the lack of size attributes very well
Lack of size features decreases their performance in majority of the data sets
When 1NN is augmented with a popularity-based pre-processor to come up with pop1NN
Lack of size features can be tolerated in most of the datasets Caveat: need to first prune irrelevancies
Size features are essential for standard learners Practitioners with enough resources to correctly collect size
features should do so In the lack of such resources, pop1NN-like methods can
compensate for the lack of the size featuresPROMISE’12
Sept2012
20
Future Work• Pop1NN as a feature selector?
– Lipowezky (1998) : • feature and case selection are
similar tasks, • both remove cells in the
hypercube of all instancestimes all features.
– So it should be possible to convert a case selection mechanism into a feature selector. • Transpose data • Nearby columns are correlated• Keep columns that are near no other
• Active learning:– pop1NN does not use dependent variable information. – can identify the popular instances of a data set, guide expert reflection on
collect dependent variable information PROMISE’12
Sept2012
21
Questions? Comments?
PROMISE’12
Sept2012