software reuse metrics and models

Upload: mauricio-ama-aline

Post on 04-Apr-2018

232 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/31/2019 Software Reuse Metrics and Models

    1/21

    Software Reuse: Metrics and Models

    WILLIAM FRAKES

    Virginia Tech

    and

    CAROL TERRY

    INCODE Corporation

    As organizations implement systematic software reuse programs to improveproductivity and quality, they must be able to measure their progress and identifythe most effective reuse strategies. This is done with reuse metrics and models. In

    this article we survey metrics and models of software reuse and reusability, andprovide a classification structure that will help users select them. Six types ofmetrics and models are reviewed: cost-benefit models, maturity assessment models,amount of reuse metrics, failure modes models, reusability assessment models, andreuse library metrics.

    Categories and Subject Descriptors: D.2.8 [Software Engineering]: Metrics;D.2.m [Software Engineering]: Miscellaneousreusable software; K.6.0[Management of Computing and Information Systems]: Generaleconomics;K.6.4 [Management of Computing and Information Systems]: SystemManagementquality assurance

    General Terms: Economics, Measurement, Performance

    Additional Key Words and Phrases: Cost-benefit analysis, maturity assessment,reuse level, object-oriented, software reuse failure modes model, reusabilityassessment, reuse library metrics, software, reuse, reusability, models, economics,quality, productivity, definitions

    1. INTRODUCTION

    Software reuse, the use of existing soft-ware artifacts or knowledge to createnew software, is a key method for signif-

    icantly improving software quality andproductivity. Reusability is the degreeto which a thing can be reused. Toachieve significant payoffs a reuse pro-gram must be systematic [Frakes and

    Isoda 1994]. Organizations implement-ing systematic software reuse programsmust be able to measure their progressand identify the most effective reusestrategies.

    In this article we survey metrics andmodels of software reuse and reusabil-ity. A metric is a quantitative indicatorof an attribute of a thing. A model spec-ifies relationships among metrics. In

    Authors addresses: W. Frakes, Virginia Tech, Computer Science Department, 2990 Telestar Court,Falls Church, VA 20165; email: [email protected] ; C. Terry, INCODE Corp., 1100 S. Main St.,Blacksburg VA 24060; email: [email protected] .

    Permission to make digital/hard copy of part or all of this work for personal or classroom use is grantedwithout fee provided that the copies are not made or distributed for profit or commercial advantage, thecopyright notice, the title of the publication, and its date appear, and notice is given that copying is bypermission of the ACM, Inc. To copy otherwise, to republish, to post on servers, or to redistribute tolists, requires prior specific permission and/or a fee. 1996 ACM 0360-0300/96/06000415 $03.50

    ACM Computing Surveys, Vol. 28, No. 2, June 1996

  • 7/31/2019 Software Reuse Metrics and Models

    2/21

    Figure 1, reuse models and metrics arecategorized into types: (1) reuse cost-

    benefits models, (2) maturity assess-ment, (3) amount of reuse, (4) failuremodes, (5) reusability, and (6) reuse li-brary metrics. Reuse cost-benefits modelsinclude economic cost/benefit analysis as

    well as quality and productivity payoff.Maturity assessment models categorizereuse programs by how advanced theyare in implementing systematic reuse.

    Amount of reuse metrics are used to as-sess and monitor a reuse improvementeffort by tracking percentages of reuse forlife cycle objects. Failure modes analysisis used to identify and order the impedi-ments to reuse in a given organization.

    Reusability metrics indicate the likeli-hood that an artifact is reusable. Reuselibrary metrics are used to manage andtrack usage of a reuse repository. Organi-zations often encounter the need for

    these metrics and models in the orderpresented.

    Software reuse can apply to any lifecycle product, not only to fragments ofsource code. This means that developerscan pursue reuse of requirements docu-ments, system specifications, designstructures, and any other developmentartifact [Barnes and Bollinger 1991].Jones [1993] identifies ten potentially

    reusable aspects of software projects asshown in Table 1.In addition to these life cycle prod-

    ucts, processes (such as the waterfallmodel of software development and the

    Figure 1. Categorization of reuse metrics and models.

    CONTENTS

    1. INTRODUCTION1.1 Types of Reuse

    2. COST BENEFIT ANALYSIS2.1 Cost/Productivity Models

    2.2 Quality of Investment2.3 Business Reuse Metrics2.4 Relation of Reuse to Quality and Productivity

    3. MATURITY ASSESSMENT3.1 Koltun and Hudson Reuse Maturity Model3.2 SPC Reuse Capability Model

    4. AMOUNT OF REUSE4.1 Reuse Level

    4.2 Reuse Metrics for Object-Oriented Systems4.3 Reuse Predictions for Life Cycle Objects

    5. SOFTWARE REUSE FAILURE MODES MODEL

    6. REUSABILITY ASSESSMENT7. REUSE LIBRARY METRICS8. SUMMARY

    APPENDIX 1: DEFINITIONS OF TYPES OF REUSEREFERENCES

    416 W. Frakes and C. Terry

    ACM Computing Surveys, Vol. 28, No. 2, June 1996

  • 7/31/2019 Software Reuse Metrics and Models

    3/21

    SEI capability maturity model), andknowledge are also potentially reusable.

    1.1 Types of Reuse

    Clear definitions of types of reuse arenecessary prerequisites to measure-ment. Table 2 provides a faceted classi-fication of reuse definitions gatheredfrom the literature. Each column speci-fies a facet, with the facet name in bold.Below each facet are its correspondingterms. (See Appendix 1 for detailed def-initions of the terms in the table.)

    Terms in parentheses are synonyms.Development scope refers to whether thereusable components are from a sourceexternal or internal to a project. Modifi-cation refers to how much a reusableasset is changed. Approach refers to dif-ferent technical methods for implement-ing reuse. Domain scope refers towhether reuse occurs within a family ofsystems or between families of systems.

    Management refers to the degree to

    which reuse is done systematically. Re-used entity refers to the type of thereused object.

    Reuse in an organization can be de-fined by selecting appropriate facet-term pairs from this table. For example,an organization might choose to do bothinternal and external code reuse, allow-ing only black box modification as partof a compositional approach. Theymight choose to focus on reuse withintheir domain (vertical) and to pursue asystematic management approach.

    The following sections discuss in detaileach of the six types of reuse metrics andmodels.

    2. COST BENEFIT ANALYSIS

    As organizations contemplate system-atic software reuse, the first questionthat will arise will probably concerncosts and benefits. Organizations willneed to justify the cost and time in-

    volved in systematic reuse by estimat-ing these costs and potential payoffs.Cost benefit analysis models includeeconomic cost-benefit models and qual-ity and productivity payoff analyses.

    Several reuse cost-benefit modelshave been reported. None of these mod-els are derived from data, nor have theybeen validated with data. Instead, themodels allow a user to simulate thetradeoffs between important economicparameters such as cost and productiv-ity. These are estimated by setting arbi-trary values for cost and productivitymeasures of systems without reuse, andthen estimating these parameters forsystems with reuse. There is consider-able commonality among the models, asdescribed in the following.

    2.1 Cost/Productivity Models

    Gaffney and Durek [1989] propose twocost and productivity models for soft-ware reuse. The simple model shows thecost of reusing software components. Thecost-of-development model builds uponthe simple model by representing the costof developing reusable components.

    The simple model works as follows.Let C be the cost of software develop-ment for a given product relative to allnew code (for which C 1). R is theproportion of reused code in the product

    Table 1. Reusable Aspects of Software Projects

    Software Reuse: Metrics and Models 417

    ACM Computing Surveys, Vol. 28, No. 2, June 1996

  • 7/31/2019 Software Reuse Metrics and Models

    4/21

    (R 1). (Note that R is a type of reuselevel; this topic is discussed in detail inSection 4.) b is the cost relative to thatfor all new code, of incorporating thereused code into the new product ( b 1for all new code). The relative cost forsoftware development is:

    [(relative cost of all new code)

    proportion of new code ]

    relative cost of reused software)

    proportion of reused software ].

    The equation for this is:

    C 1 1 R b R

    b 1 R 1 ,

    and the corresponding relative

    productivity is,

    P 1/C 1/ b 1 R 1 .

    b must be 1 for reuse to be costeffective. The size of b depends on thelife cycle phase of the reusable compo-nent. If only the source code is reused,then one must go through the require-ments, design, and testing to completedevelopment of the reusable component.In this case, Gaffney and Durek esti-mate b 0.85. If requirements, design,and code are reused as well, then onlythe testing phase must be done and b isestimated to be 0.08.

    The cost of development model in-cludes the cost of reusing software andthe cost of developing reusable compo-nents as follows. Let E represent thecost of developing a reusable componentrelative to the cost of producing a com-ponent that is not reusable. E is ex-pected to be 1 because creating a com-ponent for reuse generally requiresextra effort. Let n be the number of usesover which the code development costwill be amortized. The new value for C(cost) incorporates these measures:

    C b E/n 1 R 1 .

    Other models help a user estimate theeffect of reuse on software quality (num-ber of errors) and on software develop-ment schedules. Using reusable softwaregenerally results in higher overall devel-opment productivity; however, the costsof building reusable components must berecovered through many reuses. Someempirical estimates of the relative cost ofproducing reusable components and forcost recovery via multiple reuses are dis-cussed in the next section.

    Applications of Cost/ProductivityModels. Margono and Rhoads [1993]applied the cost of development modelto assess the economic benefits of a re-use effort on a large-scale Ada project(the United States Federal Aviation Ad-ministrations Advanced AutomationSystem (FAA/AAS)). They applied themodel to various types of software cate-

    Table 2. Types of Software Reuse

    418 W. Frakes and C. Terry

    ACM Computing Surveys, Vol. 28, No. 2, June 1996

  • 7/31/2019 Software Reuse Metrics and Models

    5/21

    gorized by the source (local, commercial,or public) and mode of reuse (reusedwith or without change). The equation forC in the reuse economics model was mod-ified to reflect acquisition, development,

    and integration costs. Results show thatthe cost of developing for reuse is oftentwice the cost of developing an equivalentnonreusable component.

    Favaro [1991] utilized the model fromBarnes et al. [1988] to analyze the eco-nomics of reuse. (Note: the Barnes et al.[1988] model is the same as that ofGaffney and Durek [1989].) The rele-

    vant variables and formulas are shownin Table 3.

    Favaros research team estimated thequantities R and b for an Ada-baseddevelopment project. They found it diffi-cult to estimate R because it was un-clear whether to measure source code orrelative size of the load modules. Theparameter b was even more difficultto estimate because it was unclearwhether cost should be measured as theamount of real-time necessary to installthe component in the application andwhether the cost of learning should beincluded.

    Favaro classified the Booch compo-nents according to their relative com-plexity. The following categories are

    listed in order of increasing complexity.(The quoted definitions are from Booch[1987].)

    Monolithic:Denotes that the structure is always

    treated as a single unit and that itsindividual parts cannot be manipu-lated. Monolithic components do notpermit structural sharing; stack,string, queue, deque, ring, map, set,and bag components are monolithic.

    Polylithic:Denotes that the structure is notatomic and that its individual parts canbe manipulated. Polylithic componentspermit structural sharing; lists, trees,and graph components are polylithic.

    Graph:A collection of nodes and arcs (whichare ordered pairs of nodes).

    Menu, mask:End-products of the project. They weredeveloped as generalized, reusable com-ponents and so were included in thestudy.

    Table 4 shows values reported by Fa-varo for E, the relative cost of creating areusable component, b, the integrationcost of a reusable component, and N0,the payoff threshold value.

    Table 3. Barnes and Bollingers economic investment model

    Software Reuse: Metrics and Models 419

    ACM Computing Surveys, Vol. 28, No. 2, June 1996

  • 7/31/2019 Software Reuse Metrics and Models

    6/21

    The overall costs of reusable compo-nents relative to nonreusable compo-nents is E b. b is expected to be lessthan 1.0, since it should be cheaper tointegrate a reusable component than tocreate the component from scratch. E isgreater than or equal to 1.0, showingcosts of developing reusable componentsare higher than costs of developing non-

    reusable components. Favaro found thatthe cost of reusability increased withthe complexity of the component. Mono-lithic components were so simple therewas no extra cost to develop them asreusable components. In contrast, thecost of the mask component more thandoubled as it was generalized. The inte-gration cost b was also high in complexapplications. The values for N0 showthat the monolithic and polylithic com-

    ponent costs are amortized after onlytwo reuses. However, the graph compo-nent must be used approximately fivetimes before its costs are recovered, andthe most complex form of the mask willrequire thirteen reuses for amortiza-tion. In summary, costs rise quicklywith component size and complexity.

    2.2 Quality of Investment

    Barnes and Bollinger [1991] examinedthe cost and risk features of softwarereuse and suggested an analytical ap-proach for making good reuse invest-ments. Reuse activities are divided into

    producer activities and consumer activi-ties. Producer activities are reuse invest-ments, or costs incurred while makingone or more work products easier to reuseby others. Consumer activities are reusebenefits, or measures in dollars of howmuch the earlier reuse investment helpedor hurt the effectiveness of an activity.The total reuse benefit can then be found

    by estimating the reuse benefit for allsubsequent activities that profit from thereuse investment.

    The quality of investment (Q) is theratio of reuse benefits (B) to reuse invest-ments (R): Q B/R. IfQ is less than onefor a reuse effort, then that effort resultedin a net financial loss. IfQ is greater thanone, then the investment provided a goodreturn. Three major strategies are identi-fied for increasing Q: increase the level of

    reuse, reduce the average cost of reuse,and reduce the investment needed toachieve a given reuse benefit.

    2.3 Business Reuse Metrics

    Poulin et al. [1993] present a set of met-rics used by IBM to estimate the effortsaved by reuse. The study weighs poten-tial benefits against the expenditures oftime and resources required to identifyand integrate reusable software into aproduct. Although the measures used aresimilar to the Cost/Productivity Models[Gaffney and Durek 1989] already dis-cussed, metrics are named from a busi-

    Table 4. Costs and payoff threshold values for reusable components [Favaro 1991]

    420 W. Frakes and C. Terry

    ACM Computing Surveys, Vol. 28, No. 2, June 1996

  • 7/31/2019 Software Reuse Metrics and Models

    7/21

  • 7/31/2019 Software Reuse Metrics and Models

    8/21

  • 7/31/2019 Software Reuse Metrics and Models

    9/21

    for subjects programming in the CODEenvironment and those programming in

    CODE and ROPE. The data reveal thatROPE had a significant effect on devel-opment time for all the experimentalprograms.

    Error rates were used to measurequality. Compile errors, execution er-rors, and logic errors were all counted.The results are shown in Table 8. Theuse of ROPE reduced error rates, butthe data are less clear than those forproductivity. The researchers attributethis to the difficulty of collecting thedata and to the lack of distinction be-tween design and code errors.

    In summary, the CODE/ROPE exper-iment showed a high correlation be-

    tween the measures of reuse rate, devel-opment time, and decreases in number

    of errors.Card et al. [1986] conducted an em-pirical study of software design prac-tices in a FORTRAN-based scientificcomputing environment. The goals ofthe analysis were to identify the typesof software that are reused and to quan-tify the benefits of software reuse. Theresults were as follows.

    The modules that were reused with-out modification tended to be smalland simple, exhibiting a relativelylow decision rate.

    Extensively modified modules tendedto be the largest of all reused software

    Table 7. Mean Development Time and [95% Confidence Intervals in Hours] [Browne et al. 1990]

    Table 8. Mean Number of Errors and 95% Confidence Intervals [Browne et al. 1990]

    Software Reuse: Metrics and Models 423

    ACM Computing Surveys, Vol. 28, No. 2, June 1996

  • 7/31/2019 Software Reuse Metrics and Models

    10/21

    (rated from extensively modified to un-changed) in terms of the number ofexecutable statements.

    98 percent of the modules reused

    without modification were fault freeand 82 percent of them were in thelowest cost per executable statementcategory.

    These results were consistent with aprevious Software Engineering Labo-ratory study [Card et al. 1982] whichshowed that reusing a line of codeamounts to only 20 percent of the costof developing it new.

    Matsumura [Frakes 1991] describedan implementation of a reuse programat Toshiba. Results of the reuse pro-gram showed a 60 percent ratio of re-used components and a decrease in er-rors by 20 to 30 percent. Managers feltthat the reuse program would be profit-able if a component were reused at leastthree times.

    A study of reuse in the object-orientedenvironment was conducted by Chen

    and Lee [1993]. They developed an envi-ronment, based on an object-orientedapproach, to design, manufacture, anduse reusable C components. A con-trolled experiment was conducted tosubstantiate the reuse approach interms of software productivity and qual-ity. Results showed improvements insoftware productivity of 30 to 90 percentmeasured in lines of code developed perhour (LOC/hr).

    The Cost/Productivity Model byGaffney and Durek [1989] specified theeffect of reuse on software quality (num-ber of errors) and on software develop-ment schedules. Results suggested thattradeoffs can occur between the propor-tion of reuse and the costs of developingand using reusable components. In astudy of the latent error content of asoftware product, the relative error con-tent decreased for each additional use ofthe software but leveled off betweenthree and four uses. The models showthat the number of uses of the reus-able software components directly cor-relates to the development product

    productivity. Gaffney and Durek be-lieve that the costs of building reus-able parts must be shared across manyusers to achieve higher payoffs from

    software reuse.

    3. MATURITY ASSESSMENT

    Reuse maturity models support an as-sessment of how advanced reuse pro-grams are in implementing systematicreuse, using an ordinal scale of reusephases. They are similar to the Capabil-ity Maturity Model developed at theSoftware Engineering Institute (SEI) at

    Carnegie Mellon University [Humphrey1989]. A maturity model is at the core ofplanned reuse, helping organizationsunderstand their past, current, and fu-ture goals for reuse activities. Severalreuse maturity models have been devel-oped and used, though they have notbeen validated.

    3.1 Koltun and Hudson Reuse Maturity

    Model

    Koltun and Hudson [1991] developedthe reuse maturity model shown in Table9. Columns indicate phases of reuse ma-turity, assumed to improve along an ordi-nal scale from 1 (Initial/Chaotic) to 5(Ingrained). Rows correspond to dimen-sions of reuse maturity such as Motiva-tion/Culture and Planning for Reuse. Foreach of the ten dimensions of reuse, the

    amount of organizational involvementand commitment increases as an organi-zation progresses from initial/chaotic re-use to ingrained reuse. Ingrained reuseincorporates fully automated supporttools and accurate reuse measurement totrack progress.

    To use this model, an organizationwill assess its reuse maturity beforebeginning a reuse improvement pro-gram by identifying its placement oneach dimension. (In our experience,most organizations are between Initial/Chaotic and Monitored at the start ofthe program.) The organization willthen use the model to guide activities

    424 W. Frakes and C. Terry

    ACM Computing Surveys, Vol. 28, No. 2, June 1996

  • 7/31/2019 Software Reuse Metrics and Models

    11/21

    that must be performed to achievehigher levels of reuse maturity. Once anorganization achieves Ingrained reuse,reuse becomes part of the business rou-tine and will no longer be recognized asa distinct discipline.

    3.2 SPC Reuse Capability Model

    The reuse capability model developed bythe Software Productivity Consortium[Davis 1993] has two components: an as-sessment model and an implementationmodel.

    The assessment model consists of a

    set of categorized critical success factors(stated as goals) that an organizationcan use to assess the present state of itsreuse practice. The factors are orga-nized into four primary groups: man-agement, application development, as-set development, and process andtechnology factors. For example, in thegroup Application Development Fac-tors/Asset Awareness and Accessibility

    is the goal Developers are aware of andreuse assets that are specifically ac-quired/developed for their application.

    The implementation model helps pri-

    Table 9. Hudson and Koltun Reuse Maturity Model

    Software Reuse: Metrics and Models 425

    ACM Computing Surveys, Vol. 28, No. 2, June 1996

  • 7/31/2019 Software Reuse Metrics and Models

    12/21

    oritize goals and build successive stagesof implementation. The SPC identifiesfour stages in the risk-reduction growthimplementation model:

    (1) Opportunistic: The reuse strategyis developed on the project level.Specialized reuse tools are used andreusable assets are identified.

    (2) Integrated: A standard reusestrategy is defined and integratedinto the corporations software de-

    velopment process. The reuse pro-gram is fully supported by manage-ment and staff. Reuse assets are

    categorized.(3) Leveraged: The reuse strategy ex-pands over the entire life cycle and isspecialized for each product line.Reuse performance is measured andweaknesses of the program identified.

    (4) Anticipating: New business ven-tures take advantage of the reusecapabilities and reusable assets.High payoff assets are identified.The reuse technology is driven by

    customers needs.The SPC continues to evolve the model.It has been used in pilot applications byseveral organizations, but no formal

    validation has been done.

    4. AMOUNT OF REUSE

    Amount of reuse metrics are used toassess and monitor a reuse improve-ment effort by tracking percentages ofreuse of life cycle objects over time. Ingeneral, the metric is:

    amount of life cycle object reused

    total size of life cycle object.

    A common form of this metric is basedon lines of code as follows:

    lines of reused code in system or module

    total lines of code in system or module

    .

    Frakes [1990], Terry [1993], and Frakesand Terry [1994] extend the basicamount of reuse metric by defining

    reuse level metrics that include factorssuch as abstraction level of the life cycleobjects and formal definitions of inter-nal and external reuse. Work is also

    underway [Bieman and Karunanithi1993; Chidamber and Kemerer 1994] todefine metrics specifically for object-ori-ented systems. The following sectionsdiscuss this work in detail. To date,little work has been done on amount ofreuse metrics for generative reuse, al-though Biggerstaff [1992] implicitly de-fines such a metric by reporting a ratioof generated source lines to specificationeffort for the Genesis system40

    KSLOC for 30 minutes specificationeffort.

    4.1 Reuse Level

    The basic dependent variable in soft-ware reuse improvement efforts is thelevel of reuse [Frakes 1993]. Reuse levelmeasurement assumes that a system iscomposed of parts at different levels ofabstraction. The levels of abstractionmust be defined to measure reuse. Forexample, a C-based system is composedof modules (.c files) that contain func-tions, and functions that contain lines ofcode. The reuse level of a C-based system,then, can be expressed in terms of mod-ules, functions, or lines of code. A soft-ware component (lower level item) maybe internal or external. An internal lowerlevel component is one developed for thehigher level component. An externallower level component is used by thehigher level component, but was createdfor a different item or for general use.

    The following quantities can be calcu-lated given a higher level item com-posed of lower level items:

    L the total number of lower levelitems in the higher level item.

    E the number of lower level itemsfrom an external repository inthe higher level item.

    I the number of lower level itemsin the higher level item that arenot from an external repository.

    426 W. Frakes and C. Terry

    ACM Computing Surveys, Vol. 28, No. 2, June 1996

  • 7/31/2019 Software Reuse Metrics and Models

    13/21

    M the number of items not from anexternal repository that are usedmore than once.

    These counts are of unique items(types), not tokens (references).Given these quantities, the following

    reuse level metrics are defined [Frakes1990]:

    External Reuse Level:

    Internal Reuse Level:

    Total Reuse Level:

    E/L

    M/L

    External Reuse Level Internal Reuse Level.

    Internal, external, and total reuse lev-els will assume values between 0 and 1.More reuse occurs as the reuse level

    value approaches 1. A reuse level of 0indicates no reuse.

    The user must provide information tocalculate these reuse measures. The usermust define the abstraction hierarchy, adefinition of external repositories, and adefinition of the uses relationship. Foreach part in the parts-based approach, wemust know the name of the part, sourceof the part (internal or external), level ofabstraction, and amount of usage.

    Terry [1993] extended this formalmodel by adding a variable reusethreshold level that had been arbitrarilyset to one in the original model. The

    variables and reuse level metrics are:

    ITL internal threshold level, themaximum number of times aninternal item can be usedbefore it is reused.

    ETL external threshold level, themaximum number of times anexternal item can be usedbefore it is reused.

    IU number of internal lower levelitems that are used more thanITL.

    EU number of external lower levelitems that are used more thanETL.

    T total number of lower levelitems in the higher level item,both internal and external.

    Internal Reuse Level:External Reuse Level:

    Total Reuse Level:

    IU/TEU/T

    (IU EU)/T.

    The reuse frequency metric is based onreferences (tokens) to reused componentsrather than counting components onlyonce as was done for reuse level. The

    variables and reuse frequency metricsare:

    IUF number of references in thehigher level item to reusedinternal lower level items.

    EUF number of references in thehigher level item to reusedexternal lower level items.

    TF total number of references tolower level items in the higherlevel item, both internal andexternal.

    Internal Reuse Frequency:

    External Reuse Frquency:

    Total Reuse Frequency:

    IUF/TF

    EUF/TF

    (IUF EUF)/TF.

    Program size is often used as a mea-sure of complexity. The complexityweighting for internal reuse is the sumof the sizes of all reused internal lower

    level items divided by the sum of thesizes of all internal lower level itemswithin the higher level item. An exam-ple size weighting for internal reuse ina C system is the ratio of the size (cal-culated in number of lines of noncom-mentary source code) of reused internalfunctions to the size of all internal func-tions in the system.

    The software tool rl calculates reuselevel and frequency for C code. Given a

    set of C files, rl reports the followinginformation:

    (1) internal reuse level;

    (2) external reuse level;

    Software Reuse: Metrics and Models 427

    ACM Computing Surveys, Vol. 28, No. 2, June 1996

  • 7/31/2019 Software Reuse Metrics and Models

    14/21

    (3) total reuse level;

    (4) internal reuse frequency;

    (5) external reuse frequency;

    (6) total reuse frequency;

    (7) complexity (size) weighting forinternal functions.

    The user may specify an internal andexternal threshold level. The softwareallows multiple definitions of higherlevel and lower level abstractions. Theallowed higher level abstractions aresystem, file, or function. The lowerlevel abstractions are function andNCSL (Non-Commentary Source Lines

    of code).

    4.2 Reuse Metrics for Object-Oriented

    Systems

    Bieman [1992] and Bieman and Karu-nanithi [1993] have proposed reuse met-rics for object-oriented systems. Bieman[1992] identifies three perspectives fromwhich to view reuse: server, client, and

    system. The server perspective is theperspective of the library or a particularlibrary component. The analysis focuseson how the entity is reused by the cli-ents. From the client perspective, thegoal is knowing how a particular pro-gram entity reuses other program enti-ties. The system perspective is a view ofreuse in the overall system, includingservers and clients.

    The server reuse profile of a class

    characterizes how the class is reused bythe client classes. The verbatim serverreuse in an object-oriented system isbasically the same as in procedural sys-tems, using object-oriented terminology.Leveraged server reuse is supportedthrough inheritance. A client can reusethe server either by extension, addingmethods to the server, or by overload,redefining methods. (Note that McGre-gor and Sykes [1992] offer good defini-tions of the object-oriented terminologyused in this section.)

    The client reuse profile characterizeshow a new class reuses existing libraryclasses. It too can be verbatim or lever-

    aged, with similar definitions to theserver perspective.

    Measurable system reuse attributesinclude:

    % of new system source text importedfrom the library;

    % of new system classes imported ver-batim from the library;

    % of new system classes derivedfrom library classes and the average% of the leveraged classes that areimported;

    average number of verbatim andleveraged clients for servers, and

    servers for clients;average number of verbatim and le-

    veraged indirect clients for servers,and indirect servers for clients;

    average length and number of pathsbetween indirect servers and clientsfor verbatim and leveraged reuse.

    Bieman and Karunanithi [1993] de-scribe a prototype tool that is underdevelopment to collect the proposed

    measures from Ada programs. Thiswork recognizes the differences betweenobject-oriented systems and proceduralsystems, and exploits those differencesthrough unique measurements.

    Chidamber and Kemerer [1994] pro-pose a metrics suite for object-orienteddesign. The paper defines the followingmetrics based on measurement theory:

    (1) Weighted methods per class;

    (2) Depth of inheritance tree;

    (3) Number of children;

    (4) Coupling between object classes;and

    (5) Responses for a class.

    Among these, the most significant to re-use is the metric depth of inheritance tree.This metric calculates the length of inher-itance hierarchies. Shallow hierarchiesforsake reusability for the simplicity ofunderstanding, thus reducing the extentof method reuse within an application.Chidamber and Kemerer assert that thismetrics suite can help manage reuse op-portunities by measuring inheritance.

    428 W. Frakes and C. Terry

    ACM Computing Surveys, Vol. 28, No. 2, June 1996

  • 7/31/2019 Software Reuse Metrics and Models

    15/21

    4.3 Reuse Predictions for Life Cycle

    Objects

    Frakes and Fox [1995] present modelsthat allow the prediction of reuse levels

    for one life cycle object based on reuselevels for other life cycle objects. Suchmodels can be used to estimate reuselevels of later life cycle objects such ascode from earlier ones such as require-ments. The authors define both temporaland nontemporal models, and discussmethods for tailoring the models for aspecific organization. The models arebased on reuse data collected from 113respondents from 29 organizations, pri-

    marily in the US. The study found thatthere were significant correlations be-tween the reuse levels of life cycle objects,and that prediction models improved asdata from more life cycle objects wereadded to the models.

    5. SOFTWARE REUSE FAILURE MODES

    MODEL

    Implementing systematic reuse is diffi-cult, involving both technical and non-technical factors. Failure modes analysisprovides an approach to measuring andimproving a reuse process based on amodel of the ways a reuse process canfail. The reuse failure modes model re-ported by Frakes and Fox [1996] can beused to evaluate the quality of a system-atic reuse program, to determine reuseimpediments in an organization and to

    devise an improvement strategy for a sys-tematic reuse program.Given the many factors that may af-

    fect reuse success, how does an organi-zation decide which ones to address inits reuse improvement program? Thisquestion can be answered by finding outwhy reuse is not taking place in theorganization. This can be done by con-sidering reuse failure modesthat is,the ways that reuse can fail.

    The reuse failure modes model hasseven failure modes corresponding tothe steps a software engineer will needto complete in order to reuse a compo-nent. The failure modes are:

    No Attempt to Reuse

    Part Does Not Exist

    Part Is Not Available

    Part Is Not Found

    Part Is Not Understood

    Part Is Not Valid

    Part Can Not Be Integrated

    Each failure mode has failure causesassociated with it. No Attempt to Re-use, has among its failure causes, forexample, resource constraints, no incen-tive to reuse, and lack of education. Touse the model, an organization gathersdata on reuse failure modes and causes,and then uses this information to prior-itize its reuse improvement activities.

    6. REUSABILITY ASSESSMENT

    Another important reuse measurementarea concerns the estimation of reus-ability for a component. Such metricsare potentially useful in two key areasof reuse: reuse design and reengineer-ing for reuse. The essential question is,are there measurable attributes of acomponent that indicate its potentialreusability? If so, then these attributeswill be goals for reuse design and re-engineering. One of the difficulties inthis area is that attributes of reusabil-ity are often specific to given types ofreusable components, and to the lan-guages in which they are implemented.In this section we review work in thisarea.

    In a study of NASA software, Selby[1989] identified several module at-tributes that distinguished black-boxreuse modules from others in his sam-ple. The attributes included:

    Fewer module calls per source line;

    Fewer I/O parameters per source line;

    Fewer read/write statements per line;

    Higher comment to code ratios;More utility function calls per source

    line;

    Fewer source lines.

    Software Reuse: Metrics and Models 429

    ACM Computing Surveys, Vol. 28, No. 2, June 1996

  • 7/31/2019 Software Reuse Metrics and Models

    16/21

    This data suggest that modules possess-ing these attributes will be more reus-able.

    Basili et al. [1990] reported on tworeuse studies of systems written in Ada.The first study defined measures of databindings to characterize and identify re-usable components. Data binding is thesharing of global data between programelements [Hutchins and Basili 1985].Basili et al. [1990] proposed a method toidentify data bindings within a pro-gram. After identification of the bind-ings, a cluster analysis computes and

    weights coupling strengths. A couplingis based on references to variables andparameters (data bindings). Aliasing, orreferencing, is not taken into account;only one level of data bindings is consid-ered. The cluster analysis identifieswhich modules are strongly coupled andmay not be good candidates for reuse,and which modules are found to be inde-pendent of others and are potentiallyreusable. A similar use of module cou-pling information to identify potentialobjects in C code has been reported byDunn and Knight [1991].

    The second study defines an abstractmeasurement of reusability of Ada com-

    ponents. Potentially reusable softwareis identified, and a method to measuredistances from that ideal is defined. By

    measuring the amount of change neces-sary to convert an existing program intoone composed of maximally reusablecomponents, an indication of the reus-ability of the program can be obtained.

    7. REUSE LIBRARY METRICS

    As can be seen in Figure 2, a reuselibrary is a repository for storing reus-able assets, plus an interface for search-

    ing the repository. Library assets can beobtained from existing systems throughreengineering, designed and built fromscratch, or purchased. Components thenare usually certified, a process for as-suring that they have desired attributessuch as testing of a certain type.The components are then classified sothat users can effectively search forthem. The most common classificationschemes are enumerated, faceted, andfree text indexing [Frakes and Gandel1990]. The evaluation criteria for index-ing schemes of reuse libraries are: costs,searching effectiveness, support for un-derstanding, and efficiency.

    Figure 2. Reuse library system.

    430 W. Frakes and C. Terry

    ACM Computing Surveys, Vol. 28, No. 2, June 1996

  • 7/31/2019 Software Reuse Metrics and Models

    17/21

    Indexing costs include the cost of cre-ating a classification scheme, maintain-ing the classification scheme, and up-dating the database for the scheme.

    These concerns are negligible for asmall collection, but become more im-portant as the collection grows. Each ofthese costs can be measured in terms ofdollars or effort. These costs can be nor-malized by dividing total costs by thenumber of components handled by thelibrary.

    Searching effectiveness addresseshow well the classification methods helpusers locate reusable components, and

    is usually measured with recall and pre-cision. Recall is the number of relevantitems retrieved divided by the numberof relevant items in the database. Thedenominator is generally not known andmust be estimated using samplingmethods. Precision is the number of rel-evant items retrieved divided by thetotal of items retrieved. Another effec-tiveness measure is overlap, which re-ports the percentage of relevant docu-

    ments retrieved jointly by two methods.Frakes and Pole [1994], in a study ofrepresentation methods for reusablesoftware, reported no statistically sig-nificant difference in terms of recall andprecision between enumerated, faceted,free text keyword, and attribute valueclassification schemes. They reportedhigh overlap measures ranging from .72to .85 for all pairs of the classificationmethods.

    To reuse a component, a software en-gineer must not only find it, but alsounderstand it. Understanding metricsmeasure how well a classificationmethod helps users understand reus-able components. Frakes and Pole[1994] used a 7-point ordinal scale (7 best) to measure support for under-standing. They reported no significantdifference between the classificationmethods using this metric.

    In order to be useful, a reuse librarymust also be efficient. Library efficiencydeals with nonfunctional requirementssuch as memory usage, indexing filesize, and retrieval speed. Memory usage

    can be measured by the number of bytesneeded to store the collection. Indexingoverhead ratios can be calculated bydividing the sum of the size of the raw

    data, and indexing files by the size ofthe indexing files. Retrieval speed isusually calculated by measuring the timeit takes the system to execute a search ofa given query on a given database.

    Quality of assets is another importantaspect of a reuse library. There is con-siderable anecdotal evidence that this isthe most important factor in determin-ing successful use of a reuse library.Frakes and Nejmeh [1987] proposed the

    following metrics as indicators of thequality of assets in a reuse library.

    Time in Use: the module should havebeen used in one or more systems thathave been released to the field for aperiod of three months.

    Reuse Statistics: the extent to whichthe module has been successfully re-used by others is perhaps the bestindicator of module quality.

    Reuse Reviews: favorable reviewsfrom those that have used the moduleare a good indication that the moduleis of higher quality.

    Complexity: overly complex modulesmay not be easy to modify or main-tain.

    Inspection: the module should havebeen inspected.

    Testing: the modules should have

    been thoroughly tested at the unitlevel with statement coverage of 100percent and branch coverage of atleast 80 percent.

    Another class of reuse library metrics isused to measure usage of a reuse librarysystem. The following list was suppliedby the ASSET system, an on-line com-mercial reuse library system.

    Time on-line (system availability):

    this is a measure of the number ofhours the system is available for use.

    Account demographics: assigns usersto the following categories: Govern-

    Software Reuse: Metrics and Models 431

    ACM Computing Surveys, Vol. 28, No. 2, June 1996

  • 7/31/2019 Software Reuse Metrics and Models

    18/21

    Table 10. Summary of Software Reuse Metrics and Models

    432 W. Frakes and C. Terry

    ACM Computing Surveys, Vol. 28, No. 2, June 1996

  • 7/31/2019 Software Reuse Metrics and Models

    19/21

    Table A1. Definitions of Reuse Types

    Software Reuse: Metrics and Models 433

    ACM Computing Surveys, Vol. 28, No. 2, June 1996

  • 7/31/2019 Software Reuse Metrics and Models

    20/21

    ment/contractor, commercial, aca-demia, NonUS;

    Type of library function performed:searches, browses, extracts;

    Asset distribution: whether electronicor via a human librarian;

    Number of subscriber accounts;

    Available assets by type: interopera-tion, supplier listings, local compo-nents;

    Number of extractions by collection:number of assets extracted by collec-tion, number of assets extracted byevaluation level;

    Listing of assets by domain;Request for services by: Telnet Log-ins, modem Logins, FTP, World WideWeb.

    These metrics can provide good man-agement information for a library sys-tem. They can be used to demonstratethe value of the library to managementas well as to provide information forcontinuous quality improvement.

    8. SUMMARY

    A reuse program must be planned, delib-erate, and systematic if it is to give largepayoffs. As organizations implement sys-tematic software reuse programs in aneffort to improve productivity and qual-ity, they must be able to measure theirprogress and identify the most effectivereuse strategies. In this article we sur-

    veyed metrics and models of software re-

    use. A metric is a quantitative indicatorof an attribute. A model specifies relation-ships between metrics.

    Table 10 presents a summary of thereuse metrics and models discussed. Asis typical in an emerging discipline suchas systematic software reuse, many ofthese metrics and models still lack for-mal validation. Despite this, they arebeing used and are being found usefulin industrial practice.

    Appendix 1: Definitions of Types of Reuse

    The terms in Table A1 describe variousreuse issues. Some terms in the table

    overlap in meaning. For example, the ta-ble terms public and external both de-scribe the part of a product that wasconstructed externally; private and inter-

    nal describe the part of a product thatwas not constructed externally but wasdeveloped and reused within a singleproduct. The terms verbatim and black-box both describe reuse without modifica-tion; leveraged and white-box describe re-use with modification. The final fourterms in the table describe levels of reusethat can occur in the object-oriented par-adigm. (References in descriptions in Ta-ble A1 are provided for further informa-

    tion. They do not necessarily indicate firstuse of the term.)

    REFERENCES

    AGRESTI, W. AND EVANCO , W. 1992. Projectingsoftware defects in analyzing Ada designs.IEEE Trans. Softw. Eng. 18, 11, 988 997.

    BARNES, B. ET AL. 1988. A framework and eco-nomic foundation for software reuse. In IEEETutorial: Software ReuseEmerging Technol-ogy, W. Tracz, Ed. IEEE Computer Society

    Press, Washiington, D.C.BARNES, B. AND BOLLINGER, T. 1991. Making

    software reuse cost effective. IEEE Softw. 1,1324.

    BASILI, V. R., ROMBACH, H. D., BAILEY, J., ANDDELIS, A. 1990. Ada reusability and mea-surement. Computer Science Tech. Rep. Se-ries, University of Maryland, May.

    BIEMAN, J. 1992. Deriving measures of soft-ware reuse in object-oriented systems. InBCS-FACS Workshop on Formal Aspects ofMeasurement, Springer-Verlag.

    BIEMAN, J. AND KARUNANITHI, S. 1993. Candidate reuse metrics for object-oriented and Adasoftware. In Proceedings of IEEE-CS FirstInternational Software Metrics Symposium.

    BIGGERSTAFF, T. 1992. An assessment and anal-ysis of software reuse. In Advances in Com-puters, Marshall Yovits, Ed. Academic Press,New York.

    BOOCH, G. 1987. Software Componenets withAda. Benjamin/Cummings, Menlo Park, CA.

    BROWNE, J., LEE, T., AND WERTH, J. 1990. Ex-perimental evaluation of a reusability-ori-ented parallel programming environment.

    IEEE Trans. Softw. Eng. 16, 2, 111120.CARD, D., MCGARRY, F. , PAGE, G., E T A L. 1982.

    The software engineering laboratory. NASA/GSFC, 2.

    CARD, D . , CHURCH, V., AND AGRESTI, W. 1986.

    434 W. Frakes and C. Terry

    ACM Computing Surveys, Vol. 28, No. 2, June 1996

  • 7/31/2019 Software Reuse Metrics and Models

    21/21

    An empirical study of software design practices.IEEE Trans. Softw. Eng. 12, 2, 264270.

    CHEN, D. AND LEE, P. 1993. On the study of software reuse: using reusable C compo-nents. J. Syst. Softw. 20, 1, 1936.

    CHIDAMBER, S. AND KEMERER, C. 1994. A met-rics suite for object-oriented design. IEEETrans. Softw. Eng. 20, 6, 476 493.

    DAVIS, T. 1993. The reuse capability model: abasis for improving an organizations reusecapability. In Proceedings of the Second Inter-national Workshop on Software Reusability(Herndon, VA).

    DUNN, M. F. AND KNIGHT, J. C. 1991. Softwarereuse in an industrial setting: A case study.In Proceedings of the Thirteenth InternationalConference on Software Engineering, IEEEComputer Society Press, Austin, TX, 329

    338.FAVARO , J. 1991. What price reusability? A case

    study. Ada Lett. (Spring), 115124.

    FENTON, N. 1991. Software Metrics, A RigorousApproach . Chapman & Hall, London.

    FRAKES, W. 1993. Software reuse as industrialexperiment. Am. Program. (Sept.), 2733.

    FRAKES, W. (Moderator). 1991. Software reuse:is it delivering? In Proceedings of the Thir-teenth International Conference on SoftwareEngineering (Los Alamitos, CA). IEEE Com-puter Society Press, Los Alamitos, CA.

    FRAKES, W. 1990. An empirical framework forsoftware reuse research. In Proceedings of theThird Workshop on Tools and Methods forReuse (Syracuse, NY).

    FRAKES, W. AND FOX, C. 1995. Modeling reuseacross the software lifecycle. J. Syst. Softw.30, 3, 295301.

    FRAKES, W. AND FOX. C. 1996. Quality improve-ment using a software reuse failure modesmodel. IEEE Trans. Softw. Eng. 24, 4 (April),274279.

    FRAKES, W. AND GANDEL, P. 1990. Representingreusable software. Inf. Softw. Technol. 32, 10,

    653664.FRAKES, W. AND ISODA, S. 1994. Success factors

    of systematic reuse. IEEE Softw. 11, 5, 1419.

    FRAKES, W. B. AND NEJMEH, B. A. 1987.Software reuse through information retrieval.In Proceedings of the Twentieth Annual Ha-waii International Conference on Systems Sci-ences. Kona, Jan., 530535.

    FRAKES, W. AND POLE, T. 1994. An empiricalstudy of representation methods for reusablesoftware components. IEEE Trans. Softw.Eng. 20, 8, 617630.

    FRAKES, W. AND TERRY, C. 1994. Reuse level

    metrics. In Proceedings of the Third Interna-tional Conference on Software Reuse (Rio deJaneiro), W. Frakes, Ed., IEEE Computer Sci-ence Press, Los Alamitos, CA, 139148.

    GAFFNE Y, J. E. AND DUREK, T. A. 1989. Soft-ware reusekey to enhanced productivity:some quantitative models. Inf. Softw. Tech-nol. 31, 5, 258267.

    HUMPHREY, W. 1989. Managing the SoftwareProcess. Addison-Wesley, Reading, MA.

    HUTCHINS, D. H. AND BASILI, V. 1985. Systemstructure analysis: Clustering with data bind-

    ings. IEEE Trans. Softw. Eng. 11, 8, 749 757.JONES, C. 1993. Software return on investment

    preliminary analysis. Software ProductivityResearch, Inc.

    KOLTUN, P. AND HUDSON, A. 1991. A reuse ma-turity model. In Fourth Annual Workshop onSoftware Reuse (Herndon, VA).

    LILLIE. 1995. Personal communication.

    MARGON O, T. AND RHOADS, T. 1993. Softwarereuse economics: cost-benefit analysis on alarge-scale Ada project. In International Con-ference on Software Engineering ACM, New

    York.MCGREGOR, J. AND SYKES, D. 1992. Object-Ori-

    ented Software Development: EngineeringSoftware for Reuse. Van Nostrand Reinhold,New York.

    OGUSH, M. 1992. A software reuse lexicon.Crosstalk (Dec.).

    POULIN, J. S., CARUSO, J. M., AND HANCOC K,D. R. 1993. The business case for softwarereuse. IBM Syst. J. 32, 4, 567594.

    PRIETO-DIAZ, R. 1993. Status report: Softwarereusability. IEEE Softw. (May), 6166.

    SELBY, R. W. 1989. Quantitative studies of soft-ware reuse. In Software Reusability, VolumeII, T. J. Biggerstaff and A. J. Perlis, Eds.,Addison-Wesley, Reading, MA.

    TERRY, C. 1993. Analysis and implementationof software reuse measurement. VirginiaPolytechnic Institute and State University,Masters Project and Report.

    Received April 1994; revised October 1995; accepted November 1995

    Software Reuse: Metrics and Models 435