software metrics - data collection what is good data? are they correct? are they accurate? are they...

Software Metrics - Data Collection

What is good data?

•Are they correct?

•Are they accurate?

•Are they appropriately precise?

•Are they consist?

•Are they associated with a particular activity or time period?

•Can they be replicated?


How to Define the Data?

Process

Product

Resource

Raw Data Refined DataDerived Attribute

Values

Measurement

data collection

extraction

analysis

(essential data)



First Step What to measure?

Direct Measure Indirect Measure



Deciding what to measure - GQM (Goal-Question-Metric)

List the major goals of the developmentor maintenance project

Derive from each goal the questions that mustbe answered to determine if the goals are being met

Decide what must be measured in order to be able toanswer the questions adequately


GQM (Goal-Question-Metric)

Goal: Evaluate effectiveness of coding standard

Questions: who is using standard? what is coder’s productivity? What is code quality?

Metrics: Proportion of coders (多少人使用 ) Experience of coders Code size (function point, line of codes) Errors


Software Quality

•How many problems have been found with a product

•Whether the product is ready for release to the next development stage or to the customer?

•How the current version of a product compares in quality with previous or competing versions?

•How effective are the prevention, detection, and removal processes?


Software Quality

The terminology is not uniform!!

Measure - If an organization measures its software quality in terms of faults per thousand lines of code.


Software Quality - Terminology

Fault Failure

Human Error

IEEE standard 729

A fault occurs when a human error results in a mistake in somesoftware product. (incorrect code as well as incorrect instructions in the user manual)

A failure is the departure of a system from its required behavior.



Errors - often means faults

Anomalies - A class of faults that are unlikely to cause failures in themselves but may nevertheless eventually cause failures indirectly. Ex. Non-meaningful variable name

Defects - normally refer collectively to faults and failures

Bugs - faults occurring in the code

Crashes - a special type of failure, where the system ceases to failure.



The reliability of a software system is defined in terms offailures observed during operation , rather than in terms offaults.


Software Quality

Key elements for the problem is observed

•Location - where did the problem occur•Timing - when did it occur•Symptom - what was observed•End Result - which consequences resulted•Mechanism - how did it occur•Cause - why did it occur•Severity - how much was the user affected•Cost - how much did it cost


Failure Report

Location – installation where failure was observed

Timing – CPU time, clock time or some temporal measure

Symptom – Type of error message or indication of failure

End Result – Description of failure, such as “operation system crash,” “services degraded,” “loss of data,” “wrong output,” “no output”

Mechanism – Cain of events, including keyboard commands and state data, leading to failure. What function the system was performed or how heavy the workload was when the failure occurred


Failure Report

Cause – Reference the possible fault(s) leading to failure trigger: physical hardware failure operating conditions malicious(惡意的 ) action user error erroneous report source: unintentional(非特意的 ) design fault intentional design fault usability problem

Severity – “Critical,” “Major,” “Minor”

Cost – Cost to fix plus lost of potential business


Fault Report

Location – within-system identifier (and version), such as module or document name

Timing – Phases of development during which fault was created, detected, and corrected

Symptom – Type of error message reported (IEEE standard on software anomalies, 1992)

End Result – Failure caused by the fault

Mechanism – How source was created (specification, coding, design, maintenance), detected(inspection, unit testing, system testing, integration testing), corrected(step to correct it)


Fault Report

Cause – Type of human error that led to fault: communication: imperfect transfer of information conceptual: misunderstanding clerical: typographical or editing errors

Severity – Refer to severity of resulting or potential failure

Cost – Time or effort to locate and correct


Change Report

Location – identifier of document or module changed

Timing – When change was made

Symptom – Type of change

End Result – Success of change

Mechanism – How and by whom change was performed

Cause – Corrective, adaptive, preventive, or perfective

Severity – Impact on rest of system

Cost – Time and effort for change implementation and test


Refined Failure Data for Software Reliability

Time Domain Data – recording the individual times at which the failure occurred (provide better accuracy in the parameter estimates)

Interval Domain Data – counting the number of failures occurring over a fixed period


Time domain approach

Failure records Actual Failure Time Between Time Failure

1 25 252 55 303 70 154 95 155 112 17


Interval domain approach

Time Observed Number of Failures

0 01 22 43 14 1

Software Faults

Software Fault Type

Thayer et al. (1978)Computational errorsLogical errorsInput/output errorsData handling errorsOperating system/system support errorConfiguration errorsUser interface errorsData base interface errorsPresent data base errors Global variable definition errorsOperator errorsUnidentified errors

Software Faults

Fault Introduced in the Software Life Cycle

Requirement and Specification

Design - Function Design Logical Design

Coding

Testing

Maintenance

Software Faults

Error Removed in the Software Life Cycle

Validation

Unit Testing

Integration Testing

Acceptance Testing

Operation and Demonstration