csci 6960- research methods - 1 - ho 3 © houman younessi 2007 lecture 3 measurement and metrics why...
TRANSCRIPT
CSCI 6960- Research Methods
- 1 -HO 3
© Houman Younessi 2007
Lecture 3
Measurement and Metrics
Why do we need to measure anything?
Because we seek confirmation of our experiences and of our “theories”.
Measurement has become a basic tenet of our rational approach to the expansion of human knowledge.
The scientific method has as its basis the measurement of phenomena of interest to us in order to develop quantitative descriptions of these.
CSCI 6960- Research Methods
- 2 -HO 3
© Houman Younessi 2007
Lecture 3
Measurement and Metrics
Quantification is therefore the very basis of modern science.
Measurement has also had a correspondingly profound impact on all fields of engineering. In fact it can be safely asserted that modern engineering is defined in terms of its scientific basis, its quantification of relationships and its measurement based approaches.
For software engineering to qualify as a true engineering discipline, it too must adopt (more correctly, develop) an empirical, measurement based foundation.
CSCI 6960- Research Methods
- 3 -HO 3
© Houman Younessi 2007
Lecture 3
Measurement and Metrics
Unfortunately, thus far much of computer science, and particularly “software engineering” research work has been of “advocacy” nature.
Advocacy can be useful, in fact crucial, as a necessary pre-cursor to the development of any field into a scientifically based discipline.
However, there comes a time when the field has to go beyond being based on the untested (or at least inadequately tested) recommendations of authority figures. History has many examples.
CSCI 6960- Research Methods
- 4 -HO 3
© Houman Younessi 2007
Lecture 3
Measurement and Metrics
BiologyMedicine?
Psychology
Anthropology::::
Physics
History
CSCI 6960- Research Methods
- 5 -HO 3
© Houman Younessi 2007
Lecture 3
Measurement and MetricsWe are at a crossroads.
We are sitting under the apple tree.
As pioneers and practitioners of a young discipline trying to make the transition, we must be vigilant. There are still many techniques, even “metrics” that are proposed on the basis of inadequate experience, theory or confirmation. These are mere proclamations, albeit sometimes useful, and often even necessary for the support and development of the practice, they must be dealt with with extreme caution.
CSCI 6960- Research Methods
- 6 -HO 3
© Houman Younessi 2007
Lecture 3
Measurement and Metrics
So what is a measure?
An empirical objective assignment of a number or symbol to an entity to characterize a specific attribute. (Fenton, 1991)
Dimension or quantity reckoned by some standard. (Webster’s Dictionary).
CSCI 6960- Research Methods
- 7 -HO 3
© Houman Younessi 2007
Lecture 3
Measurement and MetricsThis means that a measure is not just a number but characterizes a mapping between the manifestation of an aspect of interest in an element or entity within our universe of discourse and a mathematical or symbolic system of ranking and comparison. In so doing, the aspect of interest is called the “attribute”, and the mathematical system of ranking and comparison into which these attributes are mapped is called a “scale”. The action of producing the said mapping is termed “measurement”.
So a direct or atomic measure is a quantification based on a mapping into a numerical or symbolic value obtained from a scale of a directly observed aspect of a phenomenon.
CSCI 6960- Research Methods
- 8 -HO 3
© Houman Younessi 2007
Lecture 3
Measurement and MetricsScales:
The Nominal Scale;
The Ordinal Scale;
The Interval Scale;
The Ratio Scale;
A simple mapping into a number of disjoint sets without regard to any other relationships. This is a naming scale.
A mapping based on rank value. This creates an ordered category.
A mapping in which both the ordering and the distance between the values of attributes can be deduced.
A mapping from the real world onto the set of real numbers.
CSCI 6960- Research Methods
- 9 -HO 3
© Houman Younessi 2007
Lecture 3
Measurement and Metrics
Only certain transformations are allowed, or are indeed meaningful, in relation to any of the aforementioned scales. For example:
Given that there are 7 males and 9 females in a room, what is the “average” gender of the individuals in this room?
Given that Ada, C, C++ and SQL are the programming languages used in project A and C and SQL the ones used in project B, what is the minimum language used in each project?
CSCI 6960- Research Methods
- 10 -HO 3
© Houman Younessi 2007
Lecture 3
Measurement and Metrics
It is therefore important to know what operations are permissible when dealing with measured quantities:
Measurement and PermissibilityNominal Ordinal Interval Ratio
Properties
Relations
Operations
Type of Data
Statistics
Identity Identity, magnitude
Identity, magnitude, equal intervals
Identity, magnitude, equal intervals, zero
Equivalence Equivalence, Less than
Equivalence, Less than, ratio of interval
Equivalence, Less than, ratio of interval, ratio of values
None Rank order Add, subtract All
Names, Labels Ordered data Score Absolute Score
Mode, Frequency Median, Percentile Mean,StdDev., Pearson Correl.
All, e.g.: Geo. Mean, Coeff. of Variation
CSCI 6960- Research Methods
- 11 -HO 3
© Houman Younessi 2007
Lecture 3
Measurement and MetricsAn important fact about scales is the power of each scale as a means of measurement. Generally as we go from left to right on the table just presented, the “power” of the scale increases.
Any phenomenon may be measured by any scale; given understanding of the underlying principles. The aim of science is to measure more and more of observable phenomena using as high an scale as possible. For example:
We want to measure the temperature of a number of objects, say objects A,B,C, and D.
In the nominal scale we can say something like, there is category 1 and category 13 and that we assign A,and D to category 1 and the other two to category 13, based on some arrangement .
CSCI 6960- Research Methods
- 12 -HO 3
© Houman Younessi 2007
Lecture 3
Measurement and Metrics
Not terribly useful but still a measurement.
In the ordinal scale we say that we have an ordering based on the amount of perceived heat in an object. We devise a three level scale of Cold, Warm, and Hot. We then place A in category Cold, B and D in Category Warm and C in Hot.
A bit more useful, but can we use this scale in a sophisticated scientific laboratory when minute changes in temperature need be measured?
Using the interval scale, we can say that we have a scale divided into n equal parts called say, degrees. The difference between the temperature of x (which is presumed to always be constant)
CSCI 6960- Research Methods
- 13 -HO 3
© Houman Younessi 2007
Lecture 3
Measurement and Metrics
and y (whose temperature is also presumed to be constant but different to x) is then divided into n equal distances each called a degree. We still need to devise a means of assessing how much up the scale any one artifact w actually registered (a thermometer). To do so we need some further in-depth knowledge of the universe in relation to the concept of temperature than we did with the previous scales.
What is the drawback of this scale?
CSCI 6960- Research Methods
- 14 -HO 3
© Houman Younessi 2007
Lecture 3
Measurement and MetricsUsing the ratio scale, we might use the very in-depth knowledge that temperature of an object relates to the amount of energy per unit of mass possessed by that object, or the level of atomic excitation of the body. We can now say that if a body is at complete non-excited state, it lacks heat and therefore should rate a zero for its temperature (absolute zero). As the level of excitation increases we can correspondingly increase the reading for the temperature of the body in question, based on some pre-agreed scale that is homogeneous with the rate of increase of energy (or molecular excitation) in that body. In fact we could measure the temperature of the body in terms of the size of this excitation.
CSCI 6960- Research Methods
- 15 -HO 3
© Houman Younessi 2007
Lecture 3
Measurement and Metrics
Question:
A measure of the “quality” of a given process of software development may be given by evaluating that process using the SEI’s Capability Maturity Model (CMM).
What scale is this model on?
What chances do you give the measurement?
CSCI 6960- Research Methods
- 16 -HO 3
© Houman Younessi 2007
Lecture 3
Measurement and Metrics
Composite Measures and Indirect Scales:
In order to come up with “more powerful” or “higher” scale measures of a phenomenon, we usually resort not to a direct observation and ranking in terms of an atomic measure but an indirect one. Examples:
Temperature (just seen)
Velocity of moving objects.
CSCI 6960- Research Methods
- 17 -HO 3
© Houman Younessi 2007
Lecture 3
Measurement and Metrics
There are however rules that apply in how we can combine measurements on various scales. This is a challenging discussion not without its difficulties. The general rule however is that:
The scale type for an indirect measure (M) is only as strong as the weakest of the atomic scale
types that compose it.
CSCI 6960- Research Methods
- 18 -HO 3
© Houman Younessi 2007
Lecture 3
Measurement and MetricsUnfortunately this is one rule that is often broken in software engineering, leading at times to un-useable or misleading results.
Example:
In Halstead’s equation for programming effort: e=V/L; V is Program Volume (on a ratio scale) and L is Program Level (on an ordinal scale). Halstead however claims that e represents the number of mental discriminations necessary to implement a program which ought to be represented on a ratio scale (as it is a count measure)!!!
CSCI 6960- Research Methods
- 19 -HO 3
© Houman Younessi 2007
Lecture 3
Measurement and MetricsDimensionality:
Another concept often ignored when “measurement” is used or proposed in software engineering is the concept of dimensionality.
that not only the scales but also the dimensions on the right and the left hand
side of an equation must be identical
Example:
Although both SLOC and No. of Loops are on the Ratio scale and addition is permissible in that scale adding SLOC and No.of Loops is probably nonsensical.
CSCI 6960- Research Methods
- 20 -HO 3
© Houman Younessi 2007
Lecture 3
Measurement and Metrics
Desirable properties of Measurement:
Reliability
Effectiveness of range
Validity
Consonance
Dimensionality
Practicality
CSCI 6960- Research Methods
- 21 -HO 3
© Houman Younessi 2007
Lecture 3
Measurement and Metrics
Reliability
When any two measures of the same entity made in the same way and independently, agree, we have measurement reliability.
A measure is reliable if it meets the correlation condition.
If M1(A) is a measure of A obtained through experiment 1 and M2(A) is the measure of the same attribute obtained through experiment 2, and if |M1(A) – M2(A)| 0, then we have the correlation condition satisfied.
CSCI 6960- Research Methods
- 22 -HO 3
© Houman Younessi 2007
Lecture 3
Measurement and Metrics
Reliability
If M1 and M2 are the same type of measure made by the same experimenter at different times, then the reliability is called:
Test-Retest Reliability
If M1 and M2 are the same type of measure made by various experimenters at the same or different times, each blind to the result of the other then the reliability is called:
Inter-Rater Reliability
CSCI 6960- Research Methods
- 23 -HO 3
© Houman Younessi 2007
Lecture 3
Measurement and Metrics
Reliability
If M1 , N1 ,…..P1 are different types of measures made to measure
the same phenomenon , and they agree within themselves then the reliability is called:
Internally Consistent
CSCI 6960- Research Methods
- 24 -HO 3
© Houman Younessi 2007
Lecture 3
Measurement and Metrics
Reliability
The scientific tradition has demonstrated that the concept of measurement reliability is of utmost importance. Why?
Because, if the measures we obtain are not reliable, then the study can not yield useful information or relationships.
CSCI 6960- Research Methods
- 25 -HO 3
© Houman Younessi 2007
Lecture 3
Measurement and Metrics
Reliability
The factors that contribute to reliability include:
The precision of the operational definition of the
construct
The clarity of the operational definition of the construct
The care with which we carry out measuresThe number of
independent observations
CSCI 6960- Research Methods
- 26 -HO 3
© Houman Younessi 2007
Lecture 3
Measurement and MetricsEffectiveness of range
Would you use the bathroom scales to weigh:
• spices for your pie recipe, or
• your RV?
MTTF
CSCI 6960- Research Methods
- 27 -HO 3
© Houman Younessi 2007
Lecture 3
Measurement and Metrics
Validity
Measurements must be accurate reflections of the “true” behavior or property we perceive in the real world as reflected in the entity measured. This is the:
representation condition
Example:
If in measuring complexity of software using a measure C, program A is “more complex” than program B, then C(A) must
be larger than C(B).
CSCI 6960- Research Methods
- 28 -HO 3
© Houman Younessi 2007
Lecture 3
Measurement and MetricsConsonance
We must be certain that the measure and the measurement are aligned with our project, process or product goals. In this way, the data will not be open to abuse.
CSCI 6960- Research Methods
- 29 -HO 3
© Houman Younessi 2007
Lecture 3
Measurement and Metrics
Dimensionality
If a measure is set into a relationship of equality with another, then dividing the RHS and the LHS must result in an entity that is mathematically AND LOGICALLY devoid of dimensions.
CSCI 6960- Research Methods
- 30 -HO 3
© Houman Younessi 2007
Lecture 3
Measurement and Metrics
Practicality
Collecting data and making measurements should be easy.
However
The requirement of practicality is context dependent.
Bubble chambers and cyclotrons.
In the context of software engineering this usually means automatability.
CSCI 6960- Research Methods
- 31 -HO 3
© Houman Younessi 2007
Lecture 3
Measurement and Metrics
Relating Measures: Prediction Models
Assessment measures Assessment systems
Predictive measures Prediction systems
To have a prediction system one needs:
A base measure A target measure
A prediction model A set of prediction procedures
CSCI 6960- Research Methods
- 32 -HO 3
© Houman Younessi 2007
Lecture 3
Measurement and Metrics
The concept of validity becomes very important when we become concerned with prediction systems.
We now have to have measures that are not only valid in terms of the representation condition but also a model which is valid in terms of establishing the relationship that exists between them.
The predictive model must be validated
CSCI 6960- Research Methods
- 33 -HO 3
© Houman Younessi 2007
Lecture 3
Measurement and Metrics
This validation may be:
Deterministic
Stochastic
“Proof” for validity is only possible in very rare occasions when there is a correctness preserving set of mathematical transformations
that relates one measure to the other. In all other cases we provide “evidence” for validation.