1 software reliability growth models incorporating fault dependency with various debugging time lags...
Post on 19-Dec-2015
232 Views
Preview:
TRANSCRIPT
1
Software Reliability Growth Models Incorporating Fault Dependency with
Various Debugging Time Lags
Chin-Yu Huang, Chu-Ti Lin, Sy-Yen Kuo, Michael R. Lyu*, and Chuan-Ching Sue
*Computer Science & Engineering Department
The Chinese University of Hong Kong
Shatin, Hong Kong The 28th Annual International Computer Software and Application Conference
(COMPSAC 2004)
2
Outline
Introduction Reviews of software fault detection and correction
process Considering failure dependency & debugging time
lag in software reliability modeling Numerical examples Conclusions
3
Introduction
The software is becoming increasingly complex and sophisticated. Thus reliability will become the main goal for software developers.
Software reliability is defined as the probability of failure-free software operation for a specified period of time in a specified environment.
Over the past 30 years, many Software Reliability Growth Models (SRGMs) have been proposed for estimation of reliability growth of products during software development processes.
4
Introduction (contd.)
One common assumption of conventional SRGMs is that detected faults are immediately removed.
In practice, this assumption may not be realistic in software development.
The time to remove a fault depends on the complexity of the detected faults, the skills of the debugging team, the available man power, or the software development environment, etc.
Therefore, the time delayed by the detection and correction processes should not be neglected.
5
Introduction (contd.)
On the other hand, there are two types of faults in a software system: mutually independent faults and mutually dependent faults. – mutually independent faults are on different
program paths. – mutually dependent faults can be removed only
if the leading faults are removed. In this paper, we will incorporate the ideas of
failure dependency and time-dependent delay function into software reliability growth modeling.
6
Basic Assumptions of Most SRGMs
1. The fault removal process follows the Non- homogeneous Poisson Process (NHPP).
2. The software system is subject to failures at random times caused by the manifestation of remaining faults in the system.
3. All faults are independent and equally detectable.
4. Each time a failure occurs, the corresponding fault is immediately and perfectly removed. No new faults are introduced.
7
Issues of Fault Detection and Correction Processes in SRGMs It is noted that the assumption (4) may not be realis
tic in practice. Actually, most existing SRGMs can be reinterprete
d as delayed fault-detection models that can model the software fault detection and correction processes.
We can remove the impractical assumption that the fault-correction process is perfect and establish a corresponding time-dependent delay function to fit the fault-correction process.
8
Definition 1
Given a fault-detection and fault-correction process, one defines the delay-effect factor, φ(t), to be a time-dependent function that measures the expected delay in correcting a detected fault at any time.
9
Definition 2
An SRGM is called a delayed-time NHPP model if it obeys the following assumptions:
1. The fault detection process follows the NHPP.2. The software system is subject to failures at
random times caused by the manifestation of remaining faults in the system.
3. All faults are independent and equally detectable.4. The rate of change of mean value function (MVF)
of the number of detected faults is exponentially decreasing.
5. The detected faults are not immediately removed. The fault correction process lags the fault detection process by a delay-effect factor φ(t).
10
Fault Detection and Correction Processes in SRGMs Based on the above assumptions (1)-(4), the
original MVF of NHPP model is
where a is the expected number of initial faults, and r is the fault detection rate.
From the assumption (5) in definition 2 and above equation, the new MVF can be depicted as
We thus derive the following theorem.
0 ,0 ]),exp[1()( rartatmoriginal
))(()( ttmtm original 0 ,0 )]),(exp[]exp[1( ratrrta
11
Theorem 1
Given a delay-factor, φ(t),we have:
1) The fault-detection intensity of the delayed-time NHPP SRGM is
where a>0 and r>0.
2) 1)(
dt
td
))(
1()](exp[]exp[)(
)(dt
tdtrrtar
dt
tdmt
12
Three Conventional SRGMs
In the following, we will review three conventional SRGMs based on NHPP can be directly derived from Definition 1, Definition 2, and Theorem 1. 1) Goel-Okumoto model
2) Yamada delayed S-shaped model
3) Yamada Weibull-type testing-effort function model
13
Goel-Okumoto model
This model, first proposed by Goel and Okumoto , is one of the most popular NHPP model in the field of software reliability modeling.
If , then (Definition 1)
a) we have (Theorem 1)
b) mean value function (MVF): (Definition 2)
0)( t
10)(
dt
td
])exp[1()( rtatm
14
Yamada delayed S-shaped model
r
rtt
)1ln()(
11
1)(
rtdt
td
])exp[)1(1()( rtrtatm
The Yamada delayed S-shaped model is a modification of the NHPP to obtain an S-shaped curve for the cumulative number of failures detected such that the failure rate initially increases and later decays .
If , then (Definition 1)
a) we have (Theorem 1)
b) MVF: (Definition 2)
.
15
Yamada Weibull-type testing-effort function model Yamada et al. proposed a software reliability model
incorporating the amount of test-effort expended during the software testing phase. The distribution of testing-effort can be described by a Weibull curve.
If , then (Definition 1)
a) we have (Theorem 1) b) Mean value function: (Definition 2)
.
] exp[)( ttt
1] exp[ 1)(' 1 ttt
])]} exp[1(exp[1{)( tratm
16
Considering Failure Dependency in Software Fault Modeling Assumption:
1. The fault detection process follows the NHPP.2. The software system is subject to failures at
random times caused by the manifestation of remaining faults in the system.
3. The all detected faults can be categorized as leading faults and dependent faults. Besides, the total number of faults is finite.
4. The mean number of leading faults detected in the time interval (t, t+Δt] is proportional to the mean number of remaining leading faults in the system. Besides, the proportionality is a constant over time.
17
Considering Failure Dependency in Software Fault Modeling (contd.)
5. The mean number of dependent faults detected in the time interval (t, t+Δt) is proportional to the mean number of remaining dependent faults in the system and to the ratio of leading faults removed at time t and the total number of faults. Besides, the proportionality is a constant over time.
6. The detected dependent fault may not be immediately removed and it lags the fault detection process by a delay-effect factor φ(t). That is, φ(t) is the time delay between the removal of the leading fault and the removal of the dependent fault(s).
7. No new faults are introduced during the fault removal process.
18
Considering Failure Dependency in Software Fault Modeling (contd.)
From assumption (3) & (4), we have a= a1 + a2 . Besides, a1 = P× a and a2 = (1- P)× a., and m(t)=
m1(t) + m2(t) Where:
a : expected number of initial faults.a1 : total number of leading faults .a2 : total number of dependent faults.P : portion of the leading faults.m(t) : the MVF of the expected number of initial faults.m1(t) : the MVF of the expected number of leading faults detected in time (0, t].m2(t) : the MVF of the expected number of dependent faults detected in time (0, t].
19
)]([)(
111 tmar
dt
tdm
])exp[1()( 11 rtatm
Considering Failure Dependency in Software Fault Modeling (contd.) From assumption (4), the number of detected
leading faults is proportional to the number of remaining leading faults. Thus we obtain the following differential equation.
, a1 >0, 0<r<1.
Solving the above differential equation under the boundary condition m1(t) =0, we have
20
a
ttmtma
dt
tdm ))(()]([
)( 12
22
Considering Failure Dependency in Software Fault Modeling (contd.) Similarly, from assumption (5) ~ (7), we obtain the
following differential equation.
In the following, we will give a detail description of possible behavior of φ(t).
21
Case 1
)1(]exp[1()( PrtPatm
If , we obtain
From assumption (3), the MVF of the expected number of initial faults is
a
rtatma
dt
tdm ])exp[1()]([
)( 12 2
2
])) ])exp[1(
exp[1()( 1122 ra
trartaatm
])])exp[1(exp[ tPrt
r
P
0)( t
22
Case 2
If , we obtain
From assumption (3), the MVF of the expected number of initial faults is
a
rtrtatma
dt
tdm ])exp[)1(1()]([
)( 12
22
]))2]exp[]exp[2(
exp[1()( 122 ra
rtrtrtrtaatm
)1(]exp[)1(1( )( PrtrtPatm
])]exp[1]exp[12
exp[ rttPrtr
P
r
rtt
)1ln()(
23
Case 3
If , we obtain
Similarly, we can obtain the MVF of the expected number of initial faults by means of the same steps.
a
tratma
dt
tdm ])]} exp[1(exp[1{)]([
)( 12
22
] exp[)( ttt
24
Numerical Examples
Two Real Software Failure Data Sets– Ohba’s data set (1984)– Data set from RVLIS project in Taiwan (2003)
Criteria for Model’s Comparison– Mean Square of Fitting Error – Mean Error of Prediction – Noise
Performance Analysis – We only consider Case 1 & Case 2 as illustration.– Parameters of all models are estimated by using the
method of LSE and MLE.
25
Data Description
The first data set (DS1) :– The system was a PL/I database application software
consisting of approximately 1,317,000 lines of code. During 19 weeks, 47.65 CPU hours were consumed and about 328 software faults were removed.
26
Data Description (contd.)
WEEK CNF WEEK CNF WEEK CNF WEEK CNF
1 15 6 110 11 233 16 311
2 44 7 146 12 255 17 320
3 66 8 175 13 276 18 325
4 103 9 179 14 298 19 328
5 105 10 206 15 304
Table 1: Number of failures per week from DS1.
Where CNF: Cumulative number of failures.
27
Data Description (contd.)
The second data set (DS2):– RVLIS, a detection system used to monitor the level
of water within the reactor vessel. The coding language is VersaPro 2.03 and the development platform is GE FANUC PLC 9030. It took 25 weeks to complete the test. During the test phase, 230 software faults were removed.
28
WEEK CNF WEEK CNF WEEK CNF WEEK CNF
1 44 8 100 15 197 22 230
2 75 9 124 16 205 23 230
3 75 10 130 17 214 24 230
4 75 11 130 18 215 25 230
5 75 12 159 19 225
6 75 13 175 20 227
7 75 14 181 21 228
Table 2: Number of failures per week from DS2.
Where CNF: Cumulative number of failures.
Data Description (contd.)
29
Criteria for Model’s Comparison (1)
Noise:
where ri is the predicted failure rate.
Mean Square of Fitting Error (MSE):
where mi is the observed number of faults by time ti. – A smaller MSE indicates a smaller fitting error and b
etter performance.
n
i i
ii
r
rr
1 1
1
k
mtmk
i ii
1
2)(
30
Criteria for Model’s Comparison (2)
Mean Error of Prediction (MEOP):
where ni is the observed cumulative number of failures at time si and mi is the predicted cumulative number of failures at time si, i=k, k+1,…, n.
1
kn
mnn
ki ii
31
Performance Analysis
Table 3: Comparison results of different SRGMs for Case1-DS1.
Model a r θ P MSE MEOP Noise
Equation of Case 1. 420.193 0.07594 0.398488 0.55 89.3622 7.29677 1.21422
Equation of Case 1. 437.776 0.181895 0.477890 0.2 97.5109 7.66603 1.29856
Equation of Case 1. 416.043 0.133663 0.405508 0.3 93.0371 7.45637 1.27661
Equation of Case 1. 408.365 0.104933 0.383069 0.4 90.6259 7.34446 1.25936
Equation of Case 1. 412.750 0.084698 0.387177 0.5 89.5128 7.29751 1.23357
Equation of Case 1. 430.387 0.068848 0.414505 0.6 89.4898 7.31007 1.19243
Equation of Case 1. 466.626 0.055417 0.472794 0.7 90.7684 7.39115 1.12704
Equation of Case 1. 535.871 0.043297 0.588124 0.8 94.3338 7.64427 1.01778
Equation of Case 1. 671.845 0.032573 0.820907 0.9 104.231 8.19621 0.79665
Goel-Okumoto model 760.534 0.032269 — — 139.815 9.89065 0.60332
Delay S-shaped model 374.050 0.197651 — — 168.673 9.78299 2.34455
32
Performance Analysis (contd.)
Case 1 — DS1– The proposed model estimates P=0.55. That is, the s
oftware may contain two categories of faults, 55% are leading faults and 45% are dependent faults.
– The possible values of P are also discussed and listed in Table 3.
– The proposed model provides the lowest MSE & MEOP if compared to the Goel-Okumoto model and the Yamada delay S-shaped model.
– We can see that the proposed model provides the best fit and prediction for the second data set.
33
Table 4: Comparison results of different SRGMs for Case1-DS2.
Model a r θ P MSE MEOP Noise
Equation of Case 1. 284.399 0.083388 0.156682 0.71 247.105 12.0832 1.47165
Equation of Case 1. 355.001 0.274404 0.225143 0.2 254.381 13.0026 1.34102
Equation of Case 1. 327.266 0.181337 0.177243 0.3 252.878 12.6201 1.34785
Equation of Case 1. 306.302 0.140744 0.157981 0.4 250.432 12.3646 1.39488
Equation of Case 1. 292.591 0.116051 0.150571 0.5 248.650 12.1969 1.43638
Equation of Case 1. 285.321 0.098494 0.150182 0.6 247.551 12.1029 1.46346
Equation of Case 1. 284.136 0.084779 0.155675 0.7 247.110 12.0807 1.47203
Equation of Case 1. 289.565 0.073387 0.167628 0.8 247.456 12.1452 1.45800
Equation of Case 1. 303.129 0.063654 0.187542 0.9 249.052 12.3365 1.41749
Goel-Okumoto model 326.364 0.055693 — — 253.217 12.7256 1.35427
Delay S-shaped model 247.221 0.191014 — — 409.026 12.9325 3.10483
Performance Analysis (contd.)
34
Performance Analysis (contd.)
Case 1 — DS2– The proposed model estimates P=0.71. That is, the s
oftware may contain two categories of faults, 71% are leading faults and 29% are dependent faults.
– The possible values of P are also discussed and listed in Table 4.
– The proposed model provides the lowest MSE & MEOP if compared to the Goel-Okumoto model and the Yamada delay S-shaped model.
– We can see that the proposed model provides the best fit and prediction for the second data set.
35
Model a r θ P MSE MEOP Noise
Equation of Case 2. 412.600 0.228869 0.090558 0.72 161.585 9.75609 2.16918
Equation of Case 2. 523.048 0.460176 0.283841 0.2 139.241 9.68111 1.50403
Equation of Case 2. 501.357 0.358750 0.192887 0.3 145.669 9.79057 1.69780
Equation of Case 2. 478.453 0.305156 0.147434 0.4 150.444 9.83572 1.84090
Equation of Case 2. 455.454 0.271896 0.121256 0.5 154.445 9.81603 1.96342
Equation of Case 2. 434.286 0.248901 0.104441 0.6 157.935 9.77504 2.06808
Equation of Case 2. 415.631 0.231742 0.092751 0.7 161.033 9.75492 2.15467
Equation of Case 2. 399.508 0.218213 0.084170 0.8 163.818 9.75748 2.22772
Equation of Case 2. 385.722 0.207095 0.077631 0.9 166.349 9.75588 2.29030
Goel-Okumoto model 760.534 0.032269 — — 139.815 9.89065 0.60332
Delay S-shaped model 374.050 0.197651 — — 168.673 9.78299 2.34455
Table 5: Comparison results of different SRGMs for Case2-DS1.
Performance Analysis (contd.)
36
Performance Analysis (contd.)
Case 2 — DS1– The proposed model estimates P=0.72 for this data s
et. And it indicates that the software may contain two categories of faults, 72% are leading faults and 28% are dependent faults.
– The possible values of P are also discussed and listed in Table 5.
– The proposed model almost provides the lowest MEOP if compared to the Goel-Okumoto model and the Yamada delay S-shaped model.
– Overall, the MVF of proposed model provides a good fit to this data.
37
Model a r θ P MSE MEOP Noise
Equation of Case 2. 264.181 0.218560 0.082480 0.77 402.515 13.2156 2.83969
Equation of Case 2. 330.303 0.683468 0.246725 0.2 334.808 14.6244 1.68513
Equation of Case 2. 313.562 0.409470 0.184855 0.3 370.289 14.2028 2.00071
Equation of Case 2. 301.219 0.325860 0.144944 0.4 383.379 13.9400 2.24942
Equation of Case 2. 290.181 0.280493 0.119105 0.5 390.808 13.7258 2.42998
Equation of Case 2. 279.858 0.251092 0.101601 0.6 396.008 13.5182 2.59620
Equation of Case 2. 270.325 0.230121 0.089176 0.7 400.082 13.3320 2.74551
Equation of Case 2. 261.688 0.214181 0.079991 0.8 403.478 13.1691 2.87787
Equation of Case 2. 253.989 0.201489 0.072978 0.9 406.418 13.0372 2.99694
Goel-Okumoto model 326.364 0.055693 — — 253.217 12.7256 1.35427
Delay S-shaped model 247.221 0.191014 — — 409.026 12.9325 3.10483
Table 6: Comparison results of different SRGMs for Case2-DS2.
Performance Analysis (contd.)
38
Performance Analysis (contd.)
Case 2 — DS2– The proposed model estimates P=0.77 for this data
set. And it indicates that the software may contain two categories of faults, 77% are leading faults and 23% are dependent faults.
– The possible values of P are also discussed and listed in Table 6.
– On the other hand, we know that the inflection S-shaped model is based on the dependency of faults by postulating the assumption: some of the faults are not detectable before some other faults are removed. Therefore, it may provide us some information for reference.
39
Performance Analysis (contd.)
Case 2 — DS2 (contd.)– After the simulation, we find that the estimated value
of inflection rate (which indicates the ratio of the number of detectable faults to the total number of faults in the software) is 0.598 for DS2. It indicates that the growth curve is slightly S-shaped.
– On the average, the proposed model performs well in this actual data.
40
Conclusions
In this paper, we incorporate both failure dependency and time-dependent delay function into software reliability assessment.
Specifically, all detected faults can be categorized as leading faults and dependent faults.
Moreover, the fault-correction process can be modeled as a delayed fault-detection process and it lags the detection process by a time-dependent delay.
Experimental results show that the proposed framework to incorporate both failure dependency and time-dependent delay function for SRGM has a fairly accurate prediction capability.
top related