biological modeling of software development dynamics
TRANSCRIPT
Software development can be treated
as an optimization problem:
Software quality: number of
features, performance, reliability, etc.
Resources: time, money, etc.
maximise software quality
subject to the constraints limited resources
Questions to answer:
› How long it will take to complete a particular
task?
› How long it will take to test a particular module?
› How many defects remain in a particular
module after a release?
› At a particular time, what portion of resources should be devoted to testing as
opposed to developing new features?
How to answer the previous questions:
› By building a model
Purpose of modelling
› Predict, explain, discover, guide
Modelling is a two-steps process:
› Determine set of variables of interests
› Determine set of equations to describe how these variables change and which kinds of
relations exist among them
Software development is an iterative
process
The process of software developement
can be described by the phrase
evolution of software
The phrase implies similarities with
evolution of biological systems
How can we use this fact? – central idea
of this presentation
Bio-inspired optimization algorithms:› Genetic algorithm – motivated by Darwin’s
principal of natural selection
› Memetic algorithm – motivated by genetic + experience
› Particle swarm optimization – motivated by behavior of a flock of birds
› Ant colony optimization – motivated by behavior of an ant-colony
› Bee-inspired optimization, shuffled frog leaping algorithm, etc.
Artificial neural networks
› It is computational structure inspired by central
nervous system
Artificial immune system
› It is computationally intelligent systems inspired
by the principles of the immune system
Capture-recapture bug-estimation method
› Statistical method developed for population
estimation in bio-ecosystems
Accepted among most major software companies (Google, Facebook, Mozilla and Microsoft)
Main principle - abandon long development cycles in favor of faster releases in order to bring the latest features and fixes to end-users
Information for making decision if software is ready for a new release:› Relationship between number of bugs in the
system (b(t)) and effort necessary to resolve them (e(t)).
› The more effort is spent on defect resolving, the less time is left for developing new features.
O1 – Increasing (or decreasing) of e(t) leads to decreasing (or increasing) of b(t):
O2 – Increasing (or decreasing) of b(t) requires increasing (or decreasing) of e(t):
O3 – As a result from the previous two observation, both b(t) and e(t) exhibit periodic oscillations.
)()( tbte )()( tbte
)()( tetb )()( tetb
O4 – The relationship from O1and O2 is
not linear.
O5 – Small increase of effort can lead to
significant reduction of defects number.
O6 –Pareto principle – majority of time is
spent on small number of difficult bugs
(approximately, removal of 30% of bugs
requires 70% of time).
O7 – Changes in code churn exhibits growth rate which can be modeled by sigmoid function.
O8 – b(t) increasing is steeper than decreasing.
O9 – e(t) decreasing is steeper than increasing.
Observations O1-O9 imply similarity with predator-prey ecosystem› Predator – tester, programmer
› Prey - bugs
Most famous predator prey model expressed by:
Main characteristics:› In the case e(t)=0 (no hunt for bugs), b(t) has exponential
growth.
› The rate of detecting and reducing number of bugs by developers/testers is proportional to the number of bugs and effort invested in bug reduction (βeb). Intuitively, if there are more bugs in the system, it will be easier to detect and eliminate some of them. In addition, if more effort is invested in bugs reduction, the more bugs will be reduced.
)()(
ebdt
tdb)(
)(be
dt
tde
› In the absence of bugs (b(t)=0), effort exponentially reduces to zero.
› The rate at which the effort grows is proportional to the rate at which the developers/testers encounter bugs.
Issues with the presented model:› Extremely sensitive to small perturbations
› Allows unlimited exponential growth of number of bugs
› Unlimited ability of a single developer/tester to detect and eliminate bugs
Improvement of LV model which resolves
previous issues:
Number of bugs is limited by the size and
complexity of a project which is
specified by the parameter K.
Rate at which a single developer/tester
detects and removes bugs is limited.
ekb
b
K
bb
dt
tdb)1(
)(e
kb
be
dt
tde )(
Relationship among parameters defines two main regions.
Model allows regular oscillations (suitable for the beginning and the middle of project) as well as dumped oscillations(suitable for project near completion)
Conclusions:
› The model was evaluated on real-life small size project developed under RR methodology
› The model fairly accurately captures observations O1-O9
› Future work: investigate if the results can be generalized for description of projects with different characteristics.
Normalized values of e(t) and b(t)