failure prediction for water pipes · lected on a regular basis for each pipe of water networks is...
TRANSCRIPT
Swinburne University of Technology
Faculty of Engineering and Industrial Sciences
Thesis submitted in fulfillment of the requirements for
the degree of Doctor of Philosophy
Failure Prediction for Water
Pipes
by Azam Dehghan
Victoria, Australia, 2009
To my beloved family, Reza, Helia, and my dear mother and father.
ABSTRACT
This thesis focuses on predicting the future condition of pipes in water sup-
ply networks based on their previous performance using statistical analysis. The
contemporary methods developed to solve this problem are reviewed and a num-
ber of novel statistical analyses and new probabilistic techniques that enhance
the failure prediction accuracy and uncertainty modelling are introduced.
When a complete history of water pipes failures is available, the statistical
analysis will efficiently provide an accurate formulation of the relationship be-
tween failure frequencies and the factors contributing to the overall structural
deterioration of the pipes. This result can then be effectively utilised to predict
the future failures of the pipes. In practice, however, a complete dataset col-
lected on a regular basis for each pipe of water networks is very costly and not
readily available. In such circumstances, more sophisticated statistically derived
models are required. In this thesis, a failure history of water mains provided
by City West Water PTY LTD (CWW) is studied and analysed as a typical
database that is usually available for water supply networks. This database is
also used for comprehensive simulation and evaluation purposes.
An intelligent statistical reliability model based on artificial neural networks is
proposed for reliability estimation of pipes similar in terms of material, diameter,
location, etc. Application of this model to the CWW failure dataset shows
that it substantially outperforms existing statistical reliability models based on
lognormal and Weibull distributions.
In the next step, the ensemble of failures of each group of similar pipes
(called a pipe class) are studied as a random process and demonstrated to be
non-stationary because of the time-varying environmental factors that affect the
pipe failure processes.
This thesis concludes with suggesting a new non-parametric probabilistic
technique developed to capture the non-stationary process of pipe failures de-
spite the lack of information about time-variant factors which is typical of the
data available in water distribution systems. The predictions are updated auto-
matically and therefore take the gradual time-variant factors into account.
The output of this novel non-parametric auto-updating technique is a confi-
dence interval that represents a range of possible number of failures occurring in
a given period of time in the future with a given confidence. The results of eval-
uation of this method for prediction of failures in the CWW failure database
show that, in 95% of the cases, the actual number of failures is within the
confidence interval given by the suggested technique.
ACKNOWLEDGEMENTS
This research was supported by City West Water PTY LTD by providing their
failure database for the water pipes in the western suburbs of Melbourne, Aus-
tralia. The kind support received from my coordinating supervisor, Associate
Professor Kerry J. McManus, and my associate supervisor, Associate Professor
Emad F. Gad during the several milestones of my research studies are appreci-
ated. I would also like to thank Reza for the constant support in all aspects.
DECLARATION
I declare that this thesis:
• contains no material which has been accepted for an award to me of any
other degree or diploma;
• to the best of my knowledge, contains no material previously published
or written by another person except where due reference is made in the
text of this thesis;
• where the work is based on joint research or publications, discloses the
relative contributions of the respective workers or authors;
• has been professionally edited and the editing has addressed only the style
of the thesis and not its substantive content.
Signature:
Azam Dehghan Date:
Contents
List of Symbols and Abbreviations v
1 Introduction 1
1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Introduction to the Modelling of Failures in Water Pipes . . 6
1.3 Objectives and Structure of Study . . . . . . . . . . . . . . 8
2 Literature Review 15
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2 Mechanical Properties and Manufacturing Techniques of Cast
Iron Pipes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3 Structural Failures of Cast Iron Pipes . . . . . . . . . . . . . 17
2.4 Effective Factors in Pipe Failure Mechanisms: A Review of
Previous Studies . . . . . . . . . . . . . . . . . . . . . . . . 19
2.4.1 Pipe diameter . . . . . . . . . . . . . . . . . . . . . . 20
2.4.2 Pipe length . . . . . . . . . . . . . . . . . . . . . . . 21
2.4.3 Pipe age . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.4.4 Pipe material . . . . . . . . . . . . . . . . . . . . . . 22
2.4.5 Manufacturing methods . . . . . . . . . . . . . . . . 24
2.4.6 Corrosion . . . . . . . . . . . . . . . . . . . . . . . . 25
2.4.7 Pipe’s failure history of . . . . . . . . . . . . . . . . . 27
2.4.8 Water pressure . . . . . . . . . . . . . . . . . . . . . 28
2.4.9 Soil condition of the bedding . . . . . . . . . . . . . . 29
2.4.10 Seasonal variations . . . . . . . . . . . . . . . . . . . 30
2.5 Current Models Developed for Pipe Failure Analysis and Pre-
diction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.5.1 Physical analysis . . . . . . . . . . . . . . . . . . . . 31
2.5.2 Descriptive analysis . . . . . . . . . . . . . . . . . . . 33
i
ii CONTENTS
2.5.3 Statistical analysis . . . . . . . . . . . . . . . . . . . 35
2.6 Reliability Analysis of Water Networks . . . . . . . . . . . . 59
2.7 Milestones of Study and Summary . . . . . . . . . . . . . . . 62
3 Data Description 67
3.1 Typical Failure Data in Water Distribution Systems . . . . . 67
3.2 Contents of Database of This Study . . . . . . . . . . . . . . 68
3.3 Spatial Location of Pipes . . . . . . . . . . . . . . . . . . . . 72
3.3.1 Estimation of postcodes for given AMG coordinates . 73
3.3.2 Estimation of postcode for pipes with no spatial data 73
3.3.3 Distribution of failures in different postcodes . . . . . 74
3.4 Adding the Rainfall Information to the Data . . . . . . . . . 75
3.5 Failure Times . . . . . . . . . . . . . . . . . . . . . . . . . . 78
3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4 Intelligent Reliability Analysis of Water Pipes Using Ar-
tificial Neural Networks 81
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.2 Reliability Analysis: Principles and Definitions . . . . . . . . 82
4.2.1 Reliability of water distribution systems . . . . . . . 82
4.3 Objectives of The Proposed Reliability Analysis . . . . . . . 83
4.4 Structure of the Proposed Reliability Model . . . . . . . . . 84
4.5 Empirical Estimation of Survival Functions . . . . . . . . . . 85
4.6 Weibull and Lognormal Lifetime Models . . . . . . . . . . . 88
4.6.1 Weibull lifetime distribution . . . . . . . . . . . . . . 88
4.6.2 Lognormal lifetime distribution . . . . . . . . . . . . 89
4.7 Intelligent Reliability Prediction by Artificial Neural Networks 90
4.8 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . 93
4.9 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
5 Characteristics of Water Main Lifetimes as Random Pro-
cesses 101
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.2 Non-Stationary Random Failure Processes and Parametric
Lifetime Models . . . . . . . . . . . . . . . . . . . . . . . . . 102
5.3 Likelihood of Number of Failures: A Probabilistic Definition
for Failure Frequency . . . . . . . . . . . . . . . . . . . . . . 103
5.4 Derivation of Theoretical LNF Values From Lifetime Models 106
5.5 Empirical Calculation of LNF Values . . . . . . . . . . . . . 108
iii
5.6 Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
5.6.1 Effect of rainfall on failure rates . . . . . . . . . . . . 110
5.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
6 A Non-Parametric Technique for Failure Prediction of De-
teriorating Components 121
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
6.2 Maximum Likelihood Estimation of Future LNF Values . . . 124
6.3 Required Level of Accuracy for the Inter-Failure Times . . . 125
6.4 Prediction of Inter-Failure Times . . . . . . . . . . . . . . . 127
6.5 Failure Prediction Using The Estimated LNF Values . . . . 129
6.5.1 Prediction of number of failures . . . . . . . . . . . . 129
6.5.2 Confidence intervals . . . . . . . . . . . . . . . . . . 130
6.5.3 Failure prediction for multiple future time intervals . 131
6.5.4 A step-by-step algorithm for failure prediction . . . . 134
6.6 Results of Failure Prediction Using the Proposed Non-Parametric
Technique . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
6.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
7 Conclusions and Recommendations for Further Work 143
7.1 Summary of Study and Achievements . . . . . . . . . . . . . 143
7.2 Recommendations for Further Work . . . . . . . . . . . . . . 147
Bibliography 151
A An Introduction to Artificial Neural Networks 169
A.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
A.2 McCulloch-Pitts Neurons . . . . . . . . . . . . . . . . . . . . 169
A.3 Linear Neuron Models . . . . . . . . . . . . . . . . . . . . . 170
A.4 Multi-Layer Feed-Forward Perceptrons . . . . . . . . . . . . 171
B Instruction Manual for Practitioners 175
B.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
B.2 Preparation of Training Data . . . . . . . . . . . . . . . . . 176
B.3 Training the Artificial Neural Network . . . . . . . . . . . . 178
B.3.1 Learning by Error Back-Propagation . . . . . . . . . 179
B.3.2 Step-by-Step Training Algorithm . . . . . . . . . . . 180
B.4 Prioritisation of Pipes by Reliability Prediction Using the
Neural Network Model . . . . . . . . . . . . . . . . . . . . . 181
B.5 MATLAB Source Code . . . . . . . . . . . . . . . . . . . . . 184
iv CONTENTS
C Research Publications 189
List of Symbols
and Abbreviations
Abbreviation Description Page of 1st
Appearance
µk The inter-failure time elapsed between twoconsecutive NOFk events
124
∆H head loss due to friction 26AC pipes Asbestos Cement pipes 23AMG Australian Map Grids 73ANN Artificial Neural Network 9AWWARF American Water Works Association Research
Foundation55
CHM Hazen-Williams coefficient 26CBD Cenral Business District 9CDF Cumulative Distribution Function 48CI pipes Cast Iron pipes 23CICL Cast Iron Cement Lined 71CWW City West Water Pty Ltd 9D internal pipe diameter 26DI pipes Ductile Iron pipes 23DSS Decision Support System 32Eh Redox potential (millivolts) 46ENOF Expected Number Of Failures 105FIR Finite Impulse Response 127FOM Force Of Mortality 48GCI pipes Grey Cast Iron pipes 24h(x) Hazard function 48H(x) Cumulative hazard function 48
v
vi LIST OF SYMBOLS AND ABBREVIATIONS
Abbreviation Description Page of 1st
Appearance
IFTi The Inter-Failure Time between the timesTi−1 and Ti
79
KPI key performance indicators 69L Length of pipe 26LNF Likelihood of Number of Failures 104ML Maximum Likelihood 125MSE Mean Square Error 95NDT Non-Destructive Technique 27nh The number of neurons in the hidden layer of
the ANN-based intelligent reliability model93
NOFk(nT ) The event of occurrence of k failures duringthe n-th time interval [(n− 1)T, nT ]
104
pdf probability density function 48pH Soil pH 46PHM Proportional Hazard Model 49pmf probability mass function 108Q Water flow 26ROCOF rate of occurrence of failures 6S(x) Survival function 48SR Saturated soil resistivity (ohm-cm) 46STEM Shifted Time-Exponential Model 40STPM Shifted Time-Power Model 42TFF Time to the first failure 104VAR Variance 126WPSI model Winkler Pipe-Soil Interaction model 32
List of Figures
1.1 The bathtub curve of the life cycle of a buried pipe . . . . . . . 7
2.1 Cumulative failure plot for a single pipe in CWW . . . . . . . . 36
2.2 The shifted time exponential model is fitted to past failure rates
which are the cumulative number of previous failures at different
times in the life of the pipe. Larger λ corresponds to a worse
performance and vice versa. . . . . . . . . . . . . . . . . . . . . 41
3.1 Failure data in water networks may be available only during
specific time windows and include left and right censored outside
the window. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.2 Plot of AMGX coordinates versus the pipes unique IDs . . . . . 75
3.3 Plot of AMGY coordinates versus the pipes unique IDs . . . . . 76
3.4 Failure rates in each of the postcodes 3000–3039 in the region
under study (average number of breaks per km during 1997–2000). 76
3.5 Failure rates in each of the postcodes 3040–5277 in the region
under study (average number of breaks per km during 1997–2000). 77
3.6 Geographical map of the licence area of City West Water. . . . 77
3.7 Quarterly recorded rainfalls during 1997-2000 . . . . . . . . . . 78
3.8 Histograms of monthly records of rainfall . . . . . . . . . . . . . 78
3.9 Failure and inter-failure times for a class of pipes . . . . . . . . 79
4.1 Schematic diagram of a survival function estimator for a water
pipe with given type (material), diameter and construction date:
The estimator gives the pipe reliability to survive until a given
assessment date in the future. . . . . . . . . . . . . . . . . . . . 85
4.2 The step-wise empirical survival function . . . . . . . . . . . . . 87
4.3 Architecture of the proposed neural reliability analyser . . . . . 91
vii
viii LIST OF FIGURES
4.4 Diagram of an artificial neuron model in a multi-layer feed-
forward perceptron network. . . . . . . . . . . . . . . . . . . . . 92
4.5 Nonlinear profile of the Sigmoid function, the activation function
of all neurons in the proposed ANN-based reliability model. . . 92
4.6 Empirical and modelled survival function plots (class 1) . . . . . 95
4.7 Empirical and modelled survival function plots (class 2) . . . . . 96
4.8 Empirical and modelled survival function plots (class 5) . . . . . 97
5.1 Demonstration of inter-failure times in three instances . . . . . 105
5.2 Empirical LNF values P0 , P3 and P4 . . . . . . . . . . . . . . 110
5.3 Expansive soils in Victoria . . . . . . . . . . . . . . . . . . . . . 112
5.4 Rainfalls and corresponding empirical average number of failures 114
5.5 Daily average number of failures during each season in 1997-
2000, and the corresponding rainfall records for CICL pipes with
100mm diameter. . . . . . . . . . . . . . . . . . . . . . . . . . . 116
5.6 Daily average number of failures during each season in 1997-
2000, and the corresponding rainfall records for CI pipes with
100mm diameter. . . . . . . . . . . . . . . . . . . . . . . . . . . 116
5.7 Deviation of rainfalls from their average, plotted versus the cor-
responding ENOF values: A regression line demonstrates the
nearly linear correlation between the failure rates and rainfall
deviations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
6.1 Inter-failure time, showing an example where µ2 = 4T . . . . . . 124
6.2 Variance of the estimated inter-failure times is a decreasing func-
tion of the maximum likelihood estimates of Pk values. . . . . . 126
6.3 The number of failures occurring during 20 consecutive time in-
tervals (top plot) and the inter-failure times, obtained from these
data. The µk’s are constant and change only when an NOFk oc-
curs. Therefore, at each time, only one µk changes and the rest
stay at the same value. . . . . . . . . . . . . . . . . . . . . . . . 128
6.4 A case example to show the unreasonable results with using the
mode of distribution instead of the statistical mean of number
of failures for prediction. . . . . . . . . . . . . . . . . . . . . . . 130
6.5 An example of LNF values for the number of failures per day,
to be the base of computation of weekly and monthly LNF values.132
6.6 LNF values for the number of failures per week, based on the
daily LNF values plotted in Figure 6.5. . . . . . . . . . . . . . 133
ix
6.7 LNF values for the number of failures per month, based on the
daily LNF values plotted in Figure 6.5. . . . . . . . . . . . . . 133
6.8 A step-by-step algorithm for the proposed non-parametric failure
prediction technique. . . . . . . . . . . . . . . . . . . . . . . . . 135
6.9 Expected number of failures and their 80% confidence intervals
based on weekly updating, for CICL pipes with 100mm diameter
located in postcode 3021. . . . . . . . . . . . . . . . . . . . . . . 137
6.10 Expected number of failures and their 80% confidence intervals
based on monthly updating, for CICL pipes with 100mm diam-
eter located in postcode 3021. . . . . . . . . . . . . . . . . . . . 138
6.11 Expected number of failures and their 80% confidence intervals
based on quarterly updating, for CICL pipes with 100mm diam-
eter located in postcode 3021. . . . . . . . . . . . . . . . . . . . 139
6.12 Expected number of failures and 80% confidence interval based
on monthly updating, compared to predictions given by simple
averaging of recent records. . . . . . . . . . . . . . . . . . . . . 140
A.1 Some examples of activation function or “nonlinearity” of a unit 173
A.2 Architecture of a typical feed-forward neural network . . . . . . 174
B.1 Input layer of the ANN model. . . . . . . . . . . . . . . . . . . 178
B.2 Hidden layer of the ANN model. . . . . . . . . . . . . . . . . . . 179
B.3 Hidden layer of the ANN model. . . . . . . . . . . . . . . . . . . 179
B.4 An example of the reliability versus age plot of a model derived
for a class of pipes. . . . . . . . . . . . . . . . . . . . . . . . . . 182
B.5 The reliability of the same class of pipes (as in Figure B.4) in
the year 2015, plotted versus the construction year of the pipes
in that class. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
B.6 For the “red” and “blue” classes of pipes, the pipes constructed
before the year Y1 and Y2, respectively, are high-risk. . . . . . . 183
B.7 The source of the Matlab function that realises the three layers
of the Neural Network. For a given set of input values, this
functions returns the output of the network and the activation
values (outputs) of the neurons in the hidden layer. . . . . . . . 184
B.8 One epoch of the training process of Neural Network by error
back-propagation. This epoch is repeated until the estimation
error of the neural network falls down a small given threshold. . 186
x LIST OF FIGURES
B.9 Page 1 of the code that repeats the training epoches until con-
vergence. It reads the failure history from a Microsoft Excel
file. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
B.10 Page 2 of the code that repeats the training epoches until con-
vergence. It reads the failure history from a Microsoft Excel
file. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
B.11 Example of the excel file to be processed by the Matlab program. 188
List of Tables
2.1 Factors affecting structural deterioration of water distribution
pipes (Rostum; 1997) . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2 Sample relative failure rates for different pipe materials . . . . . 25
2.3 Estimated water leakage in 12 U.S. cities in 1978 . . . . . . . . 34
2.4 Water loss percentages for different causes, measured in Boston,
1978 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.5 Importance of different criteria for replacement/rehabilitation . . . 35
2.6 Deterministic Time Exponential Models . . . . . . . . . . . . . 44
2.7 Deterministic Power and Linear models . . . . . . . . . . . . . . 47
2.8 Probabilistic models using time-dependent Poisson model . . . . 64
2.9 Probabilistic models using Cox’s proportional hazard . . . . . . 65
2.10 Probabilistic models using Weibull hazard function . . . . . . . 65
2.11 Miscellaneous probabilistic models . . . . . . . . . . . . . . . . 66
3.1 Construction history of cast iron mains in City West Water
(Righetti, 2001) . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
3.2 Statistics of the six classes of pipes in the dataset, selected for
reliability analysis and failure prediction in this study. . . . . . 72
4.1 MSE of survival function estimation by various models . . . . . 97
5.1 Approximate clay content of different types of soils . . . . . . . 113
5.2 The empirical expected number of failures and their standard
deviations for 16 consecutive seasons during 1997-2000. . . . . 115
5.3 Properties of different texture types found in soils (Northcote) -
Continued to the next page. . . . . . . . . . . . . . . . . . . . 119
6.1 Rejection rate for weekly, monthly and quarterly updating . . . 139
B.1 An example of pipe material and diameter coding. . . . . . . . . 177
xi
Chapter 1
Introduction
1.1 Background
The performance of water distribution systems is primarily assessed based
on the standards governing the process of delivering water to customers.
Such standards are generally established in relation to quantity, quality
and reliability of the service, as follows:
Quantity: Water distribution systems should provide the required flows at,
or above, the minimum pressure. Hydraulic reliability of water distribution
systems assists the managers of water distribution companies to guarantee
the quantity requirements of the network. For instance, nodes in water
networks are supposed to receive a given supply at a given pressure (head)
and hence actual performance can be compared with the required standards.
Quality: Water distribution systems should ensure that quality of the water
(in terms of flavour, odour, appearance and sanitary security) delivered
to consumers complies with the established regulations and standards for
drinking water is a necessary function of the supply of water. Since safe
drinking water is essential to a healthy life, every effort needs to be taken to
ensure that drinking water suppliers provide consumers with water that is
safe to use. Satisfying this criterion remains a critical priority that should
be considered in every function within a water distribution system. For
instance, according to Australian Drinking Water Guidelines (2004): “There
should be effective maintenance procedures to repair faults and burst mains in a
manner that will prevent contamination.”
1
2 CHAPTER 1. INTRODUCTION
Reliability: Water distribution systems should keep the unscheduled disrup-
tion to supply should below the acceptable level specified by the authorities.
In this sense, improvement of network reliability leads to reduction of water
losses.
In order to meet the quantity, quality and reliability standards, water
distribution systems need to be constantly monitored to detect any faults
regarding the water quantity (including delivery pressure), quality and dis-
tribution system reliability.
A water distribution system mainly consists of pipes, valves, storage
tanks and pumping stations. Among these components, water reticulation
pipes, as the lifelines of urban and rural communities, are the primary parts
and also typically known to be the most high maintenance assets of the wa-
ter supply systems. Water mains are continuously subjected to numerous
types of environmental and operational stresses that lead to their deteriora-
tion. Increased operation and maintenance costs, water losses and reduction
of water quality are common consequences of deterioration of water mains.
Thus, to maintain the acceptable level of service, the ongoing challenge
facing managers of water distribution systems is the appropriate mainte-
nance of water pipes, typically within considering the financial constraints.
Against this backdrop, there is also often a need to improve the reliability
of the system and to improve the service delivered to the users. One of
the issues in the environment is the lack of regular condition monitoring of
the asset. Data is mainly obtained on failure of the asset by one mode or
another, rather than by regular measurement of the state of the asset.
Water distribution networks own thousands of kilometres of reticulation
pipes that are not in perfect condition due to deterioration or construction
method or in-appropriate instruction order. Obviously, overall replacement
of these parts of the networks is not economically feasible for the asset own-
ers and operators. Therefore, it is necessary to handle older networks in
appropriate ways. Besides, many of old water pipes are still satisfactorily
functioning. Age of the pipes, therefore, cannot be used as the sole crite-
rion in their assessment. It should be noted that water networks of most
of the developed municipalities within the region under study have been
constructed more than 100 years ago. Region understudy is the western
suburbs and inner sections of City of Melbourne in Australia. The older
parts of the networks were built in accordance with standards and construc-
tion practices that are now considered inappropriate.
A number of researchers have been trying to develop decision support
tools to assist the managers of water companies to prioritise their assets in
1.1. BACKGROUND 3
terms of condition and develop criteria and programs for maintenance/replacement
of water mains. Some of these criteria can be explicitly quantified (i.e.
maintenance cost, capital cost, hydraulic carrying capacity at present and
future demands). However, quantifying some other criteria such as reliabil-
ity and the social costs associated with pipe failures may require surrogate
or implicit evaluation techniques.
A key challenge that has attracted the attention of many researchers in
the field of water supply during recent decades is the development of de-
velop reliable models for prediction of the future maintenance/replacement
requirements of water mains. Traditionally, in water supply systems, a sig-
nificant number of network repairs are performed on an unscheduled basis
generally in response to pipe or other component failure. This reactive
maintenance of assets has the disadvantage that measures are not taken
before the occurrence of damage and pipe failures cause considerable costs
and inconvenience for water distribution systems and society. Considering
the limited financial resources, the ability to avoid frequent damages and to
optimise the use of available funds is highly desirable.
Since it is not practical or economically possible to replace the entire
length of the network, in a proactive management scheme (which provides
timely maintenance), the assets are prioritised for their repair, rehabilitation
or replacement. This is a proactive strategy in the sense that it removes the
need to wait for failures to occur before fixing and measuring their rate of
occurrence. In such a strategy, the manager of the system determines the
maintenance requirements of water mains by taking into account the state
of the pipes and forecasting their future performance. Employing predictive
models is therefore the preferred option for water network management.
The initial motivation of this research was the situation that was con-
cerning a water supply network in West of Melbourne, Victoria, Australia.
In 1995/96, the water mains of City West Water PTY-LTD (CWW) were
experiencing the highest rate of failures Australian water distribution sys-
tems (WSAA facts’99 1999). In the same year, CWW’s Water Reticulation
Asset Status (Pipe Structural Performance report (Water Reticulation As-
set Status Report 1997)) reported that in 1995/96, CWW had 3.1 times the
break rate of the water retailer in Melbourne, namely, South East Water.
Given this background, since 1999, CWW recognised the need for the failure
analysis of water pipes.
The managers of CWW required to have more control over the num-
ber of interruptions to the service of consumers. Thus, the need for efficient
maintenance/rehabilitation planning based on reliable failure prediction was
4 CHAPTER 1. INTRODUCTION
recognised. Accordingly, an investigation for cast iron pipes and associated
failures with a view to formulating a strategy for cost-effective asset man-
agement in both the short and longer term was conducted (Righetti 2001).
In response to the above mentioned demand, existing techniques for
failure prediction of water pipes and new methods and analyses were devel-
oped to accommodate for the common requirements and conditions of water
mains and failure databases of water authorities like CWW. The results of
this study are presented in this thesis.
The main approach chosen in this thesis to tackle the failure analysis
problem is predictive analysis which is based on modelling past pipe break-
age behaviour in order to project it into the future. The main focus is on
deriving failure/reliability models in terms of the component-based char-
acteristics of the pipes such as their diameter, age, type and geographical
location, regardless of the system-based characteristics of the pipes such as
their maximum pressure and nodal isolation.
This study aims to provide reliable techniques for water distribution
managers in developing their maintenance/rehabilitation policies. Exist-
ing techniques for prediction of pipe failure were investigated, in order to
adopt a proper technique that meets the objective of this study. The type
of available data was an important factor in determining the direction of
research.
The ability to confidently estimate the economically viable life of water
mains is fundamental to developing the maintenance/rehabilitation strate-
gies in water distribution systems. Economically viable life of a water
main should be assessed by analysis of the level of disruption avoidance
and continued operational and maintenance costs versus rehabilitation and
replacement costs, such that the best long-term solution can be identified
(Skipworth et al. 2002).
The main difficulty associated with developing efficient rehabilitation or
maintenance strategies for water mains is that the water mains are under
the ground and therefore, monitoring of the physical condition of each and
every pipe is not feasible. One of the prerequisites of undertaking this task
in a proactive manner is the ability to predict the expected failure behaviour
of pipes.
Previous research reviewed in this chapter, has shown that the failure
process is a complex function of a large number of variables, some of which
cannot be directly quantified. To deliver drinking water to the consumers
under certain standards on quantity, quality and reliability criteria, water
distribution companies need more than experienced managers and opera-
1.1. BACKGROUND 5
tors. In fact, there is a need for mathematical models to estimate future
pipe failures with an acceptable confidence in order to plan a maintenance
or replacement strategy for their network. Such models can be substan-
tially beneficial to water authorities through the efficient improvement of
their maintenance/rehabilitation plannings by replacing the common reac-
tive strategies with prediction-based proactive strategies.
Recognising that maintenance/replacement strategies are developed to
keep the annual service interruptions up to a certain number, e.g. five
times, this study concentrates on developing reliable estimators either for
expected number of failures, or reliability of a class of pipes in a certain time.
Therefore, this study explores a proper technique for statistical analysis of
this data.
A range of existing failure analysis models are deterministic in the sense
that their outcome is a value such as number of failures in a certain time.
However, probabilistic measures are more reliable and preferred for devel-
opment of maintenance/rehabilitation strategies. Indeed, probabilistic ap-
proaches are more realistic as they can provide confidence intervals for their
estimates. Confidence intervals give the planners an idea about how reliable
are the predictions. Thus, this research follows a probabilistic approach to
devise new prediction techniques for water pipe failures.
To perform the statistical analysis, this study adopts a probabilistic
approach. The reason for this choice is that the outcome of a probabilis-
tic model, unlike deterministic models, is a single probability (or a set of
probabilities) not just a certain value. Considering the complexity of the
diverse range of factors affecting the mechanism of failure, these types of
estimations are more reliable than deterministic results, to be used in devel-
oping maintenance/rehabilitation strategies in water distribution systems.
Besides, the outcome of these techniques can be estimated within a certain
confidence interval. These specifications, along with the characteristics of
data available in water networks, that cannot be monitored regularly and
often, make this class of analysis more appealing for water supply managers
and researchers.
Based on critical review of existing models for prediction of structural
failures of water pipes it was found that predicting the future failure be-
haviour of a pipe for which little or no burst data exists is very difficult, and
even models based on “grouped” analysis have a high degree of uncertainty
associated with such single pipes.
6 CHAPTER 1. INTRODUCTION
1.2 Introduction to the Modelling of Failures in
Water Pipes
Prediction of future behaviour of pipes in a water supply system is a very
complex task. The factors that lead to the pipe failures vary from one
water distribution system to another one. So failure patterns of each water
distribution network in each city differs from another location. Even these
patterns are different for various groups of pipes within a network. The
reason is that the relative importance of effective factors may vary. This
means that failure analysis should be performed for individual pipes or pipes
with similar failure-related characteristics.
Observing the physical condition of water pipes can reveal some infor-
mation about their structural state. For example with new non-destructive
testing techniques, it is possible to measure the average pit depth and also
the maximum pit depth caused by internal corrosion. However, these tech-
niques are time consuming and rather expensive. Therefore, the approach
of physical analysis is only practical for critical water mains.
As constant monitoring of the physical conditions of the numerous pipes
within the deteriorating network is not feasible, a practical alternative is pri-
oritisation by Statistical Analysis. One of the first solutions in this frame-
work has been the attempt to establish the life cycle of a typical buried
pipe. A number of researchers - e.g. (Ascher and Feingold 1984) described
the pattern of pipe’s life cycle by a so-called “bathtub curve”, named af-
ter its characteristic shape, as is illustrated in Figure 1.1. It is the plot of
hazard function against time, which is well-known in reliability analysis of
mechanical units. More precisely, the bathtub curve illustrates the tempo-
ral development of the rate of occurrence of failures (ROCOF) in a water
pipe over its service-life.
When modelled by a bathtub curve, the life cycle of a pipe is assumed to
consist of three phases. The first years of installation of pipe, burn-in phase,
are associated with the failures mostly due to faulty pipes or installation
errors. Frequency of failures decrease in this phase until the pipe reaches
a stable situation that is almost trouble free (in-usage phase). This phase
of pipe’s life cycle will continue up to the time that the pipe’s age and the
accumulated degradation cause increasing frequency of failures. This last
stage of life cycle, the wear-out phase, is of most importance in developing
the maintenance strategies for mature water distribution systems.
A bathtub curve cannot always accurately describe the life cycle of all
1.2. INTRODUCTION TO THE MODELLING OF FAILURES IN WATERPIPES 7
Figure 1.1: The bathtub curve of the life cycle of a buried pipe (Kleiner and
Rajani 2001)
pipes. Pipes under different conditions experience various extents and pat-
terns of stages in their life. Some of the models that will be explained
in Chapter 2 assume different numbers of phases for the life cycle of pipes.
Some assume different curve-shapes for ROCOF-time plots (e.g., alternative
B to the wear-out phase in Figure 1.1).
It should be mentioned that maintenance databases available for most
water distribution systems do not cover the early years of those pipes laid
about 100 years ago. More specifically, the data available in water distri-
bution systems are both left and right censored in the sense that failure
of early years are usually not available and the future is to be predicted.
Proactive management of assets in these systems can only be performed by
using the decision supporting tools that can extract realistic estimates for
future performance of their assets from such data.
Proactive maintenance strategies for water mains usually use estimates
of one of the following quantities as their key decision-making criterion:
(a) The number of failures in a given time period;
(b) The time remained till the next failure;
(c) The statistical distribution of (a); or
(d) The statistical distribution of (b).
To obtain such estimates, most statistical models implicitly assume that the
pattern of occurrences of failures repeats over time and therefore, it can be
modelled based on the failure history.
8 CHAPTER 1. INTRODUCTION
Statistical models can be generally categorised into major classes of de-
terministic and probabilistic models. The outcome of a deterministic model
is a certain value such as estimated breakage rate or time to next failure
or number of failures in a certain time in the future (outcomes (a) and (b)
in the above list). These models are applied to groups of pipes that are
homogeneous in terms of some of their breakage-related characteristics such
as diameter, material, length, geographical location, etc. Classification of
pipes is performed in such a way that each class contains pipes of similar
failure patterns. A failure history of each homogeneous group is then used
to estimate the future number of failures (or failure times) for that class.
Probabilistic models on the other hand return the distribution of fail-
ure times or numbers (outcomes (c) and (d) in the above list) in terms of
probabilistic measures such as probability of occurrence of given number
of failures up to a certain time in the future. These models are usually
associated with more complex mathematical frameworks. A probabilistic
approach, however, provides the capability of determining some lower and
upper bounds (confidence intervals) for the estimations, which can be help-
ful from planning point of view.
1.3 Objectives and Structure of Study
This thesis describes the development of new on state-of-the-art failure pre-
diction techniques for water mains. The main motivation of this study was
to contribute to the sustainability of water supply systems, one of the oldest
icons of infrastructure, present in most of the urban and rural communities
across the globe.
The mechanism of pipe failure has been studied by many researchers as
a complex process under influence of a vast variety of internal and external
factors -e.g. (Clark et al. 1982, Constantine et al. 1998, Kleiner and Rajani
2001). Each combination of these physical characteristics, environmental,
and operational factors leads to a different pattern of failure rates. These
studies have been conducted with the aim of formulating the process of
failure in terms of some of measurable effective factors. In many case studies,
the effects of some of these factors on the failure rate have been studied.
As the first step towards developing a solution for the research problem
of this thesis, previous studies in this literature were thoroughly reviewed.
Chapter 2 presents a review of other studies conducted in this field and the
existing techniques and models. The models are classified into a number
1.3. OBJECTIVES AND STRUCTURE OF STUDY 9
of distinct categories. For each model, its required data, strengths and
limitations, and the model outcomes are presented and evaluated.
A special feature of this thesis is the usage of a case study for the purpose
of comparative simulation and evaluation. This case study uses a failure
history of water mains by City West Water Pty Ltd (CWW) of pipe failure
predictions. This water retailer supplies the water of western and some inner
suburbs and parts of the Central Business District (CBD) of Melbourne,
Australia. Almost all the break-related challenges in water industry are
involved in this case study.
The type of data that is available in most water distribution systems is
explained in Chapter 3. More specifically, this chapter also describes the
characteristics and limitations of the database provided by CWW that is
used and improved in this study. It should be noted that the data used
in this study consists of failures of metal pipes digitally recorded during
1997-2000. The database has passed a significant data auditing in a recent
study supported by CWW.
On the basis of the reviews and analysis in Chapters 2 and 4, a technique
for prediction of the reliability of water mains in a given time in the future
is proposed. Reliability analysis is separately performed for each class of
pipes where that class is chosen to be homogeneous in terms of some break-
related characteristics. Reliability of a class of pipes in a given time in the
future is defined as the probability of having that class of pipes working
with no failure till that time. An Artificial Neural Network (ANN) is used
for producing the reliability estimation models for separate classes of pipes.
Usage of ANNs for modelling and estimation is a well adopted technique
in reliability analysis of mechanical and electrical systems. This method is
a powerful tool with applications in many fields involving curve fitting, pat-
tern recognition, etc. Furthermore, ANNs are capable of learning underlying
physical models from noisy and wavy patterns of incomplete and censored
data. Chapter 4 provides a short review of feed-forward artificial neural
networks model for homogeneous groups of pipes.
An ANN technique is then formulated to generate reliability estimation
models. These neural network models are trained and evaluated for each
class of pipes using the case study dataset described in Chapter 3.
The process of training the ANN model using training records and vali-
dation of the trained model using the rest of the data (validation records)
are also explained in Chapter 4. For the purpose of comparison, Weibull
and lognormal distribution models are fitted to the failure records in the
case study and applied to estimation of the reliability of the same classes
10 CHAPTER 1. INTRODUCTION
of pipes used with the ANN model. The accuracy of reliability estimates
given by the proposed neural network method is compared to the outcomes
of the Weibull and lognormal estimators.
Although the resulting neural network models are trained and evaluated
using the particular failure history, a similar procedure can be repeated
to produce models for other failure histories. Hence, the presented neural
modelling method is a generic technique and can be modified for other
failure datasets, as well.
The resulting neural network model is also compared to the two clas-
sical life time models (Weibull and lognormal models) using the existing
database. However, the neural network technique inherently learns and re-
constructs the pattern of data and unlike the distribution models does not
assume a predefined behaviour for the data. This characteristic makes this
model a general tool not restricted to particular data.
Chapter 5 studies the underlying perceptions of the statistical models
existing in the literature and their assumptions on the nature of failure
occurrences. In a novel approach, this chapter looks at the ensemble of pipe
failure occurrences as a random process. A set of probabilistic values are
introduced to be used to investigate the statistical characteristics of this
random process. Visualising the variations of this random process using
these probabilistic values demonstrates the non-stationarity of the failure
process. This important characteristic of the failure process of water pipes
is illustrated by histograms of empirical values of failure probabilities for the
number of failures occurring per season using the available failure records.
The variations and trends in the patterns of failure frequencies in the
CWW database - which are due to the non-stationarity of failure process
- are studied in similar seasons of consecutive years. This characteristic is
not exclusive to this database and applies to any other failure history. The
reason for the non-stationarity is that in addition to the static characteristics
of pipes, there is a range of dynamic factors that affects the rate of failures
of water mains.
Dynamic factors such as seasonal variations of rainfall, which are the
source of changes and trends in patterns of failures, are not easily taken
into account by existing techniques. These time variant factors are exclu-
sive to each network (region) and vary according to the characteristics of
the location. For example, the CWW network is located on a bedding of
expansive soil. This is also the case for many other cities such as Montreal,
Canada and parts of Texas, USA. Water mains at these locations suffer from
shrinkage and expansion of soil due to changes in moisture content of soil
1.3. OBJECTIVES AND STRUCTURE OF STUDY 11
as a result of rainfall variation. In other words, rainfall may substantially
affect the failure process of water pipes of such areas. The research in this
study examines the variations of failure rates in terms of changes in rainfall.
The deficiency of existing probabilistic (which are all parametric) models
of reliability and lifetime in capturing the proven non-stationary process
of pipe failures is mathematically demonstrated in Chapter 5. To model
the pattern of failure occurrences, parametric models assume a particular
probability density function (pdf) for the structural failures. Thus, these
models yield biased results by ignoring time-varying factors in the statistical
analysis of break rates.
There are many time-variant factors (e.g. temperature effects; soil-
moisture effects; cumulative length of replaced water mains; cumulative
length of cathodically protected water mains; loss of bedding support; cor-
rosion pit growth) that affect the rate of failures and are almost impossible
to be taken all into account in a single general parametric model. Even if
all of these data were recorded during the age of water mains, any attempt
to estimate these factors in a given time in the future in order to predict the
pipe condition would involve considerable uncertainty due to large spatial
and temporal variability that is inherent in this information.
Given the background in Chapter 5 regarding the limitations within the
data and complexity of process, the best way to account for the variation of
environmental and operational factors is to consider the factors implicitly
by dividing the pipes into homogeneous groups and attempting to develop
a technique to produce non-parametric models for each group. In this con-
text, even neural network models are not free of parameters. Although the
parameters of neural networks that are synopsis weights are tuned using
the training records, these parameters are fixed for the models. Future
estimations are therefore performed based on these fixed parameters.
After discussing the limitations of using parametric methods in analysis
of the water pipe failure process, an iterative non-parametric approach is
explored to address the problem. Chapter 6 contains a review of a number
of existing studies that have considered the time-variant factors in predict-
ing the future failures. The strengths and limitations of these techniques
are explained. An absence of a generic technique able to be used with (a
degree of modification) for failure history of water pipes of any network to
provide the estimations for future failures of separate groups of pipes was
the motivation for further study on this subject. This kind of estimation
can assist the managers of water distribution systems with prioritisation of
classes of pipes in their maintenance/replacement plannings.
12 CHAPTER 1. INTRODUCTION
Chapter 6 presents dynamic models that can be updated automatically
adding new records to the database. Such non-parametric techniques implic-
itly tackle the problem of capturing the non-stationarity of failure process
in spite of incomplete and limited failure histories.
A number of probabilistic values are introduced to express the quantities
representing the gradual variations of the factors influencing the deteriora-
tion process in the pipes which in turn leads to pipe failure. The outcome of
applying this technique to a failure history of a class of pipes is the expected
number of failures up to a time in the future. Simulations are performed
based on a day-by-day updating, and the results of estimations for weekly,
monthly, and quarterly periods of forecasting are presented. The study is
based on a robust mathematical framework and also benefits from theoret-
ical convergence and stability properties demonstrated using principles of
probability theory.
The developed technique is applied to the existing failure history, de-
scribed in Chapter 3. Details of processing the data, applying the proposed
technique, and the outcomes of this statistical analysis are clearly described
in this chapter. The algorithm of applying the technique on any failure his-
tory is provided. Limitations and advantages of technique are also discussed
in the chapter. It should be noted that the technique is generic and can
be used for other failure histories as well as for components of some other
infrastructure systems with similar characteristics.
The last chapter of thesis, Chapter 7, summarises the achievements and
conclusions of the study as well as limitations and recommendations for
further work.
In summary the objective of this thesis can mainly be articulated as
follows:
(a) Review of factors affecting pipe failure of existing models for predict-
ing future performance,
(b) Analysis of a typical failure database and identification of typical lim-
itations of databases,
(c) Development of a generic model for predicting future failures based
on existing limited data using ANN
(d) Examination of failure patterns and development of a more refined
model
1.3. OBJECTIVES AND STRUCTURE OF STUDY 13
(e) Development of supporting documentation for the proposed model for
use by water authorities to adequate to various pipe categories and
data sets
Chapter 2
Literature Review
2.1 Introduction
Water distribution systems consist of a range of components such as pumps,
nodes, and pipes spread across the geographical network. All of the elements
are liable to fail due to gradual degradation or possible defects in manu-
facturing or installing operations. The cost of maintenance/replacement of
water pipes, is a huge burden on water distribution companies and usually
accounts for most of the expenditures of the whole system. In addition to
maintenance costs, disruption to traffic, as well as water losses, reduction
in the quality of service, and reduction in the quality of water are typical
outcomes of pipe failures.
The physical mechanisms that lead to the pipe breakage are often very
complex. On the other hand, developing a performance estimation model
for the pipes requires an understanding of potential causes of pipe failures.
To provide a background about the factors that can be involved in the failure
process, mechanical properties and manufacturing techniques of cast iron
pipes, as the most used type of pipe in mature water distribution systems,
are reviewed in Section 2.2. Common causes of structural failures of cast
iron pipes are then discussed in Section 2.3. In Section 2.4 a review of
previous studies on the impact of factors leading to failure process of water
pipes is presented. Section 2.5 reviews studies previously conducted on
structural failures of water pipes and the associated mathematical analyses
performed by other researchers. The merits and shortcomings of the current
models developed for forecasting the structural failures of water pipes are
discussed in this section and the research problem addressed by this thesis
15
16 CHAPTER 2. LITERATURE REVIEW
is defined in terms of the need to fill the existing gaps and shortcomings of
current techniques.
2.2 Mechanical Properties and Manufacturing
Techniques of Cast Iron Pipes
Cast iron is manufactured by adding a large amount of carbon to molten
iron. Carbon composes typically about 2.5−4% of weight of various types of
cast iron, while for instance, this measure for most steels, is less than 1.2%.
Adding the extra carbon lowers the melting point of the metal and makes
it more fluid and much easier to cast in complex shapes. However, when
the metal solidifies, some (or most) of the carbon forms graphite flakes.
These flakes reduce the strength of metal and act as crack former, initiating
mechanical failures. These flakes cause the metal to behave in a nearly
brittle fashion, rather than displaying the elastic, ductile behaviour of steel.
Fractures in this type of metal tend to take place along the flakes, which
give the fracture surface a grey colour. This is the reason for naming this
metal as grey cast iron. Almost all cast irons contain other elements added
to them to improve their castability, strength, or other useful properties.
For instance, silicon can be found in almost all cast irons as it is added to
lower the metal’s casting temperature. Manganese is also very common to
increase the strength and protect against the detrimental effects of sulphur
impurities of the metal. There are five types of cast irons, however only
two have been used in manufacture of water pipes ,namely, pit cast iron
and spun cast iron. Pipes installed until the early 1970s were made of grey
cast iron, and since then ductile iron has been used for production of metal
pipes.
The presence of graphite flakes in the structure of grey cast iron causes
unusual mechanical properties in the material. While it is not truly brittle,
neither is it a ductile material in the same way as steel or even ductile cast
iron. Another important factor is that grey cast iron has very different
mechanical behaviour in tension than in compression, typically being 2.5 to
3.5 times stronger under the latter loading conditions.
There are two main kinds of grey cast iron pipes caused by different
manufacturing methods. The first cast iron pipes were pit cast pipes, while
the more recent technique in the manufacture of pipes used spin casting. Pit
casting consists of pouring the molten metal to sand moulds that are placed
in a pit. It was then allowed to cool and the solidified pipe is taken out
2.3. STRUCTURAL FAILURES OF CAST IRON PIPES 17
of the pit. Moulds used in spin casting, are made of sand or water cooled
metal, and rotate around the longitudinal axis of the pipe. Furthermore the
spinning of the mould, the metal is evenly distributed around the mould.
Besides the molten metal solidifies much more quickly than in the pit casting
method. This results in a significant difference in the size and shape of the
graphite flakes in grey cast iron.
A major difference between these two techniques of casting is that flakes
are considerably larger in pit cast pipes than in spun cast pipes. The smaller
flakes and their different distribution in the spun cast pipes are the reason
that they are generally much stronger than pit cast pipes.
2.3 Structural Failures of Cast Iron Pipes
There are a number of causes for the structural failures of cast iron water
pipes. Some of the common conditions and failure causes are follows:
- Corrosion: A common failure cause in metal pipes is corrosion. Al-
though pipe failures can be merely due to corrosion, in most of break-
ages with clear mechanical causes corrosion has an accelerating role by
weakening the fabric of the pipe. Simple corrosion pitting is a minor
failure mode in small diameter pipes (reticulation pipes< 300 mm di-
ameter), but more important in pipes with large diameters. Corrosion
of grey cast iron pipes, typically comprises of two separate but related
processes. Although simple corrosion pitting is the same as in steel
pipes, in cast iron pipes, graphitisation can also take place. Graphi-
tisation removes some of the iron from the pipe, leaving a matrix of
graphite flakes that is held together in part by iron oxide.
- Manufacturing defects: Gray cast iron pipes were manufactured
using a number of different methods, as noted previously, the two
most common being pit casting and spin or centrifugal casting. As
mentioned before, manufacturing defects are the source of some fail-
ures.
Potential structural flaws in pit casting: One of the manufac-
turing defects in pit cast pipes is the presence of inclusions. In-
clusions are undesired elements in the structure of metal that
weaken it. There are two types of inclusions that can be found
in grey cast iron pipes. One type which appears as small black
spheres is undissolved ferrosilicon. Iron phosphide is the other
18 CHAPTER 2. LITERATURE REVIEW
type of inclusion in pit cast pipes. The other common man-
ufacturing defect of pit cast pipes is the un-even thickness of
wall around their circumference. When the mould is not aligned
properly, casting will result in variable thicknesses.
Potential structural flaws in spin cast pipes: Spun casting, po-
tentially results in fewer manufacturing defects compared to pit
casting. Possible problems include inclusions, variations in the
wall thickness of the pipe along its length, porosity and improper
cleaning of the pipe moulds after casting.
- Excessive forces: There are various sources of excessive forces that
may cause pipes breaks or accelerate the failure occurrences, including:
Ground movement forces directly transferred to the pipe; and
Ground movements that happen farther away along the line, may
cause a failure at a change in pipeline direction at a thrust block
(Lackington 1991, Pascal and Revol 1994, Dyachkov 1994).
- Human error: Human errors that are potential causes of pipe fail-
ures can occur at the design phase or other parts of pipeline con-
struction. For example, some types of corrosion related failures have
been identified as result of poor installation techniques. The common
causes of failures due to human errors are:
Third party damage that are caused by other civil operations;
Human errors during pipe installation operation that cause fu-
ture failures (e.g. a failure that is initiated from a scratch that
partially removed the pipe’s coating during the installation.); and
Improper repair practices.
- Multiple event failures: Failure processes of grey cast iron pipes
are usually due to a combination of factors that may include external
loading, internal pressure, manufacturing flaws and corrosion damage
(Morris 1967). Many circumferential and bell split type failures actu-
ally occur as a series of multiple events. Pipe cracks that are detected
usually initiate in the form of a water leaking from a small crack. If
the damage is not detected, a second or even third cracking event may
take place, with the process continuing until the pipe fails completely
if not removed due to a leak detection operation.
2.4. EFFECTIVE FACTORS IN PIPE FAILURE MECHANISMS: AREVIEW OF PREVIOUS STUDIES 19
2.4 Effective Factors in Pipe Failure Mechanisms:
A Review of Previous Studies
In recent decades, many researchers have attempted to relate the rate (fre-
quency) of water pipe failures to the attributes of the pipes and environ-
mental conditions. This section contains a review of the studies that have
been conducted on pipe characteristics in order to determine the quality
and extent of their impact on pipe failure.
A variety of factors contributing to failures of water pipes have been
identified by a number of researchers (Shamir and Howard 1979, Kelly and
O’Day 1982, Goulter and Kazemi 1988, Arnold 1960, Remus 1960, Niemeyer
1960). Fitzgerald (1960) examined pipe failure conditions in the United
States and made specific recommendations that accurate and detailed leak
and failure records should be maintained to develop and (or) maintain ef-
fective programs for break reduction. Morris (1967) suggested a number of
possible causes for water main breaks, but emphasised that “the cause of
water main breaks cannot always be ascertained”.
Rajani and Tesfamariam (2005) reported that, in most cases, a combi-
nation of circumstances leads to the failure of a pipe. Factors contributing
to pipe failure include operational conditions, design parameters, external
loads (traffic, frost, etc.), internal loads (operating and surge pressures),
temperature changes, loss of bedding support, pipe properties and condi-
tion, and corrosion pit geometry. Thus information is recorded rarely if at
all and it is therefore very difficult to ascertain the precise causes of fail-
ure. Even if all this information was available, any attempt to estimate the
state of the pipe condition would involve considerable uncertainty due to a
large spatial and temporal variability that is inherent in this information.
Focusing on the correlation of corrosion rate and failure of cast iron pipes,
the authors concluded that long-term performance of buried cast iron pipes
is dictated by pit growth rate (growth rate of a single corrosion pit), unsup-
ported length of pipe (likely to develop as a result of prolonged leakage or
wash out), material fracture toughness and temperature differential. Also,
soil movement and ground frost development were mentioned as determin-
ing factors.
Pipe breakage is most likely to occur when the environmental and oper-
ational stresses act upon a pipe whose structural integrity has been compro-
mised by corrosion, degradation, inadequate installation or manufacturing
defects (Kleiner and Rajani 2001). Common variables involved in deterio-
20 CHAPTER 2. LITERATURE REVIEW
Table 2.1: Factors affecting structural deterioration of water distribution
pipes (Rostum; 1997)
Structural External/environmental Internal Maintenance
variables variables variables variables
Location of pipe Soil type Water Date of failure
velocity
Diameter Loading Water Date of repair
pressure
Length Groundwater Water Location of
quality failure
Year of Direct stray current Water Type of failure
construction hammer
Pipe material Bedding condition Internal Previous
corrosion failure history
Joint method Leakage rate
Internal Other networks
protection
External Salt for de-icing
protection of roads
Pressure class Temperature
Wall thickness External corrosion
Laying depth
Bedding condition
ration process of water networks, can be grouped into structural, environ-
mental, hydraulic and maintenance (Rostum 1997).
Table 2.1 lists a number of factors in each group of variables. Most of
the factors are constant with time some are time-dependent (e.g. water
quality, water velocity). In the following section, factors that are commonly
assumed to have the greatest impact on pipe failure are discussed.
2.4.1 Pipe diameter
All of the studies reviewed in the literature that investigate the relationship
between size of the pipe and its failure frequency, agree on existence of
an inverse relationship, e.g. (Andreou 1986, Eisenbeis 1999, Ciottoni 1983,
Kettler and Goulter 1985a, Guan 1995, Mavin 1996). In a recent study,
Boxall et al. (2006) have observed an exponentially decreasing relationship
2.4. EFFECTIVE FACTORS IN PIPE FAILURE MECHANISMS: AREVIEW OF PREVIOUS STUDIES 21
between failure rate and diameter. The high frequency of failures for pipes
with small diameters can be explained by reduced pipe strength, reduced
wall thickness and less reliable joints for smaller pipes (Kettler and Goulter
1985a).
In a study of pipe-soil interaction, Rajani and Tesfamariam (2005) re-
alised that the growth rate of a single corrosion pit in small diameter cast
iron mains is nearly always more detrimental to thin (small diameter) than
to thick (large diameter) pipes if all other data or properties remain un-
changed.
A sensitivity analysis conducted in a recent study by Tesfamariam et
al. (2006) confirms that large-diameter mains are more sensitive to external
loads, and small-diameter mains are sensitive to the extent of loss of bedding
support.
2.4.2 Pipe length
For long pipes (L > 1000m), environmental conditions such as soil condi-
tions of the bedding and traffic conditions on the field above the network
can vary along the pipe. Andreou (1986) stated that hazard function of
each pipe is approximately proportional to the square root of its length. A
number of other researchers (Eisenbeis 1999, Lei 1997, Eisenbeis 1997) have
also reported similar findings. Length of pipe may also be considered as
a surrogate for connection density, and if connections are considered as a
point of weakness, shorter pipes may exhibit higher burst rates than longer
pipes (Skipworth et al. 2002).
2.4.3 Pipe age
Early analyses of pipe breakage conditions suggested that there is not a
strong correlation between the rate of pipe breakage and its age (O’Day
et al. 1980, O’Day 1982, O’Day 1983, Ciottoni 1983). However, in later
studies by Clark et al. (1982), Chambers (1983) and Kettler and Goulter
(1985a) a correlation between the age of pipe and frequency of its failure
occurrences was observed. In a quantitative study of performance of some
water distribution systems, Butler and West (1987) reported that average
leakage figure in water network system in U.K. ageing about 50 years, was
about 30%. This measure for two water distribution systems in Germany
and three in Holland with average system age of 20 and 25 years, were in
the range from 2% to 15%.
22 CHAPTER 2. LITERATURE REVIEW
Goulter and Kazemi (1988) later concluded that age should not be the
single factor used for assessing the pipe condition. Also, Herbert (1994)
mentioned the age as an important factor in combination with knowledge
of network condition and weak points to allow accurate assessments.
Andreou et al. (1987) reported a tendency of pipes with failures at early
ages to perform better than pipes that failed at later ages. A number of
researchers reported that during the first few years after installation, pipe
failures occur quite randomly (hardly predictable) due to factors such as
unusual external loads on the pipe (Hoyland and Rausand 1994, Rausand
and Reinertsen 1996).
In some cases, older pipes are more resistant to failure than younger
pipes. For grey cast iron pipes, this can be explained by the thinner walls
produced by newer casting methods. Pipes with thinner walls are more
susceptible to greater effects of corrosion and higher stress levels for the
same external loads compared to pipes with thicker walls. However, jointing
techniques have improved over the years, allowing greater deflections at
joints.
It is also observed that some construction eras have a higher break rate
than others (Pelletier et al. 2003). Different installation eras show different
failure characteristics. It therefore appears that construction practice for
each time period has more effect on the pipe failure characteristics than the
age of the pipe (Andreou et al. 1987).
2.4.4 Pipe material
The strength of a pipe is a major factor in the ability of pipe to resist the
internal and external loads. The material of the pipe also determines the
vulnerability of pipe to corrosion. Kettler and Goulter (1985a) investigated
the variations of failure rates with pipe material and examined the type of
failures (e.g. longitudinal split, joint, or circumferential failure) for different
pipe materials.
During the study of a water distribution system in Trondheim, Norway,
Lei (1997) realised that pipe material should be considered as a categori-
sation factor and not a covariate. The same conclusion was drawn from a
study by Achim et al. (2007). The authors showed this concept by three
dimensional plots of the number of previous failures and pipe age and pipe
length for two different materials of pipes for which the data existed.
Most water networks are mainly made of cast iron pipes (grey cast iron
and ductile iron pipes) and the longest existing failure records belong to this
2.4. EFFECTIVE FACTORS IN PIPE FAILURE MECHANISMS: AREVIEW OF PREVIOUS STUDIES 23
class of pipes. Many researchers have focused on analysis of failures of grey
cast iron pipes (Andreou 1986, Goulter and Kazemi 1988, Eisenbeis 1999,
Rajani and Tesfamariam 2005, Tesfamariam et al. 2006). Cement mortar
lining of water mains was used to minimise corrosion and tuberculation
since 1920s (Mays 2000).
Since the 1970s, ductile iron (DI) has been used as pipe material in many
water distribution networks (Kirmeyer et al. 1994). A ductile iron pipe is
produced with low contents of phosphorous and sulphur, while magnesium
is added to the grey iron molten prior to casting. Consequently, the final
microstructure of DI consists of a uniform distribution of graphite nodules
within the ferritic iron matrix. In contrast, the graphite is in the form of
flakes in grey cast iron pipes. This superior mechanical structure of DI pipes
compared to cast iron pipes initially caused water distribution managers to
lay these pipes with minimal or no corrosion protection. Within a few years
it became apparent that unprotected DI pipes in aggressive soils tend to
corrode at a rate equal to that of cast iron (CI) pipes . However, because
DI pipes had smaller wall thickness than their equivalent-size CI pipes,
perforation appeared in many cases relatively soon after installation (Rajani
and Kleiner 2003).
The usage of Asbestos Cement (AC) pipes in water networks has been
usually associated with the health concern related to the release of asbestos
fibres into the drinking water due to chemical attack on the asbestos cement
material and the erosion of the internal surface of the pipe by the water.
It is observed that in some environments, AC pipe materials are subject to
damage due to various chemical processes that either leach out the cement
material or penetrate the pipe wall to form the products that weaken the
cement matrix (Mordak and Wheeler 1988).
The other detrimental mechanism observable in AC pipes is corrosion,
which is identified in the form of pits and holes on pipe walls. A num-
ber of chemical agents including acids, sulphates, magnesium salts, alkaline
hydroxides, ammonia and soft water were reported by Nebesar (1983) as
the source of corrosion in this type of pipe. Some organic compounds were
found to be “corrosive” as well. External corrosion of AC pipes follows
the same principles as internal corrosion, i.e., pH, alkalinity, sulphates con-
tained in the soils or groundwater which attack the pipes (Jarvis 1998). In
recent times, Poly Vinyl chloral (PVC) and Poly Ethylene (PE) pipes have
been introduced for use in water networks. Eisenbeis (1997) has presented
a statistical analysis of failure rates in a plastic pipes.
Obviously, failure rates differ for various pipe materials. However, due
24 CHAPTER 2. LITERATURE REVIEW
to the diversity of characteristics and environmental and system conditions
that influence the failure behaviour of pipes, it is not easily possible to
come up with a clear prioritisation of pipes performance just based on their
material. This problem is illustrated by Table 2.2 that contains an anal-
ysis of relative failure rates for different pipe materials compared to the
corresponding failure rates of grey cast iron pipes extracted from a study
conducted by Eisenbeis et al. (2000) on the software tools used by European
water cement (AC) pipes. In this case, the average age of the pipe is not
taken into account. For Regio-Emilia, Trondheim and Bergen, the relative
failure rates are the ratio of failure rate of the concerned material pipes
in the area to the failure rate of grey cast iron pipes of different materials
in the area. For Bordeaux, the relative failure rate is the ratio of hazard
functions of pipes of different materials and grey cast iron.
The improvement of failure rates for grey cast iron (GCI) pipes, de-
creasing from 1994 to 1996, may be due to a policy change that resulted
in decreasing water pressure in the distribution systems. In Table 2.2, for
Bordeaux, ductile iron and GCI pipes are compared. Even after eliminat-
ing the influence of age, it shows that GCI pipes break more than ductile
iron pipes. The table also shows that, in Norway, asbestos cement and
unprotected ductile iron pipes are more vulnerable than GCI.
2.4.5 Manufacturing methods
Performance of pipes of the same material may differ due to different man-
ufacturing methods. For instance, the first cast iron pipes were horizontally
cast in sand moulds. Pit cast iron pipes had uneven wall thickness as a
result of this manufacturing technique. Later, vertical (spin) casting was
introduced as an advanced technique in pipe manufacturing industry. Spin
casting was first developed in the United Kingdom in 1916. However, the
transition between the two techniques in North America largely took place
between 1920 and 1930 (Rajani 1995). Most pipes installed after the lat-
ter date were made by spin casting. Makar and Kleiner (2000) stated that
centrifugal casting produced a stronger pipe due to differences in the mi-
crostructure produced by the two processes.
Spun cast iron pipes, had more even wall thicknesses. This method al-
lowed the production of pipes with thinner walls. Centrifugal casting meth-
ods resulted in even greater consistency of wall thickness. These methods
were used in Australia for the first time in 1962 when a centrifugal casting
machine was used to produce pipes with diameters up to 750 mm (Price
2.4. EFFECTIVE FACTORS IN PIPE FAILURE MECHANISMS: AREVIEW OF PREVIOUS STUDIES 25
Table 2.2: Sample relative failure rates for different pipe materials used in
water distribution systems (Eisenbeis 1999), where relative failure rate =Failure rate of the material concerned
Failure rate of Cast Gray Iron.
Reggio Reggio Reggio Bordeaux Trondheim BergenEmilia Emilia Emilia1994 1995 1996 1978-1996 1978-1999
PE 0.01 0.11 0.25 0.06 0.06
PVC 0.21 0.25 0.3 0.01 0.12
Asbestos Cement 0.34 0.64 0.68 1.92 1.44
Steel 0.08 0.11 0.15
Grey Cast Iron 1 1 1 1 1 1
Ductile Iron 0.81 1.75
(no corroionprotection)Ductile Iron 0.22 0.12
(corrosionprotection)
and Sutton 1988). Production techniques, as well as the materials, need to
be considered when analysing grey cast iron pipe failures. The production
method is correlated to the year of production, which again is related to the
laying-year available in most pipe records. Walski and Pelliccia (1982) took
into account the manufacturing order of pipes by suggesting an exponen-
tial model for determining the state of pipe associated with two parameters
representing the type of pipe casting.
2.4.6 Corrosion
Corrosion is one of the main reasons of deterioration of iron pipes (Rastad
1995). This electrochemical process is caused by a variety of environmental
conditions that induce formation of electrochemical cells, which encourage
external corrosion pits in ductile iron (DI) and graphitised zones in cast
iron (CI). These signs of corrosion can emerge as early as 5 years or as late
as 30 to 65 years after installation (Rajani and Kleiner 2003).
This natural phenomenon usually occurs in two basic ways, galvanic and
electrolytic corrosion. Galvanic corrosion involves direct electric current
that is generated within the galvanic cell, whereas in electrolytic corrosion
the direct current is from an external source. Appropriate background and
26 CHAPTER 2. LITERATURE REVIEW
detailed discussions on corrosion theory, galvanic corrosion, electrolytic or
stray current corrosion and bacteriological corrosion can be found in pub-
lished information such as Peabody (1967), Peabody (2001), and NACE
(1984). O’Day (1989) identified galvanic corrosion to be the primary reason
for the external deterioration of iron pipes, determined by the soil proper-
ties, such as pH, resistivity, moisture content, and redox potential.
In reality, grey cast iron pipes fail because of a combination of factors
that may include external loading, internal pressure, manufacturing flaws
and corrosion damage (Morris 1967). Many circumferential and bell split
type failures occur as a series of multiple events. Detailed definitions and
characteristics of failure modes such as circumferential and bell split are
available in Makar et al. (2001). In these cases, the pipe cracks part way
through and may start leaking water. If the damage is not detected, a second
or even third cracking event may take place, with the process continuing
until the pipe fails completely or is removed from service due to a leak
detection work (Kleiner et al. 2005).
Internal corrosion depends on the characteristics of the transported wa-
ter (e.g. pH and alkalinity) and external corrosion depends on the charac-
teristics of environment around the pipe (e.g. soil characteristics and soil
moisture). Wall thickness and pipe strength decrease under the time effect
of corrosion and the likelihood of breakage increases. Ahammed and Melch-
ers (1994) made attempts to model the reduction of wall thickness of iron
pipes, results from corrosion. Spread of corrosion in mature iron pipes is
generally considered as a power function of its age. Internal corrosion, as
well as the means that are used to control it (water treatment), has another
detrimental effect on the performance of the system. The authors stated
that loss of carrying capacity is a direct impact of internal corrosion due
to decrease in Hazen-Williams coefficient. The existence of deteriorating
factors whose impacts increase with the age of the pipe is believed by some
researchers to follow Hazen- Williams formula (e.g. Finnemore and Franzini
(2002)). This equation relates the flow, Q, to a series of parameters as:
Q = KCHW D(2.63)(∆H
L)N (2.1)
where CHW is the Hazen-Williams coefficient; D is internal diameter; ∆H is
the head loss due to friction; L is length; N is the exponent of the hydraulic
slope usually taken as 0.54, and K is a constant dependent on the choice of
units. According to Equation (2.1), a drop in the value of CHW with age
would lead to an increase in δH, i.e. a higher head loss over the length L
of pipe, at a given level of the flow Q.
2.4. EFFECTIVE FACTORS IN PIPE FAILURE MECHANISMS: AREVIEW OF PREVIOUS STUDIES 27
Karaa and Marks (1990) argued that external corrosion is an important
factor to incorporate in predictive models as its intensity, unlike that for
internal corrosion, will vary from pipe to pipe as soil conditions vary.
In the past few years, different non-destructive techniques (NDT) have
become available (Hartman and Karlson 2002, Rajani and Kleiner 2004) to
measure remaining wall thickness, corrosion pits (ductile iron) or graphiti-
zation depths (cast iron) along the pipes. Results obtained from these NDT
measurements have to be incorporated within a broad decision support tool
to assess condition state, determine time to failure and remaining service life
for each inspected pipe and subsequently establish proactive management
strategies. Rajani et al. (1996) and Rajani and Tesfamariam (2004) have
developed a pipe-soil interaction model that determines stresses, strains,
and displacements at any point along the length of a jointed pipe. Rajani
and Tesfamariam (2005) developed a fuzzy logic based model to integrate
corrosion rates with the remaining wall thickness or pit geometry measure-
ments obtained from NDT inspections to arrive at the time to failure.
Tesfamariam et al. (2006) also developed an analytical model based on
a soil-pipe interaction model to estimate the remaining service life of one
cast iron pipe with several corrosion pits of significant depths observed at
the time of inspection. They expressed the resulting estimation as a fac-
tor of safety. The authors also undertook a sensitivity analyses, using a
Monte Carlo type random sampling method, to identify the critical compo-
nents of data that merit further investigation. Sensitivity analysis strongly
suggested that reducing pit depth (graphitisation) growth by using proper
corrosion control can be the single most effective way to decelerate the
breakage growth rate of existing pipes.
2.4.7 Pipe’s failure history of
Shamir and Howard (1979) claimed that analysis of previous failures can
assist in identifying the primary causes for breaks within a distribution
network. Once these primary causes have been isolated, changes in pipeline
design, such as construction characteristics, joint design, pipe material, and
maintenance procedure, can be initiated to improve the situation.
In fact, the failure history of a pipe is a significant factor in prediction
of future failures (Walski and Pelliccia 1982). Andreou (1986) used Cox’s
proportional hazards model to analyse the breakage rates in the water net-
work. Cox’s proportional hazards model is explained in more details in
Section 2.5.3. He reported that breakage rate increased with each break-
28 CHAPTER 2. LITERATURE REVIEW
age occurrence, up to the third break after which the breakage rate was
constant. At this point, the pipes were assumed to be in a “fast breaking
state”. The number of previous breaks was found to significantly affect the
hazard function of the pipes. Eisenbeis (1999) observed a similar pattern.
Malandain et al. (1999) included these findings from Andreou (1986) and
Eisenbeis (1999) in a break rate model that will be discussed later in this
chapter.
Goulter and Kazemi (1988) observed the temporal and spatial cluster-
ing of water-main breaks, indicating that a previous break increased the
likelihood of future breaks in its immediate vicinity. The authors reported
that about 60% of all subsequent breaks occurred within 3 months of the
previous break. They suggested that the subsequent breaks are caused by
damage during repair operations, such as pressure surge while refilling the
pipe after repair or ground movements caused by excavation, back-filling
and the movement of heavy vehicles. Several factors unrelated to repair
activities are also responsible for the clustering of breaks in the network.
Pipes in the same location often have the same age and materials and are
laid with the same construction and joining methods. Pipes in the same
location are also likely to be exposed to the same external and internal
corrosion conditions.
2.4.8 Water pressure
Water pipes are designed to resist a designated internal pressure of water
passing through them. Minimum feasible operating pressures are still con-
strained by local topography (Lambert 1998) and operating requirements.
A flawless installation operation provides a uniform support over the entire
length of the pipes by the bedding. Poor installation practices or distur-
bance over time (due to soil movement) may cause a lack of support in some
points that leads to bending moments and longitudinal stresses. The ability
of the structure of a pipe to resist such forces is a function of the tensile
strength of the material and wall thickness (Skipworth et al. 2002).
In this context, static pressure of water and pressure surges in a distri-
bution system can affect the pipe failure process. Pressure surges can occur
when water and air valves open and close during network operations. These
surges can be one of the factors in failure clustering as valves are closed and
opened during repair activities. Andreou (1986) found the static pressure
an effective factor when modelling the pipe failure patterns, but the im-
portance of the variable was not found to be considerable in comparison to
2.4. EFFECTIVE FACTORS IN PIPE FAILURE MECHANISMS: AREVIEW OF PREVIOUS STUDIES 29
other affecting factors. Clark et al. (1982) used both the absolute pressure
and the differential pressure (surge) when modelling the time to the first
failure of water pipes.
2.4.9 Soil condition of the bedding
Soil conditions affect external corrosion rates of water pipes and play an
important role in the process of pipe degradation, specially for iron pipes.
Clark et al. (1982) considered the presence of a corrosive soil environment
in the analysis of pipe failure, but found a low correlation between length
of pipe laid in a corrosive environment and failure frequency. Malandain et
al. (1998) used a GIS system to relate the soil conditions to the breakage
rate of pipes in the water network in Lyon, France.
Also Eisenbeis (1999) used ground condition (defined as the presence or
absence of corrosive soil) as an explanatory variable in the analysis of pipe
failures. There are different definitions and classifications for aggressivity
of soil in different studies. In Trondheim (Lei 1997), a broad classification
has been applied to represent the soil:
• Very aggressive: (Tidal zone, high ground water level, natural soil with
resistivity under 750 Ohm cm, pH less than 5, polluted by chemicals,
stray current, etc.)
• Moderate aggressive: (Clay, wetland, nonhomogeneous, etc.)
• Not aggressive: (Natural soil resistivity over 2500 Ohm cm, dry con-
ditions, sand, moraine).
In a study of profile of breakages of asbestos cement water mains, Mor-
dak and Wheeler (1988) observed that distribution of failures through the
year was fairly random for areas where sandy/gravel soils commonly occur,
whereas in areas with cohesive clay soils, most failures occurred during the
dry summer months. Cohesive clay soils are also associated with high inci-
dence of circumferential fractures, which are commonly related to bending
stresses.
A lack of a strong association between soil condition and breakage rate
is reported in a study by Boxall et al. (2006). These authors suggested that
this observation may be due to the spatial resolution of the data that was
performed prior to the analysis.
30 CHAPTER 2. LITERATURE REVIEW
Hu and Hubble (2005) examined the influence of deteriorating factors
for water mains in Regina, Canada. They reported the soil condition as a
critical factor in failure mechanism of water mains of the region.
Rajani et al. (1996) have developed Winkler-type pipesoil interaction
(WPSI) model based on mechanics and hence termed mechanistic or phys-
ical models. This technique has been used in some recent studies, e.g.
(Rajani and Tesfamariam 2005, Tesfamariam et al. 2006) to produce esti-
mations for remaining wall thickness of individual pipes.
2.4.10 Seasonal variations
Seasonal weather variations can be linked to pipe breaks. Many breaks
are recorded in hot months of summer when facilities are functioning at
maximum capacity to meet peak demands. These breaks are due to such
excessive pressure, along with other causes such as external stresses or cor-
rosion.
In the winter when the water temperature drops very quickly, axial stress
is added to the internal pressure of the water, possibly surpassing the factor
of safety of pipes. Additional stress may result from frost effects, especially
when pipes are rather shallow.
A seasonal pattern, with the greatest number of failures occurring during
the winter, is common for many water distribution networks (Eisenbeis 1999,
Sgrov et al. 1999). Andreou (1986) realised that smaller diameter pipes (less
that 8 inches) have higher failure rates in the winter.
Kleiner and Rajani (2002) provided many references to reported obser-
vations concerning the influence of temperature and soil moisture on the
frequency of water main breaks. Rajani et al. (1996) showed that differen-
tial temperature change between pipe and soil, and also soil shrinkage due
to dryness, result in the development of stresses in the pipe.
In a study of research needs for rehabilitation of water networks, Sgrov
et al. (1999) observed both a winter and a summer peak in break rate
in UK. The summer peak was attributed to drying and uneven shrinkage
of clay soils, whilst the winter peak may have been due to frost loading
or thermal contraction effects. In addition, the researchers reported that
annual breakage rate over a period of ten years had significant correlation
with mean annual daytime temperature and was inversely related to the
total annual rainfall.
2.5. CURRENT MODELS DEVELOPED FOR PIPE FAILURE ANALYSISAND PREDICTION 31
2.5 Current Models Developed for Pipe Failure
Analysis and Prediction
In the literature, the techniques developed to analyse water pipe failures can
be generally categorised into physical, descriptive, and statistical analysis
methods. In the following subsections, the main characteristics of each
category are explained and their performance and areas of application are
compared.
2.5.1 Physical analysis
Physical analysis methods are based on evaluating the structural and envi-
ronmental characteristics of each individual pipe. Such evaluations include
identification of the scope and severity of corrosion on the internal and ex-
ternal pipe walls and estimation of the stresses caused by the loads applied
to the water pipe. For example, Water Resources Council (WRC) method
(Williams et al. 1984) is a physical analysis technique for assessing the resid-
ual life of cast iron pipes based on measuring the pit depth. This method
has been the standard procedure for corrosion assessment in UK for many
years. However, it is criticised in recent studies to be flawed in several re-
spects (Marshall 2002, Olliff and Rolfe 2002, Kleiner and Rajani 2001, Ra-
jani and Y. 2001), specially with the underlying assumption of having a
continuing linear corrosion rate.
Marshall (2002) enhanced the WRc method and developed a more ratio-
nal procedure. This new method is based on fracture mechanics and relates
it directly to the flexural failure that is commonly encountered. This pro-
cedure needs to be implemented and examined, although there seems to be
no drive to promote this methodology within the water industry.
Olliff and Rolfe (2002) advocated a condition assessment approach to
rehabilitation practice. They stated that although much effort has been
made to improve the investigation and renovation techniques, this has not
been matched in the area of analysis of condition data.
A major Canadian program has been conducted with the purpose of
assessment of corrosion of pipes. The studies undertaken in this forum
(Kleiner and Rajani 2001) came up with a number of methods for assess-
ing the probable performance of pipes in the future. However, the survey
reviews physically based models which need a wide range of pipe physical
characteristics and such information is commonly unavailable.
32 CHAPTER 2. LITERATURE REVIEW
As noted previously, in recent years, a number of non-destructive tech-
niques (NDT) have been developed to measure wall thickness, corrosion
pits (ductile iron), or graphitisation depth (cast iron) of pipes (Jackson
et al. 1992, Hartman and Karlson 2002, Lillie et al. 2004, Rajani and
Kleiner 2004). These NDT measurements need to be incorporated within
a decision support system (DSS) to assess the pipe condition to decide
whether the pipe should be replaced and when to do it.
It is important to note that a pipe failure usually takes place as a result
of multiplicative effects of environmental, operational, and design factors.
Therefore, the reason of a failure occurrence cannot be precisely determined.
Measurement and recording of some of the design parameters and opera-
tional factors are associated with some level of imprecision. For instance,
measurement and recording of quantities such as temperature changes, traf-
fic, operating and surge pressure, corrosion pit geometry, loss of bedding
support (as a result of prolonged leakage or wash out) and so on are not
completely accurate or readily available within operational work. Given this
background, it is clear that assessment of pipes condition always involves
some level of uncertainty.
The difficulties involved in the estimation of the past, present, and future
corrosion rates add to the uncertainty in determining the remaining service
life of pipes. Rajani et al. (1996) developed the Winkler pipe-soil interaction
(WPSI) model to take most of predominant factors to the account. However,
Winkler models are liable to have some inaccuracy due to uncertainties in
estimation of the model parameters (coefficients).
Most physical models for pipe-soil interaction are deterministic and do
not consider the uncertainties involved in precision of data and parameter
estimation. In a possibilistic approach, Rajani and Tesfamariam (2005)
developed a fuzzy model to integrate corrosion rates with the remaining
wall thickness or pit geometry measurements to estimate the time to failure.
This technique is a step to estimating remaining service life of one pipe
length with several corrosion pits of significant depths observed at the time
of inspection and accounts for the unsupported length.
Rajani and Tesfamariam (2005)’s technique can be applied to pipes with
known corrosion pit depth buried in known soil corrosivity. In a recent
study, Tesfamariam et al. (2006) also cast the WPSI model in a possibilistic
framework to predict the remaining wall thickness of cast iron pipes under
intensive inspection and converted the outcome to the structural factor of
safety. In this model, material failure theories specific to cast iron mains
were combined with fuzzy stress solutions to determine the fuzzy structural
2.5. CURRENT MODELS DEVELOPED FOR PIPE FAILURE ANALYSISAND PREDICTION 33
capacity in terms of factor(s) of safety.
There are also some other physical analysis efforts for water pipes -
e.g. (Doleac et al. 1980, Kumar et al. 1984, Makar 1999, Rajani and
Makar 2000a, Makar et al. 2001). As it was stated in Chapter 1, this the-
sis concentrates on modelling and prediction of water main failures to de-
velop maintenance/replacement strategy in water distribution systems. The
physical-based models require assessment of the properties of each buried
pipe and such techniques need detailed knowledge of the long-term (histor-
ical) behaviour of the pipeline under consideration in addition to detailed
assessment of the individual pipe from different aspects.
2.5.2 Descriptive analysis
Descriptive analysis methods are based on calculation of descriptive statis-
tics to provide an insight into the breakage patterns and trends. Descriptive
analyses, by their nature, can be only performed in locations where compre-
hensive databases of the pipe characteristics and breaks are available. There
are very few case studies of descriptive analysis reported in the literature.
Some cities often cited for participating in such studies are Winnipeg, Man-
itoba, Canada (Kettler and Goulter 1985a, Goulter and Kazemi 1988, Goul-
ter et al. 1993, Jacobs and Karney 1994); New York (O’Day et al. 1980, Male
et al. 1988, Male et al. 1990); Cincinnati and New Haven, Conn. (Clark and
Goodrich 1988, Goodrich 1986, Karaa and Marks 1990); suburban Paris
and Bordeaux, France (Eisenbeis 1999); three municipalities of Quebec:
Chicoutimi, Gatineau and Saint-Georges (Pelletier et al. 2003) and Boston
(Sullivan 1982).
As an example of descriptive analysis, the study by Sullivan (1982) con-
cludes that the type of routine maintenance practice directly affects the
evolution of the system state over time. The existence of leaks that are
not repaired leads to major breaks and the accumulation of such leaks ex-
plains the high proportion of the water losses through leaks which are not
unaccounted for.
As shown in Table 2.3, the percentage of water loss varied between
10 and 17 percent in 11 U.S. cities in 1978. Unaccounted for water is
defined as the difference between the total amount of water pumped into
the water system from the sources and the amount of metered water use by
the customers of the water system expressed as a percentage of the total
water pumped into the system. Table 2.4 shows the percentage of water
losses for different causes and demonstrates that in Boston in 1978, 37.30%
34 CHAPTER 2. LITERATURE REVIEW
Table 2.3: Estimated water leakage in 12 U.S. cities in 1978 (Sulivan; 1982)
Location Percent of water lost
through leaks
Boston, Mass. 17
Cleveland, Ohio 15
St. Louis, Mo. 15
Pittsburgh, Pa. 14
Tulsa, Ckla. 14
Philadelphia, Pa. 12
Hartford, Conn. 11
Kansas City, Mo. 11
Cincinnati, Ohio 11
Buffalo, N.Y. 10
Baltimore, Md. 10
Portland, Ore. 8
Table 2.4: Water loss percentages for different causes, measured in Boston,
1978 (Sulivan; 1982)
Amount Percent of Percent of
Cause ML/d (mgd) water total water
unaccounted-for purchased
Undetermined 117.7 31.1 46.4 21.7
Leaks and breaks 94.6 25 37.3 17.5
Blow offs and flashings 4.5 1.2 1.8 0.8
Fire fighting 7.1 1.9 2.8 1.3
Unmetered public usage 15.8 4.2 6.3 2.9
Other 4.5 1.2 1.8 0.8
of water losses were due to leaks in mains and the remaining losses were
mostly due to service pipe leaks. For other cities, at least a quarter of losses
are due to leaks in mains.
Table 2.5, taken from Kaara (1984), shows the importance of different
criteria in using descriptive analysis for developing the replacement plan-
ning under different maintenance practices. Intensive maintenance practices
are considered proactive, and poor maintenance practices are known as a
reactive type of maintenance strategy.
This type of analysis is limited by the challenges faced in construct-
2.5. CURRENT MODELS DEVELOPED FOR PIPE FAILURE ANALYSISAND PREDICTION 35
Table 2.5: Importance of different criteria for replacement/rehabilitation
decision-making, under different types of maintenance policies (Karaa;
1984)
Criteria Intensive MP Fair MP Poor MP
Economic Analysis IF PF MF
Loss of Pressure PF PF IF
Water Quality PF PF IF
Reliability MF PF IF
MP: Maintenance Practices
IF: Important Factor
PF: Partial Factor
MF: Marginal Factor
ing databases. These challenges include availability of personnel and re-
sources, missing and conflicting data, non-computerised information (paper
archives), and the like. Indeed, development of such databases has been
a concern for many researchers (O’Day 1982, Clark and Goodrich 1989,
Habibian 1992). The data base of this research that was provided by CWW
and is an example that is explained in detail in Chapter 3.
2.5.3 Statistical analysis
Statistical analysis methods are based on modelling of a pipe lifetime or the
frequency of the failures (or the probability distribution of these quantities),
then using those models to make decisions on replacement and/or rehabili-
tation of the pipes. Statistical analysis of pipe failures begins with plotting
the cumulative number of failures over time a process which requires failure
data for each pipe. A cumulative plot can indicate whether there is a trend
in the failure times. From a practitioner’s point of view, such a plot can
be a convenient tool for making maintenance decisions for individual pipes.
However, this technique is too time consuming to be carried out for each
pipe in the entire network.
Figure 2.1 shows an example of a cumulative failure plot for a single
pipe in the CWW distribution system (CICL 100 mm pipe, constructed in
1967, located in postcode 3029). The curve is convex, indicating a dete-
riorating pipe. In order to predict future failures in a statistical analysis
method, a parametric or non-parametric model is fitted to the curve plotted
using the dataset. This procedure can only be used for pipes with several
36 CHAPTER 2. LITERATURE REVIEW
About Weibull & lognormal models: In case of proper choice and credibility of Weibull or lognormal analysis models, they offered considerable insight into the lifetime reliability of products. However, there are also many ways that analysis can be corrupted \citeWarrington.
Dec 97 Jul 98 Mar 99 Oct 99 May 000
1
2
3
4
5
6
7
Failure date
Cum
ulat
ive
num
ber o
f fai
lure
s
Figure 2.1: Cumulative failure plot for a single pipe in CWW (100 CICL
pipe, constructed in 1967 in postcode 3029)
previous breaks and is most useful for making decisions about distribution
and service lines. However, multiple breaks cannot be accepted on trans-
mission and trunk mains where the consequence of failure is high. Besides,
existing failure histories are not recorded based on regular inspections and
all the recorded failure dates are not accurate. For instance, a lack of regu-
lar inspections of underground pipes causes their failure occurrences to be
recorded with delay only after the obvious consequences of the failures are
reported. Thus, creating plots for every single pipe in the system requires
extensive data analysis and manipulation.
Statistical models for analysis of water pipe breaks have attracted the
interest of many researchers during recent decades because the results of
statistical analysis can be used for a variety of purposes in water network
management. In the long term, the models can be used to estimate future
budget needs for rehabilitation. In the short term, the models can be used
to define candidates for replacement based on poor structural condition.
Statistical methods use the history of failure occurrences to identify
the patterns of pipe failures in the past. In order to predict the future
performance of water pipes (likely failure rate or probability of occurrence
of failure at a time in the future), these patterns are assumed to continue in
the future. It will be shown in Chapter 5 that this assumption is inaccurate
2.5. CURRENT MODELS DEVELOPED FOR PIPE FAILURE ANALYSISAND PREDICTION 37
for water mains as the failures are proven to be non-stationary random
processes with time-varying characteristics. Therefore, either auto-updating
nonparametric techniques are required, or the parameters of lifetime models
(in the existing probabilistic approaches) should be constantly updated or
a time dependent component should be built in the model.
Statistical methods that are used for the analysis of condition of water
pipes can be categorised into deterministic and probabilistic methods. In
a further classification, probabilistic models can be divided into probabilis-
tic multivariate and probabilistic single-variate models that are applied to
grouped data (Kleiner and Rajani 1999).
Statistically derived models can be applied with various levels of input
data and may thus be particularly useful for water mains with small failure
databases available or for which the low cost of a failure does not justify
expensive data acquisition campaigns (Kleiner and Rajani 2001). However,
the small size of failure databases, in such cases, causes small sample bias
in the statistical models (being treated as estimators in a mathematical
context) and more accurate and robust models would be desirable for ap-
plications where small failure databases are available.
Deterministic models
The outcome of applying a deterministic model to a failure data is a value
representing the condition of pipe at a time in the future (e.g. number of
failures in the future or failure rate or time to next failure). In the literature
of pipe failure analysis, deterministic models are generally divided into three
types: time-exponential, time-power and time-linear models.
Deterministic time-exponential models
One of the well known deterministic models is the model of Shamir and
Howard (1979) who used regression analysis to obtain a break prediction
model that relates a pipe’s breakage rate to the exponent of its age.
N(t) = N(t0)eA(t+g) (2.2)
where: t is the time elapsed (from present) in years; N(t) is the number of
failures per unit length per year (km−1 year−1); N(t0) = N(t) at the year
of installation of the pipe (i.e., when the pipe is new); g is the current age
of the pipe; and A is coefficient of breakage rate growth (year−1).
The underlying assumption of the above model is N(t0) 6= 0, which
means that on average, a pipe is assumed to always have a breakage fre-
quency, albeit very small in the beginning of its life. The required data for
38 CHAPTER 2. LITERATURE REVIEW
this model are pipe length, installation data and breakage history. Forma-
tion of homogeneous groups is essential for analysis according to criteria
like pipe type, diameter, soil type, failure type, over burden characteristics,
etc. This limits the application of this model to the failure histories with
large number of data points for each homogeneous group of pipes. Besides,
Shamir and Howard (1979) did not provide any details on the location of the
study, the quality and quantity of available data or the method of analysis.
The two-parameter exponential model of Equation (2.2) is simple and
relatively easy to implement but its simplicity warrants careful treatment
in applying the model to data that are partitioned precisely. It should
also be noted that this exponential model implicitly assumes a uniform
distribution of breaks along all the water mains in a group. This assumption
was questioned by others, e.g. (Goulter and Kazemi 1988, Goulter et al.
1993, Mavin 1996).
Walski and Pelliccia (1982) attempted to enhance the exponential model
of Equation (2.2) by incorporating additional factors in the analysis based
on observations made by the US Army Corps of Engineers in Binghamton,
N.Y. (Kumar et al. 1984). The new expanded model added three dimen-
sions to the two-parameter model of Shamir and Howard (1979), namely,
consideration of the type of pipe casting, distinguishing between first break
and subsequent breaks, and consideration of pipe diameter. For example,
the model for (pit/sand spun) cast iron pipes with 500mm diameter is given
by:
N(t) = C1.C2.N(t0).eA(t+g) (2.3)
where: C1= ratio of break frequency for (pit/sand spun) cast iron pipes with
at most one previous break, to the overall break frequency for all (pit/sand
spun) cast iron pipes; and C2=ratio between break frequency for pit cast
iron pipes with 500 mm diameter, to overall break frequency for pit cast
iron pipes.
The model of Equation (2.3) requires the same data as the Shamir and
Howard (1979)’s model plus information on the method of pipe casting and
pipe diameter. Walski and Pelliccia (1982) did not provide any indication of
whether the correction factors they proposed indeed improved the prediction
quality and by how much. It is likely that these three added dimensions
influence the prediction of breakage rates of water mains. However, the
correction factors seem to have been derived arbitrarily and assumed to act
in a multiplicative manner on the breakage frequency (without apparent
statistical justification).
2.5. CURRENT MODELS DEVELOPED FOR PIPE FAILURE ANALYSISAND PREDICTION 39
The assumption of multiplicative effect of correction factors on the break-
age frequency implies that these dimensions affect only the initial breakage
rate and not its annual growth rate. Furthermore, since no statistical test
of significance was performed by the authors, the validity of these proposed
exponential models is questionable.
Clark et al. (1982) proposed further enhancement of the exponential
model and transformed it into a two-phase model as presented in Equations
(2.4) and (2.5). They observed a lag between the year of pipe installation
and the first break record. Consequently, they proposed a model comprising
a linear equation to predict the time elapsed to the first break and an
exponential equation to predict the number of subsequent breaks.
NY = x1 + x2D + x3P + x4I + x5RES + x6LH + x7T (2.4)
REP = y1 .ey2 t.ey3τ .ey4RPD.ey5DEV .SLy6 .SHy7 (2.5)
where:
NY = number of years from installation to first repair (break);
xi, yi= regression parameters;
D= diameter of the pipe;
P= absolute pressure in a pipe;
I= percentage of pipe overlain by industrial development;
RES= percentage of pipe overlain by residential development;
LH= length of pipe in highly corrosive soil;
T= pipe type (1=metallic, 0=reinforced concrete);
REP= number of repairs (breaks);
PRD= pressure differential;
t= age of pipe from first break;
DEV = percentage of pipe length in low and moderately corrosive soil;
SL= surface area of pipe in low corrosive soil; and
SH= surface area of pipe in highly corrosive soil.
This model requires the time of installation, breakage history, type and
diameter of the pipe, as well as information about operating pressure, soil
corrosivity and zoning type of the area overlaying the pipe. Additional data
such as the type of breaks and pipe vintage are required to enhance the
model. Only moderate “goodness of fit” with r2 equal to 0.23 and 0.47 for
the linear and exponential expressions, respectively, were reported (Clark
et al. 1982).∗
∗The goodness of fit of a statistical model describes how well it fits a set of observa-
40 CHAPTER 2. LITERATURE REVIEW
The Clark et al. (1982)’s model was the first reported attempt to explic-
itly account for several factors that were potential contributors to the pipe
breakage rate and considered two distinctly different deterioration stages in
the life of a water main. The linear Equation (2.4) implied that the covari-
ates acted on the time to first breakage independently and additively. The
low r2 value corresponding to the linear equation could suggest that this as-
sumption may be incorrect and that the factors affecting pipe deterioration
act jointly rather than independently.
The low r2 value could also indicate that other factors affecting time
to first break were present, but were not considered in the equation. The
exponential Equation (2.5) is similar to other time-exponential models de-
scribed above and considers the breakage rate primarily as an exponential
function of time since the first break.
Other covariates are assumed to act multiplicatively on the breakage
rate. It should be noted that the covariates expressing corrosivity effects are
power functions. The moderate r2 value corresponding to the exponential
equation suggests that more research is required to determine the suitability
of this model. The authors did not provide information as to the relative
contribution of each covariate to the total r2.
It is possible that the accuracy of the model could be improved if other
types of data were available. The authors also did not indicate whether both
equations had been applied to a holdout sample (sample of water mains
that was “held out” for validation purposes and thus was not included in
the dataset to perform the regression analysis). Validation with a holdout
sample would have provided more convincing evidence as to the predictive
power of the model. Although this model has been referenced extensively,
no documented reference is available to indicate if its use has been repeated
elsewhere.
Constantine et al. (1998) developed a shifted time-exponential model
(STEM) that uses the data categorised by year of construction. This is a
method for prediction of number of failures in the next year for the individ-
ual pipes. This model is expected by Equation (2.6)
H(x) = lλeβx (2.6)
tions. Measures of goodness of fit typically summarise the discrepancy between observedvalues and the values expected under the model in question. Such measures can be usedin statistical hypothesis testing, to test whether two samples are drawn from identicaldistributions, or whether outcome frequencies follow a specified distribution. Goodnessof fit can generally be described as: r2 =
∑(o− e)2/e2 ;where r2=goodness of fit, o= an
observed frequency, e= an expected (theoretical) frequency.
2.5. CURRENT MODELS DEVELOPED FOR PIPE FAILURE ANALYSISAND PREDICTION 41
2.5. CURRENT MODELS DEVELOPED FOR PIPE FAILURE ANALYSISAND PREDICTION 39
Time (in the past)
Failu
re R
ate
(in th
e pa
st)
=1.00 ; Typical pipe of the group=0.25 ; A pipe with better performance=2.10 ; A pipe with worse performance
Figure 2.2: The shifted time exponential model is fitted to the past failure
rates which are the cumulative number of previous failures at different times
in the life of the pipe: Larger λ corresponds to a worse performance and
vice versa.
STEM is not an accurate predictive model. It is not clear what covari-
ates (e.g. soil type, diameter, etc.) are included in STEM. Righetti (2001)
applied a STEM to a database of failures in a water distribution system
in Melbourne, Australia, and reported very poor correlation between the
predicted and observed values, as the STEM seemed to underestimate the
number of failures.
Neural networks have been applied to predict water pipe failures by
Sacluti et al. (1999) and Sacluti (1999). The neural network models intro-
duced in those papers are deterministic models which directly output the
future failure rates (or number of failures). Achim et al. (2007) has also
proposed a deterministic artificial neural network (ANN) model to predict
the failure rates (number of failures/km/year) for the individual pipes of
an existing failure data. In this study, pipe material is used as a strati-
fying factor that divides the failure data to two subsets for two types of
pipes available in the failure history under study. The inputs to this model
are diameter, age, year of construction, length and the pair of geographical
coordinates.
To evaluate the reliability of their neural network model, Achim et al.
(2007) used scatter plots of the estimated measures versus the targets, and
the plots of residual errors versus target. The resulting r2 values obtained
Figure 2.2: The shifted time exponential model is fitted to past failure rates
which are the cumulative number of previous failures at different times in
the life of the pipe. Larger λ corresponds to a worse performance and vice
versa.
where:
H(x)= expected number of total failures at pipe age x;
l= length;
β= rate variable;
λ=scale parameter (or the shifted time parameter); and
x=age of the pipe (year).
The rate variable (β) is the same for a homogeneous group of pipes. The
scale parameter λ should be calculated separately for each individual pipe.
This parameter is calculated by fitting the model to the cumulative number
of previous failures at different times in the life of the pipe (as shown in
Figure 2.2).
In this method, the expected influence of the environmental conditions
on the failure process of pipe is modelled by making a number of assump-
tions. The model uses different fixed parameters for each group of assets.
However, an extensive amount of knowledge would be required about the
manufacturing characteristics and real time information of water hammer,
corrosivity of the soil, external loading, soil movement etc.
The predictive capability of STEM for the examined data is not signifi-
cant. Besides, rate parameters in the STEM need to be calculated. Hence,
STEM is not an accurate predictive model. It is not clear what covari-
42 CHAPTER 2. LITERATURE REVIEW
ates (e.g. soil type, diameter, etc.) are included in STEM. Righetti (2001)
applied a STEM to a database of failures in a water distribution system
in Melbourne, Australia, and reported very poor correlation between the
predicted and observed values, as the STEM seemed to underestimate the
number of failures.
Neural networks have been also applied to predict water pipe failures by
Sacluti et al. (1999) and Sacluti (1999). The neural network models intro-
duced in those papers are deterministic models which directly output the
future failure rates. Achim et al. (2007) has also proposed a deterministic
artificial neural network (ANN) model to predict the failure rates (number
of failures/km/year) for the individual pipes of an existing failure data. In
that study, pipe material is used as a stratifying factor that divides the
failure data to two subsets for two types of pipes available in the failure
history under study. The inputs to this model are diameter, age, year of
construction, length and the pair of geographical coordinates.
To evaluate the reliability of their neural network model, Achim et al.
(2007) used scatter plots of the estimated measures versus the targets, and
the plots of residual errors versus target. The resulting r2 values obtained
from graphical means prove the proposed model to be a better model than
the shifted time power model (STPM) which will be reviewed later in this
chapter and STEM applied to the same data. However, other performance
indicators, besides r2 values, are required to assess the performance of pro-
posed model.
Kleiner and Rajani (2002) proposed the following multi-variate expo-
nential model:
N(xt) = N(xt0) exp(a>. xt) (2.7)
where: xt is the vector of time-dependent covariates prevailing at time t;
N(xt) is the number of breaks resulting from xt; a is the vector of parameters
corresponding to the covariates; and xt0is the vector of baseline x values
at year of reference t0 that is the start of history.
Time-dependent covariates could be pipe age, temperature, soil mois-
ture, etc. Parameters N(xt0) and a can be found by least square regression
(with or without linear transformation) or by using the maximum likelihood
method. Equation (2.7) is applied to groups of water mains that are as-
sumed homogeneous with respect to their deterioration rates. The grouping
is typically done by some or all of the static factors (e.g., by size, material,
vintage, etc.) for which data are available.
In addition to the age of pipes, this model considers the time-dependent
2.5. CURRENT MODELS DEVELOPED FOR PIPE FAILURE ANALYSISAND PREDICTION 43
factors of temperature effects (expressed as the freezing index); soil-moisture
effects (expressed as the rainfall deficit); cumulative length of replaced wa-
ter mains; cumulative length of cathodically protected (retrofitted) water
mains. The variety of data that is needed for this model, can be a limita-
tion considering the typical historical data that is available in most of water
distribution systems. In addition, this model should be applied to long his-
tories of failures. For a short failure dataset that includes a period with
predominantly decreasing breakage rates, applying this model may yield
counter-intuitive results such as positive effect of ageing and/or negative
effects of replacement.
Pelletier et al. (2003) distinguished two different break orders in their
failure data set. They used the Weibull distribution for the first break order
(time to failure from installation to first break), while using the exponential
distribution to describe the behavior of subsequent breaks (time to failure
from first to second break, second to third, and so forth). This model is
simply referred to as the “Weibull/exponential model.” and despite its
mathematical simplicity, has captured the essence of the ageing. However,
the modelling strategy does not take into account the variability in the
annual number of pipe breaks due to factors other than the deterioration
resulting from the natural ageing of the pipes most often, the weakening
of the pipes due to corrosion. Examples of factors that can contribute
to higher breakage rates in a given year are disturbances due to traffic
and construction, flooding, soil properties, water quality, and so on. This
equation takes into account the factor of frosting which is a concern in cold
areas like Canada, but not an issue in warmer climates such as Australia.
Table 2.6 briefly reviews the deterministic time exponential models dis-
cussed in this section.
Deterministic time-power models
The following model was one of the first time-power models suggested
by Mavin (1996):
n = αtβe (2.8)
where:
n = number of breaks at time t;
t = age of pipe;
e = random error term;
α, β = coefficients estimated from regression analysis.
Mavin (1996) compared a time-exponential model and a time-power
44 CHAPTER 2. LITERATURE REVIEW
Table 2.6: Deterministic Time Exponential Models
Reference Attributes
Shamir and Howard (1979) Relates a pipe’s breakage rate to the expo-
nent of its age Equation (2.2)
Walski and Pelliccia (1982) Enhanced the exponential model (2.2) by
consideration of the type of pipe casting, its
diameter, and distinguishing between first
break and subsequent breaks
Clark et al. (1982) Number of failures after first breakage,
Equation (eq:4-2)
Constantine et al. (1998) A shifted time-exponential model (STEM)
that uses the data categorised by year of
construction for prediction of number of
failures in next year, for the individual
pipes, Equation (2.6)
Righetti (2001) Applied an STEM for a data from a wa-
ter distribution system in Melbourne, Aus-
tralia
Kleiner and Rajani (2002) Developed a general, multi-variate expo-
nential model Equation(2.7) to consider
some time-dependent factors in predicting
water main breaks
Pelletier et al. (2003) Weibull distribution was used for the first
break order and the exponential distribu-
tion to describe the behavior of subsequent
breaks (time to failure from first to second
break, second to third, and so forth)
Achim et al. (2007) Developed an artificial neural network
(ANN) model to predict the failure rates
for the individual pipes of an existing data
model depicted in Equation (2.8) by applying both to filtered data ob-
tained from three Australian water utilities. The author proposed some
rules to filter the pipe breakage data, based on calculating the probability
of two consecutive breaks (Constantine and Darroch 1993), and discarding
the second break if the probability is low.
Mavin (1996) found that the performance of the two models in predicting
water main breaks was comparable. This model is applicable to pipes with
2.5. CURRENT MODELS DEVELOPED FOR PIPE FAILURE ANALYSISAND PREDICTION 45
more than 6 failure experiences. This preliminary assumption is clearly
impractical in developing the maintenance strategies for water distribution
systems.
The shifted time power model (STPM) presented by Constantine et al.
(1998) is given by the following equations:
H(x) = lλxβ (2.9)
rate of failure per year at age x =dH
dx= βlλxβ−1 (2.10)
where:
H(x)= expected number of total failures at pipe age x;
l= length;
β= rate variable;
λ= scale parameter (or the shifted time parameter); and
x= age of the pipe (year).
The study reported in this thesis found that the power based model,
as well as the exponential model (mentioned previously), did not fit the
database used for the study. To use these models, the rate variable should be
calculated for each class of pipes and a scale parameter should be calculated
for each individual pipe. Regardless of the choice of these parameters,
the prediction of failure rates retains an exponential nature and the same
criticisms that apply for Constantine et al. (1998)’s STEM also apply to
STPM.
Righetti (2001) also developed a shifted time power (STPM) model for
failure analysis of water pipes. Application of both STEM and STPM to
the same set of failure data showed that the STPM model resulted in over
prediction of the number of failures.
Deterministic time-linear models
Kettler and Goulter (1985a) have proposed a linear relationship between
pipe age and its failure rate, given by the following equation:
N = k0.Age (2.11)
where:
N= number of breaks per year; and
k0=regression parameter.
McMullen (1982) has proposed the following model for the age of pipe
at the first break:
Age = 65.78 + 0.028SR − 6.338pH − 0.049Eh (2.12)
46 CHAPTER 2. LITERATURE REVIEW
where:
Age= age of pipe at its first break (years);
SR= saturated soil resistivity (ohm-cm);
pH= soil pH; and
Eh= redox potential (millivolts).
Soil resistivity has traditionally been considered as an important param-
eter in evaluating the corrosivity of a soil (Norin and Vinka 2005). Redox
potential is a an intensity parameter of overall redox (oxidation-reduction )
reaction potential in the system (similar in concept to pH). Redox potential
(Eh) describes the electrical state of a matrix. In soils, Eh is an impor-
tant parameter controlling the persistence of many organic and inorganic
compounds (Vorenhouta et al. 2004).
Data required for this model is typically not available. Sporadic data col-
lection is not expensive, however continuous and extensive data collection
program is costly. Continuous monitoring of soil properties is important
where ground water conditions have not reached steady state or are season-
ally dependent. Table 2.7 summarises the deterministic power and linear
models discussed in the literature.
Probabilistic models
Probabilistic models are strongly preferred by managers of water distribu-
tion systems in establishment of maintenance or rehabilitation strategies.
The reason is that these models also quantify the level of uncertainty which
is reasonable because there are always stochastic factors that affect the
pipe failures such as corrosion process, soil movement due to moisture con-
tent and type of soil, and external and internal burdens on pipes. The
uncertainty caused by the aforementioned stochastic factors is implicitly
ignored by deterministic models. However it can be quantified by proba-
bilistic models in terms of confidence intervals for estimated values. This
can be explained by the following example. Suppose that a probabilistic
method is giving x as an estimate for a variable x. The interval [x−δ, x+δ]
is the α-confidence interval for the estimate x if Pr(x ∈ [x − δ, x + δ]) = α
(Sheskin 2003).
Survival analysis is the main approach to probabilistic modelling of pipe
failures in the literature. The analysis of survival data is a traditional statis-
tical theme that deals with the time to the next failure data. Survival anal-
ysis has been used to predict pipe breakage behaviour by many researchers
in the past two decades. Some researchers have specifically adapted sur-
2.5. CURRENT MODELS DEVELOPED FOR PIPE FAILURE ANALYSISAND PREDICTION 47
Table 2.7: Deterministic Power and Linear models
Reference Attributes
Mavin (1996) Estimates the number of breaks; applica-
ble on pipes with more than 6 failure ex-
periences, Equation (2.8)
Constantine et al. (1998) Shifted time power model (STPM), Equa-
tion (2.10)
Righetti (2001) Applied a Shifted time power model
(STPM) to a failure history of a water dis-
tribution system in Melbourne
Achim et al. (2007) Compared the performance of a STPM
model to a STEM model and an ANN
model for a failure history of a water net-
work in Melbourne, Australia
Kettler and Goulter (1985a) A linear model for estimation of number
of failures per year; Equation (2.11)
McMullen (1982) A linear model for estimation of the pipe’s
age at first break as a function of resistiv-
ity, pH, and redox potential of its bedding
soil, Equation (2.12)
vival analysis (as most frequently used in the biomedical field) to water
pipe failure problems (Clark et al. 1982, Clark and Goodrich 1988, Andreou
et al. 1987).
Survival analysis incorporates the fact that while some pipes break, al-
though other similar pipes do not, those breaks have a strong impact on the
likelihood of future breaks for those similar pipes. Pipes can fail many
times in their lifetime. Each time a failure is observed, an immediate
intervention on the network is necessary. Many researchers (Andreou et
al. 1987, Eisenbeis 1999, Gustafson and Clancy 1999) have shown that the
breakage pattern strongly depends on the number of previous breaks that
pipes have experienced. The number of previous breaks is often reported
as the most important factor for predicting future breaks. Survival anal-
ysis is particularly useful in this field when pipe break records have been
maintained for a good portion of the water pipe network history.
The survival function and proportional hazard function are the main
fundamental elements in survival analysis. The basic quantity employed to
describe time-to-event phenomena is the survival function (i.e. component
48 CHAPTER 2. LITERATURE REVIEW
reliability). This is the probability that an individual will survive beyond
time x. It is defined as:
S(x) = Pr(X > x) (2.13)
where X is a random variable denoting the time of next failure.
S(x) is survival function which is a non-decreasing function with a value
of one at the origin and zero at the point of failure occurrence. When X
is a continuous random variable, the survival function is the complement of
the cumulative distribution function (CDF) of X, that is, S(x) = 1−F (x),
where F (x) = Pr(X ≤ x).
The hazard function is known as the conditional failure rate in reliability
theory, the force of mortality (FOM) in demography, or simply the hazard
rate. The hazard function is defined by:
h(x) = lim4x→0
Pr(x ≤ X < x + 4x|X ≥ x)
4x(2.14)
The term h(x)4x can best be interpreted as the probability that next
first failure occurs in (x, x + 4x) knowing that the pipe has survived till
the time x. If X is a continuous random variable denoting the time of
occurrence of the next failure, then:
h(x) =f(x)
S(x)= − d
dxln[S(x)] (2.15)
where f(x) is the probability density function (pdf) of X. A related quantity
is the cumulative hazard function H(x), defined as
H(x) =
∫ x
0
h(u)du = − ln[S(x)]. (2.16)
Thus, for continuous lifetime models:
S(x) = exp[−H(x)] = exp
[−∫ x
0
h(u)du
]. (2.17)
The failure time distribution of pipes in a water distribution network may
be investigated through the survival function S(x), or the hazard function
h(x).
Kaara (1984), Marks (1985) and Andreou (1986) introduced the use
of a proportional hazard model for analysing failures in water distribution
networks. A general failure prediction model, named the “Cox propor-
tional hazard regression model”, was used by many researchers and ad-
justed for different data. Cox (1972) introduced the proportional hazard
2.5. CURRENT MODELS DEVELOPED FOR PIPE FAILURE ANALYSISAND PREDICTION 49
model (PHM) in order to estimate the effects of different covariates on the
time to failure of a system. The Cox model has been used extensively in
medical statistics, where the benefit of the analysis of data on such factors
as life expectancy and duration of periods of freedom from symptoms of a
disease as related to treatment applied, individual histories and so on, is
obvious. The general form of Cox’s proportional hazard model is given in
Equation (2.18):
h(t, Z) = h0(t) exp(bηZ) (2.18)
where:
h(t, Z)=hazard function, which is instantaneous rate of failure;
h0(t)=arbitrary baseline hazard function;
Z=vector of covariates on the hazard function;
b=vector of coefficients to be estimated by regression from available data;
and
η= model parameter (chosen by trial and error for the best fitting to data).
A proportional hazard model is used to estimate the time to next fail-
ure. Covariates Z, represent environmental and operational factors that
influence the failure of water mains. Ageing of water pipes can be im-
plemented by the baseline hazard function h0(t) that can be defined as a
time-dependent ageing parameter.
In a case study, Marks (1985) proposed the following second degree
polynomial of Equation (2.19) as the baseline hazard function:
h0(t) = 2 × 10−4 − 10−5t + 2 × 10−7t2 (2.19)
and used multiple regression to determine covariates Z that affect the break-
age rate. The most significant covariates identified by Marks (1985) are the
pipe length, operating pressure, percentage of low land development, pipe
“vintage” (or period of installation), pipe age at second (or higher) break
rate, number of previous breaks in pipe, and soil corrosivity.
Cox’s proportional hazard model was also applied to the first three
breakages of a pipe by Andreou et al. (1987) who used the same baseline
hazard function and the same vector of covariates of Marks (1985) for the
early stage of pipe life (up to the third failure). For later stages (after the
third failure) a constant hazard was assumed as:
h = λ = exp(bηZ) (2.20)
where λ is the constant hazard. This model could not predict future failure
times with appropriate accuracy. A moderately low value of r2 = 0.34 for
50 CHAPTER 2. LITERATURE REVIEW
the prediction of later stages, after 3 breakages, did not indicate a significant
model fitting. However, this model became the reference for many modelling
efforts by other researchers.
As mentioned above, Andreou et al. (1987) divided a pipe’s lifetime into
two stages. During the analysis of their failure data, they observed that
the time intervals between first three consecutive failures had an ascending
order. After the third failure, these intervals seemed to be constant. They
used a Cox proportional hazard model for the first phase of pipe’s lifetime.
In order to model the constant period, they considered a Poisson distri-
bution. Similar methods were taken by a number of other researchers, e.g.
Marks et al. (1987). Assuming a constant rate of failure in the second phase
of a pipe’s lifetime is an inaccurate basis for analysis and most predictive
models assume an exponential or power relationship between age and failure
rate.
Herz (1997) and Lei and Sgrov (1998) developed probabilistic models
for estimation of the useful life of a pipe, considering it as the time to the
first failure. These lifetime models are meant to model the pipe breakage
rate over its lifetime. Although such lifetime analyses provide an insight
into failure mechanisms, they are impractical for developing decisions on
management of distribution systems. This is because the detailed failure
databases required by such models are unavailable in most water companies.
The majority of pipes in mature water networks are old and full breakage
records cannot be obtained from existing data.
Constantine and Darroch (1993) and Constantine et al. (1996) developed
a time-dependent Poisson process with mean breakage rate depending on
pipe age.
H(t) = (t/θ)β (2.21)
where:
H(t)= mean number of failures per unit length at age t (not to be mistaken
with the cumulative hazard function); and
t= pipe age;
θ , β = scale and shape parameters, respectively.
This is a Weibull random process because the resulting cumulative distri-
bution is in the form of the Weibull cumulative distribution function. In
(Constantine and Darroch 1993) and (Constantine et al. 1996), the parame-
ter β (shape parameter) is considered constant for a homogeneous group of
failures (e.g. failures that are attributed to corrosion only), whereas θ (scale
parameter) is the following function of some operational and environmental
2.5. CURRENT MODELS DEVELOPED FOR PIPE FAILURE ANALYSISAND PREDICTION 51
covariates:
θ = θ0eαZ (2.22)
where θ0 is the baseline value, α is the vector of coefficients to be estimated
by regression, and Z is a vector of covariates affecting breakage rate.
Bremond (1997) applied the PHM model, Equation (2.20), as a baseline
hazard function and proposed the following time-dependent Poisson model:
h0(t) = λβ(λt)β−1 (2.23)
where:
t= time to (next) failure; and
λ, β= scale and shape parameters (respectively) of the Weibull distribution.
The model resulted in a good fit to a large failure history of more than a
decade in France. The proportional hazard model that is applied in this
model is the Weibull baseline hazard function.
The use of the Poisson distribution for failure analysis, is based on the
underlying assumption that there is a constant risk of failures and the times
between failures are not necessarily equal. In other words, it is implicitly as-
sumed that each pipe experiences breakages occurring completely randomly
at a constant rate over the period of observation. This is not a realistic as-
sumption, since the pipe will age over the period of observation. Gradual
deterioration should therefore lead to the change of failure rate over that
period. This can be considered as a source of inaccuracy in all of failure
analysis models based on the Poisson distributions.
As mentioned earlier, the time-dependent Poisson model of Constantine
and Darroch (1993) was used by Mavin (1996), for data filtering. He tried
to eliminate the failure occurrences caused by operational or other factors
and not deterioration as a result of ageing. He introduced some rules for this
purpose to identify the failures that have occurred despite of low probability.
For example when two successive failures occurred in spite of low probability
(less than 1%), the second one was considered due to accidental damage or
faulty operational procedure. Based on this reasoning, this second failure
was filtered out of data and ignored in further analysis.
Eisenbeis (1994) proposed an approach similar to Andreou’s application
of the proportional hazard model (Andreou 1986), but assumed a Weibull
distribution for the baseline hazard function. This model also included three
stages. The first stage described hazard functions for the pipes that have not
experienced a failure. The second stage describes hazard functions for the
second to fourth failures, while the third stage describes the hazard functions
52 CHAPTER 2. LITERATURE REVIEW
for pipes after their fourth failure. Since the baseline hazard function was
actually a Weibull model, the procedure for predicting new failures was
only valid in the case where the Weibull distribution was reduced to an
exponential distribution.
This five-parameter distribution model was applicable because 40 and 54
years of failure data on the large urban water pipe networks were available in
the study. The availability of data permitted the use of Cox’s proportional
hazards model which requires the significant risk factors to be identified first
by regression analysis. This significantly increases the number of parameters
that must be calibrated. The risk function for Cox’s proportional hazards
model was estimated at each time step from the data. Some significant risk
factors identified for one or more networks were pipe length, pipe diameter,
soil corrosivity, traffic intensity above the pipe, and installation after 1966.
Such a data is not available in every water distribution system and therefore
the above technique is not applicable in many cases.
Lei and Sgrov (1998) used the Cox’s proportional hazards model and
the Weibull accelerated life model, given below, to analyse the water distri-
bution network in a case study:
ln(T ) = µ + xT β + σZ (2.24)
where:
T= time to next failure;
x= vector of explanatory covariates;
Z= random variable distributed as a Weibull;
σ= parameter to be estimated by maximum likelihood; and
β= vector of parameters estimated by maximum likelihood.
In this study only the first failure was analysed, and all maintenance activ-
ities were considered a failure. In this model, time to next failure is:
T = f(µ, σZexηβ) (2.25)
The essence of the accelerated lifetime models is that time to next failure
expands or contracts relative to that at x=0, where x is defined as a vector
of explanatory variables. In the study presented in (Lei and Sgrov 1998),
the explanatory variables that were included in x were: age groups, pipe
size groups and length of pipe.
The research showed that no significant difference between the results
of the two models. This is not surprising since it can be shown that an
accelerated lifetime model is equivalent to the proportional hazard model
2.5. CURRENT MODELS DEVELOPED FOR PIPE FAILURE ANALYSISAND PREDICTION 53
when Z has a Weibull distribution (Cox and Oakes 1984). The authors did
not report whether the model was validated. They also did not comment on
the quality of the predictions. Besides, the study is limited by the decision to
treat all maintenance activities (including non-repair activities like flushing)
in the network as failures.
Eisenbeis (1999) also applied the accelerated lifetime model of Equa-
tion (2.24) for a number of failure histories. He considered different meth-
ods for using the proposed modelling approach in municipalities with brief
pipe break histories. His approach was to lengthen the pipe break history
through creating a sample of pipe breaks by randomly selecting break dates
that follow the shape of the survival function of the general model.
The covariates used for each system were decided considering the local
conditions. Using the previous number of breaks as a covariate complicated
the application of the method necessitating the use of Monte-Carlo simula-
tion for the prediction of the number of failures at a desired time-horizon.
The author reported good predictions using this method. However, they
did not provide any details to demonstrate these results. Besides, adding a
created sample of data to the real breakage record makes the reliability of
the resulting model questionable.
Le Gat (1999) described the application of the Weibull proportional
hazard model for the analysis of irrigation pipes in the southern part of
France. The expected number of failures for each pipe was predicted using
this model. The proposed method followed the principles introduced in the
works of Eisenbeis (1999) and Andreou (1986). A Monte Carlo simulation
based on the survival functions was introduced to predict pipe failures.
Monte Carlo simulation is a method for iteratively evaluating a deter-
ministic model using sets of random numbers as inputs. This method is
often used when the model is complex, nonlinear, or involves numerous
uncertain parameters. A simulation can typically involve over 10,000 eval-
uations of the model, a task which in the past was only practical using
supercomputers (Metropolis and Ulam 1949).
Eisenbeis (1997) presented an analysis of two French networks and one
Norwegian network using a Weibull proportional hazard model. The model
used a stratification of the failure data according to the number of previous
failures recorded. Acceptable agreements between observed and predicted
failures were reported. PHM modelling requires the inclusion of several vari-
ates in a single analysis. This reduces the amount of pre-grouping that is
required by a single, two or three parameter model. However, this grouping
requirement is not altogether eliminated and careful analysis is required to
54 CHAPTER 2. LITERATURE REVIEW
identify groups of pipes that may differ in their underlying ageing process.
Besides, the underlying assumption that environmental and operational fac-
tors affect the failure hazard of all types of pipes in the same proportion,
obviously reduces the reliability of these models. For instance, soil condi-
tion has a clearly higher effect on unprotected cast iron pipes, compared
to coated or cathodically protected pipes. If these two categories are not
stratified in the analysis, the differences may reduce the accuracy of the
results. Hence, using these models requires careful examination of the data
to be in order to identify the covariates with the best predicting ability, as
well as those which are required for data stratification.
Cohort survival models are another class of survival models. A statistical
distribution, named the Herz distribution, was introduced and used by Herz
(1996), Herz (1997) and Herz (1998). The Herz distribution was developed
specifically for the ageing of infrastructure elements. In Herz (1996), the
interrelationship between ageing and the occurrence of first failure was found
to be very weak. Using the Herz distribution as a survival function for failure
analysis has the feature that the failure rate/renewal rate increases with age
more and more before it increases more gradually and finally approaches a
boundary value asymptotically. What is called the failure rate/renewal rate
in this thesis, in statistical terms, is the hazard function for the service life of
a pipe. The pipe is replaced when the service life is expired. The probability
density function f(t), survival function S(t) and hazard function h(t) were
given as:
f(t) =(a + 1)beb(t−c)
[a + eb(t−c)]2(2.26)
S(t) =a + 1
a + eb(t−c)(2.27)
h(t) =b.ebe(t−c)
a + eb(t−c)(2.28)
where the values of a, b and c parameters may be derived empirically for
the past periods and particular types of pipes. When used to forecast, they
must be based on expert judgement, i.e. on pipe survival estimates by
managers and engineers (Herz 1996). The ageing function (with upper and
lower boundaries) must be established for each group of pipes. The model
predicts the residual life (i.e. remaining lifetime) for each pipe cohort and
can be used to estimate rehabilitation requirements. This is the reason that
Herz model is known as a Cohort survival model.
2.5. CURRENT MODELS DEVELOPED FOR PIPE FAILURE ANALYSISAND PREDICTION 55
This thesis examined the application of above model to the CWW fail-
ure database by fitting the empirical failure rates (calculated from the past
failure data - see Section 4.5 for more details) to the models given in Equa-
tions (2.26)-(2.28). A small correlation coefficient of r2 = 0.25 was observed,
demonstrating a poor performance of the above model for the case study of
this thesis.
A number of European research centres have developed models for as-
sessing rehabilitation and renewal needs for water infrastructure. These
decision support tools contain several modules including a network inven-
tory module, a failure and break forecasting module, an economic data
module and a strategy comparison module. Several major European cities
have used the Herz distribution for planning pipeline renovation and reha-
bilitation. The procedure has been included into the user-friendly software
KANEW in a research project sponsored by American Water Works Associ-
ation Research Foundation (AWWARF) (Deb 1998). KANEW predicts the
date that the selected pipe sections will reach the end of their service lives.
The pipe sections are differentiated by date of installation and by type of
pipe sections with distinctive life spans.
The system assumes service life to be a random variable, starting after
some time of resistance and being characterized by a median age and a
standard deviation, or age that would be reached by a certain percentage of
the most durable pipe section. The user can choose the parameters of the
Herz distribution.
Predictions are based on optimistic assumptions of service lives that
are derived from failure and rehabilitation statistics for different types of
pipes. The cohort survival model of KANEW is a tool for exploring network
rehabilitation strategies.
The main limitations of KANEW are:
- Since no covariate structure is included in the KANEW, the model
does not provide for the analysis of individual pipes. Ageing functions
are specified for each type of pipes, not for individual pipe. This im-
plies that the model should only be used for analysis of rehabilitation
requirements for the entire water distribution network (i.e. network
level).
- The parameters of the Herz distribution used in KANEW are based
on historical renewal rates and not historical break rates. The re-
newal rates reflect the rehabilitation policies in the past (e.g. often
tending to maintain a fixed average age of the stock) and the economic
56 CHAPTER 2. LITERATURE REVIEW
and technical condition of the period. Furthermore, the rehabilitation
policies are likely to change in the future. So the parameters would
have to be changed in order to reflect future standards and policies.
Kulkarni et al. (1986) proposed a Bayesian diagnostic model for estima-
tion of system-wide probability of failures:
Pr (failure of specific characteristics) =Pc/fPf
Pc/fPf + Pc/nf (1 − Pf )(2.29)
where:
Pc/f = probability of observing specified characteristics on a segment that
failed; and
Pc/nf = probability of observing the same characteristics on a segment that
has not failed.
This model can be applied to homogeneous groups of pipes in terms of
criteria such as diameter, length, age and type, soil characteristics and
water pressure. For the failure database of the case study in this thesis, the
failure probabilities given by the above equation were compared with the
empirical failure probabilities directly calculated using the failure records
(see Section 4.5) and in average, a substantial difference of 67.5% (over
prediction) in resulting prediction was observed.
Malandain et al. (1998) used a Poisson regression model to quantify
the influence of diameter, material, and position of the pipe (i.e. located
in a road or not) on the break rate. The time passed since installation
is not included in the regression. The water network of Lyon in France
was used as a case study. Prior to the analysis, the pipes were grouped
according to structural and environmental factors. In order to model the
break rate (in the form of a hazard function) as a function of time, the
break rate function was divided into three different intervals. Each interval
was analysed separately, resulting in a step function for the break rates. In
the early stage of pipe’s life, the hazard function increased and a Weibull
model was assumed based on the results from Eisenbeis (1999). In the
following stages, an exponential model (i.e. constant hazard function) was
used. The authors pointed out that the proposed approach should only
be used at network level and not at pipe level. A Geographic Information
System (GIS) was used for identifying the spatial variation for the break
rate caused by environmental variables (e.g. soil condition).
Gustafson and Clancy (1999) described a method to model the occur-
rence of pipe failures in grey cast iron pipes with a semi-Markov model,
2.5. CURRENT MODELS DEVELOPED FOR PIPE FAILURE ANALYSISAND PREDICTION 57
where the “state” of the water mains is represented by the number of fail-
ures and the time between failures is used as the “holding time”. The re-
quired probability distributions were estimated using survival analysis. The
time to first failure was modelled with a 3-parameter generalised gamma-
distribution and the subsequent failures with an exponential distribution,
identical for all ti (i > 1); where ti is the time between the (i − 1)th and
the i-th breaking pipe. The dataset was divided into three groups of pipes,
depending on the original wall thickness. No explanatory variables were
included in the analysis, due to lack of data. The authors reported that the
mean failure time is strongly related to the number of failures and concluded
that these grey cast iron pipes are deteriorating. The model’s reliability is
called into question by the poor time resolution (just one year) available for
recorded failures.
Li and Haimes (1992) also used a PHM as introduced by Andreou (1986)
to identify two stages of deterioration and their accompanying hazard func-
tions. The authors used the formula of Walski and Pelliccia (1982) to esti-
mate the repair time of a pipe (the time it takes to repair the pipe), and to
estimate the accompanying cost of repair and replacement. This was used
to formulate a two-stage decision making process.
Goulter et al. (1993) assumed a non-homogeneous Poisson distribution
for subsequent failures of an initial failure in a spatial and temporal cluster.
The authors used a cross-referencing scheme to determine the mean number
of failures that occur subsequent to initial failures. Nonlinear regression
was used to determine parameters based on time and space for the non-
homogeneous Poisson model.
P (x) =mxe( −m)
x!(2.30)
where:
P (x) = probability of x failures;
m = average number of subsequent failures occurring in the cluster domain;
x = number of subsequent failures occurring in the cluster domain;
This method assumed the Poisson distribution for the failures occurring
within a fixed interval of time (T ) and space (S). Hence, a non-linear
regression function was used to estimate the initial values of m:
m = b0tb1sb2 + ε (2.31)
where b0, b1, and b2 are the regression parameters; and ε denotes a random
error. After a while (e.g. one year), parameter m is updated using the new
58 CHAPTER 2. LITERATURE REVIEW
set of failure data.
m = m(s, t) =
∫ S
0
∫ T
0
r(s, t)dtds (2.32)
where:
s = distance from the first break in a cluster;
t = time elapsed from the first break in a cluster;
S = space interval (meters); and
T = time interval (days).
The proposed technique gives acceptable approximates when the mean
number of subsequent failures is very low. It was observed that in the case
of a high mean, the distribution obtained with the regression model, re-
sults in an inadequate fit. This model requires precise information about
the location of breakages, which calls for using a GIS system. Besides, the
model can only be used to predict the probability of breakages occurring
subsequent to an initial breakage in the cluster. Also climatic or other vari-
ables influencing the failure profile with high level of local annual variations
cannot be easily considered using this technique.
Jacobs and Karney (1994) have proposed the following simple proba-
bilistic model:
P−1 = a0 + a1Length + a2Age (2.33)
where P−1 denotes the reciprocal of the probability of a day with no breaks
and a0, a1, a2 are regression coefficients. Data required for this model are
pipe length, age, and breakage history. More data enables the formation of
homogeneous groups. This model has a very simplistic approach towards
various environmental and maintenance factors influencing the complicated
mechanism of failure.
Lee and Kim (2004) attempted to estimate the probability of pipe fail-
ures in terms of their different failure related characteristics such as their
corrosion rates at depth and length directions. The focus of this study was
on evaluation of failure probabilities in different times in the future and
studying their dependence on the aforementioned characteristics. For each
characteristic, two random variables were considered: resistance (R) and
load (L) variables. The failure caused by that characteristic is denoted by
R < L or R − L = Z < 0 and failure probability is derived by assuming
that R and L are both normal and independent:
Failure Probability = Pr(Z < 0) = φ(−µZ
σZ
) = φ(− µR − µL√σR
2 + σL2) (2.34)
2.6. RELIABILITY ANALYSIS OF WATER NETWORKS 59
where:
Φ(.) = CDF of a standard normal random variable;
µZ = average of the Z variable;
µR = average of the R variable;
µL = average of the L variable;
σZ = standard variation of the Z variable;
σR = standard variation of the R variable; and
σL = standard variation of the L variable.
For each characteristic, the term (µZ/σZ) and its time variations are
determined using different models. Then those values are substituted in
the above equation. However, assuming the normal distribution for both
resistance (R) and load (L) variables corresponding to each of the charac-
teristics is unrealistic. In addition, the model could be only applied on very
large histories of pipe failures, including a large variety of the failure-related
characteristics for the pipes. This kind of data is generally not available in
most of water distribution systems.
Most of the probabilistic models that have appeared in the literature
(including the models reviewed in this section) are listed in Tables 2.8 to
2.11. Table 2.8 lists the probabilistic models based on the Poisson distribu-
tion. The probabilistic models using Cox’s and Weibull proportional hazard
function are listed in Tables 2.9 and 2.10, respectively. The remainder of
the probabilistic models in the literature are listed in Table 2.11.
2.6 Reliability Analysis of Water Networks
Reliability analysis of infrastructure systems has been a constant challenge
for managers of infrastructure systems such as water networks. Many re-
searchers have addressed this problem from various points of views. For
instance Walters (1988) described the reliability of water distribution sys-
tems as one of the most challenging unsolved issues facing the water supply
systems. Two major issues have been recognised as being particularly prob-
lematic in performing a reliability assessment: firstly, what measure is the
most appropriate for assessment of reliability, and secondly, what is an ac-
ceptable level of reliability (Xu and Goulter 1998a).
A number of researchers, e.g. Mays (1996) and Ostfeld and Shamir
(1996) define the reliability assessment of a water distribution network as
measuring the ability of the system to meet the consumer requirements in
terms of quantity and quality under both normal and abnormal operating
60 CHAPTER 2. LITERATURE REVIEW
conditions. Maglionico and Ugarelli (2004) have defined specific Reliability
Indicators for both hydraulic and quality aspects, and for combination of
them. In that research, reliability of the whole system was determined as
to be the average value of Reliability Indicators calculated for all the nodes
during the period of simulation (100 years).
In the course of a review of existing studies under the title of reliabil-
ity analysis, it is noticed that currently there is no universally acceptable
definition or measure for the reliability of water distribution systems as it
requires both the quantification of reliability measures and criteria that are
meaningful and appropriate. The behaviour of a water distribution system
is governed by physical laws that describe the flow relationships in pipes
and hydraulic control elements, consumer demand, and system layout.
Therefore, considerations for water distribution systems are an integral
part of all decisions concerning planning, design and operation phases. Two
scenarios are possible for the failure of water distribution systems. Water
networks, like structural trusses, may fail because of loadings being greater
than design levels. Like electrical networks, they also may fail because
components (e.g. pipes) break even with loadings being below the design
levels. Thus, there are two primary probabilistic factors which contribute
to the performance of water distribution systems, namely, the probability
of failure of individual network components and the probability of actual
demand being greater than the design load (Goulter 1987).
A number of models were established for the hydraulic (capacity) re-
liability analysis of the system - e.g. (Gupta and Bhave 1994, Han and
Dai 1996, Goulter 1992, Xu and Goulter 1998b, Cooke and Jager 1998,
Goulter 1990, Goulter and Bouchart 1987, Xu and Powell 1991, Xu and
Goulter 1999, Xu and Goulter 1998b). Xu et al. (2003) defines the capacity
reliability as the probability that the nodal demand is met at or over the
prescribed minimum pressure for a fixed network configuration.
In a general sense, reliability is the ability to deliver design flows under a
wide range of conditions. Obviously, component failure, e.g. pipe breakage
can severely hamper the ability of a network to perform up to specifications.
Some researchers have attempted to define the parts of the problem
and to incorporate them into design models. Hobbs (1985) and Hobbs
and Beim (1986) proposed a series of approaches for reliability assessments
in water supply systems which include both the probability of component
failures and the probability distribution of demands. The approaches were
however developed for the supply aspect of water systems rather than the
distribution network itself. Shamir and Howard (1985) proposed approaches
2.6. RELIABILITY ANALYSIS OF WATER NETWORKS 61
for reliability assessment of water supply systems but they were similarly
concerned primarily with the supply aspect of the system.
Reliability of components of distribution system in delivering required
quantity of water can be referred to as hydraulic reliability. For instance,
flow at the nodes can be regarded as an indicator for measuring the function
of a distribution system. In this case, the system reliability is measured by
assessing the condition of nodes in receiving a given supply at a given head.
If this head is not attainable, supply at the node is reduced. Each node can
thus be in a normal, reduced service, or failure mode.
The system will be said to be in normal mode if all nodes are receiving
normal supply, in failure mode if supply to any node has been shut off, and
in reduced mode if some node or nodes are receiving reduced supply but no
nodes are completely shut up. In a reliability assessment of water distribu-
tion systems, Kettler and Goulter (1985b) used the probability of failure of
water major supply paths while Goulter and Coals (1986) used the prob-
ability of node isolation. Both approaches used a linear program through
constraints restricting the average number of breaks per year permitted in
each link. The relevant probabilities of interest were also calculated using
the average failure rates in each link.
In an examination of the hydraulic reliability of distribution systems,
Cullinane (1986) presented concepts of mechanical reliability and availabil-
ity as quantitative measures of system reliability, Mays et al. (1986) used
a cut-set approach for modelling the reliability of network. The proposed
procedure showed how failure definition can be directly included into an
optimisation design model. In the summary of the procedure, it was men-
tioned that the study of reliability of water distribution systems is severely
hampered by lack of an accepted definition of measures for reliability.
For the purpose of assessment of reliability of water distribution systems,
the first step is to define the values and criteria to measure the reliability.
Then, there is a need for a mathematical model that is capable of predicting
the state of components and last step is combining the reliability measures of
the system and performance state of system components. The mathematical
modelling of component failures, mentioned as the middle stage of reliability
analysis of water distribution system, is the focus of this thesis.
62 CHAPTER 2. LITERATURE REVIEW
2.7 Milestones of Study and Summary
The studies presented in this thesis have been undertaken in a number of
steps described as follows. First, a failure database was prepared consisting
of records of past failures in the network during a number of years com-
bined with GIS information about the location of pipes. Since the database
contains failures of grey cast iron pipes, the characteristics and failure mech-
anism of this type of pipe were briefly studied.
Existing probabilistic models were listed and discussed earlier in this
chapter. These models partially or completely rely on a number of dis-
tribution models. In other words, the probabilistic methods, reviewed in
this chapter, extract the pattern of previous failure occurrences in order to
project it to the future. For this purpose they use known distribution mod-
els. It was noted in this research that parametric modelling approach does
not take into account the non-stationary nature of failure occurrences as a
random process. This matter has been theoretically analysed as presented
in this thesis.
The other observation made in this literature review is that most of the
existing models require knowledge of the rank order of failures, i.e. whether
a failure in the existing record is the first or second or the n-th failure since
installation. However, in many mature water distribution systems such as
CWW, the data of failures within occurred before a specific date (prior to
the first day of record) are unavailable.
Based on the identified limitations of these existing models, the charac-
teristics of available datasets and the purpose of this study, a new technique
that considers the random changes of environmental factors affecting the
performance of water mains and is capable of handling incomplete data has
been formulated during the course of this research.
A new probabilistic model in which that mathematical form of the model
is learnt from previous failures using an artificial neural network (ANN) is
proposed. The neural network technique is adopted to reconstruct the pat-
tern of previous failure occurrences in order to predict the reliability of each
class of pipe at a specified in a certain time. The proposed neural network
model resulted in good estimates without the need for predefined distribu-
tions for fitting the previous failure data to. The ANN technique benefits
from its high computational power in dealing with noisy and censored data
and fitting them to the closest nonlinear curve. The black box of the neural
network model includes a number of parameters that are adjusted through
learning from past failures. However, due to the existence of these parame-
2.7. MILESTONES OF STUDY AND SUMMARY 63
ters, although they are adjustable, the neural model is a parametric model
and unable to accommodate for the non-stationarity of failure process.
To advance the study to achieve an accurate and adaptive non-parametric
model, a new technique for probabilistic analysis of water mains has been
developed. This technique can be applied to the water pipe failure history
to estimate the expected number of failures within a given number of time
intervals (days, weeks, months, etc.) in the future. The upper bound and
lower bound of 80% confidence intervals for the estimations can also be
determined. The outputs of the prediction method can be automatically
updated with time, the proposed method implicitly takes into account the
gradual variations of the factors influencing the deterioration process.
64 CHAPTER 2. LITERATURE REVIEW
Table 2.8: Probabilistic models using time-dependent Poisson model
Reference Attributes
Andreou et al. (1987) Assumed a constant hazard function
of Equation (2.20) which considers
a Poisson distribution for the inter-
failure times after the third failure
Constantine and Darroch (1993) Proposed time-dependent Poisson
model Equation (2.21); and scale
parameter function of some opera-
tional and environmental covariates;
Equation (2.22)
Bremond (1997) Proposed a time-dependent Poisson
model of Equation (2.21), to apply the
proportional hazard method of Equa-
tion (2.20)
Eisenbeis (1994) Used the same proportional hazard
of Equation (2.20) but assumed a
Weibull distribution for the baseline
hazard function. This model included
three-stages. The first stage described
hazard functions for pipes that have
not experienced a failure. The second
stage describes hazard functions for
the second to forth failure, while the
third stage described the hazard func-
tions for pipes after their forth failure.
Malandain et al. (1998)
& Malandain et al. (1999) In their analysis, the break rate func-
tion was divided into three different
intervals. They used the Weibull haz-
ard function for applying the Propor-
tional hazard method for the early
stage of pipe’s life
2.7. MILESTONES OF STUDY AND SUMMARY 65
Table 2.9: Probabilistic models using Cox’s proportional hazard
Reference Attributes
Kaara (1984) Introduced the use of Cox’s proportional hazards
model for analysing failures in water distribu-
tion networks. The non-parametric multivariate
model, Equation (2.18), was used for a survival
based model
Marks (1985) Used Equation (2.19) as baseline hazard function
for using Cox’s proportional hazards
Andreou et al. (1987) Used Equation (2.19) as the baseline hazard
model function for the early stage of pipe’s life
(to the first three breaks)
Li and Haimes (1992) Used the Andreu’s proportional hazard Equation
(2.19), and Poisson model for developing a deci-
sion support system
Lei and Sgrov (1998) used the Cox’s Proportional Hazards Model for
a failure history
Table 2.10: Probabilistic models using Weibull hazard function
Reference Attributes
Le Gat (1999) The expected number of failures for each pipe
was predicted using Weibull proportional hazard
model (PHM) for the analysis of irrigation pipes
in the southern part of France
Lei and Sgrov (1998) Used a Weibull accelerated life model Equations
(2.24) and (2.25)
Eisenbeis (1997) Proposed a Weibull PHM that used a stratifica-
tion of the failure data based on the number of
previous failures recorded
66 CHAPTER 2. LITERATURE REVIEW
Table 2.11: Miscellaneous probabilistic models
Reference Attributes
Herz (1996) Introduced Equation (2.28) as a cohort haz-ard function, assuming a weak correlationbetween ageing and the occurrence of thefirst failure; then, sharp increase in corre-lation of failure rate with ageing; and finallygradual increase of the failure rate with age-ing, approaching asymptotically to a bound-ary value
Deb (1998) A software for renewal project namedKANEW is developed using the Herz distri-bution function of Equation (2.28) Specifiedageing functions for each type of pipes
Eisenbeis (1999) Used the Gumbel distribution for acceler-ated life model of Equation (2.24)
Kulkarni et al. (1986) Proposed a Bayesian diagnostic model for es-timation of system-wide probability of fail-ures; Equation (2.29) A model applicable tohomogeneous groups of pipes in terms of di-ameter, length, viz. and gives the probabil-ity of observing specified characteristics ona segment of pipes that have failed
Goulter et al. (1993) Assumed a non-homogeneous Poisson dis-tribution for subsequent failures of an ini-tial failure in a spatial and temporal cluster;Equations (2.30) and (2.31) and (2.32)
Gustafson and Clancy (1999) Described the use of a semi-Markov methodfor cast iron pipes. The time to first failurewas modelled with a 3-parameter generalisedgamma distribution and the subsequent fail-ures with an exponential distribution
Jacobs and Karney (1994) Equation (2.33); returns the reciprocal ofthe probability of a day with no breaks
Lee and Kim (2004) Equation (2.34), estimates the probability offailure in terms of corrosion rate etc., assum-ing that resistance and load are both normaland independent
Chapter 3
Data Description
3.1 Typical Failure Data in Water Distribution
Systems
A common problem associated with failure time records of mature water
distribution systems, is lack of complete failure history. In fact, water net-
works of most cities have been established more than 100 years ago and
many of the early buried pipes are still in service while their complete fail-
ure records since the first years of their installation are not available. Thus,
decisions regarding management of the major parts of water distribution
systems are made in absence of a complete dataset.
Figure 3.1 depicts the typical data that is usually available in water
distribution networks. In this figure, the time window shows the period
of time for which failure records are available. The ‘×’ symbols on the
horizontal (time) axis denote occurrences of failures. Failures are likely to
have occurred prior to the starting point of available failure data, but not
be recorded. Therefore, the left side of the available data is unknown and
called left-censored data. Besides, given a fixed set of data, the right side of
available failure data (spanning the period of time between the last failure
time and next failure) may also be unknown, i.e. the data can also be right
censored.
In addition to these limitations, data from a water distribution system
might be subjected to recording or typing errors and the like. Failure data
may also be complete. Records in hard copy may not be accessible realisti-
cally. Reliance is therefore placed on digital records which may have only a
67
68 CHAPTER 3. DATA DESCRIPTION68 CHAPTER 3. DATA DESCRIPTION
X X X XInstallation
Year
Time
LeftCensored
RightCensored
AvailableFailure Data
Time Window
Figure 3.1: Availability of failure data in water networks.
from the database prior to analysis. Although the absence of failure records
for a short period of time does not cause considerable inaccuracy, missing
of essential information in failure record, such as the size of a failed pipe,
can decrease the credibility of resulting models.
3.2 Contents of Database of This Study
The database of this study is a failure history of cast iron pipes, assets
of City West Water (CWW), a water retailing company in Western sub-
urbs of Melbourne. Melbourne Water, in its report (Water Main Renewal
Study 1991) noted that western region of Melbourne was experiencing a
disproportionately high rate of failures. A burst rate of three times larger
that of Melbourne’s other two water supply regions (which are separate
water retail companies) was reported. Indeed, between 1972 and 1990
the annual average water main failure rate throughout CWW’s current
licence area was approximately 1 failure/km/year, as compared with 0.3
to 0.5 failures/km/year for the other two networks (Water Main Renewal
Study 1991). ALso “CWW’s Water Reticulation Asset Status- Pipe Struc-
tural Performance report” (Water Reticulation Asset Status Report 1997)
reported that in 1995/96, CWW had 3.1 times the break rate of South East
Water. In the same year, water industry benchmarks revealed that CWW
had the highest water main break rate in Australia (WSAA ’facts ’99 1999).
Given this background, since 1999, CWW recognised the need for the
failure analysis of water pipes. Accordingly, an investigation for cast iron
pipes and associated failures with a view to formulating a strategy for cost-
effective asset management in both short and long terms was conducted
(Righetti 2001). That study resulted in some models for failure prediction
of those water mains mentioned and discussed in the literature review of
this thesis.
Figure 3.1: Failure data in water networks may be available only during
specific time windows and include left and right censored outside the win-
dow.
few years of records which represent only a small portion of the pipe history.
Presence of false data in a failure record can potentially cause deviation in
the results of data analysis. Therefore, false data should be identified and
eliminated from the database prior to analysis. Although the absence of the
failure records for a short period of time does not cause considerable inaccu-
racy, missing of essential information in failure record, such as the size of a
failed pipe, can decrease the credibility of resulting models. Data of failure
might also be incomplete in that, for instance, detail of pipe deterioration,
soil type, condition of bedding etc., may be absent.
3.2 Contents of Database of This Study
The database of this study is a failure history of cast iron pipes belonging
to City West Water (CWW). CWW is a water retailing company in West-
ern suburbs of Melbourne. Melbourne Water, in its report (Water Main
Renewal Study 1991) noted that western region of Melbourne was expe-
riencing a disproportionately high rate of failures. A burst rate of three
times larger that of Melbourne’s other two water supply regions (which are
separate water retail companies) was reported. Indeed, between 1972 and
1990 the annual average water main failure rate throughout CWW’s cur-
rent licence area was approximately 1 failure/km/year, as compared with
0.3 to 0.5 failures/km/year for the other two networks (Water Main Renewal
Study 1991). Also “CWW’s Water Reticulation Asset Status- Pipe Struc-
tural Performance report” (Water Reticulation Asset Status Report 1997)
reported that in 1995/96, CWW had 3.1 times the break rate of South East
Water. In the same year, water industry benchmarks revealed that CWW
had the highest water main break rate in Australia (WSAA facts’99 1999).
Given this background, since 1999, CWW has recognised the need for
3.2. CONTENTS OF DATABASE OF THIS STUDY 69
the failure analysis of water pipes. Accordingly, an investigation for cast
iron pipes and associated failures with a view to formulating a strategy for
cost-effective asset management in both short and long terms was conducted
(Righetti 2001). That study resulted in some models for failure prediction
of those water mains mentioned and discussed in the literature review of
this thesis.
The reason for choosing cast iron pipes specifically was based on the fact
that cast iron pipes comprise more than half of CWW’s water mains and
contribute disproportionately to the number of failures and customer service
key performance indicators (KPIs) (Righetti 2001). This thesis reports the
failure analysis study as conducted on CWW’s cast iron pipes. However, it
is important to note that the developed techniques and approaches discussed
in this thesis are not exclusive to this data and can be tuned and applied
to other failure histories of other classes of pipes as well.
In-service water mains or pipes are subjected to continuous deleteri-
ous reactions and internal and external loads that undermine the intended
design factor of safety (FS). Consequently, the service life is significantly
reduced if existing stresses on structurally deteriorated pipes exceed the ex-
pected or admissible design loads or stresses. Pipe failure is defined as an
event in which the factor of safety falls below a critical value, FScr (usu-
ally set to 1), i.e., FS < FScr. Cast iron, a brittle material, typically fails
through fracture at strains of 0.5%. Thus, the fracture of brittle materials
such as cast iron is dictated by its ultimate strength.
In the database used in this study, a water main failure means a struc-
tural failure of the pipe that results in water visibly escaping to the en-
vironment. The failure does not mean a failure to meet defined regulated
customer service standards. It also does not include a failure of the pipe
to supply the water at the required quality, or general leakage of water
from unknown sources. At City West Water, these two latter issues are
either small or non-existent. It has been found at CWW that in many
areas, it is currently economically advantageous to reactively repair failure
spots as they occur, rather than undertake renewals to address or prevent
failures (Righetti 2001). However, in some instances a renewal is required
due to regulatory constraints on interruptions to supply, as well as critical
situations where single or repeat failures lead to high profile incidents and
social costs. Both the regulatory requirements and asset conditions are of
significant importance to CWW.
Traditionally, CWW used three categories to assess their assets:
70 CHAPTER 3. DATA DESCRIPTION
b Frequently bursting pipe: Pipes that had four or more breaks in
12 months or less. Due consideration was also given to the number of
historical breaks recorded for the asset. These assets were considered
at high risk of causing more than five unplanned interruptions per
year to each customer.
b Brittle pipe: Pipes that had three breaks in 12 months or less and
a historically high number of breaks. This criterion was developed in
recognition that CWW had the highest number of failure breaks in
Australia. This category was defined to target the pipes in a shut off
block that fail regularly (three times per year) such that the overall
contribution of each pipe combined to result in customers in the shut
off block experiencing more than five unplanned interruptions. A shut
off block is the zone of water supply that gets interrupted as result of
interruption to any of the water utilities in that area. Water mains
were renewed under this criterion only where it was not viable to
install additional valves to reduce the size of the shut off block. The
reason for considering this category was to proactively renew assets
that were most likely to become frequently bursting pipes in the near
future.
b Critical assets: Assets that were located in areas where the conse-
quences of a failure were high and it was suspected that the probability
of a failure was also high. For instance, water mains greater than 80
years of age, crossing under tram tracks in the CBD are considered
critical. High failure probability was determined with respect to per-
formance or inferred, based on design life/age of asset.
This categorisation did not prove to be sufficiently efficient and reliable
to be used to develop an effective asset management policy in CWW. In-
deed, this categorisation was insufficient for determining the yearly capital
expenditure assignment and renewal programming because, according to
this simplistic technique, any pipe with four or more failures should be re-
placed even if these failures did not cause an interruption to water supply
(e.g. circumferential failures can be repaired by applying a clamp). There-
fore, by employing this method, the link between the actual customer needs
and the projected likelihood of repeat interruptions is weak. In addition,
there is no financial evaluation of the cost to CWW versus the financial
benefits of this program. To address this deficiency, this study aims at de-
veloping a predictive approach toward assessment of classes of pipes in the
3.2. CONTENTS OF DATABASE OF THIS STUDY 71
future. Having reliable estimates for the probability of failure occurrences
of the pipes in the future, the above mentioned definitions can be used as
criteria for making decisions about maintenance/replacement strategies for
those classes of water mains.
The data used in this study is a history of 6381 failures recorded during
1997–2000 in a Microsoft Access file. Each record describes the occurrence
of a pipe breakage event and comprises the pipe identification number, pipe
type (material), pipe size (diameter), pipe length, date of construction of
the pipe, failure type, and the date of failure. Environmental factors such
as soil type, climate, pressure zone and cause of failures are not available in
this database.
Significant data auditing was undertaken by CWW to assess the accu-
racy of the records and false records were eliminated from data so that the
dataset can be used with a high confidence in its validity and accuracy.
The data used in this study was not prepared on the basis of a regular
inspection along the network. Rather, it consisted of digital records of
breakage occurrences over four years which is not a long history compared to
the age of this network. However, data contained sufficiently large number
of records for each group of pipes (with the same type and diameter) for
the analysis to be significant. This is mainly due to existence of a small
number of pipe groups and also the high frequency of failures.
The majority of failures CWW area are experienced by cast iron pipes
which represent more than 50% of CWW assets (Righetti 2001). There were
two types of pipes in this database: cast iron (CI) and cast iron cement
lined (CICL) pipes. In the existing dataset, the acronym CICL referred
specifically to spun grey cast iron pipes with factory cement lining, and the
acronym CI referred specifically to pit cast, unlined pipe (that may have
been cement lined in-situ).
All pipes with diameters less than 300mm are defined to as reticulation
assets. Existing pipes of this dataset have diameters of 80mm, 100mm,
125mm, 150mm and 175mm. They were installed in the region between
1857 and 1985. The construction history of cast iron pipes in all areas
under City West Water license, categorised by year of manufacture and
construction technique, is presented in Table 3.1. According to this table,
taken from (Righetti 2001) poor construction techniques have been used for
installing the pipes of this study and this has contributed to the high rate
of failures observed in the cast iron pipes of the area.
Based on the material and diameter, the pipes of the existing dataset
could be divided into six classes as listed in Table 3.2. Other classes (possible
72 CHAPTER 3. DATA DESCRIPTION
Table 3.1: Construction history of cast iron mains in City West Water
(Righetti, 2001)
Construction Length of installed Quality of construction
date pipes (km) technique
prior to 1920 218 poor
1920 to 1928 143 poor
1929 to 1967 617 poor
1968 to 1985 523 improved with
granular backfill
Table 3.2: Statistics of the six classes of pipes in the dataset, selected for
reliability analysis and failure prediction in this study.
Class No. Pipe Material Diameter(mm) No of Breakages
1 CI 80 321
2 CI 100 922
3 CI 125 60
4 CI 150 105
5 CICL 100 3886
6 CICL 150 793
combinations of pipe types and diameters) had too a small population (fewer
than 20 records) to be analysed by statistical modelling techniques and were
eliminated from this study.
3.3 Spatial Location of Pipes
The condition of water pipes, is influenced by a number of factors. These
factors include the environmental conditions and structural characteristics
(e.g. pipe diameter, wall thickness, pipe material). External loads and
rainfall and soil characteristics that influence the failure rate of pipes are
generally similar for the pipes in a neighbourhood. In other words, spatial
clustering of water pipes of a network results in homogeneous classes of
pipes in terms of external deteriorating factors.
Environmental factors causing deterioration, especially soil characteris-
tics, and internal factors such as pressure fluctuations usually do not vary
considerably for the pipes across a postcode area. Thus, in order to elimi-
nate the variation of rainfall and soil characteristics of the water mains being
3.3. SPATIAL LOCATION OF PIPES 73
categorised into homogeneous groups, pipes under this study identified in
terms of their postcode.
3.3.1 Estimation of postcodes for given AMG coordinates
Available data did not include the location of all pipes. For most of the
pipes of the datasets, the Australian Map Grids (AMGX and AMGY) co-
ordinates were available. These coordinates were converted to longitude
and latitude values using a GIS tool. The spatial data transformation is
performed by a Perl script which has parameters passed to it via the forms
interface. The transformation algorithm makes use of Redfearn (1984)’s
rigorous formulae and The Australian Geodetic Datum Technical Manual.
Special Publication (1986). Depending on the precision of the entered coor-
dinates, this transformation computes results to the nearest 4mm (nominal)
on the ground.
In the next step, a package of data, provided by “Australia Post” was
used to estimate the corresponding postcode of each point using its geo-
graphical longitude and latitude. The package included a Microsoft Excel
database containing the coordinates and corresponding postcodes across
Victoria. A MATLAB program was written to read the pipe longitude
and latitude values from the failure history (calculated for the pipes with
given AMG coordinates), find the closest coordinates in the Australia Post
postcode database, and record the corresponding postcodes of all pipes.
3.3.2 Estimation of postcode for pipes with no spatial
data
For the small number of the pipes in the failure history without any geo-
graphical information, a “linear interpolation” technique was used to esti-
mate their AMG coordinates. The process of calculating unknown values
from known values when a constant rate of change is assumed, is called
linear interpolation (Watson and Duff 1997). The main attribute of this
method is that it is easy to compute and stable. The method works by
effectively drawing a straight line between two neighboring samples and re-
turning the appropriate point along that line. For example η is a number
between 0 and 1 which represents how far one wants to interpolate a sig-
nal y between the times n and n + 1. Then the linearly interpolated value
y(n + η) can be defined as follows:
y(n + η) = (1 − η).y(n) + η.y(n + 1) (3.1)
74 CHAPTER 3. DATA DESCRIPTION
Using this method to estimate the AMG coordinates of pipes between
known locations implies the assumption of pipe unique ID’s being attributed
in spatial order. In Figures 3.2 and 3.3, the X and Y coordinates of the
pipes with known AMG values are plotted. It is observed that AMGX and
AMGY values vary with Unique ID’s almost linearly. Thus, the missing
AMGX and AMGY coordinates of pipes with failure records in the dataset
can be calculated as follows. First, all failure records are sorted according to
the unique ID numbers of the pipes. Consider a pipe with unique ID number
UID1 with missing AMG coordinates. If the unique ID’s of the two pipes
on two sides of this pipe in the sorted list are UID0 and UID2 in such
a way that UID0 ≤ UID1 ≤ UID2, then the missing AMG coordinates
of the pipe UID1 are given by Equation (3.1) in which y(n) and y(n + 1)
are the same AMG coordinates of the two pipes UID0 and UID2 and η =
(UID1 − UID0)/(UID2 − UID0).
The failure records were first reordered by sorting the pipe unique IDs.
For any pipe with no AMG coordinates, those values were then estimated
by linear interpolation of the AMG coordinates of the closest right and left
neighbouring data records in the list. A MATLAB program was developed
to find the closest records and implement the linear interpolation for each
pipe with unknown AMG coordinates. Having the AMG coordinates, a
similar procedure was used to obtain the postcode for each of those pipes.
3.3.3 Distribution of failures in different postcodes
Using the information on location of pipes in the failure history, a study
of the distribution of failures across different postcode areas is useful for
classifying the pipe breaks into almost homogeneous groups regarding the
environmental variables such as soil characteristics and rainfall.
Figures 3.4 and 3.5 show the rate of failure events in each postcode of
the CWW licence area. These figures show that the failure rates vary from
postcode to postcode. Some postcodes have experienced a markedly large
number of failures, while in some, the failure rate was quite low during that
period. This observation is compatible with the conclusions of the earlier
work by Goulter and Kazemi (1988) into the existence of strong spatial
clustering effect in occurrence of sub sequential failures.
It is important to note that because of the wide variation of pipe lengths,
the highest failure frequency (the number of breaks occurring per year,
regardless of pipe lengths) does not necessarily occur in the same area with
the largest failure rate per km. For example, according to the bar plots
3.4. ADDING THE RAINFALL INFORMATION TO THE DATA 75
1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9x 105
2.9
2.95
3
3.05
3.1
3.15
3.2
3.25
Unique ID number
AM
GX
Coo
rdin
ates
1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9x 105
5.8
5.805
5.81
5.815
5.82
5.825x 106
Unique ID number
AM
GY
Coo
rdin
ates
5x 106
Figure 3.2: Plot of AMGX coordinates versus the pipes unique IDs as ev-
idence for credibility of using linear interpolation to estimate the missing
AMGX coordinates.
in Figures 3.4 and 3.5, the pipes in the area with postcode 3061 had the
largest failure rates (per km). But in terms of the gross number of failures
occurred, the postcode area 3021 was identified as the worst region with 686
failures recorded just for CICL pipes in this region. In order to reduce the
small sample bias of the reliability estimates and failure prediction given by
the statistical techniques presented in the later chapters in this thesis, those
techniques are examined by the failure records in the area with postcode
3021 (which has the largest population of failure records in the database).
3.4 Adding the Rainfall Information to the Data
Rainfall is also a determining factor in the failure process of pipes, espe-
cially in expansive soil bedding. Since the geographic area under licence of
City West Water has expansive clay, pipes were regarded as being affected
by shrinkage and expansion of surrounding soil. The severity of these en-
vironmental stresses is influenced by the rainfall profile of the area. Thus,
in order to consider the effect of this environmental factor that impacts the
mechanism of pipe failures, rainfall history of the area under study during
that period of data collection was also needed.
76 CHAPTER 3. DATA DESCRIPTION
1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9x 105
2.9
2.95
3
3.05
3.1
3.15
3.2
3.25
Unique ID number
AM
GX
Coo
rdin
ates
1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9x 105
5.8
5.805
5.81
5.815
5.82
5.825x 106
Unique ID number
AM
GY
Coo
rdin
ates
5x 106
Figure 3.3: Plot of AMGY coordinates versus the pipes unique IDs as ev-
idence for credibility of using linear interpolation to estimate the missing
AMGY coordinates.
0
2
4
6
8
3000
3002
3003
3005
3011
3012
3013
3015
3016
3018
3019
3020
3021
3022
3023
3024
3025
3027
3028
3029
3030
3031
3032
3033
3034
3036
3037
3038
3039
Postcodes
Failure Rates (per Km)
Figure 3.4: Failure rates in each of the postcodes 3000–3039 in the region
under study (average number of breaks per km during 1997–2000).
Monthly rainfall records during 1997-2000 were provided from a climate
station in the region under study. City West Water’s boundaries contain
the local government areas of Brimbank, Hobsons Bay, Maribyrnong, Mel-
bourne (north of the Yarra river), Moonee Valley, Wyndham, Yarra and
parts of Melton and Hume. Licence area of CWW is shown in Figure 3.6.
Keilor station was chosen for its location in the middle of this region. Keilor
(3742′36′′ S and 14449′44.4′′ E) is located in postcode 3036 of Victoria,
partly in city of Brimbank and partly in City of Hume.
3.4. ADDING THE RAINFALL INFORMATION TO THE DATA 77
Failure Rates (per Km)
0
5
10
15
20
25
3040
3041
3042
3043
3044
3045
3047
3049
3050
3051
3052
3053
3054
3061
3065
3066
3067
3068
3075
3121
3141
3309
3310
3311
3892
3960
5277
Postcodes
Figure 3.5: Failure rates in each of the postcodes 3040–5277 in the region
under study (average number of breaks per km during 1997–2000).
Figure 3.6: Geographical map of the licence area of City West Water.
The rainfall dataset was a Microsoft Excel file containing monthly rain-
fall in millimetres. To compare the rain-fall profiles in similar seasons of
different years, a histogram of rain fall measures of different seasons of the
years 1997–2000 is plotted in Figure 3.7. The summer season of 1997 is
observed to have been considerably drier than subsequent summers and the
spring season of 2000 has been wet compared to the other springs. Such
observations, along with the monthly variations of rainfalls (as plotted in
Figure 3.8), are studied later in Chapter 5 in conjunction with the breakage
78 CHAPTER 3. DATA DESCRIPTION
0
50
100
150
200
250
Summer Autumn Winter Spring
1997
1998
1999
2000
Rainfall (mm)
1997
1998
1999
2000
Figure 3.7: Histograms of quarterly records of rainfall of the region in 1997-
2000
020406080
100120
January
February
March
April
May
June
July
August
Sptember
October
Novem
ber
Decem
ber
Rai
nfal
l (m
m)
1997199819992000
Figure 3.8: Histograms of monthly records of rainfall of the region in 1997–
2000
history of pipes to investigate the non-stationary nature of failure occur-
rences as a random process.
3.5 Failure Times
The available failure history comprises the dates of failures that occurred
during 1997-2000. Different approaches in failure analysis use different no-
tations. Some statistical analysis methods (for example the probabilistic
3.6. SUMMARY 79
X X X
0 T1 T2 TN-2
IFT1
X X X
T3 TN-1 TN
IFT2 IFT3 IFTN-1 IFTN
Time
Figure 3.9: Failure times (Ti) and inter-failure times (IFTi) of a class of
pipes.
technique developed in Chapter 6), use the inter failure times rather than
failure times. Regardless of the terminology, the sequence of failure times
and the sequence of inter-failure times represent the same information about
the failure history.
A graphical description of the failure history, starting from time t = 0 is
shown in Figure 3.9. Each cross symbol corresponds to a failure time (Ti)
of a class of pipes. Ti is the actual time of the i-th failure occurrence. Each
inter-failure time is the time elapsed between two consecutive failures. The
inter-failure times are denoted by IFT1, IFT2, · · · given IFTi = Ti − Ti−1
for i = 1, 2, · · · with T0 = 0.
For the purpose of statistical analysis of water networks, it is assumed
that the water network is repaired immediately after occurrence of a failure.
This implies that the repair times are negligible compared to the failure and
inter-failure times, which is a reasonable assumption for water networks.
3.6 Summary
The first section of this Chapter describes limitations of a typical database
that is usually available in water distribution networks. Generally, the time
window of recorded failure history of old pipes does not cover their total life
span. This means that any analysis based of these databases is associated
with a level of uncertainty as a result of left censored data. This is besides
the inaccuracy in recording errors that is inevitable in such databases and
should be minimised using a data integrity check.
The database that is used in this thesis is described in detail in the
next chapter. This database is provided by City West Water PTY LTD
comprising of a four years failure history of their Cast Iron water pipes. The
definition of failure in this database and other terminologies that are used
as well as description of contents of this failure history are presented. The
failure history contains material, size, construction date, length, and failure
dates of pipes. However, the spatial location of these pipes, is an important
80 CHAPTER 3. DATA DESCRIPTION
factor in condition of pipes that was missing in the information that was
available. Attempts to realise this information and attributing a postcode
to each pipe is explained in details. A study of the distribution of failures
across postcode areas is used for classifying the data into almost homogenous
groups regarding the environmental factors such as soil characteristics and
rainfall. Rainfall is a determining factor in breakage of pipes in expansive
soils that cover most of the CWW area. Obtaining the rainfall information
and adding it to the data is discussed. Discussion about data specification is
concluded with a description of inter-failure times that is used in Chapter 5.
However, the statistical analysis model developed in Chapter 3, a different
approach, takes a different notation and uses the failure times instead.
Chapter 4
Intelligent Reliability Analysis of
Water Pipes Using Artificial
Neural Networks
4.1 Introduction
During their lifetimes, water mains, as the most essential and high main-
tenance components of water distribution systems, are constantly exposed
to a vast range of deleterious influences. Consequently, their design factors
of safety may significantly degrade with time, leading to structural fail-
ures. The conditions causing failures in water mains are discussed earlier in
Chapter 2.
It was also noted previously that prediction of the performance of pipes
in the future is essential for developing proper strategies for maintenance
or replacement of water distribution systems. This is a challenging task as
water mains are buried in the ground and regular monitoring of the state
of each individual pipe is not feasible.
Studies by other researchers, conducted to predict the performance of
pipes, were reviewed in Chapter 2. Existing mathematical models were
explained and their strengths and limitations were discussed. The com-
parative literature review in Chapter 2 concluded that the best option for
water distribution authorities is to perform statistical analysis on the profile
of their water mains. This type of analysis will aid the managers to estimate
the performance of the pipes in the future.
This chapter introduces a method in which an ANN is used to develop
81
82CHAPTER 4. INTELLIGENT RELIABILITY ANALYSIS OF WATER
PIPES USING ARTIFICIAL NEURAL NETWORKS
probabilistic models for the likelihood of failures in predicting failures in
water mains. In this study, performance of water mains in the future is
studied in context of reliability analysis of the water pipes with failures
recorded in the available dataset explained in Chapter 3.
4.2 Reliability Analysis: Principles and Defini-
tions
According to ISO 8402, the reliability of a system is defined as the ability of
the system to perform a required function, under given environmental and
operational conditions, for a stated period of time (Hoyland and Rausand
1994). Reliability of a component is commonly defined similarly, namely,
the likelihood of its proper operation for a given period of time in the future.
In a general sense, reliability analysis involves mathematical modelling of
the likelihood of failure events, to estimate the performance of a system or a
component in a certain time. Models are used in failure/reliability analysis
of a system to aid the comprehension of future behaviour of system or its
components. Reasonable estimation of component performance is essential
for preparing an optimum asset management strategy for the system. This
way, unreliable components are recognised and actions can be taken to
mitigate the adverse impact of component failures on the cost or/and quality
of system service.
4.2.1 Reliability of water distribution systems
In water distribution systems, reliability means the ability to deliver design
flows under a wide range of conditions (Goulter 1987). Estimating the
reliability with this definition is an overriding challenge in the management
of this sector.
Reliability of a water distribution system or its components can be stud-
ied from some perspective which are quite different in their primary causes.
Shamir and Howard (1985) proposed a number of approaches for reliability
analysis in the water industry. However those approaches were developed
for the quality of water.
For reliability assessment of water distribution systems, Kettler and
Goulter (1985b) analysed the probability of failure of major water sup-
ply paths, while Goulter and Coals (1986) studied the probability of node
isolation. Both approaches used linear programming through constraints
4.3. OBJECTIVES OF THE PROPOSED RELIABILITY ANALYSIS 83
restricting the average number of breaks per year permitted in each link.
The relevant probabilities of interest were also calculated using the average
failure rates in each link. In an examination of the hydraulic reliability
of distribution systems, Cullinane (1986) presented concepts of mechanical
reliability and availability as quantitative measures of system reliability.
Mays et al. (1986) used a cut-set approach for modelling the reliabil-
ity of network. The proposed procedure showed how failure definition can
be directly included into an optimisation design model. In the summary,
Mays et al. (1986) mentioned that the study of reliability of water distri-
bution systems is severely hampered by lack of an accepted definition and
measures for reliability. Currently, there is no universally acceptable defi-
nition or measure for the reliability of water distribution systems in terms
of appropriate and valid criteria for reliability and quantification of these
reliability measures.
4.3 Objectives of The Proposed Reliability Anal-
ysis
This study is conducted particularly to estimate the state of reliability/failure
of the water mains using the dataset described in Chapter 3. The limita-
tions mentioned for this data are common for water distribution systems. In
the reliability study presented in this chapter, the first step is to introduce
proper reliability measures. In order to define reliability measures serving
the purpose of this study, it should be noted that delivering a constant and
satisfying service to the customers (under economic considerations) is the
major focus of water distribution systems. Failures of pipes, as major com-
ponents of water distribution systems, can diminish the ability of networks
to perform up to required specifications.
In order to quantify the quality of service, governmental and/or local
regulations may specify a maximum number of interruptions to the service
of water distribution systems. As a consequence, an ongoing problem, trou-
bling water supply managers, is to control the number of upcoming pipe
breakages. It is very difficult for water companies, if not impossible, to
absolutely guarantee that unplanned interruptions do not exceed the limit.
Reliability analysis, however would give good indications of which pipes
are at risk of exceeding the limit. In this way, the probability of not ex-
ceeding the acceptable number of unplanned failures can be estimated. In
84CHAPTER 4. INTELLIGENT RELIABILITY ANALYSIS OF WATER
PIPES USING ARTIFICIAL NEURAL NETWORKS
other words, the risks would be quantified and appropriate actions, such as
replacing the assets which are in critical conditions, taken.
To achieve the above mentioned objective, the reliability measures (that
are targeted for estimation in this study) are the probability of a homo-
geneous class of components working properly (with no failure) prior to a
certain time in the future. Outcomes of this kind of analysis are useful in
developing reliable strategies for maintenance/replacement of water pipes.
After establishing the scope of study by setting suitable targets, the
study documented in this chapter improves the existing probabilistic tech-
niques of water distribution systems by introducing a new reliability esti-
mation technique using ANNs as a new technique from the literature. The
proposed probabilistic models that are developed using this technique, are
evaluated and their performances are compared to some existing probabilis-
tic techniques by applying them to the same dataset.
4.4 Structure of the Proposed Reliability Model
A diagram of reliability estimation system for pipes of a water distribution
network is presented in Figure 4.1. As shown in this figure, the ultimate
estimation system is supposed to accept some structural characteristics of
the pipes, and a time in the future in which the reliability of pipes is ques-
tioned. After processing the inputs, the estimation system is expected to
return the reliability of the pipe at that certain time. The reliability of each
pipe at a given time in future (the probability that pipe will not fail up to
that time) can be extracted from the reliability model of the homogeneous
class that the pipe belongs to.
In this system, construction and failure dates, pipe material and diam-
eter are the characteristics of the pipes used as inputs by the estimator.
First, pipes of existing failure data should be classified based on their simi-
lar characteristics that make up homogenous classes of suitable populations.
In fact, available data dictates the inputs to the estimator. In this study,
pipes are classified by the categorisation of similar materials and diameters.
However, in case of availability of more details about the physical character-
istics of each pipe, and/or system characteristics such as pressure zone, soil
characteristics, and the like, the estimation system can be tuned to consider
more inputs. If there are sufficient number of failure records for each group,
this will result in developing more accurate models.
One of the inputs is the assessment date (expected date) until which the
4.5. EMPIRICAL ESTIMATION OF SURVIVAL FUNCTIONS 85
Survival Function Estimator
Pipe diameter
Pipe Type
Assessment Date
Reliability of the given pipe to survive until the assessment date
Construction Date
Figure 4.1: Schematic diagram of a survival function estimator for a water
pipe with given type (material), diameter and construction date: The esti-
mator gives the pipe reliability to survive until a given assessment date in
the future.
water pipes are studied for their possible failures. The output of the model
is the reliability of pipes in the given class, ie., the likelihood the pipes in
that class to work properly until the given assessment date.
In this study, a new design is devised for the content of the black box
depicted in Figure 4.1. For this purpose, an artificial neural network is
trained and evaluated. A portion of failure data is used for training the
neural estimation system, and the remainder of it is used for evaluation
of the resulting estimator. Furthermore, in this chapter, two well-known
lifetime models, based on Weibull and lognormal lifetime distributions, are
examined using the same data. These models are used as alternative base
line models for realising the black box of Figure 4.1. The two probabilistic
lifetime models are used as benchmarks to assess the proposed intelligent
lifetime model through comparing its ability to predict pipe reliabilities with
the predictions made by Weibull and lognormal models.
4.5 Empirical Estimation of Survival Functions
The probabilistic technique developed in this study is a method for survival
analysis of water pipes. Survival analysis has been used to predict pipe
breakage behaviour by many researchers in the past two decades. Liter-
ature of this type of analysis was reviewed under probabilistic models in
Chapter 2.
The reliability of a pipe at a time in the future (also called expected
date in this thesis) can be expressed by a survival function, denoted by the
86CHAPTER 4. INTELLIGENT RELIABILITY ANALYSIS OF WATER
PIPES USING ARTIFICIAL NEURAL NETWORKS
symbol S(t) and is defined as below:
S(t) = Pr(TFAIL ≥ t) (4.1)
where t is the pipe’s age at the assessment date (the time interval between
the construction date and the assessment date) and TFAIL is the time inter-
val between the construction date and the next coming failure date. The
proposition TFAIL ≥ t is equivalent to the event of “no failure before the
assessment date”.
The data used in this study includes the failure history of water pipes
as described in Chapter 3. The first step in the reliability analysis of these
pipes is to convert the database to an operable format, stratified by material
and diameter of the pipes. The details of simulation studies and application
of the proposed models to the available data are presented in Section 4.8.
After classifying the pipes, reliability values of the pipes should be empir-
ically calculated. These empirical values are used for tuning the parameters
of the resulting models. Given the pipe failure times for pipes of a specific
class in the database, reliability of that class at an age t can be empirically
calculated as follows:
SEMP(t) =the number of failures in the class occurred at ages over t
total number of failures in the class in the history.
(4.2)
where SEMP(t) is the reliability of that class at age t.
Let t1, t2, . . . , tn be the set of pipe ages at the failure times recorded
in the history of a particular class of pipes. The set is first sorted to
t(1), t(2), . . . , t(n) in an ascending order. Then, general expression of
Equation (4.2) can be enhanced to Equation (4.3) for empirical measures
of survival function at different pipe ages corresponding with the time of
failure events and their right neighbourhood points:
SEMP(t(i)) = 1− i− 1
n; SEMP(t(i) + ε) = 1 − i
n(4.3)
where ε is an infinitesimal measure; and t(i + ε) shows the right neighbour-
hood of t(i) as shown in Figure 4.2.
Equation (4.3) introduces a non-increasing staircase function. The steps
are occurring at the points of observed failure times. This method obvi-
ously returns discrete estimations. However, both the benchmark proba-
bilistic and the intelligent ANN model return continuous survival functions.
Therefore, the staircase curve is approximated with its continuous form that
passes through the mid-points of discontinuities as shown in Figure 4.2.
4.5. EMPIRICAL ESTIMATION OF SURVIVAL FUNCTIONS 87
S(t)
tt(1) t(2) t(3) t(4)
0
1
1-1/n
1-2/n
1-3/n
1-4/n
Figure 4.2: The step-wise empirical survival function (solid) and its contin-
uous approximation (dashed).
The reference values of the empirical survival function in each failure
age t(i) are given by Equation (4.4):
SEMP(t(i)) =(1 − i−1
n) + (1 − i
n)
2= 1 −
i− 12
n. (4.4)
Up to this point, the mathematical formula of Equation (4.4) is presented
for calculation of empirical measures of reliability of pipes at a given time.
The concern of the study at this stage, is developing exclusive models for
each class of pipes, that enable the managers to estimate the reliability of
pipes in a given time in the future. The estimation system depicted in
Figure 4.1 comprises a number of models that each serve a particular class
of pipes.
Since the Weibull and lognormal lifetime distributions are found to be
the most commonly used models in literature of reliability analysis of differ-
ent systems, these models are examined for performance comparison with
the proposed intelligent model. Therefore, before the proposed intelligent
reliability modelling method is described, the mathematical formulation of
the Weibull and lognormal models are briefly reviewed in the next section.
88CHAPTER 4. INTELLIGENT RELIABILITY ANALYSIS OF WATER
PIPES USING ARTIFICIAL NEURAL NETWORKS
4.6 Weibull and Lognormal Lifetime Models
4.6.1 Weibull lifetime distribution
The Weibull distribution is the most widely used distribution for life-data
analysis. It is also a well-known statistical model in reliability engineer-
ing and failure analysis. Due to its flexibility, this model can model the
behaviour of other statistical distributions such as the normal and the ex-
ponential distributions.
In his paper, ”A Distribution of Wide Applicability”, Weibull (1951),
who was studying metallurgical failures, pointed out that normal distribu-
tions are not applicable for characterising initial metallurgical strengths.
He then introduced the Weibull distribution and reported its successful ap-
plication for seven case studies. This continuous probability distribution
has been used repeatedly to provide a reasonable life-time model for many
types of components (Crowder et al. 1994). It has been mainly used to
model failures caused by fatigue, corrosion, mechanical abrasion, diffusion,
and other degradation processes.
A number of researchers (e.g. Eisenbeis (1999), Eisenbeis (1997), Lei
and Sgrov (1998) and Le Gat (1999)) used Weibull model in failure analysis
of water pipes.
In this study, a two-parameter Weibull model is used for survival func-
tion analysis. The equation for the two-parameter Weibull cumulative den-
sity function (CDF), is given by:
F (t) = 1 − e−(t/η)β
(4.5)
where t ≥ 0 is a given component age (corresponding to an assessment date),
β is the shape parameter, η is the scale parameter and F (t) is the CDF which
means the probability of occurrence of a failure before the assessment date
where is equivalent to the complement of survival function, i.e. F (t) =
1 − S(t).
The Weibull plot is a graphical tool for determining if a dataset comes
from a population that can be fitted to a two-parameter Weibull distribu-
tion. Usually, the Weibull model is fitted to a dataset by linear regression
of the form Y = log(− log(1 − F (t))), which is empirically given by Y =
log(− log(1 − (i − 0.5)/n)), plotted versus X = log(t(i)) for i = 1, 2, . . . , n.
The correlation factor of Y and X, ρX,Y
is a fitness indicator. When ρX,Y
reaches a minimum threshold, the Weibull distribution is considered ac-
ceptable for the data and the model parameters are calculated by linear
4.6. WEIBULL AND LOGNORMAL LIFETIME MODELS 89
regression assuming the following linear relationship (derived by taking log-
arithms twice from both sides of Equation (4.5):
Y = β(X − log(η)). (4.6)
4.6.2 Lognormal lifetime distribution
The lognormal distribution is another flexible model that can empirically
fit to many types of failure data. Lognormal distributions are encoun-
tered frequently in metal fatigue testing, maintenance data (time to re-
pair), chemical-process equipment failures and repairs, crack propagation,
and loading variables in probabilistic design.
A lognormal distribution is found when the time to failure or repair re-
sults have cumulative contributing factors. This property can be observed
in several deterioration processes associated with fatigue and creep mech-
anisms. Deterioration in such cases is generally progressive. For example,
a crack grows rapidly under high stress because the stress increases pro-
gressively as the crack grows. Indeed, in many situations, failure or repair
times depend on several factors that are random in nature. In such cases,
the multiplication effect of these factors leads to a lognormal failure or repair
distribution. Therefore, the lognormal model can be theoretically derived
under assumptions matching many failure degradation processes (Bishop
and Bloomfield 2003, Goldthwaitel 1976).
A theoretical justification for using lognormal distributions comes from
the Central Limit Theorem when the logarithm of lifetime is considered
the sum of a large number of small independent effects (Crowder et al.
1994). Applying the Central Limit Theorem to small additive errors in the
log domain and justifying a normal model is equivalent to justifying the
lognormal model in real time when a process moves towards failure based
on the cumulative effect of many small “multiplicative” shocks.
The CDF of the lognormal distribution is given as follows:
F (t) = Φ
(log(t/µ)
σ
)(4.7)
where t ≥ 0, µ (scale parameter) and σ (shape parameter) are the mean and
standard deviation of the log-lifetimes, and Φ is the CDF of the standard
normal distribution N(0, 1). The survival function S(t) is again given by
1−F (t) and by taking the inverse CDF (Φ−1) of both sides of Equation (4.7),
the following equation is derived:
Φ−1(1 − S(t)) =log(t) − µ
σ(4.8)
90CHAPTER 4. INTELLIGENT RELIABILITY ANALYSIS OF WATER
PIPES USING ARTIFICIAL NEURAL NETWORKS
To fit a lognormal model to a dataset, linear regression is applied to
Y = Φ−1((i − 0.5)/n) values versus X = log(t(i)) values. The degree
of linearity of the X–Y plot can be considered as an indicator. If the
correlation factor of X and Y is sufficiently large, assumption of a lognormal
distribution for the database is mathematically justified and the estimates
of the parameters µ and σ are computed either by applying linear regression
with the following linear relationship between X and Y values assumed:
Y =X − µ
σ(4.9)
or by computing the average and standard deviation of the log-lifetimes for
µ and σ, respectively.
4.7 Intelligent Reliability Prediction by Artificial
Neural Networks
Artificial neural networks (ANN) have been widely applied to solve dif-
ferent problems such as modelling, system identification, control, feature
extraction, computer vision, software reliability analysis, metal and frac-
ture analysis and the like. Particularly, in recent years, Neural Network
(NN) analysis has been used in reliability analysis of different mechanical
and electronic systems (e.g. Lee et al. (1999), Bevilacqua et al. (2003), Car-
valho et al. (1999), Moon et al. (1998)). An introduction to the history and
basic types of ANN s is presented in Appendix A for interested readers.
ANNs have also been reported to be applied to predict water pipe failures
(Sacluti et al. 1999, Sacluti 1999). However, the neural network models
introduced in those papers are deterministic models which directly produce
the future failure rates (or number of failures). A distinction is made here
as the reliability estimation framework that is introduced in this chapter
uses an ANN to learn the pattern of survival function values for the water
pipes. More precisely, a probabilistic model is developed here which provides
survival probabilities for future given dates - survival function values as
defined in Equation (4.1).
This study proposes using ANN for the purpose of reliability estimation
of survival functions of water mains. Artificial neural networks, known as
universal approximators (Hertz et al. 1991), are capable of generating mod-
els fitted to the empirical reliability values, more accurately than existing
probabilistic models. In this section, a feed-forward perceptron with one
4.7. INTELLIGENT RELIABILITY PREDICTION BY ARTIFICIALNEURAL NETWORKS 91
Pipe diameter
Pipe Type
Assessment
Date
Reliability of the
given type of pipes
to survive until the
assessment date
Construction
Date
+
-Normalisation
1/5
1
1
1/0.9
Survival Function Model
.
.
.
Figure 4.3: Architecture of the proposed method of reliability analysis by a
feed-forward perceptron.
hidden layer is proposed as an intelligent replacement for current models in
survival function modelling.
Multi-layer feed-forward perceptrons that include linear output units
and a single layer of nonlinear hidden layer units, have been theoretically
proven to be able to represent most reasonable functions as close as desired
if they are trained using back-propagation learning algorithm (Leshno et
al. 1993). Hence, in this study, a feed-forward perceptron with one hidden
layer is trained. The architecture of proposed neural reliability estimator is
illustrated in Figure 4.3.
It is well-known in ANN literature that a multi-layer perceptron will be
trained more rapidly and accurately if its inputs vary in bounded ranges
such as [0, 1] (Tarassenko 1998). Thus, all inputs to the neural network
(including the time lengths) are normalised to the range of [0, 1].
One of the inputs to the network is the time elapsed from the construc-
tion date of pipe until the assessment date in the future (pipe age at the
assessment date). Pipe types and diameters were the other inputs to the
network that were encoded and normalised. The database of this study
contained failure history of pipes of two different types (materials) and five
different sizes (diameters). Types and diameters of pipes were encoded to
0, 1 and 1, 2, . . . , 5 respectively.
The artificial neuron model used in this study is depicted in Figure 4.4.
The neuron computes the weighted sum of the input signals and applies the
weighted sum as an input to its activation function and returns the output of
the function. In the proposed design for the intelligent reliability estimation
technique shown in Figure 4.3, the activation function of all neurons are the
92CHAPTER 4. INTELLIGENT RELIABILITY ANALYSIS OF WATER
PIPES USING ARTIFICIAL NEURAL NETWORKS
+
Constant Bias Input to the neuron (Usually equal to 1)
Activation Function
A neuron in the previous layer
1X A synapse modelled by its weight 1W
11XW22 XW
33XW
nn XW
Neuron Model
Output to neurons in
the next layer
Figure 4.4: Diagram of an artificial neuron model in a multi-layer feed-
forward perceptron network.
-10 -8 -6 -4 -2 0 2 4 6 8 10
0
0.2
0.4
0.6
0.8
1
x
f(x)
Figure 4.5: Nonlinear profile of the Sigmoid function, the activation function
of all neurons in the proposed ANN-based reliability model.
sigmoid function given below:
f(x) =1
1 + e−x. (4.10)
Figure 4.5 shows the nonlinear profile of the sigmoid function.
Since the output neuron of the network has a Sigmoid activation func-
tion, the single output of the neural network is bounded within [0, 1] which
is appropriate to model a probability value (survival function).
4.8. SIMULATION RESULTS 93
The extreme outputs (0 and 1) could occur if the Sigmoid function satu-
rates. This is considered undesirable as it will reduce the learning capability
of the network. To prevent this situation, the proposed neural network is
set to learn 0.9 times of the empirical reliability values. Accordingly, the
output is divided by 0.9, as shown in Figure 4.3.
Basically, the number of neurons in the hidden layer of neural network
(denoted by nh) is a determining factor in its learning and generalisation
capabilities. With a very small nh, the network is not able to learn the
empirical survival function accurately. On the other hand, if nh is very large,
the generalisation power of the network will diminish due to its compact
fitness to all random and non-smooth variations of the points in the reference
curve (Kartalopoulos 1996). This trade-off should be balanced by trial and
error. In this study, nh = 15 neurons resulted in neural networks with
convincing ability to accurately learn the failure profile of all pipe classes
with records in the existing failure history.
The learning ability of each neuron is improved by making small adjust-
ments in its weights to reduce the difference between the actual and desired
outputs of the network. The initial weights are randomly assigned, small
numbers, which are updated to obtain the output consistent with the train-
ing examples. In each iteration, these weights are updated. If, at iteration
p, the actual output is Y (p) and the desired output is Yd(p), then the error
is given by:
e(p) = Yd(p) − Y (p). (4.11)
Iteration p here refers to the p-th training example presented to the neural
network. If the error, e(p), is positive, the network output Y (p) should be
increased, but if it is negative, it should be reduced.
Complete details of back-propagation learning algorithm along with step-
by-step instructions to implement the proposed intelligent model with a
given failure database are explained in Appendix B.
4.8 Simulation Results
This section presents comparative results of modelling and prediction of fail-
ure times using Weibull and lognormal models and the proposed ANN-based
reliability estimation method. The models have been applied to reconstruct
the failure profiles of the pipes with breaks recorded in the dataset explained
in Chapter 3.
94CHAPTER 4. INTELLIGENT RELIABILITY ANALYSIS OF WATER
PIPES USING ARTIFICIAL NEURAL NETWORKS
The dataset is a pipe break database containing the history of pipe
failures occurred during 1997–2000. There are two types of pipes in the
dataset:
CI (Cast Iron) pipes which are encoded by “0”; and
CICL (Cast Iron Cement Lined) pipes which are encoded by “1”
in the inputs of the ANN shown in Figure 4.3.
Based on the pipe types and diameters, pipes of existing data are divided
into six classes as listed in Table 3.2. Other possible combinations of pipe
types and diameters had a population of lower than 20 failure events in the
history. A low population with a small value of n in Equation (4.4) would
result in low precision survival probabilities. For this reason, such classes
are ignored in this simulation.
The modelling technique developed in this study returns specific models
for reliability estimation of each particular class of pipes. To produce a
reliability model for each class of pipes, both Weibull and lognormal models
are also employed. Although both distribution models fit to the data in this
study, the simulation results show that the ANN-based estimator resulted
in remarkably higher accuracy in survival function modelling compared to
those two models.
Two sets of data are required for developing the desired network: train-
ing set and validation set. Each failure record in a dataset comprises the
available inputs and the desired output to be generated by the fully trained
neural network. The training data are used to tune the synopsis weights of
the network and optimise the system variables in accordance with the data
that is fed to it. Validation data are used to evaluate the generalisation
performance of the trained neural network model and to examine if it is
capable of generating the expected outputs for the data records which it
has not encountered before during the training process.
In this study, each failure record includes the following components:
construction date, pipe type, pipe diameter, failure date and the empirical
survival probability of the pipe at the failure date that is given to the neural
network as the assessment date during training and validation.
For each of the six pipe classes, half of the available data in the fail-
ure records are alternately selected as training patterns and the other half
considered as validation data. This way of selecting the training and valida-
tion data ensures that during training, the neural network learns the whole
spectrum of the pipe ages recorded in failure events and during validation,
4.8. SIMULATION RESULTS 95
Fig. 3
Fig. 4
Pipe diameter
Pipe Type
Expected Date Reliability of
the given type of pipes to survive
until the expected date
Construction Date
+
- Normalisation
1/5
1
1
1/0.9
Survival Function Model
...
Survival Measures
Pipe Age (years)
Figure 4.6: Empirical and modelled survival function plots versus the pipe
age in years (Assessment Date - Construction Date) for pipe class 1 (CI
type with 80 mm diameter).
the ANN is examined to have been learnt the complete range of pipe ages
recorded in the dataset.
Figure 4.6 shows the empirical and estimated survival measures plotted
versus the pipe age (assessment date – construction date) for the first class
of pipes (80 mm CI). It is observed that the lognormal and Weibull models,
because of their specific smooth rate of decreasing, cannot closely follow
the empirical survival curve, while the ANN model learns the non-smooth,
stepped behaviour of the empirical plot. It should be noted that assessment
dates do not refer to the first ever failures of the pipes, but to the next
failure.
The superior performance of the proposed neural technique compared
to the classic methods is also observed in similar plots for the other five
classes of pipes. It is clear that for irregular or highly nonlinear behaviour,
Neural networks perform better compared to Weibull and lognormal models.
Figures 4.7 and 4.8 show the similar plots for classes 2 (100 mm CI) and 5
(100 mm CICL).
In order to perform a quantitative comparison between the performance
of the proposed neural network method with Weibull and lognormal models,
the Mean Square Error (MSE) of survival function estimates is defined and
96CHAPTER 4. INTELLIGENT RELIABILITY ANALYSIS OF WATER
PIPES USING ARTIFICIAL NEURAL NETWORKS
Fig. 5
Fig. 6
Survival Measures
Survival Measures
Pipe Age (years)
Pipe Age (years)
Figure 4.7: Empirical and modelled survival function plots versus pipe age
in years (Assessment Date - Construction Date) for pipe class 2 (CI type
with 100 mm diameter).
computed for each method and for each class of pipes. The MSE for class i
is defined as:
MSEi =
[1
ni
ni∑k=1
(SEMP(tk) − S(tk)
)2] 1
2
(4.12)
where S(tk) is the reliability estimate obtained from the model, ni is the
total number of failure records in the history of pipes of class i and tk; k =
1, . . . , ni denotes the set of failure times in the history.
The MSE values calculated for each class are presented in Table 4.1 for
the examined Weibull, lognormal and the proposed neural method. The
simulation results indicate that the estimation error of the proposed neural
scheme is improved by 79% and 83% compared to the lognormal and Weibull
methods for class 1 pipes, 82% and 77% for class 2 pipes, 50% and 60% for
class 3 pipes, 33% and 46% for class 4 pipes, 88% and 71% for class 5 pipes,
and 60% and 47% for class 6 pipes. There does not appear to be a clear
advantage for using Weibull over lognormal as, on average, both had the
some level of accuracy.
4.9. CONCLUSIONS 97
Fig. 3
Fig. 4
Pipe diameter
Pipe Type
Expected Date Reliability of
the given type of pipes to survive
until the expected date
Construction Date
+
- Normalisation
1/5
1
1
1/0.9
Survival Function Model
...
Survival Measures
Pipe Age (years)
Figure 4.8: Empirical and modelled survival function plots versus pipe age
in years (Assessment Date - Construction Date) for pipe class 5 (CICL type
with 100 mm diameter).
Table 4.1: Mean Square Error (MSE) of survival function estimation of
different pipe classes using Weibull, lognormal, and neural network models
Class No. Class Weibull Lognormal Neural Network
1 CI 80 0.112 0.143 0.023
2 CI 100 0.098 0.075 0.017
3 CI 125 0.113 0.140 0.055
4 CI 150 0.084 0.105 0.055
5 CICL 100 0.066 0.027 0.007
6 CICL 150 0.032 0.024 0.013
4.9 Conclusions
The first part of this chapter provided a brief background on reliability
analysis in the field of water distribution. The technique is general and ar-
guments behind this study are applicable to any pipe failure history. How-
ever, failure histories that are maintained in larger portions of time and
containing more data points result in better reliability estimation models.
The proposed technique is applied to a failure history that was explained
98CHAPTER 4. INTELLIGENT RELIABILITY ANALYSIS OF WATER
PIPES USING ARTIFICIAL NEURAL NETWORKS
earlier in Chapter 3.
Pipes of the network were classified into classes of similar material and
diameter. The focus was on providing reliability estimation models for each
class. For obtaining estimations that can aid the managers of water distribu-
tion system with decision making on maintenance/replacement strategies,
reliability of pipes was estimated for homogeneous groups of pipes. Relia-
bility of pipes are extracted from the reliability model for the corresponding
class. In other words, the model obtained for the class in which the pipe
belongs to, is the reference for the reliability estimation.
Artificial neural networks as the universal estimators have been used for
performing a probabilistic modelling for addressing this problem. Incom-
plete data and non-linearity of pattern of failure history are the limitations
that decrease the ability to use most of mathematical models usually ap-
plied for modelling purposes in this literature. ANNs are widely used in
reliability analysis of other systems with a proven reputation in handling
the noisy and incomplete data with nonlinear dynamics. The other advan-
tage of using ANNs is that they can be simulated using software programs,
and the learning of converting coordinates can be easily implemented. A
brief background about neural networks is presented in Appendix A.
In Section 4.7 reliability model is developed using neural networks to
learn the pattern of variations of survival values that are calculated by
failure history. The estimator is trained to predict the future behaviour of
water pipes in distribution networks. The proposed estimator benefits from
the powerful learning potential of neural networks and their noise tolerance.
The inputs to the system were the construction dates, pipe types, diameters
of the pipes and the assessment date. The output was the reliability of
individual pipes ie. the likelihood of them working properly work until the
assessment date.
Weibull and lognormal distributions, well known lifetime models in this
literature (Borror et al. 2003, Hariga 1996), were examined in this study
for survival modelling. Section 4.6 contains a background on application
of these classic mathematical distributions for the purpose of lifetime mod-
elling.
With proper choice and within the credibility of Weibull or lognormal
models, the models offer considerable insight into the lifetime and reliabil-
ity of products. These two models were also used for developing models
for different classes of pipes in maintaining the reliability estimation sys-
tem. Using these models is recognised to be acceptable, based on regression
analysis. The quantitative comparison of performance of estimation sys-
4.9. CONCLUSIONS 99
tems developed using the proposed ANN-based technique, Weibull model,
and lognormal model in terms of mean square errors of estimation is pre-
sented in Table 4.1. For all six classes of pipes investigated, the ANN model
produced the highest level of accuracy compared to the Weibull and log-
normal estimates. On average, the mean square errors for the ANN model,
Weibull and lognormal were 0.03, 0.084 and 0.086 respectively.
The illustrative comparison between the three above mentioned esti-
mation systems is also available for each class of pipes separately. Fig-
ures 4.6, 4.7, and 4.8 demonstrate this comparison for some of these pipe
classes. Both quantitative and illustrative comparisons, confirmed the ad-
vantage of proposed neural estimation system over two other examined tech-
niques for existing failure history.
The results show that the mean square errors of the reliability estimates
given by the proposed method (for different classes of pipes) are up to 83%
(at least 33%) smaller than the errors of the well-known lognormal and
Weibull distribution models.
In addition, the situation in which there is no guarantee that the classic
models fit the failure history in similar cases, places an emphasis on the
advantage of the proposed neural network method and adds weight to the
generalisation power of neural network based modeling. In other words, the
acceptable fitness of lognormal or Weibull models to failure data may not
be the case all the time.
The proposed neural network approach, on the other hand, is potentially
capable of learning the behaviour of almost any non-linear pattern of failure
data and reconstructing the pattern of data for prediction purposes. The
proposed ANN-based reliability estimator benefits from the generalisation
power of neural networks and their ability to perform modelling without
knowing or assuming any underlying distribution, based on even censored
data.
In Appendix B, a step-by-step practical algorithm is presented and de-
tails on how to design the different layers of the ANN and train it by error
back-propagation provided. A practical approach to prioritisation of pipes
for replacement/rehabilitation based on their predicted reliability in future
times is also provided.
It is important to note that occurrence of a pipe failure is the result
of the multiplicative influence of a vast range of physical and environmen-
tal and operational factors on the pipe. Some of these factors have been
rarely recorded by water distribution systems during the network service.
Furthermore, some of these determining factors, such as soil movement and
100CHAPTER 4. INTELLIGENT RELIABILITY ANALYSIS OF WATER
PIPES USING ARTIFICIAL NEURAL NETWORKS
temperature, have irregular and complex variations and need to be mod-
elled separately. This complexity has motivated the author to concentrate
on the process of failure occurrences from another point of view.
A critical approach to the existing models including the proposed neural
network model, is taken to take the study to the next stage and try to fill
the gap between existing models and true nature of failure process. To
be more precise, a component of this study is to provide a more accurate
understanding of the nature of failure occurrence. This understanding, in
advance, guides the research towards developing more accurate models that
attempt to reflect the nature of the failure process.
Chapter 5
Characteristics of Water Main
Lifetimes as Random Processes
5.1 Introduction
Different types of modelling techniques have been developed to analyse
the pipe breakages by studying their reliability and remaining life, e.g.
(Shamir and Howard 1979, O’Day et al. 1980, Andreou et al. 1987, Clark
and Goodrich 1988, Karaa and Marks 1990, Gustafson and Clancy 1999).
In this context, survival analysis has been commonly applied and has re-
sulted in parametric lifetime models, e.g. Weibull and lognormal (Andreou
et al. 1987, Gustafson and Clancy 1999). These models, previously devel-
oped for the failure/reliability prediction of water pipes, were reviewed in
Chapter 2. An ANN model for survival analysis of water pipes was also
proposed and explained in details, and simulation results of applying this
technique to the dataset were presented in Chapter 4.
Although the lifetime models realised by the proposed ANN-based method
are capable of reconstructing the pattern of previously recorded failure oc-
currences (in order to project this pattern to the future) more accurately
than other classical probabilistic methods, there are still large uncertain-
ties involved in failure prediction by either of the models. This is because
there are a number of factors neglected in the modelling process (and not
recorded in the database), that cause deterioration of a water pipe. These
factors consist of a diverse range of physical characteristics, environmental,
and operational parameters as explained in Chapter 2.
It is important to note that the effect of the above mentioned factors
101
102CHAPTER 5. CHARACTERISTICS OF WATER MAIN LIFETIMES AS
RANDOM PROCESSES
are not exactly the same in different cases because the incidence of each
pipe failure is a complex process, resulting from the multiplicative effect of
those factors. Some of the affecting factors are rarely recorded by water
distribution systems and therefore are unavailable to be taken into account
in development of any model.
Even if all of the affecting factors were available for predicting the failure
occurrences in the future, prediction of these factors in the future could
involve considerable uncertainty. The reason for this ongoing uncertainty is
that some of these factors, such as soil shrinkage (as a result of significant
variation in moisture content) and temperature differential do not follow
regular patterns and their fluctuations need further consideration.
Development of an in-depth understanding of pipe failures due to influ-
ence by the various factors is the main focus of this chapter. In particular,
this chapter illustrates that water pipe failures (failure rates or inter-failure
times) are non-stationary random processes and the deficiency of parametric
techniques for the analysis of such failure processes is demonstrated through
mathematical and empirical analyses.
In Section 5.2, the concepts and definitions of stationary random pro-
cesses are briefly reviewed and it is shown that parametric lifetime models
cannot accurately model non-stationary failure processes. A new set of
probability-based definitions of failure rate are introduced and their char-
acteristics are discussed in Section 5.3. Theoretical failure rates for general
parametric models and two-parameter Weibull models are derived in Sec-
tion 5.4, followed by explanation of empirical calculation of the new failure
rates using a database, presented in Section 5.5. Comparison of the theoret-
ical and empirical results for the CWW dataset is presented in Section 5.6.
Section 5.7 presents the conclusions of the study reflected in this chapter,
and explains the motivation and direction towards further work which is
covered in Chapter 6.
5.2 Non-Stationary Random Failure Processes
and Parametric Lifetime Models
A random process is an ensemble of consecutive random variables that cor-
respond to the possible outcomes of a random event. For example, the
number of pipe failures occurring during one month is a random variable,
and the ensemble of such numbers corresponding to a number of consecutive
months is a random process.
5.3. LIKELIHOOD OF NUMBER OF FAILURES: A PROBABILISTICDEFINITION FOR FAILURE FREQUENCY 103
In engineering applications, a random process is usually referred to as
stationary if the mean and variance of the process are time-invariant; oth-
erwise it is referred to as a non-stationary process (Bras and Rodriguez
1993, Balakrishnan 1995). More precisely, by definition, a random pro-
cess is wide-sense-stationary if its mean and second-order statistical prop-
erties (its correlation function) are time-invariant. If the distribution func-
tions of the random variables that constitute the random process are all
identical, then the random process is referred to as strict-sense stationary
(Balakrishnan 1995).
Using a lifetime model with time-invariant parameters for the water
pipes, implicitly underlies that the random process of time-intervals be-
tween consecutive failures of the pipes is a stationary process in the strict
sense. Weibull, lognormal and other similar probabilistic lifetime models
are parametric models. The ANN-based model introduced in Chapter 4
also falls within the class of parametric models with its synapses weights as
its parameters.
The model parameters for these various models are tuned and adjusted
using an optimisation method. For example, the parameters of Weibull and
lognormal models are tuned using linear regression as explained in Chap-
ter 4 and the weights of the ANN model are tuned using back-propagation
as described in Chapter 4 and Appendix B. After the parameters are tuned
they remain fixed and the models are time-invariant parametric models, un-
able to predict that failures that follow a non-stationary random process. It
is demonstrated in this chapter that all parametric lifetime models implic-
itly share the underlying assumption that the random processes of failure
occurrences in water mains are stationary random processes. The results
presented in this chapter also show that the failure process of the pipes
listed in the database (described in Chapter 3) is in fact non-stationary.
5.3 Likelihood of Number of Failures: A Prob-
abilistic Definition for Failure Frequency
Existing failure analysis methods for water pipes usually quantify the past
behaviour of pipe failures in terms of either failure rates, e.g. (Rajani and
Makar 2000b) or inter-failure times, e.g. (Rajani and Tesfamariam 2005),
and project them into the future. In this chapter, a new set of probabilistic
measures are introduced for the purpose of investigating the characteristics
of failure processes.
104CHAPTER 5. CHARACTERISTICS OF WATER MAIN LIFETIMES AS
RANDOM PROCESSES
Instead of deterministic measures of failure rates or inter-failure times,
the failure process is studied in terms of probabilistic measures such as
probabilities of certain numbers of failures occurring during specific time
intervals.
The failure history is divided into equal time intervals. The length of
time intervals, denoted by T in this chapter, should be chosen carefully.
Very long time intervals result in a rough analysis in which variations of
the failure process during the long time intervals are not uncovered. On the
other hand, the length of time intervals should be long enough to include
a fair number of failures on average. The n-th time interval is the interval
within [(n − 1)T, nT ].
The event of occurrence of k failures during the n-th time interval is
denoted by NOFk(nT ). There is a direct relationship between the inter-
failure times (denoted by IFT ) and the number of failures occurring during
each time interval. In order to show this relationship, three instances of
occurrence of the events NOF0(nT ), NOF1(nT ) and NOF2(nT ) are illus-
trated in Figure 5.1, where the TFF denotes the time to the first failure
occurring after the time (n − 1)T , and the next consecutive inter-failure
times are denoted by IFT1 and IFT2, respectively. The time passed from
most recent failure is also denoted by tf .
As Figure 5.1 shows, the event of occurrence of no failure during the
n-th time interval, NOF0(n), is equivalent to:
NOF 0(nT ) ≡ TFF > T (5.1)
and similarly, the events NOF1(nT ) and NOF2(nT ) are equivalent to:
NOF 1(nT ) ≡ (TFF ≤ T ) ∧ (TFF + IFT 1 > T ) (5.2)
NOF 2(nT ) ≡ (TFF ≤ T ) ∧ (TFF + IFT 1 ≤ T )
∧ (TFF + IFT 1 + IFT 2 > T ). (5.3)
The probability of occurrence of k failures during the n-th time interval
is denoted by Pk(nT ) and is equal to Pr(NOFk(nT )). The variable n implies
possible variations of such probabilities with time which will be discussed
further in this chapter.
Each of the probability values in the set Pk(nT )| k = 0, 1, . . . ,M is called
a Likelihood of Number of Failures (LNF value for short), where M is the
maximum number of failures that can occur within a single time interval.
The calculation of the theoretical and empirical LNF values will be dis-
cussed in the next section of this chapter. It is important to note that in the
5.3. LIKELIHOOD OF NUMBER OF FAILURES: A PROBABILISTICDEFINITION FOR FAILURE FREQUENCY 105
Tn )2(
TFF 1IFT 2IFT
Time
Tn )1( nT Tn )1( Tn )2(
ft
(a)
TFF 1IFT 2IFT
Time
Tn )2( Tn )1( nT Tn )1( Tn )2(
ft
(b)
TFF 1IFT 2IFT
Time
Tn )2( Tn )1( nT Tn )1( Tn )2(
ft
(c)
Figure 5.1: Demonstration of inter-failure times in three instances of occur-
rence of the events: (a) NOF0(n) , (b) NOF1(n) , and (c) NOF2(n).
probabilistic approach to define and evaluate the water pipe failure rates
as introduced in this chapter, unlike the deterministic approach, the anal-
ysis does not merely return a certain number of failures (or failure rate as
commonly accepted in infrastructure system analysis context). Instead, the
focus is on probabilities of certain number of failures that can be estimated
in a confidence interval (as shown below) which is a valuable measure for
developing the maintenance strategies.
Having the LNF values for the n-th time interval, the most likely ex-
pected number of failures (denoted by ENOF (nT )) in that time interval
can be directly calculated as the statistical mean of the number of failures
given by:
ENOF (nT ) =M∑
k=0
k Pk(nT ) (5.4)
This value is equivalent to the failure rate as commonly computed in
infrastructure systems analysis. However, by using the LNF values, a con-
fidence interval can also be calculated for the above failure rate. A confi-
dence interval quantifies the existing uncertainty in the calculated failure
rate, and it is particularly useful if the future failure rates are calculated
by Equation (5.4). For example, the statement “with a probability of 90%,
106CHAPTER 5. CHARACTERISTICS OF WATER MAIN LIFETIMES AS
RANDOM PROCESSES
18 to 22 failures will occur in each month in future” is more meaningful
and useful for planning, compared to the statement “20 failures will occur
monthly”.
Assume that, using the LNF values, a measure for the expected num-
ber of failures, ENOF , is calculated using Equation (5.4). The interval
[ENOF − δ, ENOF + δ] is the β-confidence interval corresponding to this
failure rate, if:
Pr(x ∈ [ENOF − δ, ENOF + δ]) = β (5.5)
Calculation of the half-width of the confidence interval, δ, is straight-
forward by histogram analysis of the LNF values. The LNF values for
immediate right and left neighbours of ENOF are added to the LNF value
for the ENOF value. If the result equals β, the interval between these two
neighbours is the β–confidence interval. Otherwise, this interval should be
symmetrically extended until the area under LNF curve equals β.
5.4 Derivation of Theoretical LNF Values From
Lifetime Models
Lifetime distribution models form the core components of probabilistic ap-
proaches used in the analysis of water pipe failures. Such models usually
contain parametric functions with constant coefficients that are tuned by
an optimisation technique applied to failure records in a dataset. If such a
model is available, it provides a probability density function for inter-failure
times and the lifetime, which is the time to the first failure (TFF ). These
probability density functions are denoted by fIFT (t) and fTFF (t), respec-
tively. There is a direct relationship between these two density functions,
as explained below:
The sum tf + TFF is an inter-failure time and therefore, it is a ran-
dom variable with the IFT density function. Since in probabilistic lifetime
modelling, the consecutive failure times are assumed independent from each
other, the random variables tf and TFF are independent and the density
function of their sum equals the convolution of their individual density func-
tions:
fIFT (t) = ftf (t) ∗ fTFF (t) =
∫ t
0
ftf (τ)fTFF (t− τ)dτ. (5.6)
On the other hand, since T − tf is also a time to the first failure, we can
express the density of tf as ftf (t) = fTFF (T − t) and by substituting into
5.4. DERIVATION OF THEORETICAL LNF VALUES FROM LIFETIMEMODELS 107
the above equation, the following integral equation for the density functions
fIFT (t) is derived:
fIFT (t) =
∫ t
0
fTFF (T − τ)fTFF (t− τ)dτ. (5.7)
From Equation (5.1), the LNF value P0(nT ) can be calculated as:
P0(nT ) = PrNOF 0(nT ) = PrTFF > T =
∫ ∞
T
fTFF
(t) dt. (5.8)
The event of occurrence of only one failure during the n-th time interval,
NOF1(nT ) is expressed in Equation (5.2) and the LNF value P1(nT ) can
be expressed as follows:
P1(nT ) = PrNOF1(nT ) = Pr TFF ≤ T ∧ TFF + IFT1 > T=∫ T
0
∫∞T−t1
fTFF (t1)fIFT (t2)dt2dt1(5.9)
where:
t1= time at occurrence of the first failure, and
t2= time at occurrence of the second failure.
Similarly, from Equation (5.3), the LNF value P2(nT ) is derived as below:
P2(n) = PrNOF 2(nT )= Pr(TFF ≤ T ) ∧ (TFF + IFT1 ≤ T )
∧(TFF + IFT1 + IFT2 > T )=
∫ T
0
∫ T−t10
∫∞T−t1−t2
fTFF (t1)fIFT (t2)fIFT (t3)dt3dt2dt1.
(5.10)
The above derivation can be generalised to every k number of failures, for
which the probability Pk(nT ) is given by:
Pk(n) = PrNOF k(nT )= Pr(TFF ≤ T ) ∧ (TFF + IFT1 ≤ T ) ∧ · · · ∧ (TFF+∑k−1
i=1 IFTi ≤ T ) ∧ (TFF +∑k
i=1 IFTi > T )=
∫ T
0
∫ T−t10
· · ·∫ T−
Pk−1i=1 ti
0
∫∞T−
Pki=1 ti
fTFF (t1)fIFT (t2) · · · fIFT (tk+)dtk+1 · · · dt1.
(5.11)
Equations (5.8-5.11) are based on the assumption that the LNF values
are time-invariant (independent of the absolute time nT ) as they merely
depend on the number of failures k and the time-invariant pdf of the TFF
and inter-failure times.
To clarify the above point, the LNF values are derived for a two-
parameter-Weibull lifetime model which has been repeatedly applied for
108CHAPTER 5. CHARACTERISTICS OF WATER MAIN LIFETIMES AS
RANDOM PROCESSES
failure analysis of many types of units, with the probability density func-
tion (Crowder et al. 1994):
fIFT (t) =η
α(t
α)η−1e−( t
α)η
(5.12)
where η and α are the shape and scale parameters. The following formula
for Pk(nT ) is derived:
Pk(n) =∫ T
0
∫ T−t10
· · ·∫ T−
Pk−1i=1 ti
0
∫∞T−
Pki=1 ti
(η
Tα
)k+1fTFF (t1)[ t2···tk+1
T kαk
]η−1e−
“t2/T
α
”η−···−
“tk+1/T
α
”η
dtk+1 · · · dt1
(5.13)
where fTFF (t1) can be derived from the Weibull distribution of inter-failure
times given in Equation (5.12), by solving the integral equation (5.7).
The above LNF values are time-invariant and this property is not spe-
cific to the Weibull model. Indeed, as long as the distribution has constant
parameters that are not updated with time, the derived LNF values are
independent of time (they do not depend on either n or T ) and merely
depend on k. When time-invariant lifetime distributions such as Weibull
distribution in Equation (5.12) are utilised to model the failure process, the
random process formed by the consecutive inter-failure times is implicitly
assumed to be strict-sense stationary (with time-invariant probability den-
sity function). The derivations made in this section assume that in such
cases, failure rate (number of failures occurring during a specified time in-
terval) is a strict-sense stationary process, too. More precisely, it would
be a discrete random process with time-invariant probability mass function
(pmf), and the LNF values defined in this chapter would be its pmf.
5.5 Empirical Calculation of LNF Values
Having a dataset including water pipe failures over a long period, the LNF
values can be empirically estimated using histogram technique. The failure
history is divided into some time units referred to as time periods. The
duration of the time periods should be short enough to assume that LNF
values remain almost constant during the time intervals within each time
period. On the other hand, the time periods should be long enough to
provide reasonable empirical estimates for LNF values during each period,
by the using histogram technique. For instance, in the analysis presented in
this chapter, each time period is three months long and each time interval is
one day long. If a time period Si includes the intervals within n1T and n2T ,
5.6. CASE STUDY 109
i.e. Si = [n1T, n2T ], then the following empirical LNF values are given by
histogram technique:
PEMPk (n1T ) = PEMP
k ((n1 + 1)T ) = . . . = PEMPk (n2T ) = PEMP
k (si)
= (Number of NOFk events occurred during Si) /(n2 − n1)(5.14)
where PEMPk (nT ) is the empirical estimate of Pk(nT ).
For each time period Si, the expected number of failures denoted by
ENOF (Si) is the statistical mean of the number of failures occurring during
a time interval within Si:
ENOF (Si) =M∑
k=0
k PEMPk (Si) =
Number of failures occurring during Si
n2 − n1
(5.15)
5.6 Case Study
This probabilistic approach is implemented using the database, consist-
ing of CWW pipe breakages that occurred during 1997 − 2000. Details of
characteristics of this database and preparation process are presented in
Chapter 3.
In order to consider the effect of the material, size, and geographical
location of the pipes in the life-time models, these factors are chosen as
dividing criteria in classification of pipes. For example one subset of the
pipes in this classification is the class of Cast Iron Cement Lined (CICL)
pipes with diameter of 100 mm, located in an area covered by a single
postcode 3021. A total of 1450 failures have occurred for this class of pipes
over the course of four years (16 seasons), which is sufficient for the purpose
of the analysis presented in this chapter.
In this analysis, each time period is three months (one season) long and
failures are counted on a day-by-day basis, i.e. each time interval is one day
long. For each season, a set of LNF values are empirically calculated using
Equation (5.14).
Figure 5.2 shows the empirical values of Pk(Si) for k = 0, 3, 4 and
their variations over 16 consecutive seasons. The main reason of the actual
LNF values being time-varying is that the random process of water pipe
failures is non-stationary due to the environmental factors that affect the
rate of failures and inter-failure times.
As discussed in Sections 5.2 and 5.4, parametric probabilistic lifetime
models assume a stationary random process of failures and therefore, they
110CHAPTER 5. CHARACTERISTICS OF WATER MAIN LIFETIMES AS
RANDOM PROCESSES
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
Sum
mer
199
7
Aut
umn
1997
Win
ter 1
997
Spr
ing
1997
Sum
mer
199
8
Aut
umn
1998
Win
ter 1
998
Spr
ing
1998
Sum
mer
199
9
Aut
umn
1999
Win
ter 1
999
Spr
ing
1999
Sum
mer
200
0
Aut
umn
2000
Win
ter 2
000
Spr
ing
2000
Time Period (Seasons)
Empi
rical
LN
F Va
lues P0
P3P4
Figure 5.2: Empirical LNF values P0 , P3 and P4 for 16 consecutive seasons
during 1997 − 2000.
cannot predict the LNF variations with time. For instance, consider a
two-parameter Weibull model fitted to the water pipe failure records. Hav-
ing the parameters η and α, the ITF density function in Equation (5.7)
is substituted with the Weibull density function given in Equation (5.12),
and the TFF density function is numerically calculated by solving the in-
tegral Equation (5.7). Then, numerical calculation of the integrals in Equa-
tion (5.13) results in the following constant LNF values:
P0 = 0.2431 P3 = 0.1283 andP4 = 0.0729.
5.6.1 Effect of rainfall on failure rates
As it was discussed earlier in Chapter 2, in addition to the size, material
and geographical location of the pipes, other factors affect the pipe failure
process. Some examples include construction details, external and internal
loads, and corrosion. In most of breakages with clear mechanical causes,
corrosion has an accelerating role by weakening the fabric of the pipe. Al-
though such factors are not considered in this case study, during the four
years length of the failure history, most of these factors could be assumed
5.6. CASE STUDY 111
to slightly vary from one season to the next. However, one factor that is
not steady over the consecutive seasons is soil movement.
Soil movement is a particularly critical factor in regions with expansive
soil that are subjected to swelling and shrinkage which varies in proportion
to the amount of moisture present in the soil. As water is initially introduced
into the soil (by rainfall), it expands and after drying out it contracts,
often leaving small fissures or cracks. Excessive drying and wetting of the
soil progressively deteriorating the structures over years.The resulting soil
movement can exert pressures as large as 718 KPa (Nelson and Debora 1992)
which could lead to cracking.
A number of investigations have been conducted on the effect of soil
movement due to its moisture content on breakage of water pipes. For
instance, Rajani et al. (1996) showed that differential temperature change
between pipe and soil, and also soil shrinkage due to dryness result in the
development of stresses in the pipe. Also in a study of high rate of failures
in the Fort Worth area, shifting soils were suspected to be the main cause
(Morris 1975). According to the same reference, bending stresses caused by
the swelling of soils such as expansive clay were found to be three to four
times greater than such effects as internal pressure.
Similarly, substantial areas of the State of Victoria (including the CWW
licence area) are covered by expansive clay soils. The soil map of Victoria
is shown in 5.3 (Mann 1997). Different textures and properties found in
soils, reported by Northcote (1979) are presented as a supplement in Ta-
bles 5.1 and 5.3. These data demonstrate that the network of CWW is
mainly located in a region with expansive soil. Therefore, pipe fractures are
likely to occur over time due to soil movements mainly caused by rainfall
fluctuation.
While the soil type is time-invariant, rainfall is a non-stationary process.
Thus, the rainfall fluctuation is expected be the major factor contributing
to the non-stationarity of failure process. Dry seasons are expected to be
associated with high rates of breakages due to soil shrinkage.
Figure 5.4(a) shows the rainfall records for duration of the 16 seasons
during 1997–2000. The variations of rainfall are significant through the sea-
sons, as the average rainfall by season is 110mm and the standard deviation
is 42mm. In addition, large fluctuations in soil moisture are considered as
the main source of soil movement resulting in pipe breakages. To examine
the consistency of the variations in LNF values with rainfall variations, the
empirical failure rates (ENOF values) are calculated using Equation (5.15)
and plotted in Figure 5.4(b). The standard deviation of ENOF values has
112CHAPTER 5. CHARACTERISTICS OF WATER MAIN LIFETIMES AS
RANDOM PROCESSES
Expansive soils in Victoria (see Atlas of Australian Soils, CSRIO)
Cracking clay soils (CC10-13, Basaltic)
Cracking clay soils (CC1-5, CC8-9;12;Ka1-3;Ke1-4)
Hard setting loamy soils with brown or mottled brown clayey subsoils (Ra1-2;Rb1;Rf1, Basaltic)
Hard setting loamy soils with mottled dark clayey subsoils (HH1-2, Basaltic)
Hard setting loamy soils with mottled yellow clayey subsoils (Ta4;Tb19;Va2,9, Basaltic)
Hard setting loamy soils with mottled yellow clayey subsoils (Ta2;Tb4;Td5;Ub24-26;Ub29;Va1,3,4,5,7,8,11;Vd1)
Hard setting loamy soils with red clayey subsoils (Oa2-3, Basaltic)
Hard setting loamy soils with red clayey subsoils (Md4;O2,6;Ob4;Oc1-2;P1-2;Qb2-3)
Friable loamy soils (G1;Rg1, Basaltic)
Friable loamy soils (G3)
Friable (highly structured) porous earths (GG1;Mg2,5,7,17;M11, Basaltic)
Peats (Z5)
Yellow leached earths (EE2)
Grey brown highly calcareous loamy earths (Lb8-9)
Hard-setting loamy soils with mottled yellow clayey subsoils (Tb1)
Leached sand soils (Cb2)
Sandy soils with mottled yellow clayey subsoils (Wa8;X1,4,5;Ya4,15,19)
Lakes
Figure 5.3: Expansive soils in Victoria(Mann, 1997)
5.6. CASE STUDY 113
Table 5.1: Approximate clay content of different types of soils
Texture Field exture Approx. clay
symbol grade content(%)
S Sand Less than 5%
LS Loamy Sand Approx. 5%
CS Clayey sand 5 − 10%
SL Sandy loam 10 − 20%
FSL Fine sandy loam 10 − 20%
SCL Light sandy clay loam 15 − 20%
L Loam About 25%
Lfsy Loam, fine sandy Appox. 25%
ZL Silty loam 25% and with silt 25%+
SCL Sandy clay loam 20 − 30%
CL Clay loam 30 − 35%
CLS Clay loam, sandy 30 − 35%
ZCL Silty clay loam 30 − 35% and with silt 25%+
SC Sandy clay 35 − 40%
ZC Silty clay 35 − 40%
LC Light clay Clay: 35 − 40% and with Silt: 25%+
LMC Light medium clay 40 − 45%
MC Medium clay 45 − 55%
MHC Medium heavy clay 50%+
HC Heavy clay 50%+
also been calculated using the following equation:
σENOF (Si) =
√√√√ M∑k=0
(k − ENOF )2 PEMPk (Si). (5.16)
The numerical results of ENOF values and their standard deviations are
presented in Table 5.2. The relatively small standard deviations of ENOF
values in every season show how accurately the expected number of failures
is calculated using Equation (5.15).
In Figure 5.4, it is observed that the directions of variations (increasing
or decreasing) of 11 failure rates (out of total 16 failure rates) are in contrast
to rainfall variations. Thus, in this case study the rainfall variations in 11
out of the 16 (70%) of seasons vary at the opposite direction of failure
pattern variations. For the remaining five seasons (outliers), other factors
114CHAPTER 5. CHARACTERISTICS OF WATER MAIN LIFETIMES AS
RANDOM PROCESSES
020406080
100120140160180200220
Sum
mer
199
7A
utum
n 19
97W
inte
r 199
7S
prin
g 19
97S
umm
er 1
998
Aut
umn
1998
Win
ter 1
998
Spr
ing
1998
Sum
mer
199
9A
utum
n 19
99W
inte
r 199
9S
prin
g 19
99S
umm
er 2
000
Aut
umn
2000
Win
ter 2
000
Spr
ing
2000
Time Period (Seasons)
Rai
nfal
l (m
m)
RainfallAverage
(a)
0
0.2
0.4
0.6
0.8
1
Sum
mer
199
7
Aut
umn
1997
Win
ter 1
997
Spr
ing
1997
Sum
mer
199
8
Aut
umn
1998
Win
ter 1
998
Spr
ing
1998
Sum
mer
199
9
Aut
umn
1999
Win
ter 1
999
Spr
ing
1999
Sum
mer
200
0
Aut
umn
2000
Win
ter 2
000
Spr
ing
2000
Time Period (Seasons)
Empi
rical
Ave
rage
Num
ber o
f Fai
lure
s
(b)
Figure 5.4: Comparison of rainfall in each season with its corresponding em-
pirical average number of failures (empirical failure rates or ENOF values)
for CICL pipes with 100mm diameter: (a) Rainfalls (b) Empirical average
number of failures over each season computed as the ratio of total number
of breaks occurred in each season to the number of pipes involved.
5.6. CASE STUDY 115
Table 5.2: The empirical expected number of failures and their standard
deviations for 16 consecutive seasons during 1997-2000.
Season (Si) ENOF σENOF Season (Si) ENOF σENOF
Summer 97 1.00 0.0223 Summer 99 0.78 0.0327
Autumn 97 0.98 0.0457 Autumn 99 0.88 0.0009
Winter 97 0.81 0.0190 Winter 99 0.80 0.0273
Spring 97 0.71 0.0149 Spring 99 0.69 0.0131
Summer 98 0.94 0.0398 Summer 2000 0.93 0.0387
Autumn 98 0.94 0.0247 Autumn 2000 0.86 0.0216
Winter 98 0.81 0.0082 Winter 2000 0.79 0.0280
Spring 98 0.62 0.0208 Spring 2000 0.68 0.0147
such as abnormal loading or preventative maintenance may explain the
inconsistency of failure pattern variations with rainfall variations.
In order to highlight the correlation between the rainfall data and num-
ber of failures, in Figure 5.5 the rainfall data are plotted versus the empirical
(expected) number of failures - the ENOF values per km of pipe length for
CICL pipes with 100mm diameter. It is observed that when the rainfall is
significantly higher or lower than its average value (about 110mm), there
is a corresponding increase in the number of pipe failures. The two diverg-
ing lines have been plotted to clarify this increase. A similar trend has
been observed in the failure pattern of other classes of pipes. For instance,
Figure 5.6 shows the same correlation for CI pipes with 100mm diameter.
To quantify the correlation between rainfall and failure rate variations,
the magnitude of deviation of rainfalls from their average is plotted versus
the ENOF values for CICL pipes with 100mm diameter in Figure 5.7, and
a regression line is fitted to the points (excluding the five outliers). The
outliers are recognised by a robust estimation technique called the Least
Median Estimator (LMS). This method finds the optimum linear fit to the
data (excluding the detected outlier samples) and automatically results in
an inlier-outlier dichotomy. The outliers in this case study are assumed
to be mainly associated with random effects and extreme climate varia-
tions. Although the outliers are not considered in this analysis, they also
contribute to the random time-variations of the failure process and its non-
stationarity. Indeed, their existence also demonstrates the deficiency of
parametric (probabilistic) models developed for water pipe failures in the
literature.
The correlation coefficient of regression is 0.83 which is sufficiently large
116CHAPTER 5. CHARACTERISTICS OF WATER MAIN LIFETIMES AS
RANDOM PROCESSES
0.02 0.025 0.03 0.03520
40
60
80
100
120
140
160
180
200
Number of Failures/Km
Rai
nfal
l (m
m)
Figure 5.5: Daily average number of failures during each season in 1997-
2000, and the corresponding rainfall records for CICL pipes with 100mm
diameter.
3 3.5 4 4.5 5 5.5 6 6.5 7 7.5 8x 10−3
20
40
60
80
100
120
140
160
180
200
Number of Failures/Km
Rai
nfal
l (m
m)
Figure 5.6: Daily average number of failures during each season in 1997-
2000, and the corresponding rainfall records for CI pipes with 100mm di-
ameter.
5.6. CASE STUDY 117
0
20
40
60
80
100
0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1Number of Failures (ENOF)
Abs
olut
e D
evia
tion
of R
ainf
all (
mm
) Fr
om It
s A
vera
ge
Regression Line
Outliers
Figure 5.7: Deviation of rainfalls from their average, plotted versus the cor-
responding ENOF values: A regression line demonstrates the nearly linear
correlation between the failure rates and rainfall deviations.
to validate our assumption of an almost linear relationship between the
failure rates and rainfall (disregarding the extreme climate variations which
cause the outlier points). A similar trend has been observed in the data of
other classes of pipes.
Large variations of failure rate with rainfall are expected, as significant
change in soil moisture (due to change in rainfall) would lead to the swelling
and shrinkage of reactive soils. This in turn would lead to movement, dis-
tortion and subsequent failure of pipes. However, it should be noted that
it is not only the amount of rainfall that is responsible for increase in fail-
ure of pipes, but more importantly is the rate of change of rainfall and soil
moisture. In other words, if a very dry soil (due to below average rain-
fall) receives high amount of rainfall (well above average), the resulting soil
movement would be quite significant even though the total rainfall might
be at about average.
The high rate of change in rainfall can be easily illustrated by Fig-
ure 5.4(a) by comparing the rainfalls for summer and winter of 1997. It
should also be noted that if a high rainfall occurs while the soil is fully sat-
urated from earlier events, it is unlikely that further soil movement will take
118CHAPTER 5. CHARACTERISTICS OF WATER MAIN LIFETIMES AS
RANDOM PROCESSES
place and hence, pipes would not experience higher than average number
of failures. Indeed, this fact explains the outliers in Figure 5.7 where at
certain periods there are high rainfalls but no appreciable increase in the
number of failures.
5.7 Conclusion
In this chapter pipe failures were considered random processes and their
characteristics were studied. A set of probabilistic definitions, called Likeli-
hood of Number of Failures (LNF ), were introduced for use in investigating
the characteristics of this random process.
Using the database of CWW, LNF values were empirically calculated.
Theoretical LNF values were derived from a Weibull distribution model as
an example of general parametric lifetime models. It was demonstrated that
the LNF values derived from classical lifetime models are assumed time-
invariant while their empirical values are shown vary from one season to
the next. Therefore, the process of failure of water pipes is non-stationary
random processes. This result is in contrast with the underlying assumption
of existing lifetime models on stationarity of failure process.
It is emphasised that these findings and conclusions are general and
applicable to any other database of pipe failures. This chapter also reflected
a more specific investigation using the failure dataset of CWW. It was noted
that CWW is located in an area of expansive soil and soil movement was
shown to be a source of non-stationary behaviour of failure represented in
this database.
To investigate the effect of the soil movements (caused by rainfall) on
failure rate variations, changes in variation of the empirical failure rates
(statistical mean of the failures occurred during each day) were studied to-
gether with variations of rainfall in the area. Comparison of the concurrent
plots of rainfall and the empirical failure rates showed that most of the time,
variations of failure rates could be explained with variations of rainfall.
As a general approach to tackle the time-varying nature of pipe failure
processes, it is suggested to regularly update the parameters of lifetime
models. While this may require demanding computational updates and
costly expert staff, the resulting predictions would be more realistic and
reliable.
As an alternative solution to the problem, a non-parametric method
has been developed for efficient analysis of the non-stationary pipe failure
5.7. CONCLUSION 119
Table 5.3: Properties of different texture types found in soils (Northcote) -
Continued to the next page.
processes and is presented in Chapter 6. This technique is able to handle
and automatically update dynamic models for non-stationary pipe failure
processes.
Texture Symbol Behaviour of moist bolus
S Coherence nil to very slight,cannot be moulded sand grainsof medium size;single sand grains stick to fingers.
LS Slight coherence; sand grains of medium size can be shearedbetween thumb and forefinger to give minimal ribbon of 5mm.
CS Slight coherence; sand grains of medium size;sticky when wet;many sand grains stick to fingers; will form a minimal ribbonof 5-15 mm,discolours fingers with clay stain.
SL Bolus coherent but very sandy to tauch; will form a ribbonof 15-25 mm of medium size and dominant sand grains arereadily visible.
FSL Bolus coherent; fine sand can be felt and heard when ma-nipulated; will form a ribbon of 13-25 mm; sand grains areclearly evident under a hand lens.
SCL Bolus strongly coherent but sandy to touch; grains domi-nantly medium sized and easily visible;will form a ribbon of2-2.5 cm.
L Bolus coherent and rather spongy;smooth feel when manipu-lated but no obvious sandiness or ’silkiness’ somewhat greasyto the touch if much organic matter present; will form ribbonof 25 mm.
LfsY, Bolus coherent and slightly spongy; fine sand can be felt andheard when manipulated; will form a ribbon of 25 mm.
ZL Coherent bolus; silky when manipulated; will form a ribbonof 25 mm.
SCL Strongly coherent bolus, sandy to the touch; medium sizedsand grains visible in finer matrix; will form a ribbon of 25-40mm.
CL Coherent plastic bolus; smooth to manipulate; will form aribbon of 40-50 mm.
120CHAPTER 5. CHARACTERISTICS OF WATER MAIN LIFETIMES AS
RANDOM PROCESSES
Texture Symbol Behaviour of moist bolus
CLS Coherent plastic bolus; medium sized sand grains visible infiner matrix; will form a ribbon of 40-50 mm.
ZCL Coherent smooth bolus, plastic and silky ; with to the touch;will form a ribbon of 40-50 mm.
SC Plastic bolus; fine to medium sands can be seen, felt or heardin clayey matrix; will form a ribbon of 50-75 mm.
ZC Plastic bolus; smooth and silky to manipulate;will form aribbon of 50-75 mm.
LC Plastic bolus; smooth to touch; slight resistance to shearing;will form a ribbon of 50-75 mm.
LMC Plastic bolus; smooth to touch; Medium slight to moderateresistance to forming a ribbon; will form a ribbon of 75 mm.
MC Smooth plastic bolus; can be moulded into a rod withoutfracturing; has moderate resistance to forming a ribbon; willform a ribbon of 75 mm+.
MHC Smooth plastic bolus; can be moulded into a rod without frac-turing;has a moderate to firm resistance to forming a ribbonwill form a ribbon of 75 mm or more.
HC Smooth plastic bolus; can be moulded into rods without frac-turing; has firm resistance to forming a ribbon; will form aribbon of 75 mm+.
Chapter 6
A Non-Parametric Technique for
Failure Prediction of
Deteriorating Components
6.1 Introduction
The statistical models that are currently available for failure prediction
of water pipes were reviewed and discussed in Chapter 2. These models
were classified broadly into Deterministic, Probabilistic multi-variate and
Probabilistic single-variate models applied to grouped data (Kleiner and
Rajani 2001). Most of the existing models fall within the category of de-
terministic models. However, the focus of this study is on probabilistic
models.
A probabilistic model, unlike deterministic models, does not merely re-
turn a certain (predicted) quantity such as the number of failures or the date
of future failures. Instead, it focuses on estimating the failure probabilities
that can be used for developing a maintenance strategy. Using those fail-
ure probabilities, confidence intervals for the predicted quantities can also
be calculated. The concept of confidence interval and its use to develop
proactive maintenance strategies will be discussed later in this chapter.
Most breakage prediction models in water mains developed so far deal
almost exclusively with the static factors. These models may yield biased re-
sults by ignoring environmental time-varying factors in the statistical anal-
ysis of break rates (Kleiner and Rajani 2002). To model the pattern of
121
122CHAPTER 6. A NON-PARAMETRIC TECHNIQUE FOR FAILURE
PREDICTION OF DETERIORATING COMPONENTS
failure occurrences, such models assume a particular probability density
function (pdf) for the structural failures (Barraza et al. 2000, Barraza et
al. 1996, Lyu 1996, Hossain and Dahiya 1993, Goel and Okumoto 1979). In
these models, the failure process is assumed to be stationary as a simplifying
hypothesis.
For the water pipes recorded in the dataset explained in Chapter 3,
empirical LNF values were separately calculated for several consecutive
seasons in Chapter 5. The results showed that these empirical LNF values
can vary substantially from one season to another and therefore, the water
pipe failures are non-stationary random processes.
Furthermore, it was shown that the parametric models currently used for
failure prediction assume a time-invariant pdf for the time or the number of
failures (with fixed parameters) and do not take into account the variations
of failure patterns with time (Wood 1996a, Wood 1996b, Sahinoglu 1992,
Musa et al. 1987, Miller 1980, Leighton and Rivest 1986, Sukert 1976, Sukert
1979, Rajani and Tesfamariam 2005). This was clarified and mathemati-
cally demonstrated for the Weibull distribution model which is a well-known
lifetime model, widely used as a failure prediction model for many compo-
nents (Crowder et al. 1994). This limitation is, however, not specific to the
Weibull model. Indeed, as long as the distribution has constant parameters
that are not updated with time (and correspond with a stationary random
process), the probabilities of future failures are modelled as a function of
inter-failure times and not the absolute time n, and therefore the derived
LNF values are time-invariant and merely depend on k and the fixed pa-
rameters of the model.
The only statistical technique in the literature for predicting water main
breaks that has considered time-dependent variables other than pipe age and
the number of previous breaks, was a deterministic multi-variate exponential
model developed by Kleiner and Rajani (2002). In addition to the age of
pipes, the other time-varying factors that were taken into effect by this
model included temperature effects, soil-moisture effects, cumulative length
of replaced water mains and cumulative length of cathodically protected
water mains. Forecasted climate data required for this model was obtained
from Fourier analysis which assumed that climate change follows harmonic
cycles. The inherent inaccuracy of climate forecasting is naturally added to
the error of failure prediction.
Considering the failure records that are usually available in most of
water distribution systems, the variety of data that is needed can be a lim-
itation for Kleiner and Rajani (2002)’s model. A computer program named
6.1. INTRODUCTION 123
WARP has been developed using this model to perform analyses of histori-
cal breakage rates without or with any number of these covariates (Kleiner
and Rajani 2003). However, when one or more covariates are selected, the
results reflect background ageing (which is the consistent increase in pipe
breakage rate due to corrosion and other steady, continuous deterioration
processes) as well as annual variations due to the influence of those time-
varying factors.
In addition, the model of Kleiner and Rajani (2002) should be applied
to long histories of failures. In some cases, the available dataset is short and
includes a period of predominantly decreasing breakage rates. Applying this
model to such datasets may yield results that are counter-intuitive, such as
positive effect of ageing and/or negative effects of replacement. For this
reason, the authors suggested that model should be used judiciously and
the outcome of the analysis interpreted with caution.
Filling the existing gap between the non-stationary random nature of
component failure processes and parametric failure prediction models has
been the main focus of this study. This chapter presents a novel non-
parametric estimation technique for prediction of future LNF values. The
proposed technique does not have any constant model parameters and there-
fore, it does not model the random process of failures based on non-stationary
assumptions. More precisely, the proposed technique complies with the non-
stationary characteristics of the pipe failures.
As it will be explained later in detail, having the LNF values, one
can predict the future number of failures within any given period of time.
Moreover, these values can be used to compute accurate confidence intervals
(lower and upper bounds) for such a prediction. Therefore, prediction of
the LNF values of the component failure process is the main point of focus
of the technique that is proposed in this chapter.
In this study, the same notations and definitions introduced in Chapter 5
are followed. The time interval is chosen “one day” and the LNF values
vary slightly and are assumed constant for heuristic calculation of empirical
LNF values during a time period, using histogram technique. For the water
pipe failures recorded in the above-mentioned history such an assumption
is applicable for a time period of one season (90 time intervals).
In the next section, an algorithm for prediction of future LNF values
based using maximum likelihood estimation is given. Section 6.4 discusses
the use of the predicted LNF values to calculate the expected number of
failures in immediate next time interval and to obtain confidence intervals
for the estimated number of failures.
124
(n -5)T
Time
2 =4T
Failure occurrence
(n -3)T
(n -2)T
(n -1)T
(n -4)T
n
T
CHAPTER6. ANON-PARAMETRICTECHNIQUEFORFAILUREPREDICTIONOFDETERIORATINGCOMPONENTS
Figure6.1:Inter-failuretime,showinganexamplewhereµ2=4T.
Thepredictionschemeisthenextendedtoestimatethetotalnumber
offailuresthatwilloccurduringupcomingmultipletimeintervals. Ap-
plicationoftheproposedtechniqueonthefailurehistoryofwaterpipes
describedinChapter5isthendemonstratedinSection6.6followedbythe
conclusionsofthestudy.
6.2 MaximumLikelihoodEstimationofFuture
LNFValues
Theinter-failuretimeelapsedbetweentwoconsecutiveNOFkeventsisde-
notedbyµk. Figure6.1showsaninstanceofµ2(theinter-failuretime
betweentwoNOF2events).
IfNOFk(nT)isatruestatement(i.e.kfailuresoccurduringthetime
interval[(n−1)T,nT]),thentheprobabilityofnextµk=m(thenext
NOFktooccurat“m”timeintervalslater)isgivenby:
Pr(nextµk=m)=Pr(Ωm)
where
Ωm≡∼NOFk((n+1)T)∧...∧∼NOFk((n+m−1)T)
∧NOFk((n+m)T)
(6.1)
where∼meansthelogical”NOT”.Theaboveexpressioncanbecalculated
asfollows:
Pr(nextµk=m)=(1−Pk)m−1Pk. (6.2)
Equation6.2expressesarelationshipbetweenfutureLNFvaluesand
inter-failuretimes.Thiscanbeutilisedtoturntheproblemofpredictionof
LNFvaluesintotheproblemofestimationofthenextinter-failuretime.
Laterinthischapter,itismathematicallyproventhatthisapproachresults
inmorecertainandreliablepredictions.
6.3. REQUIRED LEVEL OF ACCURACY FOR THE INTER-FAILURETIMES 125
Using a maximum likelihood estimation approach to prediction of LNF
values, the first step is to calculate the likelihood of LNF values for a given
inter-failure time in the future. More precisely, the following question is
to be answered: Given that “next µk = m” what is the likelihood of a
set of LNF values, Pk, during the next m time intervals? The Maximum
Likelihood (ML) estimates of the LNF values are then the values with the
highest likelihood for the observed inter-failure data.
To obtain an ML estimate denoted by Pk, the likelihood of observed data
which include the next inter-failure time k(next) = m (yet to be estimated)
are maximised. From Equation (6.2), this likelihood is given by the following
equation:
Pr(µk(next) = m|Pk)
)= Pk(1 − Pk)
m−1 (6.3)
The maximum of the above likelihood is derived by solving the algebraic
equation:
d(P
k(1 − Pk)
m−1)
dPk
= 0 (6.4)
which results in Pk = 1m
.
On the other hand, for each time interval, the probabilities of all the
NOFk’s should sum to one:
M∑k=0
Pk = 1 (6.5)
Therefore, a normalisation factor ξ is introduced to Equation (6.4) and the
following final formula is derived for the ML estimates of the LNF values:
Pk =ξ
µk
; ξ =
(M∑i=0
µ−1i
)−1
. (6.6)
The future inter-failure times are denoted by µi in the above formula, as
they also need to be estimated. However, it is shown in the next section
that the above maximum likelihood estimates of LNF values are highly
robust to inaccuracies involved in estimation of inter-failure times.
6.3 Required Level of Accuracy for the Inter-
Failure Times
To predict the LNF values by Equation (6.6), future inter-failure times
should be predicted first. Before presenting a technique to predict the future
126CHAPTER 6. A NON-PARAMETRIC TECHNIQUE FOR FAILURE
PREDICTION OF DETERIORATING COMPONENTS
0 0.2 0.4 0.6 0.8 10
1
2
3
4
5
6
7
8
9
10
ML estimate of Pk
Variance o
f in
ter-
failure
estim
ate
s
Figure 6.2: Variance of the estimated inter-failure times is a decreasing
function of the maximum likelihood estimates of Pk values.
inter-failure times, the effect of accuracy of this prediction on the predicted
LNF values is investigated.
From Equation (6.2), the inter-failure times are observed to have geomet-
ric distributions with parameter Pk. Therefore, in the process of maximum
likelihood estimation of LNF values, the variance of the inter-failure times
is given by:
VAR(µk) =1 − Pk
P 2k
. (6.7)
Having in mind that, in estimation theory, variance is as an indicator for
the error of estimation, Equation (6.7) and its plot in Figure 6.2 show an
inverse relationship between the accuracy of inter-failure time prediction and
the LNF values predicted using the inter-failure times by Equation (6.6).
The larger the LNF estimates are, the more significant is their effect on
prediction of future failures. Such large LNF values need to be accurately
estimated and the level of accuracy for small LNF values is not important.
Equation (6.7) and Figure 6.2 show that the inter-failure times that are
6.4. PREDICTION OF INTER-FAILURE TIMES 127
estimated more accurately (and have a smaller variance) correspond to such
large Pk’s. In other words, a high accuracy of prediction for the inter-failure
time µk is required, andindeed occurs only when the corresponding LNF
value Pk is large.
If for some k, the LNF value Pk is large, then the events NOFk occur
frequently and there are many short consecutive inter-failure times in each
period of time. “Finite Impulse Response” (FIR) filters (discussed in the
next Section) are applied to predict the inter-failure times. Since the accu-
racy of such filters increases with the increases in the number of available
data, when the LNF value for some k is large (and many short consecutive
µk’s appear in the recent data), the inter-failure time will be automatically
predicted more accurately compared to other k’s with small LNF . This is
a point of strength for the proposed technique that guarantees a satisfying
performance for prediction of Pk values by µk estimation using a FIR filter.
6.4 Prediction of Inter-Failure Times
Given a failure history for a class of assets, before predicting future inter-
failure times, previous µk’s need to be computed. The set of µk’s for each k
is empirically calculated by simply determining the times elapsed between
each two consecutive NOFk’s.
For each time interval [(n − 1)T, nT ], the inter-failure time µk(nT ) is
considered the number of time units between the most two recent NOFk
events. The µk(nT ) values are initially set to zero at n = 0 and NOFk
events for all k’s are assumed to have occurred at this initial time. The
error incurred by inaccuracy of such initialisation will be transient and will
fade out after training data are applied. More precisely, after the estimation
scheme is initialised and failure data from a history of recent failures (up
to present - most recent failures) are given as inputs, the transient effects
of initialisation will fade out. This is a well-known property of robust and
stable recursive estimation techniques such as the one introduced in this
chapter.
At each time, nT , the number of failures occurring during the [(n −1)T, nT ] interval is recorded. If this number is m, then NOFm has occurred
during this interval and the its corresponding inter-failure time, µm(nT ), is
updated to the number of time intervals since the last occurrence of NOFm.
Figure 6.3 shows a case example in which the number of failures occur-
ring during 20 consecutive time intervals are used to calculate the inter-
128CHAPTER 6. A NON-PARAMETRIC TECHNIQUE FOR FAILURE
PREDICTION OF DETERIORATING COMPONENTS
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 200
1
2
3
4N
o.
of
Failure
s
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
1
3
5
7
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
2
4
6
8
1
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 200
2
4
6
2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 200
2
4
6
3
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 200
2
4
6
Time (in terms of time intervals)
4
Figure 6.3: The number of failures occurring during 20 consecutive time
intervals (top plot) and the inter-failure times, obtained from these data.
The µk’s are constant and change only when an NOFk occurs. Therefore,
at each time, only one µk changes and the rest stay at the same value.
failure times. Since there is a maximum of four failures occurred during a
single time interval in the data, only µ0, µ1, µ2, µ3 and µ4 are calculated.
As it is shown in Figure 6.3, for each k, the µk values at different times
are constant and change only when an NOFk occurs. Therefore, at each
time, only one µk changes and the rest stay unchanged. This characteristic
of the consecutive inter-failure times will be revisited later in this chapter
to clarify some properties of the proposed non-parametric failure prediction
method.
Having the set of µk’s for all possible k’s, the number of time intervals
between the most recent NOFk and the next anticipated NOFk need to be
predicted for each k. A specific FIR filter is proposed here to predict the
next µk as a weighted average of the J+1 most recent µk’s. J is a userdefined
parameter that can be tuned for best performance. The weights are tuned
in such a way that more recent inter-failure times have larger effects on the
6.5. FAILURE PREDICTION USING THE ESTIMATED LNF VALUES 129
predicted value. The filter input-output equation is as follows:
µk((n + 1)T ) =
∑Ji=0 λi µk((n− i)T )∑J
i=0 λi(6.8)
where λ ∈ [0, 1] is a forgetting factor to give more recent inter-failure times
more influence on the prediction process. The denominator term is a nor-
malising factor to guarantee that FIR coefficients sum to one (as expected
in an averaging scheme).
In the special case of λ = 1, the FIR filter performs a simple averaging
over the recent data. Using the J + 1 recent µk’s rather than the whole
data is particularly necessary when dealing with large sets of data. The
parameters J and λ are tuned by using a portion of failure history data and
validated by the rest of history database.
To summarise, the prediction of LNF values for the immediate next
time interval is performed in three steps:
Having the failure database, the inter-failure times, µk’s, are calcu-
lated for all k’s, for most recent J + 1 time intervals.
The next inter-failure times, µk(n + 1), are calculated using Equa-
tion (6.8) for all k’s.
The LNF values at the next time interval, Pk(n + 1)’s, are estimated
using Equation (6.6).
6.5 Failure Prediction Using The Estimated LNF
Values
6.5.1 Prediction of number of failures
Having the estimated LNF values, one can easily calculate the expected
number of failures as a prediction for the number of failures that will occur
during the next time interval:
ENOF (n + 1) =M∑
k=0
kPk(n + 1). (6.9)
In the above equation, the expected number of failures is derived as the
statistical mean of number of failures. Statistical mean is the most mean-
ingful measure for the purpose of this study. For instance, if the mode of
130CHAPTER 6. A NON-PARAMETRIC TECHNIQUE FOR FAILURE
PREDICTION OF DETERIORATING COMPONENTS
0 1 2 3 4 5 6 7 80
0.05
0.1
0.15
0.2
0.25
Number of Failures (k)
LN
F V
alu
e (
Pk )
ENOF=4.2 (rounded to 4) Mode=7
Figure 6.4: A case example to show the unreasonable results with using the
mode of distribution instead of the statistical mean of number of failures
for prediction.
distribution of number of failures (the number of failures, k, corresponding
with the largest LNF value) is chosen as the predicted number of failures,
in many circumstances, particularly when the time interval is short, it will
result in unreasonable predictions.
For a “one day” time interval, a typical column plot of LNF values is
shown in Figure 6.4. The maximum LNF value corresponds with k = 7,
however, since another LNF value (P2) is close to P7 , the statistical mean
of the number of failures calculated by Equation (6.9) becomes k = 4.2
rounded to 4 and this appears to be a more logical prediction for the next
number of failures.
6.5.2 Confidence intervals
In addition to the direct prediction of the number of failures, LNF val-
ues can also be used to calculate Confidence Intervals (CI) for the predic-
tions given by Equation (6.9). The interval [ENOF − δ, ENOF + δ] is
α–confidence interval (α–CI) for the predicted number of failures, if:
Pr (x ∈ [ENOF − δ, ENOF + δ]) = α. (6.10)
To calculate the above α–CI, the LNF values corresponding with the
right and left neighbourhoods of ENOF are progressively added until the
6.5. FAILURE PREDICTION USING THE ESTIMATED LNF VALUES 131
sum equals (or exceeds) α. The α–CI is then determined as the interval
within the final left and right neighbourhoods.
6.5.3 Failure prediction for multiple future time intervals
The three-step prediction scheme results in ML estimates for the Pk values
corresponding with the failures likely to occur during a single time interval
in the future. Using these LNF values and Equation (6.9) and Equation
(6.10), one can predict the number of failures and estimate the confidence
interval for a single time interval.
Since, the time interval is a short period of time, failure prediction is
often required for other time periods as long as multiple time intervals.
Therefore, the LNF values for the number of failures during the next L
time intervals are required to be estimated using the Pk values estimated
for a single time interval. The following theorem in probability theory can
be used to estimate the required LNF values (Stark 1994):
Theorem: If K1 and K2 are two independent discrete random variables
with their probability mass functions (pmf) given as: P 10 , P 1
1 , . . . , P 1M and
P 20 , P 2
1 , . . . , P 2M , then the pmf of the sum of the two variables K = K1+K2
is given by the convolution of the two pmf’s as follows:
Pr(K1 + K2 = k) = P 1k ∗ P 2
k =k∑
i=0
P 1i P 2
k−i ; 0 ≤ k ≤ 2M. (6.11)
If K1 and K2 are the number of component failures in an infrastructure
system during two different time intervals, then based on the assumptions
made so far in this chapter, P 1i and P 2
i are zero for i > M . Thus, the
above convolution results in the following formulae for the LNF values
corresponding to the total number of failures in either of the two time
intervals:
Pr(K1 + K2 = k) =
∑k
i=0 P 1i P 2
k−i if 0 ≤ k ≤ M∑ki=k−M P 1
i P 2k−i if M < k ≤ 2M
0. otherwise
(6.12)
The past failure history information used to predict the LNF values
does not imply any difference between the likelihood of failures occurring
in the immediate next time interval or in the second next time interval or
the next one, and so on. Therefore, at each time, it is assumed that the
estimated LNF values apply to all future (single) time intervals. Based on
132CHAPTER 6. A NON-PARAMETRIC TECHNIQUE FOR FAILURE
PREDICTION OF DETERIORATING COMPONENTS
0 1 2 3 4 5 60
0.05
0.1
0.15
0.2
0.25
Number of failures per day (k)
P k est
imat
es fo
r the
num
ber o
f fai
lure
s pe
r day
Figure 6.5: An example of LNF values for the number of failures per day,
to be the base of computation of weekly and monthly LNF values.
this assumption, Equation (6.12) can be used to calculate the LNF values
for two time intervals by substituting:
P 1k = P 2
k = Pk(n + 1) ; for all k ∈ [0, M ] (6.13)
where Pk(n+1) is the predicted LNF value for the next single time interval,
given by the three steps listed in Section 6.4 and will be denoted by Pk,
henceforward, for short.
By generalising the convolution theorem and Equation (6.11), the LNF
values corresponding to the time period that includes the next L time in-
tervals can be calculated as follows:
LNF for the period of next L time intervals for k failures = Pk ∗ Pk ∗ . . . ∗ Pk︸ ︷︷ ︸(L−1) times
.
(6.14)
In order to study the effect of convolution on the LNF values for longer
periods of times, an example is presented in Figures 6.5–6.7. A typical LNF
estimated using the three steps listed in Section 6.4 for a single time interval
(one day) is shown in Figure 6.5. Using Equation (6.14) with L = 7, the
LNF values for a week are estimated through six times convolution of the
daily LNF values, as shown in Figure 6.6. Similarly, the LNF values are
calculated for one month period (L = 30) and shown in Figure 6.7.
The results of convolution in the above example show that the distri-
bution of the number of failures during long periods of times (large L)
6.5. FAILURE PREDICTION USING THE ESTIMATED LNF VALUES 133
0 5 10 15 20 25 30 35 400
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
Number of failures per week (k)P k est
imat
es fo
r the
num
ber o
f fai
lure
s pe
r wee
k
Figure 6.6: LNF values for the number of failures per week, based on the
daily LNF values plotted in Figure 6.5.
0 50 100 1500
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
Number of failures per month (k)P k est
imat
es fo
r the
num
ber o
f fai
lure
s pe
r mon
th
Figure 6.7: LNF values for the number of failures per month, based on the
daily LNF values plotted in Figure 6.5.
134CHAPTER 6. A NON-PARAMETRIC TECHNIQUE FOR FAILURE
PREDICTION OF DETERIORATING COMPONENTS
approaches the normal distribution as L increases. This phenomenon is ex-
pected and explained by the Central Limit Theorem which states that the
distribution of the mean (and therefore the sum) of L independent and iden-
tically distributed random variables (such as K1, K2, . . . , KL in the notation
of this thesis) approach a normal distribution as L increases.
6.5.4 A step-by-step algorithm for failure prediction
A step-by-step algorithm for the proposed failure prediction technique is
shown in Figure 6.8. The algorithm is presented in such a way that the
prediction mechanism adapts itself with the variations of the failure pattern.
More precisely, the predicted number of failures, likely to occur during the
next L time intervals, is updated after each time interval. The algorithm
works as explained below:
At the end of the current time interval, the recorded number of failures
(that have actually occurred during that time interval) is given as an input
to the algorithm. If this number is k, a corresponding new inter-failure
time µk is calculated and appended to the database of the recent inter-
failure times as a new record. Such a database would comprise an ensemble
of inter-failure times (µ0 ’s , µ1 ’s , . . ., and µM ’s) recorded during the
history of operation of the infrastructure system.
Using the updated inter-failure times, the next inter-failure times (for
different number of failures, k, to occur during a single time interval) are
updated by using Equation (6.8), and the maximum likelihood estimate of
the LNF values (for a single time interval) are calculated by Equation (6.6).
Then the LNF values corresponding to the number of failures occurring
during next L time intervals are predicted by using Equation (6.14).
Using the LNF values, the expected number of failures (that will possi-
bly occur during the next L time intervals) and its α–CI are calculated by
Equations (6.9) and (6.10) and displayed as the current prediction.
As the next time interval expires and the actual number of failures dur-
ing that time interval are recorded and input to the algorithm, the above
procedure is repeated and all predictions are updated for the next com-
ing time intervals. This regular updating makes the proposed algorithm
adaptive to the non-stationary nature of component failures.
6.5. FAILURE PREDICTION USING THE ESTIMATED LNF VALUES 135
Inputs:
– Maximum number of failures, M, that can occur per each time
interval (e.g. day);
– The number of recent inter-failure times, J, to be used by
the FIR filter to predict next inter-failure times;
– The forgetting factor, λ ∈ [0, 1] for the FIR filter to
predict next inter-failure times;
– The number of time intervals, L in each time period in the
future for which the LNF values, number of failures and
its confidence interval are to be predicted;
– The confidence interval threshold, α ∈ [0, 1]; and
– A dataset containing the number of past failures occurred in
each time interval for a class of pipe;
Repeat the following steps to update the predictions at the end
of each time interval:
1- Input the number of failures in the most recent time
interval;
2- Update the past inter-failure times, µk’s;
3- For each k ∈ [0,M ], estimate the next inter-failure time,
µk(n + 1), using Equation (6.8);
4- For each k ∈ [0,M ], estimate the LNF values for the next
single time interval, Pk, using Equation (6.6);
5- For each k ∈ [0,M ], estimate the LNF values for the next
time period (L time intervals), using Equation (6.14);
6- Predict the number of failures during the next period of
time (L time intervals), ENOF, using Equation (6.9);
7- Calculate the α-CI of the predicted number of failures,
using Equation (6.10);
8- Display the outputs: The predicted number of failures for
the given time period in the future, ENOF, and its α-CI,
[ENOF − δ, ENOF + δ].
Figure 6.8: A step-by-step algorithm for the proposed non-parametric fail-
ure prediction technique.
136CHAPTER 6. A NON-PARAMETRIC TECHNIQUE FOR FAILURE
PREDICTION OF DETERIORATING COMPONENTS
6.6 Results of Failure Prediction Using the Pro-
posed Non-Parametric Technique
The proposed technique is applied to predict the failures of water pipes
using the CWW failure database. The database of this study, as explained
in Chapter 3, is provided by City West Water PTY LTD (CWW) that
supplies the western suburbs of Melbourne.
The proposed technique is applied and its performance evaluated for
homogeneous classes of pipes. Thus, pipes are classified and failure analysis
of this study is performed separately for each group. The recorded charac-
teristics of the pipes in the database of this study are typical as to what
is available in most of water distribution systems and this study aims at
developing practical models that can be tuned for similar kinds of data.
It is important to note that in case of availability of a diverse range
of information in the database and sufficient data points, pipe classifica-
tion can be narrow and specific, resulting in development of more accurate
predictions, using the proposed technique.
As described in Chapter 3, the pipes of existing failure history are clas-
sified based on their size, material, and location. For instance, one class
of pipes includes all Cast Iron Cement Lined (CICL) pipes with 100mm
diameter and located in the postcode area 3021 Melbourne, Victoria, Aus-
tralia. For each class of pipes, the algorithm is run separately and the future
failures are predicted accordingly.
The time interval is decided to be “one day”. By trial and error, J = 29
and λ = 0.90 are found suitable choices for the inter-failure time prediction
by Equation (6.8). More precisely, 30 recent NOFk’s are considered as the
most effective data in the process of inter-failure time prediction, and the
forgetting factor in the FIR filter is set to 0.90.
The algorithm described in the previous section and shown in Figure 6.8
requires an initial seed of past inter-failure times for prediction of future
inter-failure times and the corresponding LNF values. For this purpose,
the failure records of first year (1997) are used as an initial seed and the
LNF values and expected numbers of failures are predicted by the proposed
technique for the following three years 1998-2000.
In three separate studies, weekly (L = 7), monthly (L = 30) and quar-
terly (L = 90) prediction periods are considered. As an indicator for the
reliability of estimations, 80% confidence intervals (α = 0.80) are also calcu-
lated for each ENOF prediction. The results of weekly, monthly and quar-
6.6. RESULTS OF FAILURE PREDICTION USING THE PROPOSEDNON-PARAMETRIC TECHNIQUE 137
0
2
4
6
8
10
12
14
16
Num
ber o
f Fai
lure
s80% CI lower bound80% CI upper boundENOFTrue number of failures
Spr.97 Sum.98 Aut.98 Win.98 Spr.98 Sum.99 Aut.99 Win.99 Spr.99 Sum.00 Aut.00 Win.00
Figure 6.9: Expected number of failures and their 80% confidence intervals
based on weekly updating, for CICL pipes with 100mm diameter located in
postcode 3021.
terly failure predictions for the CICL pipes with 100mm diameter, located
in the postcode 3021, are plotted in Figures 6.9, 6.10 and 6.11, respectively.
In contrast to previous prediction techniques in the literature, the pro-
posed non-parametric method gives confidence intervals for the predicted
number of failures. Thus, a rejection rate is defined below to quantitatively
evaluate the performance of the prediction technique:
In an experiment, the actual number of failures that have occurred dur-
ing each of N consecutive time periods (in this case study, there are N = 156
weeks, N = 36 months, or N = 12 seasons for the validation data over three
years) are recorded and denoted here by NOF1, . . ., NOFN . On the other
hand, these quantities are predicted by some technique (using previous data,
e.g. the failure record of the first year in our case study) and the predicted
values are denoted by ENOF1 ,. . . , ENOFN . For each predicted number
ENOFi, the prediction technique also provides a lower and upper bound
138CHAPTER 6. A NON-PARAMETRIC TECHNIQUE FOR FAILURE
PREDICTION OF DETERIORATING COMPONENTS
0
5
10
15
20
25
30
35
40
45
50
Num
ber o
f Fai
lure
s80% CI lower bound80% CI upper boundENOFTrue number of failures
Spr.97 Sum.98 Aut.98 Win.98 Spr.98 Sum.99 Aut.99 Win.99 Spr.99 Sum.00 Aut.00 Win.00
Figure 6.10: Expected number of failures and their 80% confidence intervals
based on monthly updating, for CICL pipes with 100mm diameter located
in postcode 3021.
(confidence interval), respectively denoted by Li and Ui. The rejection rate
of the prediction technique, evaluated by the experiment, is the ratio of the
number of time periods for which the actual number of failures does not fall
within its estimated confidence interval:
rejection rate ,
∑Ni=1 I (NOF /∈ [Li , Ui])
N(6.15)
where I is the identity function:
I(x) =
1 if x is true
0. otherwise(6.16)
The lower the rejection rate, the better the performance of failure prediction.
Table 6.1 shows the rejection rates of the proposed prediction technique
for weekly, monthly and quarterly updating time periods in our case study.
It is observed that the rejection rates increase (and the performance of
6.6. RESULTS OF FAILURE PREDICTION USING THE PROPOSEDNON-PARAMETRIC TECHNIQUE 139
10
20
30
40
50
60
70
80
90
Num
ber o
f Fai
lure
s80% CI lower bound80% CI upper boundENOFTrue number of Fail;ures
Spr.97 Sum.98 Aut.98 Win.98 Spr.98 Sum.99 Aut.99 Win.99 Spr.99 Sum.00 Aut.00 Win.00
Figure 6.11: Expected number of failures and their 80% confidence intervals
based on quarterly updating, for CICL pipes with 100mm diameter located
in postcode 3021.
Table 6.1: Rejection rate for weekly, monthly and quarterly updating
Updating Period Weekly Monthly Quarterly
Total No. of Points 144 36 12
No. of Correct Predictions 136 33 9
Rejection Rate 5.56% 8.33% 25.00%
prediction decrease) with increasing the length of the time period. In other
words, the predicted LNF values and number of failures are highly accurate
for the next day, accurate for the next week, less accurate for the next
month, and far less accurate for the next season.
This proposed technique and the developed ANN model both aim to
solve the same problem with a different approach. ANN model benefits
from a universal estimation technique well known in reliability estimation
of mechanical systems. The technique developed in this chapter is a non-
parametric model that is self-updating.
A simple approach to the failure analysis problem used by some man-
140CHAPTER 6. A NON-PARAMETRIC TECHNIQUE FOR FAILURE
PREDICTION OF DETERIORATING COMPONENTS
0
5
10
15
20
25
30
35
40
45
50N
umbe
r of F
ailu
res
80% CI lower bound80% CI upper boundENOFTrue number of failuresPredicted by Averaging
Spr.97 Sum.98 Aut.98 Win.98 Spr.98 Sum.99 Aut.99 Win.99 Spr.99 Sum.00 Aut.00 Win.00
Figure 6.12: Expected number of failures and 80% confidence interval based
on monthly updating, compared to predictions given by simple averaging of
recent records.
agers is to predict the number of future failures within the next period of
time by averaging the numbers of failures occurring during some recent time
periods. For comparison purposes, this technique is also applied to predict
the number of failures in each month for the case study in this chapter.
The results are illustrated in Figure 6.12. It is observed that for most of
the months, the true number of failures is within the confidence interval
given by the technique suggested in this chapter is close to the predicted
number of failures. In contrast, the predicted numbers of failures given by
the averaging techniques are mostly out of the confidence interval and far
from the true numbers of failures.
In order to quantify this comparison, the Mean Square Error (MSE
– mean of the square of deviations from the true numbers of failures as
defined in Chapter 4) of both estimation techniques are calculated. The
6.7. CONCLUSIONS 141
average prediction error of the proposed technique, MSE=18, is more than
four times smaller than the averaging technique with MSE=76.40.
6.7 Conclusions
This chapter presented a non-parametric technique for failure prediction
which is compliant with the non-stationary nature of water pipes failure
processes. This technique is applied to the water pipe failure history of the
database described in Chapter 3 to estimate the expected number of failures
within a given number of time intervals in the future. Furthermore, an 80%
confidence interval is determined for the estimated number of failures.
The proposed technique implicitly considers the gradual variations of the
factors influencing the deterioration process by automatic updating of the
predictions with time. This updating is performed after each time interval
as every time interval corresponds to a new inter-failure time update. More
precisely, in every time interval, one µk is added to the record of inter-failure
times.
In this technique, the problem of prediction of LNF values is turned into
the prediction of inter-failure times. It is mathematically shown that more
accurately predicted inter-failure times correspond to larger LNF value pre-
dictions using Maximum Likelihood estimation.It is important to note that
it is more for large LNF values to be estimated more accurately. Therefore,
the proposed technique is highly tolerant to possible inaccuracies in inter-
failure time prediction, and this is a point of strength for the technique.
On the other hand, it was shown that the accuracy of prediction de-
creases as the time-range of prediction widens. In other words, the pre-
dicted number of failures is more accurate for the next week than the next
month.
A step-by-step algorithm was presented for the proposed non-parametric
technique in Subsection 6.5.4 and Figure 6.8. From the progressive steps de-
vised in this algorithm, it is evident that the technique continuously adapts
its predictions with the most recent changes in the failure trends and pat-
terns. This automatic adaptation is the key reason for its compliance with
the unrecorded environmental time-varying factors that affect the failure
process and make it non-stationary.
An illustrative and quantitative comparison of the accuracy of estima-
tions performed by proposed technique and simple averaging technique was
undertaken in this chapter. The comparison exhibited the satisfactory per-
142CHAPTER 6. A NON-PARAMETRIC TECHNIQUE FOR FAILURE
PREDICTION OF DETERIORATING COMPONENTS
formance of failure predictions by the suggested technique. The best accu-
racy in predictions is observed to be 94.4% that was obtained from weekly
updating. It is equal to 5.56% rejection rate from the 80% confidence inter-
val for total of 144 number of points. Rejection rate is 8.33% for monthly
updating for the total of 36 number of points. This figure is much higher
(25%) for quarterly updating with total number of 12 points.
It is important to note that the proposed non-parametric technique is
not exclusive to failure prediction for different classes of water pipes. The
method is generic and can be used for prediction of groups of infrastructure
system components which show gradual deterioration over time. The tech-
nique is most useful in capturing the random processes of overall system
degradation in terms of the failure processes of groups of components. It is
not suitable for the prediction of a single component failure or for systems
where performance is sensitive to the behaviour of a single component which
if fails would set a chain reaction.
Chapter 7
Conclusions and
Recommendations for Further
Work
7.1 Summary of Study and Achievements
Individual components of infrastructure systems such as water mains are
liable to failure due to environmental interactions and stresses and also ma-
terial degradation. For instance, deterioration and degradation of pipes are
inevitable after several decades of service under vast range of environmental
loads. Gradual degradation of components eventually reduces their capac-
ity to resist the imposed loads. This situation leads to the structural failure
of components, such as bursts in water pipes.
With increasing customer expectations that form regulatory constraints
and with shifting technological paradigms and changing reporting and ac-
countability frame works and requirements, the corporations that operate
the infrastructure systems have recognised the need to develop reliable main-
tenance strategies to maintain a successful course through these many de-
mands. More clearly, planners and decision makers of water distribution
systems seek cost effective strategies to exploit the full extent of the useful
life of the pipes, while meeting customer expectations in terms of service
quality. In a proactive approach towards development of an efficient strat-
egy for asset management, reliability analysis and failure prediction of water
pipes are vitally required.
During recent decades, many researchers have attempted to measure the
143
144CHAPTER 7. CONCLUSIONS AND RECOMMENDATIONS FOR
FURTHER WORK
effect of different physical characteristics and environmental specifications
of water pipes on their failure frequency. Chapter 2 of this thesis presents a
comprehensive review of the studies and their approaches towards predicting
the future condition of pipes based on their previous performance.
As reviewed in Chapter 2, a number of physical analyses have been de-
veloped for individual pipes to assess the rate of their deterioration in order
to measure their deterioration characteristics and predict their future condi-
tion. Some researchers have conducted statistical analysis on the history of
water pipes to formulate the relationship of failure frequencies with most of
the factors contributing to the overall structural deterioration of the pipes.
In practice, a complete dataset,components of which are collected on a reg-
ular basis for each pipe of water networks, is very costly and not readily
available. For minor water mains, few data are available and low cost of
failures does not justify expensive data acquisition campaign. In this situa-
tion, statistically derived models with ability to be applied to various levels
of input data are useful for the purpose of failure analysis.
This study has addressed the latter approach that can be applied on
water mains in a cost effective and realistic manner. The objective of this
thesis was to develop new failure prediction techniques for water pipes that
can be used as decision support tools by managers of water distribution
systems (to plan the maintenance/replacement of their water mains). To be
more specific, the study was conducted to improve the existing probabilistic
statistical techniques. Resulting models for describing the technical state of
pipelines are realistic tools for maintenance planning. The models proposed
in this thesis were applied to a failure database provided by City West
Water which supplies potable water to the Western and some inner suburbs
of Melbourne, Australia. This database, similar to failure records in other
water distribution systems, was an incomplete dataset just covering a period
which is only a portion of the total age of metal pipes of a mature water
distribution system.
The failure records available at most of water distribution companies are
limited to records over limited time lengths (e.g. less than one decade) while
the pipes themselves could be over 100 years old. The techniques developed
in this study can cope with this limitation, predicting pipes reliabilities in
the future, using a limited range of break records.
Furthermore, the techniques developed in this study are general and
the procedure of obtaining the models can be used for any set of failure
data for infrastructure components. The results of the proposed modelling
methods can be used to predict the overall structural state of a network
7.1. SUMMARY OF STUDY AND ACHIEVEMENTS 145
by investigating the reliabilities of different classes (homogenous groups) of
pipes. By incorporating the data recorded for the whole system, the global
maintenance/replacement effort is facilitated using the estimations obtained
from proposed techniques.
An ANN-based technique was used in this study to produce models for
reliability estimation of classes of pipes. Weibull and lognormal models
were also applied to the same data and the accuracy of resulting mod-
els were compared both illustratively and quantitatively. Neural network
models proved to be more accurate in producing a suitable curve to the
failure history and providing accurate predictions for pipes reliabilities in
the future. The quantitative evaluation of performance of estimation sys-
tems developed using the proposed ANN-based technique, Weibull model,
and lognormal modelwas performed through comparison in terms of the
mean square errors of estimation. For all six classes of pipes investigated,
the ANN model produced the highest level of accuracy compared to the
Weibull and lognormal estimates. On average, the mean square errors for
the ANN model, Weibull and lognormal were 0.03, 0.084 and 0.086. In
order to make the ANN-based prediction techniques easier to use for other
databases, a manual has been developed and is presented in Appendix B.
In next step, the ensemble of failures of all pipes in a class was math-
ematically studied as a random process. The random process of failures
was demonstrated to be non-stationary because of the time-varying envi-
ronmental factors that affect the pipe failure processes.
The non-stationary characteristic of failure processes was visualised by
the changes of patterns of failure occurrences in similar seasons of consecu-
tive years. For this purpose, a probabilistic measure to represent the state
of pipe failures in relatively short time intervals was defined. This measure
was used to mathematically demonstrate the deficiency of parametric mod-
els in capturing the non-stationary nature of failure occurrences. Different
patterns of failure occurrences were related to a number of dynamic factors
that affect the process of pipe failures. These time-varying factors were
explained to be exclusive to each region and highly varying from network
to network. For the case study of this thesis, the expansive clay soil of the
area was expected to play an important role in this non-stationarity. This
was confirmed with a histogram plot of failure occurrences in conjunction
with the records of rainfall during the time covered by the database.
At the next stage of this work, a non-parametric probabilistic technique
was developed. The non-stationary process of pipe failures can be captured
by this technique despite the lack of information about time-variant factors
146CHAPTER 7. CONCLUSIONS AND RECOMMENDATIONS FOR
FURTHER WORK
which is typical of the data available in water distribution systems. The
resulting model of the proposed non-parametric method is updated auto-
matically and therefore takes the gradual time-variant factors into account.
The outcome of the model is an estimate for the expected number of failures
in a given time in the future, as specified by the operator. It is demonstrated
that accuracy of predictions has an inverse relationship with length of the
prediction period. The probabilistic basis of the technique also enables it
to provide an upper and lower band for the estimated number of failures
or level of confidence with a given confidence level (a confidence interval).
For the existing case study, weekly, monthly and quarterly updating are
examined. The best accuracy in predictions is observed to be 94.4% which
was obtained from weekly updating. This result was deduced from a 5.56%
rejection rate from the 80% confidence interval for total of 144 number of
points. The rejection rate is 8.33% for monthly updating for the total of 36
number of points. This rejection rate is much higher (25%) for quarterly
updating with a total number of 12 points.
It is emphasised that the techniques developed in this study are generic
in nature. Although the presented models and simulation results have been
only applied to the CWW database the techniques can be tuned and applied
to most other water pipe failure databases to produce probabilistic models
for various classes of pipes in those data. Classes of pipes also can be
more narrowly specific in case of availability of larger and more complete
databases. Indeed, larger and more homogeneous groups will result in more
accurate and reliable estimations.
The non-parametric probabilistic technique proposed in Chapter 6 can
be also applied to the failure history of components of other infrastruc-
ture systems provided that pattern of their failures are smooth and do not
include sharp peaks. Infrastructure systems that can be subjected to sud-
den significant change of failure rate (e.g., severe failing behaviour in steel
infrastructures due to fatigue) cannot be modelled using this technique.
The objectives of this research have been achieved by:
(a) Review of the factors affecting pipe failure for existing models for
predicting future performance in Chapter 2,
(b) Analysis of a typical failure database and identification of typical lim-
itations of databases in Chapter 3,
(c) Establishment of a neural network model for structural failures of
water reticulation pipes in Chapter 4,
7.2. RECOMMENDATIONS FOR FURTHER WORK 147
(d) Investigation of the nature of occurrence of failure of water pipes in
terms of its stationarity as a random process in Chapter 5
(e) Establishment of a non-parametric model to fulfil the requirement
for event of failure occurrences as non-stationary random process in
Chapter 6.
(e) Development of supporting documentation for the developed model
for use by water authorities to adequate to various pipe categories
and data sets in Appendix B.
This study has resulted in a number of publications that are listed in Ap-
pendix C.
7.2 Recommendations for Further Work
The objectives of this thesis have been achieved with two new distinct mod-
elling techniques developed and validated. Both overcome a main limitation
of incomplete failure datasets for existing pipe systems. However, there is
room for further studies and enhancement of the developed techniques.
ï The vast range of causes of failure makes accurate prediction of pipe
failures a highly complicated task. As discussed in Chapter 2, some
models accurately predict the future failure rates for individual pipes,
by incorporating a long list of environmental and operational param-
eters, as well as physical characteristics, into their prediction rou-
tines. However, as explained in Chapter 3, because of data limita-
tions, this kind of analysis is not commonly applicable. Therefore,
as stated in Section 4.4, the models developed in this study accept
a few structural characteristics of the pipes (pipe age, material, con-
struction date, diameter and location) as the inputs and return the
reliability of a class of pipes (all sharing the given characteristics).
These statistical models are associated with some level of inaccuracy
due to various reasons, e.g. classes of pipes may be too general and
non-uniform behaviour of pipes may be neglected in this classifica-
tion. Availability of more details and information for individual pipes
in the failure histories can help to develop more sophisticated models
incorporating other influencing factors as their inputs (in addition to
the ones used in this work). This will enhance the prediction perfor-
148CHAPTER 7. CONCLUSIONS AND RECOMMENDATIONS FOR
FURTHER WORK
mance of the models in cases where more details for the failed pipes
are available.
ï The proposed techniques do not consider the type of failures (e.g.
longitude, circumferential, etc.) in modelling the failure occurrences.
Taking this factor into account in development of probabilistic models
may result in more realistic and accurate estimations. This factor
may be used in dividing the pipes into homogeneous groups if there
are sufficient failure records for each of such groups in the history.
ï The failure analysis in this study is conducted on the basis of cate-
gorising the pipes into various homogeneous groups. The techniques
proposed here produce a model to estimate the failure of each group,
separately. Although this method is useful in obtaining practical re-
sults for asset management purposes, individual performance of pipes
are overshadowed by group behaviour. For instance, a pipe with one
failure in its record is treated the same as one that has experienced
several failures during the same period. It should be mentioned that
this was not an issue for the models developed using the available
database which did not contain pipes having experienced more than
5 failures within the duration of failure history. However, this may
cause inaccuracy in modelling large failure histories with pipes of sig-
nificantly high failure rates in the history (i.e. pipes that have expe-
rienced large numbers of failures). To resolve this issue, further data
involving large number of failures need to be collected. From this data
further refinement of the model can be made.
ï The whole study is conducted on reticulation pipes. Similar analy-
ses can be also performed on failures of other components of water
distribution systems such as valves. In case of having sufficient fail-
ure data for these components, their failures can also be predicted on
a similar basis. This will serve the goal of assisting the planners of
water distribution systems in providing short and long term mainte-
nance/replacement policies for the entire network.
ï The probabilistic techniques developed and protected are generic and
can be used for other failure histories as well. However, they need
to be modified for other databases and this modification requires an
understanding of the mathematical framework and the underlying fun-
damental principles of these techniques as explained throughout the
chapters. Although the provided manual provides an step by step
7.2. RECOMMENDATIONS FOR FURTHER WORK 149
algorithm and the source code of a MATLAB program for its im-
plementation, a more user-friendly package, in the form of computer
software, can make the techniques more accessible and easier to use
for the designated purposed. Developing a computer program with
interactive features and capability of accepting certain inputs and re-
turning expected outputs has not been in the scope of this thesis and
is recommended for future work.
ï A practical way to assess the impact of various pipe failure conditions
on the overall operation of water distribution networks can be quite
helpful in interpreting the outcomes of probabilistic failure analysis
methods. Being able to assess the vulnerability of the network to the
failure of any particular class or group of pipes and more specifically,
the capability of providing a quantitative estimate of the impact on
each nodal demand is suggested to be added to the manual of devel-
oped techniques. Such a tool, similar to the solution suggested by
Jowitt and Xu (1993), requires knowledge of the network configura-
tion and a set of typical operating conditions that might be already
available from a routine network analysis of the intact distribution
network. Results of such a method can be combined with pipe fail-
ure probabilities to provide measures of network reliability to be used
more conveniently by operators of water distribution systems.
Bibliography
Achim, D., F. Ghotb and K. J. McManus (2007). Application of artificial neuralnetworks for prediction of water pipe asset life. ASCE Journal of Infrastruc-ture Systems.
Ahammed, M. and R. E. Melchers (1994). Reliability of underground pipelinessubject to corrosion. ASCE Journal of Transport Engineering Division120(6), 989–1002.
Andreou, S. (1986). Predictive models for pipe break failures and their implica-tions on maintenance planning strategies for deteriorating water distributionsystems. PhD thesis. MIT.
Andreou, S. A., D. H. Marks and R. M. Clark (1987). A new methodology formodelling break failure patterns in deteriorating water distribution systems:Theory. Applications and Advances in Water Resources 10(1), 2–10.
Arnold, G. E. (1960). Experience with main breaks in four large cities- philadel-phia. Journal of American Water Works Association 53(8), 1041–1044.
Ascher, H. and H. Feingold (1984). Repairable systems- Modeling, inference, mis-conceptions and their causes. Marel Dekker. New York.
Balakrishnan, A.V. (1995). Introduction to Random Processes in Engineering. 1stedition ed.. John Wiley and Sons. New York.
Balakrishnan, N. and W. W. S. Chen (1999). Handbook of Tables for Order Statis-tics from Lognormal Distributions with Applications. Kluwer. Amsterdam,Netherlands.
Barraza, N. R., B. Cernuschi and F. Cernuschi (1996). Applications and exten-sions of the chains-of-rare-events model. Journal of IEEE Transactions onReliability 45, 417–421.
151
152 BIBLIOGRAPHY
Barraza, N. R., J. D. Pfefferman, B. Cernuschi and F. Cernuschi (2000). An ap-plication of the chains-of-rare-events model to software development failureprediction. In: Proc. of 5th International Conference of Reliable SoftwareTechnologies (H. B. Keller and E. Odereder, Eds.). Springer-Verlag. Pots-dam, ALLEMAGNE. pp. 185–195.
Bevilacqua, M., M. Braglia and R. Montanari (2003). Classification and regressiontree approach for pumps failure rate analysis. Reliability Engineering andSystem Safety 79(1), 59–67.
Bishop, G. P. and E. R. Bloomfield (2003). Using a log-normal failure rate dis-tribution for worst case bound reliability prediction. In: 14th InternationalSymposium on Software Reliability Engineering (ISSRE03). IEEE. Denver,Colorado.
Borror, C.M., J.B. Kates and D.C. Montgomery (2003). Robustness of thetime between events cusum. International Journal of Production Researchpp. 3435–3444.
Boxall, J. B., A. O’Hagan, S. Pooladsaz, A. J. Saul and D. M. Unwin (2006).Estimation of burst rates in water distribution mains. ICE, Water Manage-ment.
Bras, R. L. and I. Rodriguez (1993). Random Functions and Hydrology. DoverPublications. N.Y., USA. Chapter 1, pp 1-11.
Bremond, B. (1997). Statistical modelling as help in network renewal decision.. In:Diagnostics of Urban Infrastructure, European Commission Co-operation onScience and Technology (COST). Paris, France.
Butler, M. and J. West (1987). Leakage prevention and system renewal. In:Pipeline Management Seminar, Pipeline Industries Guild. U.K.
Carvalho, H.S., P.C. Nascimento, A.P. Alves da Silva, J.C.S. Souza, M.T.Schilling and M.B. Do Couto Filho (1999). Neural networks based approachfor reliability estimation. In: IEEE International Conference on ElectricPower Engineering. Budapest. p. 181.
Chambers, G. L. (1983). Analysis of Winnipeg’s water main failure problem.Technical report. City of Winnipeg works and operations division.
Ciottoni, A. (1983). Computerized data management in determining causes ofwater main breaks: Philadelphia case study. In: 1983 Int. Symp. on UrbanHydrology, Hydraulics and Sediment Control. Univ. of Kentucky, Lexington,Kentucky. pp. 323–329.
153
Clark, R. M. and J. A. Goodrich (1988). Developing a data base on infrastruc-ture needs. International Journal of American Water Works Association81(7), 81–87.
Clark, R. M. and J. A. Goodrich (1989). Developing a data base on infrastructureneeds. Journal of American Water Works Association 81(7), 81–87.
Clark, R. M., C. L. Stafford and J. A. Goodrich (1982). Water distribution sys-tems: A spatial and cost evaluation. ASCE Journal of Water ResourcesPlanning and Management Division 108(3), 243–256.
Constantine, A. G. and J. N. Darroch (1993). Pipeline reliability: stochastic mod-els in engineering technology and management. World Scientific PublishingCo.
Constantine, A. G., J. N. Darroch and R. Miller (1996). Predicting undergroundpipe failure. Australian Water Works Association.
Constantine, G., R. Miller and J. Darroch (1998). Prediction of pipeline fail-ures form incomplete data. Technical Report 145. Urban Water ResearchAssociation of Australia.
Cooke, R. and E. Jager (1998). A probabilistic model for the failure frequency ofunderground gas pipelines. Journal of risk analysis.
Cox, D. R. and D. Oakes (1984). Analysis of survival data. Chapman and HallLtd.. London.
Cox, D.R. (1972). Regression models and lifetables (with discussion). Journal ofthe Royal Statistical Society.
Crow, E. L., Ed. (1988). Lognormal Distributions:Theory and Applications.Dekker. New York.
Crowder, M. J., A. C. Kimber, R. L. Smith and T. J. Sweeting (1994). StatisticalAnalysis of Reliability Data. Chapman and Hall. London.
Cullinane, M. J. (1986). Hydraulic reliability of urban water distribution systems.In: ASCE Conference on Water Forum 86: World water issues in evolution(M. Karamouz, G. R. Baumli and W. J. Brick, Eds.). Long Beach, California.pp. 1264–1271.
Deb, A.K. (1998). Quantifying future rehabilitation and replacement needs ofwater mains. In: AWWA Research. Denver.
Doleac, M. L., S. L. Lackey and G. Bratton (1980). Prediction of time-to-failurefor buried cast-iron pipe. In: AWWA 1980 Annual Conference. Denver.
154 BIBLIOGRAPHY
Dyachkov, A. (1994). Rehabilitation of the water distribution in the city ofmoscow. In: Water Supply Congress, International Water Supply Associ-ation. Vol. 12. Zurich. pp. 89–94.
Eisenbeis, P. (1994). Modelisation statistique de la provision des faillances sur lesconduites deau potable. Phd thesis. University Louis Pasteur.
Eisenbeis, P. (1997). Estimating the aging of a water mains network with theaid of a record of past failures. In: Deterioration of Built Environment:Buildings, Roads and Water Systems. Norwegian University of Science andTechnology. pp. 125–133.
Eisenbeis, P. (1999). Estimating the ageing of a water mains network with theaid of a record of past failures. In: Deterioration of Built Environment:Buildings, Roads and Water Systems. Norwegian University of Science andTechnology. pp. 125–133.
Eisenbeis, P., P. Gauffre and S. Sgrov (2000). Water infrastructure management:An overview of european models and databases. In: AWWARF Infrastruc-ture Conference. Baltimore, Maryland.
Finnemore, E. J. and J. B. Franzini (2002). Fluid Mechanics. 10th edition ed..McGraw Hill.
Fitzgerald, J. H. (1960). Corrosion as a primary cause of cast iron main breaks.Journal of American Water Works Association 60(8), 882–897.
Goel, A. L. and K. Okumoto (1979). Time-dependent error-detection rate modelfor software reliability and other performance measures. Journal of IEEETransactions Reliability 28, 206–211.
Goldthwaitel, L. R. (1976). Failure rate study for lognormal lifetime model. Belllaboratories, Inc. New York.
Goodrich, J. A. (1986). Drinking water distribution system reliability: Acase study. In: Water Forum 86: World Water Issues in Evolution(M. Karamouz, G. R. Baumli and W. J. Brick, Eds.). ASCE. New York.pp. 1256–1263.
Goulter, I. C. (1987). Current and future use of systems analysis in water distri-bution network design. Journal of Civil Engineering Systems.
Goulter, I. C. (1990). Reliability-constrained pie network model. Journal of hy-draulic engineering.
155
Goulter, I. C. (1992). System analysis in water- distribution network design: fromtheory to practice. Journal of water resources and management.
Goulter, I. C. and A. V. Coals (1986). Quantitative approaches to reliabilityassessment in pipe networks. ASCE Journal of Transportation Engineering112(3), 287–301.
Goulter, I. C. and F. Bouchart (1987). Joint consideration of pipe breakage andpipe flow probabilities. In: ASCE 1987 National Conference on HydraulicEngineering (R. M. Ragan, Ed.). Williamsburg, Virginia. pp. 469–474.
Goulter, I., J. Davidson and P. Jacobs (1993). Predicting water main break-age rates. ASCE Journal of Water Resources Planning and Management119(4), 419–436.
Goulter, I.C. and A.F. Kazemi (1988). Spatial and temporal groupings of watermain pipe breakage in winnipeg. Canadian Journal of Civil Engineering15(1), 91–97.
Guan, X. (1995). Condition and Replacement of Reginas Water Distribution Sys-tem. M.sc. theses. University of Regina.
Gupta, R. and P. R. Bhave (1994). Reliability analysis of water-distribution sys-tems. Journal of Environmental Engineering.
Gustafson, J. M. and D. V. Clancy (1999). Modeling the occurrence of breaks incast iron water mains using methods of survival analysis. In: Proceedings ofAWWA Annual Conf.American Water Works Association. AWWA. Denver,USA.
Habibian, A. (1992). Developing and utilizing data bases for water main rehabil-itation. Journal of American Water Work Association 84(7), 75–79.
Han, Y. L. and SH. Dai (1996). Artificial neural network method for flawed ipefailure evaluation: probabilistic model. International Journal of vessles andpiping 68, 203–207.
Hariga, M. A. (1996). Maintenance inspection model for a single machine withgeneral failure distribution. Microelectronics and Reliability.
Hartman, W.F. and K. Karlson (2002). Condition assessment of water mains us-ing remote field technology. In: International Conference of Infrastructures2002. Montreal, Quebec, Canada.
Herbert, H. (1994). Technical and economic criteria determining the rehabilita-tion and/or renewal of drinking water pipelines. International Journal ofWater Supply 12(3-4), 105–118.
156 BIBLIOGRAPHY
Hertz, J., A. Krogh and R. G. Palmer (1991). Introduction to the theory of neuralcomputation. Santa Fe Institute.
Herz, R. (1996). Ageing processes and rehabilitation needs of drinking waterdistribution networks.. Journal of Water Supply Research and Technology-Aqua 45(5), 221–231.
Herz, R. (1997). Rehabilitation of water mains and sewers. Water-Saving Strate-gies in Urban Renewal- European Approaches. European Academy of theUrban Environment.
Herz, R. (1998). Exploring rehabilitation needs and strategies for drinking waterdistribution networks. In: First IWSA/AISE International Conferance onMaster Plans for Water Utilities. Prague.
Hobbs, B. F. (1985). Computer applications in water resources, reliability assess-ment of urban water supply. In: ASCE Speciality Conf.. Buffalo. New York.pp. 1229–1238.
Hobbs, B. F. and G. K. Beim (1986). Verification of water supply reliabilitymodel. In: Proc. ASCE Conf. Water Forum, 86: World Water Issues inEvolution. Buffalo. New York. pp. 1218–1229.
Hossain, S. A. and R. C. Dahiya (1993). Estimating the parameters of a non-homogeneous poisson-process model for software reliability.. Journal ofIEEE Transaction on Reliability 42, 604–612.
Hoyland, A. and M. Rausand (1994). System Reliability Theory: Models andStatistical Methods. John Wiley and Sons, Inc. New York.
Hu, Y. and D.W. Hubble (2005). Failure conditions of asbestos cement watermains in regina. In: 1st CSCE Specialty Conference on Infrastructure Tech-nologies, Management and Policy. Toronto, Ontario, Canada.
Jackson, R. Z., C. Pitt and R. Skabo (1992). Non-destructive testing of wa-ter mains for physical integrity. In: Annual Conference of American WaterWorks Association. AWWA Research Foundation. Denver, USA. p. 109.
Jacobs, P. and B. Karney (1994). Gis development with application to cast ironwater main breakage rate. In: 2nd Int. Conf. on Water Pipeline Systems.BHR Group Ltd. Edinburgh, Scotland.
Jarvis, B. (1998). Asbestos-cement pipe corrosion interim report. Technical re-port. Customer Services Division, Water Corporation.
157
Jowitt, P.W. and C. Xu (1993). Predicting pipe failure effects in water distribu-tion networks. ASCE Journal of Water Resources Planning and Manage-ment 119(1), 18–31.
Kaara, A.F. (1984). A decision support model for the investment planning of thereconstruction and rehabilitation of mature water distribution systems. PhDthesis. MIT.
Karaa, F. A. and D. H. Marks (1990). Performance of water distribution net-works: Integrated approach. ASCE Journal of Performance of ConstructedFacilities 4(1), 51–67.
Kartalopoulos, S. V. (1996). Understanding Neural Networks and Fuzzy Logic:Concepts and Applications. IEEE Press.
Kelly, D. and D. O’Day (1982). Organizing and analyzing leak and break datafor making replacement decisions. Journal of American Water Works Asso-ciation 74(11), 589–594.
Kettler, A.J. and I.C. Goulter (1985a). Analysis of pipe breakage in urban waterdistribution networks. In: Canadian Journal of Civil Engineering. Vol. 12.pp. 286–293.
Kettler, A.J. and I.C. Goulter (1985b). Analysis of pipe breakage in urban waterdistribution networks. In: Canadian Journal of Civil Engineering. Vol. 12.pp. 286–293.
Kirmeyer, G. J., W. Richards and C. D. Smith (1994). An assessment of waterdistribution systems and associated research needs. In: AWWA ResearchFederation. Denver.
Kleiner, Y. and B. Rajani (1999). Using limited data to assess future needs.Journal of Water Supply Research and Technology-Aqua 91(7), 47–61.
Kleiner, Y. and B. Rajani (2001). Comprehensive review of structural deteriora-tion of water mains: Statistical models. Urban Water 3(3), 131–150.
Kleiner, Y. and B. Rajani (2002). Forecasting variations and trends in water-mainbreaks. ASCE Journal of infrastructure systems 8(44), 122–131.
Kleiner, Y. and B. Rajani (2003). Water main assets: from deterioration to re-newal. In: AWWA Annual Conference and Exposition, Catch the Wave.AWWA. Anaheim, CA, USA. pp. 1–12.
Kleiner, Y., O. Hunaidi and D. Krys (2005). Failures in Gray Cast Iron Dis-tribution Pipes. Vol. 2006. National Research Council Canada, Institute forResearch in Construction.
158 BIBLIOGRAPHY
Kolmogorov, A.N. (1941). American Mathematical Society Translations 1958.Vol. 8.
Kulkarni, R. B., K. Golabi and J. Chuang (1986). Analytical techniques for se-lection of repair-or-replace options for cast iron gas piping systems phasei.. Technical Report II. Gas research institute.
Kumar, A., E. Meronkly and E. Segan (1984). Development of concepts for cor-rosion assessment and evaluation of underground pipelines. Technical Re-port ll. U.S. Army Corps of Engineers.
Lackington, D. W. (1991). Leakage control, reliability and quality of supply. Inter-nationalJournal of Civil Engineering Systems 8, Civil Engineering Systems.
Lambert, A. O. (1998). A realistic basis for objective international comparisonsof real losses from public water supply systems. In: The Institute of CivilEngineers Conf., Water Environment 98 - Maintaining the Flow. London.
Le Gat, Y. (1999). Forecasting pipe failures in drinking water network usingstochastic processes models - respective relevance of renewal and poissonprocesses. In: 13th Conference of EJSW. Dresden University of Technology.Montreal.
Lee, J. A., D. P. Almond and B. Harris (1999). The use of neural networks forthe prediction of fatigue lives of composite materials. Composites Part A:Applied Science and Manufacturing 30(10), 1159–1169.
Lee, O. S. and H. Kim (2004). Estimation of failure probability using boundaryconditions of failure pressure model of buried pipe lines. Journal of KeyEngineering Materials pp. 270–273.
Lei, J. (1997). Statistical approach for describing lifetimes of water mains - casetrondheim municpality. Technical report. Trondheim Municipality.
Lei, J. and S. Sgrov (1998). Statistical approach for describing lifetimes of watermains. Journal of Water Science and Technology 38(6), 209–217.
Leighton, T. F. and R. L. Rivest (1986). Estimating a probability using finitememory. Journal of IEEE Transactions of Information Theory 32(6), 733–742.
Leshno, M., V. Y. Lin, A. Pinkus and S. Schocken (1993). Multilayer feedforwardnetworks with a nonpolynomial activation function can approximate anyfunction. Neural Networks 6(6), 861–867.
159
Li, D. and Y. Haimes (1992). Optimal maintenance- related decision-making fordeteriorating water distribution-systems 1. semi-markovian model for a wa-ter main.. Water Resources Research 28, 1053–1061.
Lillie, K.;, C.; Reed, M.A.R.; Rodgers, S.; Daniels and D. Smart (2004). In:Workshop on Condition assessment devices for water transmission mains.AWWA Research Foundation. Denver, Colo.
Lyu, M. R. (1996). Handbook of Software Reliability Engineering. McGraw Hill.New York.
Maglionico, M. and R. Ugarelli (2004). Reliability of a water supply system inquantity and quality terms. In: 19th European Junior Scientist Workshopon Process data and integrated urban water modelling. Lyon, France.
Makar, J. M. (1999). Failure analysis for grey cast-iron water pipes. In: 1999AWWA Distribution System Symp.. American Water Works Association.Denver.
Makar, J. M., R. Desnoyers and S. E. McDonald (2001). Failure modes and mech-anisms in grey cast-iron pipes. In: International Conference on UndergroundInfrastructure Research.
Makar, J.M. and Y. Kleiner (2000). Maintaining water pipeline integrity. In:AWWA Infrastructure Conference. Vol. 1. Baltimore, Maryland.
Malandain, J., P. Gauffre and M. Miramond (1998). Organising a decison sup-port system for infrastructure maintenance: Application to water supplysystems. In: 1st International Conference on new Information technologiesfor decision Making in Civil Engineering. Montreal. pp. 1013–1025.
Malandain, J., P. Gauffre and M. Miramond (1999). Modeling the aging of waterinfrastructure.. In: 13th Conference of EJSW.
Male, J. W., T. M. Walski and A. H. Slutsky (1988). Analysis of new york city’swater main replacement policy. In: Conference on Pipeline Infrastructure(B. A. Bennett, Ed.). ASCE. New York. pp. 306–312.
Male, J. W., T. M. Walski and A. H. Slutsky (1990). Analyzing water main re-placement policies. ASCE Journal of Water Resources Planning and Man-agement 116(3), 362–374.
Mann, A. (1997). The identification of ideal locations of moisture barriers. Mas-ters thesis. Swinburne University of Technology.
160 BIBLIOGRAPHY
Marks, H. D. (1985). Predicting urban water distribution maintenance strategies:A case study of new haven connecticut. Technical report. US EnvironmentalProtection Agency.
Marks, H. D., S. Andreou, Jeffrey L., C. Park and A. Zaslavski (1987). Statisticalmodels for water main failures.
Marshall, G.P. (2002). The residual structural properties of cast iron pipesstructural and design criteria for linings for water mains. Technical Report01/VVM/02/14. UK Water Improvement Report.
Mavin, K. (1996). Predicting the failure performance of individual water mains.Technical report. Urban Water Research Association of Australia.
Mays, L. W. (1996). Review of reliability analysis of water distribution systems.In: Stochastic Hydraulics (A. A. Balkem, Ed.). Rotterdam, Netherlands.pp. 53–62.
Mays, L. W. (2000). Water Distribution System Handbook. McGraw Hill. NewYork.
Mays, L. W., N. Duan and Y. C. Sun (1986). Modellin reliability in water dis-tribution network design. In: ASCE Conference on Water Forum 86: worldwater issues in evolution (M. Karamouz, G. R. Baumli and W. J. Brick,Eds.). Buffalo, New York.
McCulloch, W. S. and W. H. Pitts (1943). A logical calculus of the ideas imma-nent in nervous activity. Bulletin of Mathematical Biophysics 5, 115–133.
McMullen, L. D. (1982). Advanced concepts in soil evaluation for exterior pipelinecorrosion. In: AWWA Annual Conference. Miami.
Metropolis, N. and S. Ulam (1949). The monte carlo method. Journal of AmericanStatistics Association 44, 335–341.
Miller, A. M. (1980). A study of the reliability model. Master’s thesis. Universityof Maryland.
Minsky, M. and S. Papert (1969). Perceptrons. MIT Press. Cambridge, MA.
Moon, Y.B., C.K. Divers and H.J. Kim (1998). Aews: an integrated knowledge-based system with neural networks for reliability prediction. Computers inIndustry 35, 101–108.
Mordak, J. and J. Wheeler (1988). Deterioration of asbestos cement water mains.Technical report. Final Report to the Department of the Environment, Wa-ter Research Center.
161
Morris, R. E. (1975). The Distribution System. Manual of water utility operations.Austin the Association. Texas.
Morris, R.E. (1967). Principal causes and remedies of water main breaks. Journalof American Water Works Association pp. 782–798.
Musa, J. D., A. Iannino and K. Okumoto (1987). Software Reliability: Measure-ment, Prediction, Application. McGraw-Hill. New York, USA.
NACE (1984). Corrosion Basics An Introduction. National Association of Cor-rosion Engineers.
Nebesar, B. (1983). Asbestos/cement pipe corrosion: Part 2. Technical Report83-17E. Canada Centre for Mineral and Energy Technology, Energy, Minesand Resources Canada.
Nelson, John D. and J. Miller Debora (1992). Expansive soils: problems andpractice in foundation and pavement engineering. John Wiley and Sons, Inc.New York.
Niemeyer, H. W. (1960). Experience with main breaks in four large cities-indianapolis. Journal of American Water Work Association 52(8), 1051–1058.
Norin, M. and T. G. Vinka (2005). Corrosion of carbon steel and zinc in fillingmaterials in an urban environment. Technical report. Swedish CorrosionInstitute.
Northcote, K.H. (1979). A factual key for the recognition of Australian soils.Adelaide, South Australia. Rellim Technical Publications Pty. Ltd.
O’Day, D. K. (1982). Organizing and analysing leak and break data for makingmain replacement decisions. Journal of American Water Works Association74(11), 589–596.
O’Day, D. K. (1983). Analyzing infrastructure conditions–a practical approach.ASCE Journal of Civil Engineering 53(4), 39–42.
O’Day, D. K., C. M. Fox and G. M. Huguet (1980). Ageing urban water systems:A computerized case study. Public Works 111(8), 61–64.
O’Day, K. (1989). External corrosion in distribution systems. Journal of WaterWorks Association(AWWA) 81(10), 45–52.
Olliff, J. and S. Rolfe (2002). Condition assessment: The essential basis for bestrehabilitation practice. In: No-Dig 2002. Copenhagen.
162 BIBLIOGRAPHY
Ostfeld, A. and U. Shamir (1996). Design of optimal reliable multi-quality water-supply systems. ASCE Journal of Water Resourcs Planning and Manage-ment 119(1), 83–98.
Pascal, O. and D. Revol (1994). Renovation of water supply systems. In: Wa-ter Supply Congress, International Water Supply Association. Vol. 12. Bu-dapest. pp. 6–7.
Peabody, A.W. (1967). Control of pipeline corrosion. National Association ofCorrosion Engineers.
Peabody, A.W. (2001). Control of pipeline corrosion. National Association ofCorrosion Engineers.
Pelletier, G., A. Mailhot and J. P. Villeneuve (2003). Modeling water pipe breaks-three case studies. ASCE Journal of Water Resources Planning and Man-agement 129(2), 115–123.
Price, D. and J. Sutton (1988). Technology in Australia 1788-1988. AustralianScience and Technology Heritage Centre.
Rajani, B. (1995). Repair or replace? IRC studies corroded water mains.
Rajani, B. and J. M. Makar (2000a). A methodology to estimate remaining servicelife of grey cast iron water mains. Canadian Journal of Civil Engineering27, 1259–1272.
Rajani, B. and J. Makar (2000b). A methodology to estimate remaining servicelife of gray cast iron water mains. Canadian Journal of Civil Engineering27, 1259–1272.
Rajani, B. and Kleiner Y. (2001). Comprehensive review of structural deteriora-tion of water mains: Physically based models. Urban Water 3, 151–164.
Rajani, B. and S. Tesfamariam (2004). Uncoupled axial, flexural, and circum-ferential pipe-soil interaction analyses of jointed water mains. CanadianGeotechnical Journal 41, 997–1010.
Rajani, B. and S. Tesfamariam (2005). Estimating time to failure of ageing castiron water mains under uncertainties. In: International Conference of Com-puting and Control in the Water Industry (CCWI2005). Exeter, Devon, UK.
Rajani, B. and Y. Kleiner (2003). Protecting ductile-iron water mains: Whatprotection method works best for what soil condition. Journal of AmericanWater Works Association(AWWA) 95(11), 110–125.
163
Rajani, B. and Y. Kleiner (2004). Non-destructive inspection techniques to deter-mine structural distress indicators in water mains. In: Workshop on Evalua-tion and Control of Water Loss in Urban Water Networks. Valencia, Spain.pp. 1–20.
Rajani, B., C. Zhan and S. Kuraoka (1996). Pipe-soil interaction analysis ofjointed watermains. Canadian Geotechnical Journal 33(3), 393–404.
Rastad, C. (1995). Nordic experiences with water pipeline systems. In: 3rd In-ternational Conference, Sector C- Pipe materials and handling. CEOCORPraha.
Rausand, M. and R. Reinertsen (1996). Failure mechanisms and life models. In-ternational Journal of Reliability, Quality and Safety Engineering 3(3), 137–152.
Redfearn, J.C.B. (1984). Transverse mercator formulae. Empire Survey Review.
Remus, G. J. (1960). Experience with main breaks in four large cities-detroit.Journal of American Water Work Association 52(8), 1048–1051.
Righetti (2001). Cast iron condition assessment study, recognition systems ltd.Technical report. City West Water.
Rosenblatt, F. (1962). A comparison of several perceptron models. Self-OrganizingSystems. Spartan Books. Washington, DC.
Rostum, J. (1997). The concept of business risk used for rehabilitation of waternetworks. In: 10th EJSW. Tautra, Norway.
Rumelhart, D. E. and J. L. McClelland (1986). Parallel Distributed Processing:Explorations in the Microstructure of Cognition. Vol. 1. MIT Press. Cam-bridge, MA.
Sacluti, F. (1999). Modeling water distribution pipe failures using artificial neuralnetworks. Master’s thesis. Deptartment of Civil and Environment Engineer-ing. University of Alberta, Canada.
Sacluti, F., S.J. Stanley and Q. Zhang (1999). Use of artificial neural networksto predict water distribution pipe breaks. In: 51st Annual Conference of theWestern Canada Water and Wastewater Association. Saskatoon, Canada.p. 12.
Sahinoglu, M. (1992). Compound-poisson software reliability model. IEEE Trans-action on Software Engineering. 18, 624–630.
164 BIBLIOGRAPHY
Sgrov, S., J.F. Melo Baptista, P. Conroy, R.K. Herz, P. LeGauffre, G. Moss,J.E. Oddevald, B. Rajani and M. Schiatti (1999). Rehabilitation of waternetworks: Survey of research needs and on-going efforts. Journal of UrbanWater.
Shamir, U. and C. Howard (1979). Analytic approach to scheduling pipe re-placement. International Journal of American Water Works Association71(5), 248–258.
Shamir, U. and C.D. Howard (1985). Reliability and risk assessment for watersupply systems. In: Computer Applications in Water Resources. Buffalo.New York. pp. 1218–1228.
Sheskin, D. (2003). The handbook of parametric and non-parametric statisticalprocedures, Mathematics. CRC Press. New York.
Skipworth, P., M. Engelhardt, A. Cashman, D. Savic, A. Saul and G. Wal-ters (2002). Whole life costing for water distribution network management.Thomas Telford Publishing. London.
Stark, H. (1994). Probability, random processes, and estimation theory for engi-neers. 2nd edition ed.. Prentice Hall. Englewood Cliffs, NJ.
Sukert, A. N. (1976). A software reliability modeling study. Technical ReportRADC-TR-76-247. Rome Air Development Centre.
Sukert, A. N. (1979). Empirical validation of three error prediction models. Jour-nal of IEEE Transaction on Reliability 28, 199–205.
Sullivan, J. P. (1982). Maintaining ageing systems-boston’s approach. Journal ofAmerican Water Works Association 74(11), 555–559.
Tarassenko, L. (1998). A guide to neural computing applications. first ed.. JohnWiley and Sons. page 17.
Tesfamariam, S., B. Rajani and R. Sadiq (2006). Possibilistic approach for con-sideration of uncertainties to estimate structural capacity of ageing cast ironwater mains. Canadian Journal of Civil Engineering p. 10501064.
The Australian Geodetic Datum Technical Manual. Special Publication (1986).
Vorenhouta, M., H. G. van der Geestc, D. van Marumb, K. Wattelb and H. J. Ei-jsackersa (2004). Automated and continuous redox potential measurementsin soil. Journal of Environment Quality 33, 1562–1567.
Walski, T. M. and A. Pelliccia (1982). Economic analysis of water main breaks.Journal of American Water Works Association(AWWA) 74(3), 140–147.
165
Walters, G. (1988). Optimal design of pipe networks: a review. In: InternationalConference on Computer and Water Resources. Vol. 2. Computational Me-chanics Publication. Southampton, U.K.. pp. 21–31.
Water Main Renewal Study (1991). Technical report. Melbourne Water. Mel-bourne.
Water Reticulation Asset Status Report (1997). Technical report. City West Wa-ter. Melbourne. Pipe Structural Performance for the period July 1994 toJune 1997.
Watson, G. A. and Iain S. Duff (1997). The state of the art in numerical analysis.The Institute of Mathematics and Its Applications Note = about interpola-tion in DATA Chapter. Oxford : Clarendon Press. London, England.
Weibull, A. W. (1951). Statistical distribution of wide applicability. Journal ofApplied Mechanics 18, 292–297.
Widrow, B. and Hoff (1960). Adaptive switching circuits. 1960 IRE WESCONConvention Record pp. 96–104.
Williams, S., R.G. Ainsworth and A.F. Elvidge (1984). A Method of Assessingthe Corrosivity of Waters Towards Iron. WRc. Swindon.
Wood, A. (1996a). Predicting software reliability. Journal of IEEE on ComputerSciences 29, 69–77.
Wood, A. (1996b). Software reliability growth models. Technical Report 96.1.Tandem Tech.. Germany.
WSAA facts’99 (1999). Technical report. Water Service Association of Australia(WSAA). Melbourne.
Xu, C. and C. Goulter (1998a). Probabilistic model for water distribution re-liability. ASCE Journal of Water Resources Planning and Management124(4), 218.
Xu, C. and I.C. Goulter (1999). Reliability-based optimal design of water distri-bution networks. ASCE Journal of Water Resources Planning and Manage-ment 125(6), 352–362.
Xu, C. and R.S. Powell (1991). Water supply system reliability- concepts andmeasures. Journal of Civil Engineering Systems 8(4), 191–195.
Xu, C., I.C. Goulter and K.S. Tickle (2003). Assessing the capacity reliability ofageing water distribution systems. Journal of Civil Engineering and Envi-ronmental Systems 20(2), 119–133.
166 BIBLIOGRAPHY
Xu, Ch. and I. Goulter (1998b). Probabilistic model for water distribution re-liability. Journal of Water Resources Planning and Management Division,ASCE.
Appendices
167
Appendix A
An Introduction to Artificial
Neural Networks
A.1 Introduction
A neural network can be defined as a model of reasoning, based on what oc-
curs in the human brain. The brain consists of a densely interconnected set
of nerve cells or basic information-processing units, called neurons. Learn-
ing, is a fundamental and essential characteristic of biological neural net-
works. By using multiple neurons simultaneously, the brain can perform its
functions much faster than the fastest computers in existence today.
Each neuron has a very simple structure, but an army of such elements
constitutes a tremendous processing power. The ease with which they can
learn led to attempts to emulate a biological neural network in a computer.
The first computational model for an artificial neuron, was presented by
McCulloch and Pitts (1943). The McCulloch-Pitts Neuron was one of the
first attempts to capture the essential properties of a real neuron. The in-
puts and outputs of a McCulloch neuron model are binary (exclusively ones
or zeros); the nodes produce only binary results. There were no weights,
and the activation function was always the unit step function.
A.2 McCulloch-Pitts Neurons
McCulloch-Pitts Neuron consists of a set of n excitatory inputs, Xi; a set of
m inhibitory inputs, Xn+j; a threshold, u; a unit step activation function;
169
170APPENDIX A. AN INTRODUCTION TO ARTIFICIAL NEURAL
NETWORKS
and a single neuron output, Y . The neuron computes the sum of the input
signals and compares the result with a threshold value, u. If the net input is
less than the threshold, the neuron output is “OFF” (e.g. 0), otherwise, the
neuron becomes activated and its output attains a value “ON” (e.g. +1).
The next step in development of artificial neural network was the in-
troduction of delta learning rule by Widrow and Hoff (1960) that brought
the concept of Perceptron networks (which they called “ADALINE”- that
stands for adaptive linear neuron). The perceptron is the simplest form of
a neural network. It consists of a single neuron with adjustable synaptic
weights and a hard limiter.
The adaptive linear combiner outputs a linear combination of its in-
puts. This element receives an input vector Xk = [X0kX1k
...Xnk]> where
> denotes “transpose”. The components of Xk may be continuous or bi-
nary values. The components of the input vector are weighted by a set
of coefficients, the weight vector Wk = [W0kW1k
...Wnk]>. The weights are
continuously positive or negative values. The sum of the weighted inputs is
then computed, producing a linear output:
sk = Xk>Wk (A.1)
A.3 Linear Neuron Models
Adaptive linear neuron or ADALINE is an adaptive threshold logic element.
It consists of an adaptive linear combiner cascaded with a hard-limiting
quantizer that is used to produce a binary 1 output; Yk = sign(sk). A bias
weight, threshold, W0k, which is connected to a constant input, X0 = 1,
controls the threshold level of the quantizer.
Such an element may be seen as a McCulloch-Pitts neuron augmented
with a learning rule for adjusting its weights. In single-element neural net-
works, the weights are often trained to classify binary patterns using binary
desired responses. After training the ADALINE, if it responds correctly to
input patterns that were not included in the training set, it is said that
generalisation has taken place.
Learning and generalisation are among the most useful attributes of
ADALINEs and neural networks in general. With n binary inputs and
one binary output, there are 2n possible input patterns. A general logic
implementation would be capable of classifying each pattern as either +1 or
−1, in accordance with the desired response. Thus, there are 22npossible
logic functions connecting n inputs to a single binary output. A single
A.4. MULTI-LAYER FEED-FORWARD PERCEPTRONS 171
ADALINE is capable of realising only a small subset of these functions,
known as the linearly separable logic functions or threshold logic functions.
These are the set of logic functions that can be obtained with all possible
weight variations. With two inputs, the two functions an ADALINE cannot
learn, are exclusive OR and exclusive NOR. With many inputs, however,
only a small fraction of all possible logic functions are realisable, i.e., linearly
separable.
In the course of development of artificial neural networks, to remove the
limitations of ADALINE in dealing with functions, MADALINE (multiple
adaptive linear neuron) was introduced. MADALINE is one of the earliest
trainable layered neural networks. Retinal inputs were connected to a layer
of adaptive ADALINE elements, the outputs of which were connected to
a fixed logic device that generated the system output. MADALINEs were
constructed with various fixed logic devices such as AND, OR, and majority
vote-taker elements in the second layer. Three functions are all threshold
logic functions.
Early neural networks (MADALINE) were invented for pattern classifi-
cation issues. Nowadays, neural networks are equally useful for tasks such
as interpolation, system modelling, state estimation, adaptive filtering, and
nonlinear control. Rosenblatt (1962) took this course of improvement to
the next stage, by proving the convergence of the perceptron training rule.
Later, Minsky and Papert (1969) showed that the Perceptron cannot deal
with nonlinearly-separable data sets, even those that represent simple func-
tions such as and/or commands.
A.4 Multi-Layer Feed-Forward Perceptrons
During 1970-1985, very little research was reported on Neural Networks.
Invention of back-propagation which can learn from nonlinearly-separable
data sets by Rumelhart and McClelland (1986) led to a revolution in this
field. Since 1986, a lot of research has been conducted in enhancement and
application of Neural Networks. For instance an artificial neuron may use
the following transfer or activation function:
X =∑n
i=1 XiWi
Y =
1 if X ≥ u
−1 else
(A.2)
172APPENDIX A. AN INTRODUCTION TO ARTIFICIAL NEURAL
NETWORKS
The above type of activation function is called a sign function. This
neuron computes the weighted sum of the input signals and compares the
result with a threshold value, u. If the net input is less than the threshold,
the neuron output is OFF (e.g. -1). However, if the net input is greater
than or equal to the threshold, the neuron becomes activated and its output
attains a value ON (e.g. +1). Some examples of activation function or
“nonlinearity” of a unit are presented in figure A.1.
Artificial neural networks were first invented by computer scientists, but
many scientists from physics, engineering, psychology, etc., have tried to im-
prove them. Early neural networks were invented for pattern classification
issues. Nowadays neural networks are useful for tasks such as interpola-
tion, system modelling, state estimation, adaptive filtering, and nonlinear
control. In Figure A.2, a typical feed-forward neural network is illustrated.
The neural network typically consists of the set of n excitatory inputs, Xi;
a set of m inhibitory inputs, Xn+j; activation function; neuron outputs, yi;
and weights, Wij.
Each neuron computes the weighted sum of the input signals. An acti-
vation function processes the outcome of this linear combiner and returns
the output of the neuron which contributes to the inputs of the neurons in
the next layer. Activation functions are used to achieve increased computa-
tional power from multiple neurons, for non-linear variations. Examples of
activation functions are: Step function, Sign function, Sigmoid function.
The number of covariates that associate in producing the process under
study, determine the population of neurons in input layer of the network.
Output signal of each neuron of the last layer is an output that is supposed
to be realised from the neural network. Layer of neurons between the first
and the last layers are called hidden layers. The ability to adjust the num-
ber of neurons of hidden layers, as well as number of hidden layers and
initial coefficients increases the flexibility of neural networks in mimicking
the patterns of different data.
A.4. MULTI-LAYER FEED-FORWARD PERCEPTRONS 173
1
−1
0 X
Y
(a)
1
0
−1
Y
x
(b)Y
1
X
−1
−
−
0
(c)
Figure A.1: Some examples of activation function or “nonlinearity” of a
unit: (a) step function (b) sign function (c) linear function.
174APPENDIX A. AN INTRODUCTION TO ARTIFICIAL NEURAL
NETWORKS
Input Layer
Hidden Layer(s)
Output Layer
Figure A.2: Architecture of a typical feed-forward neural network
Appendix B
Reliability Prediction and
Prioritisation Using an Artificial
Neural Network: A Step-by-Step
Instruction Manual for
Practitioners
B.1 Introduction
This appendix presents a step-by-step instruction manual for practitioners
and engineers in water distribution industry. The implementation comprises
three different steps, described as follows:
Í Data Preparation: This step involves preparation of pipe failure
data as recorded in a history of past pipe breaks. At this stage,
the past failure records are sorted and their empirical reliabilities are
calculated accordingly, to be applied for the training of the neural
network at the next step. The details are explained in Section B.2.
Ï Training: At this step, the ensemble of failure records prepared in
the previous step is utilised to train the artificial neural network which
is a feed-forward perceptron. This ANN can learn the failure patterns
through training by error back-propagation technique. The architec-
ture of the ANN model and details of its training are presented in
Section B.3.
175
176 APPENDIX B. INSTRUCTION MANUAL FOR PRACTITIONERS
Ò Prediction and Prioritisation: After being trained, the ANN can
be used for failure prediction and prioritisation of the pipes based on
their reliabilities. The direct output of the trained ANN is an estimate
of the reliability of a given pipe (knowing its construction date, ma-
terial and diameter) at a given date (assessment date) in the future.
This predicted value can be used for prioritisation of different classes
of pipes (pipe groups with similar diameter and material) for repair,
rehabilitation and/or replacement planning and scheduling purposes.
The details are presented in Section B.4.
The source code of the MATLAB program that realises both the neural
network and its training in software is provided in Section B.5.
B.2 Preparation of Training Data
In order to prepare the neural network model to be used for lifetime pre-
diction and prioritisation for the purpose of replacement/repair planning,
the neural network should be trained first, using a database of past pipe
breaks. A proper database would include a large number of failure records,
each with the following fields:
b Pipe age at the failure date
b Pipe material
b Pipe diameter
b Empirical reliability of the pipe at the time.
The pipe age at the failure date is calculated as the difference between
the construction and failure dates. It can be in terms of days, months or
years, depending on the required resolution of prediction. However, it is
important to note that for a higher resolution, a larger number of failure
records is required for proper training of the neural network. In this study,
with the failure database provided by CWW, the pipe ages were calculated
in terms of the number of months. The maximum age in the database
should also be recorded and will be denoted by ∆D, hereafter.
The pipe material is recorded (or later converted in the software) as
a number of codes. For example, if only CI, CICL, DI, AC, and GWCL
pipes exist in the network, then the pipe material will be coded as shown
in Table B.1. The number of the existing types of pipe materials should
B.2. PREPARATION OF TRAINING DATA 177
Table B.1: An example of pipe material and diameter coding.
Material Code
CI 1
CICL 2
DI 3
AC 4
GWCL 5
Diameter (mm) Code
80 1
100 2
110 3
120 4
150 5
also be recorded and will be denoted by NType, hereafter (in the above case,
NType = 5).
Similar to the pipe material, the pipe diameter is also coded to one
of 1, 2, . . . numbers. For example, if every pipe in the network has a
diameter of 80 mm, 100 mm, 110 mm, 120 mm or 150mm, then the pipe
diameters will be coded as shown in Table B.1. The number of existing pipe
diameter codes should also be recorded and will be denoted by ND. In the
example shown in Table B.1, ND = 5.
To calculate the empirical reliabilities of the pipes, their break records in
the database should be first grouped into different classes. In each class, the
pipes have the same material and diameter. A total number of ND ×NType
different classes can potentially exist, although usually some classes are
empty (with no break record in the database).
Suppose there are n break records in the history for a particular class
of pipes and the pipe ages at the times of failure are recorded. The set of
pipe ages is first sorted to t(1), t(2), . . . , t(n) in ascending order. Then,
for each age t(i); i = 1, . . . , n, the empirical reliability of the pipes in that
class is calculated as follows:
SEMP(t(1)) = 1 − 1/(2n)
SEMP(t(2)) = 1 − 3/(2n)
· · ·SEMP(t(i)) = 1− (2i− 1)/(2n)
· · ·SEMP(t(n)) = 1/(2n).
(B.1)
All four columns of the training dataset are now complete and can be
directly applied to train a neural network model as described in the next
section. Each row of this dataset corresponds to a break record and includes
178 APPENDIX B. INSTRUCTION MANUAL FOR PRACTITIONERS
Pipe Age
Pipe Material Code
Pipe Diameter Code
D1
Type1 N
DN1
1I
2I
3I
14I
To
the
Hidden
Layer
Figure B.1: Input layer of the ANN model.
numerical values for the pipe age, the code of pipe material, the code of pipe
diameter and the empirical reliability of the pipe (calculated at the given
age of pipe for the pipe class to which the pipe belongs).
B.3 Training the Artificial Neural Network
The neural network model of pipe reliability is a feed-forward perceptron
with three layers. The first layer is the input layer. As shown in Figure B.1,
the pipe age, material code and diameter code are normalised and passed
to the hidden layer. The inputs to the hidden layer are denoted by the
symbols I1, I2 and I3, along with a 4-th input I4 = 1 which is called the
bias input.
The hidden layer is shown in Figure B.2. The outputs of this layer are
given by:
Jj = f
(4∑
i=1
WijIi
); j = 1, . . . , nh (B.2)
where nh is the number of neurons in the hidden layer and Wij is the weight
of the synapse connecting the i-th input Ii to the j-th hidden neuron. The
function f(.) is called the activation function of the hidden neurons and has
the following mathematical form (which is called sigmoid function):
y = f(x) =1
1 + exp(−x). (B.3)
B.3. TRAINING THE ARTIFICIAL NEURAL NETWORK 179
1I
2I
3I
14 =I 11 =+nhJ
1J
2J
nhJ
The weights ijW
Figure B.2: Hidden layer of the ANN model.
11 =+nhJ
1V1J
2J
nhJ 10/9
y2V
nhV
1+nhV
S
Figure B.3: Hidden layer of the ANN model.
The output layer has a single neuron as shown in Figure B.3. The
neuron is connected to the neurons of the hidden layer through synapses
with weights V1, . . . , Vnh+1. The reliability estimate is generated by this
single neuron:
S = 10/9 f
(nh+1∑j=1
VjJj
). (B.4)
180 APPENDIX B. INSTRUCTION MANUAL FOR PRACTITIONERS
B.3.1 Learning by Error Back-Propagation
Training of the neural network means tuning of the weights Wij and Vj in
such a way that the output of the network, S, follows the failure patterns
existing in the training data. The training method described in this section
is called error back-propagation and it is the most popular learning technique
used with feed-forward multi-layer perceptron neural networks.
Suppose there is a total of N break records in the training dataset that
has been prepared following the instructions given in Section B.2. For the
k-th record (1 ≤ k ≤ N), the empirical reliability is denoted by SkEMP and
the neural network is to be trained to generate these empirical reliabilities
for the break records listed in the database. If the output of the neural
network, given by equations (B.2)-(B.4), is Sk for the k-th break record, the
reliability estimation error term for this record is ek = SkEMP − Sk. Using
error back-propagation, the weights of the neural network are tuned in such
a way that the following objective function is minimised:
E =1
2
N∑k=1
(SkEMP − Sk)
2 =1
2
N∑k=1
e2k. (B.5)
The optimisation problem is solved using the gradient technique. The
training involves multiple repetitive epochs. In each epoch, each weight w
changes in the inverse direction of the gradient of the objection function:
∆w = −η∂E
∂w+ α∆wprev.. (B.6)
The term α∆wprev. is called the momentum term and is added to the
gradient term to avoid the minimisation process being trapped in a local
minimum. The two parameters η and alpha should be both positive and less
than one, and are chosen by trial and error. For the failure dataset provided
by CWW to conduct this study, η = 0.5 and α = 0.7 were suitable choices.
But, for a complete and large dataset, different values may be appropriate.
A small η will slow down the convergence of the optimisation process
and a large value will result in parameter values largely oscillating around
the optimum values throughout the optimisation process (oscillatory con-
vergence of the optimisation).
B.3.2 Step-by-Step Training Algorithm
First the weights of the neural network are initialised to small random val-
ues. In this study, they were initialised to values between -0.01 and 0.01.
B.4. PRIORITISATION OF PIPES BY RELIABILITY PREDICTIONUSING THE NEURAL NETWORK MODEL 181
The training algorithm involves repetition of several steps (each repetition
cycle is called an epoch) until the objective function E converges. In each
epoch the following steps are taken:
1. For every break record in the training dataset, indexed with k (1 ≤k ≤ N), compute and save the following values:
/ the outputs of the hidden neurons Jj(k); j = 1, . . . , nh, using
equation (B.2);
/ the output of the neural network (reliability estimate Sk), using
equation (B.4); and
/ the reliability estimation error ek = SkEMP − Sk.
2. For each weight Vj, calculate the following gradient:
∂E
∂Vj
= −N∑
k=1
ek Sk (1 − 0.9Sk) Jj(k). (B.7)
3. For each weight Wij, calculate the following gradient:
∂E
∂Wij
= −N∑
k=1
ek Sk (1 − 0.9Sk) Jj(k) (1 − Jj(k)) Vj Ii(k). (B.8)
4. Change the weights of the neural network by ∆Vj and ∆Wij given
below:∆Vj = −η∂E/∂Vj + α∆Vj(prev.)
∆Wij = −η∂E/∂Wij + α∆Wij(prev.)(B.9)
where ∆Vj(prev.) and ∆Wij(prev.) are the values of the previous
epoch.
The above steps are repeated until convergence. The convergence is
examined by checking the absolute difference between the values of E in the
current and previous epochs. Training stops when the absolute difference
is less than a small threshold (10−4 in this study).
B.4 Prioritisation of Pipes by Reliability Predic-
tion Using the Neural Network Model
Using the trained neural network, reliability models can be derived for each
class of pipe. In this context, the term reliability model means the pipe reli-
abilities at different ages for the pipes with the same material and diameter.
182 APPENDIX B. INSTRUCTION MANUAL FOR PRACTITIONERS
Age (years)
Reliability
1.000.87
0.21
25 50
Figure B.4: An example of the reliability versus age plot of a model derived
for a class of pipes.
Assume that, using the failure database, the reliability of a class of
pipes is modelled as a function of pipe age as shown in Figure B.4. By
projecting this reliability model into future, the reliability of those pipes in
the year 2015 can be calculated as a similar (but reflected) function of the
construction dates of the pipes, as shown in Figure B.5.
For the purpose of predictive prioritisation of pipes, the following steps
are taken:
1 An assessment date in the future is determined.
1 The reliability of different classes of pipes at the assessment date are
calculated and plotted versus their construction dates.
1 A reliability threshold is set (e.g. 100% × α = 80%).
1 For each class, the pipes that their future reliability will be less than
the given threshold are determined (in terms of their construction
date) and marked as “high-risk”.
1 The list of high-risk pipes is the output of the proactive technique.
The pipes in this list have the highest priority for replacement (or
rehabilitation).
An example is shown in Figure B.6. The reliability of two classes of pipes
in a given assessment date in future are predicted and plotted versus their
B.4. PRIORITISATION OF PIPES BY RELIABILITY PREDICTIONUSING THE NEURAL NETWORK MODEL 183
Construction Year
Reliability
0.87
0.21
19901965
Figure B.5: The reliability of the same class of pipes (as in Figure B.4)
in the year 2015, plotted versus the construction year of the pipes in that
class.
Construction Year
Reliability
100% x α
Y1Y2
100%
Figure B.6: For the “red” and “blue” classes of pipes, the pipes constructed
before the year Y1 and Y2, respectively, are high-risk.
construction dates. the horizontal line of %×α intersects the two plots at the
construction years Y1 and Y2. Thus, for the class with the “red” reliability
plot, the pipes constructed before the year Y1 are identified as high-risk and
for the class with the “blue” reliability plot, the pipes constructed before
the year Y2 are identified as high-risk.
184 APPENDIX B. INSTRUCTION MANUAL FOR PRACTITIONERS
1 function [shat,J]=NN(I,W,V)2 % This function implements the actual neural network models. It gets the 3 x 1 in
put vector and ouputs the reliability estimate shat and the hidden layer neuron activities J which is a nh x 1 vector. The size of the weight matrix W is 4 x nh and the weight vector V is (nh+1) x 1.
3
4 J=f(W'*[I;1]); % computes a nh x 1 vector J. The input I is padded with a 1 for bias input
5 shat=10/9*f(V'*[J;1]); % computes the output shat. The J is padded with a 1 for bias.
6
7 function y=f(x)8 % This is the activation function of each neuron in the ANN, which is a Sigmoid fu
nction.9 y=1./(1+exp(-x));
Figure B.7: The source of the Matlab function that realises the three layers
of the Neural Network. For a given set of input values, this functions returns
the output of the network and the activation values (outputs) of the neurons
in the hidden layer.
B.5 MATLAB Source Code
In this section, the MATLAB source files of the program, written to imple-
ment and test the performance of the neural network model, are presented.
Figure B.7 shows the function ‘NN.m’ which implements the three layers
of the neural network shown in Figures B.1-B.3. The inputs of the neural
network are denoted by the 3 × 1 vector variable ‘I’ and they are in the
same order as shown in Figure B.1. The weights are denoted by the vari-
able symbols ‘W’ and ‘V’ which are 4 × nh and (nh + 1) × 1, respectively.
The function returns the output of the neural network which is an estimate
S of the reliability of the given pipe at the given age. It also returns an
nh × 1 vector ‘J’ which denotes the outputs of the neurons in the hidden
layer. The function ‘y=f(x)’ defined within ‘NN.m’ is the sigmoid activation
function of the neurons as defined in Equation (B.3).
Figure B.8 shows the code ‘Training Epoch.m’ which is a function that
implements one training epoch for the neural network. Its inputs are the
ages, material and diameter codes, empirical reliabilities, trained weights
in the previous epoch and the weight variations in the previous epoch. It
is important to note that the ages and material and diameter codes and
empirical reliabilities are vectors extracted from the failure history (this will
B.5. MATLAB SOURCE CODE 185
be performed within another m-file that repetitively calls ‘Training Epoch’
in multiple epoches to train the neural network).
The ‘Training Epoch’ function initialises different variables and gener-
ates the ‘I’ vector input and calls the ‘NN’ function. Then it uses the
cumulative estimation error and computes the updated weights and returns
them, along with their most recent variations and the cumulative estimation
error E itself.
The complete iterative training scheme for the neural network, using
error back-propagation, is implemented by the function ‘NN Training’. The
source code of this function which is the m-file ‘NN Training.m’, is shown
in Figures B.9 and B.10.
The function ‘NN Training’ reads the records of a failure database that
is assumed to be named ‘Sample data’ but it can be changed within the code
if required. The Microsoft Excel file has to be generated with the format
shown in Figure B.11. There are four columns namely the type (material),
diameter and age of the pipe at the time of the recorded failure, and the
fourth column includes the empirical reliability of the pipe. It is evident
that the failures of each class of pipe should be sorted in an ascending order
of pipe ages and the empirical reliabilities can be then computed according
to Equation 4.4. More precisely, the empirical reliabilities of the smallest
to largest ages of the failed pipes of the same class will be 1 − 12n
, 1 − 32n
,
1 − 52n
, . . ., 12n
, respectively (n is the total number of recorded failures for
the class).
After the contents of the Excel file are read into a numeric matrix and a
text cell array, then the text is first processed and the material information
are coded into material codes. Then, in a similar scheme, the diameters are
coded. The weights of the neural network are initialised to small random
numbers and the repetitive training epoches are started and continued until
convergence. At the end, the weights of the trained neural network are saved
into a file named ‘Trained Weights.mat’ for future use. They can be loaded
into variables and be used as inputs to the ‘NN’ function to estimate the
reliability of a pipe with a given class (material and diameter codes) at a
given age in future.
186 APPENDIX B. INSTRUCTION MANUAL FOR PRACTITIONERS
1 function [Wnew,Vnew,Delta_Wnew,Delta_Vnew,E]=Training_Epoch(Ages,Material_Codes,Diameter_Codes,S_Emp_Data,Wold,Vold,Delta_Wold,Delta_Vold)
2 %This function performs one epoch of back-propagation training of the ANN. The inputs are the vectors of age and empirical reliability value and material and diameter codes, extracted from the failure history. Also the old weight values and their previous variations are given. The outputs are the new weights and updated variations and the total error E.
3 etha=0.05; alpha=0.1;4 DeltaD=200; % Assumably, no pipe is more than 200 years old in the database, s
o this is a good normalisation factor;5 Ntype=5; % 5 tpes of pipe materials are assumed to exist in the database6 Nd=5; % 5 types of pipe diameters are assumed to exist in the database.7 Imatrix=[Ages/DeltaD Material_Codes/Ntype Diameter_Codes/Nd]'; %Each colum
n of this matrix is one unput to the neural network which is to be trained to generate the empirical reliability.
8 N=length(Ages); % the number of failure records9
10 E=0; dE_dV=0; dE_dW=0; %The total error and partial derivatives are initialised to zero to be summed up in the following loop.
11 for k=1:N12 I=Imatrix(:,k); % The input vector is extracted13 [shat,J]=NN(I,Wold,Vold); % the outputs of the ANN are calculated14 e=S_Emp_Data(k)-shat; % estimation error is calculated15 E=E+1/2*e^2; % calculate the formula (B.5)16 dE_dV=dE_dV+e*shat*(1-0.9*shat)*[J ; 1]; %calculates (B.7)17 dE_dW=dE_dW+e*shat*(1-0.9*shat)*[I ; 1]*(J.*(1-J).*Vold(1:(length(Vold)-1)))';
%calculates (B.8)18 end19 Delta_Vnew=+etha*dE_dV+alpha*Delta_Vold; % calculates the new V weight vari
ations from formula (B.9)20 Delta_Wnew=+etha*dE_dW+alpha*Delta_Wold; % calculates the new W weight v
ariations from formula (B.9)21 %updating the weights22 Vnew=Vold+Delta_Vnew;23 Wnew=Wold+Delta_Wnew;
Figure B.8: One epoch of the training process of Neural Network by error
back-propagation. This epoch is repeated until the estimation error of the
neural network falls down a small given threshold.
B.5. MATLAB SOURCE CODE 187
1 function NN_Training
2 % this function reads the failure data from a database in the form of an Excel file
and uses them to train the neural network
3 [Numeric,Text]=xlsread('Sample_data.xls'); %reads the numeric and text data fro
m the excel file which is the failure database
4 Type_Text=Text(2:end,1); % extracts the first column of the excel file which includ
es the text of pipe materials.
5
6 % the folloing lines code the pipe materials into a vector of 1, 2, 3, 4 or 5's accord
ing to table B.1
7 Material_Codes=zeros(size(Type_Text));
8 Material_Codes(strcmp(Type_Text,'CI'))=1;
9 Material_Codes(strcmp(Type_Text,'CICL'))=2;
10 Material_Codes(strcmp(Type_Text,'DI'))=3;
11 Material_Codes(strcmp(Type_Text,'AC'))=4;
12 Material_Codes(strcmp(Type_Text,'GWCL'))=5;
13
14 % the folloing lines code the pipe diameters into a vector of 1, 2, 3, 4 or 5's accor
ding to table B.1
15 Diameters=Numeric(:,1);
16 Diameter_Codes=zeros(size(Diameters));
17 Diameters_Codes(Diameters==80)=1;
18 Diameters_Codes(Diameters==100)=2;
19 Diameters_Codes(Diameters==110)=3;
20 Diameters_Codes(Diameters==120)=4;
21 Diameters_Codes(Diameters==150)=5;
22
23 Ages=Numeric(:,2); % extracting the Ages data from the numeric data of the exce
l file
24 S_Emp_Data=Numeric(:,3); % extracting the empirical reliability data from the nu
meric data of the excel file
25
26 % Start training the neural network
27
28 nh=20; % choose the number of neurons in the hidden layer as 20
29 Wold=0.02*rand(4,nh)-0.01; % initialise the weights W randomly between -0.01 a
nd 0.01
30 Vold=0.02*rand(nh+1,1)-0.01; % initialise the weights V randomly between -0.01
and 0.01
31 Delta_Wold=zeros(size(Wold)); % initialise the delta W to all zeros
Figure B.9: Page 1 of the code that repeats the training epoches until
convergence. It reads the failure history from a Microsoft Excel file.
188 APPENDIX B. INSTRUCTION MANUAL FOR PRACTITIONERS
32 Delta_Vold=zeros(size(Vold)); % initialise the delta V to all zeros
33
34 E=1000; % initialise the training error to a large number
35 while E/length(Ages) > 0.001,
36 % repeat the training epoches until the average estimation error is smaller tha
n a small threshold (chosen as 0.01 here).
37 [Wnew,Vnew,Delta_Wnew,Delta_Vnew,E]=Training_Epoch(Ages,Material_Cod
es,Diameter_Codes,S_Emp_Data,Wold,Vold,Delta_Wold,Delta_Vold);
38 Wold=Wnew; Vold=Vnew; Delta_Wold=Delta_Wnew; Delta_Vold=Delta_Vnew;
% update the weights and their inter-epoch variations for the next epoch
39 disp(E);
40 end
41
42 save Trained_Weights Wnew Vnew; % saves the final weights of the trained neur
al network in the 'Trained_Weights.mat' file
43
Figure B.10: Page 2 of the code that repeats the training epoches until
convergence. It reads the failure history from a Microsoft Excel file.
Figure B.11: Example of the excel file to be processed by the Matlab pro-
gram.
Appendix C
Research Publications
The results of this research study have been published in the proceedings of
two international conference proceedings. Following presenting those papers
at the conferences, they have been extended and modified, then submitted
and accepted for publication in two journals. Furthermore, a paper has
been submitted to a journal and has been revised twice, and is waiting for
the Editor’s final decision. The list of the above publications are as follows:
¶ Dehghan, A., K. J. McManus and E. F. Gad (2008), Statistical Analy-
sis of Structural Failures of Water Pipes in a Case Study, Proceedings
of Institute of Civil Engineers (ICE) - Water Management, 161(4),
207–214, August.
· Dehghan, A., K. J. McManus and E. F. Gad (2008), Probabilis-
tic Failure Prediction for Deteriorating Pipelines: A Non-Parametric
Approach, ASCE Journal of Performance of Constructed Facilities,
22(1), 45–53, February.
¸ Dehghan, A., K. J. McManus and E. F. Gad (2007), Non-Parametric
Approach to Probabilistic Analysis of Structural Failures of Cast Iron
Pipes, In: Proceedings of the Eleventh International Conference on
Civil, Structural and Environmental Engineering Computing, St. Ju-
lians, Malta, Paper No. 240, September.
¹ Dehghan, A. and K. J. McManus (2005), Improved Estimation of Wa-
ter Pipes Reliability for Urban Water Supply Systems, In: Proceedings
of The First International Conference on Structural Condition As-
189
190 APPENDIX C. RESEARCH PUBLICATIONS
sessment, Monitoring and Improvement, Perth, Australia, 163–169,
December.
º Dehghan, A. and K. J. McManus, Reliability Analysis of Water Distri-
bution Pipes Using Artificial Neural Networks, Submitted to AWWA
Journal, Published by American Water Works Association (AWWA),
USA, Second revision submitted, Waiting for final decision by the
journal’s Editor.