failure prediction for water pipes · lected on a regular basis for each pipe of water networks is...

Swinburne University of Technology

Faculty of Engineering and Industrial Sciences

Thesis submitted in fulfillment of the requirements for

the degree of Doctor of Philosophy

Failure Prediction for Water

Pipes

by Azam Dehghan

Victoria, Australia, 2009

To my beloved family, Reza, Helia, and my dear mother and father.

ABSTRACT

This thesis focuses on predicting the future condition of pipes in water sup-

ply networks based on their previous performance using statistical analysis. The

contemporary methods developed to solve this problem are reviewed and a num-

ber of novel statistical analyses and new probabilistic techniques that enhance

the failure prediction accuracy and uncertainty modelling are introduced.

When a complete history of water pipes failures is available, the statistical

analysis will efficiently provide an accurate formulation of the relationship be-

tween failure frequencies and the factors contributing to the overall structural

deterioration of the pipes. This result can then be effectively utilised to predict

the future failures of the pipes. In practice, however, a complete dataset col-

lected on a regular basis for each pipe of water networks is very costly and not

readily available. In such circumstances, more sophisticated statistically derived

models are required. In this thesis, a failure history of water mains provided

by City West Water PTY LTD (CWW) is studied and analysed as a typical

database that is usually available for water supply networks. This database is

also used for comprehensive simulation and evaluation purposes.

An intelligent statistical reliability model based on artificial neural networks is

proposed for reliability estimation of pipes similar in terms of material, diameter,

location, etc. Application of this model to the CWW failure dataset shows

that it substantially outperforms existing statistical reliability models based on

lognormal and Weibull distributions.

In the next step, the ensemble of failures of each group of similar pipes

(called a pipe class) are studied as a random process and demonstrated to be

non-stationary because of the time-varying environmental factors that affect the

pipe failure processes.

This thesis concludes with suggesting a new non-parametric probabilistic

technique developed to capture the non-stationary process of pipe failures de-

spite the lack of information about time-variant factors which is typical of the

data available in water distribution systems. The predictions are updated auto-

matically and therefore take the gradual time-variant factors into account.

The output of this novel non-parametric auto-updating technique is a confi-

dence interval that represents a range of possible number of failures occurring in

a given period of time in the future with a given confidence. The results of eval-

uation of this method for prediction of failures in the CWW failure database

show that, in 95% of the cases, the actual number of failures is within the

confidence interval given by the suggested technique.

ACKNOWLEDGEMENTS

This research was supported by City West Water PTY LTD by providing their

failure database for the water pipes in the western suburbs of Melbourne, Aus-

tralia. The kind support received from my coordinating supervisor, Associate

Professor Kerry J. McManus, and my associate supervisor, Associate Professor

Emad F. Gad during the several milestones of my research studies are appreci-

ated. I would also like to thank Reza for the constant support in all aspects.

DECLARATION

I declare that this thesis:

• contains no material which has been accepted for an award to me of any

other degree or diploma;

• to the best of my knowledge, contains no material previously published

or written by another person except where due reference is made in the

text of this thesis;

• where the work is based on joint research or publications, discloses the

relative contributions of the respective workers or authors;

• has been professionally edited and the editing has addressed only the style

of the thesis and not its substantive content.

Signature:

Azam Dehghan Date:

Contents

List of Symbols and Abbreviations v

1 Introduction 1

1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Introduction to the Modelling of Failures in Water Pipes . . 6

1.3 Objectives and Structure of Study . . . . . . . . . . . . . . 8

2 Literature Review 15

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.2 Mechanical Properties and Manufacturing Techniques of Cast

Iron Pipes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.3 Structural Failures of Cast Iron Pipes . . . . . . . . . . . . . 17

2.4 Effective Factors in Pipe Failure Mechanisms: A Review of

Previous Studies . . . . . . . . . . . . . . . . . . . . . . . . 19

2.4.1 Pipe diameter . . . . . . . . . . . . . . . . . . . . . . 20

2.4.2 Pipe length . . . . . . . . . . . . . . . . . . . . . . . 21

2.4.3 Pipe age . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.4.4 Pipe material . . . . . . . . . . . . . . . . . . . . . . 22

2.4.5 Manufacturing methods . . . . . . . . . . . . . . . . 24

2.4.6 Corrosion . . . . . . . . . . . . . . . . . . . . . . . . 25

2.4.7 Pipe’s failure history of . . . . . . . . . . . . . . . . . 27

2.4.8 Water pressure . . . . . . . . . . . . . . . . . . . . . 28

2.4.9 Soil condition of the bedding . . . . . . . . . . . . . . 29

2.4.10 Seasonal variations . . . . . . . . . . . . . . . . . . . 30

2.5 Current Models Developed for Pipe Failure Analysis and Pre-

diction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

2.5.1 Physical analysis . . . . . . . . . . . . . . . . . . . . 31

2.5.2 Descriptive analysis . . . . . . . . . . . . . . . . . . . 33

i

ii CONTENTS

2.5.3 Statistical analysis . . . . . . . . . . . . . . . . . . . 35

2.6 Reliability Analysis of Water Networks . . . . . . . . . . . . 59

2.7 Milestones of Study and Summary . . . . . . . . . . . . . . . 62

3 Data Description 67

3.1 Typical Failure Data in Water Distribution Systems . . . . . 67

3.2 Contents of Database of This Study . . . . . . . . . . . . . . 68

3.3 Spatial Location of Pipes . . . . . . . . . . . . . . . . . . . . 72

3.3.1 Estimation of postcodes for given AMG coordinates . 73

3.3.2 Estimation of postcode for pipes with no spatial data 73

3.3.3 Distribution of failures in different postcodes . . . . . 74

3.4 Adding the Rainfall Information to the Data . . . . . . . . . 75

3.5 Failure Times . . . . . . . . . . . . . . . . . . . . . . . . . . 78

3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

4 Intelligent Reliability Analysis of Water Pipes Using Ar-

tificial Neural Networks 81

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

4.2 Reliability Analysis: Principles and Definitions . . . . . . . . 82

4.2.1 Reliability of water distribution systems . . . . . . . 82

4.3 Objectives of The Proposed Reliability Analysis . . . . . . . 83

4.4 Structure of the Proposed Reliability Model . . . . . . . . . 84

4.5 Empirical Estimation of Survival Functions . . . . . . . . . . 85

4.6 Weibull and Lognormal Lifetime Models . . . . . . . . . . . 88

4.6.1 Weibull lifetime distribution . . . . . . . . . . . . . . 88

4.6.2 Lognormal lifetime distribution . . . . . . . . . . . . 89

4.7 Intelligent Reliability Prediction by Artificial Neural Networks 90

4.8 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . 93

4.9 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

5 Characteristics of Water Main Lifetimes as Random Pro-

cesses 101

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

5.2 Non-Stationary Random Failure Processes and Parametric

Lifetime Models . . . . . . . . . . . . . . . . . . . . . . . . . 102

5.3 Likelihood of Number of Failures: A Probabilistic Definition

for Failure Frequency . . . . . . . . . . . . . . . . . . . . . . 103

5.4 Derivation of Theoretical LNF Values From Lifetime Models 106

5.5 Empirical Calculation of LNF Values . . . . . . . . . . . . . 108

iii

5.6 Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

5.6.1 Effect of rainfall on failure rates . . . . . . . . . . . . 110

5.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

6 A Non-Parametric Technique for Failure Prediction of De-

teriorating Components 121

6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

6.2 Maximum Likelihood Estimation of Future LNF Values . . . 124

6.3 Required Level of Accuracy for the Inter-Failure Times . . . 125

6.4 Prediction of Inter-Failure Times . . . . . . . . . . . . . . . 127

6.5 Failure Prediction Using The Estimated LNF Values . . . . 129

6.5.1 Prediction of number of failures . . . . . . . . . . . . 129

6.5.2 Confidence intervals . . . . . . . . . . . . . . . . . . 130

6.5.3 Failure prediction for multiple future time intervals . 131

6.5.4 A step-by-step algorithm for failure prediction . . . . 134

6.6 Results of Failure Prediction Using the Proposed Non-Parametric

Technique . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136

6.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

7 Conclusions and Recommendations for Further Work 143

7.1 Summary of Study and Achievements . . . . . . . . . . . . . 143

7.2 Recommendations for Further Work . . . . . . . . . . . . . . 147

Bibliography 151

A An Introduction to Artificial Neural Networks 169

A.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 169

A.2 McCulloch-Pitts Neurons . . . . . . . . . . . . . . . . . . . . 169

A.3 Linear Neuron Models . . . . . . . . . . . . . . . . . . . . . 170

A.4 Multi-Layer Feed-Forward Perceptrons . . . . . . . . . . . . 171

B Instruction Manual for Practitioners 175

B.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 175

B.2 Preparation of Training Data . . . . . . . . . . . . . . . . . 176

B.3 Training the Artificial Neural Network . . . . . . . . . . . . 178

B.3.1 Learning by Error Back-Propagation . . . . . . . . . 179

B.3.2 Step-by-Step Training Algorithm . . . . . . . . . . . 180

B.4 Prioritisation of Pipes by Reliability Prediction Using the

Neural Network Model . . . . . . . . . . . . . . . . . . . . . 181

B.5 MATLAB Source Code . . . . . . . . . . . . . . . . . . . . . 184

iv CONTENTS

C Research Publications 189

List of Symbols

and Abbreviations

Abbreviation Description Page of 1st

Appearance

µk The inter-failure time elapsed between twoconsecutive NOFk events

124

∆H head loss due to friction 26AC pipes Asbestos Cement pipes 23AMG Australian Map Grids 73ANN Artificial Neural Network 9AWWARF American Water Works Association Research

Foundation55

CHM Hazen-Williams coefficient 26CBD Cenral Business District 9CDF Cumulative Distribution Function 48CI pipes Cast Iron pipes 23CICL Cast Iron Cement Lined 71CWW City West Water Pty Ltd 9D internal pipe diameter 26DI pipes Ductile Iron pipes 23DSS Decision Support System 32Eh Redox potential (millivolts) 46ENOF Expected Number Of Failures 105FIR Finite Impulse Response 127FOM Force Of Mortality 48GCI pipes Grey Cast Iron pipes 24h(x) Hazard function 48H(x) Cumulative hazard function 48

v

vi LIST OF SYMBOLS AND ABBREVIATIONS

Abbreviation Description Page of 1st

Appearance

IFTi The Inter-Failure Time between the timesTi−1 and Ti

79

KPI key performance indicators 69L Length of pipe 26LNF Likelihood of Number of Failures 104ML Maximum Likelihood 125MSE Mean Square Error 95NDT Non-Destructive Technique 27nh The number of neurons in the hidden layer of

the ANN-based intelligent reliability model93

NOFk(nT ) The event of occurrence of k failures duringthe n-th time interval [(n− 1)T, nT ]

104

pdf probability density function 48pH Soil pH 46PHM Proportional Hazard Model 49pmf probability mass function 108Q Water flow 26ROCOF rate of occurrence of failures 6S(x) Survival function 48SR Saturated soil resistivity (ohm-cm) 46STEM Shifted Time-Exponential Model 40STPM Shifted Time-Power Model 42TFF Time to the first failure 104VAR Variance 126WPSI model Winkler Pipe-Soil Interaction model 32

List of Figures

1.1 The bathtub curve of the life cycle of a buried pipe . . . . . . . 7

2.1 Cumulative failure plot for a single pipe in CWW . . . . . . . . 36

2.2 The shifted time exponential model is fitted to past failure rates

which are the cumulative number of previous failures at different

times in the life of the pipe. Larger λ corresponds to a worse

performance and vice versa. . . . . . . . . . . . . . . . . . . . . 41

3.1 Failure data in water networks may be available only during

specific time windows and include left and right censored outside

the window. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

3.2 Plot of AMGX coordinates versus the pipes unique IDs . . . . . 75

3.3 Plot of AMGY coordinates versus the pipes unique IDs . . . . . 76

3.4 Failure rates in each of the postcodes 3000–3039 in the region

under study (average number of breaks per km during 1997–2000). 76

3.5 Failure rates in each of the postcodes 3040–5277 in the region

under study (average number of breaks per km during 1997–2000). 77

3.6 Geographical map of the licence area of City West Water. . . . 77

3.7 Quarterly recorded rainfalls during 1997-2000 . . . . . . . . . . 78

3.8 Histograms of monthly records of rainfall . . . . . . . . . . . . . 78

3.9 Failure and inter-failure times for a class of pipes . . . . . . . . 79

4.1 Schematic diagram of a survival function estimator for a water

pipe with given type (material), diameter and construction date:

The estimator gives the pipe reliability to survive until a given

assessment date in the future. . . . . . . . . . . . . . . . . . . . 85

4.2 The step-wise empirical survival function . . . . . . . . . . . . . 87

4.3 Architecture of the proposed neural reliability analyser . . . . . 91

vii

viii LIST OF FIGURES

4.4 Diagram of an artificial neuron model in a multi-layer feed-

forward perceptron network. . . . . . . . . . . . . . . . . . . . . 92

4.5 Nonlinear profile of the Sigmoid function, the activation function

of all neurons in the proposed ANN-based reliability model. . . 92

4.6 Empirical and modelled survival function plots (class 1) . . . . . 95



5.1 Demonstration of inter-failure times in three instances . . . . . 105

5.2 Empirical LNF values P0 , P3 and P4 . . . . . . . . . . . . . . 110

5.3 Expansive soils in Victoria . . . . . . . . . . . . . . . . . . . . . 112

5.4 Rainfalls and corresponding empirical average number of failures 114

5.5 Daily average number of failures during each season in 1997-

2000, and the corresponding rainfall records for CICL pipes with

100mm diameter. . . . . . . . . . . . . . . . . . . . . . . . . . . 116

5.6 Daily average number of failures during each season in 1997-

2000, and the corresponding rainfall records for CI pipes with

100mm diameter. . . . . . . . . . . . . . . . . . . . . . . . . . . 116

5.7 Deviation of rainfalls from their average, plotted versus the cor-

responding ENOF values: A regression line demonstrates the

nearly linear correlation between the failure rates and rainfall

deviations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

6.1 Inter-failure time, showing an example where µ2 = 4T . . . . . . 124

6.2 Variance of the estimated inter-failure times is a decreasing func-

tion of the maximum likelihood estimates of Pk values. . . . . . 126

6.3 The number of failures occurring during 20 consecutive time in-

tervals (top plot) and the inter-failure times, obtained from these

data. The µk’s are constant and change only when an NOFk oc-

curs. Therefore, at each time, only one µk changes and the rest

stay at the same value. . . . . . . . . . . . . . . . . . . . . . . . 128

6.4 A case example to show the unreasonable results with using the

mode of distribution instead of the statistical mean of number

of failures for prediction. . . . . . . . . . . . . . . . . . . . . . . 130

6.5 An example of LNF values for the number of failures per day,

to be the base of computation of weekly and monthly LNF values.132

6.6 LNF values for the number of failures per week, based on the

daily LNF values plotted in Figure 6.5. . . . . . . . . . . . . . 133

ix

6.7 LNF values for the number of failures per month, based on the

daily LNF values plotted in Figure 6.5. . . . . . . . . . . . . . 133

6.8 A step-by-step algorithm for the proposed non-parametric failure

prediction technique. . . . . . . . . . . . . . . . . . . . . . . . . 135

6.9 Expected number of failures and their 80% confidence intervals

based on weekly updating, for CICL pipes with 100mm diameter

located in postcode 3021. . . . . . . . . . . . . . . . . . . . . . . 137


based on monthly updating, for CICL pipes with 100mm diam-

eter located in postcode 3021. . . . . . . . . . . . . . . . . . . . 138


based on quarterly updating, for CICL pipes with 100mm diam-

eter located in postcode 3021. . . . . . . . . . . . . . . . . . . . 139

6.12 Expected number of failures and 80% confidence interval based

on monthly updating, compared to predictions given by simple

averaging of recent records. . . . . . . . . . . . . . . . . . . . . 140

A.1 Some examples of activation function or “nonlinearity” of a unit 173

A.2 Architecture of a typical feed-forward neural network . . . . . . 174

B.1 Input layer of the ANN model. . . . . . . . . . . . . . . . . . . 178

B.2 Hidden layer of the ANN model. . . . . . . . . . . . . . . . . . . 179

B.3 Hidden layer of the ANN model. . . . . . . . . . . . . . . . . . . 179

B.4 An example of the reliability versus age plot of a model derived

for a class of pipes. . . . . . . . . . . . . . . . . . . . . . . . . . 182

B.5 The reliability of the same class of pipes (as in Figure B.4) in

the year 2015, plotted versus the construction year of the pipes

in that class. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183

B.6 For the “red” and “blue” classes of pipes, the pipes constructed

before the year Y1 and Y2, respectively, are high-risk. . . . . . . 183

B.7 The source of the Matlab function that realises the three layers

of the Neural Network. For a given set of input values, this

functions returns the output of the network and the activation

values (outputs) of the neurons in the hidden layer. . . . . . . . 184

B.8 One epoch of the training process of Neural Network by error

back-propagation. This epoch is repeated until the estimation

error of the neural network falls down a small given threshold. . 186

x LIST OF FIGURES

B.9 Page 1 of the code that repeats the training epoches until con-

vergence. It reads the failure history from a Microsoft Excel

file. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187

B.10 Page 2 of the code that repeats the training epoches until con-

vergence. It reads the failure history from a Microsoft Excel

file. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188

B.11 Example of the excel file to be processed by the Matlab program. 188

List of Tables

2.1 Factors affecting structural deterioration of water distribution

pipes (Rostum; 1997) . . . . . . . . . . . . . . . . . . . . . . . . 20

2.2 Sample relative failure rates for different pipe materials . . . . . 25

2.3 Estimated water leakage in 12 U.S. cities in 1978 . . . . . . . . 34

2.4 Water loss percentages for different causes, measured in Boston,

1978 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

2.5 Importance of different criteria for replacement/rehabilitation . . . 35

2.6 Deterministic Time Exponential Models . . . . . . . . . . . . . 44

2.7 Deterministic Power and Linear models . . . . . . . . . . . . . . 47

2.8 Probabilistic models using time-dependent Poisson model . . . . 64

2.9 Probabilistic models using Cox’s proportional hazard . . . . . . 65

2.10 Probabilistic models using Weibull hazard function . . . . . . . 65

2.11 Miscellaneous probabilistic models . . . . . . . . . . . . . . . . 66

3.1 Construction history of cast iron mains in City West Water

(Righetti, 2001) . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

3.2 Statistics of the six classes of pipes in the dataset, selected for

reliability analysis and failure prediction in this study. . . . . . 72

4.1 MSE of survival function estimation by various models . . . . . 97

5.1 Approximate clay content of different types of soils . . . . . . . 113

5.2 The empirical expected number of failures and their standard

deviations for 16 consecutive seasons during 1997-2000. . . . . 115

5.3 Properties of different texture types found in soils (Northcote) -

Continued to the next page. . . . . . . . . . . . . . . . . . . . 119

6.1 Rejection rate for weekly, monthly and quarterly updating . . . 139

B.1 An example of pipe material and diameter coding. . . . . . . . . 177

xi

Chapter 1

Introduction

1.1 Background

The performance of water distribution systems is primarily assessed based

on the standards governing the process of delivering water to customers.

Such standards are generally established in relation to quantity, quality

and reliability of the service, as follows:

Quantity: Water distribution systems should provide the required flows at,

or above, the minimum pressure. Hydraulic reliability of water distribution

systems assists the managers of water distribution companies to guarantee

the quantity requirements of the network. For instance, nodes in water

networks are supposed to receive a given supply at a given pressure (head)

and hence actual performance can be compared with the required standards.

Quality: Water distribution systems should ensure that quality of the water

(in terms of flavour, odour, appearance and sanitary security) delivered

to consumers complies with the established regulations and standards for

drinking water is a necessary function of the supply of water. Since safe

drinking water is essential to a healthy life, every effort needs to be taken to

ensure that drinking water suppliers provide consumers with water that is

safe to use. Satisfying this criterion remains a critical priority that should

be considered in every function within a water distribution system. For

instance, according to Australian Drinking Water Guidelines (2004): “There

should be effective maintenance procedures to repair faults and burst mains in a

manner that will prevent contamination.”

1

2 CHAPTER 1. INTRODUCTION

Reliability: Water distribution systems should keep the unscheduled disrup-

tion to supply should below the acceptable level specified by the authorities.

In this sense, improvement of network reliability leads to reduction of water

losses.

In order to meet the quantity, quality and reliability standards, water

distribution systems need to be constantly monitored to detect any faults

regarding the water quantity (including delivery pressure), quality and dis-

tribution system reliability.

A water distribution system mainly consists of pipes, valves, storage

tanks and pumping stations. Among these components, water reticulation

pipes, as the lifelines of urban and rural communities, are the primary parts

and also typically known to be the most high maintenance assets of the wa-

ter supply systems. Water mains are continuously subjected to numerous

types of environmental and operational stresses that lead to their deteriora-

tion. Increased operation and maintenance costs, water losses and reduction

of water quality are common consequences of deterioration of water mains.

Thus, to maintain the acceptable level of service, the ongoing challenge

facing managers of water distribution systems is the appropriate mainte-

nance of water pipes, typically within considering the financial constraints.

Against this backdrop, there is also often a need to improve the reliability

of the system and to improve the service delivered to the users. One of

the issues in the environment is the lack of regular condition monitoring of

the asset. Data is mainly obtained on failure of the asset by one mode or

another, rather than by regular measurement of the state of the asset.

Water distribution networks own thousands of kilometres of reticulation

pipes that are not in perfect condition due to deterioration or construction

method or in-appropriate instruction order. Obviously, overall replacement

of these parts of the networks is not economically feasible for the asset own-

ers and operators. Therefore, it is necessary to handle older networks in

appropriate ways. Besides, many of old water pipes are still satisfactorily

functioning. Age of the pipes, therefore, cannot be used as the sole crite-

rion in their assessment. It should be noted that water networks of most

of the developed municipalities within the region under study have been

constructed more than 100 years ago. Region understudy is the western

suburbs and inner sections of City of Melbourne in Australia. The older

parts of the networks were built in accordance with standards and construc-

tion practices that are now considered inappropriate.

A number of researchers have been trying to develop decision support

tools to assist the managers of water companies to prioritise their assets in

1.1. BACKGROUND 3

terms of condition and develop criteria and programs for maintenance/replacement

of water mains. Some of these criteria can be explicitly quantified (i.e.

maintenance cost, capital cost, hydraulic carrying capacity at present and

future demands). However, quantifying some other criteria such as reliabil-

ity and the social costs associated with pipe failures may require surrogate

or implicit evaluation techniques.

A key challenge that has attracted the attention of many researchers in

the field of water supply during recent decades is the development of de-

velop reliable models for prediction of the future maintenance/replacement

requirements of water mains. Traditionally, in water supply systems, a sig-

nificant number of network repairs are performed on an unscheduled basis

generally in response to pipe or other component failure. This reactive

maintenance of assets has the disadvantage that measures are not taken

before the occurrence of damage and pipe failures cause considerable costs

and inconvenience for water distribution systems and society. Considering

the limited financial resources, the ability to avoid frequent damages and to

optimise the use of available funds is highly desirable.

Since it is not practical or economically possible to replace the entire

length of the network, in a proactive management scheme (which provides

timely maintenance), the assets are prioritised for their repair, rehabilitation

or replacement. This is a proactive strategy in the sense that it removes the

need to wait for failures to occur before fixing and measuring their rate of

occurrence. In such a strategy, the manager of the system determines the

maintenance requirements of water mains by taking into account the state

of the pipes and forecasting their future performance. Employing predictive

models is therefore the preferred option for water network management.

The initial motivation of this research was the situation that was con-

cerning a water supply network in West of Melbourne, Victoria, Australia.

In 1995/96, the water mains of City West Water PTY-LTD (CWW) were

experiencing the highest rate of failures Australian water distribution sys-

tems (WSAA facts’99 1999). In the same year, CWW’s Water Reticulation

Asset Status (Pipe Structural Performance report (Water Reticulation As-

set Status Report 1997)) reported that in 1995/96, CWW had 3.1 times the

break rate of the water retailer in Melbourne, namely, South East Water.

Given this background, since 1999, CWW recognised the need for the failure

analysis of water pipes.

The managers of CWW required to have more control over the num-

ber of interruptions to the service of consumers. Thus, the need for efficient

maintenance/rehabilitation planning based on reliable failure prediction was


recognised. Accordingly, an investigation for cast iron pipes and associated

failures with a view to formulating a strategy for cost-effective asset man-

agement in both the short and longer term was conducted (Righetti 2001).

In response to the above mentioned demand, existing techniques for

failure prediction of water pipes and new methods and analyses were devel-

oped to accommodate for the common requirements and conditions of water

mains and failure databases of water authorities like CWW. The results of

this study are presented in this thesis.

The main approach chosen in this thesis to tackle the failure analysis

problem is predictive analysis which is based on modelling past pipe break-

age behaviour in order to project it into the future. The main focus is on

deriving failure/reliability models in terms of the component-based char-

acteristics of the pipes such as their diameter, age, type and geographical

location, regardless of the system-based characteristics of the pipes such as

their maximum pressure and nodal isolation.

This study aims to provide reliable techniques for water distribution

managers in developing their maintenance/rehabilitation policies. Exist-

ing techniques for prediction of pipe failure were investigated, in order to

adopt a proper technique that meets the objective of this study. The type

of available data was an important factor in determining the direction of

research.

The ability to confidently estimate the economically viable life of water

mains is fundamental to developing the maintenance/rehabilitation strate-

gies in water distribution systems. Economically viable life of a water

main should be assessed by analysis of the level of disruption avoidance

and continued operational and maintenance costs versus rehabilitation and

replacement costs, such that the best long-term solution can be identified

(Skipworth et al. 2002).

The main difficulty associated with developing efficient rehabilitation or

maintenance strategies for water mains is that the water mains are under

the ground and therefore, monitoring of the physical condition of each and

every pipe is not feasible. One of the prerequisites of undertaking this task

in a proactive manner is the ability to predict the expected failure behaviour

of pipes.

Previous research reviewed in this chapter, has shown that the failure

process is a complex function of a large number of variables, some of which

cannot be directly quantified. To deliver drinking water to the consumers

under certain standards on quantity, quality and reliability criteria, water

distribution companies need more than experienced managers and opera-

1.1. BACKGROUND 5

tors. In fact, there is a need for mathematical models to estimate future

pipe failures with an acceptable confidence in order to plan a maintenance

or replacement strategy for their network. Such models can be substan-

tially beneficial to water authorities through the efficient improvement of

their maintenance/rehabilitation plannings by replacing the common reac-

tive strategies with prediction-based proactive strategies.

Recognising that maintenance/replacement strategies are developed to

keep the annual service interruptions up to a certain number, e.g. five

times, this study concentrates on developing reliable estimators either for

expected number of failures, or reliability of a class of pipes in a certain time.

Therefore, this study explores a proper technique for statistical analysis of

this data.

A range of existing failure analysis models are deterministic in the sense

that their outcome is a value such as number of failures in a certain time.

However, probabilistic measures are more reliable and preferred for devel-

opment of maintenance/rehabilitation strategies. Indeed, probabilistic ap-

proaches are more realistic as they can provide confidence intervals for their

estimates. Confidence intervals give the planners an idea about how reliable

are the predictions. Thus, this research follows a probabilistic approach to

devise new prediction techniques for water pipe failures.

To perform the statistical analysis, this study adopts a probabilistic

approach. The reason for this choice is that the outcome of a probabilis-

tic model, unlike deterministic models, is a single probability (or a set of

probabilities) not just a certain value. Considering the complexity of the

diverse range of factors affecting the mechanism of failure, these types of

estimations are more reliable than deterministic results, to be used in devel-

oping maintenance/rehabilitation strategies in water distribution systems.

Besides, the outcome of these techniques can be estimated within a certain

confidence interval. These specifications, along with the characteristics of

data available in water networks, that cannot be monitored regularly and

often, make this class of analysis more appealing for water supply managers

and researchers.

Based on critical review of existing models for prediction of structural

failures of water pipes it was found that predicting the future failure be-

haviour of a pipe for which little or no burst data exists is very difficult, and

even models based on “grouped” analysis have a high degree of uncertainty

associated with such single pipes.


1.2 Introduction to the Modelling of Failures in

Water Pipes

Prediction of future behaviour of pipes in a water supply system is a very

complex task. The factors that lead to the pipe failures vary from one

water distribution system to another one. So failure patterns of each water

distribution network in each city differs from another location. Even these

patterns are different for various groups of pipes within a network. The

reason is that the relative importance of effective factors may vary. This

means that failure analysis should be performed for individual pipes or pipes

with similar failure-related characteristics.

Observing the physical condition of water pipes can reveal some infor-

mation about their structural state. For example with new non-destructive

testing techniques, it is possible to measure the average pit depth and also

the maximum pit depth caused by internal corrosion. However, these tech-

niques are time consuming and rather expensive. Therefore, the approach

of physical analysis is only practical for critical water mains.

As constant monitoring of the physical conditions of the numerous pipes

within the deteriorating network is not feasible, a practical alternative is pri-

oritisation by Statistical Analysis. One of the first solutions in this frame-

work has been the attempt to establish the life cycle of a typical buried

pipe. A number of researchers - e.g. (Ascher and Feingold 1984) described

the pattern of pipe’s life cycle by a so-called “bathtub curve”, named af-

ter its characteristic shape, as is illustrated in Figure 1.1. It is the plot of

hazard function against time, which is well-known in reliability analysis of

mechanical units. More precisely, the bathtub curve illustrates the tempo-

ral development of the rate of occurrence of failures (ROCOF) in a water

pipe over its service-life.

When modelled by a bathtub curve, the life cycle of a pipe is assumed to

consist of three phases. The first years of installation of pipe, burn-in phase,

are associated with the failures mostly due to faulty pipes or installation

errors. Frequency of failures decrease in this phase until the pipe reaches

a stable situation that is almost trouble free (in-usage phase). This phase

of pipe’s life cycle will continue up to the time that the pipe’s age and the

accumulated degradation cause increasing frequency of failures. This last

stage of life cycle, the wear-out phase, is of most importance in developing

the maintenance strategies for mature water distribution systems.

A bathtub curve cannot always accurately describe the life cycle of all

1.2. INTRODUCTION TO THE MODELLING OF FAILURES IN WATERPIPES 7

Figure 1.1: The bathtub curve of the life cycle of a buried pipe (Kleiner and

Rajani 2001)

pipes. Pipes under different conditions experience various extents and pat-

terns of stages in their life. Some of the models that will be explained

in Chapter 2 assume different numbers of phases for the life cycle of pipes.

Some assume different curve-shapes for ROCOF-time plots (e.g., alternative

B to the wear-out phase in Figure 1.1).

It should be mentioned that maintenance databases available for most

water distribution systems do not cover the early years of those pipes laid

about 100 years ago. More specifically, the data available in water distri-

bution systems are both left and right censored in the sense that failure

of early years are usually not available and the future is to be predicted.

Proactive management of assets in these systems can only be performed by

using the decision supporting tools that can extract realistic estimates for

future performance of their assets from such data.

Proactive maintenance strategies for water mains usually use estimates

of one of the following quantities as their key decision-making criterion:

(a) The number of failures in a given time period;

(b) The time remained till the next failure;

(c) The statistical distribution of (a); or

(d) The statistical distribution of (b).

To obtain such estimates, most statistical models implicitly assume that the

pattern of occurrences of failures repeats over time and therefore, it can be

modelled based on the failure history.


Statistical models can be generally categorised into major classes of de-

terministic and probabilistic models. The outcome of a deterministic model

is a certain value such as estimated breakage rate or time to next failure

or number of failures in a certain time in the future (outcomes (a) and (b)

in the above list). These models are applied to groups of pipes that are

homogeneous in terms of some of their breakage-related characteristics such

as diameter, material, length, geographical location, etc. Classification of

pipes is performed in such a way that each class contains pipes of similar

failure patterns. A failure history of each homogeneous group is then used

to estimate the future number of failures (or failure times) for that class.

Probabilistic models on the other hand return the distribution of fail-

ure times or numbers (outcomes (c) and (d) in the above list) in terms of

probabilistic measures such as probability of occurrence of given number

of failures up to a certain time in the future. These models are usually

associated with more complex mathematical frameworks. A probabilistic

approach, however, provides the capability of determining some lower and

upper bounds (confidence intervals) for the estimations, which can be help-

ful from planning point of view.

1.3 Objectives and Structure of Study

This thesis describes the development of new on state-of-the-art failure pre-

diction techniques for water mains. The main motivation of this study was

to contribute to the sustainability of water supply systems, one of the oldest

icons of infrastructure, present in most of the urban and rural communities

across the globe.

The mechanism of pipe failure has been studied by many researchers as

a complex process under influence of a vast variety of internal and external

factors -e.g. (Clark et al. 1982, Constantine et al. 1998, Kleiner and Rajani

2001). Each combination of these physical characteristics, environmental,

and operational factors leads to a different pattern of failure rates. These

studies have been conducted with the aim of formulating the process of

failure in terms of some of measurable effective factors. In many case studies,

the effects of some of these factors on the failure rate have been studied.

As the first step towards developing a solution for the research problem

of this thesis, previous studies in this literature were thoroughly reviewed.

Chapter 2 presents a review of other studies conducted in this field and the

existing techniques and models. The models are classified into a number

1.3. OBJECTIVES AND STRUCTURE OF STUDY 9

of distinct categories. For each model, its required data, strengths and

limitations, and the model outcomes are presented and evaluated.

A special feature of this thesis is the usage of a case study for the purpose

of comparative simulation and evaluation. This case study uses a failure

history of water mains by City West Water Pty Ltd (CWW) of pipe failure

predictions. This water retailer supplies the water of western and some inner

suburbs and parts of the Central Business District (CBD) of Melbourne,

Australia. Almost all the break-related challenges in water industry are

involved in this case study.

The type of data that is available in most water distribution systems is

explained in Chapter 3. More specifically, this chapter also describes the

characteristics and limitations of the database provided by CWW that is

used and improved in this study. It should be noted that the data used

in this study consists of failures of metal pipes digitally recorded during

1997-2000. The database has passed a significant data auditing in a recent

study supported by CWW.

On the basis of the reviews and analysis in Chapters 2 and 4, a technique

for prediction of the reliability of water mains in a given time in the future

is proposed. Reliability analysis is separately performed for each class of

pipes where that class is chosen to be homogeneous in terms of some break-

related characteristics. Reliability of a class of pipes in a given time in the

future is defined as the probability of having that class of pipes working

with no failure till that time. An Artificial Neural Network (ANN) is used

for producing the reliability estimation models for separate classes of pipes.

Usage of ANNs for modelling and estimation is a well adopted technique

in reliability analysis of mechanical and electrical systems. This method is

a powerful tool with applications in many fields involving curve fitting, pat-

tern recognition, etc. Furthermore, ANNs are capable of learning underlying

physical models from noisy and wavy patterns of incomplete and censored

data. Chapter 4 provides a short review of feed-forward artificial neural

networks model for homogeneous groups of pipes.

An ANN technique is then formulated to generate reliability estimation

models. These neural network models are trained and evaluated for each

class of pipes using the case study dataset described in Chapter 3.

The process of training the ANN model using training records and vali-

dation of the trained model using the rest of the data (validation records)

are also explained in Chapter 4. For the purpose of comparison, Weibull

and lognormal distribution models are fitted to the failure records in the

case study and applied to estimation of the reliability of the same classes


of pipes used with the ANN model. The accuracy of reliability estimates

given by the proposed neural network method is compared to the outcomes

of the Weibull and lognormal estimators.

Although the resulting neural network models are trained and evaluated

using the particular failure history, a similar procedure can be repeated

to produce models for other failure histories. Hence, the presented neural

modelling method is a generic technique and can be modified for other

failure datasets, as well.

The resulting neural network model is also compared to the two clas-

sical life time models (Weibull and lognormal models) using the existing

database. However, the neural network technique inherently learns and re-

constructs the pattern of data and unlike the distribution models does not

assume a predefined behaviour for the data. This characteristic makes this

model a general tool not restricted to particular data.

Chapter 5 studies the underlying perceptions of the statistical models

existing in the literature and their assumptions on the nature of failure

occurrences. In a novel approach, this chapter looks at the ensemble of pipe

failure occurrences as a random process. A set of probabilistic values are

introduced to be used to investigate the statistical characteristics of this

random process. Visualising the variations of this random process using

these probabilistic values demonstrates the non-stationarity of the failure

process. This important characteristic of the failure process of water pipes

is illustrated by histograms of empirical values of failure probabilities for the

number of failures occurring per season using the available failure records.

The variations and trends in the patterns of failure frequencies in the

CWW database - which are due to the non-stationarity of failure process

- are studied in similar seasons of consecutive years. This characteristic is

not exclusive to this database and applies to any other failure history. The

reason for the non-stationarity is that in addition to the static characteristics

of pipes, there is a range of dynamic factors that affects the rate of failures

of water mains.

Dynamic factors such as seasonal variations of rainfall, which are the

source of changes and trends in patterns of failures, are not easily taken

into account by existing techniques. These time variant factors are exclu-

sive to each network (region) and vary according to the characteristics of

the location. For example, the CWW network is located on a bedding of

expansive soil. This is also the case for many other cities such as Montreal,

Canada and parts of Texas, USA. Water mains at these locations suffer from

shrinkage and expansion of soil due to changes in moisture content of soil


as a result of rainfall variation. In other words, rainfall may substantially

affect the failure process of water pipes of such areas. The research in this

study examines the variations of failure rates in terms of changes in rainfall.

The deficiency of existing probabilistic (which are all parametric) models

of reliability and lifetime in capturing the proven non-stationary process

of pipe failures is mathematically demonstrated in Chapter 5. To model

the pattern of failure occurrences, parametric models assume a particular

probability density function (pdf) for the structural failures. Thus, these

models yield biased results by ignoring time-varying factors in the statistical

analysis of break rates.

There are many time-variant factors (e.g. temperature effects; soil-

moisture effects; cumulative length of replaced water mains; cumulative

length of cathodically protected water mains; loss of bedding support; cor-

rosion pit growth) that affect the rate of failures and are almost impossible

to be taken all into account in a single general parametric model. Even if

all of these data were recorded during the age of water mains, any attempt

to estimate these factors in a given time in the future in order to predict the

pipe condition would involve considerable uncertainty due to large spatial

and temporal variability that is inherent in this information.

Given the background in Chapter 5 regarding the limitations within the

data and complexity of process, the best way to account for the variation of

environmental and operational factors is to consider the factors implicitly

by dividing the pipes into homogeneous groups and attempting to develop

a technique to produce non-parametric models for each group. In this con-

text, even neural network models are not free of parameters. Although the

parameters of neural networks that are synopsis weights are tuned using

the training records, these parameters are fixed for the models. Future

estimations are therefore performed based on these fixed parameters.

After discussing the limitations of using parametric methods in analysis

of the water pipe failure process, an iterative non-parametric approach is

explored to address the problem. Chapter 6 contains a review of a number

of existing studies that have considered the time-variant factors in predict-

ing the future failures. The strengths and limitations of these techniques

are explained. An absence of a generic technique able to be used with (a

degree of modification) for failure history of water pipes of any network to

provide the estimations for future failures of separate groups of pipes was

the motivation for further study on this subject. This kind of estimation

can assist the managers of water distribution systems with prioritisation of

classes of pipes in their maintenance/replacement plannings.


Chapter 6 presents dynamic models that can be updated automatically

adding new records to the database. Such non-parametric techniques implic-

itly tackle the problem of capturing the non-stationarity of failure process

in spite of incomplete and limited failure histories.

A number of probabilistic values are introduced to express the quantities

representing the gradual variations of the factors influencing the deteriora-

tion process in the pipes which in turn leads to pipe failure. The outcome of

applying this technique to a failure history of a class of pipes is the expected

number of failures up to a time in the future. Simulations are performed

based on a day-by-day updating, and the results of estimations for weekly,

monthly, and quarterly periods of forecasting are presented. The study is

based on a robust mathematical framework and also benefits from theoret-

ical convergence and stability properties demonstrated using principles of

probability theory.

The developed technique is applied to the existing failure history, de-

scribed in Chapter 3. Details of processing the data, applying the proposed

technique, and the outcomes of this statistical analysis are clearly described

in this chapter. The algorithm of applying the technique on any failure his-

tory is provided. Limitations and advantages of technique are also discussed

in the chapter. It should be noted that the technique is generic and can

be used for other failure histories as well as for components of some other

infrastructure systems with similar characteristics.

The last chapter of thesis, Chapter 7, summarises the achievements and

conclusions of the study as well as limitations and recommendations for

further work.

In summary the objective of this thesis can mainly be articulated as

follows:

(a) Review of factors affecting pipe failure of existing models for predict-

ing future performance,

(b) Analysis of a typical failure database and identification of typical lim-

itations of databases,

(c) Development of a generic model for predicting future failures based

on existing limited data using ANN

(d) Examination of failure patterns and development of a more refined

model


(e) Development of supporting documentation for the proposed model for

use by water authorities to adequate to various pipe categories and

data sets

Chapter 2

Literature Review

2.1 Introduction

Water distribution systems consist of a range of components such as pumps,

nodes, and pipes spread across the geographical network. All of the elements

are liable to fail due to gradual degradation or possible defects in manu-

facturing or installing operations. The cost of maintenance/replacement of

water pipes, is a huge burden on water distribution companies and usually

accounts for most of the expenditures of the whole system. In addition to

maintenance costs, disruption to traffic, as well as water losses, reduction

in the quality of service, and reduction in the quality of water are typical

outcomes of pipe failures.

The physical mechanisms that lead to the pipe breakage are often very

complex. On the other hand, developing a performance estimation model

for the pipes requires an understanding of potential causes of pipe failures.

To provide a background about the factors that can be involved in the failure

process, mechanical properties and manufacturing techniques of cast iron

pipes, as the most used type of pipe in mature water distribution systems,

are reviewed in Section 2.2. Common causes of structural failures of cast

iron pipes are then discussed in Section 2.3. In Section 2.4 a review of

previous studies on the impact of factors leading to failure process of water

pipes is presented. Section 2.5 reviews studies previously conducted on

structural failures of water pipes and the associated mathematical analyses

performed by other researchers. The merits and shortcomings of the current

models developed for forecasting the structural failures of water pipes are

discussed in this section and the research problem addressed by this thesis

15

16 CHAPTER 2. LITERATURE REVIEW

is defined in terms of the need to fill the existing gaps and shortcomings of

current techniques.

2.2 Mechanical Properties and Manufacturing

Techniques of Cast Iron Pipes

Cast iron is manufactured by adding a large amount of carbon to molten

iron. Carbon composes typically about 2.5−4% of weight of various types of

cast iron, while for instance, this measure for most steels, is less than 1.2%.

Adding the extra carbon lowers the melting point of the metal and makes

it more fluid and much easier to cast in complex shapes. However, when

the metal solidifies, some (or most) of the carbon forms graphite flakes.

These flakes reduce the strength of metal and act as crack former, initiating

mechanical failures. These flakes cause the metal to behave in a nearly

brittle fashion, rather than displaying the elastic, ductile behaviour of steel.

Fractures in this type of metal tend to take place along the flakes, which

give the fracture surface a grey colour. This is the reason for naming this

metal as grey cast iron. Almost all cast irons contain other elements added

to them to improve their castability, strength, or other useful properties.

For instance, silicon can be found in almost all cast irons as it is added to

lower the metal’s casting temperature. Manganese is also very common to

increase the strength and protect against the detrimental effects of sulphur

impurities of the metal. There are five types of cast irons, however only

two have been used in manufacture of water pipes ,namely, pit cast iron

and spun cast iron. Pipes installed until the early 1970s were made of grey

cast iron, and since then ductile iron has been used for production of metal

pipes.

The presence of graphite flakes in the structure of grey cast iron causes

unusual mechanical properties in the material. While it is not truly brittle,

neither is it a ductile material in the same way as steel or even ductile cast

iron. Another important factor is that grey cast iron has very different

mechanical behaviour in tension than in compression, typically being 2.5 to

3.5 times stronger under the latter loading conditions.

There are two main kinds of grey cast iron pipes caused by different

manufacturing methods. The first cast iron pipes were pit cast pipes, while

the more recent technique in the manufacture of pipes used spin casting. Pit

casting consists of pouring the molten metal to sand moulds that are placed

in a pit. It was then allowed to cool and the solidified pipe is taken out

2.3. STRUCTURAL FAILURES OF CAST IRON PIPES 17

of the pit. Moulds used in spin casting, are made of sand or water cooled

metal, and rotate around the longitudinal axis of the pipe. Furthermore the

spinning of the mould, the metal is evenly distributed around the mould.

Besides the molten metal solidifies much more quickly than in the pit casting

method. This results in a significant difference in the size and shape of the

graphite flakes in grey cast iron.

A major difference between these two techniques of casting is that flakes

are considerably larger in pit cast pipes than in spun cast pipes. The smaller

flakes and their different distribution in the spun cast pipes are the reason

that they are generally much stronger than pit cast pipes.

2.3 Structural Failures of Cast Iron Pipes

There are a number of causes for the structural failures of cast iron water

pipes. Some of the common conditions and failure causes are follows:

- Corrosion: A common failure cause in metal pipes is corrosion. Al-

though pipe failures can be merely due to corrosion, in most of break-

ages with clear mechanical causes corrosion has an accelerating role by

weakening the fabric of the pipe. Simple corrosion pitting is a minor

failure mode in small diameter pipes (reticulation pipes< 300 mm di-

ameter), but more important in pipes with large diameters. Corrosion

of grey cast iron pipes, typically comprises of two separate but related

processes. Although simple corrosion pitting is the same as in steel

pipes, in cast iron pipes, graphitisation can also take place. Graphi-

tisation removes some of the iron from the pipe, leaving a matrix of

graphite flakes that is held together in part by iron oxide.

- Manufacturing defects: Gray cast iron pipes were manufactured

using a number of different methods, as noted previously, the two

most common being pit casting and spin or centrifugal casting. As

mentioned before, manufacturing defects are the source of some fail-

ures.

Potential structural flaws in pit casting: One of the manufac-

turing defects in pit cast pipes is the presence of inclusions. In-

clusions are undesired elements in the structure of metal that

weaken it. There are two types of inclusions that can be found

in grey cast iron pipes. One type which appears as small black

spheres is undissolved ferrosilicon. Iron phosphide is the other


type of inclusion in pit cast pipes. The other common man-

ufacturing defect of pit cast pipes is the un-even thickness of

wall around their circumference. When the mould is not aligned

properly, casting will result in variable thicknesses.

Potential structural flaws in spin cast pipes: Spun casting, po-

tentially results in fewer manufacturing defects compared to pit

casting. Possible problems include inclusions, variations in the

wall thickness of the pipe along its length, porosity and improper

cleaning of the pipe moulds after casting.

- Excessive forces: There are various sources of excessive forces that

may cause pipes breaks or accelerate the failure occurrences, including:

Ground movement forces directly transferred to the pipe; and

Ground movements that happen farther away along the line, may

cause a failure at a change in pipeline direction at a thrust block

(Lackington 1991, Pascal and Revol 1994, Dyachkov 1994).

- Human error: Human errors that are potential causes of pipe fail-

ures can occur at the design phase or other parts of pipeline con-

struction. For example, some types of corrosion related failures have

been identified as result of poor installation techniques. The common

causes of failures due to human errors are:

Third party damage that are caused by other civil operations;

Human errors during pipe installation operation that cause fu-

ture failures (e.g. a failure that is initiated from a scratch that

partially removed the pipe’s coating during the installation.); and

Improper repair practices.

- Multiple event failures: Failure processes of grey cast iron pipes

are usually due to a combination of factors that may include external

loading, internal pressure, manufacturing flaws and corrosion damage

(Morris 1967). Many circumferential and bell split type failures actu-

ally occur as a series of multiple events. Pipe cracks that are detected

usually initiate in the form of a water leaking from a small crack. If

the damage is not detected, a second or even third cracking event may

take place, with the process continuing until the pipe fails completely

if not removed due to a leak detection operation.

2.4. EFFECTIVE FACTORS IN PIPE FAILURE MECHANISMS: AREVIEW OF PREVIOUS STUDIES 19

2.4 Effective Factors in Pipe Failure Mechanisms:

A Review of Previous Studies

In recent decades, many researchers have attempted to relate the rate (fre-

quency) of water pipe failures to the attributes of the pipes and environ-

mental conditions. This section contains a review of the studies that have

been conducted on pipe characteristics in order to determine the quality

and extent of their impact on pipe failure.

A variety of factors contributing to failures of water pipes have been

identified by a number of researchers (Shamir and Howard 1979, Kelly and

O’Day 1982, Goulter and Kazemi 1988, Arnold 1960, Remus 1960, Niemeyer

1960). Fitzgerald (1960) examined pipe failure conditions in the United

States and made specific recommendations that accurate and detailed leak

and failure records should be maintained to develop and (or) maintain ef-

fective programs for break reduction. Morris (1967) suggested a number of

possible causes for water main breaks, but emphasised that “the cause of

water main breaks cannot always be ascertained”.

Rajani and Tesfamariam (2005) reported that, in most cases, a combi-

nation of circumstances leads to the failure of a pipe. Factors contributing

to pipe failure include operational conditions, design parameters, external

loads (traffic, frost, etc.), internal loads (operating and surge pressures),

temperature changes, loss of bedding support, pipe properties and condi-

tion, and corrosion pit geometry. Thus information is recorded rarely if at

all and it is therefore very difficult to ascertain the precise causes of fail-

ure. Even if all this information was available, any attempt to estimate the

state of the pipe condition would involve considerable uncertainty due to a

large spatial and temporal variability that is inherent in this information.

Focusing on the correlation of corrosion rate and failure of cast iron pipes,

the authors concluded that long-term performance of buried cast iron pipes

is dictated by pit growth rate (growth rate of a single corrosion pit), unsup-

ported length of pipe (likely to develop as a result of prolonged leakage or

wash out), material fracture toughness and temperature differential. Also,

soil movement and ground frost development were mentioned as determin-

ing factors.

Pipe breakage is most likely to occur when the environmental and oper-

ational stresses act upon a pipe whose structural integrity has been compro-

mised by corrosion, degradation, inadequate installation or manufacturing

defects (Kleiner and Rajani 2001). Common variables involved in deterio-


Table 2.1: Factors affecting structural deterioration of water distribution

pipes (Rostum; 1997)

Structural External/environmental Internal Maintenance

variables variables variables variables

Location of pipe Soil type Water Date of failure

velocity

Diameter Loading Water Date of repair

pressure

Length Groundwater Water Location of

quality failure

Year of Direct stray current Water Type of failure

construction hammer

Pipe material Bedding condition Internal Previous

corrosion failure history

Joint method Leakage rate

Internal Other networks

protection

External Salt for de-icing

protection of roads

Pressure class Temperature

Wall thickness External corrosion

Laying depth

Bedding condition

ration process of water networks, can be grouped into structural, environ-

mental, hydraulic and maintenance (Rostum 1997).

Table 2.1 lists a number of factors in each group of variables. Most of

the factors are constant with time some are time-dependent (e.g. water

quality, water velocity). In the following section, factors that are commonly

assumed to have the greatest impact on pipe failure are discussed.

2.4.1 Pipe diameter

All of the studies reviewed in the literature that investigate the relationship

between size of the pipe and its failure frequency, agree on existence of

an inverse relationship, e.g. (Andreou 1986, Eisenbeis 1999, Ciottoni 1983,

Kettler and Goulter 1985a, Guan 1995, Mavin 1996). In a recent study,

Boxall et al. (2006) have observed an exponentially decreasing relationship


between failure rate and diameter. The high frequency of failures for pipes

with small diameters can be explained by reduced pipe strength, reduced

wall thickness and less reliable joints for smaller pipes (Kettler and Goulter

1985a).

In a study of pipe-soil interaction, Rajani and Tesfamariam (2005) re-

alised that the growth rate of a single corrosion pit in small diameter cast

iron mains is nearly always more detrimental to thin (small diameter) than

to thick (large diameter) pipes if all other data or properties remain un-

changed.

A sensitivity analysis conducted in a recent study by Tesfamariam et

al. (2006) confirms that large-diameter mains are more sensitive to external

loads, and small-diameter mains are sensitive to the extent of loss of bedding

support.

2.4.2 Pipe length

For long pipes (L > 1000m), environmental conditions such as soil condi-

tions of the bedding and traffic conditions on the field above the network

can vary along the pipe. Andreou (1986) stated that hazard function of

each pipe is approximately proportional to the square root of its length. A

number of other researchers (Eisenbeis 1999, Lei 1997, Eisenbeis 1997) have

also reported similar findings. Length of pipe may also be considered as

a surrogate for connection density, and if connections are considered as a

point of weakness, shorter pipes may exhibit higher burst rates than longer

pipes (Skipworth et al. 2002).

2.4.3 Pipe age

Early analyses of pipe breakage conditions suggested that there is not a

strong correlation between the rate of pipe breakage and its age (O’Day

et al. 1980, O’Day 1982, O’Day 1983, Ciottoni 1983). However, in later

studies by Clark et al. (1982), Chambers (1983) and Kettler and Goulter

(1985a) a correlation between the age of pipe and frequency of its failure

occurrences was observed. In a quantitative study of performance of some

water distribution systems, Butler and West (1987) reported that average

leakage figure in water network system in U.K. ageing about 50 years, was

about 30%. This measure for two water distribution systems in Germany

and three in Holland with average system age of 20 and 25 years, were in

the range from 2% to 15%.


Goulter and Kazemi (1988) later concluded that age should not be the

single factor used for assessing the pipe condition. Also, Herbert (1994)

mentioned the age as an important factor in combination with knowledge

of network condition and weak points to allow accurate assessments.

Andreou et al. (1987) reported a tendency of pipes with failures at early

ages to perform better than pipes that failed at later ages. A number of

researchers reported that during the first few years after installation, pipe

failures occur quite randomly (hardly predictable) due to factors such as

unusual external loads on the pipe (Hoyland and Rausand 1994, Rausand

and Reinertsen 1996).

In some cases, older pipes are more resistant to failure than younger

pipes. For grey cast iron pipes, this can be explained by the thinner walls

produced by newer casting methods. Pipes with thinner walls are more

susceptible to greater effects of corrosion and higher stress levels for the

same external loads compared to pipes with thicker walls. However, jointing

techniques have improved over the years, allowing greater deflections at

joints.

It is also observed that some construction eras have a higher break rate

than others (Pelletier et al. 2003). Different installation eras show different

failure characteristics. It therefore appears that construction practice for

each time period has more effect on the pipe failure characteristics than the

age of the pipe (Andreou et al. 1987).

2.4.4 Pipe material

The strength of a pipe is a major factor in the ability of pipe to resist the

internal and external loads. The material of the pipe also determines the

vulnerability of pipe to corrosion. Kettler and Goulter (1985a) investigated

the variations of failure rates with pipe material and examined the type of

failures (e.g. longitudinal split, joint, or circumferential failure) for different

pipe materials.

During the study of a water distribution system in Trondheim, Norway,

Lei (1997) realised that pipe material should be considered as a categori-

sation factor and not a covariate. The same conclusion was drawn from a

study by Achim et al. (2007). The authors showed this concept by three

dimensional plots of the number of previous failures and pipe age and pipe

length for two different materials of pipes for which the data existed.

Most water networks are mainly made of cast iron pipes (grey cast iron

and ductile iron pipes) and the longest existing failure records belong to this


class of pipes. Many researchers have focused on analysis of failures of grey

cast iron pipes (Andreou 1986, Goulter and Kazemi 1988, Eisenbeis 1999,

Rajani and Tesfamariam 2005, Tesfamariam et al. 2006). Cement mortar

lining of water mains was used to minimise corrosion and tuberculation

since 1920s (Mays 2000).

Since the 1970s, ductile iron (DI) has been used as pipe material in many

water distribution networks (Kirmeyer et al. 1994). A ductile iron pipe is

produced with low contents of phosphorous and sulphur, while magnesium

is added to the grey iron molten prior to casting. Consequently, the final

microstructure of DI consists of a uniform distribution of graphite nodules

within the ferritic iron matrix. In contrast, the graphite is in the form of

flakes in grey cast iron pipes. This superior mechanical structure of DI pipes

compared to cast iron pipes initially caused water distribution managers to

lay these pipes with minimal or no corrosion protection. Within a few years

it became apparent that unprotected DI pipes in aggressive soils tend to

corrode at a rate equal to that of cast iron (CI) pipes . However, because

DI pipes had smaller wall thickness than their equivalent-size CI pipes,

perforation appeared in many cases relatively soon after installation (Rajani

and Kleiner 2003).

The usage of Asbestos Cement (AC) pipes in water networks has been

usually associated with the health concern related to the release of asbestos

fibres into the drinking water due to chemical attack on the asbestos cement

material and the erosion of the internal surface of the pipe by the water.

It is observed that in some environments, AC pipe materials are subject to

damage due to various chemical processes that either leach out the cement

material or penetrate the pipe wall to form the products that weaken the

cement matrix (Mordak and Wheeler 1988).

The other detrimental mechanism observable in AC pipes is corrosion,

which is identified in the form of pits and holes on pipe walls. A num-

ber of chemical agents including acids, sulphates, magnesium salts, alkaline

hydroxides, ammonia and soft water were reported by Nebesar (1983) as

the source of corrosion in this type of pipe. Some organic compounds were

found to be “corrosive” as well. External corrosion of AC pipes follows

the same principles as internal corrosion, i.e., pH, alkalinity, sulphates con-

tained in the soils or groundwater which attack the pipes (Jarvis 1998). In

recent times, Poly Vinyl chloral (PVC) and Poly Ethylene (PE) pipes have

been introduced for use in water networks. Eisenbeis (1997) has presented

a statistical analysis of failure rates in a plastic pipes.

Obviously, failure rates differ for various pipe materials. However, due


to the diversity of characteristics and environmental and system conditions

that influence the failure behaviour of pipes, it is not easily possible to

come up with a clear prioritisation of pipes performance just based on their

material. This problem is illustrated by Table 2.2 that contains an anal-

ysis of relative failure rates for different pipe materials compared to the

corresponding failure rates of grey cast iron pipes extracted from a study

conducted by Eisenbeis et al. (2000) on the software tools used by European

water cement (AC) pipes. In this case, the average age of the pipe is not

taken into account. For Regio-Emilia, Trondheim and Bergen, the relative

failure rates are the ratio of failure rate of the concerned material pipes

in the area to the failure rate of grey cast iron pipes of different materials

in the area. For Bordeaux, the relative failure rate is the ratio of hazard

functions of pipes of different materials and grey cast iron.

The improvement of failure rates for grey cast iron (GCI) pipes, de-

creasing from 1994 to 1996, may be due to a policy change that resulted

in decreasing water pressure in the distribution systems. In Table 2.2, for

Bordeaux, ductile iron and GCI pipes are compared. Even after eliminat-

ing the influence of age, it shows that GCI pipes break more than ductile

iron pipes. The table also shows that, in Norway, asbestos cement and

unprotected ductile iron pipes are more vulnerable than GCI.

2.4.5 Manufacturing methods

Performance of pipes of the same material may differ due to different man-

ufacturing methods. For instance, the first cast iron pipes were horizontally

cast in sand moulds. Pit cast iron pipes had uneven wall thickness as a

result of this manufacturing technique. Later, vertical (spin) casting was

introduced as an advanced technique in pipe manufacturing industry. Spin

casting was first developed in the United Kingdom in 1916. However, the

transition between the two techniques in North America largely took place

between 1920 and 1930 (Rajani 1995). Most pipes installed after the lat-

ter date were made by spin casting. Makar and Kleiner (2000) stated that

centrifugal casting produced a stronger pipe due to differences in the mi-

crostructure produced by the two processes.

Spun cast iron pipes, had more even wall thicknesses. This method al-

lowed the production of pipes with thinner walls. Centrifugal casting meth-

ods resulted in even greater consistency of wall thickness. These methods

were used in Australia for the first time in 1962 when a centrifugal casting

machine was used to produce pipes with diameters up to 750 mm (Price


Table 2.2: Sample relative failure rates for different pipe materials used in

water distribution systems (Eisenbeis 1999), where relative failure rate =Failure rate of the material concerned

Failure rate of Cast Gray Iron.

Reggio Reggio Reggio Bordeaux Trondheim BergenEmilia Emilia Emilia1994 1995 1996 1978-1996 1978-1999

PE 0.01 0.11 0.25 0.06 0.06

PVC 0.21 0.25 0.3 0.01 0.12

Asbestos Cement 0.34 0.64 0.68 1.92 1.44

Steel 0.08 0.11 0.15

Grey Cast Iron 1 1 1 1 1 1

Ductile Iron 0.81 1.75

(no corroionprotection)Ductile Iron 0.22 0.12

(corrosionprotection)

and Sutton 1988). Production techniques, as well as the materials, need to

be considered when analysing grey cast iron pipe failures. The production

method is correlated to the year of production, which again is related to the

laying-year available in most pipe records. Walski and Pelliccia (1982) took

into account the manufacturing order of pipes by suggesting an exponen-

tial model for determining the state of pipe associated with two parameters

representing the type of pipe casting.

2.4.6 Corrosion

Corrosion is one of the main reasons of deterioration of iron pipes (Rastad

1995). This electrochemical process is caused by a variety of environmental

conditions that induce formation of electrochemical cells, which encourage

external corrosion pits in ductile iron (DI) and graphitised zones in cast

iron (CI). These signs of corrosion can emerge as early as 5 years or as late

as 30 to 65 years after installation (Rajani and Kleiner 2003).

This natural phenomenon usually occurs in two basic ways, galvanic and

electrolytic corrosion. Galvanic corrosion involves direct electric current

that is generated within the galvanic cell, whereas in electrolytic corrosion

the direct current is from an external source. Appropriate background and


detailed discussions on corrosion theory, galvanic corrosion, electrolytic or

stray current corrosion and bacteriological corrosion can be found in pub-

lished information such as Peabody (1967), Peabody (2001), and NACE

(1984). O’Day (1989) identified galvanic corrosion to be the primary reason

for the external deterioration of iron pipes, determined by the soil proper-

ties, such as pH, resistivity, moisture content, and redox potential.

In reality, grey cast iron pipes fail because of a combination of factors

that may include external loading, internal pressure, manufacturing flaws

and corrosion damage (Morris 1967). Many circumferential and bell split

type failures occur as a series of multiple events. Detailed definitions and

characteristics of failure modes such as circumferential and bell split are

available in Makar et al. (2001). In these cases, the pipe cracks part way

through and may start leaking water. If the damage is not detected, a second

or even third cracking event may take place, with the process continuing

until the pipe fails completely or is removed from service due to a leak

detection work (Kleiner et al. 2005).

Internal corrosion depends on the characteristics of the transported wa-

ter (e.g. pH and alkalinity) and external corrosion depends on the charac-

teristics of environment around the pipe (e.g. soil characteristics and soil

moisture). Wall thickness and pipe strength decrease under the time effect

of corrosion and the likelihood of breakage increases. Ahammed and Melch-

ers (1994) made attempts to model the reduction of wall thickness of iron

pipes, results from corrosion. Spread of corrosion in mature iron pipes is

generally considered as a power function of its age. Internal corrosion, as

well as the means that are used to control it (water treatment), has another

detrimental effect on the performance of the system. The authors stated

that loss of carrying capacity is a direct impact of internal corrosion due

to decrease in Hazen-Williams coefficient. The existence of deteriorating

factors whose impacts increase with the age of the pipe is believed by some

researchers to follow Hazen- Williams formula (e.g. Finnemore and Franzini

(2002)). This equation relates the flow, Q, to a series of parameters as:

Q = KCHW D(2.63)(∆H

L)N (2.1)

where CHW is the Hazen-Williams coefficient; D is internal diameter; ∆H is

the head loss due to friction; L is length; N is the exponent of the hydraulic

slope usually taken as 0.54, and K is a constant dependent on the choice of

units. According to Equation (2.1), a drop in the value of CHW with age

would lead to an increase in δH, i.e. a higher head loss over the length L

of pipe, at a given level of the flow Q.


Karaa and Marks (1990) argued that external corrosion is an important

factor to incorporate in predictive models as its intensity, unlike that for

internal corrosion, will vary from pipe to pipe as soil conditions vary.

In the past few years, different non-destructive techniques (NDT) have

become available (Hartman and Karlson 2002, Rajani and Kleiner 2004) to

measure remaining wall thickness, corrosion pits (ductile iron) or graphiti-

zation depths (cast iron) along the pipes. Results obtained from these NDT

measurements have to be incorporated within a broad decision support tool

to assess condition state, determine time to failure and remaining service life

for each inspected pipe and subsequently establish proactive management

strategies. Rajani et al. (1996) and Rajani and Tesfamariam (2004) have

developed a pipe-soil interaction model that determines stresses, strains,

and displacements at any point along the length of a jointed pipe. Rajani

and Tesfamariam (2005) developed a fuzzy logic based model to integrate

corrosion rates with the remaining wall thickness or pit geometry measure-

ments obtained from NDT inspections to arrive at the time to failure.

Tesfamariam et al. (2006) also developed an analytical model based on

a soil-pipe interaction model to estimate the remaining service life of one

cast iron pipe with several corrosion pits of significant depths observed at

the time of inspection. They expressed the resulting estimation as a fac-

tor of safety. The authors also undertook a sensitivity analyses, using a

Monte Carlo type random sampling method, to identify the critical compo-

nents of data that merit further investigation. Sensitivity analysis strongly

suggested that reducing pit depth (graphitisation) growth by using proper

corrosion control can be the single most effective way to decelerate the

breakage growth rate of existing pipes.

2.4.7 Pipe’s failure history of

Shamir and Howard (1979) claimed that analysis of previous failures can

assist in identifying the primary causes for breaks within a distribution

network. Once these primary causes have been isolated, changes in pipeline

design, such as construction characteristics, joint design, pipe material, and

maintenance procedure, can be initiated to improve the situation.

In fact, the failure history of a pipe is a significant factor in prediction

of future failures (Walski and Pelliccia 1982). Andreou (1986) used Cox’s

proportional hazards model to analyse the breakage rates in the water net-

work. Cox’s proportional hazards model is explained in more details in

Section 2.5.3. He reported that breakage rate increased with each break-


age occurrence, up to the third break after which the breakage rate was

constant. At this point, the pipes were assumed to be in a “fast breaking

state”. The number of previous breaks was found to significantly affect the

hazard function of the pipes. Eisenbeis (1999) observed a similar pattern.

Malandain et al. (1999) included these findings from Andreou (1986) and

Eisenbeis (1999) in a break rate model that will be discussed later in this

chapter.

Goulter and Kazemi (1988) observed the temporal and spatial cluster-

ing of water-main breaks, indicating that a previous break increased the

likelihood of future breaks in its immediate vicinity. The authors reported

that about 60% of all subsequent breaks occurred within 3 months of the

previous break. They suggested that the subsequent breaks are caused by

damage during repair operations, such as pressure surge while refilling the

pipe after repair or ground movements caused by excavation, back-filling

and the movement of heavy vehicles. Several factors unrelated to repair

activities are also responsible for the clustering of breaks in the network.

Pipes in the same location often have the same age and materials and are

laid with the same construction and joining methods. Pipes in the same

location are also likely to be exposed to the same external and internal

corrosion conditions.

2.4.8 Water pressure

Water pipes are designed to resist a designated internal pressure of water

passing through them. Minimum feasible operating pressures are still con-

strained by local topography (Lambert 1998) and operating requirements.

A flawless installation operation provides a uniform support over the entire

length of the pipes by the bedding. Poor installation practices or distur-

bance over time (due to soil movement) may cause a lack of support in some

points that leads to bending moments and longitudinal stresses. The ability

of the structure of a pipe to resist such forces is a function of the tensile

strength of the material and wall thickness (Skipworth et al. 2002).

In this context, static pressure of water and pressure surges in a distri-

bution system can affect the pipe failure process. Pressure surges can occur

when water and air valves open and close during network operations. These

surges can be one of the factors in failure clustering as valves are closed and

opened during repair activities. Andreou (1986) found the static pressure

an effective factor when modelling the pipe failure patterns, but the im-

portance of the variable was not found to be considerable in comparison to


other affecting factors. Clark et al. (1982) used both the absolute pressure

and the differential pressure (surge) when modelling the time to the first

failure of water pipes.

2.4.9 Soil condition of the bedding

Soil conditions affect external corrosion rates of water pipes and play an

important role in the process of pipe degradation, specially for iron pipes.

Clark et al. (1982) considered the presence of a corrosive soil environment

in the analysis of pipe failure, but found a low correlation between length

of pipe laid in a corrosive environment and failure frequency. Malandain et

al. (1998) used a GIS system to relate the soil conditions to the breakage

rate of pipes in the water network in Lyon, France.

Also Eisenbeis (1999) used ground condition (defined as the presence or

absence of corrosive soil) as an explanatory variable in the analysis of pipe

failures. There are different definitions and classifications for aggressivity

of soil in different studies. In Trondheim (Lei 1997), a broad classification

has been applied to represent the soil:

• Very aggressive: (Tidal zone, high ground water level, natural soil with

resistivity under 750 Ohm cm, pH less than 5, polluted by chemicals,

stray current, etc.)

• Moderate aggressive: (Clay, wetland, nonhomogeneous, etc.)

• Not aggressive: (Natural soil resistivity over 2500 Ohm cm, dry con-

ditions, sand, moraine).

In a study of profile of breakages of asbestos cement water mains, Mor-

dak and Wheeler (1988) observed that distribution of failures through the

year was fairly random for areas where sandy/gravel soils commonly occur,

whereas in areas with cohesive clay soils, most failures occurred during the

dry summer months. Cohesive clay soils are also associated with high inci-

dence of circumferential fractures, which are commonly related to bending

stresses.

A lack of a strong association between soil condition and breakage rate

is reported in a study by Boxall et al. (2006). These authors suggested that

this observation may be due to the spatial resolution of the data that was

performed prior to the analysis.


Hu and Hubble (2005) examined the influence of deteriorating factors

for water mains in Regina, Canada. They reported the soil condition as a

critical factor in failure mechanism of water mains of the region.

Rajani et al. (1996) have developed Winkler-type pipesoil interaction

(WPSI) model based on mechanics and hence termed mechanistic or phys-

ical models. This technique has been used in some recent studies, e.g.

(Rajani and Tesfamariam 2005, Tesfamariam et al. 2006) to produce esti-

mations for remaining wall thickness of individual pipes.

2.4.10 Seasonal variations

Seasonal weather variations can be linked to pipe breaks. Many breaks

are recorded in hot months of summer when facilities are functioning at

maximum capacity to meet peak demands. These breaks are due to such

excessive pressure, along with other causes such as external stresses or cor-

rosion.

In the winter when the water temperature drops very quickly, axial stress

is added to the internal pressure of the water, possibly surpassing the factor

of safety of pipes. Additional stress may result from frost effects, especially

when pipes are rather shallow.

A seasonal pattern, with the greatest number of failures occurring during

the winter, is common for many water distribution networks (Eisenbeis 1999,

Sgrov et al. 1999). Andreou (1986) realised that smaller diameter pipes (less

that 8 inches) have higher failure rates in the winter.

Kleiner and Rajani (2002) provided many references to reported obser-

vations concerning the influence of temperature and soil moisture on the

frequency of water main breaks. Rajani et al. (1996) showed that differen-

tial temperature change between pipe and soil, and also soil shrinkage due

to dryness, result in the development of stresses in the pipe.

In a study of research needs for rehabilitation of water networks, Sgrov

et al. (1999) observed both a winter and a summer peak in break rate

in UK. The summer peak was attributed to drying and uneven shrinkage

of clay soils, whilst the winter peak may have been due to frost loading

or thermal contraction effects. In addition, the researchers reported that

annual breakage rate over a period of ten years had significant correlation

with mean annual daytime temperature and was inversely related to the

total annual rainfall.

2.5. CURRENT MODELS DEVELOPED FOR PIPE FAILURE ANALYSISAND PREDICTION 31

2.5 Current Models Developed for Pipe Failure

Analysis and Prediction

In the literature, the techniques developed to analyse water pipe failures can

be generally categorised into physical, descriptive, and statistical analysis

methods. In the following subsections, the main characteristics of each

category are explained and their performance and areas of application are

compared.

2.5.1 Physical analysis

Physical analysis methods are based on evaluating the structural and envi-

ronmental characteristics of each individual pipe. Such evaluations include

identification of the scope and severity of corrosion on the internal and ex-

ternal pipe walls and estimation of the stresses caused by the loads applied

to the water pipe. For example, Water Resources Council (WRC) method

(Williams et al. 1984) is a physical analysis technique for assessing the resid-

ual life of cast iron pipes based on measuring the pit depth. This method

has been the standard procedure for corrosion assessment in UK for many

years. However, it is criticised in recent studies to be flawed in several re-

spects (Marshall 2002, Olliff and Rolfe 2002, Kleiner and Rajani 2001, Ra-

jani and Y. 2001), specially with the underlying assumption of having a

continuing linear corrosion rate.

Marshall (2002) enhanced the WRc method and developed a more ratio-

nal procedure. This new method is based on fracture mechanics and relates

it directly to the flexural failure that is commonly encountered. This pro-

cedure needs to be implemented and examined, although there seems to be

no drive to promote this methodology within the water industry.

Olliff and Rolfe (2002) advocated a condition assessment approach to

rehabilitation practice. They stated that although much effort has been

made to improve the investigation and renovation techniques, this has not

been matched in the area of analysis of condition data.

A major Canadian program has been conducted with the purpose of

assessment of corrosion of pipes. The studies undertaken in this forum

(Kleiner and Rajani 2001) came up with a number of methods for assess-

ing the probable performance of pipes in the future. However, the survey

reviews physically based models which need a wide range of pipe physical

characteristics and such information is commonly unavailable.


As noted previously, in recent years, a number of non-destructive tech-

niques (NDT) have been developed to measure wall thickness, corrosion

pits (ductile iron), or graphitisation depth (cast iron) of pipes (Jackson

et al. 1992, Hartman and Karlson 2002, Lillie et al. 2004, Rajani and

Kleiner 2004). These NDT measurements need to be incorporated within

a decision support system (DSS) to assess the pipe condition to decide

whether the pipe should be replaced and when to do it.

It is important to note that a pipe failure usually takes place as a result

of multiplicative effects of environmental, operational, and design factors.

Therefore, the reason of a failure occurrence cannot be precisely determined.

Measurement and recording of some of the design parameters and opera-

tional factors are associated with some level of imprecision. For instance,

measurement and recording of quantities such as temperature changes, traf-

fic, operating and surge pressure, corrosion pit geometry, loss of bedding

support (as a result of prolonged leakage or wash out) and so on are not

completely accurate or readily available within operational work. Given this

background, it is clear that assessment of pipes condition always involves

some level of uncertainty.

The difficulties involved in the estimation of the past, present, and future

corrosion rates add to the uncertainty in determining the remaining service

life of pipes. Rajani et al. (1996) developed the Winkler pipe-soil interaction

(WPSI) model to take most of predominant factors to the account. However,

Winkler models are liable to have some inaccuracy due to uncertainties in

estimation of the model parameters (coefficients).

Most physical models for pipe-soil interaction are deterministic and do

not consider the uncertainties involved in precision of data and parameter

estimation. In a possibilistic approach, Rajani and Tesfamariam (2005)

developed a fuzzy model to integrate corrosion rates with the remaining

wall thickness or pit geometry measurements to estimate the time to failure.

This technique is a step to estimating remaining service life of one pipe

length with several corrosion pits of significant depths observed at the time

of inspection and accounts for the unsupported length.

Rajani and Tesfamariam (2005)’s technique can be applied to pipes with

known corrosion pit depth buried in known soil corrosivity. In a recent

study, Tesfamariam et al. (2006) also cast the WPSI model in a possibilistic

framework to predict the remaining wall thickness of cast iron pipes under

intensive inspection and converted the outcome to the structural factor of

safety. In this model, material failure theories specific to cast iron mains

were combined with fuzzy stress solutions to determine the fuzzy structural


capacity in terms of factor(s) of safety.

There are also some other physical analysis efforts for water pipes -

e.g. (Doleac et al. 1980, Kumar et al. 1984, Makar 1999, Rajani and

Makar 2000a, Makar et al. 2001). As it was stated in Chapter 1, this the-

sis concentrates on modelling and prediction of water main failures to de-

velop maintenance/replacement strategy in water distribution systems. The

physical-based models require assessment of the properties of each buried

pipe and such techniques need detailed knowledge of the long-term (histor-

ical) behaviour of the pipeline under consideration in addition to detailed

assessment of the individual pipe from different aspects.

2.5.2 Descriptive analysis

Descriptive analysis methods are based on calculation of descriptive statis-

tics to provide an insight into the breakage patterns and trends. Descriptive

analyses, by their nature, can be only performed in locations where compre-

hensive databases of the pipe characteristics and breaks are available. There

are very few case studies of descriptive analysis reported in the literature.

Some cities often cited for participating in such studies are Winnipeg, Man-

itoba, Canada (Kettler and Goulter 1985a, Goulter and Kazemi 1988, Goul-

ter et al. 1993, Jacobs and Karney 1994); New York (O’Day et al. 1980, Male

et al. 1988, Male et al. 1990); Cincinnati and New Haven, Conn. (Clark and

Goodrich 1988, Goodrich 1986, Karaa and Marks 1990); suburban Paris

and Bordeaux, France (Eisenbeis 1999); three municipalities of Quebec:

Chicoutimi, Gatineau and Saint-Georges (Pelletier et al. 2003) and Boston

(Sullivan 1982).

As an example of descriptive analysis, the study by Sullivan (1982) con-

cludes that the type of routine maintenance practice directly affects the

evolution of the system state over time. The existence of leaks that are

not repaired leads to major breaks and the accumulation of such leaks ex-

plains the high proportion of the water losses through leaks which are not

unaccounted for.

As shown in Table 2.3, the percentage of water loss varied between

10 and 17 percent in 11 U.S. cities in 1978. Unaccounted for water is

defined as the difference between the total amount of water pumped into

the water system from the sources and the amount of metered water use by

the customers of the water system expressed as a percentage of the total

water pumped into the system. Table 2.4 shows the percentage of water

losses for different causes and demonstrates that in Boston in 1978, 37.30%


Table 2.3: Estimated water leakage in 12 U.S. cities in 1978 (Sulivan; 1982)

Location Percent of water lost

through leaks

Boston, Mass. 17

Cleveland, Ohio 15

St. Louis, Mo. 15

Pittsburgh, Pa. 14

Tulsa, Ckla. 14

Philadelphia, Pa. 12

Hartford, Conn. 11

Kansas City, Mo. 11

Cincinnati, Ohio 11

Buffalo, N.Y. 10

Baltimore, Md. 10

Portland, Ore. 8

Table 2.4: Water loss percentages for different causes, measured in Boston,

1978 (Sulivan; 1982)

Amount Percent of Percent of

Cause ML/d (mgd) water total water

unaccounted-for purchased

Undetermined 117.7 31.1 46.4 21.7

Leaks and breaks 94.6 25 37.3 17.5

Blow offs and flashings 4.5 1.2 1.8 0.8

Fire fighting 7.1 1.9 2.8 1.3

Unmetered public usage 15.8 4.2 6.3 2.9

Other 4.5 1.2 1.8 0.8

of water losses were due to leaks in mains and the remaining losses were

mostly due to service pipe leaks. For other cities, at least a quarter of losses

are due to leaks in mains.

Table 2.5, taken from Kaara (1984), shows the importance of different

criteria in using descriptive analysis for developing the replacement plan-

ning under different maintenance practices. Intensive maintenance practices

are considered proactive, and poor maintenance practices are known as a

reactive type of maintenance strategy.

This type of analysis is limited by the challenges faced in construct-


Table 2.5: Importance of different criteria for replacement/rehabilitation

decision-making, under different types of maintenance policies (Karaa;

1984)

Criteria Intensive MP Fair MP Poor MP

Economic Analysis IF PF MF

Loss of Pressure PF PF IF

Water Quality PF PF IF

Reliability MF PF IF

MP: Maintenance Practices

IF: Important Factor

PF: Partial Factor

MF: Marginal Factor

ing databases. These challenges include availability of personnel and re-

sources, missing and conflicting data, non-computerised information (paper

archives), and the like. Indeed, development of such databases has been

a concern for many researchers (O’Day 1982, Clark and Goodrich 1989,

Habibian 1992). The data base of this research that was provided by CWW

and is an example that is explained in detail in Chapter 3.

2.5.3 Statistical analysis

Statistical analysis methods are based on modelling of a pipe lifetime or the

frequency of the failures (or the probability distribution of these quantities),

then using those models to make decisions on replacement and/or rehabili-

tation of the pipes. Statistical analysis of pipe failures begins with plotting

the cumulative number of failures over time a process which requires failure

data for each pipe. A cumulative plot can indicate whether there is a trend

in the failure times. From a practitioner’s point of view, such a plot can

be a convenient tool for making maintenance decisions for individual pipes.

However, this technique is too time consuming to be carried out for each

pipe in the entire network.

Figure 2.1 shows an example of a cumulative failure plot for a single

pipe in the CWW distribution system (CICL 100 mm pipe, constructed in

1967, located in postcode 3029). The curve is convex, indicating a dete-

riorating pipe. In order to predict future failures in a statistical analysis

method, a parametric or non-parametric model is fitted to the curve plotted

using the dataset. This procedure can only be used for pipes with several


About Weibull & lognormal models: In case of proper choice and credibility of Weibull or lognormal analysis models, they offered considerable insight into the lifetime reliability of products. However, there are also many ways that analysis can be corrupted \citeWarrington.

Dec 97 Jul 98 Mar 99 Oct 99 May 000

1

2

3

4

5

6

7

Failure date

Cum

ulat

ive

num

ber o

f fai

lure

s

Figure 2.1: Cumulative failure plot for a single pipe in CWW (100 CICL

pipe, constructed in 1967 in postcode 3029)

previous breaks and is most useful for making decisions about distribution

and service lines. However, multiple breaks cannot be accepted on trans-

mission and trunk mains where the consequence of failure is high. Besides,

existing failure histories are not recorded based on regular inspections and

all the recorded failure dates are not accurate. For instance, a lack of regu-

lar inspections of underground pipes causes their failure occurrences to be

recorded with delay only after the obvious consequences of the failures are

reported. Thus, creating plots for every single pipe in the system requires

extensive data analysis and manipulation.

Statistical models for analysis of water pipe breaks have attracted the

interest of many researchers during recent decades because the results of

statistical analysis can be used for a variety of purposes in water network

management. In the long term, the models can be used to estimate future

budget needs for rehabilitation. In the short term, the models can be used

to define candidates for replacement based on poor structural condition.

Statistical methods use the history of failure occurrences to identify

the patterns of pipe failures in the past. In order to predict the future

performance of water pipes (likely failure rate or probability of occurrence

of failure at a time in the future), these patterns are assumed to continue in

the future. It will be shown in Chapter 5 that this assumption is inaccurate


for water mains as the failures are proven to be non-stationary random

processes with time-varying characteristics. Therefore, either auto-updating

nonparametric techniques are required, or the parameters of lifetime models

(in the existing probabilistic approaches) should be constantly updated or

a time dependent component should be built in the model.

Statistical methods that are used for the analysis of condition of water

pipes can be categorised into deterministic and probabilistic methods. In

a further classification, probabilistic models can be divided into probabilis-

tic multivariate and probabilistic single-variate models that are applied to

grouped data (Kleiner and Rajani 1999).

Statistically derived models can be applied with various levels of input

data and may thus be particularly useful for water mains with small failure

databases available or for which the low cost of a failure does not justify

expensive data acquisition campaigns (Kleiner and Rajani 2001). However,

the small size of failure databases, in such cases, causes small sample bias

in the statistical models (being treated as estimators in a mathematical

context) and more accurate and robust models would be desirable for ap-

plications where small failure databases are available.

Deterministic models

The outcome of applying a deterministic model to a failure data is a value

representing the condition of pipe at a time in the future (e.g. number of

failures in the future or failure rate or time to next failure). In the literature

of pipe failure analysis, deterministic models are generally divided into three

types: time-exponential, time-power and time-linear models.

Deterministic time-exponential models

One of the well known deterministic models is the model of Shamir and

Howard (1979) who used regression analysis to obtain a break prediction

model that relates a pipe’s breakage rate to the exponent of its age.

N(t) = N(t0)eA(t+g) (2.2)

where: t is the time elapsed (from present) in years; N(t) is the number of

failures per unit length per year (km−1 year−1); N(t0) = N(t) at the year

of installation of the pipe (i.e., when the pipe is new); g is the current age

of the pipe; and A is coefficient of breakage rate growth (year−1).

The underlying assumption of the above model is N(t0) 6= 0, which

means that on average, a pipe is assumed to always have a breakage fre-

quency, albeit very small in the beginning of its life. The required data for


this model are pipe length, installation data and breakage history. Forma-

tion of homogeneous groups is essential for analysis according to criteria

like pipe type, diameter, soil type, failure type, over burden characteristics,

etc. This limits the application of this model to the failure histories with

large number of data points for each homogeneous group of pipes. Besides,

Shamir and Howard (1979) did not provide any details on the location of the

study, the quality and quantity of available data or the method of analysis.

The two-parameter exponential model of Equation (2.2) is simple and

relatively easy to implement but its simplicity warrants careful treatment

in applying the model to data that are partitioned precisely. It should

also be noted that this exponential model implicitly assumes a uniform

distribution of breaks along all the water mains in a group. This assumption

was questioned by others, e.g. (Goulter and Kazemi 1988, Goulter et al.

1993, Mavin 1996).

Walski and Pelliccia (1982) attempted to enhance the exponential model

of Equation (2.2) by incorporating additional factors in the analysis based

on observations made by the US Army Corps of Engineers in Binghamton,

N.Y. (Kumar et al. 1984). The new expanded model added three dimen-

sions to the two-parameter model of Shamir and Howard (1979), namely,

consideration of the type of pipe casting, distinguishing between first break

and subsequent breaks, and consideration of pipe diameter. For example,

the model for (pit/sand spun) cast iron pipes with 500mm diameter is given

by:

N(t) = C1.C2.N(t0).eA(t+g) (2.3)

where: C1= ratio of break frequency for (pit/sand spun) cast iron pipes with

at most one previous break, to the overall break frequency for all (pit/sand

spun) cast iron pipes; and C2=ratio between break frequency for pit cast

iron pipes with 500 mm diameter, to overall break frequency for pit cast

iron pipes.

The model of Equation (2.3) requires the same data as the Shamir and

Howard (1979)’s model plus information on the method of pipe casting and

pipe diameter. Walski and Pelliccia (1982) did not provide any indication of

whether the correction factors they proposed indeed improved the prediction

quality and by how much. It is likely that these three added dimensions

influence the prediction of breakage rates of water mains. However, the

correction factors seem to have been derived arbitrarily and assumed to act

in a multiplicative manner on the breakage frequency (without apparent

statistical justification).


The assumption of multiplicative effect of correction factors on the break-

age frequency implies that these dimensions affect only the initial breakage

rate and not its annual growth rate. Furthermore, since no statistical test

of significance was performed by the authors, the validity of these proposed

exponential models is questionable.

Clark et al. (1982) proposed further enhancement of the exponential

model and transformed it into a two-phase model as presented in Equations

(2.4) and (2.5). They observed a lag between the year of pipe installation

and the first break record. Consequently, they proposed a model comprising

a linear equation to predict the time elapsed to the first break and an

exponential equation to predict the number of subsequent breaks.

NY = x1 + x2D + x3P + x4I + x5RES + x6LH + x7T (2.4)

REP = y1 .ey2 t.ey3τ .ey4RPD.ey5DEV .SLy6 .SHy7 (2.5)

where:

NY = number of years from installation to first repair (break);

xi, yi= regression parameters;

D= diameter of the pipe;

P= absolute pressure in a pipe;

I= percentage of pipe overlain by industrial development;

RES= percentage of pipe overlain by residential development;

LH= length of pipe in highly corrosive soil;

T= pipe type (1=metallic, 0=reinforced concrete);

REP= number of repairs (breaks);

PRD= pressure differential;

t= age of pipe from first break;

DEV = percentage of pipe length in low and moderately corrosive soil;

SL= surface area of pipe in low corrosive soil; and

SH= surface area of pipe in highly corrosive soil.

This model requires the time of installation, breakage history, type and

diameter of the pipe, as well as information about operating pressure, soil

corrosivity and zoning type of the area overlaying the pipe. Additional data

such as the type of breaks and pipe vintage are required to enhance the

model. Only moderate “goodness of fit” with r2 equal to 0.23 and 0.47 for

the linear and exponential expressions, respectively, were reported (Clark

et al. 1982).∗

∗The goodness of fit of a statistical model describes how well it fits a set of observa-


The Clark et al. (1982)’s model was the first reported attempt to explic-

itly account for several factors that were potential contributors to the pipe

breakage rate and considered two distinctly different deterioration stages in

the life of a water main. The linear Equation (2.4) implied that the covari-

ates acted on the time to first breakage independently and additively. The

low r2 value corresponding to the linear equation could suggest that this as-

sumption may be incorrect and that the factors affecting pipe deterioration

act jointly rather than independently.

The low r2 value could also indicate that other factors affecting time

to first break were present, but were not considered in the equation. The

exponential Equation (2.5) is similar to other time-exponential models de-

scribed above and considers the breakage rate primarily as an exponential

function of time since the first break.

Other covariates are assumed to act multiplicatively on the breakage

rate. It should be noted that the covariates expressing corrosivity effects are

power functions. The moderate r2 value corresponding to the exponential

equation suggests that more research is required to determine the suitability

of this model. The authors did not provide information as to the relative

contribution of each covariate to the total r2.

It is possible that the accuracy of the model could be improved if other

types of data were available. The authors also did not indicate whether both

equations had been applied to a holdout sample (sample of water mains

that was “held out” for validation purposes and thus was not included in

the dataset to perform the regression analysis). Validation with a holdout

sample would have provided more convincing evidence as to the predictive

power of the model. Although this model has been referenced extensively,

no documented reference is available to indicate if its use has been repeated

elsewhere.

Constantine et al. (1998) developed a shifted time-exponential model

(STEM) that uses the data categorised by year of construction. This is a

method for prediction of number of failures in the next year for the individ-

ual pipes. This model is expected by Equation (2.6)

H(x) = lλeβx (2.6)

tions. Measures of goodness of fit typically summarise the discrepancy between observedvalues and the values expected under the model in question. Such measures can be usedin statistical hypothesis testing, to test whether two samples are drawn from identicaldistributions, or whether outcome frequencies follow a specified distribution. Goodnessof fit can generally be described as: r2 =

∑(o− e)2/e2 ;where r2=goodness of fit, o= an

observed frequency, e= an expected (theoretical) frequency.



Time (in the past)

Failu

re R

ate

(in th

e pa

st)

=1.00 ; Typical pipe of the group=0.25 ; A pipe with better performance=2.10 ; A pipe with worse performance

Figure 2.2: The shifted time exponential model is fitted to the past failure

rates which are the cumulative number of previous failures at different times

in the life of the pipe: Larger λ corresponds to a worse performance and

vice versa.

STEM is not an accurate predictive model. It is not clear what covari-

ates (e.g. soil type, diameter, etc.) are included in STEM. Righetti (2001)

applied a STEM to a database of failures in a water distribution system

in Melbourne, Australia, and reported very poor correlation between the

predicted and observed values, as the STEM seemed to underestimate the

number of failures.

Neural networks have been applied to predict water pipe failures by

Sacluti et al. (1999) and Sacluti (1999). The neural network models intro-

duced in those papers are deterministic models which directly output the

future failure rates (or number of failures). Achim et al. (2007) has also

proposed a deterministic artificial neural network (ANN) model to predict

the failure rates (number of failures/km/year) for the individual pipes of

an existing failure data. In this study, pipe material is used as a strati-

fying factor that divides the failure data to two subsets for two types of

pipes available in the failure history under study. The inputs to this model

are diameter, age, year of construction, length and the pair of geographical

coordinates.

To evaluate the reliability of their neural network model, Achim et al.

(2007) used scatter plots of the estimated measures versus the targets, and

the plots of residual errors versus target. The resulting r2 values obtained

Figure 2.2: The shifted time exponential model is fitted to past failure rates

which are the cumulative number of previous failures at different times in

the life of the pipe. Larger λ corresponds to a worse performance and vice

versa.

where:

H(x)= expected number of total failures at pipe age x;

l= length;

β= rate variable;

λ=scale parameter (or the shifted time parameter); and

x=age of the pipe (year).

The rate variable (β) is the same for a homogeneous group of pipes. The

scale parameter λ should be calculated separately for each individual pipe.

This parameter is calculated by fitting the model to the cumulative number

of previous failures at different times in the life of the pipe (as shown in

Figure 2.2).

In this method, the expected influence of the environmental conditions

on the failure process of pipe is modelled by making a number of assump-

tions. The model uses different fixed parameters for each group of assets.

However, an extensive amount of knowledge would be required about the

manufacturing characteristics and real time information of water hammer,

corrosivity of the soil, external loading, soil movement etc.

The predictive capability of STEM for the examined data is not signifi-

cant. Besides, rate parameters in the STEM need to be calculated. Hence,

STEM is not an accurate predictive model. It is not clear what covari-


ates (e.g. soil type, diameter, etc.) are included in STEM. Righetti (2001)

applied a STEM to a database of failures in a water distribution system

in Melbourne, Australia, and reported very poor correlation between the

predicted and observed values, as the STEM seemed to underestimate the

number of failures.

Neural networks have been also applied to predict water pipe failures by

Sacluti et al. (1999) and Sacluti (1999). The neural network models intro-

duced in those papers are deterministic models which directly output the

future failure rates. Achim et al. (2007) has also proposed a deterministic

artificial neural network (ANN) model to predict the failure rates (number

of failures/km/year) for the individual pipes of an existing failure data. In

that study, pipe material is used as a stratifying factor that divides the

failure data to two subsets for two types of pipes available in the failure

history under study. The inputs to this model are diameter, age, year of

construction, length and the pair of geographical coordinates.

To evaluate the reliability of their neural network model, Achim et al.

(2007) used scatter plots of the estimated measures versus the targets, and

the plots of residual errors versus target. The resulting r2 values obtained

from graphical means prove the proposed model to be a better model than

the shifted time power model (STPM) which will be reviewed later in this

chapter and STEM applied to the same data. However, other performance

indicators, besides r2 values, are required to assess the performance of pro-

posed model.

Kleiner and Rajani (2002) proposed the following multi-variate expo-

nential model:

N(xt) = N(xt0) exp(a>. xt) (2.7)

where: xt is the vector of time-dependent covariates prevailing at time t;

N(xt) is the number of breaks resulting from xt; a is the vector of parameters

corresponding to the covariates; and xt0is the vector of baseline x values

at year of reference t0 that is the start of history.

Time-dependent covariates could be pipe age, temperature, soil mois-

ture, etc. Parameters N(xt0) and a can be found by least square regression

(with or without linear transformation) or by using the maximum likelihood

method. Equation (2.7) is applied to groups of water mains that are as-

sumed homogeneous with respect to their deterioration rates. The grouping

is typically done by some or all of the static factors (e.g., by size, material,

vintage, etc.) for which data are available.

In addition to the age of pipes, this model considers the time-dependent


factors of temperature effects (expressed as the freezing index); soil-moisture

effects (expressed as the rainfall deficit); cumulative length of replaced wa-

ter mains; cumulative length of cathodically protected (retrofitted) water

mains. The variety of data that is needed for this model, can be a limita-

tion considering the typical historical data that is available in most of water

distribution systems. In addition, this model should be applied to long his-

tories of failures. For a short failure dataset that includes a period with

predominantly decreasing breakage rates, applying this model may yield

counter-intuitive results such as positive effect of ageing and/or negative

effects of replacement.

Pelletier et al. (2003) distinguished two different break orders in their

failure data set. They used the Weibull distribution for the first break order

(time to failure from installation to first break), while using the exponential

distribution to describe the behavior of subsequent breaks (time to failure

from first to second break, second to third, and so forth). This model is

simply referred to as the “Weibull/exponential model.” and despite its

mathematical simplicity, has captured the essence of the ageing. However,

the modelling strategy does not take into account the variability in the

annual number of pipe breaks due to factors other than the deterioration

resulting from the natural ageing of the pipes most often, the weakening

of the pipes due to corrosion. Examples of factors that can contribute

to higher breakage rates in a given year are disturbances due to traffic

and construction, flooding, soil properties, water quality, and so on. This

equation takes into account the factor of frosting which is a concern in cold

areas like Canada, but not an issue in warmer climates such as Australia.

Table 2.6 briefly reviews the deterministic time exponential models dis-

cussed in this section.

Deterministic time-power models

The following model was one of the first time-power models suggested

by Mavin (1996):

n = αtβe (2.8)

where:

n = number of breaks at time t;

t = age of pipe;

e = random error term;

α, β = coefficients estimated from regression analysis.

Mavin (1996) compared a time-exponential model and a time-power


Table 2.6: Deterministic Time Exponential Models

Reference Attributes

Shamir and Howard (1979) Relates a pipe’s breakage rate to the expo-

nent of its age Equation (2.2)

Walski and Pelliccia (1982) Enhanced the exponential model (2.2) by

consideration of the type of pipe casting, its

diameter, and distinguishing between first

break and subsequent breaks

Clark et al. (1982) Number of failures after first breakage,

Equation (eq:4-2)

Constantine et al. (1998) A shifted time-exponential model (STEM)

that uses the data categorised by year of

construction for prediction of number of

failures in next year, for the individual

pipes, Equation (2.6)

Righetti (2001) Applied an STEM for a data from a wa-

ter distribution system in Melbourne, Aus-

tralia

Kleiner and Rajani (2002) Developed a general, multi-variate expo-

nential model Equation(2.7) to consider

some time-dependent factors in predicting

water main breaks

Pelletier et al. (2003) Weibull distribution was used for the first

break order and the exponential distribu-

tion to describe the behavior of subsequent

breaks (time to failure from first to second

break, second to third, and so forth)

Achim et al. (2007) Developed an artificial neural network

(ANN) model to predict the failure rates

for the individual pipes of an existing data

model depicted in Equation (2.8) by applying both to filtered data ob-

tained from three Australian water utilities. The author proposed some

rules to filter the pipe breakage data, based on calculating the probability

of two consecutive breaks (Constantine and Darroch 1993), and discarding

the second break if the probability is low.

Mavin (1996) found that the performance of the two models in predicting

water main breaks was comparable. This model is applicable to pipes with


more than 6 failure experiences. This preliminary assumption is clearly

impractical in developing the maintenance strategies for water distribution

systems.

The shifted time power model (STPM) presented by Constantine et al.

(1998) is given by the following equations:

H(x) = lλxβ (2.9)

rate of failure per year at age x =dH

dx= βlλxβ−1 (2.10)

where:

H(x)= expected number of total failures at pipe age x;

l= length;

β= rate variable;

λ= scale parameter (or the shifted time parameter); and

x= age of the pipe (year).

The study reported in this thesis found that the power based model,

as well as the exponential model (mentioned previously), did not fit the

database used for the study. To use these models, the rate variable should be

calculated for each class of pipes and a scale parameter should be calculated

for each individual pipe. Regardless of the choice of these parameters,

the prediction of failure rates retains an exponential nature and the same

criticisms that apply for Constantine et al. (1998)’s STEM also apply to

STPM.

Righetti (2001) also developed a shifted time power (STPM) model for

failure analysis of water pipes. Application of both STEM and STPM to

the same set of failure data showed that the STPM model resulted in over

prediction of the number of failures.

Deterministic time-linear models

Kettler and Goulter (1985a) have proposed a linear relationship between

pipe age and its failure rate, given by the following equation:

N = k0.Age (2.11)

where:

N= number of breaks per year; and

k0=regression parameter.

McMullen (1982) has proposed the following model for the age of pipe

at the first break:

Age = 65.78 + 0.028SR − 6.338pH − 0.049Eh (2.12)


where:

Age= age of pipe at its first break (years);

SR= saturated soil resistivity (ohm-cm);

pH= soil pH; and

Eh= redox potential (millivolts).

Soil resistivity has traditionally been considered as an important param-

eter in evaluating the corrosivity of a soil (Norin and Vinka 2005). Redox

potential is a an intensity parameter of overall redox (oxidation-reduction )

reaction potential in the system (similar in concept to pH). Redox potential

(Eh) describes the electrical state of a matrix. In soils, Eh is an impor-

tant parameter controlling the persistence of many organic and inorganic

compounds (Vorenhouta et al. 2004).

Data required for this model is typically not available. Sporadic data col-

lection is not expensive, however continuous and extensive data collection

program is costly. Continuous monitoring of soil properties is important

where ground water conditions have not reached steady state or are season-

ally dependent. Table 2.7 summarises the deterministic power and linear

models discussed in the literature.

Probabilistic models

Probabilistic models are strongly preferred by managers of water distribu-

tion systems in establishment of maintenance or rehabilitation strategies.

The reason is that these models also quantify the level of uncertainty which

is reasonable because there are always stochastic factors that affect the

pipe failures such as corrosion process, soil movement due to moisture con-

tent and type of soil, and external and internal burdens on pipes. The

uncertainty caused by the aforementioned stochastic factors is implicitly

ignored by deterministic models. However it can be quantified by proba-

bilistic models in terms of confidence intervals for estimated values. This

can be explained by the following example. Suppose that a probabilistic

method is giving x as an estimate for a variable x. The interval [x−δ, x+δ]

is the α-confidence interval for the estimate x if Pr(x ∈ [x − δ, x + δ]) = α

(Sheskin 2003).

Survival analysis is the main approach to probabilistic modelling of pipe

failures in the literature. The analysis of survival data is a traditional statis-

tical theme that deals with the time to the next failure data. Survival anal-

ysis has been used to predict pipe breakage behaviour by many researchers

in the past two decades. Some researchers have specifically adapted sur-


Table 2.7: Deterministic Power and Linear models


Mavin (1996) Estimates the number of breaks; applica-

ble on pipes with more than 6 failure ex-

periences, Equation (2.8)

Constantine et al. (1998) Shifted time power model (STPM), Equa-

tion (2.10)

Righetti (2001) Applied a Shifted time power model

(STPM) to a failure history of a water dis-

tribution system in Melbourne

Achim et al. (2007) Compared the performance of a STPM

model to a STEM model and an ANN

model for a failure history of a water net-

work in Melbourne, Australia

Kettler and Goulter (1985a) A linear model for estimation of number

of failures per year; Equation (2.11)

McMullen (1982) A linear model for estimation of the pipe’s

age at first break as a function of resistiv-

ity, pH, and redox potential of its bedding

soil, Equation (2.12)

vival analysis (as most frequently used in the biomedical field) to water

pipe failure problems (Clark et al. 1982, Clark and Goodrich 1988, Andreou

et al. 1987).

Survival analysis incorporates the fact that while some pipes break, al-

though other similar pipes do not, those breaks have a strong impact on the

likelihood of future breaks for those similar pipes. Pipes can fail many

times in their lifetime. Each time a failure is observed, an immediate

intervention on the network is necessary. Many researchers (Andreou et

al. 1987, Eisenbeis 1999, Gustafson and Clancy 1999) have shown that the

breakage pattern strongly depends on the number of previous breaks that

pipes have experienced. The number of previous breaks is often reported

as the most important factor for predicting future breaks. Survival anal-

ysis is particularly useful in this field when pipe break records have been

maintained for a good portion of the water pipe network history.

The survival function and proportional hazard function are the main

fundamental elements in survival analysis. The basic quantity employed to

describe time-to-event phenomena is the survival function (i.e. component


reliability). This is the probability that an individual will survive beyond

time x. It is defined as:

S(x) = Pr(X > x) (2.13)

where X is a random variable denoting the time of next failure.

S(x) is survival function which is a non-decreasing function with a value

of one at the origin and zero at the point of failure occurrence. When X

is a continuous random variable, the survival function is the complement of

the cumulative distribution function (CDF) of X, that is, S(x) = 1−F (x),

where F (x) = Pr(X ≤ x).

The hazard function is known as the conditional failure rate in reliability

theory, the force of mortality (FOM) in demography, or simply the hazard

rate. The hazard function is defined by:

h(x) = lim4x→0

Pr(x ≤ X < x + 4x|X ≥ x)

4x(2.14)

The term h(x)4x can best be interpreted as the probability that next

first failure occurs in (x, x + 4x) knowing that the pipe has survived till

the time x. If X is a continuous random variable denoting the time of

occurrence of the next failure, then:

h(x) =f(x)

S(x)= − d

dxln[S(x)] (2.15)

where f(x) is the probability density function (pdf) of X. A related quantity

is the cumulative hazard function H(x), defined as

H(x) =

∫ x

0

h(u)du = − ln[S(x)]. (2.16)

Thus, for continuous lifetime models:

S(x) = exp[−H(x)] = exp

[−∫ x

0

h(u)du

]. (2.17)

The failure time distribution of pipes in a water distribution network may

be investigated through the survival function S(x), or the hazard function

h(x).

Kaara (1984), Marks (1985) and Andreou (1986) introduced the use

of a proportional hazard model for analysing failures in water distribution

networks. A general failure prediction model, named the “Cox propor-

tional hazard regression model”, was used by many researchers and ad-

justed for different data. Cox (1972) introduced the proportional hazard


model (PHM) in order to estimate the effects of different covariates on the

time to failure of a system. The Cox model has been used extensively in

medical statistics, where the benefit of the analysis of data on such factors

as life expectancy and duration of periods of freedom from symptoms of a

disease as related to treatment applied, individual histories and so on, is

obvious. The general form of Cox’s proportional hazard model is given in

Equation (2.18):

h(t, Z) = h0(t) exp(bηZ) (2.18)

where:

h(t, Z)=hazard function, which is instantaneous rate of failure;

h0(t)=arbitrary baseline hazard function;

Z=vector of covariates on the hazard function;

b=vector of coefficients to be estimated by regression from available data;

and

η= model parameter (chosen by trial and error for the best fitting to data).

A proportional hazard model is used to estimate the time to next fail-

ure. Covariates Z, represent environmental and operational factors that

influence the failure of water mains. Ageing of water pipes can be im-

plemented by the baseline hazard function h0(t) that can be defined as a

time-dependent ageing parameter.

In a case study, Marks (1985) proposed the following second degree

polynomial of Equation (2.19) as the baseline hazard function:

h0(t) = 2 × 10−4 − 10−5t + 2 × 10−7t2 (2.19)

and used multiple regression to determine covariates Z that affect the break-

age rate. The most significant covariates identified by Marks (1985) are the

pipe length, operating pressure, percentage of low land development, pipe

“vintage” (or period of installation), pipe age at second (or higher) break

rate, number of previous breaks in pipe, and soil corrosivity.

Cox’s proportional hazard model was also applied to the first three

breakages of a pipe by Andreou et al. (1987) who used the same baseline

hazard function and the same vector of covariates of Marks (1985) for the

early stage of pipe life (up to the third failure). For later stages (after the

third failure) a constant hazard was assumed as:

h = λ = exp(bηZ) (2.20)

where λ is the constant hazard. This model could not predict future failure

times with appropriate accuracy. A moderately low value of r2 = 0.34 for


the prediction of later stages, after 3 breakages, did not indicate a significant

model fitting. However, this model became the reference for many modelling

efforts by other researchers.

As mentioned above, Andreou et al. (1987) divided a pipe’s lifetime into

two stages. During the analysis of their failure data, they observed that

the time intervals between first three consecutive failures had an ascending

order. After the third failure, these intervals seemed to be constant. They

used a Cox proportional hazard model for the first phase of pipe’s lifetime.

In order to model the constant period, they considered a Poisson distri-

bution. Similar methods were taken by a number of other researchers, e.g.

Marks et al. (1987). Assuming a constant rate of failure in the second phase

of a pipe’s lifetime is an inaccurate basis for analysis and most predictive

models assume an exponential or power relationship between age and failure

rate.

Herz (1997) and Lei and Sgrov (1998) developed probabilistic models

for estimation of the useful life of a pipe, considering it as the time to the

first failure. These lifetime models are meant to model the pipe breakage

rate over its lifetime. Although such lifetime analyses provide an insight

into failure mechanisms, they are impractical for developing decisions on

management of distribution systems. This is because the detailed failure

databases required by such models are unavailable in most water companies.

The majority of pipes in mature water networks are old and full breakage

records cannot be obtained from existing data.

Constantine and Darroch (1993) and Constantine et al. (1996) developed

a time-dependent Poisson process with mean breakage rate depending on

pipe age.

H(t) = (t/θ)β (2.21)

where:

H(t)= mean number of failures per unit length at age t (not to be mistaken

with the cumulative hazard function); and

t= pipe age;

θ , β = scale and shape parameters, respectively.

This is a Weibull random process because the resulting cumulative distri-

bution is in the form of the Weibull cumulative distribution function. In

(Constantine and Darroch 1993) and (Constantine et al. 1996), the parame-

ter β (shape parameter) is considered constant for a homogeneous group of

failures (e.g. failures that are attributed to corrosion only), whereas θ (scale

parameter) is the following function of some operational and environmental


covariates:

θ = θ0eαZ (2.22)

where θ0 is the baseline value, α is the vector of coefficients to be estimated

by regression, and Z is a vector of covariates affecting breakage rate.

Bremond (1997) applied the PHM model, Equation (2.20), as a baseline

hazard function and proposed the following time-dependent Poisson model:

h0(t) = λβ(λt)β−1 (2.23)

where:

t= time to (next) failure; and

λ, β= scale and shape parameters (respectively) of the Weibull distribution.

The model resulted in a good fit to a large failure history of more than a

decade in France. The proportional hazard model that is applied in this

model is the Weibull baseline hazard function.

The use of the Poisson distribution for failure analysis, is based on the

underlying assumption that there is a constant risk of failures and the times

between failures are not necessarily equal. In other words, it is implicitly as-

sumed that each pipe experiences breakages occurring completely randomly

at a constant rate over the period of observation. This is not a realistic as-

sumption, since the pipe will age over the period of observation. Gradual

deterioration should therefore lead to the change of failure rate over that

period. This can be considered as a source of inaccuracy in all of failure

analysis models based on the Poisson distributions.

As mentioned earlier, the time-dependent Poisson model of Constantine

and Darroch (1993) was used by Mavin (1996), for data filtering. He tried

to eliminate the failure occurrences caused by operational or other factors

and not deterioration as a result of ageing. He introduced some rules for this

purpose to identify the failures that have occurred despite of low probability.

For example when two successive failures occurred in spite of low probability

(less than 1%), the second one was considered due to accidental damage or

faulty operational procedure. Based on this reasoning, this second failure

was filtered out of data and ignored in further analysis.

Eisenbeis (1994) proposed an approach similar to Andreou’s application

of the proportional hazard model (Andreou 1986), but assumed a Weibull

distribution for the baseline hazard function. This model also included three

stages. The first stage described hazard functions for the pipes that have not

experienced a failure. The second stage describes hazard functions for the

second to fourth failures, while the third stage describes the hazard functions


for pipes after their fourth failure. Since the baseline hazard function was

actually a Weibull model, the procedure for predicting new failures was

only valid in the case where the Weibull distribution was reduced to an

exponential distribution.

This five-parameter distribution model was applicable because 40 and 54

years of failure data on the large urban water pipe networks were available in

the study. The availability of data permitted the use of Cox’s proportional

hazards model which requires the significant risk factors to be identified first

by regression analysis. This significantly increases the number of parameters

that must be calibrated. The risk function for Cox’s proportional hazards

model was estimated at each time step from the data. Some significant risk

factors identified for one or more networks were pipe length, pipe diameter,

soil corrosivity, traffic intensity above the pipe, and installation after 1966.

Such a data is not available in every water distribution system and therefore

the above technique is not applicable in many cases.

Lei and Sgrov (1998) used the Cox’s proportional hazards model and

the Weibull accelerated life model, given below, to analyse the water distri-

bution network in a case study:

ln(T ) = µ + xT β + σZ (2.24)

where:

T= time to next failure;

x= vector of explanatory covariates;

Z= random variable distributed as a Weibull;

σ= parameter to be estimated by maximum likelihood; and

β= vector of parameters estimated by maximum likelihood.

In this study only the first failure was analysed, and all maintenance activ-

ities were considered a failure. In this model, time to next failure is:

T = f(µ, σZexηβ) (2.25)

The essence of the accelerated lifetime models is that time to next failure

expands or contracts relative to that at x=0, where x is defined as a vector

of explanatory variables. In the study presented in (Lei and Sgrov 1998),

the explanatory variables that were included in x were: age groups, pipe

size groups and length of pipe.

The research showed that no significant difference between the results

of the two models. This is not surprising since it can be shown that an

accelerated lifetime model is equivalent to the proportional hazard model


when Z has a Weibull distribution (Cox and Oakes 1984). The authors did

not report whether the model was validated. They also did not comment on

the quality of the predictions. Besides, the study is limited by the decision to

treat all maintenance activities (including non-repair activities like flushing)

in the network as failures.

Eisenbeis (1999) also applied the accelerated lifetime model of Equa-

tion (2.24) for a number of failure histories. He considered different meth-

ods for using the proposed modelling approach in municipalities with brief

pipe break histories. His approach was to lengthen the pipe break history

through creating a sample of pipe breaks by randomly selecting break dates

that follow the shape of the survival function of the general model.

The covariates used for each system were decided considering the local

conditions. Using the previous number of breaks as a covariate complicated

the application of the method necessitating the use of Monte-Carlo simula-

tion for the prediction of the number of failures at a desired time-horizon.

The author reported good predictions using this method. However, they

did not provide any details to demonstrate these results. Besides, adding a

created sample of data to the real breakage record makes the reliability of

the resulting model questionable.

Le Gat (1999) described the application of the Weibull proportional

hazard model for the analysis of irrigation pipes in the southern part of

France. The expected number of failures for each pipe was predicted using

this model. The proposed method followed the principles introduced in the

works of Eisenbeis (1999) and Andreou (1986). A Monte Carlo simulation

based on the survival functions was introduced to predict pipe failures.

Monte Carlo simulation is a method for iteratively evaluating a deter-

ministic model using sets of random numbers as inputs. This method is

often used when the model is complex, nonlinear, or involves numerous

uncertain parameters. A simulation can typically involve over 10,000 eval-

uations of the model, a task which in the past was only practical using

supercomputers (Metropolis and Ulam 1949).

Eisenbeis (1997) presented an analysis of two French networks and one

Norwegian network using a Weibull proportional hazard model. The model

used a stratification of the failure data according to the number of previous

failures recorded. Acceptable agreements between observed and predicted

failures were reported. PHM modelling requires the inclusion of several vari-

ates in a single analysis. This reduces the amount of pre-grouping that is

required by a single, two or three parameter model. However, this grouping

requirement is not altogether eliminated and careful analysis is required to


identify groups of pipes that may differ in their underlying ageing process.

Besides, the underlying assumption that environmental and operational fac-

tors affect the failure hazard of all types of pipes in the same proportion,

obviously reduces the reliability of these models. For instance, soil condi-

tion has a clearly higher effect on unprotected cast iron pipes, compared

to coated or cathodically protected pipes. If these two categories are not

stratified in the analysis, the differences may reduce the accuracy of the

results. Hence, using these models requires careful examination of the data

to be in order to identify the covariates with the best predicting ability, as

well as those which are required for data stratification.

Cohort survival models are another class of survival models. A statistical

distribution, named the Herz distribution, was introduced and used by Herz

(1996), Herz (1997) and Herz (1998). The Herz distribution was developed

specifically for the ageing of infrastructure elements. In Herz (1996), the

interrelationship between ageing and the occurrence of first failure was found

to be very weak. Using the Herz distribution as a survival function for failure

analysis has the feature that the failure rate/renewal rate increases with age

more and more before it increases more gradually and finally approaches a

boundary value asymptotically. What is called the failure rate/renewal rate

in this thesis, in statistical terms, is the hazard function for the service life of

a pipe. The pipe is replaced when the service life is expired. The probability

density function f(t), survival function S(t) and hazard function h(t) were

given as:

f(t) =(a + 1)beb(t−c)

[a + eb(t−c)]2(2.26)

S(t) =a + 1

a + eb(t−c)(2.27)

h(t) =b.ebe(t−c)

a + eb(t−c)(2.28)

where the values of a, b and c parameters may be derived empirically for

the past periods and particular types of pipes. When used to forecast, they

must be based on expert judgement, i.e. on pipe survival estimates by

managers and engineers (Herz 1996). The ageing function (with upper and

lower boundaries) must be established for each group of pipes. The model

predicts the residual life (i.e. remaining lifetime) for each pipe cohort and

can be used to estimate rehabilitation requirements. This is the reason that

Herz model is known as a Cohort survival model.


This thesis examined the application of above model to the CWW fail-

ure database by fitting the empirical failure rates (calculated from the past

failure data - see Section 4.5 for more details) to the models given in Equa-

tions (2.26)-(2.28). A small correlation coefficient of r2 = 0.25 was observed,

demonstrating a poor performance of the above model for the case study of

this thesis.

A number of European research centres have developed models for as-

sessing rehabilitation and renewal needs for water infrastructure. These

decision support tools contain several modules including a network inven-

tory module, a failure and break forecasting module, an economic data

module and a strategy comparison module. Several major European cities

have used the Herz distribution for planning pipeline renovation and reha-

bilitation. The procedure has been included into the user-friendly software

KANEW in a research project sponsored by American Water Works Associ-

ation Research Foundation (AWWARF) (Deb 1998). KANEW predicts the

date that the selected pipe sections will reach the end of their service lives.

The pipe sections are differentiated by date of installation and by type of

pipe sections with distinctive life spans.

The system assumes service life to be a random variable, starting after

some time of resistance and being characterized by a median age and a

standard deviation, or age that would be reached by a certain percentage of

the most durable pipe section. The user can choose the parameters of the

Herz distribution.

Predictions are based on optimistic assumptions of service lives that

are derived from failure and rehabilitation statistics for different types of

pipes. The cohort survival model of KANEW is a tool for exploring network

rehabilitation strategies.

The main limitations of KANEW are:

- Since no covariate structure is included in the KANEW, the model

does not provide for the analysis of individual pipes. Ageing functions

are specified for each type of pipes, not for individual pipe. This im-

plies that the model should only be used for analysis of rehabilitation

requirements for the entire water distribution network (i.e. network

level).

- The parameters of the Herz distribution used in KANEW are based

on historical renewal rates and not historical break rates. The re-

newal rates reflect the rehabilitation policies in the past (e.g. often

tending to maintain a fixed average age of the stock) and the economic


and technical condition of the period. Furthermore, the rehabilitation

policies are likely to change in the future. So the parameters would

have to be changed in order to reflect future standards and policies.

Kulkarni et al. (1986) proposed a Bayesian diagnostic model for estima-

tion of system-wide probability of failures:

Pr (failure of specific characteristics) =Pc/fPf

Pc/fPf + Pc/nf (1 − Pf )(2.29)

where:

Pc/f = probability of observing specified characteristics on a segment that

failed; and

Pc/nf = probability of observing the same characteristics on a segment that

has not failed.

This model can be applied to homogeneous groups of pipes in terms of

criteria such as diameter, length, age and type, soil characteristics and

water pressure. For the failure database of the case study in this thesis, the

failure probabilities given by the above equation were compared with the

empirical failure probabilities directly calculated using the failure records

(see Section 4.5) and in average, a substantial difference of 67.5% (over

prediction) in resulting prediction was observed.

Malandain et al. (1998) used a Poisson regression model to quantify

the influence of diameter, material, and position of the pipe (i.e. located

in a road or not) on the break rate. The time passed since installation

is not included in the regression. The water network of Lyon in France

was used as a case study. Prior to the analysis, the pipes were grouped

according to structural and environmental factors. In order to model the

break rate (in the form of a hazard function) as a function of time, the

break rate function was divided into three different intervals. Each interval

was analysed separately, resulting in a step function for the break rates. In

the early stage of pipe’s life, the hazard function increased and a Weibull

model was assumed based on the results from Eisenbeis (1999). In the

following stages, an exponential model (i.e. constant hazard function) was

used. The authors pointed out that the proposed approach should only

be used at network level and not at pipe level. A Geographic Information

System (GIS) was used for identifying the spatial variation for the break

rate caused by environmental variables (e.g. soil condition).

Gustafson and Clancy (1999) described a method to model the occur-

rence of pipe failures in grey cast iron pipes with a semi-Markov model,


where the “state” of the water mains is represented by the number of fail-

ures and the time between failures is used as the “holding time”. The re-

quired probability distributions were estimated using survival analysis. The

time to first failure was modelled with a 3-parameter generalised gamma-

distribution and the subsequent failures with an exponential distribution,

identical for all ti (i > 1); where ti is the time between the (i − 1)th and

the i-th breaking pipe. The dataset was divided into three groups of pipes,

depending on the original wall thickness. No explanatory variables were

included in the analysis, due to lack of data. The authors reported that the

mean failure time is strongly related to the number of failures and concluded

that these grey cast iron pipes are deteriorating. The model’s reliability is

called into question by the poor time resolution (just one year) available for

recorded failures.

Li and Haimes (1992) also used a PHM as introduced by Andreou (1986)

to identify two stages of deterioration and their accompanying hazard func-

tions. The authors used the formula of Walski and Pelliccia (1982) to esti-

mate the repair time of a pipe (the time it takes to repair the pipe), and to

estimate the accompanying cost of repair and replacement. This was used

to formulate a two-stage decision making process.

Goulter et al. (1993) assumed a non-homogeneous Poisson distribution

for subsequent failures of an initial failure in a spatial and temporal cluster.

The authors used a cross-referencing scheme to determine the mean number

of failures that occur subsequent to initial failures. Nonlinear regression

was used to determine parameters based on time and space for the non-

homogeneous Poisson model.

P (x) =mxe( −m)

x!(2.30)

where:

P (x) = probability of x failures;

m = average number of subsequent failures occurring in the cluster domain;

x = number of subsequent failures occurring in the cluster domain;

This method assumed the Poisson distribution for the failures occurring

within a fixed interval of time (T ) and space (S). Hence, a non-linear

regression function was used to estimate the initial values of m:

m = b0tb1sb2 + ε (2.31)

where b0, b1, and b2 are the regression parameters; and ε denotes a random

error. After a while (e.g. one year), parameter m is updated using the new


set of failure data.

m = m(s, t) =

∫ S

0

∫ T

0

r(s, t)dtds (2.32)

where:

s = distance from the first break in a cluster;

t = time elapsed from the first break in a cluster;

S = space interval (meters); and

T = time interval (days).

The proposed technique gives acceptable approximates when the mean

number of subsequent failures is very low. It was observed that in the case

of a high mean, the distribution obtained with the regression model, re-

sults in an inadequate fit. This model requires precise information about

the location of breakages, which calls for using a GIS system. Besides, the

model can only be used to predict the probability of breakages occurring

subsequent to an initial breakage in the cluster. Also climatic or other vari-

ables influencing the failure profile with high level of local annual variations

cannot be easily considered using this technique.

Jacobs and Karney (1994) have proposed the following simple proba-

bilistic model:

P−1 = a0 + a1Length + a2Age (2.33)

where P−1 denotes the reciprocal of the probability of a day with no breaks

and a0, a1, a2 are regression coefficients. Data required for this model are

pipe length, age, and breakage history. More data enables the formation of

homogeneous groups. This model has a very simplistic approach towards

various environmental and maintenance factors influencing the complicated

mechanism of failure.

Lee and Kim (2004) attempted to estimate the probability of pipe fail-

ures in terms of their different failure related characteristics such as their

corrosion rates at depth and length directions. The focus of this study was

on evaluation of failure probabilities in different times in the future and

studying their dependence on the aforementioned characteristics. For each

characteristic, two random variables were considered: resistance (R) and

load (L) variables. The failure caused by that characteristic is denoted by

R < L or R − L = Z < 0 and failure probability is derived by assuming

that R and L are both normal and independent:

Failure Probability = Pr(Z < 0) = φ(−µZ

σZ

) = φ(− µR − µL√σR

2 + σL2) (2.34)

2.6. RELIABILITY ANALYSIS OF WATER NETWORKS 59

where:

Φ(.) = CDF of a standard normal random variable;

µZ = average of the Z variable;

µR = average of the R variable;

µL = average of the L variable;

σZ = standard variation of the Z variable;

σR = standard variation of the R variable; and

σL = standard variation of the L variable.

For each characteristic, the term (µZ/σZ) and its time variations are

determined using different models. Then those values are substituted in

the above equation. However, assuming the normal distribution for both

resistance (R) and load (L) variables corresponding to each of the charac-

teristics is unrealistic. In addition, the model could be only applied on very

large histories of pipe failures, including a large variety of the failure-related

characteristics for the pipes. This kind of data is generally not available in

most of water distribution systems.

Most of the probabilistic models that have appeared in the literature

(including the models reviewed in this section) are listed in Tables 2.8 to

2.11. Table 2.8 lists the probabilistic models based on the Poisson distribu-

tion. The probabilistic models using Cox’s and Weibull proportional hazard

function are listed in Tables 2.9 and 2.10, respectively. The remainder of

the probabilistic models in the literature are listed in Table 2.11.

2.6 Reliability Analysis of Water Networks

Reliability analysis of infrastructure systems has been a constant challenge

for managers of infrastructure systems such as water networks. Many re-

searchers have addressed this problem from various points of views. For

instance Walters (1988) described the reliability of water distribution sys-

tems as one of the most challenging unsolved issues facing the water supply

systems. Two major issues have been recognised as being particularly prob-

lematic in performing a reliability assessment: firstly, what measure is the

most appropriate for assessment of reliability, and secondly, what is an ac-

ceptable level of reliability (Xu and Goulter 1998a).

A number of researchers, e.g. Mays (1996) and Ostfeld and Shamir

(1996) define the reliability assessment of a water distribution network as

measuring the ability of the system to meet the consumer requirements in

terms of quantity and quality under both normal and abnormal operating


conditions. Maglionico and Ugarelli (2004) have defined specific Reliability

Indicators for both hydraulic and quality aspects, and for combination of

them. In that research, reliability of the whole system was determined as

to be the average value of Reliability Indicators calculated for all the nodes

during the period of simulation (100 years).

In the course of a review of existing studies under the title of reliabil-

ity analysis, it is noticed that currently there is no universally acceptable

definition or measure for the reliability of water distribution systems as it

requires both the quantification of reliability measures and criteria that are

meaningful and appropriate. The behaviour of a water distribution system

is governed by physical laws that describe the flow relationships in pipes

and hydraulic control elements, consumer demand, and system layout.

Therefore, considerations for water distribution systems are an integral

part of all decisions concerning planning, design and operation phases. Two

scenarios are possible for the failure of water distribution systems. Water

networks, like structural trusses, may fail because of loadings being greater

than design levels. Like electrical networks, they also may fail because

components (e.g. pipes) break even with loadings being below the design

levels. Thus, there are two primary probabilistic factors which contribute

to the performance of water distribution systems, namely, the probability

of failure of individual network components and the probability of actual

demand being greater than the design load (Goulter 1987).

A number of models were established for the hydraulic (capacity) re-

liability analysis of the system - e.g. (Gupta and Bhave 1994, Han and

Dai 1996, Goulter 1992, Xu and Goulter 1998b, Cooke and Jager 1998,

Goulter 1990, Goulter and Bouchart 1987, Xu and Powell 1991, Xu and

Goulter 1999, Xu and Goulter 1998b). Xu et al. (2003) defines the capacity

reliability as the probability that the nodal demand is met at or over the

prescribed minimum pressure for a fixed network configuration.

In a general sense, reliability is the ability to deliver design flows under a

wide range of conditions. Obviously, component failure, e.g. pipe breakage

can severely hamper the ability of a network to perform up to specifications.

Some researchers have attempted to define the parts of the problem

and to incorporate them into design models. Hobbs (1985) and Hobbs

and Beim (1986) proposed a series of approaches for reliability assessments

in water supply systems which include both the probability of component

failures and the probability distribution of demands. The approaches were

however developed for the supply aspect of water systems rather than the

distribution network itself. Shamir and Howard (1985) proposed approaches

2.6. RELIABILITY ANALYSIS OF WATER NETWORKS 61

for reliability assessment of water supply systems but they were similarly

concerned primarily with the supply aspect of the system.

Reliability of components of distribution system in delivering required

quantity of water can be referred to as hydraulic reliability. For instance,

flow at the nodes can be regarded as an indicator for measuring the function

of a distribution system. In this case, the system reliability is measured by

assessing the condition of nodes in receiving a given supply at a given head.

If this head is not attainable, supply at the node is reduced. Each node can

thus be in a normal, reduced service, or failure mode.

The system will be said to be in normal mode if all nodes are receiving

normal supply, in failure mode if supply to any node has been shut off, and

in reduced mode if some node or nodes are receiving reduced supply but no

nodes are completely shut up. In a reliability assessment of water distribu-

tion systems, Kettler and Goulter (1985b) used the probability of failure of

water major supply paths while Goulter and Coals (1986) used the prob-

ability of node isolation. Both approaches used a linear program through

constraints restricting the average number of breaks per year permitted in

each link. The relevant probabilities of interest were also calculated using

the average failure rates in each link.

In an examination of the hydraulic reliability of distribution systems,

Cullinane (1986) presented concepts of mechanical reliability and availabil-

ity as quantitative measures of system reliability, Mays et al. (1986) used

a cut-set approach for modelling the reliability of network. The proposed

procedure showed how failure definition can be directly included into an

optimisation design model. In the summary of the procedure, it was men-

tioned that the study of reliability of water distribution systems is severely

hampered by lack of an accepted definition of measures for reliability.

For the purpose of assessment of reliability of water distribution systems,

the first step is to define the values and criteria to measure the reliability.

Then, there is a need for a mathematical model that is capable of predicting

the state of components and last step is combining the reliability measures of

the system and performance state of system components. The mathematical

modelling of component failures, mentioned as the middle stage of reliability

analysis of water distribution system, is the focus of this thesis.


2.7 Milestones of Study and Summary

The studies presented in this thesis have been undertaken in a number of

steps described as follows. First, a failure database was prepared consisting

of records of past failures in the network during a number of years com-

bined with GIS information about the location of pipes. Since the database

contains failures of grey cast iron pipes, the characteristics and failure mech-

anism of this type of pipe were briefly studied.

Existing probabilistic models were listed and discussed earlier in this

chapter. These models partially or completely rely on a number of dis-

tribution models. In other words, the probabilistic methods, reviewed in

this chapter, extract the pattern of previous failure occurrences in order to

project it to the future. For this purpose they use known distribution mod-

els. It was noted in this research that parametric modelling approach does

not take into account the non-stationary nature of failure occurrences as a

random process. This matter has been theoretically analysed as presented

in this thesis.

The other observation made in this literature review is that most of the

existing models require knowledge of the rank order of failures, i.e. whether

a failure in the existing record is the first or second or the n-th failure since

installation. However, in many mature water distribution systems such as

CWW, the data of failures within occurred before a specific date (prior to

the first day of record) are unavailable.

Based on the identified limitations of these existing models, the charac-

teristics of available datasets and the purpose of this study, a new technique

that considers the random changes of environmental factors affecting the

performance of water mains and is capable of handling incomplete data has

been formulated during the course of this research.

A new probabilistic model in which that mathematical form of the model

is learnt from previous failures using an artificial neural network (ANN) is

proposed. The neural network technique is adopted to reconstruct the pat-

tern of previous failure occurrences in order to predict the reliability of each

class of pipe at a specified in a certain time. The proposed neural network

model resulted in good estimates without the need for predefined distribu-

tions for fitting the previous failure data to. The ANN technique benefits

from its high computational power in dealing with noisy and censored data

and fitting them to the closest nonlinear curve. The black box of the neural

network model includes a number of parameters that are adjusted through

learning from past failures. However, due to the existence of these parame-

2.7. MILESTONES OF STUDY AND SUMMARY 63

ters, although they are adjustable, the neural model is a parametric model

and unable to accommodate for the non-stationarity of failure process.

To advance the study to achieve an accurate and adaptive non-parametric

model, a new technique for probabilistic analysis of water mains has been

developed. This technique can be applied to the water pipe failure history

to estimate the expected number of failures within a given number of time

intervals (days, weeks, months, etc.) in the future. The upper bound and

lower bound of 80% confidence intervals for the estimations can also be

determined. The outputs of the prediction method can be automatically

updated with time, the proposed method implicitly takes into account the

gradual variations of the factors influencing the deterioration process.


Table 2.8: Probabilistic models using time-dependent Poisson model


Andreou et al. (1987) Assumed a constant hazard function

of Equation (2.20) which considers

a Poisson distribution for the inter-

failure times after the third failure

Constantine and Darroch (1993) Proposed time-dependent Poisson

model Equation (2.21); and scale

parameter function of some opera-

tional and environmental covariates;

Equation (2.22)

Bremond (1997) Proposed a time-dependent Poisson

model of Equation (2.21), to apply the

proportional hazard method of Equa-

tion (2.20)

Eisenbeis (1994) Used the same proportional hazard

of Equation (2.20) but assumed a

Weibull distribution for the baseline

hazard function. This model included

three-stages. The first stage described

hazard functions for pipes that have

not experienced a failure. The second

stage describes hazard functions for

the second to forth failure, while the

third stage described the hazard func-

tions for pipes after their forth failure.

Malandain et al. (1998)

& Malandain et al. (1999) In their analysis, the break rate func-

tion was divided into three different

intervals. They used the Weibull haz-

ard function for applying the Propor-

tional hazard method for the early

stage of pipe’s life

2.7. MILESTONES OF STUDY AND SUMMARY 65

Table 2.9: Probabilistic models using Cox’s proportional hazard


Kaara (1984) Introduced the use of Cox’s proportional hazards

model for analysing failures in water distribu-

tion networks. The non-parametric multivariate

model, Equation (2.18), was used for a survival

based model

Marks (1985) Used Equation (2.19) as baseline hazard function

for using Cox’s proportional hazards

Andreou et al. (1987) Used Equation (2.19) as the baseline hazard

model function for the early stage of pipe’s life

(to the first three breaks)

Li and Haimes (1992) Used the Andreu’s proportional hazard Equation

(2.19), and Poisson model for developing a deci-

sion support system

Lei and Sgrov (1998) used the Cox’s Proportional Hazards Model for

a failure history

Table 2.10: Probabilistic models using Weibull hazard function


Le Gat (1999) The expected number of failures for each pipe

was predicted using Weibull proportional hazard

model (PHM) for the analysis of irrigation pipes

in the southern part of France

Lei and Sgrov (1998) Used a Weibull accelerated life model Equations

(2.24) and (2.25)

Eisenbeis (1997) Proposed a Weibull PHM that used a stratifica-

tion of the failure data based on the number of

previous failures recorded


Table 2.11: Miscellaneous probabilistic models


Herz (1996) Introduced Equation (2.28) as a cohort haz-ard function, assuming a weak correlationbetween ageing and the occurrence of thefirst failure; then, sharp increase in corre-lation of failure rate with ageing; and finallygradual increase of the failure rate with age-ing, approaching asymptotically to a bound-ary value

Deb (1998) A software for renewal project namedKANEW is developed using the Herz distri-bution function of Equation (2.28) Specifiedageing functions for each type of pipes

Eisenbeis (1999) Used the Gumbel distribution for acceler-ated life model of Equation (2.24)

Kulkarni et al. (1986) Proposed a Bayesian diagnostic model for es-timation of system-wide probability of fail-ures; Equation (2.29) A model applicable tohomogeneous groups of pipes in terms of di-ameter, length, viz. and gives the probabil-ity of observing specified characteristics ona segment of pipes that have failed

Goulter et al. (1993) Assumed a non-homogeneous Poisson dis-tribution for subsequent failures of an ini-tial failure in a spatial and temporal cluster;Equations (2.30) and (2.31) and (2.32)

Gustafson and Clancy (1999) Described the use of a semi-Markov methodfor cast iron pipes. The time to first failurewas modelled with a 3-parameter generalisedgamma distribution and the subsequent fail-ures with an exponential distribution

Jacobs and Karney (1994) Equation (2.33); returns the reciprocal ofthe probability of a day with no breaks

Lee and Kim (2004) Equation (2.34), estimates the probability offailure in terms of corrosion rate etc., assum-ing that resistance and load are both normaland independent

Chapter 3

Data Description

3.1 Typical Failure Data in Water Distribution

Systems

A common problem associated with failure time records of mature water

distribution systems, is lack of complete failure history. In fact, water net-

works of most cities have been established more than 100 years ago and

many of the early buried pipes are still in service while their complete fail-

ure records since the first years of their installation are not available. Thus,

decisions regarding management of the major parts of water distribution

systems are made in absence of a complete dataset.

Figure 3.1 depicts the typical data that is usually available in water

distribution networks. In this figure, the time window shows the period

of time for which failure records are available. The ‘×’ symbols on the

horizontal (time) axis denote occurrences of failures. Failures are likely to

have occurred prior to the starting point of available failure data, but not

be recorded. Therefore, the left side of the available data is unknown and

called left-censored data. Besides, given a fixed set of data, the right side of

available failure data (spanning the period of time between the last failure

time and next failure) may also be unknown, i.e. the data can also be right

censored.

In addition to these limitations, data from a water distribution system

might be subjected to recording or typing errors and the like. Failure data

may also be complete. Records in hard copy may not be accessible realisti-

cally. Reliance is therefore placed on digital records which may have only a

67

68 CHAPTER 3. DATA DESCRIPTION68 CHAPTER 3. DATA DESCRIPTION

X X X XInstallation

Year

Time

LeftCensored

RightCensored

AvailableFailure Data

Time Window

Figure 3.1: Availability of failure data in water networks.

from the database prior to analysis. Although the absence of failure records

for a short period of time does not cause considerable inaccuracy, missing

of essential information in failure record, such as the size of a failed pipe,

can decrease the credibility of resulting models.

3.2 Contents of Database of This Study

The database of this study is a failure history of cast iron pipes, assets

of City West Water (CWW), a water retailing company in Western sub-

urbs of Melbourne. Melbourne Water, in its report (Water Main Renewal

Study 1991) noted that western region of Melbourne was experiencing a

disproportionately high rate of failures. A burst rate of three times larger

that of Melbourne’s other two water supply regions (which are separate

water retail companies) was reported. Indeed, between 1972 and 1990

the annual average water main failure rate throughout CWW’s current

licence area was approximately 1 failure/km/year, as compared with 0.3

to 0.5 failures/km/year for the other two networks (Water Main Renewal

Study 1991). ALso “CWW’s Water Reticulation Asset Status- Pipe Struc-

tural Performance report” (Water Reticulation Asset Status Report 1997)

reported that in 1995/96, CWW had 3.1 times the break rate of South East

Water. In the same year, water industry benchmarks revealed that CWW

had the highest water main break rate in Australia (WSAA ’facts ’99 1999).

Given this background, since 1999, CWW recognised the need for the

failure analysis of water pipes. Accordingly, an investigation for cast iron

pipes and associated failures with a view to formulating a strategy for cost-

effective asset management in both short and long terms was conducted

(Righetti 2001). That study resulted in some models for failure prediction

of those water mains mentioned and discussed in the literature review of

this thesis.

Figure 3.1: Failure data in water networks may be available only during

specific time windows and include left and right censored outside the win-

dow.

few years of records which represent only a small portion of the pipe history.

Presence of false data in a failure record can potentially cause deviation in

the results of data analysis. Therefore, false data should be identified and

eliminated from the database prior to analysis. Although the absence of the

failure records for a short period of time does not cause considerable inaccu-

racy, missing of essential information in failure record, such as the size of a

failed pipe, can decrease the credibility of resulting models. Data of failure

might also be incomplete in that, for instance, detail of pipe deterioration,

soil type, condition of bedding etc., may be absent.

3.2 Contents of Database of This Study

The database of this study is a failure history of cast iron pipes belonging

to City West Water (CWW). CWW is a water retailing company in West-

ern suburbs of Melbourne. Melbourne Water, in its report (Water Main

Renewal Study 1991) noted that western region of Melbourne was expe-

riencing a disproportionately high rate of failures. A burst rate of three

times larger that of Melbourne’s other two water supply regions (which are

separate water retail companies) was reported. Indeed, between 1972 and

1990 the annual average water main failure rate throughout CWW’s cur-

rent licence area was approximately 1 failure/km/year, as compared with

0.3 to 0.5 failures/km/year for the other two networks (Water Main Renewal

Study 1991). Also “CWW’s Water Reticulation Asset Status- Pipe Struc-

tural Performance report” (Water Reticulation Asset Status Report 1997)

reported that in 1995/96, CWW had 3.1 times the break rate of South East

Water. In the same year, water industry benchmarks revealed that CWW

had the highest water main break rate in Australia (WSAA facts’99 1999).

Given this background, since 1999, CWW has recognised the need for

3.2. CONTENTS OF DATABASE OF THIS STUDY 69

the failure analysis of water pipes. Accordingly, an investigation for cast

iron pipes and associated failures with a view to formulating a strategy for

cost-effective asset management in both short and long terms was conducted

(Righetti 2001). That study resulted in some models for failure prediction

of those water mains mentioned and discussed in the literature review of

this thesis.

The reason for choosing cast iron pipes specifically was based on the fact

that cast iron pipes comprise more than half of CWW’s water mains and

contribute disproportionately to the number of failures and customer service

key performance indicators (KPIs) (Righetti 2001). This thesis reports the

failure analysis study as conducted on CWW’s cast iron pipes. However, it

is important to note that the developed techniques and approaches discussed

in this thesis are not exclusive to this data and can be tuned and applied

to other failure histories of other classes of pipes as well.

In-service water mains or pipes are subjected to continuous deleteri-

ous reactions and internal and external loads that undermine the intended

design factor of safety (FS). Consequently, the service life is significantly

reduced if existing stresses on structurally deteriorated pipes exceed the ex-

pected or admissible design loads or stresses. Pipe failure is defined as an

event in which the factor of safety falls below a critical value, FScr (usu-

ally set to 1), i.e., FS < FScr. Cast iron, a brittle material, typically fails

through fracture at strains of 0.5%. Thus, the fracture of brittle materials

such as cast iron is dictated by its ultimate strength.

In the database used in this study, a water main failure means a struc-

tural failure of the pipe that results in water visibly escaping to the en-

vironment. The failure does not mean a failure to meet defined regulated

customer service standards. It also does not include a failure of the pipe

to supply the water at the required quality, or general leakage of water

from unknown sources. At City West Water, these two latter issues are

either small or non-existent. It has been found at CWW that in many

areas, it is currently economically advantageous to reactively repair failure

spots as they occur, rather than undertake renewals to address or prevent

failures (Righetti 2001). However, in some instances a renewal is required

due to regulatory constraints on interruptions to supply, as well as critical

situations where single or repeat failures lead to high profile incidents and

social costs. Both the regulatory requirements and asset conditions are of

significant importance to CWW.

Traditionally, CWW used three categories to assess their assets:

70 CHAPTER 3. DATA DESCRIPTION

b Frequently bursting pipe: Pipes that had four or more breaks in

12 months or less. Due consideration was also given to the number of

historical breaks recorded for the asset. These assets were considered

at high risk of causing more than five unplanned interruptions per

year to each customer.

b Brittle pipe: Pipes that had three breaks in 12 months or less and

a historically high number of breaks. This criterion was developed in

recognition that CWW had the highest number of failure breaks in

Australia. This category was defined to target the pipes in a shut off

block that fail regularly (three times per year) such that the overall

contribution of each pipe combined to result in customers in the shut

off block experiencing more than five unplanned interruptions. A shut

off block is the zone of water supply that gets interrupted as result of

interruption to any of the water utilities in that area. Water mains

were renewed under this criterion only where it was not viable to

install additional valves to reduce the size of the shut off block. The

reason for considering this category was to proactively renew assets

that were most likely to become frequently bursting pipes in the near

future.

b Critical assets: Assets that were located in areas where the conse-

quences of a failure were high and it was suspected that the probability

of a failure was also high. For instance, water mains greater than 80

years of age, crossing under tram tracks in the CBD are considered

critical. High failure probability was determined with respect to per-

formance or inferred, based on design life/age of asset.

This categorisation did not prove to be sufficiently efficient and reliable

to be used to develop an effective asset management policy in CWW. In-

deed, this categorisation was insufficient for determining the yearly capital

expenditure assignment and renewal programming because, according to

this simplistic technique, any pipe with four or more failures should be re-

placed even if these failures did not cause an interruption to water supply

(e.g. circumferential failures can be repaired by applying a clamp). There-

fore, by employing this method, the link between the actual customer needs

and the projected likelihood of repeat interruptions is weak. In addition,

there is no financial evaluation of the cost to CWW versus the financial

benefits of this program. To address this deficiency, this study aims at de-

veloping a predictive approach toward assessment of classes of pipes in the

3.2. CONTENTS OF DATABASE OF THIS STUDY 71

future. Having reliable estimates for the probability of failure occurrences

of the pipes in the future, the above mentioned definitions can be used as

criteria for making decisions about maintenance/replacement strategies for

those classes of water mains.

The data used in this study is a history of 6381 failures recorded during

1997–2000 in a Microsoft Access file. Each record describes the occurrence

of a pipe breakage event and comprises the pipe identification number, pipe

type (material), pipe size (diameter), pipe length, date of construction of

the pipe, failure type, and the date of failure. Environmental factors such

as soil type, climate, pressure zone and cause of failures are not available in

this database.

Significant data auditing was undertaken by CWW to assess the accu-

racy of the records and false records were eliminated from data so that the

dataset can be used with a high confidence in its validity and accuracy.

The data used in this study was not prepared on the basis of a regular

inspection along the network. Rather, it consisted of digital records of

breakage occurrences over four years which is not a long history compared to

the age of this network. However, data contained sufficiently large number

of records for each group of pipes (with the same type and diameter) for

the analysis to be significant. This is mainly due to existence of a small

number of pipe groups and also the high frequency of failures.

The majority of failures CWW area are experienced by cast iron pipes

which represent more than 50% of CWW assets (Righetti 2001). There were

two types of pipes in this database: cast iron (CI) and cast iron cement

lined (CICL) pipes. In the existing dataset, the acronym CICL referred

specifically to spun grey cast iron pipes with factory cement lining, and the

acronym CI referred specifically to pit cast, unlined pipe (that may have

been cement lined in-situ).

All pipes with diameters less than 300mm are defined to as reticulation

assets. Existing pipes of this dataset have diameters of 80mm, 100mm,

125mm, 150mm and 175mm. They were installed in the region between

1857 and 1985. The construction history of cast iron pipes in all areas

under City West Water license, categorised by year of manufacture and

construction technique, is presented in Table 3.1. According to this table,

taken from (Righetti 2001) poor construction techniques have been used for

installing the pipes of this study and this has contributed to the high rate

of failures observed in the cast iron pipes of the area.

Based on the material and diameter, the pipes of the existing dataset

could be divided into six classes as listed in Table 3.2. Other classes (possible


Table 3.1: Construction history of cast iron mains in City West Water

(Righetti, 2001)

Construction Length of installed Quality of construction

date pipes (km) technique

prior to 1920 218 poor

1920 to 1928 143 poor

1929 to 1967 617 poor

1968 to 1985 523 improved with

granular backfill

Table 3.2: Statistics of the six classes of pipes in the dataset, selected for

reliability analysis and failure prediction in this study.

Class No. Pipe Material Diameter(mm) No of Breakages

1 CI 80 321

2 CI 100 922

3 CI 125 60

4 CI 150 105

5 CICL 100 3886

6 CICL 150 793

combinations of pipe types and diameters) had too a small population (fewer

than 20 records) to be analysed by statistical modelling techniques and were

eliminated from this study.

3.3 Spatial Location of Pipes

The condition of water pipes, is influenced by a number of factors. These

factors include the environmental conditions and structural characteristics

(e.g. pipe diameter, wall thickness, pipe material). External loads and

rainfall and soil characteristics that influence the failure rate of pipes are

generally similar for the pipes in a neighbourhood. In other words, spatial

clustering of water pipes of a network results in homogeneous classes of

pipes in terms of external deteriorating factors.

Environmental factors causing deterioration, especially soil characteris-

tics, and internal factors such as pressure fluctuations usually do not vary

considerably for the pipes across a postcode area. Thus, in order to elimi-

nate the variation of rainfall and soil characteristics of the water mains being

3.3. SPATIAL LOCATION OF PIPES 73

categorised into homogeneous groups, pipes under this study identified in

terms of their postcode.

3.3.1 Estimation of postcodes for given AMG coordinates

Available data did not include the location of all pipes. For most of the

pipes of the datasets, the Australian Map Grids (AMGX and AMGY) co-

ordinates were available. These coordinates were converted to longitude

and latitude values using a GIS tool. The spatial data transformation is

performed by a Perl script which has parameters passed to it via the forms

interface. The transformation algorithm makes use of Redfearn (1984)’s

rigorous formulae and The Australian Geodetic Datum Technical Manual.

Special Publication (1986). Depending on the precision of the entered coor-

dinates, this transformation computes results to the nearest 4mm (nominal)

on the ground.

In the next step, a package of data, provided by “Australia Post” was

used to estimate the corresponding postcode of each point using its geo-

graphical longitude and latitude. The package included a Microsoft Excel

database containing the coordinates and corresponding postcodes across

Victoria. A MATLAB program was written to read the pipe longitude

and latitude values from the failure history (calculated for the pipes with

given AMG coordinates), find the closest coordinates in the Australia Post

postcode database, and record the corresponding postcodes of all pipes.

3.3.2 Estimation of postcode for pipes with no spatial

data

For the small number of the pipes in the failure history without any geo-

graphical information, a “linear interpolation” technique was used to esti-

mate their AMG coordinates. The process of calculating unknown values

from known values when a constant rate of change is assumed, is called

linear interpolation (Watson and Duff 1997). The main attribute of this

method is that it is easy to compute and stable. The method works by

effectively drawing a straight line between two neighboring samples and re-

turning the appropriate point along that line. For example η is a number

between 0 and 1 which represents how far one wants to interpolate a sig-

nal y between the times n and n + 1. Then the linearly interpolated value

y(n + η) can be defined as follows:

y(n + η) = (1 − η).y(n) + η.y(n + 1) (3.1)


Using this method to estimate the AMG coordinates of pipes between

known locations implies the assumption of pipe unique ID’s being attributed

in spatial order. In Figures 3.2 and 3.3, the X and Y coordinates of the

pipes with known AMG values are plotted. It is observed that AMGX and

AMGY values vary with Unique ID’s almost linearly. Thus, the missing

AMGX and AMGY coordinates of pipes with failure records in the dataset

can be calculated as follows. First, all failure records are sorted according to

the unique ID numbers of the pipes. Consider a pipe with unique ID number

UID1 with missing AMG coordinates. If the unique ID’s of the two pipes

on two sides of this pipe in the sorted list are UID0 and UID2 in such

a way that UID0 ≤ UID1 ≤ UID2, then the missing AMG coordinates

of the pipe UID1 are given by Equation (3.1) in which y(n) and y(n + 1)

are the same AMG coordinates of the two pipes UID0 and UID2 and η =

(UID1 − UID0)/(UID2 − UID0).

The failure records were first reordered by sorting the pipe unique IDs.

For any pipe with no AMG coordinates, those values were then estimated

by linear interpolation of the AMG coordinates of the closest right and left

neighbouring data records in the list. A MATLAB program was developed

to find the closest records and implement the linear interpolation for each

pipe with unknown AMG coordinates. Having the AMG coordinates, a

similar procedure was used to obtain the postcode for each of those pipes.

3.3.3 Distribution of failures in different postcodes

Using the information on location of pipes in the failure history, a study

of the distribution of failures across different postcode areas is useful for

classifying the pipe breaks into almost homogeneous groups regarding the

environmental variables such as soil characteristics and rainfall.

Figures 3.4 and 3.5 show the rate of failure events in each postcode of

the CWW licence area. These figures show that the failure rates vary from

postcode to postcode. Some postcodes have experienced a markedly large

number of failures, while in some, the failure rate was quite low during that

period. This observation is compatible with the conclusions of the earlier

work by Goulter and Kazemi (1988) into the existence of strong spatial

clustering effect in occurrence of sub sequential failures.

It is important to note that because of the wide variation of pipe lengths,

the highest failure frequency (the number of breaks occurring per year,

regardless of pipe lengths) does not necessarily occur in the same area with

the largest failure rate per km. For example, according to the bar plots

3.4. ADDING THE RAINFALL INFORMATION TO THE DATA 75

1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9x 105

2.9

2.95

3

3.05

3.1

3.15

3.2

3.25

Unique ID number

AM

GX

Coo

rdin

ates

1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9x 105

5.8

5.805

5.81

5.815

5.82

5.825x 106

Unique ID number

AM

GY

Coo

rdin

ates

5x 106

Figure 3.2: Plot of AMGX coordinates versus the pipes unique IDs as ev-

idence for credibility of using linear interpolation to estimate the missing

AMGX coordinates.

in Figures 3.4 and 3.5, the pipes in the area with postcode 3061 had the

largest failure rates (per km). But in terms of the gross number of failures

occurred, the postcode area 3021 was identified as the worst region with 686

failures recorded just for CICL pipes in this region. In order to reduce the

small sample bias of the reliability estimates and failure prediction given by

the statistical techniques presented in the later chapters in this thesis, those

techniques are examined by the failure records in the area with postcode

3021 (which has the largest population of failure records in the database).

3.4 Adding the Rainfall Information to the Data

Rainfall is also a determining factor in the failure process of pipes, espe-

cially in expansive soil bedding. Since the geographic area under licence of

City West Water has expansive clay, pipes were regarded as being affected

by shrinkage and expansion of surrounding soil. The severity of these en-

vironmental stresses is influenced by the rainfall profile of the area. Thus,

in order to consider the effect of this environmental factor that impacts the

mechanism of pipe failures, rainfall history of the area under study during

that period of data collection was also needed.


1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9x 105

2.9

2.95

3

3.05

3.1

3.15

3.2

3.25

Unique ID number

AM

GX

Coo

rdin

ates

1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9x 105

5.8

5.805

5.81

5.815

5.82

5.825x 106

Unique ID number

AM

GY

Coo

rdin

ates

5x 106

Figure 3.3: Plot of AMGY coordinates versus the pipes unique IDs as ev-

idence for credibility of using linear interpolation to estimate the missing

AMGY coordinates.

0

2

4

6

8

3000

3002

3003

3005

3011

3012

3013

3015

3016

3018

3019

3020

3021

3022

3023

3024

3025

3027

3028

3029

3030

3031

3032

3033

3034

3036

3037

3038

3039

Postcodes

Failure Rates (per Km)

Figure 3.4: Failure rates in each of the postcodes 3000–3039 in the region

under study (average number of breaks per km during 1997–2000).

Monthly rainfall records during 1997-2000 were provided from a climate

station in the region under study. City West Water’s boundaries contain

the local government areas of Brimbank, Hobsons Bay, Maribyrnong, Mel-

bourne (north of the Yarra river), Moonee Valley, Wyndham, Yarra and

parts of Melton and Hume. Licence area of CWW is shown in Figure 3.6.

Keilor station was chosen for its location in the middle of this region. Keilor

(3742′36′′ S and 14449′44.4′′ E) is located in postcode 3036 of Victoria,

partly in city of Brimbank and partly in City of Hume.

3.4. ADDING THE RAINFALL INFORMATION TO THE DATA 77

Failure Rates (per Km)

0

5

10

15

20

25

3040

3041

3042

3043

3044

3045

3047

3049

3050

3051

3052

3053

3054

3061

3065

3066

3067

3068

3075

3121

3141

3309

3310

3311

3892

3960

5277

Postcodes

Figure 3.5: Failure rates in each of the postcodes 3040–5277 in the region

under study (average number of breaks per km during 1997–2000).

Figure 3.6: Geographical map of the licence area of City West Water.

The rainfall dataset was a Microsoft Excel file containing monthly rain-

fall in millimetres. To compare the rain-fall profiles in similar seasons of

different years, a histogram of rain fall measures of different seasons of the

years 1997–2000 is plotted in Figure 3.7. The summer season of 1997 is

observed to have been considerably drier than subsequent summers and the

spring season of 2000 has been wet compared to the other springs. Such

observations, along with the monthly variations of rainfalls (as plotted in

Figure 3.8), are studied later in Chapter 5 in conjunction with the breakage


0

50

100

150

200

250

Summer Autumn Winter Spring

1997

1998

1999

2000

Rainfall (mm)

1997

1998

1999

2000

Figure 3.7: Histograms of quarterly records of rainfall of the region in 1997-

2000

020406080

100120

January

February

March

April

May

June

July

August

Sptember

October

Novem

ber

Decem

ber

Rai

nfal

l (m

m)

1997199819992000

Figure 3.8: Histograms of monthly records of rainfall of the region in 1997–

2000

history of pipes to investigate the non-stationary nature of failure occur-

rences as a random process.

3.5 Failure Times

The available failure history comprises the dates of failures that occurred

during 1997-2000. Different approaches in failure analysis use different no-

tations. Some statistical analysis methods (for example the probabilistic

3.6. SUMMARY 79

X X X

0 T1 T2 TN-2

IFT1

X X X

T3 TN-1 TN

IFT2 IFT3 IFTN-1 IFTN

Time

Figure 3.9: Failure times (Ti) and inter-failure times (IFTi) of a class of

pipes.

technique developed in Chapter 6), use the inter failure times rather than

failure times. Regardless of the terminology, the sequence of failure times

and the sequence of inter-failure times represent the same information about

the failure history.

A graphical description of the failure history, starting from time t = 0 is

shown in Figure 3.9. Each cross symbol corresponds to a failure time (Ti)

of a class of pipes. Ti is the actual time of the i-th failure occurrence. Each

inter-failure time is the time elapsed between two consecutive failures. The

inter-failure times are denoted by IFT1, IFT2, · · · given IFTi = Ti − Ti−1

for i = 1, 2, · · · with T0 = 0.

For the purpose of statistical analysis of water networks, it is assumed

that the water network is repaired immediately after occurrence of a failure.

This implies that the repair times are negligible compared to the failure and

inter-failure times, which is a reasonable assumption for water networks.

3.6 Summary

The first section of this Chapter describes limitations of a typical database

that is usually available in water distribution networks. Generally, the time

window of recorded failure history of old pipes does not cover their total life

span. This means that any analysis based of these databases is associated

with a level of uncertainty as a result of left censored data. This is besides

the inaccuracy in recording errors that is inevitable in such databases and

should be minimised using a data integrity check.

The database that is used in this thesis is described in detail in the

next chapter. This database is provided by City West Water PTY LTD

comprising of a four years failure history of their Cast Iron water pipes. The

definition of failure in this database and other terminologies that are used

as well as description of contents of this failure history are presented. The

failure history contains material, size, construction date, length, and failure

dates of pipes. However, the spatial location of these pipes, is an important


factor in condition of pipes that was missing in the information that was

available. Attempts to realise this information and attributing a postcode

to each pipe is explained in details. A study of the distribution of failures

across postcode areas is used for classifying the data into almost homogenous

groups regarding the environmental factors such as soil characteristics and

rainfall. Rainfall is a determining factor in breakage of pipes in expansive

soils that cover most of the CWW area. Obtaining the rainfall information

and adding it to the data is discussed. Discussion about data specification is

concluded with a description of inter-failure times that is used in Chapter 5.

However, the statistical analysis model developed in Chapter 3, a different

approach, takes a different notation and uses the failure times instead.

Chapter 4

Intelligent Reliability Analysis of

Water Pipes Using Artificial

Neural Networks

4.1 Introduction

During their lifetimes, water mains, as the most essential and high main-

tenance components of water distribution systems, are constantly exposed

to a vast range of deleterious influences. Consequently, their design factors

of safety may significantly degrade with time, leading to structural fail-

ures. The conditions causing failures in water mains are discussed earlier in

Chapter 2.

It was also noted previously that prediction of the performance of pipes

in the future is essential for developing proper strategies for maintenance

or replacement of water distribution systems. This is a challenging task as

water mains are buried in the ground and regular monitoring of the state

of each individual pipe is not feasible.

Studies by other researchers, conducted to predict the performance of

pipes, were reviewed in Chapter 2. Existing mathematical models were

explained and their strengths and limitations were discussed. The com-

parative literature review in Chapter 2 concluded that the best option for

water distribution authorities is to perform statistical analysis on the profile

of their water mains. This type of analysis will aid the managers to estimate

the performance of the pipes in the future.

This chapter introduces a method in which an ANN is used to develop

81

82CHAPTER 4. INTELLIGENT RELIABILITY ANALYSIS OF WATER

PIPES USING ARTIFICIAL NEURAL NETWORKS

probabilistic models for the likelihood of failures in predicting failures in

water mains. In this study, performance of water mains in the future is

studied in context of reliability analysis of the water pipes with failures

recorded in the available dataset explained in Chapter 3.

4.2 Reliability Analysis: Principles and Defini-

tions

According to ISO 8402, the reliability of a system is defined as the ability of

the system to perform a required function, under given environmental and

operational conditions, for a stated period of time (Hoyland and Rausand

1994). Reliability of a component is commonly defined similarly, namely,

the likelihood of its proper operation for a given period of time in the future.

In a general sense, reliability analysis involves mathematical modelling of

the likelihood of failure events, to estimate the performance of a system or a

component in a certain time. Models are used in failure/reliability analysis

of a system to aid the comprehension of future behaviour of system or its

components. Reasonable estimation of component performance is essential

for preparing an optimum asset management strategy for the system. This

way, unreliable components are recognised and actions can be taken to

mitigate the adverse impact of component failures on the cost or/and quality

of system service.

4.2.1 Reliability of water distribution systems

In water distribution systems, reliability means the ability to deliver design

flows under a wide range of conditions (Goulter 1987). Estimating the

reliability with this definition is an overriding challenge in the management

of this sector.

Reliability of a water distribution system or its components can be stud-

ied from some perspective which are quite different in their primary causes.

Shamir and Howard (1985) proposed a number of approaches for reliability

analysis in the water industry. However those approaches were developed

for the quality of water.

For reliability assessment of water distribution systems, Kettler and

Goulter (1985b) analysed the probability of failure of major water sup-

ply paths, while Goulter and Coals (1986) studied the probability of node

isolation. Both approaches used linear programming through constraints

4.3. OBJECTIVES OF THE PROPOSED RELIABILITY ANALYSIS 83

restricting the average number of breaks per year permitted in each link.

The relevant probabilities of interest were also calculated using the average

failure rates in each link. In an examination of the hydraulic reliability

of distribution systems, Cullinane (1986) presented concepts of mechanical

reliability and availability as quantitative measures of system reliability.

Mays et al. (1986) used a cut-set approach for modelling the reliabil-

ity of network. The proposed procedure showed how failure definition can

be directly included into an optimisation design model. In the summary,

Mays et al. (1986) mentioned that the study of reliability of water distri-

bution systems is severely hampered by lack of an accepted definition and

measures for reliability. Currently, there is no universally acceptable defi-

nition or measure for the reliability of water distribution systems in terms

of appropriate and valid criteria for reliability and quantification of these

reliability measures.

4.3 Objectives of The Proposed Reliability Anal-

ysis

This study is conducted particularly to estimate the state of reliability/failure

of the water mains using the dataset described in Chapter 3. The limita-

tions mentioned for this data are common for water distribution systems. In

the reliability study presented in this chapter, the first step is to introduce

proper reliability measures. In order to define reliability measures serving

the purpose of this study, it should be noted that delivering a constant and

satisfying service to the customers (under economic considerations) is the

major focus of water distribution systems. Failures of pipes, as major com-

ponents of water distribution systems, can diminish the ability of networks

to perform up to required specifications.

In order to quantify the quality of service, governmental and/or local

regulations may specify a maximum number of interruptions to the service

of water distribution systems. As a consequence, an ongoing problem, trou-

bling water supply managers, is to control the number of upcoming pipe

breakages. It is very difficult for water companies, if not impossible, to

absolutely guarantee that unplanned interruptions do not exceed the limit.

Reliability analysis, however would give good indications of which pipes

are at risk of exceeding the limit. In this way, the probability of not ex-

ceeding the acceptable number of unplanned failures can be estimated. In



other words, the risks would be quantified and appropriate actions, such as

replacing the assets which are in critical conditions, taken.

To achieve the above mentioned objective, the reliability measures (that

are targeted for estimation in this study) are the probability of a homo-

geneous class of components working properly (with no failure) prior to a

certain time in the future. Outcomes of this kind of analysis are useful in

developing reliable strategies for maintenance/replacement of water pipes.

After establishing the scope of study by setting suitable targets, the

study documented in this chapter improves the existing probabilistic tech-

niques of water distribution systems by introducing a new reliability esti-

mation technique using ANNs as a new technique from the literature. The

proposed probabilistic models that are developed using this technique, are

evaluated and their performances are compared to some existing probabilis-

tic techniques by applying them to the same dataset.

4.4 Structure of the Proposed Reliability Model

A diagram of reliability estimation system for pipes of a water distribution

network is presented in Figure 4.1. As shown in this figure, the ultimate

estimation system is supposed to accept some structural characteristics of

the pipes, and a time in the future in which the reliability of pipes is ques-

tioned. After processing the inputs, the estimation system is expected to

return the reliability of the pipe at that certain time. The reliability of each

pipe at a given time in future (the probability that pipe will not fail up to

that time) can be extracted from the reliability model of the homogeneous

class that the pipe belongs to.

In this system, construction and failure dates, pipe material and diam-

eter are the characteristics of the pipes used as inputs by the estimator.

First, pipes of existing failure data should be classified based on their simi-

lar characteristics that make up homogenous classes of suitable populations.

In fact, available data dictates the inputs to the estimator. In this study,

pipes are classified by the categorisation of similar materials and diameters.

However, in case of availability of more details about the physical character-

istics of each pipe, and/or system characteristics such as pressure zone, soil

characteristics, and the like, the estimation system can be tuned to consider

more inputs. If there are sufficient number of failure records for each group,

this will result in developing more accurate models.

One of the inputs is the assessment date (expected date) until which the

4.5. EMPIRICAL ESTIMATION OF SURVIVAL FUNCTIONS 85

Survival Function Estimator

Pipe diameter

Pipe Type

Assessment Date

Reliability of the given pipe to survive until the assessment date

Construction Date

Figure 4.1: Schematic diagram of a survival function estimator for a water

pipe with given type (material), diameter and construction date: The esti-

mator gives the pipe reliability to survive until a given assessment date in

the future.

water pipes are studied for their possible failures. The output of the model

is the reliability of pipes in the given class, ie., the likelihood the pipes in

that class to work properly until the given assessment date.

In this study, a new design is devised for the content of the black box

depicted in Figure 4.1. For this purpose, an artificial neural network is

trained and evaluated. A portion of failure data is used for training the

neural estimation system, and the remainder of it is used for evaluation

of the resulting estimator. Furthermore, in this chapter, two well-known

lifetime models, based on Weibull and lognormal lifetime distributions, are

examined using the same data. These models are used as alternative base

line models for realising the black box of Figure 4.1. The two probabilistic

lifetime models are used as benchmarks to assess the proposed intelligent

lifetime model through comparing its ability to predict pipe reliabilities with

the predictions made by Weibull and lognormal models.

4.5 Empirical Estimation of Survival Functions

The probabilistic technique developed in this study is a method for survival

analysis of water pipes. Survival analysis has been used to predict pipe

breakage behaviour by many researchers in the past two decades. Liter-

ature of this type of analysis was reviewed under probabilistic models in

Chapter 2.

The reliability of a pipe at a time in the future (also called expected

date in this thesis) can be expressed by a survival function, denoted by the



symbol S(t) and is defined as below:

S(t) = Pr(TFAIL ≥ t) (4.1)

where t is the pipe’s age at the assessment date (the time interval between

the construction date and the assessment date) and TFAIL is the time inter-

val between the construction date and the next coming failure date. The

proposition TFAIL ≥ t is equivalent to the event of “no failure before the

assessment date”.

The data used in this study includes the failure history of water pipes

as described in Chapter 3. The first step in the reliability analysis of these

pipes is to convert the database to an operable format, stratified by material

and diameter of the pipes. The details of simulation studies and application

of the proposed models to the available data are presented in Section 4.8.

After classifying the pipes, reliability values of the pipes should be empir-

ically calculated. These empirical values are used for tuning the parameters

of the resulting models. Given the pipe failure times for pipes of a specific

class in the database, reliability of that class at an age t can be empirically

calculated as follows:

SEMP(t) =the number of failures in the class occurred at ages over t

total number of failures in the class in the history.

(4.2)

where SEMP(t) is the reliability of that class at age t.

Let t1, t2, . . . , tn be the set of pipe ages at the failure times recorded

in the history of a particular class of pipes. The set is first sorted to

t(1), t(2), . . . , t(n) in an ascending order. Then, general expression of

Equation (4.2) can be enhanced to Equation (4.3) for empirical measures

of survival function at different pipe ages corresponding with the time of

failure events and their right neighbourhood points:

SEMP(t(i)) = 1− i− 1

n; SEMP(t(i) + ε) = 1 − i

n(4.3)

where ε is an infinitesimal measure; and t(i + ε) shows the right neighbour-

hood of t(i) as shown in Figure 4.2.

Equation (4.3) introduces a non-increasing staircase function. The steps

are occurring at the points of observed failure times. This method obvi-

ously returns discrete estimations. However, both the benchmark proba-

bilistic and the intelligent ANN model return continuous survival functions.

Therefore, the staircase curve is approximated with its continuous form that

passes through the mid-points of discontinuities as shown in Figure 4.2.

4.5. EMPIRICAL ESTIMATION OF SURVIVAL FUNCTIONS 87

S(t)

tt(1) t(2) t(3) t(4)

0

1

1-1/n

1-2/n

1-3/n

1-4/n

Figure 4.2: The step-wise empirical survival function (solid) and its contin-

uous approximation (dashed).

The reference values of the empirical survival function in each failure

age t(i) are given by Equation (4.4):

SEMP(t(i)) =(1 − i−1

n) + (1 − i

n)

2= 1 −

i− 12

n. (4.4)

Up to this point, the mathematical formula of Equation (4.4) is presented

for calculation of empirical measures of reliability of pipes at a given time.

The concern of the study at this stage, is developing exclusive models for

each class of pipes, that enable the managers to estimate the reliability of

pipes in a given time in the future. The estimation system depicted in

Figure 4.1 comprises a number of models that each serve a particular class

of pipes.

Since the Weibull and lognormal lifetime distributions are found to be

the most commonly used models in literature of reliability analysis of differ-

ent systems, these models are examined for performance comparison with

the proposed intelligent model. Therefore, before the proposed intelligent

reliability modelling method is described, the mathematical formulation of

the Weibull and lognormal models are briefly reviewed in the next section.



4.6 Weibull and Lognormal Lifetime Models

4.6.1 Weibull lifetime distribution

The Weibull distribution is the most widely used distribution for life-data

analysis. It is also a well-known statistical model in reliability engineer-

ing and failure analysis. Due to its flexibility, this model can model the

behaviour of other statistical distributions such as the normal and the ex-

ponential distributions.

In his paper, ”A Distribution of Wide Applicability”, Weibull (1951),

who was studying metallurgical failures, pointed out that normal distribu-

tions are not applicable for characterising initial metallurgical strengths.

He then introduced the Weibull distribution and reported its successful ap-

plication for seven case studies. This continuous probability distribution

has been used repeatedly to provide a reasonable life-time model for many

types of components (Crowder et al. 1994). It has been mainly used to

model failures caused by fatigue, corrosion, mechanical abrasion, diffusion,

and other degradation processes.

A number of researchers (e.g. Eisenbeis (1999), Eisenbeis (1997), Lei

and Sgrov (1998) and Le Gat (1999)) used Weibull model in failure analysis

of water pipes.

In this study, a two-parameter Weibull model is used for survival func-

tion analysis. The equation for the two-parameter Weibull cumulative den-

sity function (CDF), is given by:

F (t) = 1 − e−(t/η)β

(4.5)

where t ≥ 0 is a given component age (corresponding to an assessment date),

β is the shape parameter, η is the scale parameter and F (t) is the CDF which

means the probability of occurrence of a failure before the assessment date

where is equivalent to the complement of survival function, i.e. F (t) =

1 − S(t).

The Weibull plot is a graphical tool for determining if a dataset comes

from a population that can be fitted to a two-parameter Weibull distribu-

tion. Usually, the Weibull model is fitted to a dataset by linear regression

of the form Y = log(− log(1 − F (t))), which is empirically given by Y =

log(− log(1 − (i − 0.5)/n)), plotted versus X = log(t(i)) for i = 1, 2, . . . , n.

The correlation factor of Y and X, ρX,Y

is a fitness indicator. When ρX,Y

reaches a minimum threshold, the Weibull distribution is considered ac-

ceptable for the data and the model parameters are calculated by linear

4.6. WEIBULL AND LOGNORMAL LIFETIME MODELS 89

regression assuming the following linear relationship (derived by taking log-

arithms twice from both sides of Equation (4.5):

Y = β(X − log(η)). (4.6)

4.6.2 Lognormal lifetime distribution

The lognormal distribution is another flexible model that can empirically

fit to many types of failure data. Lognormal distributions are encoun-

tered frequently in metal fatigue testing, maintenance data (time to re-

pair), chemical-process equipment failures and repairs, crack propagation,

and loading variables in probabilistic design.

A lognormal distribution is found when the time to failure or repair re-

sults have cumulative contributing factors. This property can be observed

in several deterioration processes associated with fatigue and creep mech-

anisms. Deterioration in such cases is generally progressive. For example,

a crack grows rapidly under high stress because the stress increases pro-

gressively as the crack grows. Indeed, in many situations, failure or repair

times depend on several factors that are random in nature. In such cases,

the multiplication effect of these factors leads to a lognormal failure or repair

distribution. Therefore, the lognormal model can be theoretically derived

under assumptions matching many failure degradation processes (Bishop

and Bloomfield 2003, Goldthwaitel 1976).

A theoretical justification for using lognormal distributions comes from

the Central Limit Theorem when the logarithm of lifetime is considered

the sum of a large number of small independent effects (Crowder et al.

1994). Applying the Central Limit Theorem to small additive errors in the

log domain and justifying a normal model is equivalent to justifying the

lognormal model in real time when a process moves towards failure based

on the cumulative effect of many small “multiplicative” shocks.

The CDF of the lognormal distribution is given as follows:

F (t) = Φ

(log(t/µ)

σ

)(4.7)

where t ≥ 0, µ (scale parameter) and σ (shape parameter) are the mean and

standard deviation of the log-lifetimes, and Φ is the CDF of the standard

normal distribution N(0, 1). The survival function S(t) is again given by

1−F (t) and by taking the inverse CDF (Φ−1) of both sides of Equation (4.7),

the following equation is derived:

Φ−1(1 − S(t)) =log(t) − µ

σ(4.8)



To fit a lognormal model to a dataset, linear regression is applied to

Y = Φ−1((i − 0.5)/n) values versus X = log(t(i)) values. The degree

of linearity of the X–Y plot can be considered as an indicator. If the

correlation factor of X and Y is sufficiently large, assumption of a lognormal

distribution for the database is mathematically justified and the estimates

of the parameters µ and σ are computed either by applying linear regression

with the following linear relationship between X and Y values assumed:

Y =X − µ

σ(4.9)

or by computing the average and standard deviation of the log-lifetimes for

µ and σ, respectively.

4.7 Intelligent Reliability Prediction by Artificial

Neural Networks

Artificial neural networks (ANN) have been widely applied to solve dif-

ferent problems such as modelling, system identification, control, feature

extraction, computer vision, software reliability analysis, metal and frac-

ture analysis and the like. Particularly, in recent years, Neural Network

(NN) analysis has been used in reliability analysis of different mechanical

and electronic systems (e.g. Lee et al. (1999), Bevilacqua et al. (2003), Car-

valho et al. (1999), Moon et al. (1998)). An introduction to the history and

basic types of ANN s is presented in Appendix A for interested readers.

ANNs have also been reported to be applied to predict water pipe failures

(Sacluti et al. 1999, Sacluti 1999). However, the neural network models

introduced in those papers are deterministic models which directly produce

the future failure rates (or number of failures). A distinction is made here

as the reliability estimation framework that is introduced in this chapter

uses an ANN to learn the pattern of survival function values for the water

pipes. More precisely, a probabilistic model is developed here which provides

survival probabilities for future given dates - survival function values as

defined in Equation (4.1).

This study proposes using ANN for the purpose of reliability estimation

of survival functions of water mains. Artificial neural networks, known as

universal approximators (Hertz et al. 1991), are capable of generating mod-

els fitted to the empirical reliability values, more accurately than existing

probabilistic models. In this section, a feed-forward perceptron with one

4.7. INTELLIGENT RELIABILITY PREDICTION BY ARTIFICIALNEURAL NETWORKS 91

Pipe diameter

Pipe Type

Assessment

Date

Reliability of the

given type of pipes

to survive until the

assessment date

Construction

Date

+

-Normalisation

1/5

1

1

1/0.9

Survival Function Model

.

.

.

Figure 4.3: Architecture of the proposed method of reliability analysis by a

feed-forward perceptron.

hidden layer is proposed as an intelligent replacement for current models in

survival function modelling.

Multi-layer feed-forward perceptrons that include linear output units

and a single layer of nonlinear hidden layer units, have been theoretically

proven to be able to represent most reasonable functions as close as desired

if they are trained using back-propagation learning algorithm (Leshno et

al. 1993). Hence, in this study, a feed-forward perceptron with one hidden

layer is trained. The architecture of proposed neural reliability estimator is

illustrated in Figure 4.3.

It is well-known in ANN literature that a multi-layer perceptron will be

trained more rapidly and accurately if its inputs vary in bounded ranges

such as [0, 1] (Tarassenko 1998). Thus, all inputs to the neural network

(including the time lengths) are normalised to the range of [0, 1].

One of the inputs to the network is the time elapsed from the construc-

tion date of pipe until the assessment date in the future (pipe age at the

assessment date). Pipe types and diameters were the other inputs to the

network that were encoded and normalised. The database of this study

contained failure history of pipes of two different types (materials) and five

different sizes (diameters). Types and diameters of pipes were encoded to

0, 1 and 1, 2, . . . , 5 respectively.

The artificial neuron model used in this study is depicted in Figure 4.4.

The neuron computes the weighted sum of the input signals and applies the

weighted sum as an input to its activation function and returns the output of

the function. In the proposed design for the intelligent reliability estimation

technique shown in Figure 4.3, the activation function of all neurons are the



+

Constant Bias Input to the neuron (Usually equal to 1)

Activation Function

A neuron in the previous layer

1X A synapse modelled by its weight 1W

11XW22 XW

33XW

nn XW

Neuron Model

Output to neurons in

the next layer

Figure 4.4: Diagram of an artificial neuron model in a multi-layer feed-

forward perceptron network.

-10 -8 -6 -4 -2 0 2 4 6 8 10

0

0.2

0.4

0.6

0.8

1

x

f(x)

Figure 4.5: Nonlinear profile of the Sigmoid function, the activation function

of all neurons in the proposed ANN-based reliability model.

sigmoid function given below:

f(x) =1

1 + e−x. (4.10)

Figure 4.5 shows the nonlinear profile of the sigmoid function.

Since the output neuron of the network has a Sigmoid activation func-

tion, the single output of the neural network is bounded within [0, 1] which

is appropriate to model a probability value (survival function).

4.8. SIMULATION RESULTS 93

The extreme outputs (0 and 1) could occur if the Sigmoid function satu-

rates. This is considered undesirable as it will reduce the learning capability

of the network. To prevent this situation, the proposed neural network is

set to learn 0.9 times of the empirical reliability values. Accordingly, the

output is divided by 0.9, as shown in Figure 4.3.

Basically, the number of neurons in the hidden layer of neural network

(denoted by nh) is a determining factor in its learning and generalisation

capabilities. With a very small nh, the network is not able to learn the

empirical survival function accurately. On the other hand, if nh is very large,

the generalisation power of the network will diminish due to its compact

fitness to all random and non-smooth variations of the points in the reference

curve (Kartalopoulos 1996). This trade-off should be balanced by trial and

error. In this study, nh = 15 neurons resulted in neural networks with

convincing ability to accurately learn the failure profile of all pipe classes

with records in the existing failure history.

The learning ability of each neuron is improved by making small adjust-

ments in its weights to reduce the difference between the actual and desired

outputs of the network. The initial weights are randomly assigned, small

numbers, which are updated to obtain the output consistent with the train-

ing examples. In each iteration, these weights are updated. If, at iteration

p, the actual output is Y (p) and the desired output is Yd(p), then the error

is given by:

e(p) = Yd(p) − Y (p). (4.11)

Iteration p here refers to the p-th training example presented to the neural

network. If the error, e(p), is positive, the network output Y (p) should be

increased, but if it is negative, it should be reduced.

Complete details of back-propagation learning algorithm along with step-

by-step instructions to implement the proposed intelligent model with a

given failure database are explained in Appendix B.

4.8 Simulation Results

This section presents comparative results of modelling and prediction of fail-

ure times using Weibull and lognormal models and the proposed ANN-based

reliability estimation method. The models have been applied to reconstruct

the failure profiles of the pipes with breaks recorded in the dataset explained

in Chapter 3.



The dataset is a pipe break database containing the history of pipe

failures occurred during 1997–2000. There are two types of pipes in the

dataset:

CI (Cast Iron) pipes which are encoded by “0”; and

CICL (Cast Iron Cement Lined) pipes which are encoded by “1”

in the inputs of the ANN shown in Figure 4.3.

Based on the pipe types and diameters, pipes of existing data are divided

into six classes as listed in Table 3.2. Other possible combinations of pipe

types and diameters had a population of lower than 20 failure events in the

history. A low population with a small value of n in Equation (4.4) would

result in low precision survival probabilities. For this reason, such classes

are ignored in this simulation.

The modelling technique developed in this study returns specific models

for reliability estimation of each particular class of pipes. To produce a

reliability model for each class of pipes, both Weibull and lognormal models

are also employed. Although both distribution models fit to the data in this

study, the simulation results show that the ANN-based estimator resulted

in remarkably higher accuracy in survival function modelling compared to

those two models.

Two sets of data are required for developing the desired network: train-

ing set and validation set. Each failure record in a dataset comprises the

available inputs and the desired output to be generated by the fully trained

neural network. The training data are used to tune the synopsis weights of

the network and optimise the system variables in accordance with the data

that is fed to it. Validation data are used to evaluate the generalisation

performance of the trained neural network model and to examine if it is

capable of generating the expected outputs for the data records which it

has not encountered before during the training process.

In this study, each failure record includes the following components:

construction date, pipe type, pipe diameter, failure date and the empirical

survival probability of the pipe at the failure date that is given to the neural

network as the assessment date during training and validation.

For each of the six pipe classes, half of the available data in the fail-

ure records are alternately selected as training patterns and the other half

considered as validation data. This way of selecting the training and valida-

tion data ensures that during training, the neural network learns the whole

spectrum of the pipe ages recorded in failure events and during validation,

4.8. SIMULATION RESULTS 95

Fig. 3

Fig. 4

Pipe diameter

Pipe Type

Expected Date Reliability of

the given type of pipes to survive

until the expected date

Construction Date

+

- Normalisation

1/5

1

1

1/0.9


...

Survival Measures

Pipe Age (years)

Figure 4.6: Empirical and modelled survival function plots versus the pipe

age in years (Assessment Date - Construction Date) for pipe class 1 (CI

type with 80 mm diameter).

the ANN is examined to have been learnt the complete range of pipe ages

recorded in the dataset.

Figure 4.6 shows the empirical and estimated survival measures plotted

versus the pipe age (assessment date – construction date) for the first class

of pipes (80 mm CI). It is observed that the lognormal and Weibull models,

because of their specific smooth rate of decreasing, cannot closely follow

the empirical survival curve, while the ANN model learns the non-smooth,

stepped behaviour of the empirical plot. It should be noted that assessment

dates do not refer to the first ever failures of the pipes, but to the next

failure.

The superior performance of the proposed neural technique compared

to the classic methods is also observed in similar plots for the other five

classes of pipes. It is clear that for irregular or highly nonlinear behaviour,

Neural networks perform better compared to Weibull and lognormal models.

Figures 4.7 and 4.8 show the similar plots for classes 2 (100 mm CI) and 5

(100 mm CICL).

In order to perform a quantitative comparison between the performance

of the proposed neural network method with Weibull and lognormal models,

the Mean Square Error (MSE) of survival function estimates is defined and



Fig. 5

Fig. 6

Survival Measures

Survival Measures

Pipe Age (years)

Pipe Age (years)

Figure 4.7: Empirical and modelled survival function plots versus pipe age

in years (Assessment Date - Construction Date) for pipe class 2 (CI type

with 100 mm diameter).

computed for each method and for each class of pipes. The MSE for class i

is defined as:

MSEi =

[1

ni

ni∑k=1

(SEMP(tk) − S(tk)

)2] 1

2

(4.12)

where S(tk) is the reliability estimate obtained from the model, ni is the

total number of failure records in the history of pipes of class i and tk; k =

1, . . . , ni denotes the set of failure times in the history.

The MSE values calculated for each class are presented in Table 4.1 for

the examined Weibull, lognormal and the proposed neural method. The

simulation results indicate that the estimation error of the proposed neural

scheme is improved by 79% and 83% compared to the lognormal and Weibull

methods for class 1 pipes, 82% and 77% for class 2 pipes, 50% and 60% for

class 3 pipes, 33% and 46% for class 4 pipes, 88% and 71% for class 5 pipes,

and 60% and 47% for class 6 pipes. There does not appear to be a clear

advantage for using Weibull over lognormal as, on average, both had the

some level of accuracy.

4.9. CONCLUSIONS 97

Fig. 3

Fig. 4

Pipe diameter

Pipe Type

Expected Date Reliability of

the given type of pipes to survive

until the expected date

Construction Date

+

- Normalisation

1/5

1

1

1/0.9


...

Survival Measures

Pipe Age (years)

Figure 4.8: Empirical and modelled survival function plots versus pipe age

in years (Assessment Date - Construction Date) for pipe class 5 (CICL type

with 100 mm diameter).

Table 4.1: Mean Square Error (MSE) of survival function estimation of

different pipe classes using Weibull, lognormal, and neural network models

Class No. Class Weibull Lognormal Neural Network

1 CI 80 0.112 0.143 0.023

2 CI 100 0.098 0.075 0.017

3 CI 125 0.113 0.140 0.055

4 CI 150 0.084 0.105 0.055

5 CICL 100 0.066 0.027 0.007

6 CICL 150 0.032 0.024 0.013

4.9 Conclusions

The first part of this chapter provided a brief background on reliability

analysis in the field of water distribution. The technique is general and ar-

guments behind this study are applicable to any pipe failure history. How-

ever, failure histories that are maintained in larger portions of time and

containing more data points result in better reliability estimation models.

The proposed technique is applied to a failure history that was explained



earlier in Chapter 3.

Pipes of the network were classified into classes of similar material and

diameter. The focus was on providing reliability estimation models for each

class. For obtaining estimations that can aid the managers of water distribu-

tion system with decision making on maintenance/replacement strategies,

reliability of pipes was estimated for homogeneous groups of pipes. Relia-

bility of pipes are extracted from the reliability model for the corresponding

class. In other words, the model obtained for the class in which the pipe

belongs to, is the reference for the reliability estimation.

Artificial neural networks as the universal estimators have been used for

performing a probabilistic modelling for addressing this problem. Incom-

plete data and non-linearity of pattern of failure history are the limitations

that decrease the ability to use most of mathematical models usually ap-

plied for modelling purposes in this literature. ANNs are widely used in

reliability analysis of other systems with a proven reputation in handling

the noisy and incomplete data with nonlinear dynamics. The other advan-

tage of using ANNs is that they can be simulated using software programs,

and the learning of converting coordinates can be easily implemented. A

brief background about neural networks is presented in Appendix A.

In Section 4.7 reliability model is developed using neural networks to

learn the pattern of variations of survival values that are calculated by

failure history. The estimator is trained to predict the future behaviour of

water pipes in distribution networks. The proposed estimator benefits from

the powerful learning potential of neural networks and their noise tolerance.

The inputs to the system were the construction dates, pipe types, diameters

of the pipes and the assessment date. The output was the reliability of

individual pipes ie. the likelihood of them working properly work until the

assessment date.

Weibull and lognormal distributions, well known lifetime models in this

literature (Borror et al. 2003, Hariga 1996), were examined in this study

for survival modelling. Section 4.6 contains a background on application

of these classic mathematical distributions for the purpose of lifetime mod-

elling.

With proper choice and within the credibility of Weibull or lognormal

models, the models offer considerable insight into the lifetime and reliabil-

ity of products. These two models were also used for developing models

for different classes of pipes in maintaining the reliability estimation sys-

tem. Using these models is recognised to be acceptable, based on regression

analysis. The quantitative comparison of performance of estimation sys-

4.9. CONCLUSIONS 99

tems developed using the proposed ANN-based technique, Weibull model,

and lognormal model in terms of mean square errors of estimation is pre-

sented in Table 4.1. For all six classes of pipes investigated, the ANN model

produced the highest level of accuracy compared to the Weibull and log-

normal estimates. On average, the mean square errors for the ANN model,

Weibull and lognormal were 0.03, 0.084 and 0.086 respectively.

The illustrative comparison between the three above mentioned esti-

mation systems is also available for each class of pipes separately. Fig-

ures 4.6, 4.7, and 4.8 demonstrate this comparison for some of these pipe

classes. Both quantitative and illustrative comparisons, confirmed the ad-

vantage of proposed neural estimation system over two other examined tech-

niques for existing failure history.

The results show that the mean square errors of the reliability estimates

given by the proposed method (for different classes of pipes) are up to 83%

(at least 33%) smaller than the errors of the well-known lognormal and

Weibull distribution models.

In addition, the situation in which there is no guarantee that the classic

models fit the failure history in similar cases, places an emphasis on the

advantage of the proposed neural network method and adds weight to the

generalisation power of neural network based modeling. In other words, the

acceptable fitness of lognormal or Weibull models to failure data may not

be the case all the time.

The proposed neural network approach, on the other hand, is potentially

capable of learning the behaviour of almost any non-linear pattern of failure

data and reconstructing the pattern of data for prediction purposes. The

proposed ANN-based reliability estimator benefits from the generalisation

power of neural networks and their ability to perform modelling without

knowing or assuming any underlying distribution, based on even censored

data.

In Appendix B, a step-by-step practical algorithm is presented and de-

tails on how to design the different layers of the ANN and train it by error

back-propagation provided. A practical approach to prioritisation of pipes

for replacement/rehabilitation based on their predicted reliability in future

times is also provided.

It is important to note that occurrence of a pipe failure is the result

of the multiplicative influence of a vast range of physical and environmen-

tal and operational factors on the pipe. Some of these factors have been

rarely recorded by water distribution systems during the network service.

Furthermore, some of these determining factors, such as soil movement and



temperature, have irregular and complex variations and need to be mod-

elled separately. This complexity has motivated the author to concentrate

on the process of failure occurrences from another point of view.

A critical approach to the existing models including the proposed neural

network model, is taken to take the study to the next stage and try to fill

the gap between existing models and true nature of failure process. To

be more precise, a component of this study is to provide a more accurate

understanding of the nature of failure occurrence. This understanding, in

advance, guides the research towards developing more accurate models that

attempt to reflect the nature of the failure process.

Chapter 5

Characteristics of Water Main

Lifetimes as Random Processes

5.1 Introduction

Different types of modelling techniques have been developed to analyse

the pipe breakages by studying their reliability and remaining life, e.g.

(Shamir and Howard 1979, O’Day et al. 1980, Andreou et al. 1987, Clark

and Goodrich 1988, Karaa and Marks 1990, Gustafson and Clancy 1999).

In this context, survival analysis has been commonly applied and has re-

sulted in parametric lifetime models, e.g. Weibull and lognormal (Andreou

et al. 1987, Gustafson and Clancy 1999). These models, previously devel-

oped for the failure/reliability prediction of water pipes, were reviewed in

Chapter 2. An ANN model for survival analysis of water pipes was also

proposed and explained in details, and simulation results of applying this

technique to the dataset were presented in Chapter 4.

Although the lifetime models realised by the proposed ANN-based method

are capable of reconstructing the pattern of previously recorded failure oc-

currences (in order to project this pattern to the future) more accurately

than other classical probabilistic methods, there are still large uncertain-

ties involved in failure prediction by either of the models. This is because

there are a number of factors neglected in the modelling process (and not

recorded in the database), that cause deterioration of a water pipe. These

factors consist of a diverse range of physical characteristics, environmental,

and operational parameters as explained in Chapter 2.

It is important to note that the effect of the above mentioned factors

101

102CHAPTER 5. CHARACTERISTICS OF WATER MAIN LIFETIMES AS

RANDOM PROCESSES

are not exactly the same in different cases because the incidence of each

pipe failure is a complex process, resulting from the multiplicative effect of

those factors. Some of the affecting factors are rarely recorded by water

distribution systems and therefore are unavailable to be taken into account

in development of any model.

Even if all of the affecting factors were available for predicting the failure

occurrences in the future, prediction of these factors in the future could

involve considerable uncertainty. The reason for this ongoing uncertainty is

that some of these factors, such as soil shrinkage (as a result of significant

variation in moisture content) and temperature differential do not follow

regular patterns and their fluctuations need further consideration.

Development of an in-depth understanding of pipe failures due to influ-

ence by the various factors is the main focus of this chapter. In particular,

this chapter illustrates that water pipe failures (failure rates or inter-failure

times) are non-stationary random processes and the deficiency of parametric

techniques for the analysis of such failure processes is demonstrated through

mathematical and empirical analyses.

In Section 5.2, the concepts and definitions of stationary random pro-

cesses are briefly reviewed and it is shown that parametric lifetime models

cannot accurately model non-stationary failure processes. A new set of

probability-based definitions of failure rate are introduced and their char-

acteristics are discussed in Section 5.3. Theoretical failure rates for general

parametric models and two-parameter Weibull models are derived in Sec-

tion 5.4, followed by explanation of empirical calculation of the new failure

rates using a database, presented in Section 5.5. Comparison of the theoret-

ical and empirical results for the CWW dataset is presented in Section 5.6.

Section 5.7 presents the conclusions of the study reflected in this chapter,

and explains the motivation and direction towards further work which is

covered in Chapter 6.

5.2 Non-Stationary Random Failure Processes

and Parametric Lifetime Models

A random process is an ensemble of consecutive random variables that cor-

respond to the possible outcomes of a random event. For example, the

number of pipe failures occurring during one month is a random variable,

and the ensemble of such numbers corresponding to a number of consecutive

months is a random process.

5.3. LIKELIHOOD OF NUMBER OF FAILURES: A PROBABILISTICDEFINITION FOR FAILURE FREQUENCY 103

In engineering applications, a random process is usually referred to as

stationary if the mean and variance of the process are time-invariant; oth-

erwise it is referred to as a non-stationary process (Bras and Rodriguez

1993, Balakrishnan 1995). More precisely, by definition, a random pro-

cess is wide-sense-stationary if its mean and second-order statistical prop-

erties (its correlation function) are time-invariant. If the distribution func-

tions of the random variables that constitute the random process are all

identical, then the random process is referred to as strict-sense stationary

(Balakrishnan 1995).

Using a lifetime model with time-invariant parameters for the water

pipes, implicitly underlies that the random process of time-intervals be-

tween consecutive failures of the pipes is a stationary process in the strict

sense. Weibull, lognormal and other similar probabilistic lifetime models

are parametric models. The ANN-based model introduced in Chapter 4

also falls within the class of parametric models with its synapses weights as

its parameters.

The model parameters for these various models are tuned and adjusted

using an optimisation method. For example, the parameters of Weibull and

lognormal models are tuned using linear regression as explained in Chap-

ter 4 and the weights of the ANN model are tuned using back-propagation

as described in Chapter 4 and Appendix B. After the parameters are tuned

they remain fixed and the models are time-invariant parametric models, un-

able to predict that failures that follow a non-stationary random process. It

is demonstrated in this chapter that all parametric lifetime models implic-

itly share the underlying assumption that the random processes of failure

occurrences in water mains are stationary random processes. The results

presented in this chapter also show that the failure process of the pipes

listed in the database (described in Chapter 3) is in fact non-stationary.

5.3 Likelihood of Number of Failures: A Prob-

abilistic Definition for Failure Frequency

Existing failure analysis methods for water pipes usually quantify the past

behaviour of pipe failures in terms of either failure rates, e.g. (Rajani and

Makar 2000b) or inter-failure times, e.g. (Rajani and Tesfamariam 2005),

and project them into the future. In this chapter, a new set of probabilistic

measures are introduced for the purpose of investigating the characteristics

of failure processes.


RANDOM PROCESSES

Instead of deterministic measures of failure rates or inter-failure times,

the failure process is studied in terms of probabilistic measures such as

probabilities of certain numbers of failures occurring during specific time

intervals.

The failure history is divided into equal time intervals. The length of

time intervals, denoted by T in this chapter, should be chosen carefully.

Very long time intervals result in a rough analysis in which variations of

the failure process during the long time intervals are not uncovered. On the

other hand, the length of time intervals should be long enough to include

a fair number of failures on average. The n-th time interval is the interval

within [(n − 1)T, nT ].

The event of occurrence of k failures during the n-th time interval is

denoted by NOFk(nT ). There is a direct relationship between the inter-

failure times (denoted by IFT ) and the number of failures occurring during

each time interval. In order to show this relationship, three instances of

occurrence of the events NOF0(nT ), NOF1(nT ) and NOF2(nT ) are illus-

trated in Figure 5.1, where the TFF denotes the time to the first failure

occurring after the time (n − 1)T , and the next consecutive inter-failure

times are denoted by IFT1 and IFT2, respectively. The time passed from

most recent failure is also denoted by tf .

As Figure 5.1 shows, the event of occurrence of no failure during the

n-th time interval, NOF0(n), is equivalent to:

NOF 0(nT ) ≡ TFF > T (5.1)

and similarly, the events NOF1(nT ) and NOF2(nT ) are equivalent to:

NOF 1(nT ) ≡ (TFF ≤ T ) ∧ (TFF + IFT 1 > T ) (5.2)

NOF 2(nT ) ≡ (TFF ≤ T ) ∧ (TFF + IFT 1 ≤ T )

∧ (TFF + IFT 1 + IFT 2 > T ). (5.3)

The probability of occurrence of k failures during the n-th time interval

is denoted by Pk(nT ) and is equal to Pr(NOFk(nT )). The variable n implies

possible variations of such probabilities with time which will be discussed

further in this chapter.

Each of the probability values in the set Pk(nT )| k = 0, 1, . . . ,M is called

a Likelihood of Number of Failures (LNF value for short), where M is the

maximum number of failures that can occur within a single time interval.

The calculation of the theoretical and empirical LNF values will be dis-

cussed in the next section of this chapter. It is important to note that in the

5.3. LIKELIHOOD OF NUMBER OF FAILURES: A PROBABILISTICDEFINITION FOR FAILURE FREQUENCY 105

Tn )2(

TFF 1IFT 2IFT

Time

Tn )1( nT Tn )1( Tn )2(

ft

(a)

TFF 1IFT 2IFT

Time

Tn )2( Tn )1( nT Tn )1( Tn )2(

ft

(b)

TFF 1IFT 2IFT

Time

Tn )2( Tn )1( nT Tn )1( Tn )2(

ft

(c)

Figure 5.1: Demonstration of inter-failure times in three instances of occur-

rence of the events: (a) NOF0(n) , (b) NOF1(n) , and (c) NOF2(n).

probabilistic approach to define and evaluate the water pipe failure rates

as introduced in this chapter, unlike the deterministic approach, the anal-

ysis does not merely return a certain number of failures (or failure rate as

commonly accepted in infrastructure system analysis context). Instead, the

focus is on probabilities of certain number of failures that can be estimated

in a confidence interval (as shown below) which is a valuable measure for

developing the maintenance strategies.

Having the LNF values for the n-th time interval, the most likely ex-

pected number of failures (denoted by ENOF (nT )) in that time interval

can be directly calculated as the statistical mean of the number of failures

given by:

ENOF (nT ) =M∑

k=0

k Pk(nT ) (5.4)

This value is equivalent to the failure rate as commonly computed in

infrastructure systems analysis. However, by using the LNF values, a con-

fidence interval can also be calculated for the above failure rate. A confi-

dence interval quantifies the existing uncertainty in the calculated failure

rate, and it is particularly useful if the future failure rates are calculated

by Equation (5.4). For example, the statement “with a probability of 90%,


RANDOM PROCESSES

18 to 22 failures will occur in each month in future” is more meaningful

and useful for planning, compared to the statement “20 failures will occur

monthly”.

Assume that, using the LNF values, a measure for the expected num-

ber of failures, ENOF , is calculated using Equation (5.4). The interval

[ENOF − δ, ENOF + δ] is the β-confidence interval corresponding to this

failure rate, if:

Pr(x ∈ [ENOF − δ, ENOF + δ]) = β (5.5)

Calculation of the half-width of the confidence interval, δ, is straight-

forward by histogram analysis of the LNF values. The LNF values for

immediate right and left neighbours of ENOF are added to the LNF value

for the ENOF value. If the result equals β, the interval between these two

neighbours is the β–confidence interval. Otherwise, this interval should be

symmetrically extended until the area under LNF curve equals β.

5.4 Derivation of Theoretical LNF Values From

Lifetime Models

Lifetime distribution models form the core components of probabilistic ap-

proaches used in the analysis of water pipe failures. Such models usually

contain parametric functions with constant coefficients that are tuned by

an optimisation technique applied to failure records in a dataset. If such a

model is available, it provides a probability density function for inter-failure

times and the lifetime, which is the time to the first failure (TFF ). These

probability density functions are denoted by fIFT (t) and fTFF (t), respec-

tively. There is a direct relationship between these two density functions,

as explained below:

The sum tf + TFF is an inter-failure time and therefore, it is a ran-

dom variable with the IFT density function. Since in probabilistic lifetime

modelling, the consecutive failure times are assumed independent from each

other, the random variables tf and TFF are independent and the density

function of their sum equals the convolution of their individual density func-

tions:

fIFT (t) = ftf (t) ∗ fTFF (t) =

∫ t

0

ftf (τ)fTFF (t− τ)dτ. (5.6)

On the other hand, since T − tf is also a time to the first failure, we can

express the density of tf as ftf (t) = fTFF (T − t) and by substituting into

5.4. DERIVATION OF THEORETICAL LNF VALUES FROM LIFETIMEMODELS 107

the above equation, the following integral equation for the density functions

fIFT (t) is derived:

fIFT (t) =

∫ t

0

fTFF (T − τ)fTFF (t− τ)dτ. (5.7)

From Equation (5.1), the LNF value P0(nT ) can be calculated as:

P0(nT ) = PrNOF 0(nT ) = PrTFF > T =

∫ ∞

T

fTFF

(t) dt. (5.8)

The event of occurrence of only one failure during the n-th time interval,

NOF1(nT ) is expressed in Equation (5.2) and the LNF value P1(nT ) can

be expressed as follows:

P1(nT ) = PrNOF1(nT ) = Pr TFF ≤ T ∧ TFF + IFT1 > T=∫ T

0

∫∞T−t1

fTFF (t1)fIFT (t2)dt2dt1(5.9)

where:

t1= time at occurrence of the first failure, and

t2= time at occurrence of the second failure.

Similarly, from Equation (5.3), the LNF value P2(nT ) is derived as below:

P2(n) = PrNOF 2(nT )= Pr(TFF ≤ T ) ∧ (TFF + IFT1 ≤ T )

∧(TFF + IFT1 + IFT2 > T )=

∫ T

0

∫ T−t10

∫∞T−t1−t2

fTFF (t1)fIFT (t2)fIFT (t3)dt3dt2dt1.

(5.10)

The above derivation can be generalised to every k number of failures, for

which the probability Pk(nT ) is given by:

Pk(n) = PrNOF k(nT )= Pr(TFF ≤ T ) ∧ (TFF + IFT1 ≤ T ) ∧ · · · ∧ (TFF+∑k−1

i=1 IFTi ≤ T ) ∧ (TFF +∑k

i=1 IFTi > T )=

∫ T

0

∫ T−t10

· · ·∫ T−

Pk−1i=1 ti

0

∫∞T−

Pki=1 ti

fTFF (t1)fIFT (t2) · · · fIFT (tk+)dtk+1 · · · dt1.

(5.11)

Equations (5.8-5.11) are based on the assumption that the LNF values

are time-invariant (independent of the absolute time nT ) as they merely

depend on the number of failures k and the time-invariant pdf of the TFF

and inter-failure times.

To clarify the above point, the LNF values are derived for a two-

parameter-Weibull lifetime model which has been repeatedly applied for


RANDOM PROCESSES

failure analysis of many types of units, with the probability density func-

tion (Crowder et al. 1994):

fIFT (t) =η

α(t

α)η−1e−( t

α)η

(5.12)

where η and α are the shape and scale parameters. The following formula

for Pk(nT ) is derived:

Pk(n) =∫ T

0

∫ T−t10

· · ·∫ T−

Pk−1i=1 ti

0

∫∞T−

Pki=1 ti

(η

Tα

)k+1fTFF (t1)[ t2···tk+1

T kαk

]η−1e−

“t2/T

α

”η−···−

“tk+1/T

α

”η

dtk+1 · · · dt1

(5.13)

where fTFF (t1) can be derived from the Weibull distribution of inter-failure

times given in Equation (5.12), by solving the integral equation (5.7).

The above LNF values are time-invariant and this property is not spe-

cific to the Weibull model. Indeed, as long as the distribution has constant

parameters that are not updated with time, the derived LNF values are

independent of time (they do not depend on either n or T ) and merely

depend on k. When time-invariant lifetime distributions such as Weibull

distribution in Equation (5.12) are utilised to model the failure process, the

random process formed by the consecutive inter-failure times is implicitly

assumed to be strict-sense stationary (with time-invariant probability den-

sity function). The derivations made in this section assume that in such

cases, failure rate (number of failures occurring during a specified time in-

terval) is a strict-sense stationary process, too. More precisely, it would

be a discrete random process with time-invariant probability mass function

(pmf), and the LNF values defined in this chapter would be its pmf.

5.5 Empirical Calculation of LNF Values

Having a dataset including water pipe failures over a long period, the LNF

values can be empirically estimated using histogram technique. The failure

history is divided into some time units referred to as time periods. The

duration of the time periods should be short enough to assume that LNF

values remain almost constant during the time intervals within each time

period. On the other hand, the time periods should be long enough to

provide reasonable empirical estimates for LNF values during each period,

by the using histogram technique. For instance, in the analysis presented in

this chapter, each time period is three months long and each time interval is

one day long. If a time period Si includes the intervals within n1T and n2T ,

5.6. CASE STUDY 109

i.e. Si = [n1T, n2T ], then the following empirical LNF values are given by

histogram technique:

PEMPk (n1T ) = PEMP

k ((n1 + 1)T ) = . . . = PEMPk (n2T ) = PEMP

k (si)

= (Number of NOFk events occurred during Si) /(n2 − n1)(5.14)

where PEMPk (nT ) is the empirical estimate of Pk(nT ).

For each time period Si, the expected number of failures denoted by

ENOF (Si) is the statistical mean of the number of failures occurring during

a time interval within Si:

ENOF (Si) =M∑

k=0

k PEMPk (Si) =

Number of failures occurring during Si

n2 − n1

(5.15)

5.6 Case Study

This probabilistic approach is implemented using the database, consist-

ing of CWW pipe breakages that occurred during 1997 − 2000. Details of

characteristics of this database and preparation process are presented in

Chapter 3.

In order to consider the effect of the material, size, and geographical

location of the pipes in the life-time models, these factors are chosen as

dividing criteria in classification of pipes. For example one subset of the

pipes in this classification is the class of Cast Iron Cement Lined (CICL)

pipes with diameter of 100 mm, located in an area covered by a single

postcode 3021. A total of 1450 failures have occurred for this class of pipes

over the course of four years (16 seasons), which is sufficient for the purpose

of the analysis presented in this chapter.

In this analysis, each time period is three months (one season) long and

failures are counted on a day-by-day basis, i.e. each time interval is one day

long. For each season, a set of LNF values are empirically calculated using

Equation (5.14).

Figure 5.2 shows the empirical values of Pk(Si) for k = 0, 3, 4 and

their variations over 16 consecutive seasons. The main reason of the actual

LNF values being time-varying is that the random process of water pipe

failures is non-stationary due to the environmental factors that affect the

rate of failures and inter-failure times.

As discussed in Sections 5.2 and 5.4, parametric probabilistic lifetime

models assume a stationary random process of failures and therefore, they


RANDOM PROCESSES

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

Sum

mer

199

7

Aut

umn

1997

Win

ter 1

997

Spr

ing

1997

Sum

mer

199

8

Aut

umn

1998

Win

ter 1

998

Spr

ing

1998

Sum

mer

199

9

Aut

umn

1999

Win

ter 1

999

Spr

ing

1999

Sum

mer

200

0

Aut

umn

2000

Win

ter 2

000

Spr

ing

2000

Time Period (Seasons)

Empi

rical

LN

F Va

lues P0

P3P4

Figure 5.2: Empirical LNF values P0 , P3 and P4 for 16 consecutive seasons

during 1997 − 2000.

cannot predict the LNF variations with time. For instance, consider a

two-parameter Weibull model fitted to the water pipe failure records. Hav-

ing the parameters η and α, the ITF density function in Equation (5.7)

is substituted with the Weibull density function given in Equation (5.12),

and the TFF density function is numerically calculated by solving the in-

tegral Equation (5.7). Then, numerical calculation of the integrals in Equa-

tion (5.13) results in the following constant LNF values:

P0 = 0.2431 P3 = 0.1283 andP4 = 0.0729.

5.6.1 Effect of rainfall on failure rates

As it was discussed earlier in Chapter 2, in addition to the size, material

and geographical location of the pipes, other factors affect the pipe failure

process. Some examples include construction details, external and internal

loads, and corrosion. In most of breakages with clear mechanical causes,

corrosion has an accelerating role by weakening the fabric of the pipe. Al-

though such factors are not considered in this case study, during the four

years length of the failure history, most of these factors could be assumed

5.6. CASE STUDY 111

to slightly vary from one season to the next. However, one factor that is

not steady over the consecutive seasons is soil movement.

Soil movement is a particularly critical factor in regions with expansive

soil that are subjected to swelling and shrinkage which varies in proportion

to the amount of moisture present in the soil. As water is initially introduced

into the soil (by rainfall), it expands and after drying out it contracts,

often leaving small fissures or cracks. Excessive drying and wetting of the

soil progressively deteriorating the structures over years.The resulting soil

movement can exert pressures as large as 718 KPa (Nelson and Debora 1992)

which could lead to cracking.

A number of investigations have been conducted on the effect of soil

movement due to its moisture content on breakage of water pipes. For

instance, Rajani et al. (1996) showed that differential temperature change

between pipe and soil, and also soil shrinkage due to dryness result in the

development of stresses in the pipe. Also in a study of high rate of failures

in the Fort Worth area, shifting soils were suspected to be the main cause

(Morris 1975). According to the same reference, bending stresses caused by

the swelling of soils such as expansive clay were found to be three to four

times greater than such effects as internal pressure.

Similarly, substantial areas of the State of Victoria (including the CWW

licence area) are covered by expansive clay soils. The soil map of Victoria

is shown in 5.3 (Mann 1997). Different textures and properties found in

soils, reported by Northcote (1979) are presented as a supplement in Ta-

bles 5.1 and 5.3. These data demonstrate that the network of CWW is

mainly located in a region with expansive soil. Therefore, pipe fractures are

likely to occur over time due to soil movements mainly caused by rainfall

fluctuation.

While the soil type is time-invariant, rainfall is a non-stationary process.

Thus, the rainfall fluctuation is expected be the major factor contributing

to the non-stationarity of failure process. Dry seasons are expected to be

associated with high rates of breakages due to soil shrinkage.

Figure 5.4(a) shows the rainfall records for duration of the 16 seasons

during 1997–2000. The variations of rainfall are significant through the sea-

sons, as the average rainfall by season is 110mm and the standard deviation

is 42mm. In addition, large fluctuations in soil moisture are considered as

the main source of soil movement resulting in pipe breakages. To examine

the consistency of the variations in LNF values with rainfall variations, the

empirical failure rates (ENOF values) are calculated using Equation (5.15)

and plotted in Figure 5.4(b). The standard deviation of ENOF values has


RANDOM PROCESSES

Expansive soils in Victoria (see Atlas of Australian Soils, CSRIO)

Cracking clay soils (CC10-13, Basaltic)

Cracking clay soils (CC1-5, CC8-9;12;Ka1-3;Ke1-4)

Hard setting loamy soils with brown or mottled brown clayey subsoils (Ra1-2;Rb1;Rf1, Basaltic)

Hard setting loamy soils with mottled dark clayey subsoils (HH1-2, Basaltic)

Hard setting loamy soils with mottled yellow clayey subsoils (Ta4;Tb19;Va2,9, Basaltic)

Hard setting loamy soils with mottled yellow clayey subsoils (Ta2;Tb4;Td5;Ub24-26;Ub29;Va1,3,4,5,7,8,11;Vd1)

Hard setting loamy soils with red clayey subsoils (Oa2-3, Basaltic)

Hard setting loamy soils with red clayey subsoils (Md4;O2,6;Ob4;Oc1-2;P1-2;Qb2-3)

Friable loamy soils (G1;Rg1, Basaltic)

Friable loamy soils (G3)

Friable (highly structured) porous earths (GG1;Mg2,5,7,17;M11, Basaltic)

Peats (Z5)

Yellow leached earths (EE2)

Grey brown highly calcareous loamy earths (Lb8-9)

Hard-setting loamy soils with mottled yellow clayey subsoils (Tb1)

Leached sand soils (Cb2)

Sandy soils with mottled yellow clayey subsoils (Wa8;X1,4,5;Ya4,15,19)

Lakes

Figure 5.3: Expansive soils in Victoria(Mann, 1997)

5.6. CASE STUDY 113

Table 5.1: Approximate clay content of different types of soils

Texture Field exture Approx. clay

symbol grade content(%)

S Sand Less than 5%

LS Loamy Sand Approx. 5%

CS Clayey sand 5 − 10%

SL Sandy loam 10 − 20%

FSL Fine sandy loam 10 − 20%

SCL Light sandy clay loam 15 − 20%

L Loam About 25%

Lfsy Loam, fine sandy Appox. 25%

ZL Silty loam 25% and with silt 25%+

SCL Sandy clay loam 20 − 30%

CL Clay loam 30 − 35%

CLS Clay loam, sandy 30 − 35%

ZCL Silty clay loam 30 − 35% and with silt 25%+

SC Sandy clay 35 − 40%

ZC Silty clay 35 − 40%

LC Light clay Clay: 35 − 40% and with Silt: 25%+

LMC Light medium clay 40 − 45%

MC Medium clay 45 − 55%

MHC Medium heavy clay 50%+

HC Heavy clay 50%+

also been calculated using the following equation:

σENOF (Si) =

√√√√ M∑k=0

(k − ENOF )2 PEMPk (Si). (5.16)

The numerical results of ENOF values and their standard deviations are

presented in Table 5.2. The relatively small standard deviations of ENOF

values in every season show how accurately the expected number of failures

is calculated using Equation (5.15).

In Figure 5.4, it is observed that the directions of variations (increasing

or decreasing) of 11 failure rates (out of total 16 failure rates) are in contrast

to rainfall variations. Thus, in this case study the rainfall variations in 11

out of the 16 (70%) of seasons vary at the opposite direction of failure

pattern variations. For the remaining five seasons (outliers), other factors


RANDOM PROCESSES

020406080

100120140160180200220

Sum

mer

199

7A

utum

n 19

97W

inte

r 199

7S

prin

g 19

97S

umm

er 1

998

Aut

umn

1998

Win

ter 1

998

Spr

ing

1998

Sum

mer

199

9A

utum

n 19

99W

inte

r 199

9S

prin

g 19

99S

umm

er 2

000

Aut

umn

2000

Win

ter 2

000

Spr

ing

2000


Rai

nfal

l (m

m)

RainfallAverage

(a)

0

0.2

0.4

0.6

0.8

1

Sum

mer

199

7

Aut

umn

1997

Win

ter 1

997

Spr

ing

1997

Sum

mer

199

8

Aut

umn

1998

Win

ter 1

998

Spr

ing

1998

Sum

mer

199

9

Aut

umn

1999

Win

ter 1

999

Spr

ing

1999

Sum

mer

200

0

Aut

umn

2000

Win

ter 2

000

Spr

ing

2000


Empi

rical

Ave

rage

Num

ber o

f Fai

lure

s

(b)

Figure 5.4: Comparison of rainfall in each season with its corresponding em-

pirical average number of failures (empirical failure rates or ENOF values)

for CICL pipes with 100mm diameter: (a) Rainfalls (b) Empirical average

number of failures over each season computed as the ratio of total number

of breaks occurred in each season to the number of pipes involved.

5.6. CASE STUDY 115

Table 5.2: The empirical expected number of failures and their standard

deviations for 16 consecutive seasons during 1997-2000.

Season (Si) ENOF σENOF Season (Si) ENOF σENOF

Summer 97 1.00 0.0223 Summer 99 0.78 0.0327

Autumn 97 0.98 0.0457 Autumn 99 0.88 0.0009

Winter 97 0.81 0.0190 Winter 99 0.80 0.0273

Spring 97 0.71 0.0149 Spring 99 0.69 0.0131

Summer 98 0.94 0.0398 Summer 2000 0.93 0.0387

Autumn 98 0.94 0.0247 Autumn 2000 0.86 0.0216

Winter 98 0.81 0.0082 Winter 2000 0.79 0.0280

Spring 98 0.62 0.0208 Spring 2000 0.68 0.0147

such as abnormal loading or preventative maintenance may explain the

inconsistency of failure pattern variations with rainfall variations.

In order to highlight the correlation between the rainfall data and num-

ber of failures, in Figure 5.5 the rainfall data are plotted versus the empirical

(expected) number of failures - the ENOF values per km of pipe length for

CICL pipes with 100mm diameter. It is observed that when the rainfall is

significantly higher or lower than its average value (about 110mm), there

is a corresponding increase in the number of pipe failures. The two diverg-

ing lines have been plotted to clarify this increase. A similar trend has

been observed in the failure pattern of other classes of pipes. For instance,

Figure 5.6 shows the same correlation for CI pipes with 100mm diameter.

To quantify the correlation between rainfall and failure rate variations,

the magnitude of deviation of rainfalls from their average is plotted versus

the ENOF values for CICL pipes with 100mm diameter in Figure 5.7, and

a regression line is fitted to the points (excluding the five outliers). The

outliers are recognised by a robust estimation technique called the Least

Median Estimator (LMS). This method finds the optimum linear fit to the

data (excluding the detected outlier samples) and automatically results in

an inlier-outlier dichotomy. The outliers in this case study are assumed

to be mainly associated with random effects and extreme climate varia-

tions. Although the outliers are not considered in this analysis, they also

contribute to the random time-variations of the failure process and its non-

stationarity. Indeed, their existence also demonstrates the deficiency of

parametric (probabilistic) models developed for water pipe failures in the

literature.

The correlation coefficient of regression is 0.83 which is sufficiently large


RANDOM PROCESSES

0.02 0.025 0.03 0.03520

40

60

80

100

120

140

160

180

200

Number of Failures/Km

Rai

nfal

l (m

m)

Figure 5.5: Daily average number of failures during each season in 1997-

2000, and the corresponding rainfall records for CICL pipes with 100mm

diameter.

3 3.5 4 4.5 5 5.5 6 6.5 7 7.5 8x 10−3

20

40

60

80

100

120

140

160

180

200

Number of Failures/Km

Rai

nfal

l (m

m)

Figure 5.6: Daily average number of failures during each season in 1997-

2000, and the corresponding rainfall records for CI pipes with 100mm di-

ameter.

5.6. CASE STUDY 117

0

20

40

60

80

100

0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1Number of Failures (ENOF)

Abs

olut

e D

evia

tion

of R

ainf

all (

mm

) Fr

om It

s A

vera

ge

Regression Line

Outliers

Figure 5.7: Deviation of rainfalls from their average, plotted versus the cor-

responding ENOF values: A regression line demonstrates the nearly linear

correlation between the failure rates and rainfall deviations.

to validate our assumption of an almost linear relationship between the

failure rates and rainfall (disregarding the extreme climate variations which

cause the outlier points). A similar trend has been observed in the data of

other classes of pipes.

Large variations of failure rate with rainfall are expected, as significant

change in soil moisture (due to change in rainfall) would lead to the swelling

and shrinkage of reactive soils. This in turn would lead to movement, dis-

tortion and subsequent failure of pipes. However, it should be noted that

it is not only the amount of rainfall that is responsible for increase in fail-

ure of pipes, but more importantly is the rate of change of rainfall and soil

moisture. In other words, if a very dry soil (due to below average rain-

fall) receives high amount of rainfall (well above average), the resulting soil

movement would be quite significant even though the total rainfall might

be at about average.

The high rate of change in rainfall can be easily illustrated by Fig-

ure 5.4(a) by comparing the rainfalls for summer and winter of 1997. It

should also be noted that if a high rainfall occurs while the soil is fully sat-

urated from earlier events, it is unlikely that further soil movement will take


RANDOM PROCESSES

place and hence, pipes would not experience higher than average number

of failures. Indeed, this fact explains the outliers in Figure 5.7 where at

certain periods there are high rainfalls but no appreciable increase in the

number of failures.

5.7 Conclusion

In this chapter pipe failures were considered random processes and their

characteristics were studied. A set of probabilistic definitions, called Likeli-

hood of Number of Failures (LNF ), were introduced for use in investigating

the characteristics of this random process.

Using the database of CWW, LNF values were empirically calculated.

Theoretical LNF values were derived from a Weibull distribution model as

an example of general parametric lifetime models. It was demonstrated that

the LNF values derived from classical lifetime models are assumed time-

invariant while their empirical values are shown vary from one season to

the next. Therefore, the process of failure of water pipes is non-stationary

random processes. This result is in contrast with the underlying assumption

of existing lifetime models on stationarity of failure process.

It is emphasised that these findings and conclusions are general and

applicable to any other database of pipe failures. This chapter also reflected

a more specific investigation using the failure dataset of CWW. It was noted

that CWW is located in an area of expansive soil and soil movement was

shown to be a source of non-stationary behaviour of failure represented in

this database.

To investigate the effect of the soil movements (caused by rainfall) on

failure rate variations, changes in variation of the empirical failure rates

(statistical mean of the failures occurred during each day) were studied to-

gether with variations of rainfall in the area. Comparison of the concurrent

plots of rainfall and the empirical failure rates showed that most of the time,

variations of failure rates could be explained with variations of rainfall.

As a general approach to tackle the time-varying nature of pipe failure

processes, it is suggested to regularly update the parameters of lifetime

models. While this may require demanding computational updates and

costly expert staff, the resulting predictions would be more realistic and

reliable.

As an alternative solution to the problem, a non-parametric method

has been developed for efficient analysis of the non-stationary pipe failure

5.7. CONCLUSION 119

Table 5.3: Properties of different texture types found in soils (Northcote) -

Continued to the next page.

processes and is presented in Chapter 6. This technique is able to handle

and automatically update dynamic models for non-stationary pipe failure

processes.

Texture Symbol Behaviour of moist bolus

S Coherence nil to very slight,cannot be moulded sand grainsof medium size;single sand grains stick to fingers.

LS Slight coherence; sand grains of medium size can be shearedbetween thumb and forefinger to give minimal ribbon of 5mm.

CS Slight coherence; sand grains of medium size;sticky when wet;many sand grains stick to fingers; will form a minimal ribbonof 5-15 mm,discolours fingers with clay stain.

SL Bolus coherent but very sandy to tauch; will form a ribbonof 15-25 mm of medium size and dominant sand grains arereadily visible.

FSL Bolus coherent; fine sand can be felt and heard when ma-nipulated; will form a ribbon of 13-25 mm; sand grains areclearly evident under a hand lens.

SCL Bolus strongly coherent but sandy to touch; grains domi-nantly medium sized and easily visible;will form a ribbon of2-2.5 cm.

L Bolus coherent and rather spongy;smooth feel when manipu-lated but no obvious sandiness or ’silkiness’ somewhat greasyto the touch if much organic matter present; will form ribbonof 25 mm.

LfsY, Bolus coherent and slightly spongy; fine sand can be felt andheard when manipulated; will form a ribbon of 25 mm.

ZL Coherent bolus; silky when manipulated; will form a ribbonof 25 mm.

SCL Strongly coherent bolus, sandy to the touch; medium sizedsand grains visible in finer matrix; will form a ribbon of 25-40mm.

CL Coherent plastic bolus; smooth to manipulate; will form aribbon of 40-50 mm.


RANDOM PROCESSES

Texture Symbol Behaviour of moist bolus

CLS Coherent plastic bolus; medium sized sand grains visible infiner matrix; will form a ribbon of 40-50 mm.

ZCL Coherent smooth bolus, plastic and silky ; with to the touch;will form a ribbon of 40-50 mm.

SC Plastic bolus; fine to medium sands can be seen, felt or heardin clayey matrix; will form a ribbon of 50-75 mm.

ZC Plastic bolus; smooth and silky to manipulate;will form aribbon of 50-75 mm.

LC Plastic bolus; smooth to touch; slight resistance to shearing;will form a ribbon of 50-75 mm.

LMC Plastic bolus; smooth to touch; Medium slight to moderateresistance to forming a ribbon; will form a ribbon of 75 mm.

MC Smooth plastic bolus; can be moulded into a rod withoutfracturing; has moderate resistance to forming a ribbon; willform a ribbon of 75 mm+.

MHC Smooth plastic bolus; can be moulded into a rod without frac-turing;has a moderate to firm resistance to forming a ribbonwill form a ribbon of 75 mm or more.

HC Smooth plastic bolus; can be moulded into rods without frac-turing; has firm resistance to forming a ribbon; will form aribbon of 75 mm+.

Chapter 6

A Non-Parametric Technique for

Failure Prediction of

Deteriorating Components

6.1 Introduction

The statistical models that are currently available for failure prediction

of water pipes were reviewed and discussed in Chapter 2. These models

were classified broadly into Deterministic, Probabilistic multi-variate and

Probabilistic single-variate models applied to grouped data (Kleiner and

Rajani 2001). Most of the existing models fall within the category of de-

terministic models. However, the focus of this study is on probabilistic

models.

A probabilistic model, unlike deterministic models, does not merely re-

turn a certain (predicted) quantity such as the number of failures or the date

of future failures. Instead, it focuses on estimating the failure probabilities

that can be used for developing a maintenance strategy. Using those fail-

ure probabilities, confidence intervals for the predicted quantities can also

be calculated. The concept of confidence interval and its use to develop

proactive maintenance strategies will be discussed later in this chapter.

Most breakage prediction models in water mains developed so far deal

almost exclusively with the static factors. These models may yield biased re-

sults by ignoring environmental time-varying factors in the statistical anal-

ysis of break rates (Kleiner and Rajani 2002). To model the pattern of

121

122CHAPTER 6. A NON-PARAMETRIC TECHNIQUE FOR FAILURE

PREDICTION OF DETERIORATING COMPONENTS

failure occurrences, such models assume a particular probability density

function (pdf) for the structural failures (Barraza et al. 2000, Barraza et

al. 1996, Lyu 1996, Hossain and Dahiya 1993, Goel and Okumoto 1979). In

these models, the failure process is assumed to be stationary as a simplifying

hypothesis.

For the water pipes recorded in the dataset explained in Chapter 3,

empirical LNF values were separately calculated for several consecutive

seasons in Chapter 5. The results showed that these empirical LNF values

can vary substantially from one season to another and therefore, the water

pipe failures are non-stationary random processes.

Furthermore, it was shown that the parametric models currently used for

failure prediction assume a time-invariant pdf for the time or the number of

failures (with fixed parameters) and do not take into account the variations

of failure patterns with time (Wood 1996a, Wood 1996b, Sahinoglu 1992,

Musa et al. 1987, Miller 1980, Leighton and Rivest 1986, Sukert 1976, Sukert

1979, Rajani and Tesfamariam 2005). This was clarified and mathemati-

cally demonstrated for the Weibull distribution model which is a well-known

lifetime model, widely used as a failure prediction model for many compo-

nents (Crowder et al. 1994). This limitation is, however, not specific to the

Weibull model. Indeed, as long as the distribution has constant parameters

that are not updated with time (and correspond with a stationary random

process), the probabilities of future failures are modelled as a function of

inter-failure times and not the absolute time n, and therefore the derived

LNF values are time-invariant and merely depend on k and the fixed pa-

rameters of the model.

The only statistical technique in the literature for predicting water main

breaks that has considered time-dependent variables other than pipe age and

the number of previous breaks, was a deterministic multi-variate exponential

model developed by Kleiner and Rajani (2002). In addition to the age of

pipes, the other time-varying factors that were taken into effect by this

model included temperature effects, soil-moisture effects, cumulative length

of replaced water mains and cumulative length of cathodically protected

water mains. Forecasted climate data required for this model was obtained

from Fourier analysis which assumed that climate change follows harmonic

cycles. The inherent inaccuracy of climate forecasting is naturally added to

the error of failure prediction.

Considering the failure records that are usually available in most of

water distribution systems, the variety of data that is needed can be a lim-

itation for Kleiner and Rajani (2002)’s model. A computer program named

6.1. INTRODUCTION 123

WARP has been developed using this model to perform analyses of histori-

cal breakage rates without or with any number of these covariates (Kleiner

and Rajani 2003). However, when one or more covariates are selected, the

results reflect background ageing (which is the consistent increase in pipe

breakage rate due to corrosion and other steady, continuous deterioration

processes) as well as annual variations due to the influence of those time-

varying factors.

In addition, the model of Kleiner and Rajani (2002) should be applied

to long histories of failures. In some cases, the available dataset is short and

includes a period of predominantly decreasing breakage rates. Applying this

model to such datasets may yield results that are counter-intuitive, such as

positive effect of ageing and/or negative effects of replacement. For this

reason, the authors suggested that model should be used judiciously and

the outcome of the analysis interpreted with caution.

Filling the existing gap between the non-stationary random nature of

component failure processes and parametric failure prediction models has

been the main focus of this study. This chapter presents a novel non-

parametric estimation technique for prediction of future LNF values. The

proposed technique does not have any constant model parameters and there-

fore, it does not model the random process of failures based on non-stationary

assumptions. More precisely, the proposed technique complies with the non-

stationary characteristics of the pipe failures.

As it will be explained later in detail, having the LNF values, one

can predict the future number of failures within any given period of time.

Moreover, these values can be used to compute accurate confidence intervals

(lower and upper bounds) for such a prediction. Therefore, prediction of

the LNF values of the component failure process is the main point of focus

of the technique that is proposed in this chapter.

In this study, the same notations and definitions introduced in Chapter 5

are followed. The time interval is chosen “one day” and the LNF values

vary slightly and are assumed constant for heuristic calculation of empirical

LNF values during a time period, using histogram technique. For the water

pipe failures recorded in the above-mentioned history such an assumption

is applicable for a time period of one season (90 time intervals).

In the next section, an algorithm for prediction of future LNF values

based using maximum likelihood estimation is given. Section 6.4 discusses

the use of the predicted LNF values to calculate the expected number of

failures in immediate next time interval and to obtain confidence intervals

for the estimated number of failures.

124

(n -5)T

Time

2 =4T

Failure occurrence

(n -3)T

(n -2)T

(n -1)T

(n -4)T

n

T

CHAPTER6. ANON-PARAMETRICTECHNIQUEFORFAILUREPREDICTIONOFDETERIORATINGCOMPONENTS

Figure6.1:Inter-failuretime,showinganexamplewhereµ2=4T.

Thepredictionschemeisthenextendedtoestimatethetotalnumber

offailuresthatwilloccurduringupcomingmultipletimeintervals. Ap-

plicationoftheproposedtechniqueonthefailurehistoryofwaterpipes

describedinChapter5isthendemonstratedinSection6.6followedbythe

conclusionsofthestudy.

6.2 MaximumLikelihoodEstimationofFuture

LNFValues

Theinter-failuretimeelapsedbetweentwoconsecutiveNOFkeventsisde-

notedbyµk. Figure6.1showsaninstanceofµ2(theinter-failuretime

betweentwoNOF2events).

IfNOFk(nT)isatruestatement(i.e.kfailuresoccurduringthetime

interval[(n−1)T,nT]),thentheprobabilityofnextµk=m(thenext

NOFktooccurat“m”timeintervalslater)isgivenby:

Pr(nextµk=m)=Pr(Ωm)

where

Ωm≡∼NOFk((n+1)T)∧...∧∼NOFk((n+m−1)T)

∧NOFk((n+m)T)

(6.1)

where∼meansthelogical”NOT”.Theaboveexpressioncanbecalculated

asfollows:

Pr(nextµk=m)=(1−Pk)m−1Pk. (6.2)

Equation6.2expressesarelationshipbetweenfutureLNFvaluesand

inter-failuretimes.Thiscanbeutilisedtoturntheproblemofpredictionof

LNFvaluesintotheproblemofestimationofthenextinter-failuretime.

Laterinthischapter,itismathematicallyproventhatthisapproachresults

inmorecertainandreliablepredictions.

6.3. REQUIRED LEVEL OF ACCURACY FOR THE INTER-FAILURETIMES 125

Using a maximum likelihood estimation approach to prediction of LNF

values, the first step is to calculate the likelihood of LNF values for a given

inter-failure time in the future. More precisely, the following question is

to be answered: Given that “next µk = m” what is the likelihood of a

set of LNF values, Pk, during the next m time intervals? The Maximum

Likelihood (ML) estimates of the LNF values are then the values with the

highest likelihood for the observed inter-failure data.

To obtain an ML estimate denoted by Pk, the likelihood of observed data

which include the next inter-failure time k(next) = m (yet to be estimated)

are maximised. From Equation (6.2), this likelihood is given by the following

equation:

Pr(µk(next) = m|Pk)

)= Pk(1 − Pk)

m−1 (6.3)

The maximum of the above likelihood is derived by solving the algebraic

equation:

d(P

k(1 − Pk)

m−1)

dPk

= 0 (6.4)

which results in Pk = 1m

.

On the other hand, for each time interval, the probabilities of all the

NOFk’s should sum to one:

M∑k=0

Pk = 1 (6.5)

Therefore, a normalisation factor ξ is introduced to Equation (6.4) and the

following final formula is derived for the ML estimates of the LNF values:

Pk =ξ

µk

; ξ =

(M∑i=0

µ−1i

)−1

. (6.6)

The future inter-failure times are denoted by µi in the above formula, as

they also need to be estimated. However, it is shown in the next section

that the above maximum likelihood estimates of LNF values are highly

robust to inaccuracies involved in estimation of inter-failure times.

6.3 Required Level of Accuracy for the Inter-

Failure Times

To predict the LNF values by Equation (6.6), future inter-failure times

should be predicted first. Before presenting a technique to predict the future



0 0.2 0.4 0.6 0.8 10

1

2

3

4

5

6

7

8

9

10

ML estimate of Pk

Variance o

f in

ter-

failure

estim

ate

s

Figure 6.2: Variance of the estimated inter-failure times is a decreasing

function of the maximum likelihood estimates of Pk values.

inter-failure times, the effect of accuracy of this prediction on the predicted

LNF values is investigated.

From Equation (6.2), the inter-failure times are observed to have geomet-

ric distributions with parameter Pk. Therefore, in the process of maximum

likelihood estimation of LNF values, the variance of the inter-failure times

is given by:

VAR(µk) =1 − Pk

P 2k

. (6.7)

Having in mind that, in estimation theory, variance is as an indicator for

the error of estimation, Equation (6.7) and its plot in Figure 6.2 show an

inverse relationship between the accuracy of inter-failure time prediction and

the LNF values predicted using the inter-failure times by Equation (6.6).

The larger the LNF estimates are, the more significant is their effect on

prediction of future failures. Such large LNF values need to be accurately

estimated and the level of accuracy for small LNF values is not important.

Equation (6.7) and Figure 6.2 show that the inter-failure times that are

6.4. PREDICTION OF INTER-FAILURE TIMES 127

estimated more accurately (and have a smaller variance) correspond to such

large Pk’s. In other words, a high accuracy of prediction for the inter-failure

time µk is required, andindeed occurs only when the corresponding LNF

value Pk is large.

If for some k, the LNF value Pk is large, then the events NOFk occur

frequently and there are many short consecutive inter-failure times in each

period of time. “Finite Impulse Response” (FIR) filters (discussed in the

next Section) are applied to predict the inter-failure times. Since the accu-

racy of such filters increases with the increases in the number of available

data, when the LNF value for some k is large (and many short consecutive

µk’s appear in the recent data), the inter-failure time will be automatically

predicted more accurately compared to other k’s with small LNF . This is

a point of strength for the proposed technique that guarantees a satisfying

performance for prediction of Pk values by µk estimation using a FIR filter.

6.4 Prediction of Inter-Failure Times

Given a failure history for a class of assets, before predicting future inter-

failure times, previous µk’s need to be computed. The set of µk’s for each k

is empirically calculated by simply determining the times elapsed between

each two consecutive NOFk’s.

For each time interval [(n − 1)T, nT ], the inter-failure time µk(nT ) is

considered the number of time units between the most two recent NOFk

events. The µk(nT ) values are initially set to zero at n = 0 and NOFk

events for all k’s are assumed to have occurred at this initial time. The

error incurred by inaccuracy of such initialisation will be transient and will

fade out after training data are applied. More precisely, after the estimation

scheme is initialised and failure data from a history of recent failures (up

to present - most recent failures) are given as inputs, the transient effects

of initialisation will fade out. This is a well-known property of robust and

stable recursive estimation techniques such as the one introduced in this

chapter.

At each time, nT , the number of failures occurring during the [(n −1)T, nT ] interval is recorded. If this number is m, then NOFm has occurred

during this interval and the its corresponding inter-failure time, µm(nT ), is

updated to the number of time intervals since the last occurrence of NOFm.

Figure 6.3 shows a case example in which the number of failures occur-

ring during 20 consecutive time intervals are used to calculate the inter-



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 200

1

2

3

4N

o.

of

Failure

s

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

1

3

5

7

0

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

2

4

6

8

1

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 200

2

4

6

2

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 200

2

4

6

3

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 200

2

4

6

Time (in terms of time intervals)

4

Figure 6.3: The number of failures occurring during 20 consecutive time

intervals (top plot) and the inter-failure times, obtained from these data.

The µk’s are constant and change only when an NOFk occurs. Therefore,

at each time, only one µk changes and the rest stay at the same value.

failure times. Since there is a maximum of four failures occurred during a

single time interval in the data, only µ0, µ1, µ2, µ3 and µ4 are calculated.

As it is shown in Figure 6.3, for each k, the µk values at different times

are constant and change only when an NOFk occurs. Therefore, at each

time, only one µk changes and the rest stay unchanged. This characteristic

of the consecutive inter-failure times will be revisited later in this chapter

to clarify some properties of the proposed non-parametric failure prediction

method.

Having the set of µk’s for all possible k’s, the number of time intervals

between the most recent NOFk and the next anticipated NOFk need to be

predicted for each k. A specific FIR filter is proposed here to predict the

next µk as a weighted average of the J+1 most recent µk’s. J is a userdefined

parameter that can be tuned for best performance. The weights are tuned

in such a way that more recent inter-failure times have larger effects on the

6.5. FAILURE PREDICTION USING THE ESTIMATED LNF VALUES 129

predicted value. The filter input-output equation is as follows:

µk((n + 1)T ) =

∑Ji=0 λi µk((n− i)T )∑J

i=0 λi(6.8)

where λ ∈ [0, 1] is a forgetting factor to give more recent inter-failure times

more influence on the prediction process. The denominator term is a nor-

malising factor to guarantee that FIR coefficients sum to one (as expected

in an averaging scheme).

In the special case of λ = 1, the FIR filter performs a simple averaging

over the recent data. Using the J + 1 recent µk’s rather than the whole

data is particularly necessary when dealing with large sets of data. The

parameters J and λ are tuned by using a portion of failure history data and

validated by the rest of history database.

To summarise, the prediction of LNF values for the immediate next

time interval is performed in three steps:

Having the failure database, the inter-failure times, µk’s, are calcu-

lated for all k’s, for most recent J + 1 time intervals.

The next inter-failure times, µk(n + 1), are calculated using Equa-

tion (6.8) for all k’s.

The LNF values at the next time interval, Pk(n + 1)’s, are estimated

using Equation (6.6).

6.5 Failure Prediction Using The Estimated LNF

Values

6.5.1 Prediction of number of failures

Having the estimated LNF values, one can easily calculate the expected

number of failures as a prediction for the number of failures that will occur

during the next time interval:

ENOF (n + 1) =M∑

k=0

kPk(n + 1). (6.9)

In the above equation, the expected number of failures is derived as the

statistical mean of number of failures. Statistical mean is the most mean-

ingful measure for the purpose of this study. For instance, if the mode of



0 1 2 3 4 5 6 7 80

0.05

0.1

0.15

0.2

0.25

Number of Failures (k)

LN

F V

alu

e (

Pk )

ENOF=4.2 (rounded to 4) Mode=7

Figure 6.4: A case example to show the unreasonable results with using the

mode of distribution instead of the statistical mean of number of failures

for prediction.

distribution of number of failures (the number of failures, k, corresponding

with the largest LNF value) is chosen as the predicted number of failures,

in many circumstances, particularly when the time interval is short, it will

result in unreasonable predictions.

For a “one day” time interval, a typical column plot of LNF values is

shown in Figure 6.4. The maximum LNF value corresponds with k = 7,

however, since another LNF value (P2) is close to P7 , the statistical mean

of the number of failures calculated by Equation (6.9) becomes k = 4.2

rounded to 4 and this appears to be a more logical prediction for the next

number of failures.

6.5.2 Confidence intervals

In addition to the direct prediction of the number of failures, LNF val-

ues can also be used to calculate Confidence Intervals (CI) for the predic-

tions given by Equation (6.9). The interval [ENOF − δ, ENOF + δ] is

α–confidence interval (α–CI) for the predicted number of failures, if:

Pr (x ∈ [ENOF − δ, ENOF + δ]) = α. (6.10)

To calculate the above α–CI, the LNF values corresponding with the

right and left neighbourhoods of ENOF are progressively added until the


sum equals (or exceeds) α. The α–CI is then determined as the interval

within the final left and right neighbourhoods.

6.5.3 Failure prediction for multiple future time intervals

The three-step prediction scheme results in ML estimates for the Pk values

corresponding with the failures likely to occur during a single time interval

in the future. Using these LNF values and Equation (6.9) and Equation

(6.10), one can predict the number of failures and estimate the confidence

interval for a single time interval.

Since, the time interval is a short period of time, failure prediction is

often required for other time periods as long as multiple time intervals.

Therefore, the LNF values for the number of failures during the next L

time intervals are required to be estimated using the Pk values estimated

for a single time interval. The following theorem in probability theory can

be used to estimate the required LNF values (Stark 1994):

Theorem: If K1 and K2 are two independent discrete random variables

with their probability mass functions (pmf) given as: P 10 , P 1

1 , . . . , P 1M and

P 20 , P 2

1 , . . . , P 2M , then the pmf of the sum of the two variables K = K1+K2

is given by the convolution of the two pmf’s as follows:

Pr(K1 + K2 = k) = P 1k ∗ P 2

k =k∑

i=0

P 1i P 2

k−i ; 0 ≤ k ≤ 2M. (6.11)

If K1 and K2 are the number of component failures in an infrastructure

system during two different time intervals, then based on the assumptions

made so far in this chapter, P 1i and P 2

i are zero for i > M . Thus, the

above convolution results in the following formulae for the LNF values

corresponding to the total number of failures in either of the two time

intervals:

Pr(K1 + K2 = k) =

∑k

i=0 P 1i P 2

k−i if 0 ≤ k ≤ M∑ki=k−M P 1

i P 2k−i if M < k ≤ 2M

0. otherwise

(6.12)

The past failure history information used to predict the LNF values

does not imply any difference between the likelihood of failures occurring

in the immediate next time interval or in the second next time interval or

the next one, and so on. Therefore, at each time, it is assumed that the

estimated LNF values apply to all future (single) time intervals. Based on



0 1 2 3 4 5 60

0.05

0.1

0.15

0.2

0.25

Number of failures per day (k)

P k est

imat

es fo

r the

num

ber o

f fai

lure

s pe

r day

Figure 6.5: An example of LNF values for the number of failures per day,

to be the base of computation of weekly and monthly LNF values.

this assumption, Equation (6.12) can be used to calculate the LNF values

for two time intervals by substituting:

P 1k = P 2

k = Pk(n + 1) ; for all k ∈ [0, M ] (6.13)

where Pk(n+1) is the predicted LNF value for the next single time interval,

given by the three steps listed in Section 6.4 and will be denoted by Pk,

henceforward, for short.

By generalising the convolution theorem and Equation (6.11), the LNF

values corresponding to the time period that includes the next L time in-

tervals can be calculated as follows:

LNF for the period of next L time intervals for k failures = Pk ∗ Pk ∗ . . . ∗ Pk︸︷︷︸(L−1) times

.

(6.14)

In order to study the effect of convolution on the LNF values for longer

periods of times, an example is presented in Figures 6.5–6.7. A typical LNF

estimated using the three steps listed in Section 6.4 for a single time interval

(one day) is shown in Figure 6.5. Using Equation (6.14) with L = 7, the

LNF values for a week are estimated through six times convolution of the

daily LNF values, as shown in Figure 6.6. Similarly, the LNF values are

calculated for one month period (L = 30) and shown in Figure 6.7.

The results of convolution in the above example show that the distri-

bution of the number of failures during long periods of times (large L)


0 5 10 15 20 25 30 35 400

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

Number of failures per week (k)P k est

imat

es fo

r the

num

ber o

f fai

lure

s pe

r wee

k

Figure 6.6: LNF values for the number of failures per week, based on the

daily LNF values plotted in Figure 6.5.

0 50 100 1500

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04

Number of failures per month (k)P k est

imat

es fo

r the

num

ber o

f fai

lure

s pe

r mon

th

Figure 6.7: LNF values for the number of failures per month, based on the

daily LNF values plotted in Figure 6.5.



approaches the normal distribution as L increases. This phenomenon is ex-

pected and explained by the Central Limit Theorem which states that the

distribution of the mean (and therefore the sum) of L independent and iden-

tically distributed random variables (such as K1, K2, . . . , KL in the notation

of this thesis) approach a normal distribution as L increases.

6.5.4 A step-by-step algorithm for failure prediction

A step-by-step algorithm for the proposed failure prediction technique is

shown in Figure 6.8. The algorithm is presented in such a way that the

prediction mechanism adapts itself with the variations of the failure pattern.

More precisely, the predicted number of failures, likely to occur during the

next L time intervals, is updated after each time interval. The algorithm

works as explained below:

At the end of the current time interval, the recorded number of failures

(that have actually occurred during that time interval) is given as an input

to the algorithm. If this number is k, a corresponding new inter-failure

time µk is calculated and appended to the database of the recent inter-

failure times as a new record. Such a database would comprise an ensemble

of inter-failure times (µ0 ’s , µ1 ’s , . . ., and µM ’s) recorded during the

history of operation of the infrastructure system.

Using the updated inter-failure times, the next inter-failure times (for

different number of failures, k, to occur during a single time interval) are

updated by using Equation (6.8), and the maximum likelihood estimate of

the LNF values (for a single time interval) are calculated by Equation (6.6).

Then the LNF values corresponding to the number of failures occurring

during next L time intervals are predicted by using Equation (6.14).

Using the LNF values, the expected number of failures (that will possi-

bly occur during the next L time intervals) and its α–CI are calculated by

Equations (6.9) and (6.10) and displayed as the current prediction.

As the next time interval expires and the actual number of failures dur-

ing that time interval are recorded and input to the algorithm, the above

procedure is repeated and all predictions are updated for the next com-

ing time intervals. This regular updating makes the proposed algorithm

adaptive to the non-stationary nature of component failures.


Inputs:

– Maximum number of failures, M, that can occur per each time

interval (e.g. day);

– The number of recent inter-failure times, J, to be used by

the FIR filter to predict next inter-failure times;

– The forgetting factor, λ ∈ [0, 1] for the FIR filter to

predict next inter-failure times;

– The number of time intervals, L in each time period in the

future for which the LNF values, number of failures and

its confidence interval are to be predicted;

– The confidence interval threshold, α ∈ [0, 1]; and

– A dataset containing the number of past failures occurred in

each time interval for a class of pipe;

Repeat the following steps to update the predictions at the end

of each time interval:

1- Input the number of failures in the most recent time

interval;

2- Update the past inter-failure times, µk’s;

3- For each k ∈ [0,M ], estimate the next inter-failure time,

µk(n + 1), using Equation (6.8);

4- For each k ∈ [0,M ], estimate the LNF values for the next

single time interval, Pk, using Equation (6.6);

5- For each k ∈ [0,M ], estimate the LNF values for the next

time period (L time intervals), using Equation (6.14);

6- Predict the number of failures during the next period of

time (L time intervals), ENOF, using Equation (6.9);

7- Calculate the α-CI of the predicted number of failures,

using Equation (6.10);

8- Display the outputs: The predicted number of failures for

the given time period in the future, ENOF, and its α-CI,

[ENOF − δ, ENOF + δ].

Figure 6.8: A step-by-step algorithm for the proposed non-parametric fail-

ure prediction technique.



6.6 Results of Failure Prediction Using the Pro-

posed Non-Parametric Technique

The proposed technique is applied to predict the failures of water pipes

using the CWW failure database. The database of this study, as explained

in Chapter 3, is provided by City West Water PTY LTD (CWW) that

supplies the western suburbs of Melbourne.

The proposed technique is applied and its performance evaluated for

homogeneous classes of pipes. Thus, pipes are classified and failure analysis

of this study is performed separately for each group. The recorded charac-

teristics of the pipes in the database of this study are typical as to what

is available in most of water distribution systems and this study aims at

developing practical models that can be tuned for similar kinds of data.

It is important to note that in case of availability of a diverse range

of information in the database and sufficient data points, pipe classifica-

tion can be narrow and specific, resulting in development of more accurate

predictions, using the proposed technique.

As described in Chapter 3, the pipes of existing failure history are clas-

sified based on their size, material, and location. For instance, one class

of pipes includes all Cast Iron Cement Lined (CICL) pipes with 100mm

diameter and located in the postcode area 3021 Melbourne, Victoria, Aus-

tralia. For each class of pipes, the algorithm is run separately and the future

failures are predicted accordingly.

The time interval is decided to be “one day”. By trial and error, J = 29

and λ = 0.90 are found suitable choices for the inter-failure time prediction

by Equation (6.8). More precisely, 30 recent NOFk’s are considered as the

most effective data in the process of inter-failure time prediction, and the

forgetting factor in the FIR filter is set to 0.90.

The algorithm described in the previous section and shown in Figure 6.8

requires an initial seed of past inter-failure times for prediction of future

inter-failure times and the corresponding LNF values. For this purpose,

the failure records of first year (1997) are used as an initial seed and the

LNF values and expected numbers of failures are predicted by the proposed

technique for the following three years 1998-2000.

In three separate studies, weekly (L = 7), monthly (L = 30) and quar-

terly (L = 90) prediction periods are considered. As an indicator for the

reliability of estimations, 80% confidence intervals (α = 0.80) are also calcu-

lated for each ENOF prediction. The results of weekly, monthly and quar-

6.6. RESULTS OF FAILURE PREDICTION USING THE PROPOSEDNON-PARAMETRIC TECHNIQUE 137

0

2

4

6

8

10

12

14

16

Num

ber o

f Fai

lure

s80% CI lower bound80% CI upper boundENOFTrue number of failures

Spr.97 Sum.98 Aut.98 Win.98 Spr.98 Sum.99 Aut.99 Win.99 Spr.99 Sum.00 Aut.00 Win.00

Figure 6.9: Expected number of failures and their 80% confidence intervals

based on weekly updating, for CICL pipes with 100mm diameter located in

postcode 3021.

terly failure predictions for the CICL pipes with 100mm diameter, located

in the postcode 3021, are plotted in Figures 6.9, 6.10 and 6.11, respectively.

In contrast to previous prediction techniques in the literature, the pro-

posed non-parametric method gives confidence intervals for the predicted

number of failures. Thus, a rejection rate is defined below to quantitatively

evaluate the performance of the prediction technique:

In an experiment, the actual number of failures that have occurred dur-

ing each of N consecutive time periods (in this case study, there are N = 156

weeks, N = 36 months, or N = 12 seasons for the validation data over three

years) are recorded and denoted here by NOF1, . . ., NOFN . On the other

hand, these quantities are predicted by some technique (using previous data,

e.g. the failure record of the first year in our case study) and the predicted

values are denoted by ENOF1 ,. . . , ENOFN . For each predicted number

ENOFi, the prediction technique also provides a lower and upper bound



0

5

10

15

20

25

30

35

40

45

50

Num

ber o

f Fai

lure

s80% CI lower bound80% CI upper boundENOFTrue number of failures



based on monthly updating, for CICL pipes with 100mm diameter located

in postcode 3021.

(confidence interval), respectively denoted by Li and Ui. The rejection rate

of the prediction technique, evaluated by the experiment, is the ratio of the

number of time periods for which the actual number of failures does not fall

within its estimated confidence interval:

rejection rate ,

∑Ni=1 I (NOF /∈ [Li , Ui])

N(6.15)

where I is the identity function:

I(x) =

1 if x is true

0. otherwise(6.16)

The lower the rejection rate, the better the performance of failure prediction.

Table 6.1 shows the rejection rates of the proposed prediction technique

for weekly, monthly and quarterly updating time periods in our case study.

It is observed that the rejection rates increase (and the performance of

6.6. RESULTS OF FAILURE PREDICTION USING THE PROPOSEDNON-PARAMETRIC TECHNIQUE 139

10

20

30

40

50

60

70

80

90

Num

ber o

f Fai

lure

s80% CI lower bound80% CI upper boundENOFTrue number of Fail;ures



based on quarterly updating, for CICL pipes with 100mm diameter located

in postcode 3021.

Table 6.1: Rejection rate for weekly, monthly and quarterly updating

Updating Period Weekly Monthly Quarterly

Total No. of Points 144 36 12

No. of Correct Predictions 136 33 9

Rejection Rate 5.56% 8.33% 25.00%

prediction decrease) with increasing the length of the time period. In other

words, the predicted LNF values and number of failures are highly accurate

for the next day, accurate for the next week, less accurate for the next

month, and far less accurate for the next season.

This proposed technique and the developed ANN model both aim to

solve the same problem with a different approach. ANN model benefits

from a universal estimation technique well known in reliability estimation

of mechanical systems. The technique developed in this chapter is a non-

parametric model that is self-updating.

A simple approach to the failure analysis problem used by some man-



0

5

10

15

20

25

30

35

40

45

50N

umbe

r of F

ailu

res

80% CI lower bound80% CI upper boundENOFTrue number of failuresPredicted by Averaging


Figure 6.12: Expected number of failures and 80% confidence interval based

on monthly updating, compared to predictions given by simple averaging of

recent records.

agers is to predict the number of future failures within the next period of

time by averaging the numbers of failures occurring during some recent time

periods. For comparison purposes, this technique is also applied to predict

the number of failures in each month for the case study in this chapter.

The results are illustrated in Figure 6.12. It is observed that for most of

the months, the true number of failures is within the confidence interval

given by the technique suggested in this chapter is close to the predicted

number of failures. In contrast, the predicted numbers of failures given by

the averaging techniques are mostly out of the confidence interval and far

from the true numbers of failures.

In order to quantify this comparison, the Mean Square Error (MSE

– mean of the square of deviations from the true numbers of failures as

defined in Chapter 4) of both estimation techniques are calculated. The

6.7. CONCLUSIONS 141

average prediction error of the proposed technique, MSE=18, is more than

four times smaller than the averaging technique with MSE=76.40.

6.7 Conclusions

This chapter presented a non-parametric technique for failure prediction

which is compliant with the non-stationary nature of water pipes failure

processes. This technique is applied to the water pipe failure history of the

database described in Chapter 3 to estimate the expected number of failures

within a given number of time intervals in the future. Furthermore, an 80%

confidence interval is determined for the estimated number of failures.

The proposed technique implicitly considers the gradual variations of the

factors influencing the deterioration process by automatic updating of the

predictions with time. This updating is performed after each time interval

as every time interval corresponds to a new inter-failure time update. More

precisely, in every time interval, one µk is added to the record of inter-failure

times.

In this technique, the problem of prediction of LNF values is turned into

the prediction of inter-failure times. It is mathematically shown that more

accurately predicted inter-failure times correspond to larger LNF value pre-

dictions using Maximum Likelihood estimation.It is important to note that

it is more for large LNF values to be estimated more accurately. Therefore,

the proposed technique is highly tolerant to possible inaccuracies in inter-

failure time prediction, and this is a point of strength for the technique.

On the other hand, it was shown that the accuracy of prediction de-

creases as the time-range of prediction widens. In other words, the pre-

dicted number of failures is more accurate for the next week than the next

month.

A step-by-step algorithm was presented for the proposed non-parametric

technique in Subsection 6.5.4 and Figure 6.8. From the progressive steps de-

vised in this algorithm, it is evident that the technique continuously adapts

its predictions with the most recent changes in the failure trends and pat-

terns. This automatic adaptation is the key reason for its compliance with

the unrecorded environmental time-varying factors that affect the failure

process and make it non-stationary.

An illustrative and quantitative comparison of the accuracy of estima-

tions performed by proposed technique and simple averaging technique was

undertaken in this chapter. The comparison exhibited the satisfactory per-



formance of failure predictions by the suggested technique. The best accu-

racy in predictions is observed to be 94.4% that was obtained from weekly

updating. It is equal to 5.56% rejection rate from the 80% confidence inter-

val for total of 144 number of points. Rejection rate is 8.33% for monthly

updating for the total of 36 number of points. This figure is much higher

(25%) for quarterly updating with total number of 12 points.

It is important to note that the proposed non-parametric technique is

not exclusive to failure prediction for different classes of water pipes. The

method is generic and can be used for prediction of groups of infrastructure

system components which show gradual deterioration over time. The tech-

nique is most useful in capturing the random processes of overall system

degradation in terms of the failure processes of groups of components. It is

not suitable for the prediction of a single component failure or for systems

where performance is sensitive to the behaviour of a single component which

if fails would set a chain reaction.

Chapter 7

Conclusions and

Recommendations for Further

Work

7.1 Summary of Study and Achievements

Individual components of infrastructure systems such as water mains are

liable to failure due to environmental interactions and stresses and also ma-

terial degradation. For instance, deterioration and degradation of pipes are

inevitable after several decades of service under vast range of environmental

loads. Gradual degradation of components eventually reduces their capac-

ity to resist the imposed loads. This situation leads to the structural failure

of components, such as bursts in water pipes.

With increasing customer expectations that form regulatory constraints

and with shifting technological paradigms and changing reporting and ac-

countability frame works and requirements, the corporations that operate

the infrastructure systems have recognised the need to develop reliable main-

tenance strategies to maintain a successful course through these many de-

mands. More clearly, planners and decision makers of water distribution

systems seek cost effective strategies to exploit the full extent of the useful

life of the pipes, while meeting customer expectations in terms of service

quality. In a proactive approach towards development of an efficient strat-

egy for asset management, reliability analysis and failure prediction of water

pipes are vitally required.

During recent decades, many researchers have attempted to measure the

143

144CHAPTER 7. CONCLUSIONS AND RECOMMENDATIONS FOR

FURTHER WORK

effect of different physical characteristics and environmental specifications

of water pipes on their failure frequency. Chapter 2 of this thesis presents a

comprehensive review of the studies and their approaches towards predicting

the future condition of pipes based on their previous performance.

As reviewed in Chapter 2, a number of physical analyses have been de-

veloped for individual pipes to assess the rate of their deterioration in order

to measure their deterioration characteristics and predict their future condi-

tion. Some researchers have conducted statistical analysis on the history of

water pipes to formulate the relationship of failure frequencies with most of

the factors contributing to the overall structural deterioration of the pipes.

In practice, a complete dataset,components of which are collected on a reg-

ular basis for each pipe of water networks, is very costly and not readily

available. For minor water mains, few data are available and low cost of

failures does not justify expensive data acquisition campaign. In this situa-

tion, statistically derived models with ability to be applied to various levels

of input data are useful for the purpose of failure analysis.

This study has addressed the latter approach that can be applied on

water mains in a cost effective and realistic manner. The objective of this

thesis was to develop new failure prediction techniques for water pipes that

can be used as decision support tools by managers of water distribution

systems (to plan the maintenance/replacement of their water mains). To be

more specific, the study was conducted to improve the existing probabilistic

statistical techniques. Resulting models for describing the technical state of

pipelines are realistic tools for maintenance planning. The models proposed

in this thesis were applied to a failure database provided by City West

Water which supplies potable water to the Western and some inner suburbs

of Melbourne, Australia. This database, similar to failure records in other

water distribution systems, was an incomplete dataset just covering a period

which is only a portion of the total age of metal pipes of a mature water

distribution system.

The failure records available at most of water distribution companies are

limited to records over limited time lengths (e.g. less than one decade) while

the pipes themselves could be over 100 years old. The techniques developed

in this study can cope with this limitation, predicting pipes reliabilities in

the future, using a limited range of break records.

Furthermore, the techniques developed in this study are general and

the procedure of obtaining the models can be used for any set of failure

data for infrastructure components. The results of the proposed modelling

methods can be used to predict the overall structural state of a network

7.1. SUMMARY OF STUDY AND ACHIEVEMENTS 145

by investigating the reliabilities of different classes (homogenous groups) of

pipes. By incorporating the data recorded for the whole system, the global

maintenance/replacement effort is facilitated using the estimations obtained

from proposed techniques.

An ANN-based technique was used in this study to produce models for

reliability estimation of classes of pipes. Weibull and lognormal models

were also applied to the same data and the accuracy of resulting mod-

els were compared both illustratively and quantitatively. Neural network

models proved to be more accurate in producing a suitable curve to the

failure history and providing accurate predictions for pipes reliabilities in

the future. The quantitative evaluation of performance of estimation sys-

tems developed using the proposed ANN-based technique, Weibull model,

and lognormal modelwas performed through comparison in terms of the

mean square errors of estimation. For all six classes of pipes investigated,

the ANN model produced the highest level of accuracy compared to the

Weibull and lognormal estimates. On average, the mean square errors for

the ANN model, Weibull and lognormal were 0.03, 0.084 and 0.086. In

order to make the ANN-based prediction techniques easier to use for other

databases, a manual has been developed and is presented in Appendix B.

In next step, the ensemble of failures of all pipes in a class was math-

ematically studied as a random process. The random process of failures

was demonstrated to be non-stationary because of the time-varying envi-

ronmental factors that affect the pipe failure processes.

The non-stationary characteristic of failure processes was visualised by

the changes of patterns of failure occurrences in similar seasons of consecu-

tive years. For this purpose, a probabilistic measure to represent the state

of pipe failures in relatively short time intervals was defined. This measure

was used to mathematically demonstrate the deficiency of parametric mod-

els in capturing the non-stationary nature of failure occurrences. Different

patterns of failure occurrences were related to a number of dynamic factors

that affect the process of pipe failures. These time-varying factors were

explained to be exclusive to each region and highly varying from network

to network. For the case study of this thesis, the expansive clay soil of the

area was expected to play an important role in this non-stationarity. This

was confirmed with a histogram plot of failure occurrences in conjunction

with the records of rainfall during the time covered by the database.

At the next stage of this work, a non-parametric probabilistic technique

was developed. The non-stationary process of pipe failures can be captured

by this technique despite the lack of information about time-variant factors


FURTHER WORK

which is typical of the data available in water distribution systems. The

resulting model of the proposed non-parametric method is updated auto-

matically and therefore takes the gradual time-variant factors into account.

The outcome of the model is an estimate for the expected number of failures

in a given time in the future, as specified by the operator. It is demonstrated

that accuracy of predictions has an inverse relationship with length of the

prediction period. The probabilistic basis of the technique also enables it

to provide an upper and lower band for the estimated number of failures

or level of confidence with a given confidence level (a confidence interval).

For the existing case study, weekly, monthly and quarterly updating are

examined. The best accuracy in predictions is observed to be 94.4% which

was obtained from weekly updating. This result was deduced from a 5.56%

rejection rate from the 80% confidence interval for total of 144 number of

points. The rejection rate is 8.33% for monthly updating for the total of 36

number of points. This rejection rate is much higher (25%) for quarterly

updating with a total number of 12 points.

It is emphasised that the techniques developed in this study are generic

in nature. Although the presented models and simulation results have been

only applied to the CWW database the techniques can be tuned and applied

to most other water pipe failure databases to produce probabilistic models

for various classes of pipes in those data. Classes of pipes also can be

more narrowly specific in case of availability of larger and more complete

databases. Indeed, larger and more homogeneous groups will result in more

accurate and reliable estimations.

The non-parametric probabilistic technique proposed in Chapter 6 can

be also applied to the failure history of components of other infrastruc-

ture systems provided that pattern of their failures are smooth and do not

include sharp peaks. Infrastructure systems that can be subjected to sud-

den significant change of failure rate (e.g., severe failing behaviour in steel

infrastructures due to fatigue) cannot be modelled using this technique.

The objectives of this research have been achieved by:

(a) Review of the factors affecting pipe failure for existing models for

predicting future performance in Chapter 2,

(b) Analysis of a typical failure database and identification of typical lim-

itations of databases in Chapter 3,

(c) Establishment of a neural network model for structural failures of

water reticulation pipes in Chapter 4,

7.2. RECOMMENDATIONS FOR FURTHER WORK 147

(d) Investigation of the nature of occurrence of failure of water pipes in

terms of its stationarity as a random process in Chapter 5

(e) Establishment of a non-parametric model to fulfil the requirement

for event of failure occurrences as non-stationary random process in

Chapter 6.

(e) Development of supporting documentation for the developed model

for use by water authorities to adequate to various pipe categories

and data sets in Appendix B.

This study has resulted in a number of publications that are listed in Ap-

pendix C.

7.2 Recommendations for Further Work

The objectives of this thesis have been achieved with two new distinct mod-

elling techniques developed and validated. Both overcome a main limitation

of incomplete failure datasets for existing pipe systems. However, there is

room for further studies and enhancement of the developed techniques.

ï The vast range of causes of failure makes accurate prediction of pipe

failures a highly complicated task. As discussed in Chapter 2, some

models accurately predict the future failure rates for individual pipes,

by incorporating a long list of environmental and operational param-

eters, as well as physical characteristics, into their prediction rou-

tines. However, as explained in Chapter 3, because of data limita-

tions, this kind of analysis is not commonly applicable. Therefore,

as stated in Section 4.4, the models developed in this study accept

a few structural characteristics of the pipes (pipe age, material, con-

struction date, diameter and location) as the inputs and return the

reliability of a class of pipes (all sharing the given characteristics).

These statistical models are associated with some level of inaccuracy

due to various reasons, e.g. classes of pipes may be too general and

non-uniform behaviour of pipes may be neglected in this classifica-

tion. Availability of more details and information for individual pipes

in the failure histories can help to develop more sophisticated models

incorporating other influencing factors as their inputs (in addition to

the ones used in this work). This will enhance the prediction perfor-


FURTHER WORK

mance of the models in cases where more details for the failed pipes

are available.

ï The proposed techniques do not consider the type of failures (e.g.

longitude, circumferential, etc.) in modelling the failure occurrences.

Taking this factor into account in development of probabilistic models

may result in more realistic and accurate estimations. This factor

may be used in dividing the pipes into homogeneous groups if there

are sufficient failure records for each of such groups in the history.

ï The failure analysis in this study is conducted on the basis of cate-

gorising the pipes into various homogeneous groups. The techniques

proposed here produce a model to estimate the failure of each group,

separately. Although this method is useful in obtaining practical re-

sults for asset management purposes, individual performance of pipes

are overshadowed by group behaviour. For instance, a pipe with one

failure in its record is treated the same as one that has experienced

several failures during the same period. It should be mentioned that

this was not an issue for the models developed using the available

database which did not contain pipes having experienced more than

5 failures within the duration of failure history. However, this may

cause inaccuracy in modelling large failure histories with pipes of sig-

nificantly high failure rates in the history (i.e. pipes that have expe-

rienced large numbers of failures). To resolve this issue, further data

involving large number of failures need to be collected. From this data

further refinement of the model can be made.

ï The whole study is conducted on reticulation pipes. Similar analy-

ses can be also performed on failures of other components of water

distribution systems such as valves. In case of having sufficient fail-

ure data for these components, their failures can also be predicted on

a similar basis. This will serve the goal of assisting the planners of

water distribution systems in providing short and long term mainte-

nance/replacement policies for the entire network.

ï The probabilistic techniques developed and protected are generic and

can be used for other failure histories as well. However, they need

to be modified for other databases and this modification requires an

understanding of the mathematical framework and the underlying fun-

damental principles of these techniques as explained throughout the

chapters. Although the provided manual provides an step by step

7.2. RECOMMENDATIONS FOR FURTHER WORK 149

algorithm and the source code of a MATLAB program for its im-

plementation, a more user-friendly package, in the form of computer

software, can make the techniques more accessible and easier to use

for the designated purposed. Developing a computer program with

interactive features and capability of accepting certain inputs and re-

turning expected outputs has not been in the scope of this thesis and

is recommended for future work.

ï A practical way to assess the impact of various pipe failure conditions

on the overall operation of water distribution networks can be quite

helpful in interpreting the outcomes of probabilistic failure analysis

methods. Being able to assess the vulnerability of the network to the

failure of any particular class or group of pipes and more specifically,

the capability of providing a quantitative estimate of the impact on

each nodal demand is suggested to be added to the manual of devel-

oped techniques. Such a tool, similar to the solution suggested by

Jowitt and Xu (1993), requires knowledge of the network configura-

tion and a set of typical operating conditions that might be already

available from a routine network analysis of the intact distribution

network. Results of such a method can be combined with pipe fail-

ure probabilities to provide measures of network reliability to be used

more conveniently by operators of water distribution systems.

Bibliography

Achim, D., F. Ghotb and K. J. McManus (2007). Application of artificial neuralnetworks for prediction of water pipe asset life. ASCE Journal of Infrastruc-ture Systems.

Ahammed, M. and R. E. Melchers (1994). Reliability of underground pipelinessubject to corrosion. ASCE Journal of Transport Engineering Division120(6), 989–1002.

Andreou, S. (1986). Predictive models for pipe break failures and their implica-tions on maintenance planning strategies for deteriorating water distributionsystems. PhD thesis. MIT.

Andreou, S. A., D. H. Marks and R. M. Clark (1987). A new methodology formodelling break failure patterns in deteriorating water distribution systems:Theory. Applications and Advances in Water Resources 10(1), 2–10.

Arnold, G. E. (1960). Experience with main breaks in four large cities- philadel-phia. Journal of American Water Works Association 53(8), 1041–1044.

Ascher, H. and H. Feingold (1984). Repairable systems- Modeling, inference, mis-conceptions and their causes. Marel Dekker. New York.

Balakrishnan, A.V. (1995). Introduction to Random Processes in Engineering. 1stedition ed.. John Wiley and Sons. New York.

Balakrishnan, N. and W. W. S. Chen (1999). Handbook of Tables for Order Statis-tics from Lognormal Distributions with Applications. Kluwer. Amsterdam,Netherlands.

Barraza, N. R., B. Cernuschi and F. Cernuschi (1996). Applications and exten-sions of the chains-of-rare-events model. Journal of IEEE Transactions onReliability 45, 417–421.

151

152 BIBLIOGRAPHY

Barraza, N. R., J. D. Pfefferman, B. Cernuschi and F. Cernuschi (2000). An ap-plication of the chains-of-rare-events model to software development failureprediction. In: Proc. of 5th International Conference of Reliable SoftwareTechnologies (H. B. Keller and E. Odereder, Eds.). Springer-Verlag. Pots-dam, ALLEMAGNE. pp. 185–195.

Bevilacqua, M., M. Braglia and R. Montanari (2003). Classification and regressiontree approach for pumps failure rate analysis. Reliability Engineering andSystem Safety 79(1), 59–67.

Bishop, G. P. and E. R. Bloomfield (2003). Using a log-normal failure rate dis-tribution for worst case bound reliability prediction. In: 14th InternationalSymposium on Software Reliability Engineering (ISSRE03). IEEE. Denver,Colorado.

Borror, C.M., J.B. Kates and D.C. Montgomery (2003). Robustness of thetime between events cusum. International Journal of Production Researchpp. 3435–3444.

Boxall, J. B., A. O’Hagan, S. Pooladsaz, A. J. Saul and D. M. Unwin (2006).Estimation of burst rates in water distribution mains. ICE, Water Manage-ment.

Bras, R. L. and I. Rodriguez (1993). Random Functions and Hydrology. DoverPublications. N.Y., USA. Chapter 1, pp 1-11.

Bremond, B. (1997). Statistical modelling as help in network renewal decision.. In:Diagnostics of Urban Infrastructure, European Commission Co-operation onScience and Technology (COST). Paris, France.

Butler, M. and J. West (1987). Leakage prevention and system renewal. In:Pipeline Management Seminar, Pipeline Industries Guild. U.K.

Carvalho, H.S., P.C. Nascimento, A.P. Alves da Silva, J.C.S. Souza, M.T.Schilling and M.B. Do Couto Filho (1999). Neural networks based approachfor reliability estimation. In: IEEE International Conference on ElectricPower Engineering. Budapest. p. 181.

Chambers, G. L. (1983). Analysis of Winnipeg’s water main failure problem.Technical report. City of Winnipeg works and operations division.

Ciottoni, A. (1983). Computerized data management in determining causes ofwater main breaks: Philadelphia case study. In: 1983 Int. Symp. on UrbanHydrology, Hydraulics and Sediment Control. Univ. of Kentucky, Lexington,Kentucky. pp. 323–329.

153

Clark, R. M. and J. A. Goodrich (1988). Developing a data base on infrastruc-ture needs. International Journal of American Water Works Association81(7), 81–87.

Clark, R. M. and J. A. Goodrich (1989). Developing a data base on infrastructureneeds. Journal of American Water Works Association 81(7), 81–87.

Clark, R. M., C. L. Stafford and J. A. Goodrich (1982). Water distribution sys-tems: A spatial and cost evaluation. ASCE Journal of Water ResourcesPlanning and Management Division 108(3), 243–256.

Constantine, A. G. and J. N. Darroch (1993). Pipeline reliability: stochastic mod-els in engineering technology and management. World Scientific PublishingCo.

Constantine, A. G., J. N. Darroch and R. Miller (1996). Predicting undergroundpipe failure. Australian Water Works Association.

Constantine, G., R. Miller and J. Darroch (1998). Prediction of pipeline fail-ures form incomplete data. Technical Report 145. Urban Water ResearchAssociation of Australia.

Cooke, R. and E. Jager (1998). A probabilistic model for the failure frequency ofunderground gas pipelines. Journal of risk analysis.

Cox, D. R. and D. Oakes (1984). Analysis of survival data. Chapman and HallLtd.. London.

Cox, D.R. (1972). Regression models and lifetables (with discussion). Journal ofthe Royal Statistical Society.

Crow, E. L., Ed. (1988). Lognormal Distributions:Theory and Applications.Dekker. New York.

Crowder, M. J., A. C. Kimber, R. L. Smith and T. J. Sweeting (1994). StatisticalAnalysis of Reliability Data. Chapman and Hall. London.

Cullinane, M. J. (1986). Hydraulic reliability of urban water distribution systems.In: ASCE Conference on Water Forum 86: World water issues in evolution(M. Karamouz, G. R. Baumli and W. J. Brick, Eds.). Long Beach, California.pp. 1264–1271.

Deb, A.K. (1998). Quantifying future rehabilitation and replacement needs ofwater mains. In: AWWA Research. Denver.

Doleac, M. L., S. L. Lackey and G. Bratton (1980). Prediction of time-to-failurefor buried cast-iron pipe. In: AWWA 1980 Annual Conference. Denver.

154 BIBLIOGRAPHY

Dyachkov, A. (1994). Rehabilitation of the water distribution in the city ofmoscow. In: Water Supply Congress, International Water Supply Associ-ation. Vol. 12. Zurich. pp. 89–94.

Eisenbeis, P. (1994). Modelisation statistique de la provision des faillances sur lesconduites deau potable. Phd thesis. University Louis Pasteur.

Eisenbeis, P. (1997). Estimating the aging of a water mains network with theaid of a record of past failures. In: Deterioration of Built Environment:Buildings, Roads and Water Systems. Norwegian University of Science andTechnology. pp. 125–133.

Eisenbeis, P. (1999). Estimating the ageing of a water mains network with theaid of a record of past failures. In: Deterioration of Built Environment:Buildings, Roads and Water Systems. Norwegian University of Science andTechnology. pp. 125–133.

Eisenbeis, P., P. Gauffre and S. Sgrov (2000). Water infrastructure management:An overview of european models and databases. In: AWWARF Infrastruc-ture Conference. Baltimore, Maryland.

Finnemore, E. J. and J. B. Franzini (2002). Fluid Mechanics. 10th edition ed..McGraw Hill.

Fitzgerald, J. H. (1960). Corrosion as a primary cause of cast iron main breaks.Journal of American Water Works Association 60(8), 882–897.

Goel, A. L. and K. Okumoto (1979). Time-dependent error-detection rate modelfor software reliability and other performance measures. Journal of IEEETransactions Reliability 28, 206–211.

Goldthwaitel, L. R. (1976). Failure rate study for lognormal lifetime model. Belllaboratories, Inc. New York.

Goodrich, J. A. (1986). Drinking water distribution system reliability: Acase study. In: Water Forum 86: World Water Issues in Evolution(M. Karamouz, G. R. Baumli and W. J. Brick, Eds.). ASCE. New York.pp. 1256–1263.

Goulter, I. C. (1987). Current and future use of systems analysis in water distri-bution network design. Journal of Civil Engineering Systems.

Goulter, I. C. (1990). Reliability-constrained pie network model. Journal of hy-draulic engineering.

155

Goulter, I. C. (1992). System analysis in waterdistribution network design: fromtheory to practice. Journal of water resources and management.

Goulter, I. C. and A. V. Coals (1986). Quantitative approaches to reliabilityassessment in pipe networks. ASCE Journal of Transportation Engineering112(3), 287–301.

Goulter, I. C. and F. Bouchart (1987). Joint consideration of pipe breakage andpipe flow probabilities. In: ASCE 1987 National Conference on HydraulicEngineering (R. M. Ragan, Ed.). Williamsburg, Virginia. pp. 469–474.

Goulter, I., J. Davidson and P. Jacobs (1993). Predicting water main break-age rates. ASCE Journal of Water Resources Planning and Management119(4), 419–436.

Goulter, I.C. and A.F. Kazemi (1988). Spatial and temporal groupings of watermain pipe breakage in winnipeg. Canadian Journal of Civil Engineering15(1), 91–97.

Guan, X. (1995). Condition and Replacement of Reginas Water Distribution Sys-tem. M.sc. theses. University of Regina.

Gupta, R. and P. R. Bhave (1994). Reliability analysis of water-distribution sys-tems. Journal of Environmental Engineering.

Gustafson, J. M. and D. V. Clancy (1999). Modeling the occurrence of breaks incast iron water mains using methods of survival analysis. In: Proceedings ofAWWA Annual Conf.American Water Works Association. AWWA. Denver,USA.

Habibian, A. (1992). Developing and utilizing data bases for water main rehabil-itation. Journal of American Water Work Association 84(7), 75–79.

Han, Y. L. and SH. Dai (1996). Artificial neural network method for flawed ipefailure evaluation: probabilistic model. International Journal of vessles andpiping 68, 203–207.

Hariga, M. A. (1996). Maintenance inspection model for a single machine withgeneral failure distribution. Microelectronics and Reliability.

Hartman, W.F. and K. Karlson (2002). Condition assessment of water mains us-ing remote field technology. In: International Conference of Infrastructures2002. Montreal, Quebec, Canada.

Herbert, H. (1994). Technical and economic criteria determining the rehabilita-tion and/or renewal of drinking water pipelines. International Journal ofWater Supply 12(3-4), 105–118.

156 BIBLIOGRAPHY

Hertz, J., A. Krogh and R. G. Palmer (1991). Introduction to the theory of neuralcomputation. Santa Fe Institute.

Herz, R. (1996). Ageing processes and rehabilitation needs of drinking waterdistribution networks.. Journal of Water Supply Research and Technology-Aqua 45(5), 221–231.

Herz, R. (1997). Rehabilitation of water mains and sewers. Water-Saving Strate-gies in Urban Renewal- European Approaches. European Academy of theUrban Environment.

Herz, R. (1998). Exploring rehabilitation needs and strategies for drinking waterdistribution networks. In: First IWSA/AISE International Conferance onMaster Plans for Water Utilities. Prague.

Hobbs, B. F. (1985). Computer applications in water resources, reliability assess-ment of urban water supply. In: ASCE Speciality Conf.. Buffalo. New York.pp. 1229–1238.

Hobbs, B. F. and G. K. Beim (1986). Verification of water supply reliabilitymodel. In: Proc. ASCE Conf. Water Forum, 86: World Water Issues inEvolution. Buffalo. New York. pp. 1218–1229.

Hossain, S. A. and R. C. Dahiya (1993). Estimating the parameters of a non-homogeneous poisson-process model for software reliability.. Journal ofIEEE Transaction on Reliability 42, 604–612.

Hoyland, A. and M. Rausand (1994). System Reliability Theory: Models andStatistical Methods. John Wiley and Sons, Inc. New York.

Hu, Y. and D.W. Hubble (2005). Failure conditions of asbestos cement watermains in regina. In: 1st CSCE Specialty Conference on Infrastructure Tech-nologies, Management and Policy. Toronto, Ontario, Canada.

Jackson, R. Z., C. Pitt and R. Skabo (1992). Non-destructive testing of wa-ter mains for physical integrity. In: Annual Conference of American WaterWorks Association. AWWA Research Foundation. Denver, USA. p. 109.

Jacobs, P. and B. Karney (1994). Gis development with application to cast ironwater main breakage rate. In: 2nd Int. Conf. on Water Pipeline Systems.BHR Group Ltd. Edinburgh, Scotland.

Jarvis, B. (1998). Asbestos-cement pipe corrosion interim report. Technical re-port. Customer Services Division, Water Corporation.

157

Jowitt, P.W. and C. Xu (1993). Predicting pipe failure effects in water distribu-tion networks. ASCE Journal of Water Resources Planning and Manage-ment 119(1), 18–31.

Kaara, A.F. (1984). A decision support model for the investment planning of thereconstruction and rehabilitation of mature water distribution systems. PhDthesis. MIT.

Karaa, F. A. and D. H. Marks (1990). Performance of water distribution net-works: Integrated approach. ASCE Journal of Performance of ConstructedFacilities 4(1), 51–67.

Kartalopoulos, S. V. (1996). Understanding Neural Networks and Fuzzy Logic:Concepts and Applications. IEEE Press.

Kelly, D. and D. O’Day (1982). Organizing and analyzing leak and break datafor making replacement decisions. Journal of American Water Works Asso-ciation 74(11), 589–594.

Kettler, A.J. and I.C. Goulter (1985a). Analysis of pipe breakage in urban waterdistribution networks. In: Canadian Journal of Civil Engineering. Vol. 12.pp. 286–293.

Kettler, A.J. and I.C. Goulter (1985b). Analysis of pipe breakage in urban waterdistribution networks. In: Canadian Journal of Civil Engineering. Vol. 12.pp. 286–293.

Kirmeyer, G. J., W. Richards and C. D. Smith (1994). An assessment of waterdistribution systems and associated research needs. In: AWWA ResearchFederation. Denver.

Kleiner, Y. and B. Rajani (1999). Using limited data to assess future needs.Journal of Water Supply Research and Technology-Aqua 91(7), 47–61.

Kleiner, Y. and B. Rajani (2001). Comprehensive review of structural deteriora-tion of water mains: Statistical models. Urban Water 3(3), 131–150.

Kleiner, Y. and B. Rajani (2002). Forecasting variations and trends in water-mainbreaks. ASCE Journal of infrastructure systems 8(44), 122–131.

Kleiner, Y. and B. Rajani (2003). Water main assets: from deterioration to re-newal. In: AWWA Annual Conference and Exposition, Catch the Wave.AWWA. Anaheim, CA, USA. pp. 1–12.

Kleiner, Y., O. Hunaidi and D. Krys (2005). Failures in Gray Cast Iron Dis-tribution Pipes. Vol. 2006. National Research Council Canada, Institute forResearch in Construction.

158 BIBLIOGRAPHY

Kolmogorov, A.N. (1941). American Mathematical Society Translations 1958.Vol. 8.

Kulkarni, R. B., K. Golabi and J. Chuang (1986). Analytical techniques for se-lection of repair-or-replace options for cast iron gas piping systems phasei.. Technical Report II. Gas research institute.

Kumar, A., E. Meronkly and E. Segan (1984). Development of concepts for cor-rosion assessment and evaluation of underground pipelines. Technical Re-port ll. U.S. Army Corps of Engineers.

Lackington, D. W. (1991). Leakage control, reliability and quality of supply. Inter-nationalJournal of Civil Engineering Systems 8, Civil Engineering Systems.

Lambert, A. O. (1998). A realistic basis for objective international comparisonsof real losses from public water supply systems. In: The Institute of CivilEngineers Conf., Water Environment 98 - Maintaining the Flow. London.

Le Gat, Y. (1999). Forecasting pipe failures in drinking water network usingstochastic processes models - respective relevance of renewal and poissonprocesses. In: 13th Conference of EJSW. Dresden University of Technology.Montreal.

Lee, J. A., D. P. Almond and B. Harris (1999). The use of neural networks forthe prediction of fatigue lives of composite materials. Composites Part A:Applied Science and Manufacturing 30(10), 1159–1169.

Lee, O. S. and H. Kim (2004). Estimation of failure probability using boundaryconditions of failure pressure model of buried pipe lines. Journal of KeyEngineering Materials pp. 270–273.

Lei, J. (1997). Statistical approach for describing lifetimes of water mains - casetrondheim municpality. Technical report. Trondheim Municipality.

Lei, J. and S. Sgrov (1998). Statistical approach for describing lifetimes of watermains. Journal of Water Science and Technology 38(6), 209–217.

Leighton, T. F. and R. L. Rivest (1986). Estimating a probability using finitememory. Journal of IEEE Transactions of Information Theory 32(6), 733–742.

Leshno, M., V. Y. Lin, A. Pinkus and S. Schocken (1993). Multilayer feedforwardnetworks with a nonpolynomial activation function can approximate anyfunction. Neural Networks 6(6), 861–867.

159

Li, D. and Y. Haimes (1992). Optimal maintenance- related decision-making fordeteriorating water distribution-systems 1. semi-markovian model for a wa-ter main.. Water Resources Research 28, 1053–1061.

Lillie, K.;, C.; Reed, M.A.R.; Rodgers, S.; Daniels and D. Smart (2004). In:Workshop on Condition assessment devices for water transmission mains.AWWA Research Foundation. Denver, Colo.

Lyu, M. R. (1996). Handbook of Software Reliability Engineering. McGraw Hill.New York.

Maglionico, M. and R. Ugarelli (2004). Reliability of a water supply system inquantity and quality terms. In: 19th European Junior Scientist Workshopon Process data and integrated urban water modelling. Lyon, France.

Makar, J. M. (1999). Failure analysis for grey cast-iron water pipes. In: 1999AWWA Distribution System Symp.. American Water Works Association.Denver.

Makar, J. M., R. Desnoyers and S. E. McDonald (2001). Failure modes and mech-anisms in grey cast-iron pipes. In: International Conference on UndergroundInfrastructure Research.

Makar, J.M. and Y. Kleiner (2000). Maintaining water pipeline integrity. In:AWWA Infrastructure Conference. Vol. 1. Baltimore, Maryland.

Malandain, J., P. Gauffre and M. Miramond (1998). Organising a decison sup-port system for infrastructure maintenance: Application to water supplysystems. In: 1st International Conference on new Information technologiesfor decision Making in Civil Engineering. Montreal. pp. 1013–1025.

Malandain, J., P. Gauffre and M. Miramond (1999). Modeling the aging of waterinfrastructure.. In: 13th Conference of EJSW.

Male, J. W., T. M. Walski and A. H. Slutsky (1988). Analysis of new york city’swater main replacement policy. In: Conference on Pipeline Infrastructure(B. A. Bennett, Ed.). ASCE. New York. pp. 306–312.

Male, J. W., T. M. Walski and A. H. Slutsky (1990). Analyzing water main re-placement policies. ASCE Journal of Water Resources Planning and Man-agement 116(3), 362–374.

Mann, A. (1997). The identification of ideal locations of moisture barriers. Mas-ters thesis. Swinburne University of Technology.

160 BIBLIOGRAPHY

Marks, H. D. (1985). Predicting urban water distribution maintenance strategies:A case study of new haven connecticut. Technical report. US EnvironmentalProtection Agency.

Marks, H. D., S. Andreou, Jeffrey L., C. Park and A. Zaslavski (1987). Statisticalmodels for water main failures.

Marshall, G.P. (2002). The residual structural properties of cast iron pipesstructural and design criteria for linings for water mains. Technical Report01/VVM/02/14. UK Water Improvement Report.

Mavin, K. (1996). Predicting the failure performance of individual water mains.Technical report. Urban Water Research Association of Australia.

Mays, L. W. (1996). Review of reliability analysis of water distribution systems.In: Stochastic Hydraulics (A. A. Balkem, Ed.). Rotterdam, Netherlands.pp. 53–62.

Mays, L. W. (2000). Water Distribution System Handbook. McGraw Hill. NewYork.

Mays, L. W., N. Duan and Y. C. Sun (1986). Modellin reliability in water dis-tribution network design. In: ASCE Conference on Water Forum 86: worldwater issues in evolution (M. Karamouz, G. R. Baumli and W. J. Brick,Eds.). Buffalo, New York.

McCulloch, W. S. and W. H. Pitts (1943). A logical calculus of the ideas imma-nent in nervous activity. Bulletin of Mathematical Biophysics 5, 115–133.

McMullen, L. D. (1982). Advanced concepts in soil evaluation for exterior pipelinecorrosion. In: AWWA Annual Conference. Miami.

Metropolis, N. and S. Ulam (1949). The monte carlo method. Journal of AmericanStatistics Association 44, 335–341.

Miller, A. M. (1980). A study of the reliability model. Master’s thesis. Universityof Maryland.

Minsky, M. and S. Papert (1969). Perceptrons. MIT Press. Cambridge, MA.

Moon, Y.B., C.K. Divers and H.J. Kim (1998). Aews: an integrated knowledge-based system with neural networks for reliability prediction. Computers inIndustry 35, 101–108.

Mordak, J. and J. Wheeler (1988). Deterioration of asbestos cement water mains.Technical report. Final Report to the Department of the Environment, Wa-ter Research Center.

161

Morris, R. E. (1975). The Distribution System. Manual of water utility operations.Austin the Association. Texas.

Morris, R.E. (1967). Principal causes and remedies of water main breaks. Journalof American Water Works Association pp. 782–798.

Musa, J. D., A. Iannino and K. Okumoto (1987). Software Reliability: Measure-ment, Prediction, Application. McGraw-Hill. New York, USA.

NACE (1984). Corrosion Basics An Introduction. National Association of Cor-rosion Engineers.

Nebesar, B. (1983). Asbestos/cement pipe corrosion: Part 2. Technical Report83-17E. Canada Centre for Mineral and Energy Technology, Energy, Minesand Resources Canada.

Nelson, John D. and J. Miller Debora (1992). Expansive soils: problems andpractice in foundation and pavement engineering. John Wiley and Sons, Inc.New York.

Niemeyer, H. W. (1960). Experience with main breaks in four large cities-indianapolis. Journal of American Water Work Association 52(8), 1051–1058.

Norin, M. and T. G. Vinka (2005). Corrosion of carbon steel and zinc in fillingmaterials in an urban environment. Technical report. Swedish CorrosionInstitute.

Northcote, K.H. (1979). A factual key for the recognition of Australian soils.Adelaide, South Australia. Rellim Technical Publications Pty. Ltd.

O’Day, D. K. (1982). Organizing and analysing leak and break data for makingmain replacement decisions. Journal of American Water Works Association74(11), 589–596.

O’Day, D. K. (1983). Analyzing infrastructure conditions–a practical approach.ASCE Journal of Civil Engineering 53(4), 39–42.

O’Day, D. K., C. M. Fox and G. M. Huguet (1980). Ageing urban water systems:A computerized case study. Public Works 111(8), 61–64.

O’Day, K. (1989). External corrosion in distribution systems. Journal of WaterWorks Association(AWWA) 81(10), 45–52.

Olliff, J. and S. Rolfe (2002). Condition assessment: The essential basis for bestrehabilitation practice. In: No-Dig 2002. Copenhagen.

162 BIBLIOGRAPHY

Ostfeld, A. and U. Shamir (1996). Design of optimal reliable multi-quality water-supply systems. ASCE Journal of Water Resourcs Planning and Manage-ment 119(1), 83–98.

Pascal, O. and D. Revol (1994). Renovation of water supply systems. In: Wa-ter Supply Congress, International Water Supply Association. Vol. 12. Bu-dapest. pp. 6–7.

Peabody, A.W. (1967). Control of pipeline corrosion. National Association ofCorrosion Engineers.

Peabody, A.W. (2001). Control of pipeline corrosion. National Association ofCorrosion Engineers.

Pelletier, G., A. Mailhot and J. P. Villeneuve (2003). Modeling water pipe breaks-three case studies. ASCE Journal of Water Resources Planning and Man-agement 129(2), 115–123.

Price, D. and J. Sutton (1988). Technology in Australia 1788-1988. AustralianScience and Technology Heritage Centre.

Rajani, B. (1995). Repair or replace? IRC studies corroded water mains.

Rajani, B. and J. M. Makar (2000a). A methodology to estimate remaining servicelife of grey cast iron water mains. Canadian Journal of Civil Engineering27, 1259–1272.

Rajani, B. and J. Makar (2000b). A methodology to estimate remaining servicelife of gray cast iron water mains. Canadian Journal of Civil Engineering27, 1259–1272.

Rajani, B. and Kleiner Y. (2001). Comprehensive review of structural deteriora-tion of water mains: Physically based models. Urban Water 3, 151–164.

Rajani, B. and S. Tesfamariam (2004). Uncoupled axial, flexural, and circum-ferential pipe-soil interaction analyses of jointed water mains. CanadianGeotechnical Journal 41, 997–1010.

Rajani, B. and S. Tesfamariam (2005). Estimating time to failure of ageing castiron water mains under uncertainties. In: International Conference of Com-puting and Control in the Water Industry (CCWI2005). Exeter, Devon, UK.

Rajani, B. and Y. Kleiner (2003). Protecting ductile-iron water mains: Whatprotection method works best for what soil condition. Journal of AmericanWater Works Association(AWWA) 95(11), 110–125.

163

Rajani, B. and Y. Kleiner (2004). Non-destructive inspection techniques to deter-mine structural distress indicators in water mains. In: Workshop on Evalua-tion and Control of Water Loss in Urban Water Networks. Valencia, Spain.pp. 1–20.

Rajani, B., C. Zhan and S. Kuraoka (1996). Pipe-soil interaction analysis ofjointed watermains. Canadian Geotechnical Journal 33(3), 393–404.

Rastad, C. (1995). Nordic experiences with water pipeline systems. In: 3rd In-ternational Conference, Sector C- Pipe materials and handling. CEOCORPraha.

Rausand, M. and R. Reinertsen (1996). Failure mechanisms and life models. In-ternational Journal of Reliability, Quality and Safety Engineering 3(3), 137–152.

Redfearn, J.C.B. (1984). Transverse mercator formulae. Empire Survey Review.

Remus, G. J. (1960). Experience with main breaks in four large cities-detroit.Journal of American Water Work Association 52(8), 1048–1051.

Righetti (2001). Cast iron condition assessment study, recognition systems ltd.Technical report. City West Water.

Rosenblatt, F. (1962). A comparison of several perceptron models. Self-OrganizingSystems. Spartan Books. Washington, DC.

Rostum, J. (1997). The concept of business risk used for rehabilitation of waternetworks. In: 10th EJSW. Tautra, Norway.

Rumelhart, D. E. and J. L. McClelland (1986). Parallel Distributed Processing:Explorations in the Microstructure of Cognition. Vol. 1. MIT Press. Cam-bridge, MA.

Sacluti, F. (1999). Modeling water distribution pipe failures using artificial neuralnetworks. Master’s thesis. Deptartment of Civil and Environment Engineer-ing. University of Alberta, Canada.

Sacluti, F., S.J. Stanley and Q. Zhang (1999). Use of artificial neural networksto predict water distribution pipe breaks. In: 51st Annual Conference of theWestern Canada Water and Wastewater Association. Saskatoon, Canada.p. 12.

Sahinoglu, M. (1992). Compound-poisson software reliability model. IEEE Trans-action on Software Engineering. 18, 624–630.

164 BIBLIOGRAPHY

Sgrov, S., J.F. Melo Baptista, P. Conroy, R.K. Herz, P. LeGauffre, G. Moss,J.E. Oddevald, B. Rajani and M. Schiatti (1999). Rehabilitation of waternetworks: Survey of research needs and on-going efforts. Journal of UrbanWater.

Shamir, U. and C. Howard (1979). Analytic approach to scheduling pipe re-placement. International Journal of American Water Works Association71(5), 248–258.

Shamir, U. and C.D. Howard (1985). Reliability and risk assessment for watersupply systems. In: Computer Applications in Water Resources. Buffalo.New York. pp. 1218–1228.

Sheskin, D. (2003). The handbook of parametric and non-parametric statisticalprocedures, Mathematics. CRC Press. New York.

Skipworth, P., M. Engelhardt, A. Cashman, D. Savic, A. Saul and G. Wal-ters (2002). Whole life costing for water distribution network management.Thomas Telford Publishing. London.

Stark, H. (1994). Probability, random processes, and estimation theory for engi-neers. 2nd edition ed.. Prentice Hall. Englewood Cliffs, NJ.

Sukert, A. N. (1976). A software reliability modeling study. Technical ReportRADC-TR-76-247. Rome Air Development Centre.

Sukert, A. N. (1979). Empirical validation of three error prediction models. Jour-nal of IEEE Transaction on Reliability 28, 199–205.

Sullivan, J. P. (1982). Maintaining ageing systems-boston’s approach. Journal ofAmerican Water Works Association 74(11), 555–559.

Tarassenko, L. (1998). A guide to neural computing applications. first ed.. JohnWiley and Sons. page 17.

Tesfamariam, S., B. Rajani and R. Sadiq (2006). Possibilistic approach for con-sideration of uncertainties to estimate structural capacity of ageing cast ironwater mains. Canadian Journal of Civil Engineering p. 10501064.

The Australian Geodetic Datum Technical Manual. Special Publication (1986).

Vorenhouta, M., H. G. van der Geestc, D. van Marumb, K. Wattelb and H. J. Ei-jsackersa (2004). Automated and continuous redox potential measurementsin soil. Journal of Environment Quality 33, 1562–1567.

Walski, T. M. and A. Pelliccia (1982). Economic analysis of water main breaks.Journal of American Water Works Association(AWWA) 74(3), 140–147.

165

Walters, G. (1988). Optimal design of pipe networks: a review. In: InternationalConference on Computer and Water Resources. Vol. 2. Computational Me-chanics Publication. Southampton, U.K.. pp. 21–31.

Water Main Renewal Study (1991). Technical report. Melbourne Water. Mel-bourne.

Water Reticulation Asset Status Report (1997). Technical report. City West Wa-ter. Melbourne. Pipe Structural Performance for the period July 1994 toJune 1997.

Watson, G. A. and Iain S. Duff (1997). The state of the art in numerical analysis.The Institute of Mathematics and Its Applications Note = about interpola-tion in DATA Chapter. Oxford : Clarendon Press. London, England.

Weibull, A. W. (1951). Statistical distribution of wide applicability. Journal ofApplied Mechanics 18, 292–297.

Widrow, B. and Hoff (1960). Adaptive switching circuits. 1960 IRE WESCONConvention Record pp. 96–104.

Williams, S., R.G. Ainsworth and A.F. Elvidge (1984). A Method of Assessingthe Corrosivity of Waters Towards Iron. WRc. Swindon.

Wood, A. (1996a). Predicting software reliability. Journal of IEEE on ComputerSciences 29, 69–77.

Wood, A. (1996b). Software reliability growth models. Technical Report 96.1.Tandem Tech.. Germany.

WSAA facts’99 (1999). Technical report. Water Service Association of Australia(WSAA). Melbourne.

Xu, C. and C. Goulter (1998a). Probabilistic model for water distribution re-liability. ASCE Journal of Water Resources Planning and Management124(4), 218.

Xu, C. and I.C. Goulter (1999). Reliability-based optimal design of water distri-bution networks. ASCE Journal of Water Resources Planning and Manage-ment 125(6), 352–362.

Xu, C. and R.S. Powell (1991). Water supply system reliability- concepts andmeasures. Journal of Civil Engineering Systems 8(4), 191–195.

Xu, C., I.C. Goulter and K.S. Tickle (2003). Assessing the capacity reliability ofageing water distribution systems. Journal of Civil Engineering and Envi-ronmental Systems 20(2), 119–133.

166 BIBLIOGRAPHY

Xu, Ch. and I. Goulter (1998b). Probabilistic model for water distribution re-liability. Journal of Water Resources Planning and Management Division,ASCE.

Appendices

167

Appendix A

An Introduction to Artificial

Neural Networks

A.1 Introduction

A neural network can be defined as a model of reasoning, based on what oc-

curs in the human brain. The brain consists of a densely interconnected set

of nerve cells or basic information-processing units, called neurons. Learn-

ing, is a fundamental and essential characteristic of biological neural net-

works. By using multiple neurons simultaneously, the brain can perform its

functions much faster than the fastest computers in existence today.

Each neuron has a very simple structure, but an army of such elements

constitutes a tremendous processing power. The ease with which they can

learn led to attempts to emulate a biological neural network in a computer.

The first computational model for an artificial neuron, was presented by

McCulloch and Pitts (1943). The McCulloch-Pitts Neuron was one of the

first attempts to capture the essential properties of a real neuron. The in-

puts and outputs of a McCulloch neuron model are binary (exclusively ones

or zeros); the nodes produce only binary results. There were no weights,

and the activation function was always the unit step function.

A.2 McCulloch-Pitts Neurons

McCulloch-Pitts Neuron consists of a set of n excitatory inputs, Xi; a set of

m inhibitory inputs, Xn+j; a threshold, u; a unit step activation function;

169

170APPENDIX A. AN INTRODUCTION TO ARTIFICIAL NEURAL

NETWORKS

and a single neuron output, Y . The neuron computes the sum of the input

signals and compares the result with a threshold value, u. If the net input is

less than the threshold, the neuron output is “OFF” (e.g. 0), otherwise, the

neuron becomes activated and its output attains a value “ON” (e.g. +1).

The next step in development of artificial neural network was the in-

troduction of delta learning rule by Widrow and Hoff (1960) that brought

the concept of Perceptron networks (which they called “ADALINE”- that

stands for adaptive linear neuron). The perceptron is the simplest form of

a neural network. It consists of a single neuron with adjustable synaptic

weights and a hard limiter.

The adaptive linear combiner outputs a linear combination of its in-

puts. This element receives an input vector Xk = [X0kX1k

...Xnk]> where

> denotes “transpose”. The components of Xk may be continuous or bi-

nary values. The components of the input vector are weighted by a set

of coefficients, the weight vector Wk = [W0kW1k

...Wnk]>. The weights are

continuously positive or negative values. The sum of the weighted inputs is

then computed, producing a linear output:

sk = Xk>Wk (A.1)

A.3 Linear Neuron Models

Adaptive linear neuron or ADALINE is an adaptive threshold logic element.

It consists of an adaptive linear combiner cascaded with a hard-limiting

quantizer that is used to produce a binary 1 output; Yk = sign(sk). A bias

weight, threshold, W0k, which is connected to a constant input, X0 = 1,

controls the threshold level of the quantizer.

Such an element may be seen as a McCulloch-Pitts neuron augmented

with a learning rule for adjusting its weights. In single-element neural net-

works, the weights are often trained to classify binary patterns using binary

desired responses. After training the ADALINE, if it responds correctly to

input patterns that were not included in the training set, it is said that

generalisation has taken place.

Learning and generalisation are among the most useful attributes of

ADALINEs and neural networks in general. With n binary inputs and

one binary output, there are 2n possible input patterns. A general logic

implementation would be capable of classifying each pattern as either +1 or

−1, in accordance with the desired response. Thus, there are 22npossible

logic functions connecting n inputs to a single binary output. A single

A.4. MULTI-LAYER FEED-FORWARD PERCEPTRONS 171

ADALINE is capable of realising only a small subset of these functions,

known as the linearly separable logic functions or threshold logic functions.

These are the set of logic functions that can be obtained with all possible

weight variations. With two inputs, the two functions an ADALINE cannot

learn, are exclusive OR and exclusive NOR. With many inputs, however,

only a small fraction of all possible logic functions are realisable, i.e., linearly

separable.

In the course of development of artificial neural networks, to remove the

limitations of ADALINE in dealing with functions, MADALINE (multiple

adaptive linear neuron) was introduced. MADALINE is one of the earliest

trainable layered neural networks. Retinal inputs were connected to a layer

of adaptive ADALINE elements, the outputs of which were connected to

a fixed logic device that generated the system output. MADALINEs were

constructed with various fixed logic devices such as AND, OR, and majority

vote-taker elements in the second layer. Three functions are all threshold

logic functions.

Early neural networks (MADALINE) were invented for pattern classifi-

cation issues. Nowadays, neural networks are equally useful for tasks such

as interpolation, system modelling, state estimation, adaptive filtering, and

nonlinear control. Rosenblatt (1962) took this course of improvement to

the next stage, by proving the convergence of the perceptron training rule.

Later, Minsky and Papert (1969) showed that the Perceptron cannot deal

with nonlinearly-separable data sets, even those that represent simple func-

tions such as and/or commands.

A.4 Multi-Layer Feed-Forward Perceptrons

During 1970-1985, very little research was reported on Neural Networks.

Invention of back-propagation which can learn from nonlinearly-separable

data sets by Rumelhart and McClelland (1986) led to a revolution in this

field. Since 1986, a lot of research has been conducted in enhancement and

application of Neural Networks. For instance an artificial neuron may use

the following transfer or activation function:

X =∑n

i=1 XiWi

Y =

1 if X ≥ u

−1 else

(A.2)


NETWORKS

The above type of activation function is called a sign function. This

neuron computes the weighted sum of the input signals and compares the

result with a threshold value, u. If the net input is less than the threshold,

the neuron output is OFF (e.g. -1). However, if the net input is greater

than or equal to the threshold, the neuron becomes activated and its output

attains a value ON (e.g. +1). Some examples of activation function or

“nonlinearity” of a unit are presented in figure A.1.

Artificial neural networks were first invented by computer scientists, but

many scientists from physics, engineering, psychology, etc., have tried to im-

prove them. Early neural networks were invented for pattern classification

issues. Nowadays neural networks are useful for tasks such as interpola-

tion, system modelling, state estimation, adaptive filtering, and nonlinear

control. In Figure A.2, a typical feed-forward neural network is illustrated.

The neural network typically consists of the set of n excitatory inputs, Xi;

a set of m inhibitory inputs, Xn+j; activation function; neuron outputs, yi;

and weights, Wij.

Each neuron computes the weighted sum of the input signals. An acti-

vation function processes the outcome of this linear combiner and returns

the output of the neuron which contributes to the inputs of the neurons in

the next layer. Activation functions are used to achieve increased computa-

tional power from multiple neurons, for non-linear variations. Examples of

activation functions are: Step function, Sign function, Sigmoid function.

The number of covariates that associate in producing the process under

study, determine the population of neurons in input layer of the network.

Output signal of each neuron of the last layer is an output that is supposed

to be realised from the neural network. Layer of neurons between the first

and the last layers are called hidden layers. The ability to adjust the num-

ber of neurons of hidden layers, as well as number of hidden layers and

initial coefficients increases the flexibility of neural networks in mimicking

the patterns of different data.

A.4. MULTI-LAYER FEED-FORWARD PERCEPTRONS 173

1

−1

0 X

Y

(a)

1

0

−1

Y

x

(b)Y

1

X

−1

−

−

0

(c)

Figure A.1: Some examples of activation function or “nonlinearity” of a

unit: (a) step function (b) sign function (c) linear function.


NETWORKS

Input Layer

Hidden Layer(s)

Output Layer

Figure A.2: Architecture of a typical feed-forward neural network

Appendix B

Reliability Prediction and

Prioritisation Using an Artificial

Neural Network: A Step-by-Step

Instruction Manual for

Practitioners

B.1 Introduction

This appendix presents a step-by-step instruction manual for practitioners

and engineers in water distribution industry. The implementation comprises

three different steps, described as follows:

Í Data Preparation: This step involves preparation of pipe failure

data as recorded in a history of past pipe breaks. At this stage,

the past failure records are sorted and their empirical reliabilities are

calculated accordingly, to be applied for the training of the neural

network at the next step. The details are explained in Section B.2.

Ï Training: At this step, the ensemble of failure records prepared in

the previous step is utilised to train the artificial neural network which

is a feed-forward perceptron. This ANN can learn the failure patterns

through training by error back-propagation technique. The architec-

ture of the ANN model and details of its training are presented in

Section B.3.

175

176 APPENDIX B. INSTRUCTION MANUAL FOR PRACTITIONERS

Ò Prediction and Prioritisation: After being trained, the ANN can

be used for failure prediction and prioritisation of the pipes based on

their reliabilities. The direct output of the trained ANN is an estimate

of the reliability of a given pipe (knowing its construction date, ma-

terial and diameter) at a given date (assessment date) in the future.

This predicted value can be used for prioritisation of different classes

of pipes (pipe groups with similar diameter and material) for repair,

rehabilitation and/or replacement planning and scheduling purposes.

The details are presented in Section B.4.

The source code of the MATLAB program that realises both the neural

network and its training in software is provided in Section B.5.

B.2 Preparation of Training Data

In order to prepare the neural network model to be used for lifetime pre-

diction and prioritisation for the purpose of replacement/repair planning,

the neural network should be trained first, using a database of past pipe

breaks. A proper database would include a large number of failure records,

each with the following fields:

b Pipe age at the failure date

b Pipe material

b Pipe diameter

b Empirical reliability of the pipe at the time.

The pipe age at the failure date is calculated as the difference between

the construction and failure dates. It can be in terms of days, months or

years, depending on the required resolution of prediction. However, it is

important to note that for a higher resolution, a larger number of failure

records is required for proper training of the neural network. In this study,

with the failure database provided by CWW, the pipe ages were calculated

in terms of the number of months. The maximum age in the database

should also be recorded and will be denoted by ∆D, hereafter.

The pipe material is recorded (or later converted in the software) as

a number of codes. For example, if only CI, CICL, DI, AC, and GWCL

pipes exist in the network, then the pipe material will be coded as shown

in Table B.1. The number of the existing types of pipe materials should

B.2. PREPARATION OF TRAINING DATA 177

Table B.1: An example of pipe material and diameter coding.

Material Code

CI 1

CICL 2

DI 3

AC 4

GWCL 5

Diameter (mm) Code

80 1

100 2

110 3

120 4

150 5

also be recorded and will be denoted by NType, hereafter (in the above case,

NType = 5).

Similar to the pipe material, the pipe diameter is also coded to one

of 1, 2, . . . numbers. For example, if every pipe in the network has a

diameter of 80 mm, 100 mm, 110 mm, 120 mm or 150mm, then the pipe

diameters will be coded as shown in Table B.1. The number of existing pipe

diameter codes should also be recorded and will be denoted by ND. In the

example shown in Table B.1, ND = 5.

To calculate the empirical reliabilities of the pipes, their break records in

the database should be first grouped into different classes. In each class, the

pipes have the same material and diameter. A total number of ND ×NType

different classes can potentially exist, although usually some classes are

empty (with no break record in the database).

Suppose there are n break records in the history for a particular class

of pipes and the pipe ages at the times of failure are recorded. The set of

pipe ages is first sorted to t(1), t(2), . . . , t(n) in ascending order. Then,

for each age t(i); i = 1, . . . , n, the empirical reliability of the pipes in that

class is calculated as follows:

SEMP(t(1)) = 1 − 1/(2n)

SEMP(t(2)) = 1 − 3/(2n)

· · ·SEMP(t(i)) = 1− (2i− 1)/(2n)

· · ·SEMP(t(n)) = 1/(2n).

(B.1)

All four columns of the training dataset are now complete and can be

directly applied to train a neural network model as described in the next

section. Each row of this dataset corresponds to a break record and includes


Pipe Age

Pipe Material Code

Pipe Diameter Code

D1

Type1 N

DN1

1I

2I

3I

14I

To

the

Hidden

Layer

Figure B.1: Input layer of the ANN model.

numerical values for the pipe age, the code of pipe material, the code of pipe

diameter and the empirical reliability of the pipe (calculated at the given

age of pipe for the pipe class to which the pipe belongs).

B.3 Training the Artificial Neural Network

The neural network model of pipe reliability is a feed-forward perceptron

with three layers. The first layer is the input layer. As shown in Figure B.1,

the pipe age, material code and diameter code are normalised and passed

to the hidden layer. The inputs to the hidden layer are denoted by the

symbols I1, I2 and I3, along with a 4-th input I4 = 1 which is called the

bias input.

The hidden layer is shown in Figure B.2. The outputs of this layer are

given by:

Jj = f

(4∑

i=1

WijIi

); j = 1, . . . , nh (B.2)

where nh is the number of neurons in the hidden layer and Wij is the weight

of the synapse connecting the i-th input Ii to the j-th hidden neuron. The

function f(.) is called the activation function of the hidden neurons and has

the following mathematical form (which is called sigmoid function):

y = f(x) =1

1 + exp(−x). (B.3)

B.3. TRAINING THE ARTIFICIAL NEURAL NETWORK 179

1I

2I

3I

14 =I 11 =+nhJ

1J

2J

nhJ

The weights ijW

Figure B.2: Hidden layer of the ANN model.

11 =+nhJ

1V1J

2J

nhJ 10/9

y2V

nhV

1+nhV

S

Figure B.3: Hidden layer of the ANN model.

The output layer has a single neuron as shown in Figure B.3. The

neuron is connected to the neurons of the hidden layer through synapses

with weights V1, . . . , Vnh+1. The reliability estimate is generated by this

single neuron:

S = 10/9 f

(nh+1∑j=1

VjJj

). (B.4)


B.3.1 Learning by Error Back-Propagation

Training of the neural network means tuning of the weights Wij and Vj in

such a way that the output of the network, S, follows the failure patterns

existing in the training data. The training method described in this section

is called error back-propagation and it is the most popular learning technique

used with feed-forward multi-layer perceptron neural networks.

Suppose there is a total of N break records in the training dataset that

has been prepared following the instructions given in Section B.2. For the

k-th record (1 ≤ k ≤ N), the empirical reliability is denoted by SkEMP and

the neural network is to be trained to generate these empirical reliabilities

for the break records listed in the database. If the output of the neural

network, given by equations (B.2)-(B.4), is Sk for the k-th break record, the

reliability estimation error term for this record is ek = SkEMP − Sk. Using

error back-propagation, the weights of the neural network are tuned in such

a way that the following objective function is minimised:

E =1

2

N∑k=1

(SkEMP − Sk)

2 =1

2

N∑k=1

e2k. (B.5)

The optimisation problem is solved using the gradient technique. The

training involves multiple repetitive epochs. In each epoch, each weight w

changes in the inverse direction of the gradient of the objection function:

∆w = −η∂E

∂w+ α∆wprev.. (B.6)

The term α∆wprev. is called the momentum term and is added to the

gradient term to avoid the minimisation process being trapped in a local

minimum. The two parameters η and alpha should be both positive and less

than one, and are chosen by trial and error. For the failure dataset provided

by CWW to conduct this study, η = 0.5 and α = 0.7 were suitable choices.

But, for a complete and large dataset, different values may be appropriate.

A small η will slow down the convergence of the optimisation process

and a large value will result in parameter values largely oscillating around

the optimum values throughout the optimisation process (oscillatory con-

vergence of the optimisation).

B.3.2 Step-by-Step Training Algorithm

First the weights of the neural network are initialised to small random val-

ues. In this study, they were initialised to values between -0.01 and 0.01.

B.4. PRIORITISATION OF PIPES BY RELIABILITY PREDICTIONUSING THE NEURAL NETWORK MODEL 181

The training algorithm involves repetition of several steps (each repetition

cycle is called an epoch) until the objective function E converges. In each

epoch the following steps are taken:

1. For every break record in the training dataset, indexed with k (1 ≤k ≤ N), compute and save the following values:

/ the outputs of the hidden neurons Jj(k); j = 1, . . . , nh, using

equation (B.2);

/ the output of the neural network (reliability estimate Sk), using

equation (B.4); and

/ the reliability estimation error ek = SkEMP − Sk.

2. For each weight Vj, calculate the following gradient:

∂E

∂Vj

= −N∑

k=1

ek Sk (1 − 0.9Sk) Jj(k). (B.7)

3. For each weight Wij, calculate the following gradient:

∂E

∂Wij

= −N∑

k=1

ek Sk (1 − 0.9Sk) Jj(k) (1 − Jj(k)) Vj Ii(k). (B.8)

4. Change the weights of the neural network by ∆Vj and ∆Wij given

below:∆Vj = −η∂E/∂Vj + α∆Vj(prev.)

∆Wij = −η∂E/∂Wij + α∆Wij(prev.)(B.9)

where ∆Vj(prev.) and ∆Wij(prev.) are the values of the previous

epoch.

The above steps are repeated until convergence. The convergence is

examined by checking the absolute difference between the values of E in the

current and previous epochs. Training stops when the absolute difference

is less than a small threshold (10−4 in this study).

B.4 Prioritisation of Pipes by Reliability Predic-

tion Using the Neural Network Model

Using the trained neural network, reliability models can be derived for each

class of pipe. In this context, the term reliability model means the pipe reli-

abilities at different ages for the pipes with the same material and diameter.


Age (years)

Reliability

1.000.87

0.21

25 50

Figure B.4: An example of the reliability versus age plot of a model derived

for a class of pipes.

Assume that, using the failure database, the reliability of a class of

pipes is modelled as a function of pipe age as shown in Figure B.4. By

projecting this reliability model into future, the reliability of those pipes in

the year 2015 can be calculated as a similar (but reflected) function of the

construction dates of the pipes, as shown in Figure B.5.

For the purpose of predictive prioritisation of pipes, the following steps

are taken:

1 An assessment date in the future is determined.

1 The reliability of different classes of pipes at the assessment date are

calculated and plotted versus their construction dates.

1 A reliability threshold is set (e.g. 100% × α = 80%).

1 For each class, the pipes that their future reliability will be less than

the given threshold are determined (in terms of their construction

date) and marked as “high-risk”.

1 The list of high-risk pipes is the output of the proactive technique.

The pipes in this list have the highest priority for replacement (or

rehabilitation).

An example is shown in Figure B.6. The reliability of two classes of pipes

in a given assessment date in future are predicted and plotted versus their

B.4. PRIORITISATION OF PIPES BY RELIABILITY PREDICTIONUSING THE NEURAL NETWORK MODEL 183

Construction Year

Reliability

0.87

0.21

19901965

Figure B.5: The reliability of the same class of pipes (as in Figure B.4)

in the year 2015, plotted versus the construction year of the pipes in that

class.

Construction Year

Reliability

100% x α

Y1Y2

100%

Figure B.6: For the “red” and “blue” classes of pipes, the pipes constructed

before the year Y1 and Y2, respectively, are high-risk.

construction dates. the horizontal line of %×α intersects the two plots at the

construction years Y1 and Y2. Thus, for the class with the “red” reliability

plot, the pipes constructed before the year Y1 are identified as high-risk and

for the class with the “blue” reliability plot, the pipes constructed before

the year Y2 are identified as high-risk.


1 function [shat,J]=NN(I,W,V)2 % This function implements the actual neural network models. It gets the 3 x 1 in

put vector and ouputs the reliability estimate shat and the hidden layer neuron activities J which is a nh x 1 vector. The size of the weight matrix W is 4 x nh and the weight vector V is (nh+1) x 1.

3

4 J=f(W'*[I;1]); % computes a nh x 1 vector J. The input I is padded with a 1 for bias input

5 shat=10/9*f(V'*[J;1]); % computes the output shat. The J is padded with a 1 for bias.

6

7 function y=f(x)8 % This is the activation function of each neuron in the ANN, which is a Sigmoid fu

nction.9 y=1./(1+exp(-x));

Figure B.7: The source of the Matlab function that realises the three layers

of the Neural Network. For a given set of input values, this functions returns

the output of the network and the activation values (outputs) of the neurons

in the hidden layer.

B.5 MATLAB Source Code

In this section, the MATLAB source files of the program, written to imple-

ment and test the performance of the neural network model, are presented.

Figure B.7 shows the function ‘NN.m’ which implements the three layers

of the neural network shown in Figures B.1-B.3. The inputs of the neural

network are denoted by the 3 × 1 vector variable ‘I’ and they are in the

same order as shown in Figure B.1. The weights are denoted by the vari-

able symbols ‘W’ and ‘V’ which are 4 × nh and (nh + 1) × 1, respectively.

The function returns the output of the neural network which is an estimate

S of the reliability of the given pipe at the given age. It also returns an

nh × 1 vector ‘J’ which denotes the outputs of the neurons in the hidden

layer. The function ‘y=f(x)’ defined within ‘NN.m’ is the sigmoid activation

function of the neurons as defined in Equation (B.3).

Figure B.8 shows the code ‘Training Epoch.m’ which is a function that

implements one training epoch for the neural network. Its inputs are the

ages, material and diameter codes, empirical reliabilities, trained weights

in the previous epoch and the weight variations in the previous epoch. It

is important to note that the ages and material and diameter codes and

empirical reliabilities are vectors extracted from the failure history (this will

B.5. MATLAB SOURCE CODE 185

be performed within another m-file that repetitively calls ‘Training Epoch’

in multiple epoches to train the neural network).

The ‘Training Epoch’ function initialises different variables and gener-

ates the ‘I’ vector input and calls the ‘NN’ function. Then it uses the

cumulative estimation error and computes the updated weights and returns

them, along with their most recent variations and the cumulative estimation

error E itself.

The complete iterative training scheme for the neural network, using

error back-propagation, is implemented by the function ‘NN Training’. The

source code of this function which is the m-file ‘NN Training.m’, is shown

in Figures B.9 and B.10.

The function ‘NN Training’ reads the records of a failure database that

is assumed to be named ‘Sample data’ but it can be changed within the code

if required. The Microsoft Excel file has to be generated with the format

shown in Figure B.11. There are four columns namely the type (material),

diameter and age of the pipe at the time of the recorded failure, and the

fourth column includes the empirical reliability of the pipe. It is evident

that the failures of each class of pipe should be sorted in an ascending order

of pipe ages and the empirical reliabilities can be then computed according

to Equation 4.4. More precisely, the empirical reliabilities of the smallest

to largest ages of the failed pipes of the same class will be 1 − 12n

, 1 − 32n

,

1 − 52n

, . . ., 12n

, respectively (n is the total number of recorded failures for

the class).

After the contents of the Excel file are read into a numeric matrix and a

text cell array, then the text is first processed and the material information

are coded into material codes. Then, in a similar scheme, the diameters are

coded. The weights of the neural network are initialised to small random

numbers and the repetitive training epoches are started and continued until

convergence. At the end, the weights of the trained neural network are saved

into a file named ‘Trained Weights.mat’ for future use. They can be loaded

into variables and be used as inputs to the ‘NN’ function to estimate the

reliability of a pipe with a given class (material and diameter codes) at a

given age in future.


1 function [Wnew,Vnew,Delta_Wnew,Delta_Vnew,E]=Training_Epoch(Ages,Material_Codes,Diameter_Codes,S_Emp_Data,Wold,Vold,Delta_Wold,Delta_Vold)

2 %This function performs one epoch of back-propagation training of the ANN. The inputs are the vectors of age and empirical reliability value and material and diameter codes, extracted from the failure history. Also the old weight values and their previous variations are given. The outputs are the new weights and updated variations and the total error E.

3 etha=0.05; alpha=0.1;4 DeltaD=200; % Assumably, no pipe is more than 200 years old in the database, s

o this is a good normalisation factor;5 Ntype=5; % 5 tpes of pipe materials are assumed to exist in the database6 Nd=5; % 5 types of pipe diameters are assumed to exist in the database.7 Imatrix=[Ages/DeltaD Material_Codes/Ntype Diameter_Codes/Nd]'; %Each colum

n of this matrix is one unput to the neural network which is to be trained to generate the empirical reliability.

8 N=length(Ages); % the number of failure records9

10 E=0; dE_dV=0; dE_dW=0; %The total error and partial derivatives are initialised to zero to be summed up in the following loop.

11 for k=1:N12 I=Imatrix(:,k); % The input vector is extracted13 [shat,J]=NN(I,Wold,Vold); % the outputs of the ANN are calculated14 e=S_Emp_Data(k)-shat; % estimation error is calculated15 E=E+1/2*e^2; % calculate the formula (B.5)16 dE_dV=dE_dV+e*shat*(1-0.9*shat)*[J ; 1]; %calculates (B.7)17 dE_dW=dE_dW+e*shat*(1-0.9*shat)*[I ; 1]*(J.*(1-J).*Vold(1:(length(Vold)-1)))';

%calculates (B.8)18 end19 Delta_Vnew=+etha*dE_dV+alpha*Delta_Vold; % calculates the new V weight vari

ations from formula (B.9)20 Delta_Wnew=+etha*dE_dW+alpha*Delta_Wold; % calculates the new W weight v

ariations from formula (B.9)21 %updating the weights22 Vnew=Vold+Delta_Vnew;23 Wnew=Wold+Delta_Wnew;

Figure B.8: One epoch of the training process of Neural Network by error

back-propagation. This epoch is repeated until the estimation error of the

neural network falls down a small given threshold.

B.5. MATLAB SOURCE CODE 187

1 function NN_Training

2 % this function reads the failure data from a database in the form of an Excel file

and uses them to train the neural network

3 [Numeric,Text]=xlsread('Sample_data.xls'); %reads the numeric and text data fro

m the excel file which is the failure database

4 Type_Text=Text(2:end,1); % extracts the first column of the excel file which includ

es the text of pipe materials.

5

6 % the folloing lines code the pipe materials into a vector of 1, 2, 3, 4 or 5's accord

ing to table B.1

7 Material_Codes=zeros(size(Type_Text));

8 Material_Codes(strcmp(Type_Text,'CI'))=1;

9 Material_Codes(strcmp(Type_Text,'CICL'))=2;

10 Material_Codes(strcmp(Type_Text,'DI'))=3;

11 Material_Codes(strcmp(Type_Text,'AC'))=4;

12 Material_Codes(strcmp(Type_Text,'GWCL'))=5;

13

14 % the folloing lines code the pipe diameters into a vector of 1, 2, 3, 4 or 5's accor

ding to table B.1

15 Diameters=Numeric(:,1);

16 Diameter_Codes=zeros(size(Diameters));

17 Diameters_Codes(Diameters==80)=1;





22

23 Ages=Numeric(:,2); % extracting the Ages data from the numeric data of the exce

l file

24 S_Emp_Data=Numeric(:,3); % extracting the empirical reliability data from the nu

meric data of the excel file

25

26 % Start training the neural network

27

28 nh=20; % choose the number of neurons in the hidden layer as 20

29 Wold=0.02*rand(4,nh)-0.01; % initialise the weights W randomly between -0.01 a

nd 0.01

30 Vold=0.02*rand(nh+1,1)-0.01; % initialise the weights V randomly between -0.01

and 0.01

31 Delta_Wold=zeros(size(Wold)); % initialise the delta W to all zeros

Figure B.9: Page 1 of the code that repeats the training epoches until

convergence. It reads the failure history from a Microsoft Excel file.


32 Delta_Vold=zeros(size(Vold)); % initialise the delta V to all zeros

33

34 E=1000; % initialise the training error to a large number

35 while E/length(Ages) > 0.001,

36 % repeat the training epoches until the average estimation error is smaller tha

n a small threshold (chosen as 0.01 here).

37 [Wnew,Vnew,Delta_Wnew,Delta_Vnew,E]=Training_Epoch(Ages,Material_Cod

es,Diameter_Codes,S_Emp_Data,Wold,Vold,Delta_Wold,Delta_Vold);

38 Wold=Wnew; Vold=Vnew; Delta_Wold=Delta_Wnew; Delta_Vold=Delta_Vnew;

% update the weights and their inter-epoch variations for the next epoch

39 disp(E);

40 end

41

42 save Trained_Weights Wnew Vnew; % saves the final weights of the trained neur

al network in the 'Trained_Weights.mat' file

43

Figure B.10: Page 2 of the code that repeats the training epoches until

convergence. It reads the failure history from a Microsoft Excel file.

Figure B.11: Example of the excel file to be processed by the Matlab pro-

gram.

Appendix C

Research Publications

The results of this research study have been published in the proceedings of

two international conference proceedings. Following presenting those papers

at the conferences, they have been extended and modified, then submitted

and accepted for publication in two journals. Furthermore, a paper has

been submitted to a journal and has been revised twice, and is waiting for

the Editor’s final decision. The list of the above publications are as follows:

¶ Dehghan, A., K. J. McManus and E. F. Gad (2008), Statistical Analy-

sis of Structural Failures of Water Pipes in a Case Study, Proceedings

of Institute of Civil Engineers (ICE) - Water Management, 161(4),

207–214, August.

· Dehghan, A., K. J. McManus and E. F. Gad (2008), Probabilis-

tic Failure Prediction for Deteriorating Pipelines: A Non-Parametric

Approach, ASCE Journal of Performance of Constructed Facilities,

22(1), 45–53, February.

¸ Dehghan, A., K. J. McManus and E. F. Gad (2007), Non-Parametric

Approach to Probabilistic Analysis of Structural Failures of Cast Iron

Pipes, In: Proceedings of the Eleventh International Conference on

Civil, Structural and Environmental Engineering Computing, St. Ju-

lians, Malta, Paper No. 240, September.

¹ Dehghan, A. and K. J. McManus (2005), Improved Estimation of Wa-

ter Pipes Reliability for Urban Water Supply Systems, In: Proceedings

of The First International Conference on Structural Condition As-

189

190 APPENDIX C. RESEARCH PUBLICATIONS

sessment, Monitoring and Improvement, Perth, Australia, 163–169,

December.

º Dehghan, A. and K. J. McManus, Reliability Analysis of Water Distri-

bution Pipes Using Artificial Neural Networks, Submitted to AWWA

Journal, Published by American Water Works Association (AWWA),

USA, Second revision submitted, Waiting for final decision by the

journal’s Editor.

failure prediction for water pipes · lected on a regular basis for each pipe of water networks is...

Documents