zhang, sheng,ma, lin,sun, yong, &mathew, joseph in gelman ... · 2 monitoring data under...

This may be the author’s version of a work that was submitted/acceptedfor publication in the following source:

Zhang, Sheng, Ma, Lin, Sun, Yong, & Mathew, Joseph(2007)Asset Health Reliability Estimation Based on Condition Data.In Gelman, L (Ed.) Proceedings of the 2nd World Congress on Engineer-ing Asset Management and the 4th International Conference on ConditionMonitoring.Coxmoor Publishing Company, CD Rom, pp. 2195-2204.

This file was downloaded from: https://eprints.qut.edu.au/13311/

c© Copyright 2007 (please consult author)

This work is covered by copyright. Unless the document is being made available under aCreative Commons Licence, you must assume that re-use is limited to personal use andthat permission from the copyright owner must be obtained for all other uses. If the docu-ment is available under a Creative Commons License (or other specified license) then referto the Licence for details of permitted re-use. It is a condition of access that users recog-nise and abide by the legal requirements associated with these rights. If you believe thatthis work infringes copyright please provide details by email to [email protected]

Notice: Please note that this document may not be the Version of Record(i.e. published version) of the work. Author manuscript versions (as Sub-mitted for peer review or as Accepted for publication after peer review) canbe identified by an absence of publisher branding and/or typeset appear-ance. If there is any doubt, please refer to the published source.

https://eprints.qut.edu.au/view/person/Zhang,_Sheng.html

https://eprints.qut.edu.au/view/person/Ma,_Lin.html

https://eprints.qut.edu.au/view/person/Sun,_Yong.html

https://eprints.qut.edu.au/view/person/Mathew,_Joseph.html

https://eprints.qut.edu.au/13311/

QUT Digital Repository: http;;//eprints.qut.edu.au

Zhang, Sheng and Ma, Lin and Sun, Yong and Mathew, Joseph (2007) Asset health reliability estimation based on condition data. In Proceedings World Congress on Engineering Asset Management, pages pp. 2195-2204, Harrogate, England.

© Copyright 2007 (please consult author)

1

Asset health reliability estimation based on condition data

Sheng Zhang, Lin Ma, Yong Sun, Joseph Mathew

CRC for Integrated Engineering Asset Management, School of Engineering Systems,

Queensland University of Technology, Brisbane, QLD 4001, Australia

[email protected]

Abstract

This paper presents a method using recursive Bayesian analysis to estimate asset health

reliability. The method enables reliability analysis and prediction using degradation condition

monitoring data from an individual asset, rather than using failure data from a population of

assets. A scheme was devised to assist the implementation of the method in both normal and

abnormal zones during the whole asset operation cycle. An experiment on bearing life testing

was employed to validate the method.

Keywords: recursive Bayesian analysis, asset health reliability, bearing life testing

Introduction

Reliability models have been reasonably well developed for asset health evaluation in the

engineering asset maintenance and management field. Conventional reliability analysis is

largely conducted through the analysis of population data, e.g. the data from life testing

experiments of a batch of similar assets. With condition monitoring techniques being adopted

in different industrial sectors, a large amount of observation data is typically collected from

individual critical assets during their operation. Such condition monitoring data is used for

troubleshooting (e.g. fault diagnosis) and short term asset condition prediction (e.g.

prognosis) (Jardine A. K. S., Lin D. et al. 2005). Using condition data for the estimation of

asset health reliability however has not been well explored. The idea is dependent on the

assumption that condition monitoring data is able to reflect the underlying degradation

process of an asset, and that the variation of condition data manifests the reliability change of

an asset. As a result, asset health reliability can be estimated from condition data. In this

work, a recursive Bayesian analysis method making use of condition data for asset health

reliability estimation is investigated. The method enables reliability analysis and prediction

based on condition monitoring data from an individual asset, rather than through traditional

methods of using data from a family of assets.

Utilising condition monitoring data for reliability estimation has some similarities

with fault diagnosis. The severity of a fault is a key issue for diagnosis, as it presents insights

into how long before the asset fails. Identification of the severity of faults is analogous to the

estimation of asset reliability. In practice, fault severity has been commonly addressed by

different condition levels using indicators. For example, the widely exercised ISO standard

on vibration monitoring uses the Root Mean Square (RMS) of vibration signals to

differentiate machine conditions (ISO 10816 1998). Fault severity, if scaled in the range

between 0 and 1, expresses similar ideas with the reliability level in reliability analysis.

As the fault severity has a close connection with asset failure probability and hence

asset reliability, increasing efforts have been devoted to converting condition monitoring

information to reliability representation in recent years. Yan et al, (Yan J., Koc M. et al.

2004) modelled the relationship between failure probability and observed condition

mailto:[email protected]

2

monitoring data under different machine conditions using logistic regression algorithms. The

output of the logistic regression model is a probability value about the asset condition, which

is regarded as the asset health reliability. The reliability values were further modelled using

auto-regression moving average algorithms to attain failure prediction, and expert rules were

applied to produce reliability estimation using estimated condition data in case of insufficient

historical condition monitoring data being available.

It is believed that the variation of condition data reflects the degradation process of an

asset and hence can be modelled to represent asset health reliability. Knezevi etc. proposed

the use of relevant condition data for asset reliability estimation at an instant of operation

time (Knezevic 1987). The method was then developed for asset health reliability estimation

under multiple failure modes (Saranga H. and Knezevic 2000; Lu, Lu et al. 2001). When

extending this method to asset health prediction, one of the issues is to estimate probability

density functions (PDF) of future observation indicators. Feed-forward networks and self

organised maps have been proposed as prediction algorithms to represent the dispersion of

predicted condition indicators (Chinnam R.B. 1999). Expert rules and adaptive neuro-fuzzy

inference systems have also been reported on modelling asset reliability using observed data

(Chinnam R.B. and Baruah P. 2004), where the application of the neuro-fuzzy system is

similar to that of Yan (Yan J., Koc M. et al. 2004).

Reliability estimation based on condition data produces a time series of reliability

evaluations with respect to asset operation time. The time series of evaluations can then be

projected into the future for prognosis or prediction. Though the idea is analogous to the

condition based approach which uses regression or artificial intelligence algorithms to model

condition data and achieves prognosis using extrapolation, it provides an alternative approach

to evaluate asset reliability using condition data from individual assets. This work

investigates the formulation of asset health reliability estimation using a recursive Bayesian

estimation technique, and employs an experiment on bearing life testing as a case study.

Recursive Bayesian reliability estimation

We use T to denote the time to failure, )(tR to denote the probability of a failure occurring

after time t (or a failure does not occur before t ), i.e. )()( tTPtR , and )|( 1 ii ttR to denote

the conditional reliability which is interpreted as a probability that an asset will not fail until

1 itt given that the asset is operational at it . According to (Lewis E. E. 1996), the

unconditional reliability at time 1it is given by

)()|()( 11 iiii tRttRtR (1)

Hence the unconditional reliability can be determined using Bayesian estimation to

take account of previous conditional reliability in a recursive manner.

N

n

ninii ttRtRtR0

101 )|()()( (2)

Generally 1)( 0 tR , which is the case when the asset condition is deemed perfect.

The conditional reliability )|( 1 ii ttR can be derived from the condition monitoring

observation variable vector ],...,[ 1 myyy . Such a mixture of degradation variables preferably

reflects the loading and response variations of an asset during its operation. Given the m

indicators are of the ―lower-is-better‖ type signal, and a critical limit is ],...,[ 1

cl

m

clcl yyy , the

following parameter is defined as a failure index

3

11

1

1

1|...)ˆ(...

it

my

clmy

m

y

cly

itdydyygA (3)

where, )ˆ(yg is the joint probability density function of the m degradation variables. The

failure index indicates the failure probability at a particular time point. The critical limit is

referred to as the reliability threshold in this work. The conditional reliability is then defined

as

itit

itititit

ii AA

AAAAttR

1

111 ,1

),1/()1()|( (4)

Eq. (4) explains that the conditional reliability is only calculated when the failure index

increases. In other words, it is only upgraded when the condition deteriorates compared with

the previous detection. The failure index can be served as a criterion to decide if an

observation shall be used for upgrading reliability. In some applications, hazard is a

preferable indicator rather than reliability. According to (Lewis E. E. 1996), an accumulative

hazard at 1it is calculated from

)ln()( 11 ii RtH (5)

A scheme for implementing asset health reliability estimation Asset health reliability estimation should be conducted in the whole operation period of an

asset. According to the PF curve illustrated in Figure 1, any asset operates in two distinctive

zones, the normal zone from installation (I) to a potential failure (P), and the abnormal zone

from a potential failure (P) to a functional failure (F). In the normal zone, an asset runs free of

faults. In the abnormal zone, the asset operates with fault(s) and experiences a deterioration

process. To conduct asset health reliability estimation, it is of importance to differentiate the

two zones, especially to determine the time when the abnormal zone starts. This is generally

accomplished using condition monitoring techniques through examining observed data

(Vachtsevanos G., Lewis F. L. et al. 2006). Figure 2 further illustrates how measured data

indicates the underlying deterioration. The measurements present stable behaviours in the

normal zone (I-P: Installation to Potential failure) while an increasing trend in the abnormal

zone (P-F: Potential failure to Functional failure).

Operation Age

Functional failure

Potential failure

P FI

Functional

capability

Figure 1 PF curve

Eq. (4) assumes the use of a failure index to decide the observation for upgrading

reliability estimate. In practice, the threshold of condition indicators is commonly used to

determine if the condition is normal or not (Zhang S., Hodkiewicz M. et al. 2006). A scheme

4

is therefore required to incorporate condition thresholds in the implementation of recursive

Bayesian reliability estimation. Given observation data are measured at closely spaced

intervals, a procedure comprising two steps is taken to apply the data to asset health

reliability estimation.

Measurement

I P F Operation Age

Figure 2 Illustrative measurements with regards to operation

1) Determine if a measurement shall be counted as a candidate for reliability estimation.

This is achieved by examining the selected indicators from measured data, and can be

considered as a binary classification issue. For example, if the condition is assessed in

the abnormal zone, the measurement is counted in the reliability estimation process.

Otherwise, the measurement is ignored.

2) Apply the rule in Eq. (4) to the determined data and work out the conditional

reliability. The unconditional reliability is then computed using Eq. (2).

This procedure suggests that a filtering mechanism is required to filter out a degradation

process by examining the observation data. Two fundamental issues are then raised to

facilitate the implementation of the procedure.

1) The determination of the threshold which is used to judge the condition so that the

corresponding measurement shall be counted in the reliability estimation.

2) The determination of the threshold ],...,[ 1

cl

m

clcl yyy which is used for reliability

calculation.

The first issue is concerned with diagnosis. The underlying condition needs to be classified as

normal or abnormal. Numerous methods have been available for diagnosis, from simple ones

such as linear discrimination analysis to complex ones such as neural networks (Bishop C. M.

1995; Zhang S., Mathew J. et al. 2003). Statistical process control (SPC) have also been

applied as an alternative for abnormality detection (Goode K. B., Moore J. et al. 2000; Wang

W. 2002). In addition, multivariate analysis shall be investigated to take advantage of the

information from multiple indicators. In the selection of detection algorithms, it is generally

in favour of those capable of deriving instant decisions based on relative fewer

measurements.

As to the second issue, the area of the PDF above the threshold has been considered

reflecting the conditional probability that a failure occurs since the last measurement, given

the asset still functions at the last measurement time. This threshold lies in-between the

thresholds of potential failures and function failures. Empirical knowledge may have to be

resorted to determine the threshold (Ben-Daya M., Duffuaa S. O. et al. 2000). Other related

topics in implementing this scheme include indicator extraction, and PDF estimation of the

indicators etc. The PDF estimation becomes very problematic in case of sparse data and

demands effective methods (MacKay D. J. C. 2003).

Case study

5

Bearing failure is one of the foremost causes of breakdown in rotating machinery. A large

part of the work reported in the literature have focused on crack propagation failures, which

have resulted from an artificial or seeded damage (Dyer 1973; Dyer and Stewart 1978;

Stronach, Cudworth et al. 1984; Prabhu 1996; Kim June 1984) and component characteristic

frequencies can be identified as indicators for early failure warning. In this work, an

accelerated life testing of bearings without any seeded defects was adopted for testing the

algorithm on asset health reliability estimation. The observations from the experiment

showed more complicated frequency components. As a result, it was not suitable to sense

early failure warning using component characteristic frequencies as indicators.

The test rig for bearing life testing is shown in Figure 3 and Figure 4. A hydraulic ram

provided vertical forces on the bearing housing. A hand actuated hydraulic load unit was also

available. A slave bearing block was screwed on the load table with two M10x20 bolts.

Inside the slave bearing block two high precision bearings (7018CTDBLP6) were used to

support the shaft and maintain alignment. The input shaft on the slave bearing block was

driven by a TECO electric motor (5.5 kW and 4 poles) via a 630mm pulley to reduce the

motor speed from 1450 RPM to 211 RPM. The shaft was designed to be capable of

undertaking heavy loads. The bearing housing had two holes 90 degrees apart for mounting

sensors. The test bearing was a single-row deep groove ball bearing (SKF 6805). It was

loaded radically between 50-60 % of its dynamic loading capacity with a calculated cycle-life

of 1.47 million revolutions (Harris 1966).

An accelerometer (IMI 621B51) mounted on a bolt was screwed onto the bearing

outer race. At 90 degrees away from the accelerometer, a type J thermocouple (iron and

copper-nickel, with a range of 0º to 750 ºC) used another hole on the bearing housing and

contacted the bearing outer race. The vibration signals were amplified using a PCB482a20

amplifier (gain of 10), and filtered using an analogue filter (KROHN3202, cut-off frequency

of 3 kHz). Both vibration and thermocouple were connected with a connector (NI BNC2120)

and then came to a digital A/D converter (NI DAC6012). A computer with a data acquisition

software tool coded using LabVIEW was used for signal acquisition. The bearing was tested

until a failure was reached. Data was acquired in an interval of 5 minutes, with a sampling

frequency 12 kHz. For each sample, a total of 172 data points were recorded for 10.55

seconds.

Figure 3 Life testing rig

6

Figure 4 Bearing house and sensors

The bearing was operated 51 hours in four days under heavy loading, with a break in

each evening. Vibration features, i.e. kurtosis, RMS and Peak-to-Peak (P2P), together with

the temperature of the bearing outer race, were extracted and plotted. The ranges of features

extracted from the first and last samples in each day (including the 4th

day before the last 40

minutes) are presented in Table1, together with the maximum values of the features for the

data in the last 40 minutes. Note for the sample of 10.55 seconds, different sub-samples with

a data length of 4096 were used to work out the range of features. Data from the earlier three

days suggested that the bearing condition was normal. These data provide a profile of

baseline indicators.

Table 1 Statistical features summary

Day Temperature RMS P2P Kurtosis

1 28C – 32C 1.1 – 1.3 10 – 12 3 – 3.2

2 28C – 33C 1.1 – 1.3 10 – 12.5 3 – 3.3

3 28C – 34C 1.1 – 1.35 10 – 13 3 – 3.4

4 (before the last 40 min) 28C – 34C 1.1 – 1.25 10 – 13 3 – 3.4

4 (last 40 min) 37C 1.21 14.5 3.6

. Figure 5 displays the last 27 samples (with a sampling interval of 5 minutes) of the

temperature and three derived vibration indicators close to the bearing failure in day 4. Note

the vibration indicators were the average value of the total sample of 10.55 seconds. The

temperature always kept rising during the experiment, presenting a clear trend in relation with

bearing operation. From the sample 20, the temperature showed a significant start of increase,

which matches the indication from Kurtosis. The kurtosis before this sample is around 3.0

and afterwards is large than 3.0. This moment of time was considered as the onset of bearing

deterioration. Interestingly, RMS and P2P appeared to be the least conclusive for detecting

this potential failure occurrence. They presented a decreasing trend when the failure started

before showing an increasing trend (Karimi M. 2006). In the normal zone the Kurtosis also

displayed stable behaviour compared with RMS and P2P. Such stability assists the reduction

of false alarms in reliability estimation. Kurtosis was selected as an indicator for reliability

estimation since it presented a favourable indication throughout the progression of damage.

Reliability estimation was applied for the last 27 samples. The kurtosis of the first 20

samples were around 3.0, which is a common indication that the vibration was normal, and

hence the first 20 samples were treated from normal condition (Davies A. 1998). A kurtosis

level of 3.1 was taken as the threshold for triggering abnormality alarms, and a threshold of

3.7 was taken for triggering reliability estimation.

7

Figure 6 shows the trend of kurtosis with PDF versus operation time. For this specific

case, no false diagnosis occurred in the normal zone. In addition, no missed diagnosis

occurred in the abnormal zone. However, two measurements (24 and 26) demonstrate non-

degradation behaviours compared with their previous one and have been ruled out in the

reliability update process.

0 5 10 15 20 25 3032

34

36

38

temperature

0 5 10 15 20 25 300.5

1

1.5

rms

0 5 10 15 20 25 306

7

8

9

P2P

0 5 10 15 20 25 302.5

3

3.5

kurtosis

Figure 5 Indicators versus operation time

12

2.2

2.4

2.6

2.8

3

3.2

3.4

3.6

3.8

4

Indicator

2 3 4 5 6 7 8 9 10 11 12 13 14

Sample

15 16 17 18 19 20 21 22 23 24 25 26 27

Figure 6 Kurtosis with regards to operation time

Figure 7 shows the reliability estimation using the last 27 samples, where the aging effect is

incorporated using an exponential distribution of time to failure with a failure rate of 0.001.

8

Also the hazard is presented. As can be seen, the reliability of the bearing starts to decline

quickly from the time of bearing failure initiation (the 20th

sample).

0 5 10 15 20 25 300.5

0.6

0.7

0.8

0.9

1

Sample

Reliability

conditional

unconditional

0 5 10 15 20 25 300

0.2

0.4

0.6

0.8

Sample

Hazard

Figure 7 Reliability estimation considering aging effect

Conclusion and future work

A recursive Bayesian analysis was developed in this work to estimate asset health reliability

using condition monitoring data. It enables reliability evaluation using observations from

individual assets. The method can be performed in an ‗on-line‘ mode, and functions in both

normal and abnormal zones. Remaining useful life can also be evaluated based on the derived

reliability curve, given an acceptable reliability (hazard).

Although the proposed method was successfully illustrated through bearing life tests,

it needs to be improved from various aspects. Proposed future work include: developing

methods to minimise false diagnosis in the normal zone and improve the diagnosis response

time; estimating health reliability from mixed condition indicators using data fusion;

modelling remaining useful life estimation using both degradation data and reliability data;

developing novel methods for PDF estimation for multivariate and sparse data; taking

maintenance actions into consideration in the reliability estimation process; and conducting

case studies from field data.

Acknowledgement

The authors are grateful to Dr. Mahdi Karimi at Queensland University of Technology for

providing the data for testing.

References

Ben-Daya M., Duffuaa S. O. and Raouf A. (2000). Maintenance, modelling and optimization,

Kluwer Academic Publisher.

9

Bishop C. M. (1995). Neural Networks for Pattern Recognition, Oxford University Press.

Chinnam R.B. (1999). "On-line reliability estimation of individual components using

degradation signals." IEEE Transactions on Reliability 48(4): 403-412.

Chinnam R.B. and Baruah P. (2004). "A neuro-fuzzy approach for estimating mean residual

life in condition-based maintenance systems." International Journal of Materials and

Product Technology 20: 166–179.

Davies A. (1998). Handbook of condition monitoring techniques and methodology, London:

Chapman & Hall.

Dyer, D. (1973). "Bearing condition monitoring." Interim Report 1 Department of

Mechanical Engineering, University of Southampton, Southampton (UK).

Dyer, D. and R. Stewart, M., (1978). "Detection of rolling element bearing damaged by

statistical vibration analysis." Trans ASME, J Mech Design 100(2): 229-235.

Goode K. B., Moore J. and Roylance B.J. (2000). Plant machinery working life prediction

method utilizing reliability and condition-monitoring data. Proceedings of the Institution

of Mechanical Engineers Part E—Journal of Process Mechanical Engineering 214

(2000).

Harris, T. A. (1966). Rolling bearing analysis. New York, John Wiley and Sons.

ISO 10816 (1998). "Evaluation of machine vibration by measurements on non-rotating parts -

- Part 3: Industrial machines with nominal power above 15 kW and nominal speeds

between 120 r/min and 15 000 r/min when measured in situ."

Jardine A. K. S., Lin D. and Banjevic D. (2005). "A review on machinery diagnostics and

prognostics implementing condition-based maintenance". Mechanical Systems and

Signal Processing.

Karimi M. (2006). Machine Fault Diagnosis Using Blind Deconvolution. Brisbane,

Queensland University of Technology.

Kim, P. Y. (June 1984). "A review of rolling element bearing health monitoring (II):

preliminary test results on current technologies." Proceedings of Machinery Vibration

Monitoring and Analysis Meeting, Vibration Institute, New Orleans, LA, 26-28: 127-

137.

Knezevic, J. (1987). "Condition parameter based approach to calculation of reliability

characteristics." Reliability Engineering 19(1): 29-39.

Lewis E. E. (1996). Introduction to Reliability Engineering, Wiley.

Lu, S., H. Lu and W. J. Kolarik (2001). "Multivariate performance reliability prediction in

real-time." Reliability Engineering & System Safety 72(1): 39-45.

MacKay D. J. C. (2003). Information theory, inference, and learning algorithms, Cambridge

University Press.

Prabhu, R. (1996). "Rolling bearing diagnostics." Proceedings of the Indo-US Symposium on

Emerging Trends in Vibration and Noise Engineering, New Delhi,: 311-320.

Saranga H. and J. Knezevic (2000). "Reliability analysis using multiple relevant condition

parameters." Journal of quality in maintenance engineering 6(3): 165-176.

Stronach, A. F., C. J. Cudworth and A. B. Johnston (1984). "Condition monitoring of rolling

element bearings." Proceedings of the International Condition Monitoring Conference,

Swansea, UK, 10-13: 162-177.

Vachtsevanos G., Lewis F. L., Roemer M., Hess A. and Wu B. (2006). Intelligent Fault

Diagnosis and Prognosis for Engineering Systems, Wiley.

Wang W. (2002). "A model to predict the residual life of rolling element bearings given

monitored condition information to date." IMA Journal of Management Mathematics 13:

3-16.

Yan J., Koc M. and Lee J. (2004). "A prognostic algorithm for machine performance

assessment and its application." Production Planning and Control 15: 796–801.

10

Zhang S., Hodkiewicz M., Ma L. and Mathew J. (2006). Machinery condition prognosis

using multivariate analysis. The 1st World Congress of Engineering Asset Management,

Gold Coast, Australia.

Zhang S., Mathew J., Ma L. and Sun Y. (2003). Integrated machine fault diagnosis using

wavelet packet transform and probabilistic neural networks. The 10th Asia Pacific

Vibration Conference, Gold Coast, Australia.

zhang, sheng,ma, lin,sun, yong, &mathew, joseph in gelman ... · 2 monitoring data under...

Documents