zhang, sheng,ma, lin,sun, yong, &mathew, joseph in gelman ... · 2 monitoring data under...
TRANSCRIPT
This may be the author’s version of a work that was submitted/acceptedfor publication in the following source:
Zhang, Sheng, Ma, Lin, Sun, Yong, & Mathew, Joseph(2007)Asset Health Reliability Estimation Based on Condition Data.In Gelman, L (Ed.) Proceedings of the 2nd World Congress on Engineer-ing Asset Management and the 4th International Conference on ConditionMonitoring.Coxmoor Publishing Company, CD Rom, pp. 2195-2204.
This file was downloaded from: https://eprints.qut.edu.au/13311/
c© Copyright 2007 (please consult author)
This work is covered by copyright. Unless the document is being made available under aCreative Commons Licence, you must assume that re-use is limited to personal use andthat permission from the copyright owner must be obtained for all other uses. If the docu-ment is available under a Creative Commons License (or other specified license) then referto the Licence for details of permitted re-use. It is a condition of access that users recog-nise and abide by the legal requirements associated with these rights. If you believe thatthis work infringes copyright please provide details by email to [email protected]
Notice: Please note that this document may not be the Version of Record(i.e. published version) of the work. Author manuscript versions (as Sub-mitted for peer review or as Accepted for publication after peer review) canbe identified by an absence of publisher branding and/or typeset appear-ance. If there is any doubt, please refer to the published source.
QUT Digital Repository: http;;//eprints.qut.edu.au
Zhang, Sheng and Ma, Lin and Sun, Yong and Mathew, Joseph (2007) Asset health reliability estimation based on condition data. In Proceedings World Congress on Engineering Asset Management, pages pp. 2195-2204, Harrogate, England.
© Copyright 2007 (please consult author)
1
Asset health reliability estimation based on condition data
Sheng Zhang, Lin Ma, Yong Sun, Joseph Mathew
CRC for Integrated Engineering Asset Management, School of Engineering Systems,
Queensland University of Technology, Brisbane, QLD 4001, Australia
Abstract
This paper presents a method using recursive Bayesian analysis to estimate asset health
reliability. The method enables reliability analysis and prediction using degradation condition
monitoring data from an individual asset, rather than using failure data from a population of
assets. A scheme was devised to assist the implementation of the method in both normal and
abnormal zones during the whole asset operation cycle. An experiment on bearing life testing
was employed to validate the method.
Keywords: recursive Bayesian analysis, asset health reliability, bearing life testing
Introduction
Reliability models have been reasonably well developed for asset health evaluation in the
engineering asset maintenance and management field. Conventional reliability analysis is
largely conducted through the analysis of population data, e.g. the data from life testing
experiments of a batch of similar assets. With condition monitoring techniques being adopted
in different industrial sectors, a large amount of observation data is typically collected from
individual critical assets during their operation. Such condition monitoring data is used for
troubleshooting (e.g. fault diagnosis) and short term asset condition prediction (e.g.
prognosis) (Jardine A. K. S., Lin D. et al. 2005). Using condition data for the estimation of
asset health reliability however has not been well explored. The idea is dependent on the
assumption that condition monitoring data is able to reflect the underlying degradation
process of an asset, and that the variation of condition data manifests the reliability change of
an asset. As a result, asset health reliability can be estimated from condition data. In this
work, a recursive Bayesian analysis method making use of condition data for asset health
reliability estimation is investigated. The method enables reliability analysis and prediction
based on condition monitoring data from an individual asset, rather than through traditional
methods of using data from a family of assets.
Utilising condition monitoring data for reliability estimation has some similarities
with fault diagnosis. The severity of a fault is a key issue for diagnosis, as it presents insights
into how long before the asset fails. Identification of the severity of faults is analogous to the
estimation of asset reliability. In practice, fault severity has been commonly addressed by
different condition levels using indicators. For example, the widely exercised ISO standard
on vibration monitoring uses the Root Mean Square (RMS) of vibration signals to
differentiate machine conditions (ISO 10816 1998). Fault severity, if scaled in the range
between 0 and 1, expresses similar ideas with the reliability level in reliability analysis.
As the fault severity has a close connection with asset failure probability and hence
asset reliability, increasing efforts have been devoted to converting condition monitoring
information to reliability representation in recent years. Yan et al, (Yan J., Koc M. et al.
2004) modelled the relationship between failure probability and observed condition
2
monitoring data under different machine conditions using logistic regression algorithms. The
output of the logistic regression model is a probability value about the asset condition, which
is regarded as the asset health reliability. The reliability values were further modelled using
auto-regression moving average algorithms to attain failure prediction, and expert rules were
applied to produce reliability estimation using estimated condition data in case of insufficient
historical condition monitoring data being available.
It is believed that the variation of condition data reflects the degradation process of an
asset and hence can be modelled to represent asset health reliability. Knezevi etc. proposed
the use of relevant condition data for asset reliability estimation at an instant of operation
time (Knezevic 1987). The method was then developed for asset health reliability estimation
under multiple failure modes (Saranga H. and Knezevic 2000; Lu, Lu et al. 2001). When
extending this method to asset health prediction, one of the issues is to estimate probability
density functions (PDF) of future observation indicators. Feed-forward networks and self
organised maps have been proposed as prediction algorithms to represent the dispersion of
predicted condition indicators (Chinnam R.B. 1999). Expert rules and adaptive neuro-fuzzy
inference systems have also been reported on modelling asset reliability using observed data
(Chinnam R.B. and Baruah P. 2004), where the application of the neuro-fuzzy system is
similar to that of Yan (Yan J., Koc M. et al. 2004).
Reliability estimation based on condition data produces a time series of reliability
evaluations with respect to asset operation time. The time series of evaluations can then be
projected into the future for prognosis or prediction. Though the idea is analogous to the
condition based approach which uses regression or artificial intelligence algorithms to model
condition data and achieves prognosis using extrapolation, it provides an alternative approach
to evaluate asset reliability using condition data from individual assets. This work
investigates the formulation of asset health reliability estimation using a recursive Bayesian
estimation technique, and employs an experiment on bearing life testing as a case study.
Recursive Bayesian reliability estimation
We use T to denote the time to failure, )(tR to denote the probability of a failure occurring
after time t (or a failure does not occur before t ), i.e. )()( tTPtR , and )|( 1 ii ttR to denote
the conditional reliability which is interpreted as a probability that an asset will not fail until
1 itt given that the asset is operational at it . According to (Lewis E. E. 1996), the
unconditional reliability at time 1it is given by
)()|()( 11 iiii tRttRtR (1)
Hence the unconditional reliability can be determined using Bayesian estimation to
take account of previous conditional reliability in a recursive manner.
N
n
ninii ttRtRtR0
101 )|()()( (2)
Generally 1)( 0 tR , which is the case when the asset condition is deemed perfect.
The conditional reliability )|( 1 ii ttR can be derived from the condition monitoring
observation variable vector ],...,[ 1 myyy . Such a mixture of degradation variables preferably
reflects the loading and response variations of an asset during its operation. Given the m
indicators are of the ―lower-is-better‖ type signal, and a critical limit is ],...,[ 1
cl
m
clcl yyy , the
following parameter is defined as a failure index
3
11
1
1
1|...)ˆ(...
it
my
clmy
m
y
cly
itdydyygA (3)
where, )ˆ(yg is the joint probability density function of the m degradation variables. The
failure index indicates the failure probability at a particular time point. The critical limit is
referred to as the reliability threshold in this work. The conditional reliability is then defined
as
itit
itititit
ii AA
AAAAttR
1
111 ,1
),1/()1()|( (4)
Eq. (4) explains that the conditional reliability is only calculated when the failure index
increases. In other words, it is only upgraded when the condition deteriorates compared with
the previous detection. The failure index can be served as a criterion to decide if an
observation shall be used for upgrading reliability. In some applications, hazard is a
preferable indicator rather than reliability. According to (Lewis E. E. 1996), an accumulative
hazard at 1it is calculated from
)ln()( 11 ii RtH (5)
A scheme for implementing asset health reliability estimation Asset health reliability estimation should be conducted in the whole operation period of an
asset. According to the PF curve illustrated in Figure 1, any asset operates in two distinctive
zones, the normal zone from installation (I) to a potential failure (P), and the abnormal zone
from a potential failure (P) to a functional failure (F). In the normal zone, an asset runs free of
faults. In the abnormal zone, the asset operates with fault(s) and experiences a deterioration
process. To conduct asset health reliability estimation, it is of importance to differentiate the
two zones, especially to determine the time when the abnormal zone starts. This is generally
accomplished using condition monitoring techniques through examining observed data
(Vachtsevanos G., Lewis F. L. et al. 2006). Figure 2 further illustrates how measured data
indicates the underlying deterioration. The measurements present stable behaviours in the
normal zone (I-P: Installation to Potential failure) while an increasing trend in the abnormal
zone (P-F: Potential failure to Functional failure).
Operation Age
Functional failure
Potential failure
P FI
Functional
capability
Figure 1 PF curve
Eq. (4) assumes the use of a failure index to decide the observation for upgrading
reliability estimate. In practice, the threshold of condition indicators is commonly used to
determine if the condition is normal or not (Zhang S., Hodkiewicz M. et al. 2006). A scheme
4
is therefore required to incorporate condition thresholds in the implementation of recursive
Bayesian reliability estimation. Given observation data are measured at closely spaced
intervals, a procedure comprising two steps is taken to apply the data to asset health
reliability estimation.
Measurement
I P F Operation Age
Figure 2 Illustrative measurements with regards to operation
1) Determine if a measurement shall be counted as a candidate for reliability estimation.
This is achieved by examining the selected indicators from measured data, and can be
considered as a binary classification issue. For example, if the condition is assessed in
the abnormal zone, the measurement is counted in the reliability estimation process.
Otherwise, the measurement is ignored.
2) Apply the rule in Eq. (4) to the determined data and work out the conditional
reliability. The unconditional reliability is then computed using Eq. (2).
This procedure suggests that a filtering mechanism is required to filter out a degradation
process by examining the observation data. Two fundamental issues are then raised to
facilitate the implementation of the procedure.
1) The determination of the threshold which is used to judge the condition so that the
corresponding measurement shall be counted in the reliability estimation.
2) The determination of the threshold ],...,[ 1
cl
m
clcl yyy which is used for reliability
calculation.
The first issue is concerned with diagnosis. The underlying condition needs to be classified as
normal or abnormal. Numerous methods have been available for diagnosis, from simple ones
such as linear discrimination analysis to complex ones such as neural networks (Bishop C. M.
1995; Zhang S., Mathew J. et al. 2003). Statistical process control (SPC) have also been
applied as an alternative for abnormality detection (Goode K. B., Moore J. et al. 2000; Wang
W. 2002). In addition, multivariate analysis shall be investigated to take advantage of the
information from multiple indicators. In the selection of detection algorithms, it is generally
in favour of those capable of deriving instant decisions based on relative fewer
measurements.
As to the second issue, the area of the PDF above the threshold has been considered
reflecting the conditional probability that a failure occurs since the last measurement, given
the asset still functions at the last measurement time. This threshold lies in-between the
thresholds of potential failures and function failures. Empirical knowledge may have to be
resorted to determine the threshold (Ben-Daya M., Duffuaa S. O. et al. 2000). Other related
topics in implementing this scheme include indicator extraction, and PDF estimation of the
indicators etc. The PDF estimation becomes very problematic in case of sparse data and
demands effective methods (MacKay D. J. C. 2003).
Case study
5
Bearing failure is one of the foremost causes of breakdown in rotating machinery. A large
part of the work reported in the literature have focused on crack propagation failures, which
have resulted from an artificial or seeded damage (Dyer 1973; Dyer and Stewart 1978;
Stronach, Cudworth et al. 1984; Prabhu 1996; Kim June 1984) and component characteristic
frequencies can be identified as indicators for early failure warning. In this work, an
accelerated life testing of bearings without any seeded defects was adopted for testing the
algorithm on asset health reliability estimation. The observations from the experiment
showed more complicated frequency components. As a result, it was not suitable to sense
early failure warning using component characteristic frequencies as indicators.
The test rig for bearing life testing is shown in Figure 3 and Figure 4. A hydraulic ram
provided vertical forces on the bearing housing. A hand actuated hydraulic load unit was also
available. A slave bearing block was screwed on the load table with two M10x20 bolts.
Inside the slave bearing block two high precision bearings (7018CTDBLP6) were used to
support the shaft and maintain alignment. The input shaft on the slave bearing block was
driven by a TECO electric motor (5.5 kW and 4 poles) via a 630mm pulley to reduce the
motor speed from 1450 RPM to 211 RPM. The shaft was designed to be capable of
undertaking heavy loads. The bearing housing had two holes 90 degrees apart for mounting
sensors. The test bearing was a single-row deep groove ball bearing (SKF 6805). It was
loaded radically between 50-60 % of its dynamic loading capacity with a calculated cycle-life
of 1.47 million revolutions (Harris 1966).
An accelerometer (IMI 621B51) mounted on a bolt was screwed onto the bearing
outer race. At 90 degrees away from the accelerometer, a type J thermocouple (iron and
copper-nickel, with a range of 0º to 750 ºC) used another hole on the bearing housing and
contacted the bearing outer race. The vibration signals were amplified using a PCB482a20
amplifier (gain of 10), and filtered using an analogue filter (KROHN3202, cut-off frequency
of 3 kHz). Both vibration and thermocouple were connected with a connector (NI BNC2120)
and then came to a digital A/D converter (NI DAC6012). A computer with a data acquisition
software tool coded using LabVIEW was used for signal acquisition. The bearing was tested
until a failure was reached. Data was acquired in an interval of 5 minutes, with a sampling
frequency 12 kHz. For each sample, a total of 172 data points were recorded for 10.55
seconds.
Figure 3 Life testing rig
6
Figure 4 Bearing house and sensors
The bearing was operated 51 hours in four days under heavy loading, with a break in
each evening. Vibration features, i.e. kurtosis, RMS and Peak-to-Peak (P2P), together with
the temperature of the bearing outer race, were extracted and plotted. The ranges of features
extracted from the first and last samples in each day (including the 4th
day before the last 40
minutes) are presented in Table1, together with the maximum values of the features for the
data in the last 40 minutes. Note for the sample of 10.55 seconds, different sub-samples with
a data length of 4096 were used to work out the range of features. Data from the earlier three
days suggested that the bearing condition was normal. These data provide a profile of
baseline indicators.
Table 1 Statistical features summary
Day Temperature RMS P2P Kurtosis
1 28C – 32C 1.1 – 1.3 10 – 12 3 – 3.2
2 28C – 33C 1.1 – 1.3 10 – 12.5 3 – 3.3
3 28C – 34C 1.1 – 1.35 10 – 13 3 – 3.4
4 (before the last 40 min) 28C – 34C 1.1 – 1.25 10 – 13 3 – 3.4
4 (last 40 min) 37C 1.21 14.5 3.6
. Figure 5 displays the last 27 samples (with a sampling interval of 5 minutes) of the
temperature and three derived vibration indicators close to the bearing failure in day 4. Note
the vibration indicators were the average value of the total sample of 10.55 seconds. The
temperature always kept rising during the experiment, presenting a clear trend in relation with
bearing operation. From the sample 20, the temperature showed a significant start of increase,
which matches the indication from Kurtosis. The kurtosis before this sample is around 3.0
and afterwards is large than 3.0. This moment of time was considered as the onset of bearing
deterioration. Interestingly, RMS and P2P appeared to be the least conclusive for detecting
this potential failure occurrence. They presented a decreasing trend when the failure started
before showing an increasing trend (Karimi M. 2006). In the normal zone the Kurtosis also
displayed stable behaviour compared with RMS and P2P. Such stability assists the reduction
of false alarms in reliability estimation. Kurtosis was selected as an indicator for reliability
estimation since it presented a favourable indication throughout the progression of damage.
Reliability estimation was applied for the last 27 samples. The kurtosis of the first 20
samples were around 3.0, which is a common indication that the vibration was normal, and
hence the first 20 samples were treated from normal condition (Davies A. 1998). A kurtosis
level of 3.1 was taken as the threshold for triggering abnormality alarms, and a threshold of
3.7 was taken for triggering reliability estimation.
7
Figure 6 shows the trend of kurtosis with PDF versus operation time. For this specific
case, no false diagnosis occurred in the normal zone. In addition, no missed diagnosis
occurred in the abnormal zone. However, two measurements (24 and 26) demonstrate non-
degradation behaviours compared with their previous one and have been ruled out in the
reliability update process.
0 5 10 15 20 25 3032
34
36
38
temperature
0 5 10 15 20 25 300.5
1
1.5
rms
0 5 10 15 20 25 306
7
8
9
P2P
0 5 10 15 20 25 302.5
3
3.5
kurtosis
Figure 5 Indicators versus operation time
12
2.2
2.4
2.6
2.8
3
3.2
3.4
3.6
3.8
4
Indicator
2 3 4 5 6 7 8 9 10 11 12 13 14
Sample
15 16 17 18 19 20 21 22 23 24 25 26 27
Figure 6 Kurtosis with regards to operation time
Figure 7 shows the reliability estimation using the last 27 samples, where the aging effect is
incorporated using an exponential distribution of time to failure with a failure rate of 0.001.
8
Also the hazard is presented. As can be seen, the reliability of the bearing starts to decline
quickly from the time of bearing failure initiation (the 20th
sample).
0 5 10 15 20 25 300.5
0.6
0.7
0.8
0.9
1
Sample
Reliability
conditional
unconditional
0 5 10 15 20 25 300
0.2
0.4
0.6
0.8
Sample
Hazard
Figure 7 Reliability estimation considering aging effect
Conclusion and future work
A recursive Bayesian analysis was developed in this work to estimate asset health reliability
using condition monitoring data. It enables reliability evaluation using observations from
individual assets. The method can be performed in an ‗on-line‘ mode, and functions in both
normal and abnormal zones. Remaining useful life can also be evaluated based on the derived
reliability curve, given an acceptable reliability (hazard).
Although the proposed method was successfully illustrated through bearing life tests,
it needs to be improved from various aspects. Proposed future work include: developing
methods to minimise false diagnosis in the normal zone and improve the diagnosis response
time; estimating health reliability from mixed condition indicators using data fusion;
modelling remaining useful life estimation using both degradation data and reliability data;
developing novel methods for PDF estimation for multivariate and sparse data; taking
maintenance actions into consideration in the reliability estimation process; and conducting
case studies from field data.
Acknowledgement
The authors are grateful to Dr. Mahdi Karimi at Queensland University of Technology for
providing the data for testing.
References
Ben-Daya M., Duffuaa S. O. and Raouf A. (2000). Maintenance, modelling and optimization,
Kluwer Academic Publisher.
9
Bishop C. M. (1995). Neural Networks for Pattern Recognition, Oxford University Press.
Chinnam R.B. (1999). "On-line reliability estimation of individual components using
degradation signals." IEEE Transactions on Reliability 48(4): 403-412.
Chinnam R.B. and Baruah P. (2004). "A neuro-fuzzy approach for estimating mean residual
life in condition-based maintenance systems." International Journal of Materials and
Product Technology 20: 166–179.
Davies A. (1998). Handbook of condition monitoring techniques and methodology, London:
Chapman & Hall.
Dyer, D. (1973). "Bearing condition monitoring." Interim Report 1 Department of
Mechanical Engineering, University of Southampton, Southampton (UK).
Dyer, D. and R. Stewart, M., (1978). "Detection of rolling element bearing damaged by
statistical vibration analysis." Trans ASME, J Mech Design 100(2): 229-235.
Goode K. B., Moore J. and Roylance B.J. (2000). Plant machinery working life prediction
method utilizing reliability and condition-monitoring data. Proceedings of the Institution
of Mechanical Engineers Part E—Journal of Process Mechanical Engineering 214
(2000).
Harris, T. A. (1966). Rolling bearing analysis. New York, John Wiley and Sons.
ISO 10816 (1998). "Evaluation of machine vibration by measurements on non-rotating parts -
- Part 3: Industrial machines with nominal power above 15 kW and nominal speeds
between 120 r/min and 15 000 r/min when measured in situ."
Jardine A. K. S., Lin D. and Banjevic D. (2005). "A review on machinery diagnostics and
prognostics implementing condition-based maintenance". Mechanical Systems and
Signal Processing.
Karimi M. (2006). Machine Fault Diagnosis Using Blind Deconvolution. Brisbane,
Queensland University of Technology.
Kim, P. Y. (June 1984). "A review of rolling element bearing health monitoring (II):
preliminary test results on current technologies." Proceedings of Machinery Vibration
Monitoring and Analysis Meeting, Vibration Institute, New Orleans, LA, 26-28: 127-
137.
Knezevic, J. (1987). "Condition parameter based approach to calculation of reliability
characteristics." Reliability Engineering 19(1): 29-39.
Lewis E. E. (1996). Introduction to Reliability Engineering, Wiley.
Lu, S., H. Lu and W. J. Kolarik (2001). "Multivariate performance reliability prediction in
real-time." Reliability Engineering & System Safety 72(1): 39-45.
MacKay D. J. C. (2003). Information theory, inference, and learning algorithms, Cambridge
University Press.
Prabhu, R. (1996). "Rolling bearing diagnostics." Proceedings of the Indo-US Symposium on
Emerging Trends in Vibration and Noise Engineering, New Delhi,: 311-320.
Saranga H. and J. Knezevic (2000). "Reliability analysis using multiple relevant condition
parameters." Journal of quality in maintenance engineering 6(3): 165-176.
Stronach, A. F., C. J. Cudworth and A. B. Johnston (1984). "Condition monitoring of rolling
element bearings." Proceedings of the International Condition Monitoring Conference,
Swansea, UK, 10-13: 162-177.
Vachtsevanos G., Lewis F. L., Roemer M., Hess A. and Wu B. (2006). Intelligent Fault
Diagnosis and Prognosis for Engineering Systems, Wiley.
Wang W. (2002). "A model to predict the residual life of rolling element bearings given
monitored condition information to date." IMA Journal of Management Mathematics 13:
3-16.
Yan J., Koc M. and Lee J. (2004). "A prognostic algorithm for machine performance
assessment and its application." Production Planning and Control 15: 796–801.
10
Zhang S., Hodkiewicz M., Ma L. and Mathew J. (2006). Machinery condition prognosis
using multivariate analysis. The 1st World Congress of Engineering Asset Management,
Gold Coast, Australia.
Zhang S., Mathew J., Ma L. and Sun Y. (2003). Integrated machine fault diagnosis using
wavelet packet transform and probabilistic neural networks. The 10th Asia Pacific
Vibration Conference, Gold Coast, Australia.