fault diagnosis using wavelet neural networks

Fault Diagnosis Using Wavelet Neural Networks

LIU QIPENG, YU XIAOLING and FENG QUANKESchool of Energy and Power Engineering, Xi’an Jiaotong University, Xi’an, Shanxi, China.

e-mail: [email protected]

Abstract. Wavelet neural networks are a class of neural networks consisting of wavelets. Thispaper presents a novel universal tool for fault diagnosis and algorithms for wavelet neuralnetwork construction are proposed. Using the model of wavelet neural networks, we cannot only extract the features of system but also predict the development of the fault.

Key words. fault diagnosis, feature extraction, neural networks, RBF, wavelet neuralnetworks, WBF

1. Introduction

Fault diagnosis plays an important role in the operation of modern industrial sys-

tems. We are confronted with two problems. How do we ‘measures’ the growth of

a fault and how do we predict the remaining useful lifetime of such a failing compo-

nent or machine? These two problems have not been solved perfectly up to now. This

paper attempts to address these challenging problems with intelligence-oriented tech-

niques, specifically wavelet neural networks (WNN).

Based on wavelet theory, the wavelet neural networks (WNN) are a novel univer-

sal tool for fault diagnosis, wavelet neural network WNN is a feed forward neural

network based on wavelet transform and adaptive wavelet neural network [1–5].

The essence of WNN is to find a family of wavelet in the characteristic space so that

the complex function relationship contained in the original signal might be exactly

expressed. The network has advantages of the wavelet transform in denoising, back-

ground deduction and recovery of characteristic information. In this paper, theory of

WNN is described, and the new network is applied to fault diagnosis.

2. Wavelet Transform

Wavelet transform (WT) is a mathematical theory developed in recent years. It is

considered as a great breakthrough following Fourier analysis, and has been widely

used in a lot of research fields such as data compression, fault diagnosis.

Let cðxÞ is a mother wavelet, wavelet cðxÞ can be derived through the followingdilation and translation processes [6].

ca;bðxÞ ¼ a�1=2cx� b

a

� �a; b 2 R and a > 0 ð1Þ

Neural Processing Letters 18: 115–123, 2003. 115# 2003 Kluwer Academic Publishers. Printed in the Netherlands.

where a and b are the scale and position parameters expressed in real number R,

respectively. The basic idea of the WT is to represent any arbitrary function fðtÞ as

a superposition of wavelets. The continuous wavelet transform of fðtÞ is given by:

wf ða; bÞ ¼

Z þ1

0

ca;bðtÞfðtÞ dt ð2Þ

The inverse continuous wavelet transform can be obtained through the following

formula:

fðtÞ ¼1

Cc

Z þ1

0

da

a2

Z þ1

�1

wf ða; bÞca;bðtÞ db

¼1

Cc

Z þ1

0

da

a2

Z þ1

�1

wf ða; bÞ1ffiffiffia

p ct� b

a

� �db

ð3Þ

where Cc ¼Rþ1

0

CðaoÞj j2

a da < 1 is a constant depending only on c In general, therepresentation in Equation (3) is redundant by using the continuous wavelet trans-

form wf ða; bÞ. Provided that a ¼ 2 j; b ¼ 2 jk; j; k 2 Z. Then, the dyadic discrete form

of ca;bðxÞ is cj;kðxÞ ¼ 2�j=2cð2�jx� kÞ signal fðtÞ can be represented by a set of coef-

ficients Cj;k

fðtÞ ¼Xj2Z

Xk2Z

Cj;kcj;kðtÞ ð4Þ

3. Constructing Wavelet Neural Networks

3.1. WAVELET BASIS FUNCTION (WBF) NEURAL NETWORKS

We choose scaling functions, f,

fðxÞ ¼Xn

cnfð2x� nÞ ð5Þ

We then define Vj to be the closed subspace spanned by the fj;k; k 2 Z, with

fj;k ¼ 2j=2fð2 jx� kÞ:

Since the family of functions fj;kðxÞjð j; kÞ 2 Z2� �

forms an orthonormal basis for

Vm f2 j can be written as

f2 jðxÞ ¼Xþ1

k¼�1

sj;kfj;kðxÞ ð6Þ

where

sj;k ¼

Z þ1

�1

f2 jðxÞfj;kðxÞ dx ð7Þ

is the projection of f2 j onto the orthonormal basis functions fj;kðxÞ:

116 LIU QIPENG ET AL.

Wavelets are the basis functions of the subspace, W, where Vj Wj ¼ Vjþ1. The

information difference in function f between resolution j and resolution jþ1 can

be described using a wavelet, c, as

D � f2 j ¼Xþ1

k¼�1

dj;kcj;kðxÞ ð8Þ

where cj;kðxÞ ¼ 2jc 2 jx� k� �

; ð j; kÞ 2 Z2 and dj;k ¼Rþ1

�1f2 j ðxÞcj;kðxÞ dx are the

projections of f2 j onto Wm Wm. Further, from Vjþ1 ¼ Vj Wj with Vj?Wj,

we get

f2 jþ1 ðxÞ ¼ f2 jðxÞ þXþ1

�1

dj;kcj;kðxÞ ð9Þ

Since the V-spaces form a nested set of subspaces, f2 j can be written as

fðxÞ ¼Xk

sj;kfj;kðxÞ þXLj¼1

Xk

dj;kcj;kðxÞ ð10Þ

3.2. RADIAL WAVELET BASIS FUNCTION (RBF) NEURAL NETWORKS

Previous work using WBF neural networks employed sample data evenly spaced

with known maximum resolution. In such a case, the locations of the basis functions

could be calculated using a dyadic grid. There have been few studies dealing with

unevenly spaced sample data whose sampling rate is unknown. The dyadic grid used

to select the locations of basis functions and orthogonal discrete set basis functions

such as the Daubechies wavelet or the Battle–Lemarie wavelet, cannot be obtained if

the sampling rate is unknown. Only continuous basis functions can be used for inter-

polating test data. One such candidate function is the RBF, which is known to pos-

sess good approximation properties, and also provides a solution to the regularization

problem [7–10].

The most commonly used Gaussian RBF function is:

fðxÞ ¼ expkx� cjk

2

2s2j

!ð11Þ

where cj represents the location of its center and s2j denotes the spread of the func-tion. The Mexican hat wavelet frame, which can be related to the second derivative

of a Gaussian,

cðxÞ ¼ 1� x2�

expkx� cjk

2

2s2j

!ð12Þ

is used as a wavelet function. This wavelet frame satisfies the admissibility condition

and the frame property. It decays rapidly in the time and frequency domains and the

frame constant is very near unity. Equation (10) can be rewritten by using RBFs as

FAULT DIAGNOSIS USING WAVELET NEURAL NETWORKS 117

fð �xÞs0;kf0;k¼Xk

s0kf0k �x� �ckk kð Þ¼

XNL

k¼1

sL;KfL;K �x� �ckk kð ÞþXLj¼1

XNf

k¼1

dj;kcj;k k �x� �cj;kk�

ð13Þ

In this case, the dilation at the resolution j and translation 2 jn can be written by

an uncompressed and untranslated scaling function fðt; sÞ and wavelet functionscðt; sÞ as

f2 jðt; sÞ ¼ f t� 2�jn;s2 j

�c2 jðt; sÞ ¼ c t� 2�jn;

s2 j

� ð14Þ

In this application, the input of a network, �x corresponds to the defect signature

generated by the sensors in the inspection tool and the output of a network, f, repre-

sents the corresponding three-dimensional defect profile.

3.2. TRAINING NEURAL NETWORKS

Training of a radial WBF neural network essentially consists of determining the

basis function centers cj;k the width of the basis functions sj;k the total number ofresolutions (L) and the number of basis functions at each resolution (Nj) and the

network weights in [11–13].

cj;k can be obtained using the k-means clustering algorithm or one of its several

variants and Nj can be determined heuristically.

Selecting the best subset of G for the estimation of Y amounts to selecting a subset

of ff0;c0; . . . ;cMg that spans the space that is closest to the vector Y.

Y ¼ GW

where Y ¼ y1; y2; . . . ; yNð ÞT;G ¼ g1; g2; . . . ; gLð Þ

T;W ¼ w1;w2; . . . ;wLð ÞT

The residual vector of the wavelet network, g, can be written as

g ¼ Y� GW ð15Þ

Then gTg ¼ YTT� YTGW�WTGTYþWTGTGW ¼ YTY�WTGTGW

The basis vectors c1; . . . ;cL� �

are not orthogonal. Nevertheless, because the

wavelets in G come from a wavelet frame, we assume that they are roughly orthonor-

mal. Then, approximately, we have

gTg � YTY� w21 þ � � � þ w2L�

ð16Þ

Thus the wavelets corresponding to the smallest coefficients wi contribute least to the

minimization of gTg. If the size of the wavelet network is to be reduced, the waveletscorresponding to the smallest coefficients should be removed.

sj;k is calculated either at the coarsest or at the finest resolution. Thereafter, widthsat other resolutions were obtained from equation (14). In this study, the width at the

finest resolution was obtained in order to cover the whole input space.


W is the weight of the matrix and L is the output dimension. Equation (14) can be

solved for W as:

W ¼ GþY

where Gþ is the defined as

Gþ ¼ ðGTGÞ�1GT ð17Þ

In this case, W ¼ ðGTGÞ�1GTY where Gþ is the pseudo-inverse of G.

4. An Example

Compressor is widely used in many field, so it is important to diagnose the system of

the compressor. Abnormal vibration in the compressor system is the most common

fault. Many factors can cause vibration, such as crack between the slide bearing and

rolling bearing of motor or compressor. A rolling bearing fault of compressor system

is used to demonstrate the feasibility of the diagnosis algorithms.

Defective bearings would cause a compressor to vibrate abnormally. The vibra-

tions are normally monitored by an accelerometer. The measured signals are trans-

ferred to a data acquisition unit via a high-quality co-axial cable. Tri-axial vibration

signals originating from a bearing with a crack in its inner race have been collected

[17]. An initial crack was seeded in the bearing and the experiment was run for a per-

iod of time and vibration data were recorded during that period. The set-up was then

stopped and the crack size was increased followed by a second run. This procedure

was repeated until the bearing failed. The crack sizes were organized in an ascending

order while time information was assumed uniformly distributed among the crack

sizes. A training data set relating to the crack growth was thus obtained. Time seg-

ments of vibration signals from a good bearing and a defective one are shown in

Figure 1. Their corresponding power spectral densities (PSD) are shown in Figure 2.

The original signals were windowed with each window containing 1000 time points.

Figure 3 demonstrates the crack growth as a function of time. The model is first

trained using fault data up to the 100th time window. from then on, it predicts the

crack evolution until the final bearing failure. The sensor, implemented as a WNN

with seven hidden nodes or neurons is trained through the process of Figure 4. This

sensor ‘measures’ the crack size on the basis of the maximum signal amplitude and

the maximum signal PSDs as inputs. The training results are depicted in Figure 5. It

is observed that 100 data points employed for training lead to very satisfactory

results. The WNN, acting as the predictor, is trained next. The optimized training

procedure results in a WNN of eight hidden neurons. The training results are shown

in Figure 7. Training is deemed satisfactory when 100 data points are used. The

trained predictor is employed next to predict the future crack development, as

shown in Figure 8. A failure hazard threshold was established on the basis of empiri-

cal evidence corresponding to Crack_Width¼ 2000 microns or Crack_Depth¼ 1000

microns. The crack reaches this hazard condition at the 174th time window.


Figure 1. Vibration signals from a good and a defective bearing.

Figure 2. PSDs of the vibration signals in Figure 1.

Figure 3. The original crack sizes.


Figure 4. The training of the sensor.

Figure 5. The crack sizes measured by the trained sensor.

Figure 6. The training of the predictor.


The Crack_Width criterion is reached first. These results are preliminary and inten-

ded only to illustrate the proposed prediction architecture. A substantially large data

base is required for feature extraction, training, validation and optimization. Such a

data base will permit a series of sensitivity studies that may lead to more conclusive

results as to the capabilities and the effectiveness of the proposed approach.

6. Conclusions

In this paper, a fault prognosis architecture consisting dynamic wavelet neural

networks has been developed. Gaussian radial basis functions and Mexican hat

Figure 7. The crack growth predicted by the trained predictor within 100th time window.

Figure 8. The crack growth predicted by the trained predictor beyond 100th time window.


wavelet frames are used as scaling functions and wavelets respectively. The centers of

the basis functions are calculated using a dyadic expansion scheme and a k-means

clustering algorithm. The results indicate that significant advantages over other

neural network based defect characterization schemes could be obtained, in that

the accuracy of the predicted defect profile can be controlled by the resolution of

the network.

Using the model of wavelet neural networks, we can successfully extract feature

and a trained WNN to predict a defective bearing with a crack in its inner race.

The results of this study demonstrate that WBF neural networks with center loca-

tions obtained using a dyadic scheme can successfully diagnose fault.

References

1. Picton Phil.: Neural Networks, PALGRAVE, 2000, pp. 102–107.2. Zhang, Q. and Benveniste, A.: Wavelet networks neural networks, IEEE Transactions on

3(6) (1992), 889–898.3. Zhang, Q.: Using wavelet network in nonparametric estimation neural networks, IEEE

Transactions on 8(2) (1997), 227–236.4. Chau, T.: A review of analytical techniques for gait data. Part 2: neural network and

wavelet methods, Gait and Posture 13(2) (2001), 102–120.5. Kugarajah, T. and Zhang, Q.: Multidimensional wavelet frames neural networks, IEEE

Transactions on 6(6) (1995), 1552–1556.

6. Daubechies, I.: The wavelet transform, time-frequency localization and signal analysi,IEEE Trans. Inf. Theory 36(5) (1990), 961–967.

7. Hwang, K. and Mandayam, S.: Characterization of gas pipeline inspection signals using

wavelet basis function neural networks, NDT and E International 33(8) (2000), 531–545.8. Jiao, L. and Pan, J.: Multiwavelet neural network and its approximation properties,neural networks, IEEE Transactions on 12(5) (2001), 1060–1066.

9. Yu, D. L. and Gomm, J. B.: Sensor fault diagnosis in a chemical process via RBF neural

networks, Control Engineering Practice 7(1) (1999), 49–55.10. Chau, T.: A review of analytical techniques for gait data. Part 2: neural network and

wavelet methods, Gait and Posture 13 (2001), 102–120.

11. Trunov, A. B. and Polycarpou, M. M.: Automated fault diagnosis in nonlinear multivari-able systems using a learning methodology, neural networks, IEEE Transactions on 11(1)(2000), 91–101.

12. Vemuri, A. T. and Polycarpou, M. M.: Neural-network-based robust fault diagnosis inrobotic systems, Neural Networks, IEEE Transactions on 8(6) (1997), 1410–1420.

13. Wilson, D. J. H. and Irwin, G. W.: RBF principal manifolds for process monitoring,

Neural Networks, IEEE Transactions on 10(6) (1999), 1424–1434.


fault diagnosis using wavelet neural networks

Documents