aeroengine exhausted gas temperature prediction using process extreme learning machine
TRANSCRIPT
Aeroengine Exhausted Gas Temperature Prediction
Using Process Extreme Learning Machine
Ding Ganga, Lei Dab and Yao Weic
School of Mechatronics Engineering, Harbin Institute of Technology,
Heilongjiang, P.R. China, 150001
Keywords: Process Extreme Learning Machine, Function Inputs, Time Series Prediction, Aeroengine Health Condition Prediction.
Abstract. To solve the aeroengine health condition prediction problem, a process extreme learning
machine (P-ELM) is proposed based on the process neural networks (PNN) and the extreme learning
machine (ELM). The proposed P-ELM has an ability of processing time accumulation effects widely
existing in practical systems. The proposed P-ELM has only one unknown parameter which can be
calculated directly rather than in the iteration way, thus the training time can be significantly reduced.
After being validated via the prediction of Mackey-Glass time series, the proposed P-ELM is utilized
to predict the aeroengine exhausted gas temperature, and the test results is satisfied. It has shown by
the contrast tests that the proposed P-ELM can outperform the ELM, but has equal performance with
the PNN. However, with just one unknown parameter which can be calculated directly, the proposed
P-ELM is much easier to use and it needs much less training time. Thus, the proposed P-ELM is more
adaptable to the practical situation of aeroengine health condition prediction compared with the PNN.
Introduction
Artificial neural networks (ANN) have attracted significant attention in many fields including time
series prediction in the last few years mainly due to their ability of approximating nonlinear mappings.
Hornik [1] and Funahashi [2] proved that multilayer feedforward neural networks can approximate
any continuous function. Huang [3] took a further step and proved that standard single layer
feedforward neural networks are capable to learn N distinct samples with at most N hidden neurons
and any bounded nonlinear activation function with zero error. It seems that ANN can have great
potential in modeling nonlinear systems.
However, traditional ANN actually cannot perfectly model the actual signal processing procedure
of true biological neurons, whose states of the synapses are interrelated with relative time of the input
impulse [4]. Thus, it’s difficult for traditional ANN to process time accumulation effect exists in
practical systems directly. To overcome this limitation, He and Liang proposed process neuron model
[5] in 2000, which has similar architecture to traditional neurons, but its inputs, outputs and
corresponding connection weights of the process neuron can be time-varying functions. With similar
working mechanism to true biological neurons, process neural networks (PNN) can often get higher
precision in time series prediction compared with NN, and is widely used in such areas [6-8].
However, the training process of PNN is still complex and time consuming just as traditional ANN.
Extreme learning machine (ELM) is an extended single hidden layer feedforward neural network
proposed by Huang [9], which has universal approximation capability for any type of computational
hidden nodes [10]. The essence of ELM is that the hidden layer of the network need not to be tune,
and is given directly instead. So, there is no iteration during the application of ELM, which can greatly
reduce the training time. And ELM not only tends to reach the smallest training error but also the
smallest norm of output weights which means it can gain better generalization performance compared
with traditional neural networks.
Applied Mechanics and Materials Vols. 423-426 (2013) pp 2355-2362Online available since 2013/Sep/27 at www.scientific.net© (2013) Trans Tech Publications, Switzerlanddoi:10.4028/www.scientific.net/AMM.423-426.2355
All rights reserved. No part of contents of this paper may be reproduced or transmitted in any form or by any means without the written permission of TTP,www.ttp.net. (ID: 129.186.1.55, Iowa State University, Ames, United States of America-08/10/13,05:32:07)
Thus, a process extreme learning machine (P-ELM) is proposed based on the advantages of the
PNN and the ELM in this paper, which has the ability of processing time accumulation effect and can
calculate parameters directly rather than in the iteration way. Then, the P-ELM is utilize to monitor
the aeroengine health condition by predicting the aeroengine exhausted gas temperature which can
show the gradually decrease of the engine performance as a result of time accumulation.
Process neuron and extreme learning machine
Process Neuron. The process neuron model is composed of three sections: inputs, an activation unit
and output. This is based on the fact that a biological neuron is composed of three basic parts: a
dendrite, a soma and an axonal tree. Traditionally, the inputs and the connection weights of the neuron
in a neural network are discrete values. However, the inputs and the connection weights of process
neuron are continuous time-varying functions. The process neuron architecture is depicted in Fig. 1.
)(1 tx
)(2 tx
)(txn
�
)(⋅f y
)(1 tω
)(2 tω
)(tnω( )K ⋅
Fig.1. Schematic diagram process neuron
According to the difference of the operator ( )K ⋅ , the process neuron can be divided into two
different types. The output of the first type process neuron model can be expressed as:
1
( ) ( ( ) ( ) )n
i i
i
y t f t x tω θ=
= − .∑ The output of the second type process neuron model can be expressed as:
01
( ( ) ( ) )n T
i i
i
y f t x t dtω θ=
= − .∑∫ Where ],0[)( TCtxi ∈ is the i-th input function, ],0[ TC denotes the space of
continuous functions on ],0[ T , )(tiω is the i-th weight function, θ is the threshold, and )(⋅f is the
activation function.
Extreme Learning Machine. ELM is an extended single hidden layer feedforward neural network
(SLFN), so the basic model of ELM can be described as an approximation model of N arbitrary
distinct samples ( ),i i
x t , where [ ]1 2, , ,
T n
i i i inx x x R= ⋅⋅ ⋅ ∈x , and [ ]1 2
, , ,T m
i i i imR= ⋅⋅⋅ ∈t t t t . For ELM with N
training samples, L hidden neurons and activation function , the mapping relationship between the
inputs and output is
1 1
( , , ) ( ), 1, ,L L
j i i j i i i j i
i i
o G b g b j N= =
= = ⋅ + = ⋅⋅ ⋅∑ ∑β w x β w x . (1)
Where i
w is the weight vector connecting the i-th hidden neuron and the inputs and
[ ]1, ,
T
i i inw w= ⋅⋅⋅w ,
iβ is the weight vector between the i-th hidden neuron and the output neurons and
[ ]1, ,
T
i i imβ β= ⋅⋅⋅β . The training goal of a traditional neural network is to minimize the error between the
outputs and the targets, which is1
N
j j
j
min o t=
−
∑ , where
jt is the corresponding target of
jo .It can be
rewritten in the matrix form as min -Hβ T where
1 1 1 1 1
1 1
( ) ( , , ) ( , , )
( ) ( , , ) ( , , )
L L
N N L L N N L
h G b G b
h G b G b×
⋅ ⋅ ⋅ = ⋅⋅ ⋅ = ⋅⋅ ⋅ ⋅ ⋅ ⋅
x w x w x
H
x w x w x
(2)
2356 Applied Materials and Technologies for Modern Manufacturing
1
T
T
L L m×
= ⋅⋅ ⋅
β
β
β
and
1
N N m
t
t×
= ⋅⋅⋅
T (3)
From the view of ELM, to train SLFN is equivalent to find a least square solution to a linear system.
Since the hidden layer parametersi
w and i
b can be chosen randomly [11], the only unknown parameter
is β , which is the output weight matrix, so the equivalent linear system of SLFN is
- min -∧
=β
H β T Hβ T (4)
And the smallest norm solution for the above linear system is
∧
= †β H T (5)
Where †H is the Moore-Penrose generalized inverse of matrix H .Different methods such as
orthogonal projection method, iterative method, and singular value decomposition can be adopted to
calculate the Moore-Penrose generalized inverse matrix.
Since the only unknown parameter can be calculated directly without iteration, ELM does not need
a long training time. And ELM gains the smallest training error with the smallest norm of output
weights which means it can have better generalization performance compared with traditional neural
networks.
The process extreme learning machine
The process extreme learning machine proposed in this paper is a single hidden layer feedforward
process neural network (SLFPNN). For the sake of convenience, we just analyze P-ELM with single
output node in this section, whose topology structure is depicted in Fig. 2.
1( )x t
2 ( )x t
( )nx t
( )ki tω
� �
∑ y
1β
2β
Lβ�
, ,∑∫ g
, ,∑∫ g
, ,∑∫ g
Fig. 2. Process extreme learning machine model
It can be seen from Fig.2 that the hidden neurons of P-ELM are process neurons, and the inputs of
P-ELM are continuous functions which are different from the basic ELM. Given N samples ( )( ) ,i i
tx t ,
where [ ]1 2( ) ( ) , ( ) , , ( ) [0, ]
T n
i i i int x t x t x t C T= ⋅⋅ ⋅ ∈x , and
iR∈t , suppose that we utilize a P-ELM model with L
hidden neurons, then the output is
01 1
( ( ) ( ) ), 1, ,L n
T
j i ki jk i
i k
o g t x t dt b j Nω= =
= + = ⋅⋅ ⋅∑ ∑∫β (6)
Where ( )jk
x t is the k-th input of the j-th input vector, and ( )ki
w t is the corresponding weight vector.
According to (6) and section 2.2, we have to calculate the integration before we calculate the hidden
output matrix H, but it is always difficult and tedious. And in the practical use of PNN, the calculation
of integration is always avoided by expanding the input and weight functions based on normalized
Applied Mechanics and Materials Vols. 423-426 2357
orthogonal basis functions [12]. We use the same method to simplify the integral calculus in P-ELM.
Denote the normalized orthogonal basis functions as{ }1
( )p p
b t+∞
=, then the input and weight functions can
be expanded as
( )
1
( ) ( )M
p
jk jk p
p
x t a b t=
=∑ (7)
( )
1
( ) ( )M
m
ki ki m
m
t w b tω=
=∑ (8)
And we have0
1,( ) ( )
0,
T
p l
p lb t b t dt
p l
==
≠∫ , so Eq.6 can be rewritten as
( ) ( )
1 1 1
( ), 1, ,L n K
p p
j i jk ki i
i k p
o g a w b j N= = =
= + = ⋅⋅ ⋅∑ ∑∑β (9)
Let ( )(1) (2) ( ), , ,
TM
jk jk jk jka a a=a � , ( )(1) (2) ( ), , ,
TM
ki ki ki kiw w w=w � , 1 2
( , , , )j j j jn=a a a a� and
1 2( , , , )T
i i i ni=w w w w� ,where
1,2, ,i n= � , then Eq.9 could be rewritten as
( ) ( )1 1
( , , ) ( ), 1, ,L L
j i i j i i j i i
i i
o G t t b g b j N= =
= = ⋅ + = ⋅⋅ ⋅∑ ∑β w x β a w (10)
Where j i⋅a w is the inner product of
ja and
iw .Then we can get the equivalent linear system of
SLFPNN, and it can be described as min -Hβ T where
( )
( )
( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ( ) ( )
1 1 1 1 1
1 1
( ) ( , , ) ( , , )
( ) ( , , ) ( , , )
L L
N N L L N N L
h t G t b t t G t b t t
h t G t b t t G t b t t×
⋅ ⋅⋅
= ⋅⋅ ⋅ = ⋅⋅⋅ ⋅ ⋅⋅
x w x w x
H
x w x w x
(11)
1
L
β
β
= ⋅⋅ ⋅
β and 1
N
t
t
= ⋅⋅ ⋅
T (12)
Thus, the equivalent linear system of SLFN is
- min -∧
=β
H β T Hβ T (13)
And the smallest norm solution for the above linear system is
∧
= †β H T (14)
Where †H is the Moore-Penrose generalized inverse of matrix H . Notice that for P-ELM, the
element hidden output matrix ( ) ( ) ( )( , , ) ( )i i j j i i
G t b t t g b= ⋅ +w x a w , where ( )g ⋅ is the activation function,
we choose sigmoid function as the activation function in this paper. The function inputs are finally
transferred into forms of vectors which help to diminish the tedious integral calculus.
The observations of actual systems are always discrete samples, so we have to fit them into
continuous functions to get the input functions before the application of P-ELM. And the steps for
using P-ELM are as follows:
2358 Applied Materials and Technologies for Modern Manufacturing
Step 1 Determine the structure of P-ELM, which is determining the hidden process neurons and the
choosing the hidden activation function,
Step 2 Choose a time interval [ ]0,T , fit the discrete sample into the continuous functions,
Step 3 Choose a set of orthogonal basis functions to expand the input functions and weight
functions as Eq.7 and Eq.8 to get j
a ,
Step 4 Randomly generate hidden node parameters i
w and i
b ,
Step 5 Calculate hidden output matrix H,
Step 6 Calculate the output weight∧
β , where∧
= †β H T .
Application test
Mackey-Glass Time Series Prediction Mackey-Glass time series is a typical chaos time series,
which is widely used to test the performance of predicting algorithms. In this section, the procedures
of utilizing P-ELM for time series prediction is explained by forecasting the Mackey-Glass time
series, and the effectiveness of P-ELM is also validated at the same time.
We generated a Mackey-Glass with 150 samples, which can be denoted as 1501{ }k kx = . We choose
1 5( , , , )i i ix x x+ +� to generate the input function iIF , 1, ,144i = � , and 6ix + to be the corresponding output.
Then, we can get 144 couples of samples which can be denoted as 1446 1{ , }i i iIF x + = , and we use the former
72 couples to train the P-ELM and the left 72 couples to test out model. The number of the hidden
neurons of the P-ELM employed is 10, and the input number is 6. We adopted the normalized
Legendre orthogonal basis functions to expand the input functions and the weight functions. The
prediction results are depicted in Fig (3). It can be seen from Fig.3 that the performance of P-ELM is
satisfied. The absolute value of the relative error is used to evaluate the performance of our model,
and the average value is 0.59%, while the max value is 2.09%, which proves the effectiveness of
P-ELM.
0 10 20 30 40 50 60 700.4
0.6
0.8
1
1.2
1.4
Sample points
Sam
ple
valu
e
Actual Value
Prediction results by P-ELM
Fig.3. Prediction results of Mackey-Glass time series by P-ELM
Aeroengine health condition prediction. Aeroengine is a complicated nonlinear system, which is
always working under extreme conditions such as high temperature, high pressure and high speed
leading to the result that the performance of its components and subsystems will degrade gradually
with time. Thus, condition monitoring is essential in terms of flight safety and also for reduction of the
preventive maintenance cost. Exhausted gas temperature (EGT) is a pivotal health index for
aeroengine because of the temperature of the engine turbines need to be strictly limited for safety
considerations, thus EGT monitoring is very important in aeroengines’ daily operation. The EGT
fluctuation and trend should be watched carefully, which lead to the needs of predicting technologies’
Applied Mechanics and Materials Vols. 423-426 2359
utilization in such areas. In this section, the P-ELM is utilized to forecast the EGT time series to
describe its application to the aeroengine condition monitoring.All text, figures and tables must be in
English.
The EGT time series is from an airline company in China. In order to reduce the influence from
environment, a basis value was subtracted from the origin data and a new time series DEGT was
generated for prediction. The DEGT time series is denoted as 86
1{ }j jDEGT = . We choose
1 5( , , , )i i iDEGT DEGT DEGT+ +� to generate the input function iIF , 1, ,144i = � , and 6iDEGT + to be the
corresponding output. Then, we can get 80 couples of samples which can be denoted
as 806 1{ , }i i iIF DEGT + = , and we use the former 70 couples to train the P-ELM and the left 10 couples to test
out model. And we predict the same time series by ELM and PNN of single hidden layers for
comparison.
The input number of all models is set as 6, and the normalized Legendre orthogonal basis functions
are also used to expand the input functions and the weight functions. However, just as traditional
artificial neural networks, the number of hidden neurons is difficult to choose. We tried several
numbers such as 10, 20 and 30, and we found that with the increase of hidden number, the
performance of ELM and P-ELM get worse, thus ,we choose the number of hidden neurons of all
models as 10. We adopted the normalized Legendre orthogonal basis functions to expand the input
functions and the weight functions for P-ELM and PNN. And the PNN is trained with
Levenberg-Marquardt [13] algorithm, which is widely used fast training algorithm for neural
networks. The prediction results are show in Table 1.
Table 1. DEGT Prediction Results
Absolute relative error/% Absolute error/℃
average max min average max min
P-ELM 4.4268 10.1314 0.0850 2.0650 4.5085 0.0392
ELM 4.6301 10.7738 0.4577 2.1663 4.7943 0.2110
PNN 4.5270 10.1243 0.2991 2.1147 4.9931 0.1379
As shown in Table1, P-ELM performs slightly better than ELM and PNN, and PNN also perform
slightly better than ELM. However, all of the results are not satisfied because that their max absolute
relative errors reach 10%, and it can hardly be adopted in practical engine health monitoring. In order
to improve the prediction precision, the cubic spline interpolation method is utilized to generate a new
DEGT time series to add more information. We choose 166 samples from the new time series which
denoted as 166
1{ }j jDEGT = , and the prediction of the origin DEGT time series is equivalent to the prediction
as the new DEGT time series. 160 samples are generated with the method aforementioned, and the
former 140 samples are utilized for training and the left are for testing. P-ELM, ELM and PNN are
also employed to predict the same time series. The results are depicted as Fig. 4.
It can be seen from Fig.4 that the performance of all three model are improved. Actually, the
average of absolute relative error of prediction results by P-ELM is 1.07% while the max value is
2.01%; the average of absolute relative error of prediction results by ELM is 1.33% while the max
value is 3.33%; and the average of absolute relative error of prediction results by PNN is 1.74% while
the max value is 4.20%. Again, P-ELM outperforms the other two methods, but PNN performs
slightly worse than ELM this time.
Three more tests are conducted with different ratio of training samples and testing samples
utilizing the time series 166
1{ }j jDEGT = , and the results are listed in Table 2. Taking both average relative
error and max relative error into consideration, P-ELM perform slightly better than ELM, and PNN
perform slightly better than ELM. Though it is difficult to judge whether P-ELM performs better than
PNN in our tests, it is worth mentioning that the training time can be significantly reduced while using
P-ELM compared with PNN.
2360 Applied Materials and Technologies for Modern Manufacturing
1 2 3 4 5 6 7 8 9 10
42
44
46
48
50
52
54
Sample times
DE
GT
/ °C
Actual Value
Prediction results by P-ELM
Prediction results by PNN
Prediction results by ELM
Fig. 4. Prediction results of DEGT after interpolation
Table 2. DEGT Prediction Results with different sample ratios
Sample ratio Max relative error/% Average relative error/%
Train Test P-ELM PNN ELM P-ELM PNN ELM
120 40 3.41 7.35 2.52 1.03 1.89 1.08
100 60 7.52 11.05 15.38 2.84 2.47 2.69
80 80 8.85 8.82 9.52 2.82 2.65 2.95
Conclusions
The process extreme learning machine (P-ELM) is proposed in this paper taking both advantages of
PNN and ELM. The P-ELM can process time accumulation effects widely exists in practical systems,
it has only one unknown parameter which can be calculated directly rather than be tuned in an
iteration way, and this can significantly reduce the training time. Its application to aeroengine health
monitoring is described by the prediction of engine exhausted gas temperature while compared with
PNN and ELM. P-ELM outperforms ELM most of the time in our tests. This can be ascribed to the
fact that P-ELM has the ability of processing time accumulation effect in complex system, and
P-ELM can get the least norm of output weight while gain the least error and then has better
generalization performance. Though we cannot conclude that P-ELM also performs better than PNN
yet, we can conclude that P-ELM performs not worse than PNN, moreover, P-ELM has only one
unknown parameter which can be calculated directly, thus it needs much less training time than PNN,
which can make P-ELM more adaptable to the practical situation of aeroengine health monitoring.
Acknowledgments
This research was supported by the Harbin City Key Technologies R & D Program under Grant No.
2011AA1BG059, and the International S&T Cooperation Projects of Heilongjiang Province under
Grant No. WB10A104.
References
[1] K. Hornik, M. Stinchcombe, H. White: Neural Networks, Vol.2 (1989), p.359-366.
[2] K. Funahashi: Neural Networks, Vol.2 (1989), p.183-192.
[3] G. B. Huang, H. A. Babri: IEEE Transactions on Neural Networks, Vol. 9(1998), p.224-229.
Applied Mechanics and Materials Vols. 423-426 2361
[4] L. I. Zhang, H. W. Tao, C. E. Holt, et al: Nature, Vol.395 (1988), p.37-44.
[5] X.G. He, J.Z. Liang: Procedure Neural Networks. In Proceedings of 16th World Computer
Congress on Intelligent Information processing, Beijing (2000).
[6] G. Ding, S. S. Zhong: J. of Astronautics, Vol.27 (2006), p.645-650.
[7] S. S. Zhong, G. Ding and Z. D. Su: Lecture Notes in Computer Science, vol. 3496(2005), p.
103-124.
[8] S. S. Zhong, Y. Li, G. Ding and L. Lin: Neural network world, Vol.17 (2007), p. 483-495.
[9] G. B. Huang, Q.Y. Zhu, C.K. Siew: Neurocomputing, Vol. 70(2006), p. 489-501.
[10] G. B. Huang, L. Chen: Neurocomputing, Vol. 72 (2007), p. 3056-3062.
[11] G. B. Huang, L. Chen, C.K. Siew, et al: IEEE Transactions on Neural Networks, Vol. 17(2006),
p.879-892.
[12] X. G. He, J. Z. Liang and S. H. Xu: Chinese Journal of Computers, Vol. 5(2004), p. 645-650.
[13] M. T. Hagan, M. Menhaj: IEEE Transactions on Neural Networks, Vol. 6(1994), p. 989-993.
2362 Applied Materials and Technologies for Modern Manufacturing
Applied Materials and Technologies for Modern Manufacturing 10.4028/www.scientific.net/AMM.423-426 Aeroengine Exhausted Gas Temperature Prediction Using Process Extreme Learning Machine 10.4028/www.scientific.net/AMM.423-426.2355