manuscript

Comparison of Neural Network Methods for Forecasting Lumpy Demand

Mohammad R. Amin-naseri 1*, Bahman Rostami Tabar 2

1 Department of Industrial Engineering, Faculty of Engineering,Tarbiat Modares University, Tehran, Iran

2 Laboratoire de l'Intégration du Matériau au SystèmeUniversité Bordeaux1, Bordeaux, France

Abstract

Accurate demand forecasting of spare parts is a difficult task particularly for items with

lumpy pattern. This study investigates four different types of artificial neural networks

(ANNs) namely multilayer perceptron (MLP), generalized regression neural network

(GRNN), recurrent neural network (RNN), Time delay neural network (TDNN) and two

traditional methods for lumpy demand forecasting. Demand records for 30 types of spare

parts from Arak Petrochemical Company in Iran were used to evaluate the accuracy of

different methods. It was found that the ANNs provided suitable forecasts, and that estimation

using RNN yields better results than other neural networks.

Keyword: Forecasting; Neural network methods; Lumpy demand; Spare parts

Corresponding Author.

Tel: +982182880; Fax: +982188006544 Tarbiat Modares University, Ale Ahmad High Way, Tehran, Iran E-mail Address: [email protected]

1

1. Introduction

Forecasting the future is a critical element of management decision making. The final

effectiveness of any decision depends upon the consequence of events following this

decision. The ability to forecast the uncontrollable aspects of these events earlier to making

the decision should permit an improved choice over that which would otherwise be made[1].

The need for forecasting is increasing as management attempts to decrease its dependence on

chance and becomes more scientific in dealing with its environment [2]. Statistical methods,

such as exponential smoothing and regression analysis, have been used by analysts in

forecasting demand for a number of decades. Many of these methods may perform poorly

when demand for an item is lumpy.

Lumpy demand patterns are characterized by intervals in which there is no demand and, for

periods with actual demand occurrences, by a large variation in demand levels [3]. The

problem of modeling the future consumption becomes especially difficult for lumpy patterns

which are common for the spare parts inventory systems. Forecasting the lumpy demand

requires special techniques in comparison with the smooth and continuous case, since the

assumptions for continuity and normal demand distribution do not hold [4]. Lumpy demand

patterns are very common, particularly in organizations that hold many spare parts. In the

aerospace, and automotive sectors, for example, organizations may have thousands or tens of

thousands of stock keeping units (SKUs) classified as intermittent or lumpy [5]. For instance,

lumpy demand has been observed in the automotive industry [5,6], in durable goods spare

parts [7], in aircraft maintenance service parts [8], in petrochemical industry [9,10] and in

telecommunication systems, large compressors, and textile machines [3].

Croston [11] implied that traditional forecasting methods such as single exponential

smoothing (SES) may lead to sub-optimal stocking decisions and proposed an alternative

2

forecasting method. In proposed procedure, two forecasts for the mean demand-interval and

the mean demand-size have been done. The forecast for the demand per period is then

calculated as the ratio of the forecasts for demand size and demand interval. Modifications of

the original Croston's method were later proposed by several other authors. Willemain et al

[12] compared Croston’s method with exponential smoothing and concluded that Croston’s

method is robustly superior to exponential smoothing, although results with real data in some

cases show a more modest benefit. Johnston and Boylan [13] obtained similar results, but

further showed that Croston’s method is always better than exponential smoothing when the

average inter-arrival time between demands is greater than 1.25 review intervals. Sani and

Kingsman [14] compared various forecasting and inventory control methods on some long

series of low demand real data from a typical spare parts depot in the UK. They concluded

based on cost and service level, that the best forecasting method is the Croston’s method. An

important contribution is that by Syntetos and Boylan [15]. They show that Croston’s method

lead to a biased estimate of demand per unit of time. They also proposed a modified method

and demonstrated the improvement in a simulation experiment. Ghobbar and Friend[7]

compared various forecasting methods using real data of aircraft maintenance repair parts

from an airlines operator. The data is lumpy in nature and they showed that moving average,

Holt’s and Croston’s forecasting methods, are superior to other methods such as the

exponential smoothing. Willemain et al. [16] compared various forecasting methods using

large industrial data sets. They showed that the bootstrapping method produces more accurate

forecasts than both exponential smoothing and Croston’s method. In an attempt to develop a

forecasting procedure that can handle both fast moving and slow moving items, Levén and

Segerstedt [17] proposed a modification of Croston’s method which was thought to avoid the

bias indicated by Syntetos and Boylan[15]. Boylan and syntetos [18] reviewed modified

Croston procedure to intermittent demand forecasting proposed by Levén and Segerstedt [17].

3

It has been found to be more biased than Croston's method, especially for highly intermittent

series. They assessed the accuracy of this method using simulated data and the mean square

error measure, and showed that Croston's method is generally more accurate than its

modification, particularly for strongly intermittent series. Eaves and Kingsman [19] compared

various forecasting methods using real data from the UK’s Royal Air Force. They showed

that the modified Croston’s method [11] by Syntetos and Boylan[15] is the best forecasting

method for spare parts inventory control. Syntetos et al. [20] analyzed a wider range of

intermittent demand patterns and made a categorization to guide the selection of forecasting

methods. They indicated that there are demand categories that are better used with the

original Croston’s method and there are others that go well with the Syntetos/Boylan

modification. In an attempt to further confirm the good performance of their modified

Croston’s method, Syntetos and Boylan [21] carried out a comparison of forecasting methods

including theirs and the original Croston’s method. A simulation exercise was carried out on

3,000 products from the automotive industry with “fast intermittent” demand.

It was shown that the modification is the most accurate estimator. Syntetos and Boylan [22]

evaluated the empirical stock control performance of the Syntetos-Boylan approximation

(SBA). They first discussed the nature of the empirical demand data set and specified the

stock control model to be used for experimentation purposes. Performance measures were

then selected to report customer service level and stock volume differences. The out-of

sample empirical comparison results demonstrate the superior stock control performance of

the SBA and enable insights to be gained into the empirical utility of the other estimators.

Hua et al. [9, 10] developed a hybrid approach for forecasting the intermittent demand of

spare parts. Their approaches provide a mechanism to integrate the demand autocorrelated

process and the relationship between explanatory variables and the nonzero demand of spare

parts during forecasting occurrences of nonzero demands over lead times. Nezih et al. [23]

4

investigated an alternative method developed by Wright (1986) for data sets with missing

values and compared it to a Syntetos-Boylan approximation in a simulated environment.

A little work has been done in lumpy demand forecasting using neural network. Carmo and

Rodrigues [24] applied NN modeling on ten “irregularly spaced” time-series. They used a

Radial Basis Function (RBF) network. Gaussian basis function networks were shown to be

adequate models for short-term prediction of irregularly spaced time series, with the ability to

generate better predictive performance than alternative models, by taking into account

nonlinear correlations in the data. Gutierrez et al. [25] adopted the most widely used method,

a multi-layered perceptron (MLP) trained by a back-propagation (BP) algorithm.

Their research objective was to assess whether the NN-based approach is a superior

alternative to traditional approaches for modeling and forecasting lumpy demand.

In this research, generalized regression neural network (GRNN), Elman recurrent neural

network (RNN) and Time delay neural network (TDNN) have been applied to 30 types of

spare parts in Arak Petrochemical Company in Iran. Statistical accuracy measures showed

that our approaches produce more accurate forecasts than two traditional methods (Coston's

method and Syntetos-Boylan Approximation) and applied MLP to lumpy patterns by Guttirez

et al.

This paper is divided into four sections. In the next section, forecasting methods for

estimating the lumpy demand problem, accuracy measures and characteristics of data are

presented. In the third section, results of our studies are presented. Conclusions and future

studies are found in the fourth section.

2. Forecasting Methods

2.1 Croston 's method and Syntetos-Boylan Approximation

5

Croston’s original method estimates the mean demand per period, by applying the

exponential smoothing method separately, to the intervals between nonzero demands and

their sizes. This method has become an industry standard, and has been adopted by many

companies and software vendors [15]. The method works as follows:

If a demand occurs, then forecast the mean demand-size using SES and forecast the mean

demand-interval using SES, and finally, forecast the mean demand per period as the ratio of

Size to Interval; Else, if there is no demand, then do not change any of the forecasts.

A disadvantage of the original Croston's method is that it is positively biased. Syntetos and

Boylan [15] noted this and proposed a modification. Syntetos and Boylan [21] correct

original method by multiplying the forecast for the demand per period with(1−α2 ), where is

the smoothing constant. Therefore forecast demand per period calculated as follow:

d t=(1−α2 ) st

nt

The selection of the smoothing constants will definitely affect the accuracy of forecasts [23].

The use of low exponential smoothing constant values in the range of 0.20-0.5 has been

considered realistic and recommended in the literature on lumpy demand [11, 13]. Syntetos

and Boylan [15] suggest that should be no more than 0.15. Eaves [26] choose values in the

range of 0.01- 0.1. Syntetos and Boylan [21] used four smoothing constant values of 0.05,

0.10, 0.15, and 0.20 in the single exponential smoothing, Croston's method and Syntetos-

Boylan approximation. Nezih et al. [23] used smoothing constant ranges from 0.05 to 0.45 for

Syntetos and Boylan method, also they used 0.1<α<0.9 and 0.1<β<0.9 with increments of

0.1 for their method. . In this study the smoothing constant value ranges from 0.05 to 0. 5

used in Croston's method and Syntetos-Boylan approximation.

6

)1(

2.2 Artificial neural network models

Hill et al. [27] demonstrated that statistical time-series methods need human interaction and

evaluation and can misjudge the functional form relating the independent and dependent

variables. These traditional methods can also fail to make necessary data transformations.

Traditional time-series methods may not sometimes capture the nonlinear pattern in data [27].

The estimation of artificial neural networks (ANN) or simply Neural network (NN), however,

can be automated. These models need to be recalibrated on all previous data. In spite of the

great deal of time and effort spent by the researchers in both conventional and soft computing

techniques for time series forecasting, the need of producing more and more accurate time

series forecasts has forced the researchers to develop innovative methods to model time

series. ANNs have been applied in many areas for time series forecasting and widely touted

as solving many forecasting and decision modeling problems. NN were introduced as

efficient tools of modeling and forecasting about two decades ago. A great deal of research

has been devoted to using neural networks techniques especially in business and financial

forecasting. One can find numerous ANN applications in a wide range of areas for time series

forecasting. Wong et al [28] reviewed bibliography of neural network application research in

business during the period of 1994-1998. Their extensive literature searches have identified a

total of 302 research articles. Production/operations, finance, marketing/distribution, and

information systems were found as the most popular application areas. Regarding the lumpy

demand case, most traditional forecasting methods assume a specific distribution of demand

time series or the time series of forecast error, while neither assumption is valid for that[8].

Artificial neural network modeling is a logical choice to overcome these limitations [25].

High capabilities of artificial neural networks in capturing non-linearity of lumpy demand

patterns can make it a promising choice in lumpy demand forecasting. A little work has been

7

done in lumpy demand forecasting using neural network, and we are aware of only two works

by Carmo and Rodrigues [24] and Gutierrez et al. [25]. In the subsequent section, the

architectural of 4 types of neural network, Multi-layered perceptron(MLP), Generalized

Regression Neural Network(GRNN), Recurrent Neural Network(RNN) and Time Delay

Neural network(TDNN) have been illustrated.

2.2.1 Multilayer perceptron network(MLP)

There are many methods in the NN literature that can be used for flexible nonlinear modeling.

The most widely used method is multilayer perceptron (MLP) trained by back-propagation

(BP) algorithm. The standard MLP network consists of three components: an input layer, one

hidden layer and one output layer. A layer being groups of neurons (or processing units) that

share the same input-output connections. In conventional MLP network, the information (or

input signal) is passed forward. Thus the MLP network is also referred to as multilayer

feedforward network.

A major advantage of the MLP is that it is less complex than temporal neural networks such

as the time delay networks and the recurrent networks, and has the same nonlinear input–

output mapping capability. Furthermore, the MLP can be trained even by using the standard

Backpropagation algorithm [29]. Gutierrez et al [25] used three layers of MLP for lumpy

demand forecasting: one input layer for input variables, one hidden unit layer, and one output

layer. The MLP had three nodes in the hidden layer. One output unit was used in the output

layer. All the input nodes were fully connected to all the hidden nodes.

The hidden nodes were in turn connected to the output node. The input nodes represented two

variables: (1) the demand at the end of the immediately preceding period and (2) the number

of periods separating the last two nonzero demand transactions as of the end of the

immediately preceding period. The output node represented the predicted value of the

8

demand transaction for the current period (figure 1). They used a learning rate value of 0.1

and momentum factor value of 0.9. Logistic sigmoid transfer function has been used and all

input patterns have been scaled between 0 and 1.

<<Insert figure 1 about here>>

To exploit the time structure in the data, the neural network must have access to this time

dimension. While MLPs are popular in many application areas, they are not well suited for

such temporal sequences processing due to the lack of time delay and/or feedback

connections necessary to provide a dynamic model.

2.2.2 Proposed models

2.2.2.1 Generalized regression neural network

Generalized Regression Neural Network (GRNN) proposed by Spetch [30] does not require

an iterative training procedure as in back propagation method. It approximates any arbitrary

function between input and output vectors, drawing the function estimate directly from the

training data. Furthermore, it is consistent; that is, as the training set size becomes large, the

estimation error approaches zero, with only mild restrictions on the function. The GRNN is

used for estimation of continuous variables, as in standard regression techniques. It is related

to the radial basis function network and is based on a standard statistical technique called

kernel regression [30].

The GRNN consists of four layers: input layer, pattern layer, summation layer and output

layer. The input units are in the first layer. The second layer has the pattern units and the

outputs of this layer are passed on to the summation units in the third layer. The final layer

covers the output units. Details of the GRNN are presented by Specht and Schematic diagram

9

of GRNN architecture is presented in Figure 2. In GRNN approach, users only choose the

value of smoothing parameter σ and the number of input nodes in the input layer [30]. In this

research, wide ranges of the smoothing parameter σ, i.e. from 0 to 50, have been used and

based on extracted input variables from original data series, GRNN model was adapted. In

this model defined input variables were extracted from whole observations.


In the GRNN model the input nodes include following items:

1. The demand at the end of the immediately preceding target period (lag1).

2. The number of consecutive period with no demand transaction immediately preceding

target period.

3. The number of periods separating the last two nonzero demand transactions as of the

end of the immediately preceding target period.

4. The mean of demand for four period immediately preceding target periods.

In order to identify the effective input variables in GRNN architectures, several combinations

of different input variables were applied into the network and the performance of networks

were evaluated using statistical measures, and then best input sets have been selected based

on these measures [31].

2.2.2.2 Recurrent neural network(RNN)

To exploit the time structure in the data, the neural network must have access to this time

dimension. One way of introducing ‘‘memory’’ in the MLP is by considering a feedback

connections. The RNN used in this study is the basic Elman-type RNN [32], also referred to

as the globally connected RNN. The network consists of four layers (Figure. 3): an input 10

layer, a hidden layer, a context layer and an output layer. Each input unit is connected to

every hidden unit, as is each context unit. Conversely, there are one-by-one downward

connections between the hidden nodes and the context units leading to an equal number of

hidden and context units. In fact, the downward connections allow the context units to store

the outputs of the hidden nodes (i.e. internal states) at each time step, and then the fully

distributed upward links feed them back as additional inputs. Therefore, the recurrent

connections allow the hidden units to recycle the information over multiple time steps and

thereby to discover temporal information contained in the sequential input and relevant to the

target function. Thus the RNN has an inherent dynamic (or adaptive) memory provided by the

context units in its recurrent connections [29]. The mathematical model of RNN is as follows:

a j1 (t )=F (∑

i=1

R

w j , i1 pi ( t )+∑

c=1

s1

w j ,cC ac

1 ( t−1 )+b j1)1≤ j≤ s1

ak2 (t )=G ¿

In which

R: the number of input nodes

P: input vector.

w1, wc: hidden layer weight's matrix for actual and recursive inputs, recursively.

w2: Output layer weight's matrix.

s2 , s1: The number of hidden and output nodes, respectively.

b1,b2: the bias vector for hidden and output layers.

a1,a2: the output vector for hidden and output layers.

11

)2(

)3(

F and G: transfer function in hidden and output layer, respectively.

<<Insert Figure 3 about here>>

In this network, following variables have been defined for input nodes in input layer:

1. The demand at the end of the immediately preceding target period (lag1).

2. The number of consecutive periods with demand transaction, immediately preceding

target period.

3. The number of consecutive period with no demand transaction, immediately

preceding target period.

4. The number of periods separating the last two non-zero demand transactions as of

the end of the immediately preceding target period.

5. The number of period between target period and first non-zero demand immediately


6. The number of period between target period and first zero demand immediately


7. The mean of demand for six periods immediately preceding target period.

8. The maximum of demand for six periods immediately preceding target period.

The input patterns should be normalized before being presented to the network. When

variables are loaded into a neural network, they must be scaled from their numeric range into

the numeric range that the neural network deals with efficiently. In this network tan-sigmoid

and satlins transfer functions, have been used in hidden and output layers, respectively. Thus,

according to following formula, all input patterns were scaled to range of (-1, 1):

12

Si (scaled )=[ S i−min (S)max ( S )−min (S) ]×2−1

Where S is the time series of variable under consideration, Si is the value of observations and

Si)scaled) is equal to normalized values. To find the best number of neurons in the hidden

layer, range of 1 to 15 neurons have been examined and neurons with minimum errors were

selected. When an input is presented to the network the training algorithm (learning equation)

attempts to adjust the weights so that the desired output is produced. In this research back

propagation algorithm have been used as training algorithm with learning rate of 0.01. In

addition, network's parameter adjusting have been done using adaptive calibration algorithm.

Because of the limited historical data, this algorithm preferred to batch mode training for train

the network.

2.2.2.3 Time-Delayed Neural Network(TDNN)

Another way of introducing memory in the MLP is by replacing the neurons in the input layer

with a memory structure called a Tap Delay Line (TDL). Such type of MLP is called Time-

Delay feed forward neural network (TDNN) and also referred to as a pseudo-dynamic neural

network owing to the static memory structure as opposed to the adaptive memory structure

used in recurrent neural networks. To predict temporal patterns, an ANN requires two distinct

components: a memory and an associator. The memory holds the relevant past information,

and the associator uses the memory to predict future events. In this case the associator is

simply a static MLP network and the memory is generated by the tapped delay line by simply

holding past samples of the input signal as shown in figure 4 [33].

The mathematical model of TDNN is as follows:

13

)4(

)5(

a j1 ( t )=F (∑

d=0

D

∑i=1

R

w j ,i , d1 P i ,d+1 ( t )+b j

1)1 ≤ j ≤ s1

ak2 (t )=G(∑

j=1

s1

w k , j2 a j

1 (t )+bk2)1≤ k ≤ s2

In which

D: Time delay memory degree

R: the number of input nodes

P: input vector.

w1,wc: hidden layer weight's matrix for actual and recursive inputs, recursively.

w2: Output layer weight's matrix.

s2 , s1: The number of hidden and output nodes, respectively.

b1,b2: the bias vector for hidden and output layers.

a1,a2: the output vector for hidden and output layers.

F and G: transfer function in hidden and output layer, respectively.


In this research we use a special type of TDNN, in which TDLs have been applied in input

layer. This network was named as input delay neural network (IDNN). In this network, the

input nodes are the same as elman recurrent neural network architecture. In hidden and output

layer tan-sig transfer function was used, therefore, all patterns were scaled to (-1, 1). To find

the best results, the range of 1 to 15 neurons in the hidden layer has been used. For training

14

)6(

the network, back propagation algorithm with learning rate of 0.01 was used. Same as the

RNN architecture we use an adaptive calibration algorithm for adjusting the network

parameters. In this study, tap delay line with length 2 to 5 have been used in input layer.

2.3 Data analysis

In this study real data sets of 30 types of spare parts demand in Arak Petrochemical enterprise

in Iran have been used. The data were gathered from Arak Petrochemical enterprise's

inventory control package, which include 79 monthly periods from 2001 to 2007. Average

inter-demand interval (ADI) and squared coefficient of variation (CV2) measures were used

to classify demand patterns into four categories. The demand patterns are classified into

lumpy demand category when the average inter-demand interval (ADI) is greater than 1.32

and squared coefficient of variation (CV2) is greater than 0.49. The ADI ranges from 1.7 to

7.4 month and CV2 ranges from 0.52 to 2. Table 1 summarizes the statistical characteristics

of the 30 types of spare parts. All 30 items under consideration satisfy criteria specified for

demand lumpiness. For Croston's method and Syntetos-Boylan approximation (SBA) data

series have been divided into three blocks: (i) initialization, (ii) calibration and (iii)

performance measurement. The ‘initialization block’ is used to initialize values required for

methods based on recursive formulae (such as the mean inter-demand interval for Croston’s

method). In the ‘calibration block’, the optimal smoothing constants are identified based on

mean square error (MSE). Finally, the optimal smoothing constants are used to update

forecasts in the ‘performance block’, in which performance statistics are calculated [34].

The lengths of these blocks are 12- 51-16. For neural network methods, data series were

divided into two blocks. (i) training set (ii) test set. From 79 monthly observations, 63

observations have been used for train the network, and five Methods tested using last 16

observations.

15

<<Insert Table 1 about here>>

2.4 Statistical Forecast Accuracy Measures

Forecast-accuracy measures are critical guidelines for proper selection and implementation of

forecast models. Standard forecast error measures frequently appearing in the literature do not

provide fair evaluations for lumpy demand cases due to the large number of zero demand

periods. The conventional measures, such as MAPE and GMAE, have many problems when

the actual demand in the dominator of the calculation is zero or the forecast of demand equals

to actual one.

Hoover [35], Boylan and Syntetos [36], Willemain [37], and Hyndman [38] suggested

various metrics and discussed their strengths and weaknesses. Although it seems that no

single metric comes out clean, but in this research adjusted mean absolute error(A-MAPE),

mean absolute scaled error(MASE) and percentage best(PB) measures have been used.

MASE is a scale-free error measure which uses naïve forecasts as a benchmark. Let e t

indicate forecast error, et = dt – ft. The scaled error, q at time t is then calculated using (7) and

MASE is the average of absolute values of qt :

q t=e t

1n−1

∑i=2

n

(di−d i−1)

Hoover has been suggested three variations on the MAPE: the denominator-adjusted MAPE,

the symmetric MAPE, and the ratio of MAD to MEAN. In this study latest measure was used

as Adjusted MASE.

16

)7(

Adjusted MAPE ¿

∑i=1

n

|d t−f t|n

∑i=1

n

d t

n

The third error measure, the percentage best (PB), is the percentage of time periods one

method performs better than the other methods under consideration. This approach is robust

to large forecast errors and the results can be subjected to formal statistical tests .It is a useful

measure, although it does not quantify the degree of improvement in forecast error [39].

The mathematical expression for PB for method m is

PBm=((∑t=1

n

Bm, t)n

)×100

Where, for time period t, Bm,t=1if |dm,t−f m ,t| is the minimum of |dk ,t−f k ,t|for all methods k

under consideration and Bm,t=0 otherwise .

To compare the alternative methods under consideration, results were reported on these three

error measures, A-MAPE, MASE and PB.

3. Result and Discussion

In croston's method and the Syntetos-Boylan Approximation, each of 30 time series is divided

into three blocks including: initialization, calibration and performance measurement. The

value of smoothing constant with minimum error in calibration block was selected for using in

performance measurement block according to the Mean Squared Error (MSE) measure.

17

)8(

)9 (

Therefore the results of forecasting using two traditional forecast methods in performance

measurement blocks were compared to four neural network approaches.

In this paper, three types of neural networks such as generalized regression neural network,

elman recurrent neural network and time delay neural network have been used in modeling

lumpy demand. In addition, the results of these models were compared to forecasts of multi-

layered perceptron network applied by Gutierrez et al. [25] in lumpy demand forecasting.

Table 2 reports overall A-MAPEs for the methods under consideration. The performance of

RNN model is superior among all methods in general (a simple average of 30 A-MAPEs:

99.68 for RNN vs. 111.44 for IDNN, 126.07 for GRNN, 155.33 for MLP, 163.3 for SBA and

190.87 for Croston method). In calculation of this measure we considered all periods

including the periods with no transaction. In 27 and 24 of 30 items, RNN model produce

more accurate forecasts than SBA and MLP, respectively.


Table 3 shows results of MASEs measure for all forecasting methods. In overall RNN

approach has a better performance than those other methods (a simple average of 30 MASEs:

0.72 for RNN vs. .83 for IDNN, 0.97 for GRNN, 1.14 for MLP, 1.22 for SBA and 1.39 for

CR), Also according to this measure in 29 and 27 of 30 items, RNN model outperforms to

SBA and MLP, respectively.


Table 4 and 5 reports on model performance based on Percentage Best statistics. Because of

the superiority of SBA over the Croston's method and recurrent neural network to other

proposed neural network in subsequent analysis, the performance of RNN only compared to

18

the SBA and MLP. RNN had the highest PB values for 25 series vs. SBA and 24 series vs.

MLP. These PB statistics further establishes the overall superiority of RNN model (averaging

68.71% vs. 31.29% for SBA and 65.67% vs. 34.33% for multi-layered perceptron proposed

by Gutierrez et al[25].



In summary, we found that our proposed neural network models to be superior to Croston's

method and Syntetos-Boylan Approximation and recently applied multi-layered perceptron

by Gutierrez et al(2007) in general, based on all three error measures i.e., Adjusted MAPE,

PB, and MASE.

4. Conclusions and Future Studies

Spare parts inventories frequently display lumpy demand streams, which are difficult to

forecast. Croston's method introduced in 1972 is often used as a benchmark when testing new

methods [23]. Little work has been done on the application of NN modeling in lumpy demand

forecasting. We are aware of only two studies, Carmo and Rodrigues [24] who applied NN

modeling on 10 demand "irregularly spaced" time series and recently Gutierrez et al [25]

adopted MLP on 24 demand time series. In this study, the generalized regression neural

network (GRNN), the elman recurrent neural network (RNN) and the time delay neural

network (TDNN) have been used in lumpy demand forecasting. Our study compares the

performance of GRNN, RNN and TDNN model to those using two traditional methods like

Croston's method and the Syntetos-Boylan approximation and recently suggested Multi-

Layered Perceptron neural network (MLP) by Gutierrez et al [25]. Using real data sets of 30

19

types of spare parts from Arak Petrochemical Company in Iran and three performance

measures, A-MAPE, PB and MASE, we show that our proposed approaches produce more

accurate forecasts. Moreover, this study showed that the elman recurrent neural network is the

most effective approach in lumpy demand forecasting.

In future studies we shall attempt to use other dynamic neural networks like time delay

models with TDL in hidden layer, as well as, hybrid of time delay networks and recurrent

neural network. Another interesting research is the use of genetic algorithm to obtain the

optimal neural network architecture. The possibility of combining neural network to another

one for forecasting the occurrence and then the quantity of demand may be useful in lumpy

demand forecasting.

Acknowledgements

The authors wish to thank Dr. A.A. Syntetos for his encouragement on this research.

20

References

[1] Montgomery D et al., Forecasting and Time series Analysis, Mc Graw-Hill, 1990.[2] Makridakis S. et al., Forecasting: Methods and Applications. John Wiley and Sons,

1985.[3] Bartezzaghi E. et al., A simulation framework for forecasting uncertain lumpy

demand. International of Journal of Production Economics, 59(1999), 499–510.[4] Dolgui.A and pashkevich.M. , Extended beta-binomial model for demand forecasting

of multiple slow-moving items with low consumption and short requests history. Research report, 2005.

[5] Boylan J., Intermittent and Lumpy Demand: A Forecasting Challenge. Foresight, International Journal of Applied Forecasting, 1(1),( 2005), 36-42.

[6] Syntetos A.A and Boylan J.E. ,On the stock control performance of intermittent demand estimators. International Journal of Production Economics,103 (2006), 36-47.

[7] Kalchschmidt M et al., Inventory management in a multi-echelon spare parts supply chain. International Journal of Production Economics, 81–82 (2003), 397–413.

[8] Ghobbar A.A and Friend C.H. , Evaluation of forecasting methods for intermittent parts demand in the field of aviation: a predictive model. Computers & Operations Research, 30 (2003), 2097–2114.

[9] Hua ZS, Zhang B. , A hybrid support vector machines and logistic regression approach for forecasting intermittent demand of spare parts. Applied Mathematics and Computation,181(2) (2006), 1035-1048.

[10] Hua ZS et al. , A new approach of forecasting intermittent demand for spare parts inventories in the process industries. Journal of Operational Research Society, 58 (2007), 52-61.

[11] Croston, J. D. , Forecasting and Stock Control for Intermittent Demand. Operational Research Quarterly, 23(3) (1972), 289-303.

21

[12] Willemain T.R et al., Forecasting intermittent demand in manufacturing: a comparative evaluation of Croston’s method. International Journal of forecasting, 10 (1994), 529-538.

[13] Johnston F.R and Boylan J.E.,. Forecasting for items with intermittent demand . Journal of the Operational Research Society, 47 (1996), 113-121.

[14] B. and Kingsman B.G., Selecting the best periodic inventory control and demand forecasting methods for low demand items. Journal of the Operational Research Society, 48 (1997) , 700-713.

[15] Syntetos A.A, Boylan J. E., on the bias of intermittent demand estimates. International Journal of Production Economics,71 (2001), 457-466.

[16] Willemain T. et al., A new approach to forecasting intermittent demand for service parts inventories. International Journal of Forecasting, 20(2004), 375– 387.

[17] Leve'n, E. and Segerstedt A., Inventory control with a modified Croston procedure and Erlan distribution. International Journal of Production Economics, 90 (2004), 361–367.

[18] Boylan J. and Syntetos A.A., The accuracy of a modified Croston procedure. International Journal of Production Economics, 107 (2007), 511-517.

[19] Kingsman BG and Eaves AHC., for the ordering and stock-holding of spare parts. Journal of the Operational Research Society, 55 (2003), 431-437.

[20] Syntetos A.A et al., on the categorization of demand patterns. Journal of the Operational Research Society, 56 (2005), 495-503.

[21] Syntetos A.A and Boylan J.E., The accuracy of intermittent demand estimates. International Journal of Forecasting, 21 (2005), 303-314.

[22] Syntetos A.A and Boylan J.E., On the stock control performance of intermittent demand estimators. International Journal of Production Economics,103 (2006), 36-47.

[23] Nezih A. et al., Adapting Wright’s modification of Holt’s method to forecasting intermittent demand. International Journal of Production Economic, 111(2) (2008), 389-408.

[24] Carmo Jose and A. o. J. Rodriguez, Adaptive forecasting of irregular demand processes. Engineering Applications of Artificial Intelligence,17(2004), 137-143.

[25] Gutierrez, R.S et al., Lumpy demand forecasting using neural networks. International Journal of Production Economic,111 (2008), 409-420.

[26] Eaves A.H.C., Forecasting for the ordering and stockholding of consumable spare parts. Thesis(Phd), Department of management science, Lancaster University, UK, 2002.

[27] Hill T. et al., Artificial neural network models for forecasting and decision making. International Journal of Forecasting, 10 (1994), 5–15.

[28] Wong B.K et al., A bibliography of neural network business application research: 1994-1998. computers & operations research, 27 (2001), 1045-1076.

[29] Hagan M. T et al. ,1995. Neural Network Design. PWS Publishing Company.[30] Specht D.F., A general regression neural network. IEEE Trans Neural

Network ,2(6) (1991), 568–76.

22

[31] Amin-Naseri M. R et al. ,Generalized Regression Neural Network in Modeling Lumpy Demand. 8th International Conference on Operations and Quantitative Management, Bangkok, Thailand, 2007.

[32] Elman J.L and Zipser D. , Learning the hidden structure of speech. Institute of Cognitive Science, Report 8701, UC San Diego, 1987.

[33] Clouse D.S. et al., Time delay neural networks: representation and induction of finite state machines. IEEE Transactions on Neural Networks, 8(5) (1997), 1065–1070.

[34] Boylan J et al., Classification for forecasting and stock control: a case study. Journal of the Operational Research Society, 59(4) (2008), 473-481.

[35] Hoover J., Measuring Forecast Accuracy: Omissions in Today’s Forecasting Engines and Demand-Planning Software. Foresight, International Journal of Applied Forecasting, 1(4) (2006), 32-35.

[36] Syntetos A.A and Boylan J.E., Accuracy and Accuracy-Implication Metrics for Intermittent Demand. Foresight, International Journal of Applied Forecasting, 4 (2006), 39-42.

[37] Willemain T., Forecast-Accuracy Metrics for Intermittent Demands: Look at the Entire Distribution of Demand. Foresight, International Journal of Applied Forecasting, 4 (2006), 36-38.

[38] Hyndman R. J., Another Look at Forecast-Accuracy Metrics for Intermittent Demand. Foresight, International Journal of Applied Forecasting, 1(4) (2006), 43-46.

[39] Syntetos A.A, Boylan J.E., forecasting for inventory management of service parts in: Kobbacy K.A.H. and Murthy D.N.P., ”Complex System Maintenance Handbook”, Springer ,Chapter 20, 2007.

23

manuscript

Documents

tap delay

arak petrochemical

arak petrochemical

operational

optimal smoothing

artificial

intermittent

immediately