stream flow forecasting using neuro-wavelet technique

HYDROLOGICAL PROCESSESHydrol. Process. 22, 4142–4152 (2008)Published online 26 March 2008 in Wiley InterScience(www.interscience.wiley.com) DOI: 10.1002/hyp.7014

Stream flow forecasting using neuro-wavelet technique

Ozgur Kisi*Erciyes University, Engineering Faculty, Civil Eng. Dept., 38039, Kayseri, Turkey

Abstract:

This paper proposes the application of a neuro-wavelet technique for modelling monthly stream flows. The neuro-wavelet modelis improved by combining two methods, discrete wavelet transform and multi-layer perceptron, for one-month-ahead streamflow forecasting and results are compared with those of the single multi-layer perceptron (MLP), multi-linear regression (MLR)and auto-regressive (AR) models. Monthly flow data from two stations, Gerdelli Station on Canakdere River and Isakoy Stationon Goksudere River, in the Eastern Black Sea region of Turkey are used in the study. The comparison results revealed thatthe suggested model could increase the forecast accuracy and perform better than the MLP, MLR and AR models. Copyright 2008 John Wiley & Sons, Ltd.

KEY WORDS stream flow; neural networks; discrete wavelet transform; modelling

Received 5 July 2007; Accepted 22 January 2008

INTRODUCTION

Many of the activities associated with planning andoperation of the components of a water resource systemrequire forecasts of future events. For the hydrologiccomponent, there is a need for both short term and longterm forecasts of stream flow events in order to optimizethe system or to plan for future expansion or reduction.

Artificial neural networks (ANN) have been success-fully applied in a number of diverse fields includingwater resources. ANN have been used successfully forrainfall–runoff modelling. Smith and Eli (1995) used amulti-layer perceptron (MLP) to predict the peak dis-charge and the time of peak resulting from a singlerainfall pattern. Shamseldin (1997) compared the MLPmodels with the simple linear model, the seasonallybased linear perturbation model and the nearest neighbourlinear perturbation model in rainfall–runoff modelling.He suggested that the neural network shows consider-able promise in the context of rainfall–runoff modelling.Tokar and Johnson (1999) employed a MLP method-ology to forecast daily runoff as a function of dailyprecipitation, temperature, and snowmelt for the LittlePatuxent River watershed in Maryland. They comparedANN rainfall–runoff model with results obtained usingthe statistical regression and a simple conceptual modeland found that the ANN performs better than the others.Sudheer et al. (2002a) proposed a data-driven algorithmfor designing the radial basis function ANN (RBF) struc-ture in rainfall–runoff modelling. They validated theirmethodology using the data for a river basin in India.

In the hydrological forecasting context, recent exper-iments have reported that ANN may offer a promising

* Correspondence to: Ozgur Kisi, Erciyes University, Engineering Fac-ulty, Civil Eng. Dept., 38039, Kayseri, Turkey.E-mail: [email protected]

alternative for stream flow prediction. Karunanithi et al.(1994) used a MLP model to predict stream flow. Theycompared the accuracy of MLP with that of the ana-lytic nonlinear power model and found that the ANNmodel shows better accuracy than the latter. Zealandet al. (1999) investigated the utility of the MLP modelfor short term stream flow forecasting. They exploredthe capabilities of MLP and compared the performanceof this tool with conventional approaches used to forecaststream flow. They concluded that the ANNs developedconsistently outperformed a conventional model duringthe verification (testing) phase. Chang and Chen (2001)used a fuzzy-neural network model for real-time stream-flow prediction. They compared the model results withthose of the autoregressive moving average exogenousvariables model (ARMAX) and found that the fuzzy-neural network model could offer a higher degree ofreliability and accuracy than the ARMAX in stream flowforecasting. Sivakumar et al. (2002) used MLP modelsfor 1-day and 7-day-ahead forecasts of daily stream flow.Sudheer and Jain (2003) compared the MLP and RBF indaily stream flow estimation. They found that the MLPresults were similar to those of the RBF. Cigizoglu (2003)investigated the accuracy of MLP in estimation, forecast-ing and extrapolation of daily stream flow data. Cigizogluand Kisi (2005) developed MLP models using differenttraining algorithms for 1-day-ahead forecasting of dailystream flows. They used k-fold partitioning, a statisticalmethod, in the ANN training stage. Kisi (2004) used MLPmodels in monthly stream flow forecasting. He comparedthe ANN results with the auto-regressive models (AR)and found that the ANN model performed better than theAR. Sahoo and Ray (2006) investigated the accuracy ofMLP and RBF models in forecasting daily flows of aHawaii stream. To the knowledge of the author, thereis no any published work indicating the input–output

Copyright 2008 John Wiley & Sons, Ltd.

STREAM FLOW FORECASTING USING NEURO-WAVELET TECHNIQUE 4143

mapping capability of neuro-wavelet model in modellingof monthly stream flows.

The present study describes utilization of the input–output mapping capabilities of the neuro-wavelet tech-nique in monthly stream flow forecasting. The neuro-wavelet technique is used to forecast monthly flows ofCanakdere and Goksudere River. The results obtainedusing the neuro-wavelet models are compared with thoseof the single neural network, multi-linear regression andauto-regressive models.

MULTI-LAYER PERCEPTRON (MLP)

A MLP has one or more hidden layers, whose compu-tation nodes are correspondingly called hidden neuronsof hidden units. A MLP network structure is shown inFigure 1. The function of hidden neurons is to inter-vene between the external input and the network outputin some useful manner. By adding one or more hiddenlayers, the network is enabled to extract higher orderstatistics. In a rather loose sense, the network acquires aglobal perspective despite its local connectivity due to theextra set of synaptic connections and the extra dimensionof NN interconnections. Detailed theoretical informationabout MLPs can be found in Haykin (1998).

The MLP were trained using a Levenberg–Marquardttechnique because this technique is more powerful andfaster than the conventional gradient descent technique(Hagan and Menhaj, 1994; Cigizoglu and Kisi, 2005).

The back propagation with gradient descent techniqueis a steepest descent algorithm, while the Levenberg–Marquardt algorithm is an approximation to Newton’smethod (Marquardt, 1963). If we have a function V(x)which we want to minimize with respect to the parametervector x, then Newton’s method would be

x D �[r2V�x�]�1rV �x� �1�

where r2V�x� is the Hessian matrix and rV�x� is thegradient. If we assume that V(x) is a sum of squaresfunction

V�x� DN∑

iD1

ei2�x� �2�

then it can be shown that

rV �x� D JT�x�e�x� �3�

r2V �x� D JT�x�J�x�C S �x� �4�

1

2

i

1

2

j

1

2

kWjkWij

L M

Input Output

N

Figure 1. Three-layer MLP architecture

where J(x) is the Jacobean matrix and

S�x� DN∑iD1

eir2ei �x� �5�

For the Gauss–Newton method it is assumed thatS�x� ³ 0, and the update (1) becomes

x D [JT�x�J�x�]�1 JT�x� e�x� �6�

The Levenberg–Marquardt modification to the Gauss–Newton method is

x D [JT�x� J�x�C µI]�1JT�x�e�x� �7�

The parameter � is multiplied by some factor (ˇ)whenever a step would result in an increased V(x). Whena step reduces V(x), � is divided by ˇ. When � is largethe algorithm becomes steepest descent (with step 1/�),while for small � the algorithm becomes Gauss–Newton.The Levenberg–Marquardt algorithm can be considereda trust-region modification to Gauss–Newton. The keystep in this algorithm is the computation of the Jacobeanmatrix. For the neural network-mapping problem theterms in the Jacobean matrix can be computed by a simplemodification to the back propagation algorithm (Haganand Menhaj, 1994).

A difficult task with MLP involves choosing thenumber of hidden nodes. Here, a MLP with one hidden isused with the common trial and error method to select thenumber of hidden nodes. Sigmoid and linear activationfunctions are used for the hidden and output nodes,respectively. ANN networks training was stopped after100 epochs since variation of the error was very smallafter this epoch. The error graph for an ANN modelduring training is shown in Figure 2.

DISCRETE WAVELET TRANSFORM (DWT)

Wavelet analysis is multi-resolution analysis in time andfrequency domains and is an important derivative of

Figure 2. Training error graph for the MLP model

Copyright 2008 John Wiley & Sons, Ltd. Hydrol. Process. 22, 4142–4152 (2008)DOI: 10.1002/hyp

4144 O. KISI

the Fourier transform. Wavelet function �t�, called themother wavelet, can be defined as

∫ C1�1 �t�dt D 0 math-

ematically. a,b�t� can be acquired through compressingand expanding �t�:

a,b�t� D jaj1/2 (

1 � b

a

)b 2 R, a 2 R, a 6D 0

�8�where a,b�t� is the successive wavelet, a is a scale orfrequency factor, b is a time factor; R is the domain ofreal numbers.

If a,b�t� satisfies Equation (8), for the energy finitesignal or time series f�t� 2 L2�R�, the successive wavelettransform of f�t� is defined as:

W f�a, b� D< f, �a, b� >

D jaj1/2∫Rf�t�

(t � b

a

)dt �9�

where �t� is a complex conjugate functions of �t�.Equation (9) indicates that the wavelet transform is thedecomposition of f�t� under different resolution level(scale). In other words, the essence of wavelet transformis to filter wave for f�t� with different filters.

In real applications the successive wavelet is oftendiscrete. Let a D aj0, b D kb0a

j0, a0 > 1, b0 2 R, k, j are

integer numbers. The discrete wavelet transform of f�t�is written as:

W f�j, k� D a�j/20

∫Rf�t� �a�j

0 t � kb0�dt �10�

When a0 D 0, b0 D 1, Equation (10) becomes the binarywavelet transform:

W f�a, b� D 2�j/2�j/20

∫Rf�t� �a�j

0 t � kb0�dt �11�

W f�a, b�or W f�j, k� reflect the characteristics of theoriginal time series in frequency (a or j) and timedomains (b or k) at the same time. When a or j is smallthe frequency resolution of the wavelet transform is low,but the time domain resolution is high. When a or jbecomes large, the frequency resolution of the wavelettransform is high, but the time domain resolution is low(Wang and Ding, 2003).

The DWT operates two sets of functions viewed ashigh-pass and low-pass filters (Figure 3). The originaltime series are passed through high-pass and low-pass fil-ters, and detailed coefficients and approximation (A) sub-time series are obtained.

Time series

High-pass filter

Low-pass filter

Details

Approximation

Figure 3. DWT decomposition of a time series

MULTI-LINEAR REGRESSION (MLR)

If it is assumed that the dependent variable Y is effectedby m independent variables X1, X2, . . . , Xm and a linearequation is selected for the relation among them, theregression equation of Y can be written as:

y D aC b1x1 C b2x2 C . . .C bmxm �12�

y in this equation shows the expected value of thevariable Y when the independent variables take the valuesX1 D x1, X2 D x2, . . . , Xm D xm.

The regression coefficients a, b1, b2, . . . , bm are evalu-ated, similar to simple regression, by minimizing the sumof the eyi distances of observation points from the planeexpressed by the regression equation (Bayazıt and Oguz,1998):

N∑iD1

e2yi D

N∑iD1

�yi � a� b1x1i � b2x2i � bmxmi�2 �13�

In this study, the coefficients a, b1, b2, . . . , bm weredetermined using a least squares method.

AUTO-REGRESSIVE (AR) MODEL

The autoregressive (AR) model of order p can be writtenas AR(p) and is defined as

Xt D ˛1Xt�1 C . . . . . . . . .C ˛t�p C Zt �14�

where Zt is a purely random process and E�Zt� D 0,Var�Zt� D �2

Z. The parameters ˛1, . . . , ˛p are called theAR coefficients. The name ‘auto-regressive’ comes fromthe fact that Xt is regressed on past values of itself.

In this study, the AR(3) model has been applied tostream flow data by using MATLAB. The least squaresmethod was used to estimate AR coefficients.

CASE STUDY

In this study, the monthly flow data of Gerdelli Stationon Canakdere River and Isakoy Station on GoksudereRiver in the Eastern Black Sea region of Turkey wereused. The location of Gerdelli and Isakoy Stations areshown on Figure 4. The drainage areas at these sites are285 km2 for Gerdelli and 395 km2 for Isakoy Station.The observed data are 40 years (480 months) long withobservation period between 1960 and 1999 for bothstations. The observed data are for hydrologic years, i.e.the first month of the year is October and the last monthof the year is September.

After examining the data and noting the periods inwhich there were gaps in one or more of the two vari-ables, the periods for training and testing were cho-sen. In the applications, the first 32 years of flow data(384 months, 80% of the whole data set) was used fortraining and the remaining 8 years (96 months, 20% ofthe whole data set) for testing. For the Gerdelli and



Figure 4. Gerdelli and Isakoy Stations on Canakdere and Goksudere rivers

Table I. Monthly statistical parameters of data set for Gerdelli and Isakoy Station (xmean, Sx, Csx , xmin, xmax , r1, r2 and r3 denote theoverall mean, standard deviation, skewness, lag-1, lag-2 and lag-3 auto-correlation coefficients, respectively)

Station Data set xmean�m3 s�1) Sx�m3 s�1) Csx�m3 s�1) xmin�m3 s�1) xmax�m3 s�1) r1 r2 r3

Gerdelli Training data 5Ð78 6Ð97 1Ð86 0Ð01 49Ð4 0Ð536 0Ð331 0Ð008Test data 6Ð80 7Ð82 1Ð50 0Ð26 34Ð1 0Ð374 0Ð284 0Ð021Entire data 5Ð98 7Ð15 1Ð78 0Ð01 49Ð4 0Ð499 0Ð322 0Ð012

Isakoy Training data 4Ð50 4Ð95 1Ð31 0Ð00 25Ð0 0Ð566 0Ð336 0Ð019Test data 4Ð60 5Ð31 1Ð28 0Ð02 22Ð4 0Ð410 0Ð325 �0Ð004Entire data 4Ð52 5Ð02 1Ð30 0Ð00 25Ð0 0Ð531 0Ð334 0Ð013

Isakoy Station, the monthly flow statistics of the train-ing set, testing set and entire data set are presented inTable I. In this table xmean, Sx, Csx, xmin, xmax, r1, r2,r3 denote the overall mean, standard deviation, skew-ness, lag-1, lag-2 and lag-3 auto-correlation coefficients,respectively. The observed monthly flows show highpositive skewness (Csx D 1Ð78 and 1Ð30). The data forGerdelli Station have more scattered distribution thanthose for the Isakoy Station. The auto-correlations arequite low showing low persistence (e.g. r1 D 0Ð531, r2 D0Ð334, r3 D 0Ð013). The testing data set extremes (xmin D0Ð26 m3 s�1, xmax D 34Ð1 m3/s) fall within the trainingdata set limits (xmin D 0Ð01 m3 s�1, xmax D 49Ð4 m3 s�1)for the Gerdelli Station. This condition is the same atthe Isakoy Station. This ensures that the trained neu-ral network models do not face difficulties in makingextrapolations.

NEURO-WAVELET MODELS

Two methods, ANN and DWT, were used to obtaina neuro-wavelet (NW) model. The NW model is anANN model which uses sub-time series componentsobtained using DWT on original data. The NW model

structure developed in the present study is shown inFigure 5. For the NW model inputs, the original timeseries are decomposed into a certain number of sub-timeseries components (Ds) by the DWT algorithm. Eachcomponent plays a different role in the original timeseries and the behaviour of each sub-time series is distinct(Wang and Ding, 2003). The ANN is constructed suchthat the Ds of the original input time series are the inputsto the ANN and the original output time series are theoutputs of the ANN.

In the present study, the previous monthly stream flowtime series were decomposed into various Ds at differentresolution levels using the DWT to estimate the current

Decompose inputtime series usingDWT

Output (e.g., currentflow value)

Input time series(e.g., precedingmonthly flows)

ANNmodel

Add the effectiveDs componentsfor input of ANN

Figure 5. NW model structure


4146 O. KISI

stream flow value. Five resolution levels (2-4-8-16-32)were employed in this study. The original stream flowinput time series at Gerdelli Station and their Ds, that is,the time series of 2-month mode (D1), 4-month mode(D2), 8-month mode (D3), 16-month mode (D4), 32-month mode (D5), and approximate mode are shownin Figure 6. The correlation coefficients between each Dsub-time series and the original stream flow time seriesare given in Tables II and III for the Gerdelli and IsakoyStation, respectively. These correlation values provideinformation for the determination of effective waveletcomponents on stream flow. It can be seen from thesetables that the D1 component have significantly low cor-relations for both stations. The approximation compo-nents have the highest correlations. According to thesecorrelation analyses between Ds and the original cur-rent stream flow data (output), the effective components(D2, D3, D4 and D5) are selected. The model becomesexponentially more complex as the number of variables(or weights) increases. Using a large number of inputparameters or weights should be avoided to save timeand calculation effort. Therefore, the new series obtainedby adding the effective Ds and approximation componentare used as input to the ANN model.

APPLICATION AND RESULTS

The mean square errors (MSE), mean absolute relativeerrors (MARE), mean absolute errors (MAE) and corre-lation coefficient (R) statistics are used to evaluate theaccuracy of the ANN and NW model. R measures thedegree to which two variables are linearly related. MSEand MARE provide different types of information aboutthe predictive capabilities of the model. The MSE mea-sures the goodness-of-fit relevant to high flow valueswhereas the MARE yields a more balanced perspective ofthe goodness-of-fit at moderate flows (Karunanithi et al.,1994). The MSE, MARE and MAE are defined as:

MSE D 1

N

N∑iD1

�Yiobserved � Yiestimate�2 �15�

MARE D 1

N

N∑iD1

∣∣∣∣Yiobserved � YiestimateYiobserved

∣∣∣∣ 100 �16�

MAE D 1

N

N∑iD1

jYiobserved � Yiestimatej �17�

in which N is the number of data sets, Yi is the monthlystream flow.

0102030405060

0 100 200 300 400

months

0 100 200 300 400

months

0 100 200 300 400

months

0 100 200 300 400

months

Str

eam

flow

(m

3 /s)

-20

-10

0

10

20

-20

-10

0

10

20

D1

-10

-5

0

5

10

0 100 200 300 400

months

0 100 200 300 400

months

0 100 200 300 400

months

D2

D3

-4

-2

0

2

4

6

D4

-6-4-20246

D5

0

2

4

6

8

10

App

roxi

mat

ion

Figure 6. Decomposed wavelet sub-time series components (Ds) of stream flow data of Gerdelli Station



Table II. The correlation coefficients between each of sub-time series and original stream flow data—Gerdelli Station

Discretewaveletcomponents

Correlation betweenD sub-time series at

time t-1 and measuredstream flow at time t







D1 �0Ð272 0Ð006 0Ð073 �0Ð020D2 0Ð207 �0Ð104 �0Ð217 �0Ð097D3 0Ð644 0Ð334 �0Ð047 �0Ð409D4 0Ð234 0Ð211 0Ð172 0Ð107D5 0Ð167 0Ð161 0Ð150 0Ð129Approximate 0Ð169 0Ð174 0Ð176 0Ð174

Table III. The correlation coefficients between each of sub-time series and original stream flow data—Isakoy Station

Discretewaveletcomponents









D1 �0Ð306 0Ð032 0Ð069 �0Ð018D2 0Ð236 �0Ð136 �0Ð254 �0Ð103D3 0Ð634 0Ð338 �0Ð039 �0Ð407D4 0Ð209 0Ð185 0Ð150 0Ð099D5 0Ð203 0Ð199 0Ð192 0Ð177Approximate 0Ð151 0Ð157 0Ð157 0Ð154

For the Gerdelli Station, four input combinations basedon preceding monthly flows were evaluated to estimatecurrent flow value. The input combinations evaluatedin the study were; (i) Qt�1, (ii) Qt�1 and Qt�2, (iii)Qt�1, Qt�2 and Qt�3, (iv) Qt�1, Qt�2, Qt�3 and Qt�4.In all cases, the output layer has only one neuron, thedischarge Qt for the current month. The MSE, MARE,MAE and R statistics of ANN and NW models in thetest period are given in Table IV. The table reveals thatamong the ANN models, the model whose inputs arethe flows of three previous months (input combination(iii)) has the smallest MSE (39Ð9 �m3 s�1�2) and thehighest R (0Ð561). In Table IV, of all the NW models, themodel comprising input combination (iii) has the smallestMSE (17Ð6 �m3 s�1�2) and MAE (2Ð99 m3 s�1) and thehighest R (0Ð837). As can be seen from the table, the NWmodel provides much better performance than the ANNin monthly stream flow estimation.

A critical issue in training an ANN is to avoidingoverfitting, as this reduces its capacity for generaliza-tion. If too many neurons are used, the network has

too many parameters and may overfit the data. In con-trast, if too few neurons are included in the network,it might not be possible to fully detect the signal andvariance of a complex data set. Here, the number of hid-den nodes in the ANN were determined using a trialand error method. The number of nodes in the hiddenlayer of the ANN and NW models is listed in Table V.For the Gerdelli Station, the optimum number of hid-den nodes in the ANN and NW models was found tovary between 2 and 5. The best way to avoid over-fitting is to use lots of training data. For noise-freedata, if we have at least five times as many train-ing cases as there are weights in the network, overfit-ting is unlikely. The other way to avoid the overfittingproblem is to use different ANN structures (Sudheeret al., 2002b). In this study, different ANN structureswere tried and 32 years of flow data (384 months) wereused for training the ANN and NW models. The 20weights (3 ð 5 C 5 D 20) were used for the most com-plex NW(3,5,1) model comprising three (3) inputs, five(5) hidden and one (1) output nodes (input combination

Table IV. The mean square errors (MSE), mean absolute relative errors (MARE), mean absolute errors (MAE) and correlation(R) statistics of neural networks and neuro-wavelet applications for Gerdelli Station (test period)

Model inputs ANN Neuro-wavelet

MSE(�m3 s�1�2)

MARE(%)

MAE(m3 s�1)

R MSE(�m3 s�1�2)

MARE(%)

MAE(m3 s�1)

R

(i) Qt�1 48Ð9 167 4Ð92 0Ð474 36.5 135 4.09 0.616(ii) Qt�1 and Qt�2 50Ð9 170 4Ð96 0Ð446 21.7 94Ð4 3.05 0.793(iii) Qt�1, Qt�2 and Qt�3 39Ð9 159 4Ð49 0Ð561 17.6 97Ð4 2.99 0.837(iv) Qt�1, Qt�2, Qt�3 and Qt�4 43Ð3 150 4Ð29 0Ð514 21.5 113 3.00 0.794


4148 O. KISI

(iii)). The training data seem to be sufficient to avoidoverfitting.

Table V. Numbers of hidden nodes for ANN and NW models

Model inputs The number of hidden nodes

Gerdelli Station Isakoy StationANN NW ANN NW

(i) Qt�1 3 3 5 5(ii) Qt�1 and Qt�2 5 5 2 2(iii) Qt�1, Qt�2 and Qt�3 2 5 2 3(iv) Qt�1, Qt�2, Qt�3 and Qt�4 3 3 2 2

The ANN and NW estimates in the test period arecompared with the observed flows in the form of ahydrograph in Figure 7. The residuals of the models arealso provided in this figure. It can be seen, especiallyfrom the residuals, that the NW model performs betterthan the ANN. There are underestimations of somepeaks with both models. The availability of only asmall number of training patterns for peak flows couldbe a reason for this. The estimates of each model arerepresented in Figure 8 in the form of a scatterplot. Itis seen from the scatterplots that the NW estimates arecloser to the corresponding observed flow values thanthose of the ANN model. As seen from the fit line

0

4

8

12

16

20

24

28

32

36

40

0 15 30 45 60 75 90

month

0 15 30 45 60 75 90

month

0 10 20 30 40 50 60 70 80 90

month

Str

eam

flo

ws

(m3 /

s)S

trea

m f

low

s (m

3 /s)

observedANN

0

4

8

12

16

20

24

28

32

36

40 observedNeuro-wavelet

-24

-20

-16

-12

-8

-4

0

4

8

12

16

20

Res

idua

ls (

m3 /

s)

ANNNeuro-wavelet

Figure 7. Monthly stream flow estimates of ANN and NW models in test period—Gerdelli Station



y = 0.328x + 4.018R = 0.561

y = 0.763x + 1.584R = 0.837

0

10

20

30

40

0

Observed (m3/s)

AN

N M

odel

(m

3 /s)

0

10

20

30

40

NW

Mod

el (

m3 /

s)

10 20 30 40 0

Observed (m3/s)

10 20 30 40

Figure 8. Monthly stream flow estimates of ANN and NW models in test period—Gerdelli Station

equations (assume that the equation is y D aox C a1)in the scatterplots, the ao and a1 coefficients for theNW model are, respectively, closer to 1 and 0 witha higher R value of 0Ð837 than those of the ANNmodel.

The performances of the ANN, NW, MLR and ARmodels in terms of MSE, MARE, MAE and R arepresented in Table VI. It can be seen from Table VI thatthe NW model outperforms all other models in terms ofseveral performance criteria. The ANN is ranked as thesecond best. The MLR and AR models seem to havealmost equal accuracy.

For the Gerdelli Station, the results were also testedusing one-way analysis of variance (ANOVA) and a t-testto verify the robustness (the significance of differencesbetween the model estimates and observed values) of themodels. Both tests were set at a 95% significance level.Namely, differences between observed and estimatedvalues are considered significant when the resultantsignificance level (P) is lower than 0Ð05 using two-tailed significance levels. The statistics of the tests aregiven in Table VII. The NW(3,5,1) model yields smalltest values (0Ð00001 and 0Ð003) with a high significancelevel (0Ð997) for the ANOVA and t-test, respectively.According to the test results, the NW is more robust(similarity between observed flows and NW estimatesis significantly higher) in monthly flow estimation thanthe ANN.

The ANN, NW, MLR and AR(3) estimate meanflow values of 6Ð69 m3 s�1 as 6Ð21 m3 s�1, 6Ð69 m3 s�1,5Ð11 m3 s�1 and 5Ð11 m3 s�1 with underestimations of7Ð17%, 0Ð06%, 23Ð6% and 23Ð6%, respectively. The

Table VI. Mean square errors (MSE), mean absolute relativeerrors (MARE), mean absolute errors (MAE) and correlation(R) statistics of MLR, AR(3), ANN, and NW models for Gerdelli

Station (test period)

Models MSE(�m3 s�1�2)

MARE(%)

MAE(m3 s�1)

R

MLR 54Ð3 108 4Ð85 0Ð448AR(3) 54Ð2 109 4Ð82 0Ð437ANN 39Ð9 159 4Ð49 0Ð561NW 17Ð6 97Ð4 2Ð99 0Ð837

Table VII. Analysis of variance and t-test for stream flow esti-mation

ANOVA t-test Resultant significance level

F-statistic t-statistic

Gerdelli StationMLR 2Ð49766 1Ð580 0Ð116AR(3) 2Ð48832 1Ð577 0Ð117ANN 0Ð27000 0Ð519 0Ð604NW 0Ð00001 0Ð003 0Ð997Isakoy StationMLR 1Ð702 1Ð305 0Ð193AR(3) 1Ð701 1Ð304 0Ð194ANN 0Ð266 0Ð516 0Ð607NW 0Ð005 0Ð069 0Ð945

NW model provides the best estimate. This result hassignificance for water resources applications, where meanflow is required rather than the extremes.

The MSE, MARE, MAE and R statistics for the ANNand NW models are given for the Isakoy Station inTable VIII. The number of nodes in the hidden layer ofthe ANN and NW models is provided in Table V. Forthis station, the optimum number of hidden nodes for theANN and NW models was found to vary between 2 and5, as found for the first station. Table VIII indicates thatANN(3,2,1) and NW(3,3,1) models, whose inputs are theflows of the three previous months (input combination(iii)), have the best performance. As can be seen fromthis table, the NW model performs much better than theANN. The forecasting performance of the ANN and NWmodels are compared in Figure 9 in the form of a hydro-graph. It is obvious from the hydrographs that the NWmodel provides much closer estimates to the correspond-ing observed flow values than the ANN. The superiorityof the NW model can also be observed from the resid-uals graph. The matching of observed stream flows andthe estimates of each model are plotted in Figure 10, inthe form of a scatter diagram for ease of comparison.As seen from the figure, the NW model has less scat-tered estimates than the ANN model. The performancesof the ANN and NW models are compared with the MLRand AR models in Table IX, and it is seen that the NWmodel performs much better than the other models. The


4150 O. KISI

Table VIII. Mean square errors (MSE), mean absolute relative errors (MARE), mean absolute errors (MAE) and correlation(R) statistics of neural networks and neuro-wavelet applications for Isakoy Station (test period)

Model inputs ANN Neuro-wavelet

MSE(�m3 s�1�2)

MARE(%)

MAE(m3 s�1)

R MSE(�m3 s�1�2)

MARE(%)

MAE(m3 s�1)

R

(i) Qt�1 22Ð3 517 3Ð49 0Ð485 14.0 319 2.74 0.673(ii) Qt�1 and Qt�2 22Ð2 529 3Ð37 0Ð510 9.14 275 2.11 0.801(iii) Qt�1, Qt�2 and Qt�3 14Ð6 588 2Ð89 0Ð657 8.19 285 1.88 0.823(iv) Qt�1, Qt�2, Qt�3 and Qt�4 17Ð8 613 3Ð12 0Ð561 8.88 226 1.95 0.805

0

4

8

12

16

20

0 15 30 45 60 75 90

month

0 15 30 45 60 75 90

month

0 10 20 30 40 50 60 70 80 90

month

Str

eam

flo

ws

(m3 /

s)

0

4

8

12

16

20

Str

eam

flo

ws

(m3 /

s)

observedNeuro-wavelet

-16

-12

-8

-4

0

4

8

12

Res

idua

ls (

m3 /

s)

observedANN

ANNNeuro-wavelet

Figure 9. Monthly stream flow estimates of ANN and NW models in test period—Isakoy Station



0

5

10

15

20

0 10 15 20

Observed (m3/s)

AN

N M

odel

(m

3 /s)

0

5

10

15

20

AN

N M

odel

(m

3 /s)

5 0 10 15 20

Observed (m3/s)

5

y = 0.468x + 2.713R = 0.657

y = 0.710x + 1.347R = 0.823

Figure 10. Monthly stream flow estimates of ANN and NW models in test period—Isakoy Station

Table IX. The mean square errors (MSE), mean absolute relativeerrors (MARE), mean absolute errors (MAE) and correlation(R) statistics of MLR, AR(3), ANN, and NW models for Isakoy

Station (test period)

Models MSE(�m3 s�1�2)

MARE(%)

MAE(m3 s�1)

R

MLR 22Ð0 160 3Ð22 0Ð498AR(3) 21Ð8 160 3Ð18 0Ð493ANN 14Ð6 588 2Ð89 0Ð657NW 8Ð19 285 1Ð88 0Ð823

ANN also has better accuracy than the MLR and ARmodels. The MLR and AR models show similar perfor-mance.

The statistics of the ANOVA and t-test are given inTable VII. The NW model seems to be more robust thanthe ANN, MLR and AR models in estimation of monthlystream flows. For the Isakoy Station, the means of theANN, NW, MLR and AR(3) flow forecasts are 7Ð47%,1Ð08% higher and 19Ð8%, 19Ð8% lower than the observedflow, respectively. Here also the NW model has the bestaccuracy.

For the two different rivers, the models show dramat-ically different abilities to capture the peak flows. Thismay be due to the fact that the statistical properties ofthe flow data are different for the Gerdelli and IsakoyStation. It can be seen from Table I that the flow dataof Gerdelli Station have a much more skewed distribu-tion than those of the Isakoy Station (see Csx valuesin Table I). The auto-correlations for Gerdelli are alsolower than those for Isakoy (see r1, r2 and r3 values inTable I).

CONCLUSIONS

The accuracy of the neuro-wavelet (NW) technique hasbeen investigated for modelling monthly stream flows.The NW models were developed by combining two meth-ods, artificial neural networks (ANN) and the discretewavelet transform (DWT). The NW and ANN mod-els were tested by applying to various input combi-nations of monthly streamflow data from Gerdelli and

Isakoy Stations in the Eastern Black Sea region ofTurkey. The test results indicate that the DWT couldincrease the accuracy of ANN models in modellingmonthly stream flows. The NW and ANN models werecompared with multi-linear regression (MLR) and auto-regressive (AR) models. The comparison results showedthat the NW model performed much better than theANN, MLR, and AR models. The study only used datafrom two areas and further studies using more datafrom different areas may be required to strengthen theseconclusions.

REFERENCES

Bayazıt M, Oguz B. 1998. Probability and Statistics for Engineers. BirsenPublishing House: Istanbul, Turkey.

Chang F-J, Chen Y-C. 2001. A counterpropagation fuzzy-neural networkmodelling approach to real time streamflow prediction. Journal ofHydrology 245: 153–164.

Cigizoglu HK. 2003. Estimation, forecasting and extrapolation of flowdata by artificial neural networks. Hydrological Sciences Journal 48(3):349–361.

Cigizoglu HK, Kisi O. 2005. Flow prediction by three back propagationtechniques using k-fold partitioning of neural network training data.Nordic Hydrology 36(1): 49–64.

Hagan MT, Menhaj MB. 1994. Training feed forward networks withthe Marquaradt algorithm. IEEE Transactions on Neural Networks 6:861–867.

Haykin S. 1998. Neural Networks—A Comprehensive Foundation, 2ndedn. Prentice-Hall: Upper Saddle River, NJ; 26–32.

Karunanithi N, Grenney WJ, Whitley D, Bovee K. 1994. Neuralnetworks for river flow prediction. Journal Computing in CivilEngineering ASCE 8(2): 201–220.

Kisi O. 2004. River flow modelling using artificial neural networks. ASCEJournal of Hydrologic Engineering 9(1): 60–63.

Marquardt D. 1963. An algorithm for least squares estimation of non-linear parameters. Journal of the Society of Industrial and AppliedMathematics 11: 431–441.

Sahoo GB, Ray C. 2006. Flow forecasting for a Hawaii stream usingrating curves and neural networks. Journal of Hydrology 317:63–80.

Shamseldin AY. 1997. Application of a neural network technique torainfall–runoff modeling. Journal of Hydrology 199: 272–294.

Sivakumar B, Jayawardena AW, Fernando TMKG. 2002. River flowforecasting: use of phase space recostruction and artificial neuralnetworks approaches. Journal of Hydrology 265: 225–245.

Smith J, Eli RN. 1995. Neural-network models of rainfall–runoffprocess. Journal Water Resource Planning and Management, ASCE121(6): 499–508.

Sudheer KP, Gosain AK, Ramasastri KS. 2002a. A data-driven algorithmfor constructing artificial neural network rainfall–runoff models.Hydrological Processes 16: 1325–1330.


4152 O. KISI

Sudheer KP, Gosain AK, Rangan DM, Saheb SM. 2002b. Modellingevaporation using an artificial neural network algorithm. HydrologicalProcesses 16: 3189–3202.

Sudheer KP, Jain SK. 2003. Radial basis function neural network formodelling rating curves. Journal of Hydrologic Engineering 8(3):161–164.

Tokar AS, Johnson PA. 1999. Rainfall–runoff modelling using artificialneural networks. Journal of Hydrologic Engineering, ASCE 4(3):232–239.

Wang W, Ding J. 2003. Wavelet network model and its application to theprediction of the hydrology. Nature and Science 1(1): 67–71.

Zealand CM, Burn DH, Simonovic SP. 1999. Short term streamflowforecasting using artificial neural networks Journal of Hydrology 214:32–48.


stream flow forecasting using neuro-wavelet technique

Documents