precipitation forecasting by using wavelet-support vector machine conjunction model

Engineering Applications of Artificial Intelligence 25 (2012) 783–792

Contents lists available at SciVerse ScienceDirect

Engineering Applications of Artificial Intelligence

0952-19

doi:10.1

n Corr

E-m

journal homepage: www.elsevier.com/locate/engappai

Precipitation forecasting by using wavelet-support vector machineconjunction model

Ozgur Kisi a,n, Mesut Cimen b

a Erciyes University, Engineering Faculty, Civil Engineering Deptartment, 38039 Kayseri, Turkiyeb Suleyman Demirel University, Engineering Faculty, Civil Engineering Deptartment, Isparta, Turkiye

a r t i c l e i n f o

Article history:

Received 22 March 2011

Received in revised form

26 August 2011

Accepted 14 November 2011Available online 30 November 2011

Keywords:

Precipitation

Discrete wavelet transform

Support vector machine

Forecast

76/$ - see front matter & 2011 Elsevier Ltd. A

016/j.engappai.2011.11.003

esponding author. Tel.: þ90 352 4374901; fa

ail address: [email protected] (O. Kisi).

a b s t r a c t

A new wavelet-support vector machine conjunction model for daily precipitation forecast is proposed

in this study. The conjunction method combining two methods, discrete wavelet transform and support

vector machine, is compared with the single support vector machine for one-day-ahead precipitation

forecasting. Daily precipitation data from Izmir and Afyon stations in Turkey are used in the study. The

root mean square errors (RMSE), mean absolute errors (MAE), and correlation coefficient (R) statistics

are used for the comparing criteria. The comparison results indicate that the conjunction method could

increase the forecast accuracy and perform better than the single support vector machine. For the Izmir

and Afyon stations, it is found that the conjunction models with RMSE¼46.5 mm, MAE¼13.6 mm,

R¼0.782 and RMSE¼21.4 mm, MAE¼9.0 mm, R¼0.815 in test period is superior in forecasting daily

precipitations than the best accurate support vector regression models with RMSE¼71.6 mm,

MAE¼19.6 mm, R¼0.276 and RMSE¼38.7 mm, MAE¼14.2 mm, R¼0.103, respectively. The ANN

method was also employed for the same data set and found that there is a slight difference between

ANN and SVR methods.

& 2011 Elsevier Ltd. All rights reserved.

1. Introduction

Forecast of precipitation is essential for planning and manage-ment of water resources. But, the temporal and spatial variationof rainfall makes it difficult. Especially, it is important forhydrologists and meteorologists to accurately estimate dailyexcessive precipitations, which are responsible for some floods.Using some numerical models by Bustamante et al. (1999) andOlson et al. (2004) and physical models by Georgakakos and Bras(1984), studies of rainfall quantitative prediction have beencarried out. However, they are not successful enough in forecast-ing precipitation (Olson et al., 1995) due to inaccurate initialconditions, parameterization schemes of subscale phenomena,and limited spatial resolution (Ramirez et al., 2005). Using anumeric weather model, rainfall prediction is in general depen-dent highly on grid point values. However, it is quite difficult forrainfall as it is variable both in space and time.

In last decades, soft computing techniques has been success-fully used in precipitation/rainfall forecasting. French et al. (1992)used an artificial neural networks (ANN) for forecasting rainfallintensity fields at a lead-time of 1 h. Navone and Ceccatto (1994)predicted summer monsoon rainfall over India using an ANN.

ll rights reserved.

x: þ90 352 4375784.

Sivapragasam et al. (2001) forecasted rainfall using singularspectrum analysis coupled with support vector machine (SVM)approach. Freiwan and Cigizoglu (2005) investigated the accuracyof ANN technique for estimating monthly precipitation amount.Marzano et al. (2006) compared the accuracy of ANN approachwith the previously developed regression techniques in estima-tion of precipitation intensity. Ingsrisawang et al. (2008) useddecision tree, ANN and SVM for short-term rain forecasting. Hunget al. (2009) applied ANN for forecasting rainfall of Bangkok,Thailand. Wu et al. (2010a,b) used SVM for daily rainfall fore-casting. Dastorani et al. (2010) examined the potential of ANNand neuro-fuzzy models dryland precipitation prediction. Wuet al. (2010a,b) predicted daily and monthly rainfall time seriesby using modular ANN coupled with data-preprocessing techni-ques. Moustris et al. (2011) forecasted precipitation of Greece byusing ANN method. El-Shafie et al. (2011) used a neuro-fuzzy andANN techniques for forecasting monthly rainfalls of Klang River inMalaysia. Hong (2008) used SVM and chaotic particle swarmoptimization algorithm for rainfall forecasting.

The application of wavelet transform for analyzing variations,periodicities, trends in time series has received much attention inrecently years (Smith et al., 1998; Lu, 2002; Chou and Wang,2002; Dai et al., 2003; Coulibaly and Burn, 2004; Partal andKucuk, 2006; Zhou et al., 2008; Santos et al., 2009; Kisi, 2007,2009; Kisi and Cimen, 2011). Smith et al. (1998) used a discretewavelet transform (DWT) for quantifying streamflow variability.

www.elsevier.com/locate/engappai

www.elsevier.com/locate/engappai

dx.doi.org/10.1016/j.engappai.2011.11.003

mailto:[email protected]

dx.doi.org/10.1007/978-3-642-12990-2_7

O. Kisi, M. Cimen / Engineering Applications of Artificial Intelligence 25 (2012) 783–792784

They suggested that streamflows could be effectively classifiedinto distinct hydroclimatic categories using DWT. Lu (2002)applied wavelet transform for decomposition of interdecadaland interannual components of rainfall data in rainy season.Chou and Wang (2002) used a DWT for decomposition of unithydrograph. They employed on-line estimation of unit hydro-graph using the DWT components. Xingang et al. (2003) investi-gated the rainfall spectrum and its evolution of North China inrainy season with summer monsoon decaying interdecadal timescale using wavelet analysis. Coulibaly and Burn (2004) usedwavelet analysis to identify and describe variability in annualCanadian streamflows and to gain insights into the dynamical linkbetween the streamflows and the dominant modes of climatevariability in the Northern Hemisphere. Partal and Kucuk (2006)used a DWT for determining the possible trends in annual totalprecipitation series. They used the precipitation records frommeteorological stations of Turkiye and concluded that the trendanalysis on DWT components of the precipitation timeseries clearly explains the trend structure of data. Kisi (2008)predicted monthly streamflows using wavelet network models.Zhou et al. (2008) proposed a wavelet predictor-corrector modelfor the simulation and prediction of monthly discharge timeseries. Kisi (2009) introduced the wavelet-ANN conjunctionmodel for forecasting intermittent streamflow. Nourani et al.(2009) predicted Ligvanchai watershed precipitation using acombined neural-wavelet model. Kisi and Cimen (2011) proposeda wavelet and support vector machine conjunction model formonthly streamflow forecasting. All these studies showed thatwavelet transform is an effective tool for precisely locatingirregularly distributed multi-scale features of climate elementsin space and times.

The purpose of this paper is to investigate the performance ofwavelet and support vector machine conjunction model for one-day-ahead precipitation forecasting and to compare this with theperformance of single SVM and ANN models. The presented studyis the first application for forecasting precipitation using waveletand support vector machine in literature.

2. Support vector machines

The idea of SVMs, which are known as the classification andregression procedures, has been developed by Vapnik (1995).Support vector regression (SVR) is used to describe regressionwith SVMs in the open literature. In regression estimation withSVR we attempt to estimate a functional dependency f ð x

!Þ

between a set of sampled points X ¼ f x!

1, x!

2,. . .. . ., x!

lg takenfrom Rn and target values Y¼{y1,y2,yy.,yl} with yiAR (herein,the x!

i denotes preceding daily precipitation and yi denotescurrent daily precipitation). Let us assume that these sampleshave been generated independently from an unknown probabilitydistribution function Pð x

!,yÞ and a class of functions (Vapnik,

1995):

F ¼ ff 9f ð x!Þ¼ ðw!, x!ÞþB : w

!ARn, Rn-Rg ð1Þ

where w!

and B are coefficients that have to be estimated from theinput data. Herein, the fundamental problem is to find a functionf ð x!ÞAF that minimizes a risk functional:

R½f ð x!Þ� ¼

Zlðy�f ð x

!Þ, x!ÞdPð x!

,yÞ ð2Þ

where l is a loss function used to measure the deviation betweenthe target, y, and estimate, f ð x

!Þ, values. As the probability

distribution function Pð x!

,yÞ is unknown, it cannot minimize

R½f ð x!Þ� directly but only compute the empirical risk function as

Remp f ð x!Þ

h i¼

1

N

XN

i ¼ 1

lðyi�f ð x!

iÞÞ ð3Þ

This traditional empirical risk minimization is not advisablewithout any means of structural control or regularization. There-fore, a regularized risk function with the smallest steepnessamong the functions that minimize the empirical risk functioncould be used as

Rreg ½f ð x!Þ� ¼ Remp½f ð x

!Þ�þgJw

!J2

ð4Þ

where g is a constant (gZ0). This additional term reduces themodel space and thereby controls the complexity of the solution.For this reason, the following form of this expression can beconsidered (Smola, 1996; Cimen, 2008):

Rreg f ð x!Þ

h i¼ Cc

Xxi AX

leðyi�f ð x!

iÞÞþ1

2Jw!

J2ð5Þ

where Cc is a positive constant (i.e., additional capacity controlparameter) that has to be chosen beforehand. The constant Cc thatinfluences a trade-off between an approximation error and theregression (weight) vector Jw

!J is a design parameter. The loss

function in this expression, which is called e-insensitive lossfunction, has the advantage that we will not need all the inputdata for describing the regression vector w

!and can be written as

leðyi�f ð x!

iÞÞ ¼0 for 9yi�f ð x

!iÞ9oe

9yi�f ð x!

iÞ9 otherwise

8<: ð6Þ

This function behaves as a biased estimator when combinedwith a regularization term ðgJw

!J2Þ. The loss is equal to 0 if the

difference between the predicted f ð x!

iÞ and the measured value yi

is less than e. The choice of e value is easier than the choice of Cc

and it is often given as desired percentage of the output values yi.Hence, nonlinear regression function is given by function thatminimizes Eq. (5) subject to Eq. (6) as in the following expression(Vapnik, 1995; Gunn, 1998; Cimen, 2008):

f ðxÞ ¼XN

i ¼ 1

ðan

i �aiÞKðx,xiÞþB ð7Þ

where ai and an

i Z0 are the Lagrange multipliers, B is a bias term,and K(x,xi) is the Kernel function, which is based upon Reprodu-cing Kernel Hilbert Spaces. The data are often assumed to havezero mean (this can be achieved by pre-processing), so the biasterm is dropped. The kernel function is to enable operations to beperformed in the input space rather than the potentially highdimensional feature space. Hence an inner product in the featurespace has an equivalent kernel in input space. In general, theKernel functions treated by the SVR are the functions with thepolynomial, Gaussian Radial Basis, Exponential Radial Basis,Multi-Layer Perception, Splines, etc. The Gaussian Radial Basisfunction (GRBF) taken into consideration in this study because itis the most common kernel function (Caputo et al., 2002). It canbe written as follows:

Kðx,xiÞ ¼ expð�Jx�xiJ2=2s2Þ ð8Þ

where s is the Gaussian noise level of standard deviation.During the learning by SVR the purpose is to find a nonlinear

function given by Eq. (7) that minimizes a regularized riskfunction (i.e., Eq. (5)). This is achieved for the least value of thedesired error criterion (for example, RMSE) for various constantparameters Cc, and e and various kernel functions K(x,xi) withvarious constant s values. This process was achieved by aprogram written in Fortran 90. The program also produces theLagrange multipliers in Eq. (7).

TurkiyeAfyonIzmir

Fig. 1. The location of the precipitation stations.

Table 1The daily statistical parameters of data set for Izmir and Afyon Station.

Station Data

set

xmean

(cm)

Sx

(cm)

Csx

(cm)

xmin

(cm)

xmax

(cm)

r1 r2 r3

Izmir Training 1.81 6.95 6.46 0.00 108 0.273 0.087 0.057

Test 1.94 7.30 5.95 0.00 79.5 0.283 0.096 0.083

Entire 1.84 7.03 6.32 0.00 108 0.275 0.088 0.062

Afyon Training 1.16 3.47 4.86 0.00 37.2 0.174 0.076 0.015

Test 1.23 3.62 4.41 0.00 38.0 0.146 0.065 0.021

Entire 1.17 3.49 4.76 0.00 38.0 0.168 0.060 0.016

O. Kisi, M. Cimen / Engineering Applications of Artificial Intelligence 25 (2012) 783–792 785

3. Discrete wavelet transform

The theory of wavelet analysis was developed based on theFourier analysis. A signal is broken up smooth sinusoids ofunlimited duration in Fourier analysis. In wavelet analysis, asignal is similarly broken up into wavelets, which are waveformsof effectively limited duration and zero mean. Wavelet analysis isa windowing technique with variable-sized regions. The waveletanalysis indicates a time-scale view of a signal and provides amethod of expressing natural phenomena by utilizing their veryrudimentary multi-fractal basis. The lower scales refer to thecompressed wavelet and are able to follow the high frequencycomponent or the rapidly changing details of the signal. Thehigher scales represent the stretched version of a wavelet and thecorresponding coefficients indicate the slowly changing featuresof a low-frequency component (Kucuk and Agiralioglu, 2006).

Wavelet function c(t) called the mother wavelet, can bedefined as

R þ1�1

cðtÞdt¼ 0 and the ca,b(t) can be obtained throughcompressing and expanding c(t):

ca,bðtÞ ¼ 9a9�1=2ct�b

a

� �bAR, aAR, aa0 ð9Þ

where ca,b(t) is the successive wavelet, a is the scale or frequencyfactor, b is a time factor; R is the domain of real numbers.

If ca,b(t) satisfies Eq. (9), for the time series f(t)AL2(R) or finiteenergy signal, successive wavelet transform of f(t) is defined as

Wcf ða,bÞ ¼ 9a9�1=2Z

Rf ðtÞc

t�b

a

� �dt ð10Þ

where cðtÞ is the complex conjugate functions of c(t). It can beseen from Eq. (10) that the wavelet transform is the decomposi-tion of f(t) under different resolution level (scale). In other words,to filter wave for f(t) with different filter is the essence of wavelettransform.

The successive wavelet is often discrete in real applications.Let a¼ aj

0, b¼ kb0aj0, a041, b0AR, and k and j are integer numbers.

Discrete wavelet transform of f(t) can be written as

Wcf ðj,kÞ ¼ a�j=20

ZR

f ðtÞcða�j0 t�kb0Þdt ð11Þ

The most common (and simplest) choice for the parameters a0

and b0 is 2 and 1 time steps, respectively. This power of twologarithmic scaling of the time and scale is known as dyadic gridarrangement and is the simplest and most efficient case forpractical purposes (Mallat, 1989). Eq. (11) becomes binary wave-let transform when a0¼2, b0¼1:

Wcf ðj,kÞ ¼ 2�j=2Z

Rf ðtÞ cð2�jt�kÞdt ð12Þ

The characteristics of the original time series in frequency (a or j)and time domain (b or k) at the same time are reflected byWcf(a,b) or Wcf(j,k). When the frequency resolution of wavelettransform is low and the time domain resolution is high, a or j

becomes small. When the frequency resolution of wavelet trans-form is high and the time domain resolution is low, a or j becomeslarge (Wang and Ding, 2003).

For a discrete time series f(t), where occurs at different t

(here integer time steps are used), the DWT can be defined as(Mallat, 1989)

Wcf ðj,kÞ ¼ 2�j=2XN�1

t ¼ 0

f ðtÞcð2�jt�kÞ ð13Þ

where Wcf(j,k) is wavelet coefficient for the discrete wavelet of scalea¼2j, b¼2jk.

DWT operates two sets of function viewed as high-pass andlow-pass filters. The original time series are passed through high-pass and low-pass filters and separated at different scales (Kisi,2009). The time series is decomposed into one comprising itstrend (the approximation) and one comprising the high frequen-cies and the fast events (the detail). In the present study, thedetail coefficients and approximation sub-time series areobtained using the Eq. (13).

4. Data and statistical analysis

In this study, the daily precipitation data of two stationslocated in Aegean Region of Turkiye (located in west of Turkiye)are used in this study. The location of the stations is shown inFig. 1. Observed data records have been obtained from the DMI(Turkish State Meteorological Services). The observed data is 15years (5479 day) long with an observation period between 1987and 2001 (from January 1987 to December 2001) for bothstations.

In the applications, the first 12-year of precipitation data (4383day, 80% of the whole data set) are used for training and theremaining 3-year (1096 day, 20% of the whole data set) are used fortesting. The data sets’ daily statistics are presented in Table 1 for theIzmir and Afyon Station. In this table, xmean, Sx, Csx, xmin, xmax, r1, r2, r3

denote the overall mean, standard deviation, skewness, minimum,maximum, lag-1, lag-2 and lag-3 autocorrelation coefficients,respectively. The highest maximum daily precipitation value wasobserved at the Izmir Station (xmax¼1080 mm). The observed dailyprecipitations show quite high positive skewness values (Csx¼6.32and 4.76). It can be seen from the skewness coefficients in the fifthcolumn of Table 1 that the precipitation data show scattereddistribution. The data of Izmir Station have more scattered distribu-tion than those of the Afyon Station. The autocorrelations are quitelow, showing low persistence (e.g., r1¼0.275, r2¼0.088, r3¼0.062).

The root mean square errors (RMSE), mean absolute errors(MAE), Nash–Sutcliffe coefficient (NS) (Nash and Sutcliffe, 1970)and correlation coefficient (R) statistics are used to evaluateWSVR and SVR model accuracies. The R shows the degree whichtwo variables are linearly related to. Different types of informa-tion about the predictive capabilities of the model are measured


through RMSE and MAE. The RMSE sizes the goodness of the fitrelated to high precipitation values whereas the MAE measures amore balanced perspective of the goodness of the fit at moderateprecipitations (Karunanithi et al., 1994). The NS is a coefficient ofefficiency and indicates the relative assessment of the modelperformance in dimensionless measures (Nash and Sutcliffe,1970). The RMSE, MAE and NS are defined as

RMSE¼

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1

N

XN

i ¼ 1

ðYiobserved�Yiestimate

Þ2

vuut ð14Þ

MAE¼1

N

XN

i ¼ 1

Yiobserved�Yiestimate

�� ð15Þ

NS¼ 1�

PNi ¼ 1 ðYiobserved

�YiestimateÞ2

PNi ¼ 1 ðYiobserved

�Y�

iobservedÞ2

ð16Þ

in which N is the number of data set, Yi is the daily precipitationand bar denotes the mean.

Decompose input time series using

DWT

Output (e.g., current precipitation value)

Input time series (e.g., preceding daily

precipitations)

SVRmodel

Add the effective Ds components for input of SVR

Fig. 2. The WSVR model structure.

0200400600800

10001200

0day

Pre

cipi

tatio

n (m

m)

-600-400-200

0200400600

D2

-600-400-200

0200400600

D4

1000 2000 3000 4000

0day

1000 2000 3000 4000

0day

1000 2000 3000 4000

Fig. 3. Decomposed wavelet sub-time series compon

5. Application and results

The wavelet support vector regression (WSVR) models areobtained combining two methods, DWT and SVR. The WSVRmodel is an SVR model, which uses sub-time series componentsobtained using DWT on original data. The WSVR model structuredeveloped in the present study is shown in Fig. 2. For the WSVRmodel inputs, the original time series are decomposed into acertain number of sub-time series components (Ds). Each com-ponent plays different role in the original time series and thebehavior of each sub-time series is distinct (Wang and Ding,2003). In WSVR model, the inputs to the model are the Ds ofpreceding daily precipitation and the outputs are original andcurrent daily precipitation.

The previous daily precipitation time series are decomposedinto various Ds at different resolution levels by using DWT toestimate current precipitation value. Four resolution levels (2–4–8–16) are employed in this study. In general, there is log(n)resolution level, where n is the length of the time series (Wangand Ding, 2003). In this study, 4383 daily data are used to obtainSVR models for both stations. This approximately gives fourresolution levels. The original precipitation input time series ofIzmir Station and their Ds, which are, the time series of 2-daymode (D1), 4-day mode (D2), 8-day mode (D3), 16-day mode (D4)and approximate mode, are shown in Fig. 3. The correlationcoefficients between each D sub-time series and original dailyprecipitation time series are given in Table 2 for the Izmir andAfyon Station. In this table, the Dt-i (i¼1,2,3,4) and Pt denotes theD sub-time series at time t–i and measured precipitation at time t,respectively. These correlation values provide information for thedetermination of effective wavelet components on precipitation.It can be seen from Table 2 that D1 component shows lowcorrelations for the both stations. The D2 components have thehighest correlations. In order to select dominant D components,total absolute correlations are evaluated. The total correlationsare given in the last column of the Table 2. According to the total

-600-400-200

0200400600

0day

D1

-600-400-200

0200400600

D3

0

50

100

150

App

roxi

mat

ion

1000 2000 3000 4000

0day

1000 2000 3000 4000

0day

1000 2000 3000 4000

ents (Ds) of precipitation data of Izmir Station.


correlations, the D2, D3 and D4 components seem to be moreeffective than the D1. Then, the new series obtained by adding theeffective Ds and approximation component are used as input tothe SVR model. This summed new series was labeled as DW.

For the Izmir Station, four input combinations based on precedingdaily precipitations are evaluated to estimate current precipitationvalue. The input combinations evaluated in the study are; (i) Pt�1,(ii) Pt�1 and Pt�2, (iii) Pt�1, Pt�2 and Pt�3, (iv) Pt�1, Pt�2, Pt�3 andPt�4. In all cases, the output is the precipitation Pt for the current day.The RMSE, MAE, NS and R statistics of SVR models in training and testperiod are given in Table 3. The Cc, e, and s parameters of theoptimum SVR and WSVR models are provided in Table 5.

Table 3 indicates that the SVR model, whose inputs are theprecipitations of four previous days (input combination (iv)), hasthe best accuracy in training period when compared to WSVR. Inthe test period, however, this SVR model seems to be worse thanthe others from the MAE and R viewpoints. The training and testperformance statistics of the WSVR models are given in Table 4. Inthis table DWt�1 indicates the summed series obtained by addingthe effective Ds and approximation component at time t�1.Table 4 shows that the WSVR model with three inputs has a

Table 2The correlation coefficients between each of sub-time series and original

precipitation data.

Discrete wavelet

components

Correlations Total

(absolute)

Dt�1/Pt Dt�2/Pt Dt�3/Pt Dt�4/Pt

Izmir station

D1 �0.348 �0.069 0.132 0.016 0.565

D2 0.216 �0.274 �0.344 �0.043 0.877

D3 0.347 0.150 �0.072 �0.246 0.815

D4 0.262 0.217 0.159 0.095 0.733

Approximate 0.377 0.372 0.363 0.352 1.405

Afyon station

D1 �0.382 �0.015 0.095 0.018 0.510

D2 0.189 �0.276 �0.318 �0.034 0.817

D3 0.331 0.130 �0.082 �0.241 0.784

D4 0.276 0.235 0.172 0.091 0.774

Approximate 0.318 0.314 0.307 0.293 1.232

Table 3The RMSE, MAE, NS and R statistics of SVR applications for Izmir Station.

Model inputs Training

RMSE (mm) MAE (mm) NS

(i) Pt�1 63.8 15.0 0.162

(ii) Pt�1 and Pt�2 65.4 19.6 0.120

(iii) Pt�1, Pt�2 and Pt�3 69.4 20.9 0.009

(iv) Pt�1, Pt�2, Pt�3 and Pt�4 63.9 15.0 0.161

Table 4The RMSE, MAE, NS and R statistics of WSVR applications for Izmir Station.

Model inputs Training

RMSE (mm) MAE (mm) N

(i) DWt�1 59.8 16.4 0.

(ii) DWt�1 and DWt�2 48.5 14.6 0.

(iii) DWt�1, DWt�2 and DWt�3 43.3 12.9 0.(iv) DWt�1, DWt�2, DWt�3 and DWt�4 43.3 15.1 0.

better accuracy than the others in training and test periods. It canbe seen from Tables 3 and 4 that the WSVR models perform muchbetter than the single SVR models from the various performancecriteria viewpoints. For the Izmir Station, the relative RMSE, MAE,NS and R differences between the WSVR (input combination iii)and SVR (input combination iv) models are 35%, 31%, 1502% and183% in the test period, respectively. The WSVR and SVR forecastsand residuals in test period are shown in Fig. 4. It can be obviouslyseen that the WSVR approximates the precipitation data betterthan the SVR. The scatterplots of each model are illustrated inFig. 5. It is obvious from the fit line equations and R values thatthe WSVR model performs much better than the single SVRmodel. WSVR predicts the maximum peak as 879 mm instead ofmeasured 795 mm, with an overestimation of 10.6%, while theSVR results in 363 mm, with an underestimation of 54.3%. How-ever, the WSVR prediction of the second max peak, 780 mm is280 mm, with an underestimation error of 64.6%, while the SVRyields as 0 mm. WSVR model seems to perform better than theSVR in forecasting peak precipitations.

Test

R RMSE (mm) MAE (mm) NS R

0.441 74.2 20.2 0.033 0.212

0.375 74.1 22.9 0.034 0.253

0.186 72.1 18.9 0.024 0.288

0.448 71.6 19.6 0.037 0.276

Test

S R RMSE (mm) MAE (mm) NS R

265 0.556 64.8 18.8 0.211 0.486

516 0.754 51.8 14.9 0.496 0.745

615 0.794 46.5 13.6 0.593 0.782614 0.784 47.4 16.2 0.577 0.762

Table 5The optimum parameters of the SVR and WSVR models.

Input Izmir Afyon

SVR WSVR SVR WSVR

Pt�1

Cc 10 100 100 10

Epsilon 0 0.0001 1.00E-06 0.1

Sigma 1000 0.9 1000 5

Pt�1 and Pt�2

Cc 10,000 10 1000 100

Epsilon 0.000001 0.01 0.5 0.1

Sigma 0.7 0.01 0.28 0.09

Pt�1, Pt�2 and Pt�3

Cc 1000 1000 100 100

Epsilon 0.5 0.5 0.0001 0.1

Sigma 0.285 0.285 2 0.2

Pt�1, Pt�2, Pt�3 and Pt�4

Cc 10 10,000 1000 100

Epsilon 0.001 0.1 0.5 0.1

Sigma 0.8 0.001 0.28 0.08

Fig. 4. Daily precipitation forecasts of SVR and WSVR models in test period—Izmir Station.

Fig. 5. The scatterplots of the SVR and WSVR models in test period—Izmir Station.


Table 6The RMSE, MAE, NS and R statistics of SVR applications for Afyon Station.

Model inputs Training Test

RMSE (mm) MAE (mm) NS R RMSE (mm) MAE (mm) NS R

(i) Pt�1 34.3 12.2 0.018 0.262 37.3 14.0 0.075 0.035

(ii) Pt�1 and Pt�2 34.7 11.7 0.002 0.251 37.3 13.2 0.074 0.076

(iii) Pt�1, Pt�2 and Pt�3 32.9 14.0 0.100 0.348 37.5 14.5 0.086 0.118

(iv) Pt�1, Pt�2, Pt�3 and Pt�4 32.9 11.2 0.096 0.374 38.7 14.2 0.154 0.103

Table 7The RMSE, MAE, NS and R statistics of WSVR applications for Afyon Station.

Model inputs Training Test


(i) DWt�1 31.6 10.8 0.170 0.504 35.2 12.5 0.041 0.300

(ii) DWt�1 and DWt�2 23.1 8.3 0.555 0.761 23.5 8.7 0.578 0.771

(iii) DWt�1, DWt�2 and DWt�3 20.6 7.1 0.645 0.812 22.2 8.0 0.626 0.796

(iv) DWt�1, DWt�2, DWt�3 and DWt�4 20.0 7.3 0.666 0.826 21.4 9.0 0.647 0.815


For the Afyon Station, also four input combinations are triedbased on preceding daily precipitations to estimate currentprecipitation value. The performance statistics of the SVR modelare given in Table 6. Table 6, which indicates that the SVR model,whose inputs are the precipitations of four previous days (inputcombination iv), has the smallest RMSE (32.9 mm), MAE(11.2 mm) and the highest R (0.374) in training period. The RMSE,MAE, NS and R statistics of the WSVR models are shown inTable 7. From this table, it is obvious that the WSVR model hasthe best accuracy for the input combination iv. Compared withthe SVR models given in Table 6, the WSVR models provide muchbetter performance than the SVR in daily precipitation forecast-ing. The relative RMSE, MAE, NS and R differences between theWSVR (input combination iv) and SVR (input combination iv)models are 45%, 37%, 320% and 691% in the test period, respec-tively. Fig. 6 illustrates the precipitation forecasts of the WSVRand SVR and the residuals in test period. As it can be seen fromthe hydrographs and residuals that the WSVR model performsmuch better than the single SVR model. The scatterplots of eachmodel are demonstrated in Fig. 7. From this figure the WSVRmodel seems also to have a much better forecast accuracy thanthe SVR model. While the WSVR predicts the maximum peak as344 mm instead of measured 380 mm, with an underestimationof 9%, the SVR computes as 335 mm, with an underestimation of12%. However, the SVR prediction of the second maximum peak(272 mm), is 16 mm with an underestimation error of 94%, whilethe WSVR yields 334 mm, with an underestimation of 23%. Herealso the WSVR model has the better accuracy.

Also, daily precipitation estimation has been carried out byfeed-forward artificial neural networks (ANN), for the purpose ofcomparison. The Levemberg–Marquardt algorithm is used herefor adjusting the weights of ANN model. The sigmoid and linearactivation functions are used for the hidden and output node(s),respectively. The hidden layer node numbers of each ANN modelwere determined after trying various network structures. Theoptimal ANN structures for each combination are given in Table 8.In this table, ANN(1,9,1) indicates an ANN model comprising1 input, 9 hidden and 1 output nodes. The ANN networks trainingwere stopped after 100 epochs since the variation of error was toosmall after this epoch. The error graph for an ANN model duringtraining is shown in Fig. 8. Various ANN structures were tried byincreasing the number of hidden neurons from 1 to 10. Hecht-Nielsen (1987) suggested an upper limit of (2nþ1) hidden layer

neurons, where ‘‘n’’ is the number of input neurons. The variationof MSE vs the number of hidden nodes for the ANN(1,9,1) model isshown in Fig. 8 in the test period. It is clear from the figure thatthe 9 hidden neuron has the lowest MSE. The comparison ofTables 3, 6 and 8 indicates that there is a slight differencebetween the ANN and SVR models. It is clear from Table 8 thatthe ANN method also have poor precipitation forecasts as foundfor the SVR method.

In overall, the wavelet and support vector regression conjunc-tion model that is improved combining two methods, DWT andSVR, seems to be more adequate than the single SVR model forforecasting daily precipitations. The original signal (daily precipi-tation time-series in the present study) is represented in differentresolution intervals by DWT. In other words, the complex hydro-logical time-series are decomposed into several simple time-series using a DWT. Thus, some features of the sub-series suchas its daily, monthly, annually periods can be seen more clearlythan the original signal. Setting up a WSVR model, the SVR modelis constructed with appropriate sub-series to belong to differentscales. Forecasts are more accurate than that obtained directly byoriginal signals due to the fact that the features (such asperiodically) of the sub-series are obvious, (Ning and Yunping,1998). This is why the WSVR model performs better than theSVR model.

6. Conclusions

The accuracy of WSVR model has been investigated for fore-casting daily precipitations in the present study. The WSVRmodels were developed by combining two methods, discretewavelet transform and support vector regression. The WSVRmodels were tested applying to different input combinations ofdaily precipitation data of Izmir and Afyon Station located inEagean Region of Turkiye. The test results were compared withthe single support vector regression and artificial neural networkmodels. The comparison results indicated that the WSVRperformed better than the SVR and ANN models in forecastingdaily precipitations. For the Izmir and Afyon stations, the WSVRconjunction model increased the prediction correlation coefficientand Nash–Sutcliffe coefficient with respect to the single SVRmodel by 183–691% and 1502–320% and reduced the root mean

Fig. 6. Daily precipitation forecasts of SVR and WSVR models in test period—Afyon Station.

Fig. 7. The scatterplots of the SVR and WSVR models in test period—Afyon Station.


Table 8The RMSE, MAE, MAPE and R statistics of ANN applications.

Model inputs Model Training Test


Izmir station

(i) Pt�1 (1.9.1) 65.6 26.5 0.110 0.332 68.8 27.3 0.110 0.333

(ii) Pt�1 and Pt�2 (2.4.1) 65.7 26.5 0.108 0.329 68.4 27.3 0.120 0.348

(iii) Pt�1, Pt�2 and Pt�3 (3.1.1) 66.6 27.4 0.082 0.286 68.8 27.8 0.110 0.336

(iv) Pt�1, Pt�2, Pt�3 and Pt�4 (4.1.1) 66.6 27.3 0.083 0.288 68.6 27.7 0.115 0.345

Afyon station

(i) Pt�1 (1,5,1) 33.6 16.6 0.061 0.247 35.3 17.6 0.050 0.226

(ii) Pt�1 and Pt�2 (2,5,1) 33.4 16.5 0.069 0.263 35.2 17.4 0.056 0.239

(iii) Pt�1, Pt�2 and Pt�3 (3,4,1) 33.4 16.5 0.068 0.261 35.2 17.4 0.054 0.235

(iv) Pt�1, Pt�2, Pt�3 and Pt�4 (4,2,1) 33.5 16.5 0.067 0.259 35.1 17.4 0.058 0.243

Fig. 8. The variation of MSE vs. iteration and hidden node numbers for the ANN(1.9.1) model—Izmir Station.


square errors and mean absolute errors by 35–45% and 31–37%,respectively.

References

Bustamante, J., Gomes, J., Chou, S., Rozante, J., 1999. Evaluation of April 1999Rainfall Forecasts over South America Using the ETA Model. INPE, CachoeriaPaulista, Sp, Brasil.

Caputo, B., Sim, K., Furesjo, F., Smola, A., 2002. Appearance-based object recogni-tion using SVMs: which kernel should ı use? In: Proceedings of NIPS Workshopon Statistical Methods for Computational Experiments in Visual Processingand Computer Vision, Whistler.

Cimen, M., 2008. Estimation of daily suspended sediments using support vectormachines. Hydrol. Sci. J. 53 (3), 656–666.

Chou, C.M., Wang, R.Y., 2002. On-line estimation of unit hydrographs using thewavelet-based LMS algorithm. Hydrol. Sci. J. 47 (5), 721–738.

Coulibaly, P., Burn, H.D., 2004. Wavelet analysis of variability in annual Canadianstreamflows. Water Res. Res. 40, W03105.

Dastorani, M.T., Afkhami, H., Sharifidarani, H., Dastorani, M., 2010. Application ofANN and ANFIS models on dryland precipitation prediction (Case Study: Yazdin Central Iran). J. Appl. Sci. 10, 2387–2394.

El-Shafie, A., Jaafer, O., Seyed, A., 2011. Adaptive neuro-fuzzy inference systembased model for rainfall forecasting in Klang River, Malaysia. Int. J. Phys. Sci. 6(12), 2875–2888.

Freiwan, M., Cigizoglu, H.K., 2005. Prediction of total monthly rainfall in Jordanusing feed forward backpropagation method. Fresenius Environ. Bull. 14 (2),142–151.

French, M.N., Krajewski, W.F., Cuykendal, R.R., 1992. Rainfall forecasting in spaceand time using a neural network. J. Hydrol. 137, 1–37.

Georgakakos, K.P., Bras, L.R., 1984. A hydrologically useful station precipitationmodel. Part I and II: formulation and application. Water Res. Res. 20 (11),1585–1596 1597–1610.

Gunn, S.R., 1998. Support Vector Machines for Classification and Regression.Technical Report, University of Southampton, England.

Hecht-Nielsen, R., 1987. Neurocomputing: picking the human brain. IEEE Spec-trum. 25 (3), 36–41.

Hong, W.C., 2008. Rainfall forecasting by technological machine learning models.Appl. Math. Comput. 200 (1), 41–57.

Hung, N.Q., Babel, M.S., Weesakul, S., Tripathi, N.K., 2009. An artificial neuralnetwork model for rainfall forecasting in Bangkok, Thailand. Hydrol. EarthSyst. Sci. 13, 1413–1425.

Ingsrisawang, L., Ingsriswang, S., Somchit, S., Aungsuratana, P., Khantiyanan, W.,2008. Machine learning techniques for short-term rain forecasting system inthe northeastern part of Thailand. World Acad. Sci. Eng. Technol. 41, 248–253.

Karunanithi, N., Grenney, W.J., Whitley, D., Bovee, K., 1994. Neural Networks forRiver Flow Prediction. J. Comp. Civil Eng. ASCE 8 (2), 201–220.

Kisi, O., 2008. Stream flow forecasting using neuro-wavelet technique. Hydrol.Process. 22 (20), 4142–4152.

Kisi, O., 2009. Neural networks and wavelet conjunction model for intermittentstreamflow forecasting. J. Hydrol. Eng. 14 (8), 773–782.

Kisi, O., Cimen, M., 2011. A wavelet-support vector machine conjunction model formonthly streamflow forecasting. J. Hydrol.

Kucuk, M., Agiralioglu, N., 2006. Wavelet regression techniques for streamflowpredictions. J. Appl. Stat. 33 (9), 943–960.

Lu, R.Y., 2002. Decomposition of interdecadal and interannual components forNorth China rainfall in rainy season. Chin. J. Atmos. 26, 611–624 (in Chinese).

Mallat, S.G., 1989. A theory for multi resolution signal decomposition: the waveletrepresentation. IEEE Trans. Pattern Anal. Mach. Intell. 11 (7), 674–693.

Marzano, F.S., Fionda, E., Ciotti, P., 2006. Neural-network approach to ground-based passive microwave estimation of precipitation intensity and extinction.J. Hydrol. 328, 121–131.

Moustris, K.P., Larissi, I.K., Nastos, P.T., Paliatsos, A.G., 2011. Precipitation forecastusing artificial neural networks in specific regions of Greece. Water Resour.Manage.

Nash, J.E., Sutcliffe, J.V., 1970. River flow forecasting through conceptual models,Part 1—a discussion of principles. J. of Hydrology 10 (3), 282–290.

Navone, H.D., Ceccatto, H.A., 1994. Predicting indian monsoon rainfall: a neuralnetwork approach. Clim. Dyn. 10, 305–312.

Ning, M., Yunping, C., 1998. An ANN and wavelet transformation based method forshort term load forecast. In: Energy Management and Power Delivery.International Conferences, vol. 2, pp. 405–410.

Nourani, V., Alami, M.T., Aminfar, M.H., 2009. A combined neural-wavelet modelfor prediction of Ligvanchai watershed precipitation. Eng. Appl. Artif. Intell. 22,466–472.

Olson, D.A., Junker, N.W., Korty, 1995. Evaluation 33 years of quantitativeprecipitation forecasting at NMC. Weather Forecast. 10, 498–511.

Olsson, J., Uvo, C.B., Jinno, K., Kawamura, A., Nishiyama, K., Koreeda, N., Nakashima, T.,Morita, O., 2004. Neural networks for rainfall forecasting by atmospheric down-scaling. J. Hydrol. Eng. ASCE 9, 1–12.


Partal, T., Kucuk, M., 2006. Long-term trend analysis using discrete waveletcomponents of annual precipitations measurements in Marmara region(Turkey). Physi. Chem. Earth 31, 1189–1200.

Ramirez, M.C.V., Velho, H.F.C., Ferreira, N.J., 2005. Artificial neural networktechnique for rainfall forecasting applied to the Sao Paulo region. J. Hydrol.301, 146–160.

Santos, C.A.G., Morais, B.S., Silva, G.B.L., 2009. Drought forecast using artificialneural network for three hydrological zones in San Francisco river basin. IAHS-AISH Publ. 333, 302–312.

Sivapragasam, C., Liong, S.-Y., Pasha, M.F.K., 2001. Rainfall and runoff forecastingwith SSA–SVM approach. J. Hydroinf. 3 (3), 141–152.

Smith, L.C., Turcotte, D.L., Isacks, B., 1998. Stream flow characterization and featuredetection using a discrete wavelet transform. Hydrol. Process. Vol. 12, 233–249.

Smola, A.J., 1996. Regression Estimation with Support Vector Learning Machines.MSc Thesis, Technische Universitat Munchen, Germany.

Dai, X.G., Wang, P., Chou, J.F., 2003. Multiscale characteristics of the rainy seasonrainfall and interdecadal decaying of summer monsoon in North China. Chin.Sci. Bull. 48, 2730–2734.

Vapnik, V., 1995. The Nature of Statistical Learning Theory. Springer Verlag, NewYork, USA.

Wang, W., Ding, J., 2003. Wavelet network model and its application to the

prediction of the hydrology. Nat. Sci. 1 (1), 67–71.Wu, C.L., Chau, K.W., Fan, C., 2010a. Prediction of rainfall time series using modular

artificial neural networks coupled with data-preprocessing techniques. J.Hydrol. 389, 146–167.

Wu, J., Liu, M., Jin, L., 2010b. Least square support vector machine ensemble fordaily rainfall forecasting based on linear and nonlinear regression, Advances in

Neural Network Research and Applications. Lect. Notes Electr. Eng. 67 (Part 1),

55–64. doi:10.1007/978-3-642-12990-2_7.Xingang, D., Ping, W., Jifan, C., 2003. Multiscale characteristics of the rainy season

rainfall and interdecadal decaying of summer monsoon in North China.Chinese Sci. Bull. 48, 2730–2734.

Zhou, H.C., Peng, Y., Liang, G.-H., 2008. The research of monthly dischargepredictor-corrector model based on wavelet decomposition. Water Resour.

Manage 22 (2), 217–227.

dx.doi.org/10.1007/978-3-642-12990-2_7

precipitation forecasting by using wavelet-support vector machine conjunction model

Documents