typhoon flood forecasting using integrated two-stage support vector machine approach

9

Click here to load reader

Upload: ming-chang

Post on 13-Dec-2016

215 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Typhoon flood forecasting using integrated two-stage Support Vector Machine approach

Journal of Hydrology 486 (2013) 334–342

Contents lists available at SciVerse ScienceDirect

Journal of Hydrology

journal homepage: www.elsevier .com/locate / jhydrol

Typhoon flood forecasting using integrated two-stage Support VectorMachine approach

Gwo-Fong Lin a,⇑, Yang-Ching Chou a, Ming-Chang Wu b

a Department of Civil Engineering, National Taiwan University, Taipei 10617, Taiwanb Taiwan Typhoon and Flood Research Institute, National Applied Research Laboratories, Taipei 10093, Taiwan

a r t i c l e i n f o s u m m a r y

Article history:Received 29 October 2012Received in revised form 17 January 2013Accepted 2 February 2013Available online 18 February 2013This manuscript was handled byKonstantine P. Georgakakos, Editor-in-Chief,with the assistance of Eylon Shamir,Associate Editor

Keywords:Flood forecastingForecasted rainfallSupport Vector MachinesTyphoon characteristics

0022-1694/$ - see front matter � 2013 Elsevier B.V. Ahttp://dx.doi.org/10.1016/j.jhydrol.2013.02.012

⇑ Corresponding author. Tel.: +886 2 3366 4368; faE-mail address: [email protected] (G.-F. Lin).

Accurate runoff forecasts are essential for flood mitigation and warning. In this paper, a two-stage floodforecasting model that is based on Support Vector Machine (SVM) is presented. In the first stage, theobserved typhoon characteristics and observed rainfall are used to produce rainfall forecast; and in thesecond stage, the forecasted rainfall and observed runoff are used to produce runoff forecast. A datasetof 16 typhoon storms from Taiwan were used to evaluate the two-stage SVM model. The SVM model gen-erated accurate rainfall and runoff forecasts with a 1–6 h lead time, especially for the peak runoff values.A substantial performance improvement of flood forecast is shown for the 4- to 6-h lead time. In conclu-sion, the SVM model provides an operational advantage by increasing the forecast lead time duringtyphoon events.

� 2013 Elsevier B.V. All rights reserved.

1. Introduction

To mitigate disasters due to typhoons, accurate and reliablerunoff forecasts are required to provide early warning of impend-ing floods and their improvement has been recognized as animportant task. However, the highly non-linear and complex pro-cesses of typhoon rainfall and runoff make it difficult to constructa reliable physically-based model. Data driven Neural Network(NN) approaches have been recommended as an attractive alterna-tive to the physically-based models. The American Society of CivilEngineers (ASCE) Task Committee (2000a,b) and Maier and Dandy(2000) provide a general introduction and a comprehensive reviewof the application of NNs in hydrology. In recent years, NNs havebeen successfully used in various hydrologic modeling applications(e.g., de Vos and Rientjes, 2005; Hu et al., 2007; Lin and Chen,2004) and specifically for rainfall and flood forecasting (e.g., Changet al., 2004; Chiang et al., 2007; Lin and Chen, 2005; Lin et al., 2010;Luk et al., 2001; Pramanik et al., 2011; Rathinasamy and Khosa,2012; Toth and Brath, 2007).

The major advantage of NNs is their capability to simulate com-plex relationship between desired output and available input giventhe existence of sufficient training datasets. Because of their flexi-

ll rights reserved.

x: +886 2 2363 1558.

bility in modeling nonlinear systems and their computational effi-ciency, NNs have received a considerable attention.

However, flood forecasting performance of most NNs decreasesrapidly with increasing of the forecast lead time. Operational agen-cies that are responsible for flood mitigation and warnings canbenefit from improved forecast accuracy of the longer lead times.Multi-stage NN-based models were developed in attempt to im-prove the longer forecast lead time (Chang et al., 2007; Lin andWu, 2011). The concept of multi-stage NN-based models is thattwo or more NN-based models are connected. Using the two-stageas an example, the connection between the two stages is that fore-casted values from the first-stage module are used as input to thesecond-stage module. It is widely known that rainfall is one of themost important inputs to flood forecasting model and the accuracyof long lead time flood forecasting can be improved with moreaccurate rainfall forecasts. Lin et al. (2009c) improved longer leadtime streamflow forecast by using Multilayer Perceptron with backpropagation training algorithm (that is BPN hereinafter) to predictrainfall as input to a Radial Basis Function (RBF)-based reservoir in-flow forecasting model. Chiang and Chang (2009) used Quantita-tive Precipitation Forecasting (QPF) information as input toRecurrent Neural Network (RNN)-based flood forecasting modeland reported a similar finding, that is, the forecasted rainfall wascapable of providing useful information for flood forecasting, espe-cially for long lead time.

Page 2: Typhoon flood forecasting using integrated two-stage Support Vector Machine approach

G.-F. Lin et al. / Journal of Hydrology 486 (2013) 334–342 335

In previous studies, multi-stage NN-based flood forecastingmodels have been based on conventional NNs. The architectureand the weights of these conventional NNs are determined by atrial and error procedure which consists of iterative time-consum-ing process. Although the selection of NN-based models commonlydisregard the efficiency of the model training, it is essential to de-velop a well-performing model that can be quickly trained.

In this paper, a two-stage Support Vector Machine (SVM)-basedmodel is developed to yield 1- to 6-h lead time runoff forecasts.Support Vector Machines (SVMs) have been used for hydrologictime series forecasting (Liong and Sivapragasam, 2002; Sivapraga-sam and Liong, 2005; Wu et al., 2009; Yu and Liong, 2007; Yu et al.,2004). More recently, Rasouli et al. (2012) used SVM and other ma-chine learning methods with weather and climate inputs to fore-cast daily streamflow. Based on statistical learning theory, SVMhas better generalization ability and requires less training timethan conventional NNs (e.g., BPN). For both rainfall and inflow fore-casting, Lin et al. (2009a,b) demonstrated that SVM-based modelsoutperform BPN-based models. Moreover, the development ofSVM-based models is efficient and thus expected to be suitablefor development of the two-stage model presented herein.

The objective of this study is to demonstrate a two-stage SVM-based model for typhoon flood forecasting. A rainfall forecastingmodule is developed in the first stage to pre-process the typhooninformation (namely, typhoon characteristics and rainfall) and toproduce rainfall forecasts. Then, the rainfall forecasts along withthe observed runoff are used as input to the flood forecasting mod-ule in the second stage. This procedure is expected to reduce theinput dimensionality and improve the performance of the longerforecast lead times, especially for the prediction of peak runoff.This paper is organized in the following manner. Section 2 de-scribes the development of the two-stage model. The applicationof the proposed two-stage model and the forecast results are pre-sented in Section 3. Section 4 summarizes main conclusions in thisstudy.

2. Model development

2.1. SVM theory

SVMs were developed for classification applications in the early1990s, and later extended for regression analysis by Vapnik (1995).In this section, the methodology of the support vector regression(SVR) used in this paper is briefly described and more details canbe found in several text books (Cristianini and Shaw-Taylor,2000; Vapnik, 1995, 1998).

The architecture of a SVM is presented in Fig. 1. Based on Nd

training data ½ðx1; y1Þ; ðx2; y2Þ; . . . ; ðxNd; yNdÞ�, the objective of the

SVR is to find a non-linear regression function to yield the outputy, which is the best approximate of the desired output y with anerror tolerance of e. First, the input vector x is mapped into a higher

Fig. 1. Architectural graph of SVM.

dimensional feature space by a non-linear function /(x). Then theregression function that relates the input vector x to the output ycan be written as:

y ¼ f ðxÞ ¼ wT/ðxÞ þ b ð1Þ

where w and b are weights and bias of the regression function,respectively. According to the Structural Risk Minimization (SRM)induction principle, the learning objective of a SVM is to minimizeboth the empirical risk and the model complexity. Based on theSRM induction principle, w and b are estimated by minimizingthe following structural risk function:

R ¼ 12

wTwþ CXNd

i¼1

LeðyiÞ ð2Þ

where the Vapnik’s e-insensitive loss function Le is defined as:

LeðyÞ ¼ jy� f ðxÞje ¼0 for jy� f ðxÞj < e

jy� f ðxÞj � e for jy� f ðxÞj � e

�ð3Þ

The first and second terms in Eq. (2) represent the model com-plexity and the empirical error, respectively. The trade-off betweenthe model complexity and the empirical error is specified by auser-defined parameter C, and C = 1 represents a case that themodel complexity is as important as the empirical error. The useof SRM induction principle results in the better generalization abil-ity of SVMs and avoids over-training of the model.

Vapnik (1995) expressed the SVR problem in terms of the fol-lowing optimization problem:

Minimize Rðw; b; n; n0Þ ¼ 12kwk2 þ C

XNd

i¼1

ðni þ n0iÞ ð4Þ

subject to

yi � yi ¼ yi � ðwT/ðxiÞ þ bÞ � eþ ni

yi � yi ¼ ðwT/ðxiÞ þ bÞ � yi � eþ n0ini � 0; i ¼ 1;2; :::;Nd

n0i � 0; i ¼ 1;2; :::;Nd

ð5Þ

where n and n0, which are slack variables used to convert aninequality constraint into an equality constraint, represent theupper and the lower training errors, respectively. The above opti-mization problem is usually solved in its dual form using La-grange multipliers. Rewriting Eq. (4) in its dual form anddifferentiating with respect to the primal variables (w, b, n, n0)gives:

MaximizeXNd

i¼1

yiðai � a0iÞ � eXNd

i¼1

ðai þ a0iÞ

� 12

XNd

i¼1

XNd

j¼1

ðai � a0iÞðaj � a0jÞ/ðxiÞT/ðxjÞ ð6Þ

subject to:

XNd

i¼1

ðai � a0iÞ ¼ 0

0 � ai � C; i ¼ 1;2; :::;Nd

0 � a0i � C; i ¼ 1;2; :::;Nd

ð7Þ

where a and a0 are the dual Lagrange multipliers. Note that thesolution to the optimal problem (Eq. (6)) is guaranteed to be uniqueand converge to global optima because the objective function is aconvex function.

The optimal Lagrange multipliers a⁄ are solved by the standardquadratic programming algorithm and then the regression func-tion can be rewritten as:

Page 3: Typhoon flood forecasting using integrated two-stage Support Vector Machine approach

336 G.-F. Lin et al. / Journal of Hydrology 486 (2013) 334–342

f ðxÞ ¼XNd

i¼1

a�i Kðxi;xÞ þ b ð8Þ

where K(xi, x) is the kernel function and the kernel function used inthis paper is the radial basis function:

Kðxi;xÞ ¼ exp � 1nxjxi � xj2

� �ð9Þ

where nx is the number of components in input vector x.Some of solved Lagrange multipliers (a � a0) are zero and

should be eliminated from the regression function. Finally, theregression function involves the nonzero Lagrange multipliersand the corresponding input vectors of the training data, whichare called the support vectors. The final regression function canbe rewritten as:

f ðxÞ ¼XNsv

k¼1

akKðxk; xÞ þ b ð10Þ

where xk denotes the kth support vector and Nsv is the number ofsupport vectors.

For conventional NN, the architecture and the weights arerespectively determined by a trial-and-error procedure which is atime-consuming iterative process. In contrary, the optimal archi-tecture and weights of SVM are quickly ‘‘solved’’, not ‘‘searched’’,which provides an efficient and consistent platform.

2.2. Model construction

The architecture of the proposed two-stage SVM-based model(hereinafter SVM-QRf) is illustrated in Fig. 2a. In the first stage,the rainfall forecasting module, is used to pre-process the typhooninformation (that is typhoon characteristics and rainfall) to pro-duce rainfall forecasts. Then in the second stage, the forecastedrainfall (Rf) in conjunction with observed runoff (Q) is used as in-put to the flood forecasting module. For comparison with theSVM-QRf, a one-stage SVM-based flood forecasting model (namedSVM-QRT) with observed runoff (Q), rainfall (R) and typhoon char-acteristics (T) is constructed. It should be noted that rainfall and

Fig. 2. Architectural graphs of: (a) the prop

typhoon characteristics are directly used as input to SVM-QRTwithout preprocessing (Lin et al., 2009a). The architecture ofSVM-QRT is illustrated in Fig. 2b.

The construction of the proposed model, SVM-QRf, is summa-rized below. First, the rainfall and typhoon characteristics are usedas input to the rainfall forecasting module. The general form of therainfall forecasting module is:

RtþDt ¼ f ðRt ;Rt�1; . . . ;Rt�ðLR�1Þ; TYt ; TYt�1; . . . ; TYt�ðLTY�1ÞÞ ð11Þ

where t is the current time, Dt is the lead-time period (from 1 to6 h), Rt is observed rainfall at time t, and LR denotes the lag lengthof rainfall, TYt is typhoon characteristics at time t, LTY denotes thelag length of typhoon characteristics, and Rt+Dt is the forecastedrainfall at time t + Dt.

Then, the forecasted rainfall (Rt+Dt) and observed runoff data areused as input to the flood forecasting module in the second stage.The general form of the proposed model is:

QtþDt ¼ f ðQ t ;Q t�1; . . . ;Q t�ðLQ�1Þ;RtþDtÞ ð12Þ

where Qt is observed runoff at time t, and LQ denotes the lag lengthof runoff, Qt+Dt is the forecasted runoff at time t + Dt.

The general form of the SVM-QRT model is:

QtþDt ¼ f ðQ t ;Q t�1; . . . ;Qt�ðLQ�1Þ;Rt ;Rt�1; . . . ;Rt�ðLR�1Þ; TYt ; TYt�1; . . . ; TYt�ðLTY�1ÞÞð13Þ

The flowchart of SVM-QRf and SVM-QRT is shown in Fig. 3. Inmodel construction, determination of the appropriate lag lengthsof input is an important step. A trial-and-error procedure is appliedto determine the lag lengths of input. The lag lengths, LTY, LR and LQ,are determined by the same process. The criterion for selecting thelag lengths is the relative percentage error (RPE):

RPE ¼ EðLÞ � EðLþ 1ÞEðLÞ � 100 ð14Þ

where E(L) and E(L + 1) are the Root mean square errors (RMSEs) formodels with L and L + 1 lag lengths, respectively. In general, theRMSE decreases with increasing lag term. When the RPE is less than5%, the increase of lag lengths is stopped and the best inputs of fore-casting models are selected.

osed model and (b) the existing model.

Page 4: Typhoon flood forecasting using integrated two-stage Support Vector Machine approach

Fig. 3. Flowchart of the model development.

G.-F. Lin et al. / Journal of Hydrology 486 (2013) 334–342 337

For event-based data, the collected events are separated intotwo datasets: training and testing. Some of the collected eventsare chosen as training data and used to construct NN-based mod-els. The performance of the NN-based models is tested by theremaining events. Different selections of training data and testingdata yield different results and sometimes lead to different conclu-sions. In this study, we used cross validation and each single ty-phoon event (except from the event with the maximum-runoff)was used to test the NN-based models in turn. Hence, for N ty-phoon events, a total of N � 1 test results were obtained. The con-clusions are drawn on the basis of the overall performance forthese testing results.

2.3. Performance measures

To evaluate the forecast performance of the models the follow-ing performance criteria were selected:

1. Mean coefficient of efficiency (MCE)

For a single testing event, the coefficient of efficiency (CE) iswritten as:

CE ¼ 1�Pn

t¼1ðQ t � Q tÞ2Pnt¼1ðQ t � �QÞ2

ð15Þ

where Qt and Q t denote the observed and forecasted runoff at timet, respectively, Q is the average of the observed runoff, and n is thenumber of time steps. If the CE value is equal to one, the forecastsare perfect. Because the cross validations are used herein, the meanCE of N testing events is written as:

MCE ¼ 1N

XN

j¼1

CEj ð16Þ

where CEj is the CE for the jth testing event.2. Root mean square error (RMSE)

The RMSE is a measure which represents the errors betweentwo sets of data. The smaller the RMSE value, the better the fore-casts. The RMSE is written as:

RMSE ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1n

Xn

t¼1ðQ t � Q tÞ2

rð17Þ

3. Mean error of peak runoff (MEPR)For a single testing event, the error of peak runoff (EPR) is de-

fined as:

EPR ¼ jQ p � Qp

Q pj ð18Þ

where Qp and Qp is the forecasted peak runoff and the observedpeak runoff respectively. The mean error of peak runoff of N testingevents is written as:

MEPR ¼ 1N

XN

j¼1

EPRj ð19Þ

where EPRj is the error of peak runoff for the jth testing event.

3. Application, results and discussion

3.1. Application

The island of Taiwan is located in one of the main paths of thenorth-western Pacific typhoons. During the past 100 years, on

Page 5: Typhoon flood forecasting using integrated two-stage Support Vector Machine approach

Fig. 4. The study area and locations of rainfall and water-level stations.

Table 1Description of typhoon events used in the modeling.

Name Date Duration (h) Scale Maximum hourly rainfall (mm) Peak runoff (m3/s)

Tim 10 July 1994 50 Intense typhoon 11.48 87.8Doug 8 August 1994 30 Intense typhoon 38.03 749Toraji 29 July 2001 62 Moderate typhoon 85.07 1530Nari 16 September 2001 111 Moderate typhoon 19.00 84.5Nakri 9 July 2002 96 Minor typhoon 21.09 244Nanmadol 3 December 2004 37 Moderate typhoon 13.21 61.6Haitang 17 July 2005 81 Intense typhoon 38.94 916.6Talim 31 August 2005 55 Intense typhoon 14.36 262.4Longwang 1 October 2005 40 Intense typhoon 16.73 208.4Sepat 17 August 2007 48 Intense typhoon 21.14 228.2Wipha 17 September 2007 52 Moderate typhoon 20.02 183Krosa 4 October 2007 79 Intense typhoon 27.01 267.5Kalmaegi 16 July 2008 58 Moderate typhoon 67.56 370.2Fung-wong 26 July 2008 73 Moderate typhoon 30.25 388.2Sinlaku 11 September 2008 127 Intense typhoon 47.86 662Jangmi 27 September 2008 52 Intense typhoon 29.44 696

Note: According to the classification system of the Taiwan Central Weather Bureau, the intensities of minor, moderate and intense typhoons are 34–63, 64–99, andP100 knot, respectively.

Table 2Input variables to the NN models.

Lead time (h) Input

SVM-QRT SVM-QRf SVM-QRi

1 Q(t), Q(t � 1), R(t), Ty(t) Q(t), Q(t � 1), R, (t + 1) Q(t), Q(t � 1), R(t + 1)

2 Q(t), Q(t � 1), R(t), Ty(t) Q(t), Q(t � 1), R, (t + 2) Q(t), Q(t � 1), R(t + 2)

3 Q(t), Q(t � 1), R(t), Ty(t) Q(t), Q(t � 1), R, (t + 3) Q(t), Q(t � 1), R(t + 3)

4 Q(t), Q(t � 1), R(t), Ty(t) Q(t), Q(t � 1), R, (t + 4) Q(t), Q(t � 1), R(t + 4)

5 Q(t), Q(t � 1), R(t), Ty(t) Q(t), Q(t � 1), R, (t + 5) Q(t), Q(t � 1), R(t + 5)

6 Q(t), Q(t � 1), R(t), Ty(t) Q(t), Q(t � 1), R, (t + 6) Q(t), Q(t � 1), R(t + 6)

Note: Q: observed runoff; R: observed rainfall; R: forecasted rainfall; Ty: observed typhoon characteristics.

338 G.-F. Lin et al. / Journal of Hydrology 486 (2013) 334–342

average, about four typhoons have hit Taiwan each year. The studyarea is the Wu River basin in central Taiwan. The basin with anarea of 2026 km2 is the fourth largest in Taiwan. The length of

the main river is 119 km and the average slope is 1/92. Heavy rain-fall brought by typhoons frequently cause flood disasters in the WuRiver basin. The city of Taichung with a population of about 3

Page 6: Typhoon flood forecasting using integrated two-stage Support Vector Machine approach

G.-F. Lin et al. / Journal of Hydrology 486 (2013) 334–342 339

million people is located downstream of the Nan-Pei Bridge on theWu River. In 2008, two typhoons (Kalmaegi and Fung-Wong) suc-cessively hit central Taiwan to cause an economic loss of about 3billion USD.

Fig. 4 shows the study area and the location of four hourly rain-fall stations (Pei-Shan, Chin-Liu, Hui-Suen and Tsui-Luan) and onehourly water-level station (Nan-Pei Bridge). The rainfall and runoffdata were obtained from the Water Resources Agency and the dataof typhoon characteristics were provided from the Central WeatherBureau. The mean areal rainfall of the watershed was calculatedusing Thiessen method. The typhoon characteristics dataset in-cludes the latitude and longitude (degree) of the typhoon center,the distance (km) between the typhoon center and the water-levelstation, the near-center maximum wind speed (m/s), the centralpressure (hPa), the storm radius (km) and the speed (km/hr) ofthe typhoon movement. The typhoon events that were used in thisstudy are listed in Table 1.

Table 3MCE and MEPR for various models.

Lead time (h) SVM-QRT SVM-QRf SVM-QRi

MCE1 0.73 0.93 0.932 0.56 0.85 0.843 0.47 0.76 0.744 0.12 0.65 0.635 �0.28 0.58 0.556 �0.80 0.46 0.39

MEPR (%)1 7.28 4.42 3.742 13.23 8.16 4.06

3.2. Rainfall forecasts

In order to show the influence of rainfall forecasts on flood fore-casting, a hypothetical SVM-based model (named SVM-QRi) wasfirst tested. It should be noted that the rainfall input to the SVM-QRi is considered as optimal, that is, the raingauge measurementswere used to represent the perfect rainfall forecast. Table 2 pre-sents the list of inputs that were used to construct SVM-basedmodels. The MCE values of SVM-QRi and the conventional model(SVM-QRT) are presented in Fig. 5. Additionally, the result of aBPN-based model (named BPN-QRT), which uses the same inputs

Fig. 5. MCE values of SVM-QRT, BPN-QRT and SVM-QRi.

Fig. 6. RMSE values of the rainfall forecasts.

as SVM-QRT, is also presented in Fig. 5. As shown in Fig. 5, SVM-QRi performed best among all models. Furthermore, SVM-QRTyielded higher MCE than BPN-QRT, which is consistent with theconclusion of Lin et al. (2009a, 2009b). It is seen that both SVM-QRT and BPN-QRT cannot yield effective forecasts for a forecastlead time that is greater than 3 h, whereas SVM-QRi producedaccurate flood forecasts up to 6 h. This supports the argument thatthe SVM-based model can effectively mitigate the negative impactof increasing forecast lead time if reliable and accurate rainfallforecasts are made available.

Lin et al. (2009b) confirmed that adding typhoon characteristicssignificantly improve the rainfall forecasting performance, espe-cially for forecast lead times that are longer than 3-h forecasting.Following their recommendation, data of rainfall and typhooncharacteristics are used herein to develop a SVM-based rainfallforecasting module. The RMSE values resulting from the rainfall

3 16.41 11.49 5.214 22.28 13.58 6.735 28.12 14.28 7.926 32.51 15.10 11.78

Fig. 7. (a) MCE values of SVM-QRT and SVM-QRf and (b) the improvement in MCEdue to the use of SVM-QRf instead of SVM-QRT.

Page 7: Typhoon flood forecasting using integrated two-stage Support Vector Machine approach

Fig. 8. (a) MEPR values of SVM-QRT and SVM-QRf and (b) the improvement inMEPR due to the use of SVM-QRf instead of SVM-QRT.

Table 4Paired comparison t-tests of two performance measures (CE and EPR) resulting fromSVM-QRT and SVM-QRf.

Alternativehypothesis

t Statistic Critical t value Statisticallysignificantat the 1% level

CESVM-QRf > CESVM-QRT �2.95 �2.37 YesEPRSVM-QRf < EPRSVM-QRT 4.19 2.37 Yes

340 G.-F. Lin et al. / Journal of Hydrology 486 (2013) 334–342

forecasting module are presented in Fig. 6. As shown in Fig. 6,RMSE values increase from 4.3 mm to 6.3 mm for 1- to 3-h leadtime forecasts, while they slightly increase from 6.6 mm to 7 mmfor 4- to 6-h lead time forecasts. We note that in this region, themaximum yearly rainfall is higher than 2000 mm and the maxi-mum hourly rainfall is higher than 80 mm. The RMSE values of1- to 6-h lead time forecasts are all lower than 8 mm, which indi-cates high accuracy of the rainfall forecasts.

Fig. 9. Number of events for which (a) CE values of SVM-QRf are higher than those

3.3. Influence of forecasted rainfall on flood forecasting

The MCE and MEPR of three SVM-based models (SVM-QRT, SVM-QRf and SVM-QRi) for 1- to 6-h lead time forecasts are summarizedin Table 3. The input data of SVM-QRi are the observed record thatrepresents optimal rainfall forecast and antecedent runoff. As forthe input data of both SVM-QRT and SVM-QRf, the antecedent run-off, rainfall and typhoon characteristics were used. However, in theSVM-QRf model, the rainfall forecasting module was used to pre-process typhoon information (that is, typhoon characteristics andrainfall) and to provide the forecasted rainfall. For SVM-QRT, therainfall and typhoon characteristics were directly used as inputswithout further processing. In the following subsection we focuson the comparison between SVM-QRf and SVM-QRT.

The MCE values for runoff forecasts of both SVM-QRT and SVM-QRf decrease with increasing forecast lead time (Fig. 7a). However,the MCE values of SVM-QRT decrease more rapidly than those ofSVM-QRf. For 1- to 3-h lead time forecasts, both models providedreasonable runoff forecasts. For 4- to 6-h lead time forecasts, theperformance of SVM-QRT gets worse and the MCE values are al-most equal or even lower than zero. Clearly, the SVM-QRT cannotyield effective forecasts when the forecast lead time is greater than3 h. As for the SVM-QRf, the performance is still acceptable forlonger lead times up to 6-h. Regardless of the forecast lead time,the proposed model improved accuracy of the runoff forecast whencompared with the model without forecasted rainfall. Further-more, the improvement in MCE due to the use of SVM-QRf insteadof SVM-QRT is presented in Fig. 7b. It is also concluded that SVM-QRf outperformed SVM-QRT. As for the other performance measure(MEPR), SVM-QRf yields significantly lower MEPR values than

of SVM-QRT and (b) EPR values of SVM-QRf are lower than those of SVM-QRT.

Page 8: Typhoon flood forecasting using integrated two-stage Support Vector Machine approach

Fig. 10. Comparison of the observed runoff with the 1- to 6-h lead time forecasts resulting from SVM-QRf.

Fig. 11. Comparison of the observed runoff with the 1-h lead time forecasts resulting from: (a) SVM-QRf and (b) SVM-QRT for Typhoon Haitang.

G.-F. Lin et al. / Journal of Hydrology 486 (2013) 334–342 341

SVM-QRT (Fig. 8a), and the improvement increase with increasingforecast lead time (Fig. 8b). This indicates that SVM-QRf is moreappropriate for forecasting peak runoff than SVM-QRT, especiallyfor lead time forecasting that is longer than 3-h.

We found that runoff forecasts cannot be improved by using ty-phoon characteristics as direct input to an SVM-based model. Be-cause of the short time of concentration in the study basin, the

direct use of observed data (runoff, rainfall and typhoon character-istics) in model development cannot provide useful information forlead time forecasting that are longer than 3-h. When the forecastlead time increases, the data used for longer lead time forecastinginclude more complex noise and the correlation between desiredoutput and available input decreases. Because the rainfall forecast-ing module successfully reduces the complication caused by the

Page 9: Typhoon flood forecasting using integrated two-stage Support Vector Machine approach

342 G.-F. Lin et al. / Journal of Hydrology 486 (2013) 334–342

typhoon characteristics, the proposed model effectively improvesthe long lead-time forecasting.

In addition to the overall performance, evaluation of individualevents is described herein. The number of events for which SVM-QRf yields a higher CE than SVM-QRT is counted and presentedin Fig. 9a. In a like manner, Fig. 9b presents the results for anotherperformance measure, EPR. Fig. 9 shows that SVM-QRf performsbetter than SVM-QRT for most of the events. To further assesswhether SVM-QRf performs better than SVM-QRT for the sametesting events, paired comparison t-tests are conducted at the 1%significance level. Table 4 shows that SVM-QRf yields significantlyhigher CE and lower EPR than SVM-QRT. For 1- to 6-h lead times,the comparison of the observed runoff with the forecasts resultingfrom SVM-QRf is presented in Fig. 10. It is shown that the proposedtwo-stage model (SVM-QRf) produced reliable forecasts and theforecasted hydrograph accurately matches the observed hydro-graph. To highlight the comparison, Fig. 11 shows the hydrographsof 1-h lead time forecasts resulting from SVM-QRf and SVM-QRTfor the most extreme runoff event (resulting from Typhoon Hai-tang). As shown in Fig. 11, both SVM-QRf and SVM-QRT slightlyunderestimate the peak runoff, but reproduce low runoff appropri-ately because low runoff is more frequent in data set than highrunoff. However, SVM-QRf captures the peak runoff better thanSVM-QRT. Although the result confirms that the proposed modelimproves the forecasts of peak runoff, more validation of the mod-els in extrapolation is still required in future research.

4. Summary and conclusions

In this paper, a two-stage SVM-based model (i.e. SVM-QRf) isproposed for improving runoff forecast during typhoon events. Inthe first stage, the rainfall forecasting module is used to pre-pro-cess the typhoon information (namely, typhoon characteristicsand rainfall) and to produce rainfall forecasts. Then, in the secondstage, the forecasted rainfall and observed runoff are used as inputto the flood forecasting module to yield runoff forecasts. A casestudy for the Wu River basin in central Taiwan is performed to as-sess the model performance. In addition, a single-stage SVM-basedmodel (i.e. SVM-QRT), which directly uses the observed runoff,rainfall and typhoon characteristics as input without any process-ing, is constructed for comparison.

Regarding the performance of rainfall forecasting, it is foundthat the first-stage of the proposed model yields quite accurate1- to 6-h lead time rainfall forecasts. The use of typhoons charac-teristics can effectively reduce the negative impacts of increasingforecast lead time. As to the performance of flood forecasting, acomparison between the proposed two-stage model and the sin-gle-stage model shows that the proposed model significantly im-proved the runoff forecasts. In addition to the overallperformance, the proposed model significantly improved the fore-casts of peak runoff, especially for long lead time forecasting. Thebetter performance of the proposed model confirms that the pro-cessed typhoon information is more useful than the raw typhooninformation. The use of forecasted rainfall and the proposed two-stage structures are justified and it expected to improve hourly ty-phoon flood forecasting.

Acknowledgements

This paper is based on research partially supported by the Na-tional Science Council, Taiwan, under grants NSC 101-2625-M-

002-007 and NSC 99-2221-E-002-092-MY3. We would like toespecially thank the Associate Editor and reviewers for their con-structive suggestions that greatly improved the manuscript.

References

ASCE Task Committee on Application of Artificial Neural Networks in Hydrology,2000a. Artificial neural networks in hydrology. I: Preliminary concepts. J.Hydrol. Eng. 5 (2), 115–123.

ASCE Task Committee on Application of Artificial Neural Networks in Hydrology,2000b. Artificial neural networks in hydrology. II: Hydrologic applications. J.Hydrol. Eng. 5 (2), 124–137.

Chang, L.C., Chang, F.J., Chiang, Y.M., 2004. A two-step-ahead recurrent neuralnetwork for stream-flow forecasting. Hydrol. Process. 18 (1), 81–92.

Chang, F.C., Chiang, Y.M., Chang, L.C., 2007. Multi-step-ahead neural networks forflood forecasting. Hydrol. Sci. J.-J. Sci. Hydrol. 52 (1), 114–130.

Chiang, Y.M., Chang, F.C., 2009. Integrating hydrometeorological information forrainfall-runoff modelling by artificial neural networks. Hydrol. Process. 23 (11),1650–1659.

Chiang, Y.M., Chang, F.C., Jou, J.D.B., Lin, P.F., 2007. Dynamic ANN for precipitationestimation and forecasting from radar observations. J. Hydrol. 334 (1–2), 250–261.

Cristianini, N., Shaw-Taylor, J., 2000. An Introduction to Support Vector Machinesand Other Kernel-Based Learning Methods. Cambridge University Press, NewYork.

de Vos, N.J., Rientjes, T.H.M., 2005. Constraints of artificial neural networks forrainfall–runoff modeling: trade-offs in hydrological state representation andmodel evaluation. Hydrol. Earth Syst. Sci. 9 (1–2), 111–126.

Hu, T.S., Wu, F.Y., Zhang, X., 2007. Rainfall–runoff modeling using principalcomponent analysis and neural network. Hydrol. Res. 38 (3), 235–248.

Lin, G.F., Chen, L.H., 2004. A non-linear rainfall-runoff model using radial basisfunction network. J. Hydrol. 289 (1–4), 1–8.

Lin, G.F., Chen, L.H., 2005. Application of artificial neural network to typhoon rainfallforecasting. Hydrol. Process. 19 (9), 1825–1837.

Lin, G.F., Wu, M.C., 2011. An RBF network with a two-step learning algorithm fordeveloping a reservoir inflow forecasting model. J. Hydrol. 405 (3–4), 439–450.

Lin, G.F., Chen, G.R., Huang, P.Y., Chou, Y.C., 2009a. Support Vector Machine-basedmodels for hourly reservoir inflow forecasting during typhoon-warning periods.J. Hydrol. 372 (1–4), 17–29.

Lin, G.F., Chen, G.R., Wu, M.C., Chou, Y.C., 2009b. Effective forecasting of hourlytyphoon rainfall using Support Vector Machines. Water Resour. Res. 45,W08440. http://dx.doi.org/10.1029/2009WR007911.

Lin, G.F., Wu, M.C., Chen, G.R., Tsai, F.Y., 2009c. An RBF-based model with aninformation processor for forecasting hourly reservoir inflow during typhoons.Hydrol. Process. 23 (25), 3598–3609.

Lin, G.F., Huang, P.Y., Chen, G.R., 2010. Using typhoon characteristics to improve thelong lead-time flood forecasting of a small watershed. J. Hydrol. 380 (3–4), 450–459.

Liong, S.Y., Sivapragasam, C., 2002. Flood stage forecasting with Support VectorMachines. J. Am. Water Resour. Assoc. 38 (1), 173–186.

Luk, K.C., Ball, J.E., Sharma, A., 2001. An application of artificial neural networks forrainfall forecasting. Math. Comput. Modell. 33, 683–693.

Maier, H.R., Dandy, G.C., 2000. Neural networks for the prediction and forecasting ofwater resources variables: a review of modeling issues and applications.Environ. Modell. Softw. 15, 101–124.

Pramanik, N., Panda, R.K., Singh, A., 2011. Daily river flow forecasting using waveletANN hybrid models. J. Hydroinform. 13 (1), 49–63.

Rasouli, K., Hsieh, W.W., Cannon, A.J., 2012. Daily streamflow forecasting bymachine learning methods with weather and climateinputs. J. Hydrol. 414–415,284–293.

Rathinasamy, M., Khosa, R., 2012. Multiscale nonlinear model for monthlystreamflow forecasting: a wavelet-based approach. J. Hydroinform. 14 (2),424–442.

Sivapragasam, C., Liong, S.Y., 2005. Flow categorization model for improvingforecasting. Nord. Hydrol. 36 (1), 37–48.

Toth, E., Brath, A., 2007. Multistep ahead streamflow forecasting: role of calibrationdata in conceptual and neural network modeling. Water Resour. Res. 43,W11405. http://dx.doi.org/10.1029/2006WR005383.

Vapnik, V., 1995. The Nature of Statistical Learning Theory. Springer, New York.Vapnik, V., 1998. Statistical Learning Theory. John Wiley, New York.Wu, C.L., Chau, K.W., Li, Y.S., 2009. Predicting monthly streamflow using data-driven

models coupled with data-preprocessing techniques. Water Resour. Res. 45,W08432. http://dx.doi.org/10.1029/2007WR006737.

Yu, X.Y., Liong, S.Y., 2007. Forecasting of hydrologic time series with ridgeregression in feature space. J. Hydrol. 332 (3–4), 290–302.

Yu, X.Y., Liong, S.Y., Babovic, V., 2004. EC-SVM approach for real-time hydrologicforecasting. J. Hydroinform. 6 (3), 209–223.