superposition of three sources of uncertainties in operational flood forecasting chains

17
Superposition of three sources of uncertainties in operational ood forecasting chains Massimiliano Zappa a, , Simon Jaun a,b , Urs Germann c , André Walser c , Felix Fundel a a Swiss Federal Research Institute WSL, Birmensdorf, Switzerland b Institute for Atmospheric and Climate Science, ETH Zurich, Switzerland c Swiss Federal Ofce of Meteorology and Climatology MeteoSwiss, Switzerland article info abstract Article history: Received 29 May 2009 Received in revised form 1 December 2010 Accepted 3 December 2010 One of the less known aspects of operational flood forecasting systems in complex topographic areas is the way how the uncertainties of its components propagate and superpose when they are fed into a hydrological model. This paper describes an experimental framework for investigating the relative contribution of meteorological forcing uncertainties, initial condi- tions uncertainties and hydrological model parameter uncertainties in the realization of hydrological ensemble forecasts. Simulations were done for a representative small-scale basin of the Swiss Alps, the Verzasca river basin (186 km 2 ). For seven events in the time frame from June 2007 to November 2008 it was possible to quantify the uncertainty for a five-day forecast range yielded by inputs of an ensemble numerical weather prediction (NWP) model (COSMO-LEPS, 16 members), the uncertainty in real-time assimilation of weather radar precipitation fields expressed using an ensemble approach (REAL, 25 members), and the equifinal parameter realizations of the hydrological model adopted (PREVAH, 26 members). Combining the three kinds of uncertainty results in a hydrological ensemble of 10,400 members. Analyses of sub-samples from the ensemble provide insight in the contribution of each kind of uncertainty to the total uncertainty. The results confirm our expectations and show that for the operational simulation of peak- runoff events the hydrological model uncertainty is less pronounced than the uncertainty obtained by propagating radar precipitation fields (by a factor larger than 4 in our specific setup) and NWP forecasts through the hydrological model (by a factor larger than 10). The use of precipitation radar ensembles for generating ensembles of initial conditions shows that the uncertainty in initial conditions decays within the first 48 hours of the forecast. We also show that the total spread obtained when superposing two or more sources of uncertainty is larger than the cumulated spread of experiments when only one uncertainty source is propagated through the hydrological model. The full spread obtained from uncertainty superposition is growing non-linearly. © 2010 Elsevier B.V. All rights reserved. Keywords: Flood forecasting Uncertainty superposition Weather radar ensemble Atmospheric EPS Model uncertainty PREVAH MAP D-PHASE COST 731 1. Introduction Operational ood forecasting is an important task in order to detect potentially hazardous extreme rainfall-runoff events in time. This is particularly challenging in mountainous areas, where the orography strongly complicates the setup and operational workow of most components of an end-to-end ood forecasting system. Such systems consists of atmospheric models (e.g. Rotach et al., 2009), hydrological prediction systems (e.g. Zappa et al., 2008), nowcasting tools used for estimating initial conditions (e.g. Germann et al., 2009) and warnings for end-users (Bruen et al., 2010; Frick and Hegg, 2011-this issue). Atmospheric Research 100 (2011) 246262 Corresponding author. Swiss Federal Institute for Forest, Snow and Landscape Research WSL, Mountain Hydrology and Torrents, Zürcherstrasse 111, CH-8903 Birmensdorf. Tel.: +41 44 739 24 33. E-mail address: [email protected] (M. Zappa). 0169-8095/$ see front matter © 2010 Elsevier B.V. All rights reserved. doi:10.1016/j.atmosres.2010.12.005 Contents lists available at ScienceDirect Atmospheric Research journal homepage: www.elsevier.com/locate/atmos

Upload: independent

Post on 14-May-2023

1 views

Category:

Documents


0 download

TRANSCRIPT

Atmospheric Research 100 (2011) 246–262

Contents lists available at ScienceDirect

Atmospheric Research

j ourna l homepage: www.e lsev ie r.com/ locate /atmos

Superposition of three sources of uncertainties in operational floodforecasting chains

Massimiliano Zappa a,⁎, Simon Jaun a,b, Urs Germann c, André Walser c, Felix Fundel a

a Swiss Federal Research Institute WSL, Birmensdorf, Switzerlandb Institute for Atmospheric and Climate Science, ETH Zurich, Switzerlandc Swiss Federal Office of Meteorology and Climatology MeteoSwiss, Switzerland

a r t i c l e i n f o

⁎ Corresponding author. Swiss Federal InstituteLandscape Research WSL, Mountain Hydrology and To111, CH-8903 Birmensdorf. Tel.: +41 44 739 24 33.

E-mail address: [email protected] (M. Za

0169-8095/$ – see front matter © 2010 Elsevier B.V.doi:10.1016/j.atmosres.2010.12.005

a b s t r a c t

Article history:Received 29 May 2009Received in revised form 1 December 2010Accepted 3 December 2010

One of the less known aspects of operational flood forecasting systems in complex topographicareas is the way how the uncertainties of its components propagate and superpose when theyare fed into a hydrological model. This paper describes an experimental framework forinvestigating the relative contribution of meteorological forcing uncertainties, initial condi-tions uncertainties and hydrological model parameter uncertainties in the realization ofhydrological ensemble forecasts. Simulations were done for a representative small-scale basinof the Swiss Alps, the Verzasca river basin (186 km2).For seven events in the time frame from June 2007 to November 2008 it was possible toquantify the uncertainty for a five-day forecast range yielded by inputs of an ensemblenumerical weather prediction (NWP) model (COSMO-LEPS, 16 members), the uncertainty inreal-time assimilation of weather radar precipitation fields expressed using an ensembleapproach (REAL, 25 members), and the equifinal parameter realizations of the hydrologicalmodel adopted (PREVAH, 26 members). Combining the three kinds of uncertainty results in ahydrological ensemble of 10,400 members. Analyses of sub-samples from the ensembleprovide insight in the contribution of each kind of uncertainty to the total uncertainty.The results confirm our expectations and show that for the operational simulation of peak-runoff events the hydrological model uncertainty is less pronounced than the uncertaintyobtained by propagating radar precipitation fields (by a factor larger than 4 in our specificsetup) and NWP forecasts through the hydrological model (by a factor larger than 10). The useof precipitation radar ensembles for generating ensembles of initial conditions shows that theuncertainty in initial conditions decays within the first 48 hours of the forecast. We also showthat the total spread obtained when superposing two or more sources of uncertainty is largerthan the cumulated spread of experiments when only one uncertainty source is propagatedthrough the hydrological model. The full spread obtained from uncertainty superposition isgrowing non-linearly.

© 2010 Elsevier B.V. All rights reserved.

Keywords:Flood forecastingUncertainty superpositionWeather radar ensembleAtmospheric EPSModel uncertaintyPREVAHMAP D-PHASECOST 731

1. Introduction

Operationalflood forecasting is an important task inorder todetect potentially hazardous extreme rainfall-runoff events in

for Forest, Snow andrrents, Zürcherstrasse

ppa).

All rights reserved.

time. This is particularly challenging in mountainous areas,where the orography strongly complicates the setup andoperational workflow of most components of an end-to-endflood forecasting system. Such systems consists of atmosphericmodels (e.g. Rotach et al., 2009), hydrological predictionsystems (e.g. Zappa et al., 2008), nowcasting tools used forestimating initial conditions (e.g. Germann et al., 2009) andwarnings for end-users (Bruen et al., 2010; Frick and Hegg,2011-this issue).

247M. Zappa et al. / Atmospheric Research 100 (2011) 246–262

Each component of the system is affected by uncertaintieslinked to the physical representation of orography, to theparameterization schemes of the models involved and thelimitations of the observing platforms providing real-time data(Zappa et al., 2010). For an integral consideration of uncertaintythree key sources of errors have to be considered: a) theuncertainty arising from incomplete process representationincluding the error in the estimation of model parameters(Vrugt et al., 2005), b) the uncertainty in the initial conditionsand c) the uncertainty of the observed/forecasted hydromete-orological input. This “uncertainty triplet” (Fig. 1) superposeswhen data are fed into a hydrological model. The integraluncertainty is the result of the interactions of all sources ofuncertainty that are propagating.

In the field of numerical weather prediction, ensemblesystems are established as standard tools to estimate anddescribe prediction uncertainties. Deterministic numericalweather predictions (NWPs) are intrinsically limited by thechaotic nature of the atmospheric dynamics. Already in the1960s, Lorenz (1963) demonstrated in a seminal study thatsmall errors in the initial conditions of a weather forecast cangrow rapidly, leading to highly diverging solutions. In order toestimate predictability, much research has been undertaken todevelop probabilistic forecasting methodologies (see thereviews by Ehrendorfer, 1997 and Palmer, 2000). In the lastyears, several studies have been devoted to the regional scalesusing limited-area ensembles, in particular for forecastingheavy precipitation events (e.g. Stensrud et al., 2000;Walser etal., 2004). Motivated by the reported results, initiatives foroperational limited-area ensemble prediction systems (EPSs)have emerged, e.g. the SRNWP-PEPS (Quiby and Denhard2003) and COSMO-LEPS (Marsigli et al., 2005). It is nowadayscommon to apply atmospheric EPS as a forcing in operationalflood-forecasting systems (Siccardi et al., 2005; Verbunt et al.,2007; Bartholmes et al., 2009 and see Cloke and Pappenberger,2009 for a review).

One of the advantages all meteorological ensembleapproaches have in common is the simple interface withhydrological impact models. Each member of the ensemblecan be fed into the hydrological model and generate forecast.The spread arising from the outcomes of all membersrepresents the sensitivity of the hydrological system to themeteorological ensemble. Recently, ensemble techniqueshave been proposed to quantify uncertainties in observing

Fig. 1. Main sources of uncertainties propagating and superposing throug

systems (Collier, 2007), such as radar precipitation estima-tion and nowcasting (e.g. Berenguer et al., 2005; Bowler et al.,2006; Szturc et al., 2008; Lee et al., 2009; Germann et al.,2009), pluviometer-based ensembles (Ahrens and Jaun,2007; Villarini and Krajewski, 2008; Moulin et al., 2009;Pappenberger et al., 2009), or satellite rainfall retrieval (e.g.Bellerby and Sun, 2005; Clark and Slater, 2006). In additionthe use of observation-based ensembles allows obtaining ahydrologically consistent ensemble of initial conditions forsimulations coupled with atmospheric EPS.

The hydrological model uncertainty is a further measurethat is needed being accounted and communicated inhydrological forecasting. The problem of parameter estimationand equifinality is not a prerogative of hydrology (Beven, 1993,2006; Beven and Freer, 2001; Vrugt et al., 2003; Pappenbergerand Beven 2006), but is a common issue in environmentalmodelling (see Matott et al., 2009 for a review).

This paper describes an experimental flood-forecastingchain emerging from the joint activities of the MAP–D-PHASEproject (Rotach et al., 2009) and the COST action 731 (Rossaet al., 2011-this issue). A novel approach from our study is thesuperposition (or “cascading”, Pappenberger et al., 2005) ofthe “uncertainty triplet” described above. To summarize wewill:

- Propagate COSMO-LEPS (Section 2.3) and the radarensemble fields from REAL (Germann et al., 2009;Section 2.2) through the hydrological model PREVAH(Viviroli et al., 2009a; Section 2.1)

- Estimate the uncertainty of PREVAH tunable parametersby Monte Carlo sampling and select different parametersub-samples (Section 3.2)

- Define different experimental settings for superposing theuncertainties from PREVAH, REAL and COSMO-LEPS(Section 3.4)

- Quantify uncertainty and express it as average spread for aforecast period of 120 hours, as defined by the lead-timeof COSMO-LEPS forecasts (Section 3.5).

As experimental area the Swiss Verzasca river basin(186 km2, Section 3.1) has been selected. This was the authors'main test bedduringMAPD-PHASE.Data are available since thebeginning of the MAP D-PHASE demonstration period in June2007.

h a hydrological model in hydrometeorological forecasting chains.

248 M. Zappa et al. / Atmospheric Research 100 (2011) 246–262

Our main goal is to estimate the different magnitudes ofspread generated by our particular definitions of inputuncertainties (REAL and COSMO/LEPS), initial conditionsuncertainties (REAL for estimating initial conditions beforefeeding COSMO/LEPS into PREVAH) and hydrological modeluncertainties (use of different set of calibrated parameters).As a further goal we want to identify how spread grows whendifferent sources of uncertainty are superposed.

2. Methods

2.1. The operational hydrological model PREVAH

We adopt the semi-distributed hydrological catch-ment modelling system PREVAH (Precipitation-Runoff-Evapotranspiration HRU Model; Viviroli et al., 2009a), whichhas beendeveloped to improve theunderstandingof the spatialand temporal variability of hydrological processes in catch-ments with complex topography. A review on previous workwith PREVAH is presented in Viviroli et al. (2009a), which alsothoroughly introduces the model physics, parameterizationsand pre- and post-processing tools.

Besides application for investigating water resources inmountainous basins (Zappa et al., 2003; Zappa and Kan, 2007;Koboltschnig et al., 2009), in recent times PREVAH has beenmore andmore used in quasi-operational hydrological applica-tions and re-forecasts of flooding events in Switzerland.Verbunt et al. (2006) presented an indirect verification ofdeterministic quantitative precipitation forecasts (QPF) for theriver Rhine. Verbunt et al. (2007) and Jaun et al. (2008)presented case studies on coupling PREVAHwith the ensemblenumerical weather prediction system COSMO-LEPS. Jaun andAhrens (2009) verify a two-year reforecast experiment of thePREVAH/COSMO-LEPS forecasting chain for the Swiss Rhinebasin. Romang et al. (2011) introduce the application ofPREVAH for early flood warning in Swiss mesoscale basins.PREVAH is adopted as a “hydrological engine” for superposingthree sources of uncertainty (Fig. 1).

2.2. Dealing with uncertainties within operational weatherradar systems

In the past decade MeteoSwiss, the Swiss Federal Office ofMeteorology and Climatology, developed and implemented aseries of sophisticated algorithms to obtain best estimates ofsurface precipitation rates over Switzerland using a radarnetwork (Germann et al., 2006). In spite of significantimprovements, the residual uncertainty is still relatively large.A novel promising solution to express this residual uncertaintyis to generate an ensemble of radar precipitation fields bycombining stochastic simulations and detailed knowledge ofthe radar signal error structure. The method is called REAL,which stands for Radar Ensemble generator designed for usagein the Alps using LU decomposition (Germann et al., 2009).

In REAL, the original (deterministic) radar precipitationfield (1×1 km2 resolution) is perturbed with a stochasticcomponent, which has the same mean and covariancestructure in space and time as the covariance matrix of theradar errors. In a first step mean and covariance structure ofradar errors are determined by comparing radar estimateswith rain gauge measurements. Radar errors are defined as

the logarithm of the ratio between the true (unknown)precipitation values divided by the radar estimate. This is areasonable definition given the fact that most radar errors areactually multiplicative (Germann et al., 2006). In a secondstep REAL generates a number of perturbation fields usingsingular value decomposition of the radar error covariancematrix, stochastic simulation using the LU decompositionalgorithm, and autoregressive filtering. Each ensemble mem-ber is a possible realization of the unknown true precipitationfield time series given the radar reflectivity measurementsand the radar error covariance matrix. For the completemathematical derivation of REAL we refer to Germann et al.(2009).

A prototype ensemble generator has been implemented aspart of MAP D-PHASE and COST-731 and is running in real-time in an automatic mode since spring 2007. The ensembleof precipitation field time series from REAL consists of 25members and is updated operationally every 60 min andpropagated through PREVAH.

2.3. Quantification of uncertainty from ensemble NWP-systems

Early identification of severe long-lasting rainfall eventswithin the next five days is obtained from the Limited-areaEnsemble Prediction System of the COnsortium for Small-scale MOdelling COSMO-LEPS (Marsigli et al., 2005). In thecurrent configuration, COSMO-LEPS provides once a day a 16member ensemble forecast with 132 hours lead-time forlarge parts of Europe. COSMO-LEPS is initialized at 12:00 UTCwhereas the first 12 forecast hours are not used due tomisrepresentations during model spin up. Initial and bound-ary conditions are taken from the European Centre forMedium-Range Weather Forecast EPS (Molteni et al., 1996).The horizontal grid-spacing of COSMO-LEPS is 10×10 km2

which is rather coarse for the small Verzasca basin, but due tothe high computational costs ensemble forecasts with higherresolutions are not yet available for the medium-range. Sixmeteorological surface variables (air temperature, precipita-tion, humidity, wind, sunshine duration derived from cloudcover, and global radiation) are obtained from the ensembleNWP and downscaled for hydrological modelling. The setupadopted for downscaling information from COSMO-LEPS forhydrological applications is the same as presented in Jaun etal. (2008) and relies on bilinear interpolation. Air tempera-ture is adjusted according to elevation by adopting a constantlapse rate of 0.65 °C per 100 m.

3. Experimental design

3.1. Study area

The Verzasca basin has an area of 186 km² up to the maingauge in Lavertezzo (Fig. 2). The basin is located in thesouthern part of Switzerland and is little affected by humanactivities. Its elevation range is 490–2870 m a.s.l. Forests(30%), shrub (25%), rocks (20%) and alpine pastures (20%) arethe predominant land cover classes. Soils are rather shallow(generally smaller than 30 cm) and the plant available fieldcapacity is below 5% volume. The discharge regime isgoverned by snowmelt in spring and early summer and byheavy rainfall events in fall (Ranzi et al., 2007). The river is

Fig. 2. Situation map of the Verzasca river basin in southern Switzerland including hydrometric (FOEN) and meteorological networks (MeteoSwiss and UCA).Additionally, the location of the Monte Lema weather radar few kilometers southern of the basin is displayed. Graphic elements reproduced by kind authorizationof “swisstopo” (JA022265) and BFS GEOSTAT.

249M. Zappa et al. / Atmospheric Research 100 (2011) 246–262

rather prone to flash floods (Wohling et al., 2006) and leadsinto the “Lago di Vogorno” an artificial reservoir maintainedby a private Hydropower Company.

The hydrological properties of the catchment are derivedfrom gridded maps of elevation, land use, land cover and soilproperties (Gurtz et al., 1999), which are available at100×100 m2 resolution. For the present application a resolu-tion of 500×500 m2 is generated previous to the delineation ofhydrological response units (Viviroli et al., 2009a). The runoffgauging station at the catchment outlet is maintained by theSwiss Federal Office for Environment, which provides data at10 min resolution operationally. Flood peaks at Lavertezzomayexceed 600 m3 s−1 (~3.2 m3 s−1km²). Base flow in winter canbe less than one m3 s−1.

The operational meteorological forcing is obtained fromseveral sources. MeteoSwiss maintains a network of auto-matic stations providing a detailed set of meteorologicalvariables with a sampling interval of up to 10 min (Fig. 2). Theadministration of the Canton Ticino (UCA Ct. Ticino on Fig. 2)maintains an additional network of pluviometers, whichsamples the precipitation data in real-time with a temporalresolution of 30 min. One of the latter is the only automaticpluviometer within the basin. Furthermore weather radarprecipitation fields are available (Section 2.2.).

3.2. Consideration of hydrological uncertainty

The initial setup and calibration of the hydrological modelwas based on previous applications in the Verzasca river basin

(Wohling et al., 2006; Ranzi et al., 2007). The used defaultcalibration is focused on the identification of a single parameterset with highest performance in the simulation of the averageflows and with the smallest volume error between observedand simulated time series (Zappa and Kan, 2007; Viviroli et al.,2009a). Since the target of this study is the quantification ofuncertainty propagation in hydrometeorological flood fore-casting chains, only sevenparameters being relevant for surfacerunoff generationwere allowed to randomly change during theMC experiment (Table 1). The identification of these sevensensitive parameters relies on experience (Zappa, 2002), onconsideration of the model structure (Gurtz et al., 2003) andtargeted sensitivity studies onfloodpeak calibration (Viviroli etal., 2009b). Table 1 indicates the basic value of the sevenparameters after the default calibration and the ranges allowedfor parameter sampling during the MC experiment. Furtheruncertainties linked to the parameters controlling snowaccumulation, snow melting and base-flow have been dis-regarded. A total of 2527MCrunswere computed for theperiod1996–2001, whereby the year 1996was only used as a spin-upyear. Please note, that we are not addressing the full predictiveuncertainty of the forecasting chain as defined inDraper (1995)and Todini (2009), but we focus on the parameter uncertaintyas obtained by selecting equifinal realizations from a MonteCarlo (MC) experiment, as well as observation and algorithmuncertainty by the ensemble methods for the NWP and theradar systems (see above). However, for the model chain used,theobtaineduncertainty is thebest available estimateof the fullpredictiveuncertainty andour hydrological experimentswhich

Table 1Definition ofmodel parameters allowed varying in theMonte-Carlo runs. The“default” parameters are the result of a standard calibration procedure(Viviroli et al., 2009a). The random sampling of the parameters was limitedto values included in the interval defined by MCMin and MCMax.

Symbol Parameter Unit Default MCMin MCMax

Pcorr Rainfall adjustment a [%] 12.8 0.0 30.0Scorr Snow adjustment a [%] 37.4 20.0 50.0BETA Soil moisture recharge

exponent– 3.8 3.0 6.0

SGR Threshold for surfacerunoff

mm 41 30 50

K0 Storage coefficient forsurface runoff

h 21 10 30

K1 Storage coefficient forinterflow

h 127 100 150

PERC Deep Percolation mm h−1 0.153 0.10 0.20

a The two parameters controlling the bias adjustment of the precipitationinput (rain or snow) are only used if the hydrological model is fed byinterpolated pluviometers data. Although the NWP models and precipita-tions estimates with the weather radar contain systematic errors, it wasdecided to avoid bias-corrections (Verbunt et al., 2006).

250 M. Zappa et al. / Atmospheric Research 100 (2011) 246–262

rely on assessing different sets of model parameters to fit pastobservations provide a practicable way to quantify howparameter uncertainty might contribute to the full predictiveuncertainty of the system (Fig. 1).

The decision if a model run is behavioural or not is basedon a subjective choice of likelihood function(s) (Beven, 1993;Madsen, 2000, 2003; Viviroli et al., 2009b; Bosshard andZappa, 2008). As the goal of the modelling experiments is theestimation of flood peaks, two goodness-of-fit measuresfocused on peak-discharge have been computed for eachMC realization. As a first measure, the well-known Nash andSutcliffe (1970) (NSE) efficiency is used:

NSE = 1−∑nt=1 j Q t−qt j2

∑nt=1 j Q t−Q j2 ;NSE∈�−∞;1� ð1Þ

where Qt is the observed hourly runoff at the time step t, Qthe average of observed runoff, qt the simulated runoff at the

Fig. 3. Dot-plot of the 2527 Monte Carlo realizations for the application of PREVAHSWAE are used to select three sub-samples (99.5%, 95% and 80%) of acceptable para

time step t and n the number of time steps. NSE quantifies therelative improvement of the model compared to the mean ofthe observations. NSE is particularly adequate for our presentapplication, since it is particularly sensitive to high flows. Itsuse is less advisable for studies focussed on obtaining the bestcalibrated values for both high and low-flows (Legates andMcCabe, 1999; Schaefli and Gupta, 2007).

In addition to NSE a second function is used. Lamb (1999)and Viviroli et al. (2009b) introduce and discuss severalscores for obtaining tailored parameters sets for flood-peakestimations. One of them is the sum of weighted absoluteerrors (SWAE), which is defined as:

SWAE = ∑n

t=1Qa

t jQt−qt j� �

;SWAE∈ 0;∞ :½½ ð2Þ

A value of a=1.5 was used as proposed by Lamb (1999)for evaluation of peak flow conditions. Behavioural simula-tions show a lower SWAE.

The 2527 MC runs (Fig. 3) were ranked according to theirperformance in the defined calibration period. As a com-pound measure of performance a weighted product of NSE(weight=3) and SWAE (weight=1) was adopted to build asingle score Li:

Li =NSEi

NSEAVG

� �3⋅ SWAEAVG

SWAEi: ð3Þ

AMC realization i having NSEi above the average NSEAVG ofall realizations and SWAEi lower than the average SWAEAVG ofall realizations will be ranked higher than the MC runsshowing an opposite behaviour with respect to the averageNSE and SWAE. The analysis of theMC runs showed that SWAEvaried between 3500 and 6000 while the range of NSE was0.71 to 0.84 (Fig. 3). Finally a Li range between 0.5 and 1.34was obtained for all runs.

Fig. 3 shows a dot-plot of allMC realizationswithNSEon they-axis and SWAEon the x-axis. Theobtainedpattern allows for a

in the Verzasca river basin during the calibration period 1996–2001. NSE andmeter sets consisting of 26 realizations each.

Table 2Summary of the three parameter-sets of 26 members each after inferring the Monte Carlo simulations (see text for details). The numbers declare the median(Med.) and standard deviation (St.Dev) of the seven parameters that were randomly varied. “Range” indicates the ratio between St.Dev. and the dimension of theinterval allowed for this parameters (Table 1).

Symbol Unit MOD_99.5% Med./St.Dev/Range MOD_95% Med./St.Dev/Range MOD_80% Med./St.Dev/Range

Pcorr [%] 11.54/2.8/0.09 14.3/5.1/0.17 19.3/6.6/0.22Scorr [%] 32.1/7.2/0.24 29.6/9.2/0.31 33.6/9.5/0.32BETA – 4.6/0.87/0.29 4.5/0.82/0.27 4.1/1.03/0.34SGR mm 33.2/3.6/0.18 39.1/5.1/0.26 38.1/5.8/0.29K0 h 12.7/0.97/0.05 11.8/2.0/0.1 15.6/2.4/0.12K1 h 127/14.5/0.29 122/15.6/0.31 129/12.8/0.26PERC mm h−1 0.11/0.013/0.13 0.13/0.022/0.22 0.14/0.031/0.31

251M. Zappa et al. / Atmospheric Research 100 (2011) 246–262

visual discrimination between realizations with higher andlower performances, with the best realizations being in theupper-left region of the dot-plot. For the analysis in theremaining sections of the paper three sub-samples of 26parameter sets eachwere isolated by ranking all realizations bysorting Li. The first sub-sample consists of the best 26realizations (99.5%; Li: 1.289/1.339). The second sub-samplescollect the26 sets around the95% ranking (Li: 1.238/1.246). Thethird sub-sample is a selection of 26 runs around the 80%ranking (Li: 1.152/1.157). Table 2 displays some statisticalmeasures about the three sub-samples of 26 parameter sets.Except for the storage coefficient controlling the generation ofinterflow K1, the 26 runs with highest performance present forall seven tuneable parameters the lowest standard deviationwithin the sub-sample itself. The highest variability is comput-ed within the 80% sub-sample.

3.3. The selected peak-flow events

All experiments rely on a long-term simulation withPREVAH using the basic parameter calibration (Table 1).

Fig. 4. Design of the seven experiments run for quantification of uncertainty superpC-LEPS forecasts.

Initial conditions for September 1st 2005 (Fig. 4) aregenerated by a reference run using interpolated observedpluviometer data. This reference run starting on January 1st1996 was obtained from an offline meteorological database,which also includes stations that are not available in real-time. Starting from September 1st 2005 a second long-termsimulation relying on operationally available data only hasbeen run to produce initial conditions for March 1st 2007.This run used precipitation data from the operationalpluviometers operated by MeteoSwiss and the river networkadministration of the Canton of Ticino (Fig. 2) as an input.FromMarch 1st 2007 operational time series of radar QPE andREAL are also available.

The time frame for the implementation of PREVAH inoperational mode was decided in order to have good initialconditions for the MAP-D-PHASE demonstration period. Inthe period of March 1st 2007 to November 23rd 2008 sevenevents with peak-runoff ranging between 77 and 541 m3 s−1

have been identified (Table 3). The return period of thehighest flood peak in the considered period on September 7th2008 is approximately 5 years on the basis of extreme value

osition. The time window for the statistics is defined by the lead time of the

Table 3Accumulated precipitation during the five day previous to the seven peak-flow events investigated. The column “Day-10/-20” declares the moment where initialconditions from a deterministic run are stored in order to trigger experiments on uncertainty propagation and superposition (Fig. 4). The list of the used COSMO-LEPS forecasts is sorted after the lead time in days before the event.

Event(year/month/day)

Peak Runoff[m3 s−1]

Day-10/-20(month/day)

Accumulatedprecipitationuntil day-5 [mm]

120 hours COSMO/LEPS forecast initialization (month/day)

Pluviometers Radar Day-5 Day-4 Day-3 Day-2 Day-1 Day-0

2007/08/22 100.7 08/01 151 153 08/19 08/20 08/212008/07/07 80.3 06/28 10 38 07/04 07/05 07/062008/07/13 163.0 06/28 87 113 07/10 07/112008/08/15 76.9 07/20 45 98 08/10 08/11 08/12 08/13 08/142008/09/07 541.0 08/20 28 29 09/042008/10/29 210.6 10/09 9 4 10/27 10/28 10/292008/11/05 157.5 10/09 219 186 11/03 11/04

Table 4Average ensemble spread (q100–q0) in m3 s−1 for seven peak-runoff eventswhen adopting different sets of model parameter realizations and eitherpluviometers (MOD_%/PLUV) or weather radar QPF (MOD_%/RAD) asprecipitation forcing.

Event(year/month/day)

Initialization(month/day)

MOD_%/PLUV MOD_%/RAD

99.5% 95% 80% 95%

Members – 26 26 26 26

2007/08/22 08/20 10.3 14.2 18.5 9.32008/07/07 07/05 4.9 8.3 10.4 6.52008/07/13 07/11 5.3 7.2 10.0 9.52008/08/15 08/12 5.9 10.3 11.7 8.32008/09/07 09/04 22.8 30.0 38.9 30.42008/10/29 10/27 9.9 14.6 19.6 10.82008/11/05 11/03 10.8 13.9 18.8 9.9Average [m3 s−1] 10.0 14.1 18.3 12.1

252 M. Zappa et al. / Atmospheric Research 100 (2011) 246–262

statistics of a time series starting in 1990 and having anaverage yearly flood of 385 m3 s−1. The cumulative precip-itation in the period previous to the 7 events is also indicatedin Table 3, both as spatially interpolated pluviometer data(areal precipitation estimate with inverse distance weightinginterpolation) and as assimilated QPE from theweather radar.It can be observed, that the event on October 29th 2008occurred after a relative dry antecedent period, while in thedays and weeks previous to the August 22nd 2007 over150 mm rainfall was estimated for the Verzasca basin. In theantecedent cumulative precipitation for the November 5th2008 event the precipitation event that triggered the October29th 2008 peak flow is included.

3.4. The seven experiments towards estimation of uncertaintysuperposition

The availability of several different data sets of deterministicand probabilistic precipitation measurements and forecasts andthe identification of sets of hydrological model parametersallows the computation of uncertainty superposition. Sevendifferent experiments (Fig. 4 and Table 3) have been completed:

1) MOD/PLUV: in this experiment the simulations fromMarch1st 2007 have been continued until November 25th 2008with the same configuration used since September 1st 2005.No sources of uncertainty were considered. During thesimulation a series of model starting points were stored 10to 20 days ahead of amajor discharge event (see Section 3.3and Table 3). The timing for saving the initial conditionswaschosen in order to guarantee that almost only base-flow iscontributing to the discharge at initialization and that aminor rainfall event is included in the timespanbetween themodels restart point and the peak-flow event. In a secondstage a temporally nested simulation was run starting fromthe defined initialization date (Day-10/-20, Table 3) until 10to 15 days after the event. For the nested sub-period 3 times26 model runs were run (Figs. 3 and 4 and Table 2).

2) MOD/RAD: this experiment is identical to the MOD/PLUVexperiment, with the only change that the precipitationforcing is obtained from theweather radar (see Section 2.2).Also in this case model runs for temporally nested sub-periods in correspondence to peak-flow events were run byaccounting the uncertainty in the determination of calibrat-ed model parameters. It is important to declare here that inthe case of simulations forced with radar data (either

deterministic or estimatedwithREAL) nobias in rainfall andsnowfall is accounted for. The radarQPE is already correctedfor biases during the pre-processing (Germann et al., 2006).Therefore: the two parameters of PREVAH controlling suchcorrections are set to 0% (Table 2).

3) REAL: in this experiment only the uncertainty arising fromthe weather radar QPE is accounted for. 25 ensemblemembers from the radar ensemble generator (Section 2.2)are used to force PREVAH. The initial conditions atinitialization of the nested runs are obtained from theMOD/RAD experiment, being forced with the deterministicradar QPE since March 1st 2007 (Fig. 4).

4) REAL/MOD: this experiment is the first one where uncer-tainty superposition is considered. To reduce the computa-tional effort, only one of the 3 parameter sub-sets isaccounted for (see Section 4.1), namely the 95% sub set(Table 2),which includes the 26model runs being ranked inthe top 94.5% to 95.5% among the 2527MC realizations (seeSection 3.2). In detail: for each nested period 25 (REAL)×26(MOD_95%) runs were completed in order to estimate theinteraction between the radar-QPE and the uncertainties ofthe hydrological model. Also in this case restart points forthe hydrological model were saved for later initialization ofprobabilistic forecasts with COSMO-LEPS (see below andTable 4).

5) LEPS: in this experiment only the uncertainty arising fromfeeding PREVAH with the 16 COSMO-LEPS ensemblemembers is accounted for. The initial conditions at

253M. Zappa et al. / Atmospheric Research 100 (2011) 246–262

initialization of COSMO-LEPS forecasts (Table 4 and Fig. 4)are obtained from the model being forced with thedeterministic radar QPE since March 1st 2007 (Fig. 4). Atotal of 19 COSMO-LEPS 5-day forecasts for the Verzascariver basin were selected (Table 3). COSMO-LEPS QPF arenot bias corrected (Table 2).

6) LEPS/MOD: in this experiment both the uncertainty of themodel parameters and of the NWP forecasts are consid-ered. In detail: for each COSMO-LEPS initialization point16 (COSMO/LEPS)×26 (MOD-95%) runs were completed.This gives an ensemble of 416 5-days forecasts.

7) FULL: the final experiment is the combination of REAL/MODand LEPS/MOD. For each of the 19 COSMO-LEPS ensembleforecasts 650 different initial conditions are available fromthe superposition of REAL with the model parameteruncertainty (see above). Thus, 650 (REAL/MOD)×16(COSMO-LEPS) runs were computed for all 19 forecasts(Fig. 4 and Table 3). Anoverall ensemble of 10,400membersresults for evaluation and quantifying uncertainty superpo-sition by simultaneous consideration of uncertainties in theQPE (REAL), in the NWP forecasts (COSMO-LEPS) and in thedetermination of the parameters of the hydrological model(MOD-95%).

3.5. Quantification of uncertainty

We aim at quantifying the propagation and superpositionof uncertainty when forcing PREVAH with different meteo-rological time series and different configuration of its tunableparameters. In all experiments a time frame of 120 hours isevaluated (Fig. 4). The time frame is defined by theinitialization time of the COSMO-LEPS forecast used. Weassume that the average spread of the simulated ensemblehydrographs is related to the uncertainty of the experimentalsettings used. For allowing intercomparison between experi-ments all statistics have been computed for the same120 hours period. We take the average of the ensemblequantiles during the 120 hours as an objective measure forquantifying the uncertainty. Prior to the averaging, quantiles(q%i ) are determined for each of the 120 hours beingevaluated. Eq. (4) defines the computation of the average ofquantiles q% for the defined time frame:

q% =∑i=n

i=1qi%

n; n = 120 time steps ð4Þ

q% denotes the “average quantile” of discharge duringn=120 time steps. q% has been computed for the levels 0%,25%, 50% (the median), 75% and 100%. The average inter-quartile range IQR can be obtained by subtracting q25 fromq75, while the average range of spread is computed bysubtracting q0 from q100.

4. Results

In this section the findings from the different experimentsare discussed. The observed runoff hydrograph and theaverage discharge during the events are also plotted, andshould give a subjective indication on the plausibility of theobtained result. The evaluation of long series of operational

forecasts with COSMO-LEPS and nowcast runs with REAL,MOD/PLUV and MOD/RAD is not detailed here. NeverthelessAppendix A and Fig. A1 give a concise summary on the qualityof the probabilistic (COSMO-LEPS, REAL) and deterministic(MOD, RAD) simulation during the period June 2007 toNovember 2008, in which all selected events are included inand for which there is a detailed verification report (Diezig etal., 2010). The verification indicates that all used determin-istic and probabilistic meteorological inputs result in dis-charge estimations that perform better than climatology.Even if REAL and COSMO-LEPS present similar skill againstobservations, the following sections will outline that thespread of these two sources of ensemble precipitation inputmay differ quite a lot for events leading to high dischargeevents.

4.1. Parameter uncertainty

The MOD/PLUV and MOD/RAD experiments have beenevaluated by quantifying the average ensemble spread(q100–q0) during the seven events (Table 4). MOD/PLUV wasrun using each of the three different sub-sets of parameterrealizations (Table 2). For MOD/RAD only the results from the26 realizations from the set MOD_95% are shown. Dependingon the intensity of the event (peak-flow) and the differences inantecedent precipitation (Table3)different values of spread areobtained for the different events. The largest average ensemblespread (about 30 m3 s−1 for both MOD_95%/PLUV andMOD_95%/RAD) is found during the event leading to theSeptember 7th 2008 peak-flow of 541 m3 s−1.

The application of parameter sub-sampleswith higherNSEand SWAE results in reduced spread. The average spreadresulting by propagating the MOD_99.5% sub-sample is 30%lower than the one computed when propagating MOD_95%.The spread obtained by propagating the MOD_80% sub-sample is on 30% higher than the one obtained fromMOD_95% (Table 4).

The average spread for the seven events obtained from 26realizations of PREVAH forced with weather radar QPE is about14% (MOD95%/RAD) lower than the corresponding spread ofthe runs forced with interpolated pluviometer data (MOD95%/PLUV). Only for the event leading to the July 13 2008 peak flowthe spread of the weather radar-driven simulations are largerthan the ones runwith the rain gauge data. This is due to a localconvective rainfall event that was not recorded by thepluviometers, but that resulted in locally very high radar QPE.Themain reason for having a lower spreadwith radar QPE thanwith pluviometer forcing is the effect of bias correction, whichis applied to the pluviometer data only. The variation in the biascorrection (Table 3) covers both the input and modeluncertainties. This is the way errors in estimating precipitationare currently accounted for. However, there is an importantconstraint as compared to the state-of-the-art observation-based precipitation ensembles (e.g. Ahrens and Jaun, 2007;Moulin et al., 2009; Pappenberger et al., 2009). Thehydrologicalmodel uses the precipitation bias corrections (Table 2) as aglobal tunable parameter for accounting for different sources oferror in the treatment of rain gauge data: a) direct measure-ment errors, b) systematic errors due to the choice, location andavailability of meteorological stations and, c) uncertainties inthe generation of spatially interpolated fields. Additionally the

Table 5Average ensemble spread (q100–q0) in m3 s−1 for seven peak-runoff eventswhen adopting different experimental settings for propagating and super-posing uncertainty in operational hydrological simulations.

Event (year/month/day)

Initialization(month/day)

REAL REAL/MOD

LEPS LEPS/MOD

FULL

Members – 25 650 16 416 10400

2007/08/22 08/20 37.2 48.9 117.0 137.0 141.02008/07/07 07/05 24.8 33.8 84.5 100.0 105.02008/07/13 07/11 42.0 53.9 123.7 146.0 146.02008/08/15 08/12 37.9 48.6 100.0 123.0 142.02008/09/07 09/04 167.0 216.0 288.0 328.0 338.02008/10/29 10/27 34.9 48.9 82.0 102.0 110.02008/11/05 11/03 43.3 56.3 116.0 138.0 139.0Average [m3 s−1] 55.3 72.3 130.0 153.0 160.0

254 M. Zappa et al. / Atmospheric Research 100 (2011) 246–262

bias correction parameters also contribute to a compensation ofsystematic errors in the estimation of evapotranspiration andother water fluxes by PREVAH (Zappa, 2002; Viviroli et al.,2009a). Methods for generating observation-based ensembles(both based on weather radar and simulations) are onlyfocusing on the estimation uncertainties in the gridding ofprecipitation informationand are therefore better suited for thepropagation of input uncertainties.

Fig. 5 shows in detail simulations for the November 5th2008 event. The related evaluation of the average spread for the120 hours window starting from November 3rd 2008 00:00 issummarized in Table 4. The spread arising from adopting threedifferent parameter sub-sets clearly increases by using setswith lower Li during the calibration period. While the shape ofthe simulated ensembles above and below themedian remainsvery similar among the three cases, the distance of the upperand lower ensemble envelopes grows with decreasing likeli-hood within the calibration period. As a consequence, thenumber of observations falling within the uncertainty banddrawn by the ensembles increases when using MOD_80 ascompared to both MOD_99.5% and MOD_95%. The spreadcomputed when using weather radar information is slightlyhigher at the start of the event. During the event the spreadobtained from radar forcing gets clearly smaller as the oneobtained from forcing using interpolated data from pluvi-ometers. This is confirmed by the average values of spreadduring the event (Table 4).

Spreads resulting from this analysis range between 7 and30 m3 s−1 for the seven investigated events (MOD_95%). Weselected MOD_95% as a benchmark against which to comparespreads resulting from the other sources of uncertainty from

Fig. 5.November 5th 2008 event: visualization of the spread obtained by adopting diobserved hydrograph is drawn as black line. The shaded dark and light grey areas aq100). The dashed black line draws the ensemble median (q50). Top left: realizationsMOD_95% set is applied. Bottom left: same for MOD_80%. Bottom right: weather ra

now on. This decision is takenwith the intent of avoiding overfitting (when using MOD_99.5% as a benchmark).

4.2. Weather radar uncertainty and superposition with parameteruncertainty

Following the proof-of-concept presented in Germann etal. (2009), PREVAH was run by adopting ensemble radar QPEensembles obtained from REAL. The runs forced by REALmembers use the initial conditions of a deterministic runforced by the operational radar QPE of MeteoSwiss until somedays ahead of the event (Fig. 4 and Table 4). From thatinitialization point the procedure described in Section 3.4(experiment “REAL”) is applied. As for the results presented

fferent sets of model parameter realizations for simulations with PREVAH. There delimitated by the quantiles of the ensemble realizations (q0, q25, q75, andobtained from pluviometric data and the MOD_99.5% set. Top right: same budar data are used combined with the MOD_95% parameters realizations set

t.

255M. Zappa et al. / Atmospheric Research 100 (2011) 246–262

in the previous section the average ensemble spread of theseven selected events has been computed for a 120 hourstime frame (Table 5). In analogy also the experiment REAL/MOD was completed by varying both, the REAL member andthe calibrated parameter realization from the MOD_95% setone after the other.

The model runs resulted in an average spread rangingbetween 25 and 167 m3 s−1 for REAL and between 34 and216 m3 s−1 for REAL/MOD. If we compare these results withthe outcomes of MOD_95%/RAD, the REAL and REAL/MODruns (Table 4) present a higher spread by a factor of 4.3(REAL) and 5.6 (REAL/MOD).

Fig. 6 shows two examples of 5-days ensemble hydro-graphs obtained for the experiments REAL and REAL/MOD.Contrarily to the ensembles shown in Fig. 5 almost allobserved values fall within the ensemble envelopes. Onlythe falling limbs close to the end of the simulation areunderestimated by both REAL and REAL/MOD ensembles. Forthe cases August 22nd 2007 and November 5th 2008 eventsthere is clear evidence that the spread arising by jointconsideration of two sources of uncertainty is higher thanthe one obtained by propagating only the REAL membersthrough the hydrological model. The spread from the REAL/MOD ensemble is 25% to 40% higher than that of the REALrealizations. The average additional spread for the sevenevents is 17 m3 s−1 (Table 5). Combining the analyses ofTables 4 and 5, the following findings can be stated forsimulations REAL and REAL/MOD:

- The average spread for the seven events stemming from theparameter ensemble is about 12 m3 s−1 (PREVAH forced by

Fig. 6. Ensemble hydrographs for the August 22nd 2007 (upper panels) and Novemradar ensemble members (REAL, left panels) and by jointly accounting for both radahydrograph is drawn as black line. The shaded dark and light grey areas are delimitadashed black line draws the ensemble median (q50).

deterministic radar QPE and 26 parameter realizations fromMOD_95%).

- The coupling of PREVAH with REAL results in hydrographensembles with an average spread of over 55 m3 s−1 forthe same seven events.

- If both REAL andMOD_95% are applied an ensemble of 650members is generated. The obtained average spread inthis case is about 72 m3 s−1.

This means that REAL/MOD generates a 6% to 7% largerspread than the sum of the spread obtained from theexperiment MOD/RAD_95% and REAL (67 m3 s−1). Thisindicates that an amplification of spread by superposition oftwo sources of uncertainty is occurring. Our particularmodelling system is characterized by non-linear responses,mostly explained by conceptual threshold processes in therunoff generation module of PREVAH (Gurtz et al., 2003). Atthe level of interquartile range amplification of spread hasbeen observed in only one of the 19 cases considered(Table 3). Thus only a subset of all considered REAL andMOD combinations triggers a non linear reaction within therunoff generation module of PREVAH. In all other cases theIQR-spread of REAL/MOD is in average 9% smaller than thecumulative spread of REAL and MOD.

4.3. COSMO-LEPS uncertainty and superposition with parameteruncertainty

As expected the average spread obtained by propagatingNWP forecasts through the hydrological model is much largerthan the one obtained from the experiments discussed above

ber 5th 2008 (bottom panels) events as obtained by forcing PREVAH with 25r and model parameter uncertainty (REAL/MOD, right panels). The observedted by the quantiles of the ensemble realizations (q0, q25, q75, and q100). The

256 M. Zappa et al. / Atmospheric Research 100 (2011) 246–262

(Fig. 7 and Table 4). The computation of LEPS generatesaverage spreads that are about 10 times higher than the onesof MOD_95%/RAD and 2.3 times higher than the ones fromREAL (Tables 4 and 5). Contrary to previous experiments, thatare always related to an occurred precipitation event, theLEPS ensemble (initialized as declared in Table 5) alsoincludes members that are forecasting very low or noprecipitation at all for the respective event (e.g. Fig. 7 forthe August 22nd 2007 event, upper panels). The forecastinitialized on August the 20th 2007 includes a relevantnumber ofmembers that show no runoff increase at all withinthe 120 forecast hours. Even the 25% quantile shows amaximum discharge that is slightly higher than the dischargeat initialization time. In case of this event the whole observedtime series falls within the envelope drawn by the LEPSexperiment. It is unfortunate that the spread is very large.This makes any kind of decision making related to that casealmost impossible. Anyway, in this specific case a potentialend-user taking actions on the basis of the 75% quartile wouldhave been very efficient in his decision making. Furtherconsiderations on skill for decision making are only possibleafter sound verification of long-term time series of consec-utive forecasts (e.g., Fundel and Zappa, 2011).

The results from the November 5th 2008 event (lowerpanels in Fig. 7) show different characteristics. All LEPSensemble members agree that the first runoff first peak is tobe expected in the second half of the first day of the forecast,and that a second (higher) peak will arrive about 60 hoursafter initialization of the forecast. Potential users focusing onthe 75% quantile would have probably over-reacted at thestart of the event, but would have been able to cope with thepeak on November 5th 2008.

The LEPS/MOD experiments represent a second series ofsimulations, for which parameter uncertainty is accounted for

Fig. 7. The same as Fig. 6 but with PREVAH forced by 16 COSMO-LEPS ensemble memmodel parameter uncertainty (LEPS/MOD, right panels).

and superposed to the uncertainty originating from the LEPS(right panels in Fig. 7). The average spread from the LEPS/MOD ensemble is 13% to 25% higher than the one of themodelrealizations based on LEPS only. The average additionalspread for the seven events is 23 m3 s−1 (Table 5).

In analogy to joint consideration of Tables 4 and 5 inSection 4.2 the experiments with LEPS and LEPS/MOD allowthe following statements:

- Average spread from MOD_95%/RAD is about 12 m3 s−1

(see above).- The coupling of PREVAH with LEPS generated hydrographensembles with an average spread of over 130 m3 s−1 forthe seven events considered.

- Applying both LEPS and MOD_95% results in an ensembleof 416 members. The obtained average spread is largerthan 150 m3 s−1.

This means that REAL/MOD generates a 9% to 10% largerspread than the sum of the spread obtained from theexperiment MOD/RAD_95% and LEPS (142 m3 s−1). Also inthis case the superposition of the two sources of uncertaintycauses an amplification of the full spread. In this case anamplification of spread measured by the interquartile rangehas been observed in seven cases (Table 3). On average theIQR-spread of REAL/MOD is 2% smaller than the cumulativespread of LEPS and MOD.

When propagating numerical forecasts from an ensembleprediction system such as COSMO-LEPS through a hydrolog-ical model for mesoscale areas such as the Verzasca basin(186 km2), the big mismatch between the basin area and theresolution of the ensemble prediction system (10×10 km2

mesh size) has to be kept in mind. Nevertheless studies withsuch kind of hydrological ensemble predictions have been

bers (LEPS, left panels) and by jointly accounting for both COSMO-LEPS and

Fig. 8. Ensemble hydrographs for the August 15th 2008 with PREVAH initialized on August 12th 2008 and forced by jointly accounting for both COSMO-LEPS andmodel parameter uncertainty (LEPS/MOD, left panel) and by accounting all three sources of uncertainty in the experimental chain (FULL, right panel). Legend asFigs. 6 and 7.

257M. Zappa et al. / Atmospheric Research 100 (2011) 246–262

very popular in the last few years (Cloke and Pappenberger,2009; Jaun et al., 2008) and have found already application inoperational chains. This scale restriction is less problematicfor applications in macro-scale basins (Pappenberger et al.,2005; Bartholmes et al., 2009).

4.4. Superposition of three sources of uncertainty

The last experiment combines the initial conditionsobtained from the REAL/MOD experiment (650 members)with the16ensemblemembersofCOSMO-LEPS (seeSection3.4and Fig. 4) and thus considers the entire “uncertainty triplet”(Fig. 1). The LEPS/MOD experiment discussed above isextended by additionally perturbing the initial conditionsforcing PREVAH with REAL, up to start of the COSMO-LEPSpropagation through PREVAH. By accounting for these addi-tional perturbations the average spread for the seven eventsincreases by about 4.5%, from 153 (LEPS/MOD) to 160 m3 s−1

(FULL, Table 5). Only the run initialized on August 12th 2008shows a distinctly higher additional uncertainty (~15% more)in the FULL experiment, as compared to the LEPS/MOD

Fig. 9. Box plots summarizing the average ensemble discharge quantiles related 19from three sources. The experiments related to different events (upper captions ofaverage discharge during the 120 hours of each experiment is displayed as a thick hoplot drawn by q0, q25, q75 and q100.

experiment (Fig. 8). The FULL ensemble shows already a largespread at initialization, as determined by the antecedentconditions obtained from REAL/MOD runs. This difference inthe overall spread gradually converges but it is still well definedat the time of the first runoff peak shortly after 2:00 on August13th 2008, where the maximum peak-flow of FULL is about50 m3 s−1 higher than the corresponding LEPS/MOD peak. Thedifference is alsowell visible in the IQR. The secondpeak, late inthe evening of August 15th 2008 shows nearby identical shapeand ranges for both FULL and LEPS/MOD. The uncertaintiesowed to the REAL influence on REAL/MOD decays during thefirst part of the event.

Fig. 9 shows an overview on all 19 “FULL” experiments,each of them summarizing the spread arising from 10,4005-days forecasts. In 12 cases the observed average discharge isfound within the IQR. Only the experiment with the longestlead time initialized on August 10th 2008 produced a q100lower than the observed average discharge during the 120forecast hours considered. The model run initialized 24 hourslater (August 11th 2008) generates an ensemble spread thatstrongly overestimates the observed value. The correspondent

experiments (lower captions on the x-axis) of superposing the uncertaintythe x-axis) are separated by a vertical line crossing the x-axis. The observedrizontal black line. The thick horizontal white line depicts q50 within the box

Table 6Attributing the contribution of different sources of uncertainty to the average spread ensemble spread (q100–q0) and IQR (q75–q25) in m3 s−1 for the November5th 2008 peak runoff event. The observed value and the correspondent ensemble median (q50) are also summarized. Sub-samples of three experiments areevaluated to estimate the different contribution of MOD, LEPS and REAL to the total experimental uncertainty. Details on the experiments and acronyms are foundin Section 3.4.

Experiment Filter Varying Average of n realizations Members Observation [m3 s−1] q50 [m3 s−1] q100–q0 [m3 s−1] q75–q25 [m3 s−1]

FULL None All three 1 10400 62.4 67.0 139.0 68.4MOD REAL & LEPS 26 400 62.4 65.3 125.4 62.3REAL MOD & LEPS 25 416 62.4 66.8 136.1 67.0LEPS REAL & MOD 16 650 62.4 71.9 15.9 10.8

REAL/MOD NONE Both 1 650 62.4 56.7 56.3 31.0REAL MOD 25 26 62.4 57.7 10.1 6.9MOD REAL 26 25 62.4 55.7 45.9 25.5

LEPS/MOD None Both 1 416 62.4 66.9 138.0 67.9MOD LEPS 26 16 62.4 64.4 121.8 59.6LEPS MOD 16 26 62.4 71.5 15.7 10.1

258 M. Zappa et al. / Atmospheric Research 100 (2011) 246–262

runs for three following days show a gradual reduction inensemble spread. The reason for the large spread is that someCOSMO-LEPS members are forecasting severe convectiveprecipitation, while others predicted no precipitation at all.Finally a moderate thunderstorm occurred in the evening ofAugust 11th 2008. REAL also generated large spread in itsmembers with cumulated rainfall for August 11th 2008ranging between 3 and 40 mm. This explains the largediscrepancy in initial conditions observed at initialization ofthe LEPS forecasts on August 12th 2008 (see Fig. 8).

4.5. Attributing the contribute to the total uncertainty

The outcome from the three experiments dealing withuncertainty superposition (FULL, REAL/MOD, and LEPS/MOD)can be sorted out in order to allocate the contribution of oneof the sources of spread to the whole experimental uncer-tainty. For this analysis we put the focus on one event only,namely the November 5th 2008 event with COSMO-LEPS

Fig. 10. Box plots summarizing the average ensemble discharge quantiles for the No(upper captions of the x-axis) are evaluated as complete set (“no filter”) and by seLEPS”). The observed average discharge during the 120 hours of each experimentdepicts q50within the box plot drawn by q0, q25, q75 and q100.

forecasts initialized on November 3rd 2008 (see also Figs. 5 to7). The following procedure was applied:

- Calculate the quantiles of all runs of the experiment(Eq. (4));

- Grouping in turn all runs sharing the same MOD, REAL orLEPS member (Table 6);

- Averaging the quantiles of the sub-sample and calculatecorrespondent spread metrics (Fig. 10).

The three main findings from Table 6 are:

a) FULL: The 10,400 FULL runs give an average ensemblespread “q100–q0” of 139.0 m3 s−1. There are 400 modelruns sharing the same parameter set. This means that wecan compute 26 different “q100–q0” and average them toobtaining an integral measure indicating the spreadattributed to the two sources of uncertainty that havebeen varied in this specific case (REAL & LEPS). In thisexample the “q100–q0” that cannot be attributed to MOD is125.4 m3 s−1 (90% of the total spread). When REAL is used

vember 5th 2008 event initialized on November 3rd 2008. Three experimentsparating the influence of different sources of uncertainty (“filter MOD/REALis displayed as a thick horizontal black line. The thick horizontal white line

/

259M. Zappa et al. / Atmospheric Research 100 (2011) 246–262

as a filter and both MOD and LEPS are varied, then almost98% of the “q100–q0” is obtained. REAL contributes in a verylimited way to the whole ensemble spread. Finally, if theinfluence of LEPS is averaged then only 11.5% of the FULLspread can be attributed (Table 6). Similar outcomes areobserved when looking at the IQR “q75–q25”.

b) REAL/MOD: The REAL/MOD ensemble generates a“q100–q0” of 56.3 m3 s−1. Here the “q100–q0” that cannotbe allocated to MOD is 45.9 m3 s−1 (80% of the totalspread). When REAL is used as a filter and only MOD isvaried, then 17.9% of the spread can be allocated (Table 6).

c) LEPS/MOD: The 400 LEPS/MOD realizations are resultingin an average “q100–q0” of 138 m3 s−1. When focusing onthe role of changing MOD and averaging the influence ofLEPS then “q100–q0” is only 15.6 m3 s−1 (11% of the totalspread). If we make a sub-sample that filters the spread ofMOD, then about 88% of the whole spread of LEPS/MODcan still be allocated (Table 6).

Fig. 10 is a graphic rendering of Table 6 in form of box-plots. All experiments in which LEPS contributes to the spreadvariation show an average spread close to the one of thespread of the whole experiment. If only LEPS is propagatedthen the average spread is 130 m3 s−1 (Table 5). If also modeluncertainty is propagated, then the average spread increasesby about 23 m3 s−1 to 153 m3 s−1. If different initial condi-tions from REAL are also considered the additional increasesis 7 m3 s−1 only (total: 160 m3 s−1, Table 5). If REAL is usedto generate initial conditions only, its influence to the totalspread is smaller than the influence of the hydrological modeluncertainty. Using REAL as a forcing during the eventincreases the spread by about 4.5 times (in the specific caseof November 5th 2008) compared to the spread that can beattributed to the model parameters. This confirms theoutcomes summarized in Tables 4 and 5.

5. Discussion and conclusions

The experimental setup, accounting for three sources ofuncertainty, presented in this paper, provides interestinganswers to questions linked to uncertainty propagation andsuperposition in a hydrometeorological forecasting system.

The used setup showed that the hydrological model(PREVAH) uncertainty is less pronounced than the uncer-tainty obtained by propagating radar precipitation fields(REAL) and NWP forecasts (COSMO-LEPS) through thehydrological model. The average difference in spread for afive-days forecast range in the seven events consideredresults in a factor larger than four between MOD/RAD andREAL and in a factor above ten between MOD/RAD and LEPS.

Since the size of the Verzasca basin is only a few squarekilometers larger than the mesh size of COSMO-LEPS there isalmost no averaging effect. This contributes to the largespread of the obtained hydrographs when COSMO-LEPS isused. Gallus (2002) warns about using NWPs grid-pointinformation as for verification against point data. In the caseof the Verzasca this is almost the case, since we useinformation of few COSMO-LEPS grid points in order toforce our impact model and compare it to observations.

The estimation of PREVAH parameter uncertainty isstrongly dependent on the way the parameters have been

sampled and ranked. Numerous approaches are possible forthis kindof problem(Matott et al., 2009).Weare confident, thatthe chosen approach is appropriate to estimate the parameteruncertainty of PREVAH within the presented superpositionexperiment. Of course the parameter uncertainty is estimatedon the basis of the whole calibration period. Current literature(He et al., 2009; Cullmann andWriedt, 2008 and Pappenbergerand Beven, 2004) offers some examples of approaches that tryto combine parameter configurations being successful in thecomplete data basis with other parameter configurationsestimated for single events or series of events.

Amplification of spread is obtained if the combination ofLEPS (or REAL) and triggers of a non linear reaction of the runoffgenerationmodule of PREVAH (Gurtz et al., 2003; Viviroli et al.,2009a)which includes a threshold parameter for activating thegeneration of surface runoff (Table 1). Such a non linearresponse needs to be accounted for by hydrological models,since a sudden increase of discharge coefficients has beenobserved in many basins during long lasting heavy precipita-tion events (e.g. Naef et al., 2008). Such threshold processes canbe also identified in for of step-structures in theflood frequencystatistic (e.g.Merz andBloschl, 2008). Inall considered casesweobserved an amplification of the full spread, while thecorresponding interquartile range is mostly smaller when twoerror sources are superposed.

By use of REAL, input uncertainties are considered fornowcasting. We showed that the simultaneous application ofREAL and parameter uncertainties generates ensembles thatnicely envelop the observed hydrograph. Besides weather-radar based approaches, observation-based ensembles withpluviometer data have been recently proposed. Recent studiespropose theuse of theKriging variance (Ahrens and Jaun, 2007;Moulin et al., 2009; Pappenberger et al., 2009) for theestimation of the interpolation uncertainty of ground-basedprecipitation data for hydrological purposes. Jaun (2008)showed that hydrological simulation forced by observation-based ensembles is sensitive to the density and number ofstations available. The interpolation uncertainty increases withdecreasing number of representative stations available. Theserestrictions do not apply to REAL, which is able to operationallygenerate high resolution observation-based ensembles forhydrology. Nevertheless, observation-based pluviometersensembles are certainly a feasible way to consider inputuncertainty in regions where the weather radar coverage isnot adequate. Further efforts are planned in order to implementinterpolation-based ensembles within our experimental chain.

The use of weather radar ensembles for generatinghydrologically consistent ensembles of initial conditionsprevious to the propagation of COSMO-LEPS through thehydrological model show that the uncertainty in initialconditions decays within the first 48 hours of the forecast.The magnitude of the uncertainty attributed to the differencein initial conditions is smaller than the uncertainty attributedto the hydrological model parameters and almost negligiblewith respect to the spread owed to COSMO-LEPS.

The operational implementation of this experiment for thesmall Verzasca river basin would be a priori possible. Torealize a run with all 10,400 ensembles including 650 runs forthe determination of initial conditions requires about 6 hoursCPU time. The application on larger river basin requires areduction in number of simulations. The adaptive forecasting

260 M. Zappa et al. / Atmospheric Research 100 (2011) 246–262

concept proposed by Romanowicz et al. (2006, 2008) couldbe a possible approach to estimatewhichmembers need to becomputed.

Acknowledgments

We want to acknowledge the Swiss Federal Office forEnvironment providing us runoff data from their operationalnetworks. Thanks to the Ufficio dei corsi d'acqua (CantonTicino) and Istituto Scienze della Terra (SUPSI) for additionalrain-gauge data. This study is part of COST-731 and MAPD-PHASE, and was funded by MeteoSwiss, WSL and theState Secretariat for Education and Research SER (COST731). The comments of the two reviewers F. Pappenbergerand L. Moulin and of the Guest Editor A. Rossa helpedclarifying the paper.

Appendix A

In this appendix we give a concise summary on theverification of the probabilistic (COSMO-LEPS and REAL) anddeterministic (MOD and RAD) simulations during the periodJune 2007 to November 2008, expressed with probabilisticmeasures of skill. Such kind of verification is established inatmospheric sciences (Brier, 1950; Wilks, 2006; Weigel et al.,2007; Ahrens and Walser, 2008) and is enjoying increasingpopularity in hydrological sciences both for the analysis ofsingle events and for verification of long time series (Jaun etal., 2008; Jaun and Ahrens, 2009; Bartholmes et al., 2009;Roulin and Vannitsem, 2005, Roulin 2007; Laio and Tamea2007, Brown et al., 2010).

Fig. A1 shows the relative operating characteristic curves(ROC, Wilks, 2006) of LEPS and REAL for the period June 2007to November 2008. The ROC for the deterministic simulationsMOD/RAD and MOD/PLUV are also indicated as a point. Theanalysis has been completed for three different thresholds, allof them representing a percentile (50%, 75%, and 95%) of theobserved discharge during these 18 months. Additionally theBrier Skill Score (BSS, Wilks, 2006) of the ensemble productsis declared. For LEPS the analysis has been completed fordifferent lead-times (Jaun and Ahrens, 2009). The lead-timeof one and five days is displayed in Fig. A1.

Fig. A1. Relative operating characteristic curves (ROC, Wilks, 2006) of LEPS and REAsimulations MOD/RAD and MOD/PLUV is indicated as a point. ROC are plot for threeand 0.9 (right) quantiles.

The obtained ROC and BSS show that both REAL and LEPSare skillful for all selected thresholds. When low dischargepercentiles are tested (50% and 75%) BSS of LEPS decreasesonly slightly between day one and day five forecasts. The skillof forecast for discharges above 75.8 m3 s−1 (95% percentile)is better for LEPS forecasts with lead time of one day than forLEPS with five days lead time.

BSS of REAL is high for the lowest and the highestpercentiles considered. For the 75% discharge (17.2 m3 s−1)percentile, REAL tends to have an increased rate of falsealarms.

MOD/RAD and MOD/PLUV show similar behaviour as theensemble products. MOD/RAD has a higher hit rate thanMOD/PLUV when the 75% discharge percentile is tested. Onthe other hand MOD/PLUV has fewer false alarms than MOD/RAD when discharge above 5.97 m3 s−1 (50% percentile) isverified. An extended objective quantitative verification ofthe ensemble simulations against observed data will bepresented in follow-up studies.

References

Ahrens, B., Jaun, S., 2007. On evaluation of ensemble precipitation forecastswith observation-based ensembles. Advances in Geosciences 10,139–144.

Ahrens, B., Walser, A., 2008. Information-based skill scores for probabilisticforecasts. Monthly Weather Review 136 (1), 352–363.

Bartholmes, J.C., Thielen, J., Ramos, M.H., Gentilini, S., 2009. The europeanflood alert system EFAS — part 2: statistical skill assessment ofprobabilistic and deterministic operational forecasts. Hydrology andEarth System Sciences 13 (2), 141–153.

Bellerby, T.J., Sun, J.Z., 2005. Probabilistic and ensemble representations ofthe uncertainty in an IR/microwave satellite precipitation product.Journal of Hydrometeorology 6 (6), 1032–1044.

Berenguer, M., Corral, C., Sanchez-Diezma, R., Sempere-Torres, D., 2005.Hydrological validation of a radar-based nowcasting technique. Journalof Hydrometeorology 6 (4), 532–549.

Beven, K., 1993. Prophecy, reality and uncertainty in distributed hydrologicalmodeling. Advances in Water Resources 16 (1), 41–51.

Beven, K., 2006. On undermining the science? Hydrological Processes 20 (14),3141–3146.

Beven, K.J., Freer, J., 2001. Equifinality, data assimilation, and uncertaintyestimation in mechanistic modelling of complex environmental systems.Journal of Hydrology 249, 11–29.

Bosshard, T., Zappa, M., 2008. Regional parameter allocation andpredictive uncertainty estimation of a rainfall-runoff model in thepoorly gauged Three Gorges Area (PR China). Physics and Chemistry ofthe Earth 33 (17–18), 1095–1104.

L for the period June 2007 to November 2008. The ROC for the deterministicdifferent discharge thresholds corresponding to the 0.5 (left), 0.75 (middle)

261M. Zappa et al. / Atmospheric Research 100 (2011) 246–262

Bowler, N.E., Pierce, C.E., Seed, A.W., 2006. STEPS: A probabilisticprecipitation forecasting scheme which merges an extrapolation nowcastwith downscaledNWP. Quarterly Journal RoyalMeteorological Society 132(620), 2127–2155.

Brier, G.W., 1950. Verification of forecasts expressed in terms of probability.Monthly Weather Review 78 (1), 1–3.

Brown, J.D., Demargne, J., Seo, D.J., Liu, Y.Q., 2010. The Ensemble VerificationSystem (EVS): a software tool for verifying ensemble forecasts ofhydrometeorological and hydrologic variables at discrete locations.Environmental Modelling and Software 25 (7), 854–872.

Bruen, M., et al., 2010. Visualizing flood forecasting uncertainty: somecurrent European EPS platforms-COST731 working group 3. Atmospher-ic Science Letters 11 (2), 92–99.

Clark, M.P., Slater, A.G., 2006. Probabilistic quantitative precipitationestimation in complex terrain. Journal of Hydrometeorology 7 (1), 3–22.

Cloke, H.L., Pappenberger, F., 2009. Ensemble flood forecasting: a review.Journal of Hydrology 375 (3–4), 613–626.

Collier, C.G., 2007. Flash flood forecasting:What are the limits of predictability?Quarterly Journal Royal Meteorological Society 133 (622), 3–23.

Cullmann, J., Wriedt, G., 2008. Joint application of event-based calibrationand dynamic identifiability analysis in rainfall-runoff modelling:implications for model parametrisation. Journal of Hydroinformatics 10(4), 301–316.

Diezig, R., Fundel, F., Jaun, S., Vogt, S., 2010. Verification of runoff forecasts bythe FOEN and the WSL. In: CHR (Ed.), Advances in Flood Forecasting andthe Implications for Risk Management. International Commission for theHydrology of the Rhine Basin (CHR), Alkmaar, pp. 111–113.

Draper, D., 1995. Assessment and propagation of model uncertainty. Journalof the Royal Statistical Society: Series B: Methodological 57 (1), 45–97.

Ehrendorfer, M., 1997. Predicting the uncertainty of numerical weatherforecasts: a review. Meteorologische Zeitschrift 6 (4), 147–183.

Frick, J., Hegg, C., 2011. Can end-users' floodmanagement decisionmaking beimproved by information about forecast uncertainty? AtmosphericResearch 100, 296–302 (this issue).

Fundel, F., Zappa, M., 2011. Hydrological Ensemble Forecasting in MesoscaleCatchments: Sensitivity to Initial Conditions and Value of Reforecasts.Water Resources Research, in review.

Gallus, W.A., 2002. Impact of verification grid-box size on warm-season QPFskill measures. Weather and Forecasting 17 (6), 1296–1302.

Germann, U., Galli, G., Boscacci, M., Bolliger, M., 2006. Radar precipitationmeasurement in a mountainous region. Quarterly Journal RoyalMeteorological Society 132 (618), 1669–1692.

Germann, U., Berenguer, M., Sempere-Torres, D., Zappa, M., 2009. REAL —ensemble radar precipitation estimation for hydrology in a mountainousregion. Quarterly Journal Royal Meteorological Society 135 (639),445–456.

Gurtz, J., Baltensweiler, A., Lang, H., 1999. Spatially distributed hydrotope-based modelling of evapotranspiration and runoff in mountainousbasins. Hydrological Processes 13 (17), 2751–2768.

Gurtz, J., et al., 2003.A comparative study inmodellingrunoff and its componentsin two mountainous catchments. Hydrological Processes 17 (2), 297–311.

He, Y., et al., 2009. Tracking the uncertainty in flood alerts driven by grandensemble weather predictions. Meteorological Applications 16 (1),91–101.

Jaun, S., 2008. Towards operational probabilistic runoff forecasts, Disserta-tion No. 17817, ETH Zurich, [available online at http://e-collection.ethbib.ethz.ch/view/eth:41686].

Jaun, S., Ahrens, B., 2009. Evaluation of a probabilistic hydrometeorologicalforecast system. Hydrology and Earth System Sciences Discussions 6,1843–1877.

Jaun, S., Ahrens, B., Walser, A., Ewen, T., Schar, C., 2008. A probabilistic viewon the August 2005 floods in the upper Rhine catchment. NaturalHazards and Earth System Sciences 8 (2), 281–291.

Koboltschnig, G.R., Schoner, W., Holzmann, H., Zappa, M., 2009. Glaciermeltof a small basin contributing to runoff under the extreme climateconditions in the summer of 2003. Hydrological Processes 23 (7),1010–1018.

Laio, F., Tamea, S., 2007. Verification tools for probabilistic forecasts ofcontinuous hydrological variables. Hydrology and Earth System Sciences11 (4), 1267–1277.

Lamb, R., 1999. Calibration of a conceptual rainfall-runoff model for floodfrequency estimation by continuous simulation. Water ResourcesResearch 35 (10), 3103–3114.

Lee, C.K., Lee, G., Zawadzki, I., Kim, K.E., 2009. A preliminary analysis of spatialvariability of raindrop size distributions during stratiform rain events.Journal of Applied Meteorology and Climatology 48 (2), 270–283.

Legates, D.R., McCabe, G.J., 1999. Evaluating the use of “Goodness-of-Fit”measures in hydrologic and hydroclimatic model validation. WaterResources Research 35, 233–241.

Lorenz, E.N., 1963. Deterministic nonperiodic flow. Journal of AtmosphericSciences 20 (2), 130–141.

Madsen, H., 2000. Automatic calibration of a conceptual rainfall-runoff modelusing multiple objectives. Journal of Hydrology 235 (3–4), 276–288.

Madsen, H., 2003. Parameter estimation in distributed hydrological catch-ment modelling using automatic calibration with multiple objectives.Advances in Water Resources 26 (2), 205–216.

Marsigli, C., Boccanera, F., Montani, A., Paccagnella, T., 2005. The COSMO-LEPS mesoscale ensemble system: validation of the methodology andverification. Nonlinear Processes in Geophysics 12 (4), 527–536.

Matott, L.S., Babendreier, J.E., Purucker, S.T., 2009. Evaluating uncertainty inintegrated environmental models: a review of concepts and tools. WaterResources Research 45.

Merz, R., Bloschl, G., 2008. Flood frequency hydrology: 1. temporal, spatial,and causal expansion of information. Water Resources Research 44 (8).

Molteni, F., Buizza, R., Palmer, T.N., Petroliagis, T., 1996. The ECMWFensemble prediction system: methodology and validation. QuarterlyJournal Royal Meteorological Society 122 (529), 73–119.

Moulin, L., Gaume, E., Obled, C., 2009. Uncertainties on mean arealprecipitation: assessment and impact on streamflow simulations.Hydrology and Earth System Sciences 13 (2), 99–114.

Naef, F., Schmocker-Fackel, P., Margreth, M., Kienzler, P., Scherrer, S., 2008.Die Häufung der Hochwasser der letzten Jahre. In: Bezzola, G.R., Hegg, C.(Eds.), Ereignisanalyse der Hochwasser 2005—Teil 2, Analyse vonProzessen, Massnahmen und Gefahrengrundlagen in Umwelt. Umwelt-Wissen, p. 429. 08025.

Nash, J.E., Sutcliffe, J.V., 1970. River flow forecasting through conceptualmodels (1), a discussion of principles. Journal of Hydrology 10, 282–290.

Palmer, T.N., 2000. Predicting uncertainty in forecasts of weather andclimate. Reports on Progress in Physics 63 (2), 71–116.

Pappenberger, F., Beven, K.J., 2004. Functional classification and evaluation ofhydrographs based on Multicomponent Mapping (Mx). InternationalJournal of River Basin Management 2 (2), 89–100.

Pappenberger, F., Beven, K.J., 2006. Ignorance is bliss: or seven reasons not touse uncertainty analysis. Water Resources Research 42 (5).

Pappenberger, F., et al., 2005. Cascading model uncertainty from mediumrange weather forecasts (10 days) through a rainfall-runoff model toflood inundation predictions within the European Flood ForecastingSystem (EFFS). Hydrology and Earth System Sciences 9 (4), 381–393.

Pappenberger, F., Ghelli, A., Buizza, R., Bodis, K., 2009. The skill of probabilisticprecipitation forecasts under observational uncertainties within thegeneralized likelihood uncertainty estimation framework for hydrologicalapplications. Journal of Hydrometeorology 10 (3), 807–819.

Quiby, J., Denhard, M., 2003. SRNWP-DWD Poor-Man Ensemble PredictionSystem: the PEPS Project. Eumetnet Newsletter, pp. 9–12.

Ranzi, R., Zappa, M., Bacchi, B., 2007. Hydrological aspects of the MesoscaleAlpine Programme: findings from field experiments and simulations.Quarterly Journal Royal Meteorological Society 133 (625), 867–880.

Romang, H., et al., 2011. IFKIS-Hydro— earlywarning and information systemfor floods and debris flows. Natural Hazards 56 (2), 509–527 (19).

Romanowicz, R.J., Young, P.C., Beven, K.J., 2006. Data assimilation andadaptive forecasting of water levels in the river Severn catchment,United Kingdom. Water Resources Research 42 (6).

Romanowicz, R.J., Young, P.C., Beven, K.J., Pappenberger, F., 2008. A databased mechanistic approach to nonlinear flood routing and adaptiveflood level forecasting. Advances in Water Resources 31 (8), 1048–1056.

Rossa, A., Liechti, K., Zappa,M., Bruen,M., Germann, U., Haase, G., Keil, C., Krahe, P.,2011. The COST 731 Action: A review on uncertainty propagation inadvanced hydro-meteorological forecast systems. Atmospheric Research 100,150–167 (this issue).

Rotach, M.W., et al., 2009. MAP D-PHASE real-time demonstration of weatherforecast quality in the Alpine region. Bulletin of the AmericanMeteorological Society 90 (9) 1321-+.

Roulin, E., 2007. Skill and relative economic value of medium-rangehydrological ensemble predictions. Hydrology and Earth SystemSciences 11 (2), 725–737.

Roulin, E., Vannitsem, S., 2005. Skill of medium-range hydrological ensemblepredictions. Journal of Hydrometeorology 6 (5), 729–744.

Schaefli, B., Gupta, H.V., 2007. Do Nash values have value? HydrologicalProcesses 21 (15), 2075–2080.

Siccardi, F., Boni, G., Ferraris, L., Rudari, R., 2005. A hydrometeorologicalapproach for probabilistic flood forecast. Journal of GeophysicalResearch, [Atmospheres] 110 (D5).

Stensrud, D.J., Bao, J.W., Warner, T.T., 2000. Using initial condition and modelphysics perturbations in short-range ensemble simulations of mesoscaleconvective systems. Monthly Weather Review 128 (7), 2077–2107.

Szturc, J., Osrodka, K., Jurczyk, A., Jelonek, L., 2008. Concept of dealing withuncertainty in radar-based data for hydrological purpose. NaturalHazards and Earth System Sciences 8 (2), 267–279.

262 M. Zappa et al. / Atmospheric Research 100 (2011) 246–262

Todini, E., 2009. Predictive uncertainty assessment in real time floodforecasting. In: Baveye, P.C., Laba, M., Mysiak, J. (Eds.), Uncertaintiesin Environmental Modelling and Consequences for Policy Making.NATO Science for Peace and Security Series C—Environmental Security,pp. 205–228.

Verbunt, M., Zappa, M., Gurtz, J., Kaufmann, P., 2006. Verification of a coupledhydrometeorological modelling approach for alpine tributaries in theRhine basin. Journal of Hydrology 324 (1–4), 224–238.

Verbunt, M., Walser, A., Gurtz, J., Montani, A., Schar, C., 2007. Probabilisticflood forecasting with a limited-area ensemble prediction system:selected case studies. Journal of Hydrometeorology 8 (4), 897–909.

Villarini, G., Krajewski, W.F., 2008. Empirically-based modeling of spatialsampling uncertainties associated with rainfall measurements by raingauges. Advances in Water Resources 31 (7), 1015–1023.

Viviroli, D., Zappa, M., Gurtz, J., Weingartner, R., 2009a. An introduction to thehydrological modelling system PREVAH and its pre- and post-processing-tools. Environmental Modelling and Software 24 (10), 1209–1222.

Viviroli, D., Zappa, M., Schwanbeck, J., Gurtz, J., Weingartner, R., 2009b.Continuous simulation for flood estimation in ungauged mesoscalecatchments of Switzerland — part I: modelling framework andcalibration results. Journal of Hydrology 377 (1–2), 191–207.

Vrugt, J.A., Gupta, H.V., Bouten, W., Sorooshian, S., 2003. A shuffled complexevolution metropolis algorithm for optimization and uncertaintyassessment of hydrologic model parameters. Water Resources Research39 (8).

Vrugt, J.A., Diks, C.G.H., Gupta, H.V., Bouten, W., Verstraten, J.M., 2005.Improved treatment of uncertainty in hydrologic modeling: combining

the strengths of global optimization and data assimilation. WaterResources Research 41 (1).

Walser, A., Luthi, D., Schar, C., 2004. Predictability of precipitation in a cloud-resolving model. Monthly Weather Review 132 (2), 560–577.

Weigel, A.P., Liniger, M.A., Appenzeller, C., 2007. Generalization of the discretebrier and rankedprobability skill scores forweightedmultimodel ensembleforecasts. Monthly Weather Review 135 (7), 2778–2785.

Wilks, D., 2006. Statistical Methods in the Atmospheric sciences, vol. 91 ofInternational Geophysics Series. Elsevier, Amsterdam, The Netherlands.

Wohling, T., Lennartz, F., Zappa, M., 2006. Technical note: updatingprocedure for flood forecasting with conceptual HBV-type models.Hydrology and Earth System Sciences 10 (6), 783–788.

Zappa, M., 2002. Multiple-response verification of a distributed hydrologicalmodel at different spatial scales, Dissertation No. 14895, ETH Zurich,[available online at: http://e-collection.ethbib.ethz.ch/show?type=diss&nr=14895]. pp.

Zappa, M., Kan, C., 2007. Extreme heat and runoff extremes in the Swiss Alps.Natural Hazards and Earth System Sciences 7 (3), 375–389.

Zappa, M., Pos, F., Strasser, U., Warmerdam, P., Gurtz, J., 2003. Seasonal waterbalanceof anAlpinecatchmentasevaluatedbydifferentmethods for spatiallydistributed snowmelt modelling. Nordic Hydrology 34 (3), 179–202.

Zappa, M., et al., 2008. MAP D-PHASE: real-time demonstration of hydrologicalensemble prediction systems. Atmospheric Science Letters 9 (2), 80–87.

Zappa, M., et al., 2010. Propagation of uncertainty from observing systemsand NWP into hydrological models: COST-731 Working Group 2.Atmospheric Science Letters 11 (2), 83–91.