on the value of experimental data to reduce the prediction uncertainty of a process-oriented...

14
On the value of experimental data to reduce the prediction uncertainty of a process-oriented catchment model Stefan Uhlenbrook a, ) , Angela Sieber b a Institute of Hydrology, University of Freiburg, Fahnenbergplatz, D-79098 Freiburg, Germany b Geological Survey of the State of Baden-Wu ¨rttemberg, Hydrogeology, Albertstr. 5, D-79104 Freiburg, Germany Received 9 July 2003; received in revised form 14 October 2003; accepted 3 December 2003 Abstract Predicting hydrological response to rainfall or snowmelt including an estimation of prediction uncertainty is a major challenge in current hydrological research. The process-based catchment model TAC D (tracer aided catchment model, distributed) was applied to the mountainous Brugga basin (40 km 2 ), located in the Black Forest Mountains, southwest Germany. The Monte Carlo-based generalized likelihood uncertainty estimation (GLUE) framework was used to analyse the uncertainty of discharge predictions. The model input parameter sets were generated using the Latin Hypercube sampling method, which is an efficient way to sample the parameter space representatively. It was shown that the number of investigated parameters should exceed the number of varied parameters by at least a factor of 10. Even if the process basis and suitability of the model could be proven, relatively large uncertainty ranges of the discharge predictions still occurred during the simulation of floods. Prediction uncertainty varied both temporally and spatially. Incorporating additional data, i.e. sub-basin runoff and observed tracer concentrations, reduced the prediction uncertainty. However, the potential restriction of the uncertainty clearly depends on the goodness of the simulation of the additional data set. Knowledge of the uncertainty of model predictions and of the potential for experimental data to reduce it are crucial to sustainable environmental management, and should be considered more thoroughly during the planning of future field studies. Ó 2004 Elsevier Ltd. All rights reserved. Keywords: Uncertainty; Catchment modelling; TAC D model; GLUE; Flood modelling; Multi-response data; Model validation 1. Introduction The ability to reliably model hydrological processes is essential for optimal management of water resources (Singh, 1995). This includes not only the reproduction of a single variable such as the daily discharge of a medium-sized watershed, but also sufficient process simulation at various scales. The latter is needed to simulate the effects of environmental interferences for different spatial and temporal scales. The simulation of the daily discharge can often be accomplished with a relatively simple model calibrated to observed data if the input data are sufficient (e.g. Jakeman and Hornberger, 1993). However, if more detailed and spatially distributed simulations are required for exten- sive environmental planning, a more complex and distributed hydrological model needs to be applied. Although models that allow flux estimations at a wide range of spatial and temporal scales are potentially good models, they nevertheless suffer from various short- comings and uncertainties in their predictions (a detailed discussion is provided e.g. by Beven, 1996, 2001 or Refsgaard and Storm, 1996). A large portion of prediction uncertainty is often due to problems and errors of input data, e.g. insufficient quality, lack of long-term data, too few measuring stations, or difficulties regionalizing point measurements to catchment scale (e.g. Blo¨schl and Grayson, 2000). Another source of uncertainty is the over-parameteri- zation of models and that the interrelationships between model parameters are not well understood. Model www.elsevier.com/locate/envsoft Environmental Modelling & Software 20 (2005) 19e32 ) Corresponding author. Tel.: C49-761-2033520; fax: C49-761- 2033594. E-mail address: [email protected] (S. Uhlenbrook). 1364-8152/$ - see front matter Ó 2004 Elsevier Ltd. All rights reserved. doi:10.1016/j.envsoft.2003.12.006

Upload: stefan-uhlenbrook

Post on 26-Jun-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

www.elsevier.com/locate/envsoft

Environmental Modelling & Software 20 (2005) 19e32

On the value of experimental data to reduce the predictionuncertainty of a process-oriented catchment model

Stefan Uhlenbrooka,), Angela Sieberb

aInstitute of Hydrology, University of Freiburg, Fahnenbergplatz, D-79098 Freiburg, GermanybGeological Survey of the State of Baden-Wurttemberg, Hydrogeology, Albertstr. 5, D-79104 Freiburg, Germany

Received 9 July 2003; received in revised form 14 October 2003; accepted 3 December 2003

Abstract

Predicting hydrological response to rainfall or snowmelt including an estimation of prediction uncertainty is a major challenge incurrent hydrological research. The process-based catchment model TACD (tracer aided catchment model, distributed) was applied

to the mountainous Brugga basin (40 km2), located in the Black Forest Mountains, southwest Germany. The Monte Carlo-basedgeneralized likelihood uncertainty estimation (GLUE) framework was used to analyse the uncertainty of discharge predictions. Themodel input parameter sets were generated using the Latin Hypercube sampling method, which is an efficient way to sample theparameter space representatively. It was shown that the number of investigated parameters should exceed the number of varied

parameters by at least a factor of 10. Even if the process basis and suitability of the model could be proven, relatively largeuncertainty ranges of the discharge predictions still occurred during the simulation of floods. Prediction uncertainty varied bothtemporally and spatially. Incorporating additional data, i.e. sub-basin runoff and observed tracer concentrations, reduced the

prediction uncertainty. However, the potential restriction of the uncertainty clearly depends on the goodness of the simulation of theadditional data set. Knowledge of the uncertainty of model predictions and of the potential for experimental data to reduce it arecrucial to sustainable environmental management, and should be considered more thoroughly during the planning of future field

studies.� 2004 Elsevier Ltd. All rights reserved.

Keywords: Uncertainty; Catchment modelling; TACD model; GLUE; Flood modelling; Multi-response data; Model validation

1. Introduction

The ability to reliably model hydrological processes isessential for optimal management of water resources(Singh, 1995). This includes not only the reproduction ofa single variable such as the daily discharge ofa medium-sized watershed, but also sufficient processsimulation at various scales. The latter is needed tosimulate the effects of environmental interferences fordifferent spatial and temporal scales. The simulation ofthe daily discharge can often be accomplished witha relatively simple model calibrated to observed data ifthe input data are sufficient (e.g. Jakeman and

) Corresponding author. Tel.: C49-761-2033520; fax: C49-761-

2033594.

E-mail address: [email protected]

(S. Uhlenbrook).

1364-8152/$ - see front matter � 2004 Elsevier Ltd. All rights reserved.

doi:10.1016/j.envsoft.2003.12.006

Hornberger, 1993). However, if more detailed andspatially distributed simulations are required for exten-sive environmental planning, a more complex anddistributed hydrological model needs to be applied.Although models that allow flux estimations at a widerange of spatial and temporal scales are potentially goodmodels, they nevertheless suffer from various short-comings and uncertainties in their predictions (a detaileddiscussion is provided e.g. by Beven, 1996, 2001 orRefsgaard and Storm, 1996).

A large portion of prediction uncertainty is often dueto problems and errors of input data, e.g. insufficientquality, lack of long-term data, too few measuringstations, or difficulties regionalizing point measurementsto catchment scale (e.g. Bloschl and Grayson, 2000).Another source of uncertainty is the over-parameteri-zation of models and that the interrelationships betweenmodel parameters are not well understood. Model

20 S. Uhlenbrook, A. Sieber / Environmental Modelling & Software 20 (2005) 19e32

parameters are frequently not physically based, notmeasurable in the field, or not clearly related tocatchment properties. Consequently an equifinalityproblem exists, meaning that different models orparameter sets yield equally good simulation results(e.g. Beven and Binley, 1992). In addition, the modelstructure uncertainty can be large (e.g. Grayson et al.,1992; Seibert, 1999a), as this should be consistent withthe conceptual model of the investigated basin. Theconceptual model describes the ‘basin functioning’ andshould be based on experimental findings. However,these results are often lacking and the understanding ofthe dominating processes in different scales and environ-ments is still incomplete (e.g. McDonnell and Tanaka,2001; Uhlenbrook et al., 2003). Another problem ofhydrological models is due to the mathematical descrip-tions of various processes, and the application ofequations to the catchment scale that were derived forthe laboratory scale (Beven, 2001). These numerousuncertainties make simulations less reliable for periodsoutside the calibration period (Melching et al., 1990;Harlin and Kung, 1992; Uhlenbrook et al., 1999) or atneighbouring basins (Seibert, 1999b).

Recently, different methods for estimating the un-certainty of model predictions have been suggested.Beven and Binley (1992) proposed the widely usedgeneralized likelihood uncertainty estimation (GLUE)framework that applies many Monte Carlo simulationsto identify ‘equally good’ parameter sets and to estimatethe uncertainty of simulations by calculating uncertaintybounds. Thiemann et al. (2001) developed a Bayesianrecursive estimation approach that can be used forsimultaneous parameter estimation and prediction. Theprediction at each time step considers the probabilitiesassociated with different output values. The uncertaintyassociated with the parameter estimates is updatedcontinually, resulting in smaller prediction uncertaintiesas measurement data are successively assimilated.Franks (2002) presented a Bayesian error-sensitivemodel calibration scheme, which also accounts forrainfall errors. Brath et al. (2002) estimated the pre-diction uncertainty bounds for discharge simulationsusing an approach that considers the discharge volumeat the each time step as the driving variable. Thestrengths and shortcoming as well as the underlyingphilosophy of the first two approaches, GLUE andBARE, and general limitations in the state-of-the-art ofcurrent uncertainty estimation techniques are discussedin further detail in Beven and Young (2003) and Guptaet al. (2003).

A way forward to reduce the model uncertainty is toforce the model to simulate several observed responsesof the modelled system simultaneously (multi-criteriacalibration). This is achieved by calibration using otherdata in addition to discharge at the basin outlet, e.g.hydrochemical data (Mroczkowski et al., 1997), ground-

water levels (Lamb et al., 1998; Beldring, 2002),environmental isotopes sampled during events (Seibertand McDonnell, 2002) or sampled continuously (Uhlen-brook and Leibundgut, 2002), and the distribution ofsaturated areas (Ambroise et al., 1996; Franks et al.,1998; Guntner et al., 1999). However, the question towhat extent the model uncertainty can be reduced byincorporating additional data into the model calibrationis not well understood and hardly investigated so far.

The process-oriented catchment model TACD (traceraided catchment model, distributed) was developed forthe Brugga basin (Uhlenbrook et al., 2004) afterapplying a series of models that were able to simulatethe discharge well but could be disproved in terms oftheir ability to simulate internal processes correctly(TOPMODEL, Guntner et al., 1999; HBV, Uhlenbrooket al., 1999; NPSM, Eisele et al., 2001). TACD isa raster-based, modular catchment model, which at itscore is a process-oriented runoff generation routinebased on experimental findings including tracer studies(Uhlenbrook et al., 2002). The runoff generation routineuses a spatial delineation of units with the samedominating runoff generation processes. Linear andnon-linear reservoir routines are applied to conceptual-ize lateral flow processes. The successful model applica-tion to the Brugga basin (40 km2) was shown fordifferent periods using discharge data and tracer data.However, as TACD is a complex, detailed, and fullydistributed model, it suffers from some of the short-comings described above. In particular, the uncertaintyof the model predictions has not been quantified so far.Nor have ways of reducing the prediction uncertaintiesbeen investigated.

The objectives of this paper are, first, to quantify theuncertainty of stream flow predictions using the process-oriented catchment model TACD during different typesof events (convective storms and snow melt events). Thesecond objective is to investigate the spatial andtemporal variability of the prediction uncertainty forthe whole Brugga basin (40 km2) and for a 15.2 km2 sub-basin. The third is to evaluate the potential foradditional experimental data to reduce prediction un-certainty. Therefore, the incorporation of data from anadditional runoff gauging station (i.e. multi-scale data)is compared to the integration of tracer data measuredin the stream (i.e. multi-response data).

2. The Brugga basin

The study was performed in the meso-scale Bruggabasin (40 km2) and the sub-basin St. Wilhelmer Talbach(15.2 km2), located in the southern Black ForestMountains in southwest Germany. The test site ismountainous with elevations ranging from 438 to1493 m a.m.s.l. The mean annual precipitation is

21S. Uhlenbrook, A. Sieber / Environmental Modelling & Software 20 (2005) 19e32

approximately 1750 mm, generating a mean annualdischarge of approximately 1220 mm. The gneissbedrock is covered by soils, debris, and drift of varyingdepths (0e10 m). The catchment is widely forested(75%) and the remaining area is pastureland; urban landuse is less than 2% of the total area. A detaileddescription of the Brugga basin can be found inUhlenbrook (1999).

In recent years, detailed experimental investigationshave been conducted using artificial and environmentaltracers. It was possible to identify runoff sources andflow paths, to quantify runoff components, and to datethe age of different water compartments ( for furtherdetails, see Uhlenbrook et al., 2002). It was shown thatsurface runoff is generated on impervious or saturatedareas. In addition, fast runoff components are generatedon steep, highly permeable slopes covered by boulderfields. Subsurface storm flow components can also begenerated at hill slopes with permeable soil and peri-glacial drift material, which is located either abovenearly impermeable bedrock, or above deeper driftcover layers with significantly reduced permeability.Base flow components originate from the fractured hardrock aquifer and the deeper parts of the weatheringzone.

3. Model description and input data

The model TACD is a process-oriented catchmentmodel with a modular structure. A detailed descriptionthat differs only in the calculation of the interceptionand potential evapotranspiration is given by Uhlen-brook et al. (2004). It is a conceptual but fullydistributed model with 50!50 m2 grid cells as de-termined by the resolution of the available digitalelevation model (DEM). The model routes the waterby applying the single-flow direction algorithm (D8,O’Callaghan and Mark, 1984). TACD works on hourlytime steps and is coded within the geographical in-formation system PC-Raster (Karssenberg et al., 2001),which offers a dynamic modelling language. Somemodules were obtained from the literature (see below).The process-based runoff generation module thatrequires a spatial delineation of units with the samedominating runoff generation processes represents themodel’s core.

3.1. Meteorological input data

Precipitation [mm h�1] was observed at up to sevenstations. Three took hourly or more frequent measure-ments; the others were disaggregated to hourly resolu-tion by transferring the temporal pattern of the neareststation. The systematic wind error was corrected usingan approach of Schulla (1997). A mixed approach of the

inverse distance weighting (IDW) method and anelevation gradient method was used to calculate basinprecipitation. For the latter, a temporally constantelevation factor derived from the mean precipitationeelevation gradient was applied, and precipitation foreach grid cell was estimated according to its elevation.The IDW method was used to compute the hourlyprecipitation from the available rainfall stations duringeach time step. Finally, for every grid cell, the pre-cipitation was estimated as a weighted mean (80:20) ofthe precipitation derived from the IDW method and theelevation gradient method. This regionalization schemeis a compromise. First, the spatial distribution of raincells during the event scale needs to be captured. Here,the IDW method is the suitable tool. Second, for longerevents, the rainfall pattern is further influenced bytopography. Here, the elevation gradient becomes moreimportant.

Temperature [(C] was observed at up to 10 locationsin and around the Brugga basin and was regionalizedusing a simple elevation gradient. Two hourly variablegradients were applied for the upper and lower parts ofthe catchment to account for inversions that arecommon during winter.

The potential evapotranspiration [mm h�1] wascalculated using the physically based PenmaneMonteithapproach that incorporates the following meteorologicalvariables: temperature, wind speed, humidity, and netradiation. For the latter, global radiation [Wh m�2] wascalculated using the model POTRAD (Potential Radi-ation Equator model, Version 5, http://www.geo.vu.nl/

~damo/potrad/potrad.htm, date: 2003-04-22) (van Dam,2000). This model calculates the incoming globalradiation as flux for a given time step and grid cell,taking topography (slope, aspect, and shadowingeffects), solar geometry (declination of the sun, latitude,and azimuth angle), and sunshine duration (cloudiness)into consideration. For the net radiation also thebalance of long wave radiation is needed, which wascalculated according to Schulla (1997). To account forspatially and temporally variable land use cover whenmodelling evapotranspiration, the following vegetationparameters were taken from the literature (DVWK,1996; Schulla, 1997; Bremicker, 2000): bulk-surfaceresistance [s m�1], albedo [e], leaf area index [m2

m�2], and effective plant height [m]. Parameter valueswere varied monthly for each land use class. Evapora-tion of snow was considered by monthly variable values(DVWK, 1996).

Wind velocity [m s�1] was measured at up to sevenlocations, and directly transferred to the other pre-cipitation stations using values from the nearest station.Sunshine duration [e] was measured at four stations, andhumidity [%] at three stations. All three parameters wereregionalized to catchment scale by applying the IDWmethod.

22 S. Uhlenbrook, A. Sieber / Environmental Modelling & Software 20 (2005) 19e32

3.2. Snow module

A temperature index method was used to simulatesnow cover development and snowmelt (e.g. Bergstrom,1992). Precipitation is modelled as snow if the airtemperature at the specific grid cell is below thethreshold temperature TT [(C]. Snowmelt occurs if thetemperature is above TT. In this case, TT is varied forforested and non-forested areas (TTmelt_forest andTTmelt) to account for different snowmelt conditions.CFMAX [mm h�1 (C�1] defines the amount of meltingwater. A snowfall correction parameter, SFCF [e],accounts for systematic error during snowfall measure-ment. Melted water is conducted into the soil module,but can be stored in the snow cover with up to 10% ofthe water equivalent. Stored melt water can refreeze attemperatures below TT. The amount of refrozen meltwater was defined as 5% of CFMAX multiplied by thedifference between TT and the temperature at therespective time step. Both percent values were obtainedfrom Bergstrom (1992).

3.3. Soil module

The soil water module from the HBV model(Bergstrom, 1992) was used for unit types with a soilzone. Its two parameters were varied for each landscapeunit. FC [mm] represents the maximum soil waterstorage. Water transfer to the runoff generation routinefor a given precipitation input P [mm h�1] is computedby a non-linear function. This function depends on theactual soil moisture, Ssm, [mm] as well as the parametersBETA [e] and FC:

recharge

P¼ Ssm

FC

� �BETA

ð1Þ

This simple module can be characterized as processrealistic, as percolation to the runoff generation routineis possible before maximum soil moisture content(defined by FC) is reached. Actual evapotranspiration[mm h�1] is computed for each cell depending on theavailability of stored soil moisture. According to Menzel(1997), potential evapotranspiration is reduced linearlyif local soil moisture storage is below 60% of soilmoisture storage capacity.

3.4. Runoff generation and routing module

Tilch et al. (2002) developed an objective approachbased on previous process investigations to delineateunits with similar dominating runoff generation behav-iour. It considers different hydrogeological units, surfacecharacteristics, and topography as major controls onrunoff processes for each grid cell. Therefore, maps ofgeology, soils, topography, stream network, land use,

and forest habitats as well as a DEM were used. Thedelineation is the basis for the spatial discretizationincorporated into the TACD runoff generation routine.The following units are differentiated: areas dominatedby (i) Horton overland flow, (ii) saturation overlandflow, (iii) fast subsurface storm flow (slope greater than25(, boulder fields), (vi) fast subsurface storm flow andpiston flow accumulation and colluvium zones at thetoes of hill slopes, (v) delayed interflow (slope angle:6e25(, with stratified soils and drift covers), (vi) verydelayed, damped interflow (moraines), (vii) deep perco-lation and base flow generation ( flat or hilly areas at thehilltops), and (viii) deep percolation to large porousaquifers and base flow generation.

Reservoir systems with parallel, sequentially con-nected, or overflowing reservoirs (‘tanks’) were designedfor each of the eight units to model the lateral flow ( forfurther details, see Uhlenbrook et al., 2004). To simulaterunoff from urbanized areas, the portion of imperviousarea in each urban raster cell was distinguished from theportion of areas where infiltration is still possible (openspaces, gardens, etc.). For the latter, the soil and runoffgeneration routine of the respective runoff unit type wasused.

The kinematic wave approach with an implicit non-linear approach was used to simulate runoff routing inthe stream network. Further details about the para-meterization are discussed in Uhlenbrook et al. (2004).

3.5. Hydrological and hydrochemical data

Runoff is measured continuously at the Brugga basin(40 km2) outlet and the St. Wilhelmer Talbach sub-basin(15.2 km2) at official Baden-Wurttemberg Environmen-tal Survey gauging stations. Hourly data were availablefor the four investigated events (see below). The modelwas applied for the 400 different parameter sets (see thefollowing Section 4) with identical antecedent hydro-logical conditions, i.e. storage fillings at the surfacewater storage, snow storage, soil water storage, ground-water storage and channel storage were similar. Thestorage fillings were obtained by modelling an initiali-zation period of at least 12 months to reach realisticlevels. In addition, a period of at least 10 time steps wassimulated before the beginning of each investigatedflood event.

Dissolved silica has been used as a tracer to examineflow pathways and to separate different runoff compo-nents at the study site (Uhlenbrook et al., 2002). Duringthe four investigated events, stream water was sampledat the catchment outlet at varying temporal resolutions(i.e. 1e4-h intervals). The samples were analysed fordissolved silica according to a photometric methodfollowing the German Institute for Standardisation(DIN). The analytical Gaussian standard error wasapproximately 0.06 mg l�1. A simple mixing model was

23S. Uhlenbrook, A. Sieber / Environmental Modelling & Software 20 (2005) 19e32

applied to simulate concentrations of dissolved silica atthe Brugga outlet:

Siconc ¼ Q�1sim �

XðSirunoff�comp �Qrunoff�compÞ ð2Þ

where Siconc [mg l�1] is the simulated silica concentra-tion, Sirunoff-comp [mg l�1] is the silica concentration ofeach runoff component (see Section 3.4) entering thestream within the modelled time step, Qrunoff-comp [mmh�1] is the flow volume of the respective runoffcomponent, and Qsim [mm h�1] is the modelleddischarge. The silica concentrations assigned to eachrunoff component are based on previous experimentalinvestigations.

4. Investigating model uncertainty

4.1. Monte Carlo simulations using Latin Hypercubesampling

The GLUE procedure (Beven and Binley, 1992) wasused to investigate model prediction uncertainty. TheMonte Carlo-based technique divides parameter setsused for simulations into acceptable and unacceptablesets according to their modelling performance. Anobjective function’s value is defined as the thresholdfor determining the acceptable parameter sets. Asobjective functions statistical measures are used in thisstudy that describes the goodness of model fit for therespective parameter set (see Section 4.2). The accept-able parameter sets are used to compute uncertaintyranges, e.g. the 5% and 95% quantiles. For thatpurpose, a probability distribution and its quantilesfor simulating the target variable, i.e. mostly thedischarge in a hydrological catchment model, arecalculated for each modelling time step. The GLUEtechnique normally requires a large number of modelruns if random sampling of the model parameters isperformed (e.g. Freer et al., 1996). Due to the longprocessing time of the complex TACD model, a ‘classi-cal’ analysis with inefficiently, randomly sampledparameter values was impossible. Therefore, LatinHypercube sampling (McKay et al., 1979) was appliedas a more efficient sampling strategy. Using this method,the range of each parameter is divided into n intervals,with n representing the number of necessary parametercombinations. The size of the intervals needs to have thesame probability as a chosen distribution function.Assuming a uniform distribution (as proposed by Beven,2001), the intervals were set to equal sizes and oneparameter value was taken from each interval. Eachparameter value is combined randomly with the valuesof the other investigated parameters to obtain nparameter sets. Thus, it is guaranteed that eachparameter range is completely considered. As a result,the parameter space can be sampled representatively

with significantly fewer samples and the number ofnecessary model runs can be reduced. Yu et al. (2001)compared Latin Hypercube sampling and randomsampling during an uncertainty analysis of a distributedrainfall-runoff model. The authors demonstrated thatLatin Hypercube sampling requires only 10% of thenecessary runs using random sampling to producesimilar uncertainty bounds. This highlights the potentialof Latin Hypercube sampling, which is particularlyuseful if a high number of model runs is impossible andthe assessment of high quantiles (O0.99) is notnecessary (Helton and Davis, 2000).

Latin Hypercube sampling has been widely used inenvironmental modelling studies (e.g. Helton et al.,1998), but examples using hydrological models are lessnumerous. Christiaens (2001) used Latin Hypercubesampling for a sensitivity and uncertainty analysis of thehighly complex, widely physically based model MIKE-SHE; Lal et al. (1997) used this sampling strategy toanalyse a regional hydrological model. Melching (1992)performed comparative studies between different sam-pling techniques for the hydrological model HEC-1 anddemonstrated that Latin Hypercube Sampling is a verypowerful tool for uncertainty analysis because it offersthe flexibility of random sampling but affords lesscomputing power.

The majority of model parameters were defined asuncertain and varied during the analysis (Table 1): 37and 33 parameters for events with and without theinfluence of snow, respectively. The parameter rangeswere defined in accordance with experience fromprevious model applications (Uhlenbrook et al., sub-mitted for publication) and an uncertainty analysis ofthe HBV model (Uhlenbrook et al., 1999).

Recommendations for the necessary number ofmodel runs using Latin Hypercube sampling, n, varyin the literature. Iman and Helton (1985) suggestedsampling two to five times the number of varied modelparameters. Melching (1995) proposed to define n bychecking convergence of statistical measures of modeloutput on the number of executed model runs.Therefore, a preliminary study using a new HBV version(Bergstrom, 1992), the so-called model HBV light(Seibert, 2002), was performed at the Brugga basin(Sieber, 2003). This model computes more quickly thanthe TACD model but uses partially similar routines.Here, the results gathered by Latin Hypercube samplingand random sampling for several 10 000 model runswere compared. It was shown that both samplingstrategies yield similar statistical measurement values ifn exceeds 10 times the number of varied parameters.This demonstrated clearly that the guideline value ofIman and Helton (1985) is too low. Additional inves-tigations performing 1500 model runs with the TACD

model for the short runoff event 1 indicated that similarresults were obtained with the two different sampling

24 S. Uhlenbrook, A. Sieber / Environmental Modelling & Software 20 (2005) 19e32

Table 1

Parameters of the model TACD and their ranges used for the Monte Carlo simulations

Parameter Unit Explanation Minimum Maximum

Precipitation correction

WindA e Correction of precipitation measurement 1.0 1.12

WindB s m�1 Correction of precipitation measurement 0.0025 0.04

Snow routine

TT (C Temperature threshold for snow fall �1.0 1.0

TT_melt (C TT for snow melt �1.0 1.0

TT_meltforest (C TT for snow melt in forests 0.25 3.0

SFCF e Snow fall correction factor 0.9 1.1

CFMAX mm (C�1 d�1 Degree-hour factor 0.04 0.125

CWHa e Water holding capacity 0.1

CFRa e Refreezing coefficient 0.05

Soil routine

LPa e Reduction of potential evapotranspiration 0.6

FC_3 mm Field capacity at unit type 3 200 300

FC_4 mm Field capacity at unit type 4 105 155

FC_5 mm Field capacity at unit type 5 70 110

FC_6 mm Field capacity at unit type 6 160 240

FC_7 mm Field capacity at unit type 7 160 240

FC_8 mm Field capacity at unit type 8 175 265

BETA_3 e Beta parameter at unit type 3 1.0 5.0

BETA_4 e Beta parameter at unit type 4 0.75 4.0

BETA_5 e Beta parameter at unit type 5 0.5 3.5

BETA_6 e Beta parameter at unit type 6 0.75 4.0

BETA_7 e Beta parameter at unit type 7 0.75 4.0

BETA_8 e Beta parameter at unit type 8 1.0 5.0

Runoff generation routine

UrbanSplita e Portion of sealed areas in unit type 1 0.4

MTD mm Micro-topographic depression storage at unit

type 2

10 50

K_2 h�1 Recession coefficient for unit type 2 0.0025 0.04

K_3 h�1 Recession coefficient for unit type 3 0.00025 0.004

K_4_u h�1 Recession coefficient for unit type 4 0.006 0.1

K_4_l Upper recession coefficient for unit type 4 0.00125 0.02

T_4 mm h�1 Percolation from upper to lower reservoir at unit

type 4

0.05 0.8

H_4 mm Maximal storage capacity of lower storage

at unit type 4

300 500

K_5_u h�1 Recession coefficient for unit type 5 0.005 0.7

K_5_l h�1 Lower recession coefficient for unit type 5 0.006 0.1

T_5 mm h�1 Percolation from upper to lower reservoir at unit

type 5

0.15 2.4

H_5 mm Maximal storage capacity of lower storage at unit type 5 60 100

K_6 h�1 Upper recession coefficient for unit type 6 0.05 0.8

K_7_u h�1 Upper recession coefficient for unit type 7 0.05 0.7

K_7_l h�1 Lower recession coefficient for unit type 7 0.002 0.025

T_7 mm h�1 Percolation from upper to lower reservoir at unit type 7 0.15 2.4

H_7 mm Maximal storage capacity of lower storage

at unit type 7

110 190

K_8 h�1 Upper recession coefficient for unit type 8 0.0005 0.008

Upper_H mm Maximal storage capacity of all upper

groundwaters

GW_K h�1 Recession coefficient for hard rock aquifer 0.00025 0.0025

GW_Ha mm Maximal storage capacity of hard rock aquifer 1000

GW_P mm h�1 Percolation from upper reservoirs to hard rock aquifer 0.02 0.3

Runoff routing routine

StreamWidtha m Stream width 0.3e6.1

StreamLengtha m Stream lengths per raster cell 60.6

na m1/3 s�1 Manning’s n (roughness coefficient) 0.06e0.08

Betaa e Parameter for kinematic wave routing 0.6

a Parameter was not varied. Field data or values from the literature were used.

25S. Uhlenbrook, A. Sieber / Environmental Modelling & Software 20 (2005) 19e32

strategies after about 380 model runs (Fig. 1). This is inline with the experiences gained using the HBV model,as 380 is about 10 times the number of varied modelparameters. Finally, the number of investigatedmodel runs was set to 400.

4.2. Objective functions and choice of acceptableparameter sets with the GLUE framework

The model efficiency, Reff [�] (Nash and Sutcliffe,1970), was used to evaluate the agreement betweensimulated and observed discharge for hourly time steps.Efficiency values are between �N and 1.0. Values of 1.0indicate a perfect agreement between simulated andobserved discharge; negative values indicate poor modelfits. The coefficient of determination, R2 [�], was used toevaluate the agreement between simulated and observedtracer concentrations. For the shorter events 1 and 2 (seebelow), hourly silica data were available, and for thelonger events 3 and 4, silica measurements for only everyfourth hour were available. The coefficient of determi-nation was used to evaluate the chemistry simulations,as this statistical measure is sensitive to the simulatedand observed relative increase/decrease of concentra-tions, even if the absolute values do not fit well. Themodel efficiency would be a too strong statisticalmeasure, as it is sensitive to the absolute values of theconcentrations (Weglarczyk, 1998). Due to the simplemixing approach and the uncertainty of the silicaconcentrations of the different runoff components(Section 3.5), an offset of the simulations can easily bemodelled. Therefore, a good simulation of relativeconcentration changes is already a success. The severermodel efficiency is suitable for discharge simulations, asthe model is more complex at this part and theconservation of mass is valid. In addition, the model

Fig. 1. Mean and standard deviation of the simulations of average

runoff Qav (a and c) and maximum runoff Qmax (b and d) for event no.

1 subject to the number of simulations for Latin Hypercube sampling.

efficiency is sensitive to larger values, thus, it is suitablefor the evaluation of floods.

The objective functions, i.e. Reff at the Brugga outlet(Reff Brugga) and at the outlet of the St. WilhelmerTalbach (Reff Talbach) as well as R2 of the silicasimulations at the Brugga outlet, were combined toanalyse the value of incorporating additional experi-mental data into the modelling process in order toreduce the prediction uncertainty. The single objectivefunctions (i.e. Reff Brugga, Reff Talbach and R2

Silica) weremultiplied for this combination, whereas a minimumnumber of 40 acceptable parameter sets had to remainafter combining two single measures. This threshold of40 parameter sets guarantees that single outliers(significantly different simulations of the respectivevariable, i.e. discharges or silica concentrations) cannotinfluence the computed uncertainty bounds. Beforecombining the single objective functions they werefuzzy-transformed (see Seibert, 1997). Normally withinthe GLUE framework, parameter sets and theircorresponding model results are excluded, if theirobjective function does not exceed a certain threshold.To avoid this abrupt distinction between acceptable andunacceptable parameter sets, partial rejection of a pa-rameter combination based on its position relative toupper and lower thresholds was applied. For the upperthreshold, the maximum value of an objective functionthat could be achieved with all parameter sets fora specific event was identified. Then this value was set to1 for the fuzzy membership function. In a next step, theaforementioned threshold of 40 acceptable parametersets defined the lower threshold. Its value was de-termined as follows for each runoff event separately: theinitial number of parameter sets used for combining theobjective functions was fixed to equal size for all threemeasures. The number of m parameter sets yielding thebest measures of fit was iteratively varied until theminimum number of 40 acceptable combined parametersets was exceeded. The lower threshold for eachobjective function corresponded to the worst value ofthe objective functions of the m parameter sets. Betweenthe upper and lower threshold, a linear function wasadapted. All objective functions’ values below the lowerthreshold were set to 0; consequently these parametersets were excluded as unacceptable.

This procedure resulted in a variable number ofparameter sets that were considered when computing the5% and 95% uncertainty bounds for every event, andvariable upper and lower thresholds used for the fuzzy-transformation. However, this method guaranteed a faircomparison of the effectiveness achieved by differentadditional data on the reduction of uncertainty bounds.If different initial sets (i.e. with varying numbers ofparameter sets) had been used for combination of theobjective functions, the integration of the parameter setsof the larger set, which includes relatively more

26 S. Uhlenbrook, A. Sieber / Environmental Modelling & Software 20 (2005) 19e32

parameter sets that lead to bad simulations, wouldinevitably lead to wider uncertainty bounds of bothcombined measures. Table 2 lists the number ofparameter sets considered when computing the un-certainty bounds, the upper and lower thresholds usedfor the fuzzy-transformation, as well as the number ofparameter sets remaining after the combination of thesingle measures.

5. Simulation results

Four flood events were examined, two convectivestorm events (no. 1 and 2) and two snow melt events(no. 3 and 4). Events 1, 2, and 3 had a recurrenceinterval of less than 1 year. The durations of these floodevents at the outlet were approximately 10, 18 h and 12days, respectively. Event 3 was a long-lasting multi-peakevent, of which simulation was expected to be difficult asit was influenced by snow accumulation and meltprocesses for several days. Event 4 was a larger flood(duration approximately 5 days), also influenced bysnow processes. This flood peak had a recurrenceinterval of about 8e9 years.

The simulations for the four flood events using the400 sampled parameter sets achieved between very bad(negative Reff values) and very good (Reff values close to1) modelling results (Table 3). The Reff values wereslightly better for the Brugga basin than for the St.Wilhelmer Talbach sub-basin. This is particularly clearfor the complex event 3. Similar results with a bettermodel performance for the larger basin were obtained inthe same study area by Seibert et al. (2000), who applied

the HBV model, and Kleinhans (2000), who used theWaSiM-ETH model. Bremicker (2000) also observedbetter modelling results for larger basins by applying theLARSIM model to the macro-scale Weser basin withseveral nested sub-basins. As a possible explanation, theauthors indicate an averaging-out effect of input dataerrors that is more distinct for larger basins. This couldbe supplemented by an averaging-out effect of specificprocess reactions in larger basins. Smaller basins are inmost cases more sensitive to explicit process reactionsthat are, due to data and model constraints, often notwell understood or not captured adequately with theapplied modelling system.

Simulations of the dissolved silica concentrationsgenerally yielded worse results than the runoff simu-lations (Table 4). In particular, the silica dynamics forthe small convective event 1 were not reproduced well.For this event, the minimum silica concentrations weremodelled simultaneously to the simulated peak dis-charge, but a few hours earlier than the observedminimum silica concentrations. It should be noted thatthe duration of the main part of the event was onlyabout 6 h; thus, minor timing problems in input andoutput data have large impacts on the accuracy of thesilica simulations. In addition, the simplicity of the silicamodel and its associated uncertainties (Section 4.2)should be considered. However, it is possible that thetemporal variable contributions of the different runoffcomponents activated during the small intense rainstorm do not agree with reality as good as they do forthe longer events. The silica concentrations during thelong-lasting snowmelt-influenced events (3 and 4) weremodelled well. In general, the results are at least as good

Table 2

Number of parameter sets taken into consideration for the combination of objective functions; upper and lower thresholds of the objective functions

and number of parameter sets remaining after the combination of two objective functions

Number of parameter

sets taken into

consideration for combination

Upper

threshold

Lower

threshold

Sets after combination

Event no. 1

R2Silica e e e

Reff Brugga 155 0.80 0.98

Reff Talbach 155 0.51 0.96

Event no. 2

R2Silica 200 0.42 0.58

Reff Brugga 200 0.20 0.97

Reff Talbach 200 0.79 0.96

Event no. 3

R2Silica 120 0.74 0.83

Reff Brugga 120 0.68 0.95

Reff Talbach 120 0.55 0.87

Event no. 4

R2Silica 120 0.67 0.73

Reff Brugga 120 0.82 0.97

Reff Talbach 120 0.78 0.97

27S. Uhlenbrook, A. Sieber / Environmental Modelling & Software 20 (2005) 19e32

as the simulation results achieved with comparablerainfall-runoff models for other chemical species (seeBergstrom et al., 1985; Lundquist et al., 1990).

The variations of the uncertainty bounds, i.e. the 5%and 95% quantile, are discussed in further detail forevents 2 (summer) and 3 (snow melt influenced). Theuncertainty of the modelled peak discharge of event 2was reduced from a range of 6.1e10.7 to 6.9e10.8 m3

s�1 by combining the two objective functions Reff Brugga

and R2Silica (Fig. 2). In other words, only parameter sets

that were able to simulate both the discharge of theBrugga and the concentrations of dissolved silicaconfined the model were considered for simulating thepeak discharge. In particular, the 5%-quantile wasincreased, while the 95%-quantile was almost constantthroughout the exercise. The reason for the slightlyincreased 95%-quantile ( from 10.7 to 10.8 m3 s�1) isthat less parameter sets resulting in high peak flowsimulations were eliminated by the combination of thetwo objective functions than such parameter setssimulating lower peak flows. The uncertainty of thepeak discharge simulations for the entire catchment wasreduced more effectively by integrating the modelleddischarge of the St. Wilhelmer Talbach sub-basin. Thepeak discharge ranges could be restricted to 9.1e10.8 m3

s�1. Again, the lower value of the uncertainty bound wassignificantly increased and the upper value remainedconstant. Interestingly, the observed discharge is outsidethe computed uncertainty bounds for this combinedmeasure (see Section 6 for further discussion).

The simulation results of the additionally integrateddata are given in Fig. 3. In agreement with the results

Table 4

Ranges of coefficients of determination obtained during the 400 silica

simulations

R2Silica

Minimum Maximum

Event 1 0.0 0.17

Event 2 0.21 0.58

Event 3 0.46 0.83

Event 4 0.51 0.73

Table 3

Ranges of model efficiencies obtained during the 400 runoff

simulations

Reff Brugga Reff Talbach

Minimum Maximum Minimum Maximum

Event 1 �4.7 0.98 0.1 0.96

Event 2 �9.4 0.97 �12.7 0.96

Event 3 �14.8 0.95 �20.5 0.87

Event 4 �5.2 0.97 �30.7 0.97

shown in Fig. 2, the fuzzy-transferred objective func-tions were again used for computing the uncertaintybounds. The model overestimated the silica concen-trations; the runoff of the sub-basin was modelledreasonably well.

Only minor reductions of the uncertainty boundswere observed for the complex event 3 (Fig. 4) bycombining the single measures. The model overesti-mated the first runoff peak on March 22, but theobserved runoff was a nearly perfect fit between thecomputed uncertainty bounds for the rest of the event.The daily course of runoff variations triggered bysnowmelt is reproduced well. The potential for addi-tional data to reduce the uncertainty of dischargepredictions varied during different parts of the hydro-graph. Considering additional data from the sub-basinseems to be slightly more efficient for reducing theuncertainty bounds than the integration of the silicameasurements if the whole event is considered. If onefocuses on the highest peak discharge on March 23, theuncertainty range of 3.1e4.7 m3 s�1 was reduced slightlyto 3.1e4.6 m3 s�1 and to 3.2e4.6 m3 s�1 by consideringadditional runoff and silica data, respectively. Again thesimulated silica concentrations demonstrated the ten-dency of the model to overestimate the concentrations(Fig. 5). However, the overall concentration dynamicswith event-dependent and daily fluctuations weremodelled well. The simulation results at the sub-basinSt. Wilhelmer Talbach (Fig. 5) were less satisfying thanfor event 2. The discharge was overestimated at thebeginning of the event and slightly underestimated forthe last part of the event (March 26e28). The predictionuncertainty of the first runoff peak was relatively large.

The runoff peak uncertainty ranges for all fourinvestigated events are illustrated in Fig. 6. The

Fig. 2. Basin precipitation and uncertainty bounds (5% and 95%

quantile) for runoff simulations at the Brugga basin catchment outlet

subject to different measures for goodness of fit (event 2).

28 S. Uhlenbrook, A. Sieber / Environmental Modelling & Software 20 (2005) 19e32

combination with silica data was not executed for event1, as the silica simulations were too weak (Table 4). Theconsideration of additional data always reduced theprediction uncertainty but efficiency varied from eventto event. Incorporating runoff data from the sub-basinreduced the uncertainty of peak runoff to a larger extent

Fig. 3. Basin precipitation, uncertainty bounds (5% and 95% quantile)

for runoff simulation at the outlet of the St. Wilhelmer Talbach

sub-basin, and uncertainty bounds (5% and 95% quantile) for simula-

tion of dissolved silica concentrations at outlet of the Brugga basin

(event 2).

Fig. 4. Basin precipitation and uncertainty bounds (5% and 95%

quantile) for runoff simulations at the catchment outlet of the Brugga

basin subject to different measures for goodness of fit (event 3).

for events 2 and 4. In the case of event 3, both additionaldata sets performed equally well. The relative un-certainty of the summer events (1 and 2) was largecompared to the snowmelt events (3 and 4), butaccounting for the runoff simulations at the sub-basincould reduce uncertainty significantly. Due to thelimited number of investigated events, no clear patterncould be detected concerning the uncertainty reductionfor different types of events (summer storm versussnowmelt event) or for different magnitudes of events(large versus small events).

6. Discussion and conclusions

Although the uncertainty ranges computed bya Monte Carlo-based GLUE analysis reflect all un-certainty sources (Beven and Binley, 1992), the results inthis study essentially demonstrate the impact of modelparameter uncertainty. Latin Hypercube sampling hasproven to be an efficient sampling strategy for analysingthis uncertainty for the complex and highly parameter-ized model TACD. It is worth noting that this can beconcluded after investigating wide parameter ranges andvarying almost all model parameters (see Table 1). Butthe results indicate that the number of required modelruns is greater than is proposed in the literature. As ruleof thumb, we suggest that the number of model runs

Fig. 5. Basin precipitation, uncertainty bounds (5% and 95% quantile)

for runoff simulation at the outlet of the sub-basin St. Wilhelmer

Talbach, and uncertainty bounds (5% and 95% quantile) for

simulation of dissolved silica concentrations at outlet of the Brugga

basin (event 3).

29S. Uhlenbrook, A. Sieber / Environmental Modelling & Software 20 (2005) 19e32

should exceed the number of varied parameters by atleast a factor of 10. However, the prediction uncertaintyusing a combined method of Latin Hypercube samplingand GLUE might be even larger if the error of theprecipitation and its regionalization is explicitly consid-ered (Franks, 2002).

On one hand, it was shown that the uncertainty ofmodel predictions is temporally variable. It variedduring different parts of the investigated events, whereasthe highest absolute values were observed during thepeak discharge. In addition, the prediction uncertaintyvaried from event to event and could not be quantifiedbefore a separate analysis of each event. Interestingly, inspite of the problems and errors of snowfall measure-ment and the difficulties simulating snow cover pro-cesses that also require a larger number of modelparameters (i.e. five parameters of the snow routine), thesnow melt events showed no proportionally largeruncertainty bounds than the convective storm events.On the other hand, the prediction uncertainty of thesame model output variable varied spatially. This wasillustrated by comparing the discharge simulation resultsat the Brugga basin and the St. Wilhelmer Talbach sub-basin. In addition, the computed uncertainty ranges areonly valid for this study’s boundary conditions. Thechoice of the threshold values for the likelihoodmeasures is particularly important. Furthermore, themodel initialization can have a significant impact. Dueto limited computing power in this study, the model wasinitialized similarly for all model runs and therefore thispoint was not analysed.

The value of additional data for model calibrationhas been clearly demonstrated. The uncertainty ofdischarge predictions at the Brugga basin was reduced,and a better process basis of the model was achieved asthe model’s ability to simulate internal variables wasdemonstrated. The latter has been investigated in

Fig. 6. 95% confidence interval of the simulated runoff peaks. The

numbers indicate the range in %-values of the respective maximum.

a number of studies using different process-basedmodels (Ambroise et al., 1996; Mroczkowski et al.,1997; Franks et al., 1998; Lamb et al., 1998; Seibert andMcDonnell, 2002; Uhlenbrook and Leibundgut, 2002).In this study, multi-scale data (i.e. runoff from a sub-basin) were was compared to multi-response data (i.e.concentrations of dissolved silica) in terms of its abilityto reduce to the uncertainty of discharge predictions.The multi-scale data seem to be more efficient, but thelimited number of investigated events has to beconsidered. This can be explained by the model’spurpose, which is the process-based runoff simulation.Additional assumptions had to be made for the silicasimulation, for instance, a spatially and temporallyconstant silica concentration of all runoff componentswas assumed.

The ability of the additional data to reduce predictionuncertainty seems to be related to the goodness ofsimulation of the additional data itself. For event 3,integration of the reasonably good silica simulationsresulted in uncertainty reduction comparable to thatachieved by use of the additional sub-basin dischargedata. Accordingly, poor silica simulations could notreduce the uncertainty of discharge simulation (event 1).Similar findings exist for the discharge of the sub-basin:good discharge simulations at the sub-basin helped tosignificantly reduce uncertainty at the Brugga basin(events 1 and 2) and poorer simulations were less helpful(event 3).

A point already mentioned above needs furtherdiscussion: For events 1 and 2, the uncertainty rangeswere reduced so much by integrating the runoff data ofthe sub-basin that the measured runoff laid outside theseranges. This might have been caused by (i) incorrectinput data or insufficient regionalization to basin scale,(ii) insufficient model initialization, (iii) errors in themodel structure and process conceptualization, or (iv)unqualified criteria for exclusion of parameter combi-nations. Incorrect input data or insufficient regionaliza-tion (reason i) is very likely for the two summer events.Small convective rain cells with spatial extremely vari-able rainfall amounts, which are typical for the studybasin in summer, are difficult to detect with a limitednumber of rain gauges and to regionalize adequately tobasin scale. Of course, the other sources of error(reasons iieiv) are still present even if additionalexperimental data are integrated into the modelcalibration process. Yet, if only catchment outlet runoffis used for model calibration, then its influence might besurpassed by a greater influence of parameter un-certainty. The influence of different model initializations(reason ii) was not investigated in further detail, but thelong initialization period of 12 months let the authorsassume that the different storage fillings were realistic atthe beginning of the investigated events. However, thefact that the observed discharge lays outside reduced

30 S. Uhlenbrook, A. Sieber / Environmental Modelling & Software 20 (2005) 19e32

uncertainty bounds suggests an assiduous check of themodel structure and its assumptions (reason iii) in futurestudies. This is in particular the case for the simple silicamodel. Thus, the analysis of the model uncertaintyallowed to gain further insights into the model itself andgave hints for future model improvements.

A wider reduction of simulation uncertainty might beachieved using further process data. Beldring (2002)succeeded in reducing both the uncertainty of modelparameter estimates and the uncertainty of dischargepredictions by integrating observed groundwater levels(multi-response data) into the calibration procedure ofa process-based model. However, the groundwater levelsare the first order control on hydrological response inthe study basin in Norway, and thus, the greaterefficiency of these data are reasonable. Lane andRichards (2001) demonstrated the difficulty of reducingparameter uncertainty using integral data such as runoffor tracer concentrations measured at the catchmentoutlet. Better results could be achieved through the useof distributed data such as soil moisture. However, suchdistributed data are not yet available for the meso-scaleand widely forested Brugga basin. Future developmentsin remote sensing will show if important progress can bemade in this area.

Finally, the authors wish to state that the uncertaintyof model predictions is critical and can be significanteven for a well-validated, process-based catchmentmodel. The oft-quoted demand for model predictionsgiven as ranges rather than as single values must beconfirmed. The methodology used in this study showeda way to determine prediction ranges efficiently even forcomplex hydrological models. Knowing the uncertaintyand also the potential for different experimental data toreduce it should be a key focus during the developmentof strategies for future field studies.

Acknowledgements

The authors thank the German Research Foundation(Deutsche Forschungsgemeinschaft, DFG, Bonn, Ger-many) for financial support, grant no. Le 698/12-1.Special thanks to Marco Ratto and Stefano Tarantola(Joint Research Centre of European Commission,Institute for Protection and Security of the Citizen,Ispra, Italy) as well as Cees Wesseling (University ofUtrecht, The Netherlands) for their prompt support forsoftware problems. Thanks are due to Kendall Watkinsfor language editing.

References

Ambroise, B., Freer, J., Beven, K.J., 1996. Application of a generalized

TOPMODEL to the small Ringelbach catchment, Vosges, France.

Wat. Resour. Res. 32 (7), 2147e2159.

Beldring, S., 2002. Multi-criteria validation of a precipitation-runoff

model. J. Hydrol. 257, 189e211.

Bergstrom, S., 1992. The HBV modeldits structure and applications.

SMHI, RH, 4, Norrkoping, Sweden.

Bergstrom, S., Carlsson, B., Sandberg, G., Maxe, L., 1985. Integrated

modelling of runoff, alkalinity and pH on a daily base. Nordic

Hydrol. 16, 89e104.

Beven, K.J., 1996. A discussion of distributed hydrological modelling.

In: Abbott, M.B., Refsgaard, J.C. (Eds.), Distributed Hydrological

Modelling. Kluwer Academic Publishers, Dordrecht, The Nether-

lands, pp. 255e278.

Beven, K.J., 2001. Rainfall-Runoff Modelling. The Primer. John Wiley

& Sons Ltd, Chichester, England, pp. 360.

Beven, K.J., Binley, A., 1992. The future of distributed models: model

calibration and uncertainty prediction. Hydrol. Processes 6,

279e298.

Beven, K.J., Young, P., 2003. Comment on ‘‘Bayesian recursive

parameter estimation for hydrologic models’’ by M. Thiemann, M.

Trosset, H. Gupta, and S. Sorooshian. Wat. Resour. Res. 39 (5),

1116, (doi: 10.1029/2001WR001183).

Bloschl, G., Grayson, R., 2000. Spatial observations and interpolation.

In: Grayson, R., Bloschl, G. (Eds.), Spatial Patterns in Catchment

Hydrology. Cambridge University Press, Cambridge, UK, pp.

51e81.

Brath, A., Montanari, A., Moretti, G., 2002. On the use of simulation

techniques for the estimation of peak river flows. Proceeding of the

International Conference on Flood Estimation, March 6e8, 2002,

Bern, Switzerland. CHR Report II-17, 587e599.

Bremicker, M., 2000. Das Wasserhaushaltsmodell LARSIMdModell-

grundlagen und Anwendungsbeispiele (The water budget model

LARSIMdmodel basics and applications). Freiburger Schriften zur

Hydrologie, Bd. 11, University of Freiburg, Germany (in German).

Christiaens, K., 2001. Sensitivity and uncertainty of physically based

spatially distributed hydrological models. Ph.D. thesis, Faculty of

agricultural and applied biological sciences, University of Leuven,

Belgium.

van Dam, J.C., 2000. Field-scale water flow and solute transport:

SWAP model concepts, parameter estimation and case studies.

Ph.D. thesis, Wageningen Institute for Environment and Climate

Research, Wageningen Universiteit.

DVWK (Deutscher Verband fur Wasserwirtschaft und Kulturbau

e.V.), 1996. Ermittlung der Verdunstung von Land- und Wasser-

flachen. Merkblatter zur Wasserwirtschaft, 238. Bonn.

Eisele, M., Kiese, R., Kramer, A., Leibundgut, C., 2001. Application

of a catchment water quality model for assessment and prediction

of nitrogen budgets. Phys. Chem. Earth 26 (7e8), 547e551.

Franks, S., 2002. Integrating models, methods and measurements for

prediction in ungauged basins. In: Hubert, P., Schertzer, D.,

Takeuchi, K., Koide, S.PUBeKick-off Workshop of the IAHS

Decade on Prediction in Ungauged Basins, November 20e22,

2002, Brasilia, Brasilien.

Franks, S.W., Gineste, P., Beven, K.J., Merot, P., 1998. On

constraining the predictions of a distributed model: the incorpo-

ration of fuzzy estimates of saturated areas into the calibration

process. Wat. Resour. Res. 34 (4), 787e797.

Freer, J., Beven, K., Ambroise, B., 1996. Bayesian estimation of

uncertainty in runoff prediction and the value of data: an

application of the GLUE approach. Wat. Resour. Res. 32 (7),

2161e2173.

Grayson, R.B., Moore, I.D., McMahon, T.A., 1992. Physically based

hydrologic modeling, 2. Is the concept realistic?. Wat. Resour. Res.

26 (10), 2659e2666.

Guntner, A., Uhlenbrook, S., Seibert, J., Leibundgut, C., 1999.

Multi-criterial validation of TOPMODEL in a mountainous

catchment. Hydrol. Processes 13, 1603e1620.

Gupta, H., Thiemann, M., Trosset, M., Sorooshian, S., 2003. Reply to

comment by K. Beven and P. Young on ‘‘Bayesian recursive

31S. Uhlenbrook, A. Sieber / Environmental Modelling & Software 20 (2005) 19e32

parameter estimation for hydrologic models’’. Wat. Resour. Res.

39 (5), 1117, (doi: 10.1029/2002WR001405).

Harlin, J., Kung, C.S., 1992. Parameter uncertainty and simulation of

design floods in Sweden. J. Hydrol. 137, 209e230.

Helton, J.C., Davis, F.J., 2000. Sampling-based methods. In: Saltelli,

A., Chan, K., Scott, E.M. (Eds.), Sensitivity Analysis. John Wiley

& Sons, Chichester, England, pp. 101e154.

Helton, J.C., Bean, J.E., Berglund, J.W., Davis, F.J., Economy, K.,

Garner, J.W., Johnson, J.D., MacKinnon, R.J., Miller, J., O’Brien,

D.G., Ramsey, J.L., Schreiber, J.D., Shinta, A., Smith, L.N.,

Stoelzel, D.M., Stockman, C., Vaughn, P., 1998. Uncertainty and

sensitivity analysis results obtained in the 1996 performance

assessment for the waste isolation pilot plant. SAND98-0365.

Sandia National Laboratories, Albuquerque, USA.

Iman, R.L., Helton, J.C., 1985. A comparison of uncertainty and

sensitivity analysis techniques for computer models. Technical

Report SAND84-1461, Sandia National Laboratories, Albuquer-

que, USA.

Jakeman, A.J., Hornberger, G.M., 1993. How much complexity is

warranted in a rainfall-runoff model. Wat. Resour. Res. 29 (8),

2637e2649.

Karssenberg, D., Burrough, P.A., Sluiter, R., de Jong, K., 2001. The

PC Raster software and course materials for teaching numerical

modelling in the environmental sciences. Trans. GIS 5 (2),

99e110.

Kleinhans, A., 2000. Anwendung des Wasserhaushaltsmodells WaSi-

M-ETH im Dreisam-Einzugsgebiet (Application of the water

budget model WaSiM-ETH in Dreisam basin). Diploma thesis,

University of Freiburg, Institute of Hydrology, Germany (un-

published) (in German).

Lal, A.M.W., Obeysekera, J., Van Zee, R., 1997. Sensitivity and

uncertainty of a regional simulation model for the natural system in

South Florida. Proceedings of Managing Water: Coping with

Scarcity and Abundance, Theme A: Water for a Changing Global

Community. 27. Congress of the International Association for

Hydraulic Research, Water Resources Engineering Division/

ASCE, August 10e15, 1997, San Francisco, (http://

www.sfwmd.gov/org/pld/hsm/pubs/wlal/sens_iahr.pdf (date:

2002-11-26)).

Lamb, R., Beven, K.J., Myrabo, S., 1998. Use of spatially distributed

water table observations to constrain uncertainty in a rainfall-run-

off model. Adv. Wat. Resour. 22, 305e317.Lane, S.N., Richards, K.S., 2001. The ‘validation’ of hydrody-

namic models: some critical perspectives. In: Anderson, M.G.,

Bates, P.D. (Eds.), Model Validation: Perspectives in Hydrolog-

ical Science. John Wiley & Sons, Chichester, England, pp.

413e438.

Lundquist, D., Christophersen, N., Neal, C., 1990. Towards de-

veloping a new short-term model for the Birkenes catchmentdles-

sons learned. J. Hydrol. 116, 391e401.

McDonnell, J.J., Tanaka, T., 2001. Hydrology and Biogeochemistry of

Forsted Catchments. Hydrol. Processes 15 (10), (special issue).

McKay, M.D., Conover, W.J., Beckman, R.J., 1979. A comparison of

three methods for selecting values of input variables in the analysis

of output from a computer code. Technometrics 21, 239e245.

Melching, C.S., 1992. A comparison of methods for estimating

variance of water resources model predictions. In: Kuo, J.-T.,

Lin, G.-F.Stochastic Hydraulics ’92, Proceedings of the Sixth

International Association for Hydraulic Research Symposium on

Stochastic Hydraulics, Taipeh, Taiwan. Water Resources Publica-

tions, Littleton, CO, USA, pp. 663e670.

Melching, C.S., 1995. Reliability estimation. In: Singh, V.P. (Ed.),

Computer Models of Watershed Hydrology. Water Resources

Publications, Highlands Ranch, CO, USA, pp. 69e118.Melching, C.S., Yen, B.C., Wenzel, Jr., H.G., 1990. A reliability

estimation in modeling watershed runoff with uncertainties. Wat.

Resour. Res. 26 (10), 2275e2286.

Menzel, L., 1997. Modellierung der Evapotranspiration im System

Boden-Pflanze-Atmosphare. Ph.D. thesis, Zuricher Geographische

Hefte, 67, ETH Zurich, Zurich, Switzerland.

Mroczkowski, M., Raper, G.P., Kuczera, G., 1997. The quest for more

powerful validation of conceptual catchment models. Wat. Resour.

Res. 33, 2325e2335.

Nash, J.E., Sutcliffe, J.V., 1970. River flow forecasting through

conceptual models, 1. A discussion of principles. J. Hydrol. 10,

282e290.

O’Callaghan, J.F., Mark, D.M., 1984. The extraction of drainage

networks from digital elevation data. Comput. Vision Graphics

Image Process 28, 328e344.Refsgaard, J.C., Storm, B., 1996. Construction, calibration and

validation of hydrological models. In: Abbott, M.B., Refsgaard,

J.C.Distributed Hydrological Modelling. Water Science and

Technology Library, vol. 22. Kluwer Academic Publishers,

Dordrecht, The Netherlands, pp. 41e54.

Schulla, J., 1997. Hydrologische Modellierung von Flußeinzugsgebiet-

en zur Abschatzung der Folgen von KlimaanderungenZuricher

Geographische Hefte, 65. ETH Zurich, Zurich, Switzerland.

Sieber, A., 2003. Parameterstudien und Unsicherheitsanalysen mit dem

Einzugsgebietsmodell TACD (Parameter studies and uncertainty

analyses of the catchment model TACD). Diploma thesis,

University of Freiburg, Institute of Hydrology, Germany (un-

published) (in German).

Seibert, J., 1997. Estimation of parameter uncertainty in the HBV

Model. Nordic Hydrol. 28 (4/5), 247e262.Seibert, J., 1999a. Conceptual runoff modelsdfiction or representation

of reality? Ph.D. thesis, Uppsala, Sweden.

Seibert, J., 1999b. Regionalisation of parameters for a conceptual

rainfall-runoff model. Agric. For. Meteorol. 98e99, 279e293.

Seibert, J., 2002. Manual of the model HBV light. SLU, Department of

Environmental Assessment, Uppsala, Sweden.

Seibert, J., McDonnell, J., 2002. On the dialog between experimentalist

and modeler in catchment hydrology: use of soft data for

multi-criteria model calibration. Wat. Resour. Res. 38 (11),

1231e1241.

Seibert, J., Uhlenbrook, S., Leibundgut, Ch., Haldin, S., 2000.

Multiscale calibration and validation of a conceptual rainfall-run-

off model. Phys. Chem. Earth 25 (1), 59e64.

Singh, V.P. (Ed.), 1995. Computer Models of Watershed Hydrology.

Water Resources Publications, Highlands Ranch, CO, USA.

Thiemann, M., Trosset, M., Gupta, H., Sorooshian, S., 2001. Bayesian

recursive parameter estimation for hydrologic models. Wat.

Resour. Res. 37 (10), 2521e2535.Tilch, N., Uhlenbrook, S., Leibundgut, Ch., 2002. Regionalisierungs-

verfahren zur Ausweisung von Hydrotopen in von periglazialem

Hangschutt gepragten Gebieten. Grundwasser 2002/2004,

206e216.

Uhlenbrook, S., 1999. Untersuchung und Modellierung der Abfluss-

bildung in einem mesoskaligen Einzugsgebiet (Investigating and

modelling of the runoff generation in a mesoscaled catchment).

Freiburger Schriften zur Hydrologie, Band 10. Institute of

Hydrology, University Freiburg, Germany, pp. 201, (in German).

Uhlenbrook, S., Leibundgut, Ch., 2002. Process-oriented catchment

modelling and multiple-response validation. Hydrol. Processes 16,

423e440.

Uhlenbrook, S., Seibert, J., Leibundgut, C., Rodhe, A., 1999.

Prediction uncertainty of conceptual rainfall-runoff models caused

by problems to identify model parameters and structure. Hydrol.

Sci. J. 44 (5), 279e299.

Uhlenbrook, S., Frey, M., Leibundgut, C., Maloszewski, P., 2002.

Hydrograph separations in a mesoscale mountainous basin at event

and seasonal timescales. Wat. Resour. Res. 38 (6), 1e14.Uhlenbrook, S., McDonnell, J.J., Leibundgut, Ch., 2003. Runoff

generation and implications for river basin modelling. Hydrol.

Processes 17 (2), 296, (special issue).

32 S. Uhlenbrook, A. Sieber / Environmental Modelling & Software 20 (2005) 19e32

Uhlenbrook, S., Roser, S., Tilch, N., 2004. Hydrological process

representation at the meso-scale: the potential of

a distributed, conceptual catchment model. J. Hydrol.,

in press.

Uhlenbrook, S., Didszun, J., Leibundgut, Ch. Runoff generation

processes on hillslopes and their susceptibility to global change. In:

Huber, U.M., Reasoner, M.A., Bugmann, B. (Eds.), Global

Change and Mountain Regions: A State of Knowledge Overview.

Advances in Global Change Research. Kluwer Academic Publish-

ers, Dordrecht, in press.

Weglarczyk, S., 1998. The interdependence and applicability of some

statistical quality measures for hydrological models. J. Hydrol. 206,

98e103.

Yu, P.-S., Yang, T.-Ch., Chen, S.-J., 2001. Comparison of uncertainty

analysis methods for a distributed rainfall-runoff model. J. Hydrol.

244, 43e59.