input errors in rainfall-runoff modelling

Mathematics and Computers in Simulation 30 (1988) 119-131 North-Holland

119

INPUT ERRORS IN RAINFALL-RUNOFF MODELLING

M.T.P. RETNAM and B.J. WILLIAMS

Department of Civil Engineering and Surveying, University of Newcastle, Shortland, NS W 2308, Australia

Precipitation is one of the most variable of the hydrologic processes. This presents two particular problems when hydrologic process of basin scale are studied via mathematical models. The first is the extent to which area1 rainfall distribution can be determined from point measurements and the second is the effect of precipitation uncertainty on the uncertainty of catchment model parameters. This paper presents a state-of-the-art survey of methods used for handling the first problem. Spatial interpolation models commonly in use are discussed. A Bayesian type optimisation procedure to estimate model parameters is presented and demonstrated for a real world example. The technique relaxes the usual least squares assumption of normally distributed residuals and estimates an error distribution which is a member of a family of distributions which includes the normal and Laplace distribution.

1. INTRODUCTION

Rainfall-runoff models are widely used by engineers to predict the stream flow from a

catchment on which rainfall of known intensity falls. In places where streamflow statistics

are scarce the models are used for design purposes to infer flows of a particular frequency by

applying a "design" storm of the same frequency to the model. Structures at the catchment out-

let are then designed to cope with this flow. Another important use of the models is in flood-

forecasting, where measured rainfall is used by the model to predict streamflows.

The models may be simple regression equations applied to a particular catchment or they

may be of a more general "conceptual" form using differential equations of continuity of flow

and requiring a range of data describing the catchment and, of course, the rainfall. Depending

on the complexity of the model and the precision required, the rainfall may be specified as

spatially averaged and varying in time but the measurements in most cases are in fact taken by

continuous (in time) rain gauges at points (in space), although radar measurements, if available,

allow a limited spatial picture of rainfall. The principal concern of this paper is the errors

in the modelling process which derive from the use of limited point measurements of an input

variable which is continuous in space.

Unfortunately the hydrologist intending to forecast floods, does not usually have the

opportunity to design or redesign the data collection network so that spatial variation of

precipitation needs to be inferred from existing point measurements. This technique of

037%4754/88/$3.50 0 1988, IMACS/Elsevier Science Publishers B.V. (North-Holland)

120 M. T. P. Retnam, B.J. Williams / Input errors in rainfall-runoff modelling

FIGURE 1 - Schematic Representation of Flood Forecasting

inferring spatial variability of precipitation, or precipitation at an ungauged point is called

spatial interpolation of precipitation.

A schematic representation of the procedure of flood forecasting from a precipitation

record is given in Figure 1.

Rainfall network design is the design of the physical system A, whereas, spatial inter-

polation is the mathematical conceptualisation of system B which attempts to infer precipitation

at ungauged points, thereby generating the area1 precipitation required for use in system C.

Note that the catchment response depends not only on the interpolation input for the forecast

but also on the interpolation used in estimating the model parameters during calibration.

Calibration follows the usual criterion such that parameters when used in system C "best"

(usually in the least squares sense) reproduce an observed flood for an observed precipitation.

Hence, the spatial variation of precipitation presents two particular problems when hydro-

logic processes of basin (catchment) scale are analysed via rainfall-runoff models. The first

is the extent to which area1 rainfall distribution can be determined from point

and the second is the effect of precipitation uncertainty on the uncertainty of

model parameters and hence the catchment model response. This paper presents a

handling of the first of the above two problems.

2. SPATIAL INTERPOLATION

2.1 Introduction

measurements

the catchment

technique for

Central to the art of spatial interpolation of precipitation is the idea that rainfall

precipitation on an area is a stochastic process indexed by geographical co-ordinates. The

value of the process at a point at m time intervals is thought of as m realisations of the

process at that point. Spatial interpolation models perform mathematical manipulations on

point measurements to infer the process at ungauged sites by preserving the basic nature of

the process in the space dimension. Accordingly, we have classified spatial interpolation

models into two categories: the first preserves the variability of the process as indicated

by the point measurements and the second preserves the spatial covariance of the process for

a given burst. We shall concentrate below on the latter category.

Within these categories, techniques can be differentiated by the choice of model

structure which until recently was rather arbitrary with little account being taken of the

underlying process. Current research is direrted towards the difficult problem of

M. T P. Retnam, B.J. Williams / Input errors in rainfall-runoff modelling 121

"establishment of firm links between atmospheric dynamics and statistical description of

spatial and temporal variability of the process" (Gupta, 1985 and Cho, 1985).

Finally, most of these models require a fitting procedure, the criterion for which is also

arbitrary; this is discussed in detail in Section 3.

Creutin and Obled (1982) and Tabios and Salas (1985) have reviewed spatial interpolation

techniqueswith respect to water resources problems.

2.2 Optimal Interpolation

Most of the models which preserve the spatial covariance come under the general title

"optimal interpolation", including (with some variations) the well known "Kriging" procedures.

We describe the general procedure below and subsequently, in Section 3.2 we provide an

illustrative example.

In this technique, the estimate of PO is given by

F. = ; wj Pj j=l

in which the weights wj, j

where

2 u E = var[P, - PO1

= var[Po - jZ, 'j'j'

1, 2, . . . . n are chosen so that the error variance o 2 = E is minimised

2 = 0 - 2 f wj COV(PoPj) + j=l

(2.1)

where

u = variance of PO

cov(PiPj) = covariance between Pi and Pj

(2.2)

For an unbiassed estimate of Po we require that

(2.3)

Minimising (2.2) subject to (2.3) is a constrained optimisation problem, hence introducing

the Lagrange multiplier x we have

cl 2 = 02 _ 2 ; E

j=l 'j cov(popj) + jl, J, WiWj COV(PiPj) - h( i wj - 1) (2.4)

jg1

122 M. T.P. Retnam, B.J. Williams / Input errors in rainfall-runoff modelling

Differentiating (2.4) with respect to wj and setting the result to zero yields

! ‘i cov(pipj) + ~ = COV(PoPj) ; j = 1, . . . . . n i=l

Assuming an homogeneous variance in the precipitation field we have

'i'j = ' 2 for any i, j

hence (2.5) becomes

! 'i -P(pipj) + ~ = p(P,Pj) ; j = 1, . . . . . n i=l

(2.5)

(2.6)

where

P(‘i’j) = correlation coefficient between Pi and P.

J

Assuminganhomogeneous and isotropic spatial correlation, p(PiPj) depends only on the inter-

station distance d;; and (2.6) becomes

T ii1 Wi P(dij) f $ = P(doj)

Solving the linear systems (2.7) and (2.3) yields wi, j = 1, . . . . n which when used in (2.1)

gives an unbiassed estimate of PO at (x0, y,) with iinimum error variance derived from (2.2) -

; j = 1, 2, . . . . . n (2.7)

^02 E

= 02[T - ; wj p(doj)] - $ j=l

(2.8)

Note that in (2.8) the n indicates an estimate for the error variance, the reason being that

P(doj) is not known from the data. So far in the treatment we have assumed a known distance

functioned (homogeneous and isotropic) correlation function.

The spatial correlation function p(d) can be assumed to be one of several forms (Yevjevich

and Karplus, 1973):

a. The reciprocal model P(d) = l/(1 + d/c,)

b. The square-root model P(d) = l/m,)

C. The exponential model P(d) = exp(-d/co)

M. T. P. Retnam, B. J. Williams / Input errors in rainfall-runoff modelling 123

where co is a parameter to be estimated by fitting the model (a), (b) or (c) to the sample

correlation coefficients,

^

'ij

where

T

_Pjlt)

'i

=

lT f ,C, [Pi(t) - Pil[pj(t) - ‘jl

,. ,. si s.

J

= total number of observationsin time

= observation at jth station during time t

= time average at Pi

= f i t=1

Pi(t)

= sample standard deviation at ith station

(2.9)

3. MODEL OPTIMISATION CRITERIA

3.1 Error Structure

The method of least squares owes its wide popularity to the simplicity of the associated

computations. It does, however, have some particularly useful statistical properties under

certain assumptions. It is the "best" (in the sense that its estimates have the minimum

variance) of all linear models and its estimates are unbiassed. These properties apply when:

a. the errors have zero mean and constant finite variance, and

b. each distinct pair of errors have zero covariance (i.e., the errors are uncorrelated)

If furthermore we assume

C. the errors are normally distributed, then the least squares technique has minimal

variance of all unbiassed estimators, the estimates of the regression parameters are

themselves normally distributed and it follows that t and F test can be used to

determine confidence limits.

In practice, the conscientious modeller, having estimated his parameters, carefully checks

that the assumptions are reasonable. Various transformations of the data (Box and Cox, 1964,

for example) are available to account for problems in assumptions (a) and (b). In the water

124 M. T. P. Retnam, B..J. Williams / Input errors in rainfall-runoff modelling

resources area, strong autocorrelations are the norm and one expects the error variances to be

non-constant. Because of these difficulties with assumption (a) and (b) and the error normality

assumption, a number of authors in the water resources (and statistical) literature have argued

that the error structure in the model should determine the optimisation criterion (Clarke, 1973;

Troutman, 1985; Sorooshian and Dracup, 1980; Bard, 1974 and others).

If an estimator other than least squares is to be used,we need to know its statistical

properties such as efficiency, consistency and unbiassedness.

In the following section we demonstrate the use of a proposal by Box and Tiao (1973) to

estimate, during the fitting procedure a measure of the kurtosis, B, of a family of symmetric

distributions which are assumed to contain the error distribution. The family includes the

normal distribution (B=O) and the double exponential distribution (e=l). The procedure

optimises the fitting criterion on the basis of the a estimate and consequently will be equiva-

lent to a least squares estimate for 0=0 or a minimum sum of absolute errors for s=l. The

approach is Bayesian and hence a prior estimate of the distribution of I must be made. This

is discussed via an illustrative example in the next section (3.2).

3.2 Illustrative Example

Central to the technique of optimal interpolation discussed in Section 2.2 is the fitting

of a spatial correlation function P. Instead of the commonly used functions (Yevjevich and

Karplus, 1973), we propose the use of the function

'ij = e

-(arfj + bsfj) (3.1)

where

Pij = correlation coefficient between Pi and Pj

rij = interstation x deviate (xi - Xj)

‘ij = interstation y deviate (yi - yj)

a,b = parameters to be estimated

This model describes a positive definite spatial correlation structure as would be expected

in a precipitation field. It incorporates the non-isotropy of the correlation of the process.

Note that the model is vector functioned, (as opposed to distance functioned) as rij and sij

are elements of the vector from point i to j. Further this model preserves the symmetry of

the covariance matrix (pij = pji). That is, in general, this model removes the assumption

of isotropy of correlation, (isotropic c orrelation being a special case when a = b) and

preserves the homogeneity condition required to satisfy symmetry of the covariance matrix.

In a study area (catchment) with n recording stations, there are m(= 1/2(n-l)(n)) inter-

station distances and hence m known sample correlation coefficients. Hence, the model is

;;k = Pk + Ek ; k=l,m

= e -(ar; + bs;)

+ Ek

Where the n indicates a sample value. Assuming that each of these samp

coefficients come from a family of symmetric parent distributions

le corre

z/(7+8)

P(6kla' b, $, a) = W(s) ok' exp[-c(a)/*1 1

M. T. P. Retnam, B.J. Williams / Input errors in rainfall-runoff modelling

lation

n k = l,...,m ; -15 B 51 ; -co < b, - P,) -< m

125

(3.2)

(3.3)

where

4 = var(Pk)

and

w(B) = or+ (1 + B)]F

(1 + B)Ir[; (1 t o)]?

The parameter B can be regarded as a measure of kurtosis indicating the extent of non-

normality of the parent distribution.

Now, assuming that any two distinct sample correlation coefficients are independent,

the likelihood function is

2/(1+0)

-y(a, b, 0, al_) 0: [~(a)]" 0" exp[-c(B) ,I, i+i 1

in which

^ Q = G ,’ ;I,, .-....-, Pm1

(3.4)

For a fixed value of 6, the likelihood function is

126 M. T. P. Retnam, B.J. Williams / Input errors in rainfall-runoff modelling

2/(1+B) ala, b, 016, i) = a-" exp[-c(B) ,I, I*1 1

Assuming a non-informative reference prior for a, b, and u,

P(a, b, 0) a 0-l

the joint posterior distribution of (a, b, u) is given by

P(a, b, ala, $) = a(a, b, 016, _) p(a, b, u)

From (3.5) and (3.6), (3.7) can be rewritten as

z/(1+8)

P(a, b, 018, G) a 0 -(n+l) 1

Integrating out CT from (3.8) to form the posterior distribution of (a. b)

P(a, bl6, _) = J(a)-’ INa, b)l -1/2n(1+8)

where

M(a, b) = k& Ick - o~I~‘(‘~”

= &;k - e -(ari + bsi) 2/(1+4)

and J(B)-' is a normalising constant such that

m m

I i P(a, bl@, ^p) da db E 1

-m -m

whence

J(B) = Jm Jm [M(a, b)]-1'2n('+6) da db

-co -aJ

Assuming B is distributed independently of a, b, and u, so that

(3.5)

(3.6)

(3.7)

(3.8)

(3.9)

M. T. P. Retnam, B.J. Williams / Input errors in rainfall-runoff modelling 127

P(a, b, u, R) = P(6) P(a, b, a)

we can rewrite (3.10) using (3.6) as

P(a, b, u, B) a P(B) 0-l

Now the joint posterior distribution of (a, b, ci, 6)

(3.10)

(3.11)

P(a, b, 0, RI_) = P(a, b, U, B) 11 (a, b, 0, Rib)

a u -’ P(B) R (a, b, u, PI@)

where a(a, b, u, BI;) is given in (3.4).

Substituting (3.4) in (3.11) and integrating out CJ yields the joint posterior distribution of

(a, b, 6)

-l/Zn(l+B) -n P(a, b, al;) a p(a) [M(a, b)l r[l + 1/2n(l+a)l IrCl + l/2(1+6)1> (3.12)

Note that (3.9) is a symmetric unimodal distribution of the parameters a and b for any values

of 0. In particular when B=O, the mode of a and b are the least square estimates. Setting

the value of B to zero, at that stage reflects the modellers confidence in the normality of

the error structure, however, this is seldom the case. If the error structure is assumed to

fall within a family of exponential distributions ((3.3), normal distribution as a special

case when B=O), the modeller reflects his uncertainty of the value of B by specifying a uniform

reference prior distribution of B,

P(B) = : -15 651

= 0 otherwise (3.13)

Given the sample correlation coefficients 6, the resulting error structure cannot in full

describe the parent distribution from which it is derived, however, it gives an indication

of probable values of B by means of a posterior distribution. The generality of (3.12) is

evident from its form in that it allows the modeller to specify his prior belief (including

ignorance) about the value of 6. Also, on integration of (3.12) with respect to a and b,

the posterior distribution of B is obtained,

-n Pu(BI@) = P(B) r[l + 1/2n(l+B)l {r[l + 1/2(l+a)l> J(B) (3.14)

128 M. T. P. Retnarn, B.J. Williams / Input errors in rainfall-runoff modelling

Now integrating (3.12) with respect to B yields the posterior distribution of

1

P(a, big) = 1 P(a, b, f3lb) dB _ -1

1

= J P(a, bl6, 6) P(b) dB -1

(a, b);

(3.15)

Thus (3.14) tells us what our estimate of the distribution of B is now that we have taken our

data into consideration.

Results Obtained

The above method of fitting spatial correlation was applied to the Hacking Catchment in

New South Wales. Observations from 6 recording stations for a storm of 36 hours duration were

used for the analysis. Fifteen sample correlation coefficients were used for the fitting

procedure. The numerical techniques used in solving equations in Section 3.2,due to their

complexity,are not treated in this paper, only the results obtained and their implications

are discussed.

By maximising (3.9) for various values of B, the modal values of a and b are given in

Table 3.1.

TABLE 3.1 - Modal parameter values for various B

B -0.9 -0.6 -0.3 0.0 0.3 0.6 0.9

a x 1000 1.36 0.96 0.85 0.73 0.61 0.61 0.61

b x 1000 0.35 0.94 0.94 0.88 0.77 0.57 0.52

The apparent insensitivity of the parameters to the values of B > 0.0 is due to the fact that

the catchment considered was small (82 sq.km). This is indeed reflected by the near zero

values of the parameters a and b as

p(r, s) = 1

lim a,b + 0

which is expected for stations which are close in space.

The derivation of the posterior density of B from the given data, is in fact, the under-

lying reason for the superiority of this method of optimisation of the model parameters as


-.=0.9 -0.6 -0.3 0.0 0.3 0.6 0.9 %

FIGURE 3.1 - Posterior Distribution of B, Pu(~l@)

opposed to that of the least squares technique. The posterior density of B is shown in

Figure 3.1.

Note that given the sample correlation coefficients, the deviation of the error distri-

bution from normality is significant as

P[B -< O/p] = 0.87

While using the least squares technique the modeller uses 6 = 0.0 without investigating the use

of maybe more probable values (as in this case) of B. From the posterior density of 8, it may

be apparent that selecting the modal values of the parameters corresponding to the value of B

for which the posterior density is maximised, is justified. However, this is not the case, as

the posterior distribution of the parameters a and b vary with B as indicated in (3.9). It is

in fact (3.15), the posterior density of a and b, that the modeller has to adopt. It gives a

combined effect of the variation of the posterior density of a and b conditioned on B and the

posterior density of B itself. A plot of the posterior density of a and b is given in Figure

3.2.

FIGURE 3.2 - Posterior Distri- bution of a and b given

the data

130 M. T. P. Retnam, B.J. Williams / Input errors in rainfall-runoff modeling

The posterior density is maximised at (a, b) = (1.3, 0.5). Two points are worthwhile

mentioning here, about these values of a, b. First is that although these values are not

much different from the least squares estimate (Table 3.1 with f3 = O.O), the posterior density

associated with the latter is very much lower than that of the former. The second point is

that the posterior density of a, b at which the conditional density conditioned at the mode

of the posterior density of B (Table 3.1 with f3 = -0.9) is again very much less than that of

the mode of a and b. This second point demonstrates the power of this application of Bayesian

analysis which combines the effect of goodness of fit (conditional density of a, b) and the

resultant error structure (posterior density of 0).

Finally as the whole posterior density of a and b is given, the modeller can choose the

value of the parameter depending on his use of the model. He may choose the expected values

of a and b for unbiassedness or the modal value. Clearly these two values are not the same

as the posterior density is not symmetric about the mode of the parameters.

As with the least squares technique, we still have to verify the validity of assumption

(1) and (2) given in Section 3.1. An example of a residual plot is given in Figure 3.3 for

both directions. The runs test can be employed to check for serial correlation in the

residuals. . 6 = 0.6 E B = 0.6

FIGURE 3.3 - Residual plot against x and y deviates

The runs test indicated that the runs obtained were non-critical and hence there is no

evidence of serial correlation. However, the mean of the residuals is not zero suggesting a

data transformation of the type indicated in Section 3.1 is necessary.

4. CONCLUSIONS

A logical procedure has been demonstrated for estimating the parameters of a simple model

of a spatial rainfall field for which point observations are available. The estimation

procedure takes into account the effect of non-normality of residuals.

5. REFERENCES

BARD, Y., Nonlinear Parameter Estimation, Academic, New York, 1974.

BOX, G.E.P. and COX, O.R., The Analysis of Transformations, Journal Royal Statistical Society, Ser. B., Vol. 26, No. 2, 211, 1964.


BOX, G.E.P. and TIAO, G.C., Reading,

Bayesian Inference in Statistical Analysis, Addison-Wesley, Massachusetts, 1973.

CHO, H.R., Stochastic Dynamics of Precipitation: An Example, Water Resour. Res., 21(8), 1225, 1985.

CLARKE, R.T., Variables,

Statistical Methods for the Study of Spatial Variation in Hydrological Facets of Hydrology, Wiley, Bristol, 1976.

CREUTIN, J.D. and OBLED, C., Objective Analyses and Map An Objective Comparison, Water Resour. Res.,

ing Techniques for Rainfall Fields: 18(2 , 413, 1982. P

DAVID, M., 1977.

Geostatistical Ore Reserve Estimation, Elsevier Scientific Publ. Co., New York,

DELFINER, P. and DELHOMME, J.P., Spatial Data,

Optimum Interpolation by Kriging, Display and Analysis of NATO Advance Study Institute, Wiley, 96, 1975.

DELHOMME, J.P., Kriging in Hydrosciences, Adv. Water Resour., l(5), 251, 1978.

GUPTA, V.K., Preface, Water Resour. Res., 21(8), 1223, 1985.

SOROOSHIAN, S. and DRACUP, J.A., Rainfall-Runoff Models:

Stochastic Parameter Estimation Procedures for Hydrologic

16(2), 430, 1980. Correlated and Heteroscedastic Error Cases, Water Resour. Res.,

TABIOS, G.Q. and SALAS, J.D., of Precipitation,

A Comparitive Analysis of Techniques for Spatial Interpolation Water Resour. Bull., 21(3), 365, 1985.

TROUTMAN, B.M., Errors and Parameter Estimation in Precipitation - Runoff Modelling 1, Theory, Water Resour. Res., 21(8), 1195, 1985.

YEVJEVICH, V. and KARPLUS, A.K., Hydrology Paper No. 64,

Area-Time Structure of the Monthly Precipitation Process, Colorado State University, Fort Collins, Colorado, 45 pp.

input errors in rainfall-runoff modelling

Documents