trond reitan (division of statistics and insurance mathematics, department of mathematics,...

Trond Reitan (Division of statistics and insurance mathematics, Department of Mathematics, University of Oslo)

Classical and Bayesian nonlinear regression applied to hydraulic

rating curve inference.

Construction and uncertainty analysis of stage-discharge rating curves


Motivation for this work

• River hydrology– Management of fresh water resources

• Decision-making concerning flood risk• Decision-making concerning drought

• River hydrology => How much water is flowing through the rivers?

• Key definition; discharge, amount of water passing through a

cross-section of the river each time

unit


Key problem

• Discharge is expensive. But hydrologists wants discharge time series!

• Solution: Find a relationship between discharge and something that is inexpensive to measure.

• Usually, that something is water level.• This job must be done over and over again: Need

solid tools for finding such relationships.• Discharge measurements are uncertain => need

statistical tools• Program must be easy for hydrologists to use =>

User friendliness in statistics?


Water level definitions

• Stage: the height of the water level at a river site

h

Datum, height=0

Q

h0


Stage-discharge relationship

Discharge, Q

Q=C(h-h0)b

h

Datum, height=0

Q

h0


Stage-discharge relationship

• Simple physical attributes:– Q=0 for hh0

– Q(h2)>Q(h1) for h2>h1>h0

• Parametric form suggested by hydraulics (Lambie (1978)

and ISO 1100/2 (1998)): Q=C(h-h0)b

• Parameters may be fixed only in stage intervals - segmentation

Q

h

width

h


Calibration data and statistical model

• n stage-discharge measurements.• Discharge is error-prone.

• Statistical inference on C,b,h0 nonlinear regression

• Qi=C (hi-h0)b Ei, where Ei~logN(0,2) i.i.d. noise and i{1,…,n}

• qi=a+b log(hi-h0) + i, where i~N(0,2) i.i.d.

• Problem: Enable hydrological engineers to estimate Q(h)=C(h-h0)b and evaluate the calibration uncertainty.


One segment fitting, the old way

1) Guess or make approximate measurement of h0. Then linear regression on qi vs log(hi-h0).

2) For each plausible value of h0, do linear regression. Choose the h0 that gives least SSE for the regression. – Same as doing max likelihood inference on the model:

qi=a+b log(hi-h0) + i , i{1,…,n}

– Means that uncertainty analysis becomes available (?).

– Studied by Venetis (1970).

– Also by Clarke (1999).


Problems concerning classic one segment curve fitting• Sometimes exhibits heteroscedasticity.

• Sometimes there's no finite solution!– Found a set of requirements that ensures finite estimates.

– In practice, broken requirements means no finite estimates.

– The model can produce broken requirements for any set of stage measurements!

– Parameter estimators have infinite expectancy -> Uncertainty inference becomes difficult!

– Explored in paper 1, Reitan and Petersen-Øverleir (2006) and in the appendix, Reitan and Petersen-Øverleir (2005).


Bayesian one segment fitting

• Based in the same data model, but with a prior distribution to the set of parameters. The Bayesian study of this resulted in paper 2, Reitan and Petersen-Øverleir (2008a).

• Bayesian analysis of other models done by Moyeeda and Clark (2005) and Árnason (2005).


Pros and cons

• Upsides:– Encodes hydraulic knowledge.

– Can put softer ‘restrictions’ on the parameters.

– Finite estimates.

– Natural uncertainty measures.

• Downsides:– Requires heavier numerical methods (MCMC).

– Coming up with a prior distribution can be hard.

– Also sometimes exhibits heteroscedasticity.


Input - prior distribution

• Prior distribution form as simple as possible.• Prior knowledge either local or regional.• Regional knowledge can be extracted once and for

all.• At-site prior knowledge can be set through asking for

credibility intervals.


Output – estimates and uncertainties

• Estimates – expectancies or medians from the posterior distribution.

• Uncertainty – credibility intervals of parameters and the curve itself, Q(h)=C (h-h0)b.


Segmentation

• Original idea: Divide the stage-discharge measurement into sets and fit Q(h)=C (h-h0)b separately for each segment. This can fit a wider range of measurement sets.

Q

h

Segment 1

Segment 2

Intersection


Problems with manual segmentation

1) Uncertainty analysis of manual decisions not statistically available.

2) Curves fitted to two neighbouring sets may not intersect.

3) Two such curves may have intersections only inside the sets.

Q

h

Q

h

Jump in the curve


Statistical model for segmentation – the interpolation model

• Idea: Make a model with segments and let the data be attached to that model.

• Model: for k segments, introduce k-1 segment limits parameters, hs,1, …, hs,k-1. For a measurement hs,j-1<hi< hs,j assume qi=aj+bj log(hi-h0,j) +i.

• Make sure there’s continuity by sacrificing one of the parameters in the upper segments (aj for j>1).

• Goal: Make inference on all parameters in this model. Also, make inference on k.


Frequentist inference on the interpolation model

• Segmented model first formulated and treated by using the maximum likelihood method in paper 3, Petersen-Øverleir and Reitan (2005).

• Problems:1) Possibility of infinite parameter estimates inherited

from one segment model. (Much more likely than usual for upper segments.)

2) Multi-modality and discontinuous derivative of the marginal likelihood of changepoint parameters, {hs,j}.

3) Inference of k through AIC or BIC?


Bayesian inference on the interpolation model

• Need prior distribution of changepoints, {hs,j}, and number of segments, k.

• MCMC for each sub-model characterized by k.• Importance sampling for posterior sub-model

probability, Pr(k|D).• Input: Data, prior probability of each sub-model and

prior for the parameter set of each sub-model. (Can be regional or partially regional. Set by asking for credibility intervals.)

• Output: Pr(k|D) and posterior dist. of all parameters for each k. (Summarised by estimates and credibility intervals.)


Output example for interpolation model inference

Q


Problems with Bayesian treatment of segmented models• Difficult to make efficient inference algorithms (but a semi-efficient

one has been made).• Changepoints only inside the dataset (thus ”the interpolation model”).

Extrapolation uncertainty underestimated because there can be changepoints outside the dataset.

• Solution(?): The process model, a new model where the segments appear through a process.

• Problems with the process model: Very inefficient algorithms. Difficult to implement all sorts of relevant prior knowledge.

• Middle ground? Use changepoints of most sub-model from the interpolation model as data in inference about the changepoint process. Process model used for extrapolation of the curve.


References

1) Árnason S (2005), Estimating nonlinear hydrological rating curves and discharge using the Bayesian approach. Masters Degree, Faculty of Engineering, University of Iceland

2) Clarke, RT (1999), Uncertainty in the estimation of mean annual flood due to rating curve indefinition. J Hydrol, 222: 185-190

3) ISO 1100/2. (1998), Stage-discharge Relation, Geneva

4) Lambie JC (1978), Measurement of flow - velocity-area methods. Hydrometry: Principles and Practices, first edition, edited by R.W. Herschy, John Wiley & Sons, UK.

5) Moyeeda RA, Clarke RT (2005), The use of Bayesian methods for fitting rating curves, with case studies. Adv Water Res, 28:8:807-818

6) Petersen-Øverleir A, Reitan T (2005), Objective segmentation in compound rating curves. J Hydrol, 311: 188-201

7) Reitan T, Petersen-Øverleir A (2005), Estimating the discharge rating curve by nonlinear regression - The frequentist approach. Statistical Research Report, University of Oslo, Preprint 2, 2005 Available at: http://www.math.uio.no/eprint/stat report/2005/02-05.html

8) Reitan T, Petersen-Øverleir A (2006), Existence of the frequentistic regression estimate of a power-law with a location parameter, with applications for making discharge rating curves. Stoc Env Res Risk Asses, 20:6: 445-453

9) Reitan T, Petersen-Øverleir A (2008a), Bayesian power-law regression with a location parameter, with applications for construction of discharge rating curves. Stoc Env Res Risk Asses, 22: 351-365

10) Venetis C (1970), A note on the estimation of the parameters in logarithmic stage-discharge relationships with estimation of their error, Bull Inter Assoc Sci Hydrol, 15: 105-111

trond reitan (division of statistics and insurance mathematics, department of mathematics,...

Documents