calibration, sensitivity analysis and uncertainty analysis for computationally expensive models...

Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive

Models

Prof. Christine Shoemaker

Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman

School of Civil and Environmental Engineering and School of Operations Research and Industrial Engineering

Cornell University

South Florida Water District Morning MeetingSept. 24, 3003

Models Help Extract Information

Point Data

from monitoring or experiments at limited number of points in space and time

that describes temporal and spatial connections

Forecasts (with statistical representation)

Comparison of Alternative Management Options

Understanding Processes

from Point Data to Processes Continuous in Space and Time

Models Help Extract Information from

Data___________________

Point Data

from monitoring or experiments at limited number of points in space and time

that describes temporal and spatial connections

Forecasts (with statistical representation)

Comparison of Alternative Management Options

Understanding Processes

Model Outputs

for Multiple Outputs

Steps in Modeling

• Calibration—selecting parameter values within acceptable limits to fit the data as well as possible

• Validation—applying the model and calibrated parameters to independent data set

• Sensitivity Analysis—assess the impact of changes in uncertain parameter values on model output

• Uncertainty Analysis-assessing the range of model outcomes likely given uncertainty in parameters, model error, and exogenous factors like weather.

Computationally Expensive Models

• It is difficult to calibrate for many parameters with existing methods with a limited number of simulations.

• Most existing uncertainty methods require thousands of simulations.

• We can only do a limited number of model simulations if models that hours to run.

• Our methods are designed to reduce the number of simulations required to do good calibration and sensitivity analysis.

Methods and Applications

• We will discuss a general methodology for calibration, sensitivity analysis and uncertainty analysis that can be applied to many types of computationally expensive models.

• We will present numerical examples for two “real life” examples: a watershed and a groundwater remediation.

1.Effective Use of Models and Observations Through Calibration, Sensitivity Analysis and Uncertainty

Analysis

A description of the technical approach and “real life applications. Including:

1. Sensitivity Analysis for large number of parameters with application to a large watershed.

2. Optimization methods for calibration with application to ground water remediation based on field data.

3. Uncertainty Analysis based on groundwater model

Cannonsville Watershed• Cannonsville Reservoir Basin – agricultural basin• Supply of New York City drinking water • To avoid $8 billion water filtration plant, need

model analysis to help manage phosphorous

1200 km2

Watershed subject to economic constraints if P violations of TMDL.

Monitoring Stations

Trout Creek

W. Br. Delaware R. @ Walton

Little Delaware R.

W. Br. Delaware @ Delhi

Beerston

Town Brook

Subwatersheds BoundariesRivers and Streams

#S USGS Flow Gauges$T Climate Stations%U Sediment Monitoring Stations

5 0 5 10 Kilometers

There are over 20,000 data for this watershed

Questions

• Using all this data, can we develop a model that is a useful forecasting tool to assess the impact of weather and phosphorous management actions on future loading the reservoir?

• What phosphorous management strategies should be undertaken if any?

I. Methodology for Sensitivity Analysis of a Model with Many Parameters: Application to Cannonsville Basin

• Joint work with Jennifer Benaman (Cornell Ph.D. in Civil and Environmental Engineering, 2003)

• Funded by EPA Star Fellowship

Sensitivity Analysis with Many Parameters

• Sensitivity Analysis measures the change in model output associated with the change (perturbation) in model input (e.g. in parameter values).

• Purposes include:– To help select which parameters should be adjusted

in a calibration and which can be left at default values.

– This makes multivariate sensitivity and uncertainty analysis more feasible for computationally expensive models

Sensitivity Analysis with Many Parameters- Additional Purposes– To prioritize additional data collection, and

– To estimate potential errors in model forecasts that could be due to parameter

value errors.• Sensitivity Analysis and calibration are

difficult with a large number of parameters.

Questions

• Can we develop a sensitivity analysis method that is:– robust (doesn’t depend strongly on our

assumptions)?– computationally efficient for a large

number of parameters (hundreds)?– allows us to consider many different

model outputs simultaneously?– .

• 160 parameters– 35 basinwide– 10 vary by land use (10 x 5 land uses)– 7 vary by soil (7 x 10 soil types)– 2 additional for corn and hay– 1 additional for pasture

• Ranges obtained from literature, databases, and SWAT User’s Manual

Choose Parameters Establish ParameterRanges

Choose OutputVariables of Concern

Application to Cannonsville Watershed

Monitoring Stations

Trout Creek

W. Br. Delaware R. @ Walton

Little Delaware R.

W. Br. Delaware @ Delhi

Beerston

Town Brook

Subwatersheds BoundariesRivers and Streams

#S USGS Flow Gauges$T Climate Stations%U Sediment Monitoring Stations

5 0 5 10 Kilometers

Output Variables of Concern• Basinwide (average annual from 1994-1998)

– Surface water runoff– Snowmelt– Groundwater flow– Evapotranspiration– Sediment yield

• Location in-stream (monthly average over entire simulation)– Flow @ Beerston– Flow @ Trout Creek– Flow @ Town Brook– Flow @ Little Delaware River– Sediment load @ Beerston– Sediment load @ Town Brook

Choose Parameters Establish ParameterRanges

Choose OutputVariables of Concern

Final Results

Weighting Method A Weighting Method B Weighting Method C Weighting Method D

All Equal Weights Focus on Beerston Focus on CalibrationFocus on Basinwide

ManagementAPMBASIN 100 100 100 100BIOMIXBASIN 100 100 100 100CN2CSIL 100 100 100 100CN2FRSD 100 100 100 100CN2PAST 100 100 100 100RSDCOPAST 100 100 100 100SLSUBBSNBASIN 100 100 100 100SMFMNBASIN 100 100 100 100T_BASEPAST 100 100 100 100T_OPTPAST 100 100 100 100USLEKNY129 100 100 100 100ESCONY129 100 75 75 100SMTMPBASIN 100 75 75 100LAT_SEDBASIN 100 50 100 100CN2HAY 75 75 75 75ESCONY132 75 75 75 50GWQMNBASIN 75 75 75 75TIMPBASIN 75 50 75 75BIO_MINPAST 75 50 50 75ROCKNY132 75 25 50 50REVAPMNBASIN 50 50 50 75ROCKNY129 50 25 50 25USLEPCSIL 25 25 50 25HVSTICSIL 25 25 25 50USLECPAST 25 25 25 25SMFMXBASIN 25 0 0 50GSIPAST 0 0 25 0ROCKNY026 0 0 25 0

Percentage of times in the 'Top 20'

These are in top 20 for ALL cases

These are in top 20 most of the time

Computational Issues• We have a robust method for determining

importance and sensitivity of parameters.• An advantage is that the number of model

simulations is independent of the number of output variables, sensitivity indices, or weighting factors considered in the combined sensitivity analysis. (Almost no extra computation is required to do many output variables, indices or weightings.)

• The number of simulations is simply the number required to do a single (non robust) univariate sensitivity analysis multiplied by the number of perturbation methods (=2 in this example).

Next Steps• Once the most important parameters have

been identified we can extend the analysis to more detailed analyses including:– Multivariate sensitivity analysis (changes in more

than one parameter at a time)– Uncertainty Analysis (e.g. GLUE)

• Both of these analyses above are highly computationally demanding and can hence only be done with a small number of parameters.

• The (univariate) sensitivity analysis done here can identify the small number of parameters on which these analyses should be focused.

Questions

• Can we develop a sensitivity analysis method that is:

– robust (doesn’t depend strongly on our assumptions)?

– computationally efficient for a large number of parameters (hundreds)?

– allows us to consider many different model outputs simultaneously?

– Yes, the results for Cannonsville indicate this is possible with this methodology.

– Models with longer simulation times require more total simulation times or fewer parameters.

II: Use of Response Surface Methods in Non-Convex Optimization,

Calibration and Uncertainty Analysis

• Joint work with– Pradeep Mugunthan (PhD Candidate in Civil and

Environmental Engineering)– Rommel Regis (Postdoctoral Fellow with PhD in

Operations Research)

– Funded by three National Science Foundaton (NSF) Projects

Computational Effort for Trial and Error (Manual) Calibration

• Assume that you have P parameters and you want to consider N levels of each.

• Then the total number of combinations of possible sets of parameter is NP.

• So with 10 parameters, considering only 2 values each (very crude evaluation), there are 21024 possible combinations, too many to evaluate all of them for computationally expensive function.

• With 8 parameters considering a more reasonable 10 values each gives 100 million possible combinations of parameters!

• With so many possibilities it is hard to find with trial and error good solutions with few (e.g. 100) function evaluations.

Automatic Calibration

• We would like to find the set of parameter values (decision variables) such that– the calibration error (objective function) is

minimized – subject to constraints on the allowable range

of the parameter values.

This is an Optimization Problem.

It can be a global optimization problem.

NSF Project 1: Function Approximation Algorithms for Environment Analysis with

Application to Bioremediation of Chlorinated Ethenes

• Title: “Improving Calibration, Sensitivity and Uncertainty Analysis of Data-Based Models of the Environment”,

• The project is funded by the NSF Environmental Engineering Program.

• The following slides will discuss the application of these concepts to uncertainty analysis.

“Real World Problem”:Engineered Dechlorination by Injection of Hydrogen

Donor and Extraction

We have developed a user friendly transport model of engineered anaerobic degradation of chlorinated ethenes that models chemical and biological species and utilizes MT3D and RT3D.

This model is the application for the function approximation research.

Optimization

• Because our model is computationally expensive, we need to find a better way than trial and error to get a good calibration set of parameters.

• Optimization can be used to efficiently search for a “best” solution.

• We have developed optimization methods that are designed for computationally expensive functions.

Optimization

• Our goal is to find the

minimum of f(x)

where x є D• We want to do very few evaluations of f(x)

because it is “costly to evaluate.

This can be a measure of error between model prediction and observations

X can be parameter values

Global versus Local Minima

X (parameter value)

Local minimum

Global minimum

Many optimization methods only find one local minimum.

We want a method that finds the global minimum.

Experimental Design with Symmetric Latin Hypercube (SLHD)

• To fit the first function approximation we need to have evaluated the function at several points.

• We use a symmetric Latin Hypercube (SLHD) to pick these initial points.

• The number of points we evaluate in the SLHD is (d+1)(d+2)/2, where d is the number of parameters (decision variables).

x (parameter value-one dimensional example)

Objective

Function

measure of error

One Dimensional Example of Experimental Design to Obtain Initial Function Approximation

Costly Function Evaluation (e.g. over .5 hour CPU time for one evaluation).

x (parameters)

Function Approximation with Initial Points from Experimental Design

In real applications x is multidimensional since there are many parameters (e.g. 10).

x (parameter value)

Update in Function Approximation with New Evaluation

Update done in each iteration for function approximation for each algorithm expert.

Function Approximation is a guess of the function value of f(x) for all x.

Use of Derivatives

• We use the gradient-based methods only on the function approximations R(x) (for which accurate derivatives are inexpensive to compute).

• We do not try to compute gradients/derivatives for the underlying costly function f(x).

Our RBF Algorithm

• Our paper on RBF optimization algorithm has will appear soon in Jn. of Global Optimization .

• The following graphs show a related RBF method called “Our RBF” as well as an earlier RBF optimization suggested by Gutmann (2000) in Jn. of Global Optimization called “Gutmann RBF”.

Comparison of RBF Methods on a 14-dimensional Schoen Test Function (Average of 10 trials)

120 140 160 180 200 220 240 260 280 30015

number of function evaluations

mean o

e in 3

Comparison of RBF Methods on a 14-dimensional Schoen Test Function

ExpRBF-LGutmannRBFGreedyRBF

Objective Function

Our RBF

Number of Function Evaluations

Comparison of RBF Methods on a 12-dimensional Groundwater Aerobic Bioremediation Problem ( a PDE system)

(Average of 10 trials)

80 100 120 140 160 180 200400

number of function evaluations

mean o

e in 1

Comparison of RBF Methods on a 12-dimensional Groundwater Bioremediation Problem

ExpRBF-LGutmannRBFGreedyRBF

Objective Function

Number of Function Evaluations

Our RBF

The following results are from:

NSF Project 1: Function Approximation Algorithms for Environment Analysis with

Application to Bioremediation of Chlorinated Ethenes

• Title: “Improving Calibration, Sensitivity and Uncertainty Analysis of Data-Based Models of the Environment”,

• The project is funded by the NSF Environmental Engineering Program.

Now a real costly function: DECHLOR: Transport Model of

Anaerobic Bioremediation of Chlorinated Ethene

• This model was originally developed by Willis and Shoemaker based on kinetics equations by Fennell and Gossett.

• This model will be our “costly” function in the optimization.

• Model based on data from a field site in California.

Complex model: 18 species at each of thousands of nodes of finite difference model

Butyrate

Propionate

Acetate

LactateLac2Ace

PCE DCETCE VC Ethene

Dechlorinator

Lac2Prop

But2AceProp2Ace

Methane

But2Ace

Hyd2Meth

Chlorinated.Ethenes

Donors

Example of Objective Function for Optimization of Chlorinated Ethene Model

1 j ) Y (Y SSE

where, SSE is the sum of squared errors between observed and simulated chlorinated ethenes

is the observed molar concentration of species j at time t, location i

is the simulated molar concentration of species j at time t, location i

t = 1 to T represent time points at which measured data is available

j = 1 to J represents PCE, TCE, DCE, VC and ethene in that order

i = 1 to I is a set monitoring locations

Observation Model

Algorithms Used for Comparison of Optimization Performance on Calibration

• Stochastic Greedy Algorithm– Neighborhood defined to make search global– Neighbors generated from triangular distribution around current

solution. Moves only to a better solution.• Evolutionary Algorithms

– Derandomized evolution strategy DES with lambda = 10 and b1 = 1/n and b2 = 1/n0.5 (Ostermeier et al. 1992)

– Binary or Real Genetic algorithm GA, population size 10, one point cross-over, mutation probability 0.1, crossover probability 1

• RBF Function Approximation Algorithms– RBF Gutmann- radial basis function approach, with cycle length

five, SLH space filling designRBF-Cornell radial basis function approach.

• FMINCON– derivative based optimizer in Matlab with numerical derivatives

• 10 trials of 100 function evaluations were performed for heuristic and function approximation algorithms for comparison

Comparison of algorithms for NS as objective function on a hypothetical problem

30 50 70 90

Number of function evaluations

FMINCON

RBF-CORNELL

RBF-GUT

FMINCON+RBF

RealGA

BinaryGA

Average is based on 10 trials. The best possible value for –NS is –1. 28 Experimental design evaluations done.

Lower curve is better

Boxplot comparing best objective value (CNS) produced by the algorithms in each trial over 10 trials

average

outlier

Conclusions• Optimizing costly functions is typically done only

once.• The purpose for our examination of multiple

trials is to examine how well one is likely to do if you do solve the problem only once.

• Hence we want the method that has both the smallest Mean objective function value and the smallest Variance.

• Our RBF has both the smallest Mean and the smallest Variance.

• The second best method is Gutmann RBF, so RBF methods seem very good in general.

Conclusions• Optimizing costly functions is typically done only

once.• The purpose for our examination of multiple

trials is to examine how well one is likely to do if you do solve the problem only once.

• Hence we want the method that has both the smallest Mean objective function value and the smallest Variance.

• Our RBF has both the smallest Mean and the smallest Variance.

• The second best method is Gutmann RBF, so RBF methods seem very good in general.

Alameda Field Data

• The next step was to work with a real field site. • We obtained data from a DOD field site studied

by a group (including Alleman, Morse, Gossett, and Fennell).

• Running the simulation model takes about three hours for one run of the chlorinated ethene model at this site because of the nonlinearities in the kinetics equations.

Site Layout

Range of objective values for SSE objective function at Alameda field site - Mean, min and max are shown for each

algorithm

150000

250000

350000

450000

550000

650000

DES FA-Gutmann FA-RS FMINCON

E (m M

gradient

Conclusions on RBF Optimization of Calibration

• Radial Basis Function Approximation Methods can be used effectively to find optimal solutions of costly functions.

• “Our RBF” performed substantially better than the previous RBF method by Gutmann on the difficult chlorinated ethene remediation problem, especially because our RBF is robust (small variance).

• Both Genetic algorithms and derivative-based search did very poorly.

• The two RBF methods did much better on the Alameda field data problem than other methods.

However,300 hours is a long time to wait!

Solution: Parallel Algorithms• We would like to be able to speed up

calculations for costly functions by using parallel computers.

• To get a good speed up on a parallel computer, you need an algorithm that parallelizes efficiently.

• We are developing such an algorithm through a second NSF grant (from Computer and Information Science Directorate).

III: Uncertainty Analysis

• Modelers have discovered that there is often more than one set of parameters that gives and “adequate” fit to the data.

• One approach to assessing uncertainty associated with a model output is to look at the weighted mean and the variability of the output associated all the sets of parameters that give an equally good fit.

x (parameters)

More than one parameter value might give acceptable goodness of fit

If we impose a “filter” and allow only the acceptable points, then only the black points are incorporated in the analysis.

acceptable

Uncertainty Analysis: GLUE Approach

• GLUE is a methodology (by Bevins and co-workers) used largely in watersheds (where computation times are not long).

Uncertainty Analysis via GLUE: Dots are Model Simulations of Parameter Combinations Chosen at

Random (Two Parameter Example)

parameter 1

parameter 2

parameter combination that gives R2 greater than .75parameter combination that gives R2 less than .75

Glue Methodology (used mostly in watershed modeling)

• Step 1: Select combinations of parameter values at random and simulate model for each combination.

• Step 2:compare goodness of fit (e.g. R 2) for each model simulation compared with data

• Step 3: Simulate model at acceptable points and weight output to determine variability characteristics of model output (e.g. mean and variance of amount of contamination remaining after N years)

Problems with GLUE Methodology

• We applied GLUE to the Cannonsville Watershed SWAT model predictions for sediment (a very hard quantity to model).

• We did 20,000 Monte Carlo runs (which took about three weeks of computer time).

• Of the 20,000 runs only two runs were within the allowable R2. (only two )

• This does not adequately characterize uncertainty, and it is not computationally feasible to make more runs.

• For computationally expensive models like our groundwater problem or your Everglades problem, it is not feasible to run the model 20,000 times!

• Hence GLUE has the problem that it finds very few samples within an acceptable level (filter) if the filter is fairly stringent.

Groundwater Example Used for Numerical Comparison with GLUE

• 2-D confined aquifer contaminated with chlorinated ethenes.

• Same PDE equations as earlier field case• 400m long, 100m wide• Modeled using a coarse 10mx10m finite difference grid

– Simulation time for 6 month calibration period was approximately ¾ minute in a Pentium4® 3GHz computer

– Typical simulation time for long-term forecast scenarios is of the order of several hours to days

Calibration Problem

• Calibration of 3 parameters were considered – 2 biological parameters and one biokinetic parameter

• Synthetic observations were generated for a period of 6 months using a known set of parameters

• Optimal calibration was attempted using a response surface (RS) optimization method (Regis and Shoemaker, 2004)

• GLUE based calibration/uncertainty assessment was also performed for comparison

Output Definition

• Output: The total moles of toxic compounds (chlorinated ethenes) remaining in aquifer at final time period. (This cannot be measured but can be estimated through model.)

• Uncertainty in the Output was analyzed using GLUE and RS based methods

Goodness-of-fit Measure

• Nash-Sutcliffe Efficiency Measure (Nash and Sutcliffe, 1970)

• Optimization algorithm was setup to minimize CNS = 1-NS, so that a CNS of zero is best

obstji

simtji

Uncertainty Estimates for Output Total Moles of Chlorinated Ethenes Remaining

Bounds obtained using a filter of 0.01 for CNS

G2000G1000G500RS200 RSG20k TRUE141.00

142.00

143.00

144.00

145.00

146.00

147.00

6 12126

Our Method 1 with 200 function evaluations

Uncertainty Estimates for Output Total Moles of Chlorinated Ethenes Remaining

G2000G1000G500RS200 RSG20k TRUE141.00

142.00

143.00

144.00

145.00

146.00

147.00

6 12126

GLUE 1 with 500 function evaluations

Uncertainty Estimates for Total Moles of Chlorinated Ethenes

G2000G1000G500RS200 RSG20k TRUE141.00

142.00

143.00

144.00

145.00

146.00

147.00

6 12126

Is the mean, range is 99% of data

This is the true answer

G2000G1000G500RS200 RSG20k TRUE141.00

142.00

143.00

144.00

145.00

146.00

147.00

6 12126

Number of points after applying filter

RS200 uses 200 function evaluations. G200 found 0 solutions (none) for this filter. GS500 found only 5 solutions.

G2000G1000G500RS200 RSG20k TRUE141.00

142.00

143.00

144.00

145.00

146.00

147.00

6 12126

Is the mean, range is 99% of data

The mean estimate is almost perfect for our RS method and is far off for GLUE method with 250% as many points evaluated !

G2000G1000G500RS200 RSG20k TRUE141.00

142.00

143.00

144.00

145.00

146.00

147.00

6 12126

Even with 2000 function evaluations, GLUE has a much worse mean than our RS method with only 1/10 as many function evaluations.

Our Method 2(RSG)

• Step 1: Same as in Method 1• Step Construct a function approximation surface of the

output• Step 3: Make a large number of samples from function

approximation. Do further function evaluations if function approximation is negative and refit function approximation.

• Step 4: Filter out points that are not acceptable and compute statistics

• Determine statistical characteristics of model output (e.g. mean and variance of amount of contamination remaining after N years) at all acceptable points.

G2000G1000G500RS200 RSG20k TRUE141.00

142.00

143.00

144.00

145.00

146.00

147.00

6 12126

Our Method 2 with 200 function evaluations and 20,000 samples from the response surface

Difference Between Method 1 and Method 2

The uncertainty analysis in Method 1 is based only on actual function evaluations.

The uncertainty analysis in Method 2 is based on a very large number of samples from the function approximation.

Comments on Results

• A strict filter produces very few points with GLUE – even after 2000 function evaluations, only 12 points

remain after filtering

• Our RS method produces the tightest bounds and also provides more points for uncertainty assessment with only 200 function evaluations– Limited with respect to sample independence

• The RSG provides an improvement over GLUE – Independent samples for uncertainty assessment– A larger sample size for a tight filter

Effect of Relaxing Filter – CNS of 0.1

Empirical 98% Bounds obtained using a filter of 0.1 for CNS

G2000RS200 G200 G500 G1000 RSG20k TRUE135.00

140.00

145.00

150.00

155.00

160.00

165.00

12 44 84 167

Comparison of percentage of points after filtering

0.01 0.1 0.3 1 inf

CNS Filter

RSG20k

Comparison of percentage of points after filtering

0.01 0.1 0.3

CNS Filter

Filter

RSG20k

Percentage of Points for Different Filters

Advantages of Method 2

• The samples are independent

• Reuse information from calibration

• Computationally cheap – – use only the same number of costly function

evaluations as in the regular RS optimization method (200 in these examples)

– Can obtain goodness-of-fit and output values for many thousands of points

Summary

• Models can help us use data take a small scale and at discrete time points to understand and manage environmental processes over large spatial areas and time frames.

• Development of computationally efficient methods for automatic calibration, sensitivity and uncertainty analysis are very important.

New Project 2: Parallel Optimization Algorithms

• Funded by the Computer Science (CISE) Directorate at NSF

• The method is general and can be used for a wide range of problems including other engineering systems in addition to environmental systems.

• This research is underway.

2. How are calibration sensitivity analysis and uncertainty

analysis used in environmental analyses?

3. What are the alternatives to sensitivity analysis and uncertainty

analysis?

How Do we address the uncertainties that are not directly related to parameter uncertainty

such as data uncertainty?

My NSF Projects

• NSF-Environmental Engineering: applications of methods to watershed and groundwater

• NSF-Advanced Computing: development of parallel algorithms for function approximation optimization

• NSF-Statistics: development of an integration of Bayesian statistical methods with function approximation optimization for computationally expensive functions.

• All this previously funded research can be useful in applications to the Everglades.

calibration, sensitivity analysis and uncertainty analysis for computationally expensive models...

model analysis

groundwater model slide

watershed slide

time model

time slide

model error

field data

expensive models

Documents

rommel t juan

rommel-orgeln in der thüringer...

rommel rules rommelrules.qxd

shoemaker mine history

rommel malabrigo

cecilia y rommel

little shoemaker

jolynn shoemaker

rommel original paper

diapos rommel

epk rommel hunter

memorias - rommel, erwin.pdf

shoemaker practice

near shoemaker

rommel desmond young

shoemaker pres

monografia rommel-xenofonte

rommel pme 130401

andrés sendra shoemaker

rommel. s, unprotected